date:20200529

[incubator-pinot] annotated tag release-0.4.0-rc4 updated (1ab389e -> 3747154)

2020-05-29 Thread xiangfu

This is an automated email from the ASF dual-hosted git repository.

xiangfu pushed a change to annotated tag release-0.4.0-rc4
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


*** WARNING: tag release-0.4.0-rc4 was modified! ***

from 1ab389e  (commit)
  to 3747154  (tag)
 tagging 1ab389e5c09f8abc20cd4d4eed10df2f6cac4638 (commit)
  by Xiang Fu
  on Fri May 29 22:23:48 2020 -0700

- Log -
[maven-release-plugin] copy for tag release-0.4.0-rc4
---


No new revisions were added by this update.

Summary of changes:


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[incubator-pinot] branch release-0.4.0-rc4 updated: [maven-release-plugin] prepare for next development iteration

2020-05-29 Thread xiangfu

This is an automated email from the ASF dual-hosted git repository.

xiangfu pushed a commit to branch release-0.4.0-rc4
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git


The following commit(s) were added to refs/heads/release-0.4.0-rc4 by this push:
 new b8b5d8b  [maven-release-plugin] prepare for next development iteration
b8b5d8b is described below

commit b8b5d8b133865346df849840a8572283a0cd8f9f
Author: Xiang Fu 
AuthorDate: Fri May 29 22:23:52 2020 -0700

[maven-release-plugin] prepare for next development iteration
---
 pinot-broker/pom.xml  | 2 +-
 pinot-clients/pinot-java-client/pom.xml   | 2 +-
 pinot-clients/pom.xml | 2 +-
 pinot-common/pom.xml  | 2 +-
 pinot-controller/pom.xml  | 2 +-
 pinot-core/pom.xml| 2 +-
 pinot-distribution/pom.xml| 2 +-
 pinot-integration-tests/pom.xml   | 2 +-
 pinot-minion/pom.xml  | 2 +-
 pinot-perf/pom.xml| 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-common/pom.xml| 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-hadoop/pom.xml| 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-spark/pom.xml | 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-standalone/pom.xml| 2 +-
 pinot-plugins/pinot-batch-ingestion/pom.xml   | 2 +-
 .../pinot-batch-ingestion/v0_deprecated/pinot-hadoop/pom.xml  | 2 +-
 .../v0_deprecated/pinot-ingestion-common/pom.xml  | 2 +-
 pinot-plugins/pinot-batch-ingestion/v0_deprecated/pinot-spark/pom.xml | 2 +-
 pinot-plugins/pinot-batch-ingestion/v0_deprecated/pom.xml | 2 +-
 pinot-plugins/pinot-file-system/pinot-adls/pom.xml| 2 +-
 pinot-plugins/pinot-file-system/pinot-gcs/pom.xml | 2 +-
 pinot-plugins/pinot-file-system/pinot-hdfs/pom.xml| 2 +-
 pinot-plugins/pinot-file-system/pinot-s3/pom.xml  | 2 +-
 pinot-plugins/pinot-file-system/pom.xml   | 2 +-
 pinot-plugins/pinot-input-format/pinot-avro-base/pom.xml  | 2 +-
 pinot-plugins/pinot-input-format/pinot-avro/pom.xml   | 2 +-
 pinot-plugins/pinot-input-format/pinot-confluent-avro/pom.xml | 2 +-
 pinot-plugins/pinot-input-format/pinot-csv/pom.xml| 2 +-
 pinot-plugins/pinot-input-format/pinot-json/pom.xml   | 2 +-
 pinot-plugins/pinot-input-format/pinot-orc/pom.xml| 2 +-
 pinot-plugins/pinot-input-format/pinot-parquet/pom.xml| 2 +-
 pinot-plugins/pinot-input-format/pinot-thrift/pom.xml | 2 +-
 pinot-plugins/pinot-input-format/pom.xml  | 2 +-
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-0.9/pom.xml  | 2 +-
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-2.0/pom.xml  | 2 +-
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-base/pom.xml | 2 +-
 pinot-plugins/pinot-stream-ingestion/pom.xml  | 2 +-
 pinot-plugins/pom.xml | 2 +-
 pinot-server/pom.xml  | 2 +-
 pinot-spi/pom.xml | 2 +-
 pinot-tools/pom.xml   | 2 +-
 pom.xml   | 4 ++--
 42 files changed, 43 insertions(+), 43 deletions(-)

diff --git a/pinot-broker/pom.xml b/pinot-broker/pom.xml
index 2fd885d..a884630 100644
--- a/pinot-broker/pom.xml
+++ b/pinot-broker/pom.xml
@@ -24,7 +24,7 @@
   
 pinot
 org.apache.pinot
-0.4.0
+0.5.0-SNAPSHOT
 ..
   
   pinot-broker
diff --git a/pinot-clients/pinot-java-client/pom.xml 
b/pinot-clients/pinot-java-client/pom.xml
index 615c5e9..60177d7 100644
--- a/pinot-clients/pinot-java-client/pom.xml
+++ b/pinot-clients/pinot-java-client/pom.xml
@@ -24,7 +24,7 @@
   
 pinot-clients
 org.apache.pinot
-0.4.0
+0.5.0-SNAPSHOT
 ..
   
   pinot-java-client
diff --git a/pinot-clients/pom.xml b/pinot-clients/pom.xml
index e05e7ec..902f54f 100644
--- a/pinot-clients/pom.xml
+++ b/pinot-clients/pom.xml
@@ -24,7 +24,7 @@
   
 pinot
 org.apache.pinot
-0.4.0
+0.5.0-SNAPSHOT
 ..
   
   pinot-clients
diff --git a/pinot-common/pom.xml b/pinot-common/pom.xml
index db2d28d..7f2a7fc 100644
--- a/pinot-common/pom.xml
+++ b/pinot-common/pom.xml
@@ -24,7 +24,7 @@
   
 pinot
 org.apache.pinot
-0.4.0
+0.5.0-SNAPSHOT
 ..
   
   pin

[incubator-pinot] branch release-0.4.0-rc4 created (now 1ab389e)

2020-05-29 Thread xiangfu

This is an automated email from the ASF dual-hosted git repository.

xiangfu pushed a change to branch release-0.4.0-rc4
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


  at 1ab389e  [maven-release-plugin] prepare release release-0.4.0-rc4

This branch includes the following new commits:

 new 1ab389e  [maven-release-plugin] prepare release release-0.4.0-rc4

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[incubator-pinot] 01/01: [maven-release-plugin] prepare release release-0.4.0-rc4

2020-05-29 Thread xiangfu

This is an automated email from the ASF dual-hosted git repository.

xiangfu pushed a commit to branch release-0.4.0-rc4
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git

commit 1ab389e5c09f8abc20cd4d4eed10df2f6cac4638
Author: Xiang Fu 
AuthorDate: Fri May 29 22:23:32 2020 -0700

[maven-release-plugin] prepare release release-0.4.0-rc4
---
 pinot-broker/pom.xml  | 5 ++---
 pinot-clients/pinot-java-client/pom.xml   | 5 ++---
 pinot-clients/pom.xml | 6 ++
 pinot-common/pom.xml  | 5 ++---
 pinot-controller/pom.xml  | 5 ++---
 pinot-core/pom.xml| 5 ++---
 pinot-distribution/pom.xml| 7 +++
 pinot-integration-tests/pom.xml   | 5 ++---
 pinot-minion/pom.xml  | 5 ++---
 pinot-perf/pom.xml| 5 ++---
 .../pinot-batch-ingestion/pinot-batch-ingestion-common/pom.xml| 6 ++
 .../pinot-batch-ingestion/pinot-batch-ingestion-hadoop/pom.xml| 6 ++
 .../pinot-batch-ingestion/pinot-batch-ingestion-spark/pom.xml | 6 ++
 .../pinot-batch-ingestion-standalone/pom.xml  | 6 ++
 pinot-plugins/pinot-batch-ingestion/pom.xml   | 6 ++
 .../pinot-batch-ingestion/v0_deprecated/pinot-hadoop/pom.xml  | 7 +++
 .../v0_deprecated/pinot-ingestion-common/pom.xml  | 6 ++
 .../pinot-batch-ingestion/v0_deprecated/pinot-spark/pom.xml   | 7 +++
 pinot-plugins/pinot-batch-ingestion/v0_deprecated/pom.xml | 6 ++
 pinot-plugins/pinot-file-system/pinot-adls/pom.xml| 5 ++---
 pinot-plugins/pinot-file-system/pinot-gcs/pom.xml | 6 ++
 pinot-plugins/pinot-file-system/pinot-hdfs/pom.xml| 5 ++---
 pinot-plugins/pinot-file-system/pinot-s3/pom.xml  | 6 ++
 pinot-plugins/pinot-file-system/pom.xml   | 6 ++
 pinot-plugins/pinot-input-format/pinot-avro-base/pom.xml  | 5 ++---
 pinot-plugins/pinot-input-format/pinot-avro/pom.xml   | 5 ++---
 pinot-plugins/pinot-input-format/pinot-confluent-avro/pom.xml | 5 ++---
 pinot-plugins/pinot-input-format/pinot-csv/pom.xml| 5 ++---
 pinot-plugins/pinot-input-format/pinot-json/pom.xml   | 5 ++---
 pinot-plugins/pinot-input-format/pinot-orc/pom.xml| 6 ++
 pinot-plugins/pinot-input-format/pinot-parquet/pom.xml| 5 ++---
 pinot-plugins/pinot-input-format/pinot-thrift/pom.xml | 5 ++---
 pinot-plugins/pinot-input-format/pom.xml  | 6 ++
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-0.9/pom.xml  | 6 ++
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-2.0/pom.xml  | 6 ++
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-base/pom.xml | 6 ++
 pinot-plugins/pinot-stream-ingestion/pom.xml  | 6 ++
 pinot-plugins/pom.xml | 8 +++-
 pinot-server/pom.xml  | 5 ++---
 pinot-spi/pom.xml | 5 ++---
 pinot-tools/pom.xml   | 5 ++---
 pom.xml   | 7 +++
 42 files changed, 89 insertions(+), 149 deletions(-)

diff --git a/pinot-broker/pom.xml b/pinot-broker/pom.xml
index d9e48e1..2fd885d 100644
--- a/pinot-broker/pom.xml
+++ b/pinot-broker/pom.xml
@@ -19,13 +19,12 @@
 under the License.
 
 -->
-http://www.w3.org/2001/XMLSchema-instance"; 
xmlns="http://maven.apache.org/POM/4.0.0";
- xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd";>
+http://maven.apache.org/POM/4.0.0"; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; 
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd";>
   4.0.0
   
 pinot
 org.apache.pinot
-${revision}${sha1}
+0.4.0
 ..
   
   pinot-broker
diff --git a/pinot-clients/pinot-java-client/pom.xml 
b/pinot-clients/pinot-java-client/pom.xml
index 6a98e3d..615c5e9 100644
--- a/pinot-clients/pinot-java-client/pom.xml
+++ b/pinot-clients/pinot-java-client/pom.xml
@@ -19,13 +19,12 @@
 under the License.
 
 -->
-http://maven.apache.org/POM/4.0.0"; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
- xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd";>
+http://maven.apache.org/POM/4.0.0"; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; 
xsi:schemaLocation="http://maven.apache.org/POM/4

[GitHub] [incubator-pinot] vincentchenjl commented on a change in pull request #5435: [TE] clean up decprecated/unused code

2020-05-29 Thread GitBox



vincentchenjl commented on a change in pull request #5435:
URL: https://github.com/apache/incubator-pinot/pull/5435#discussion_r432790440



##
File path: 
thirdeye/thirdeye-pinot/src/main/java/org/apache/pinot/thirdeye/dashboard/resources/AnomalyResource.java
##
@@ -101,21 +99,18 @@
   private MergedAnomalyResultManager anomalyMergedResultDAO;
   private AlertConfigManager emailConfigurationDAO;
   private MergedAnomalyResultManager mergedAnomalyResultDAO;
-  private AutotuneConfigManager autotuneConfigDAO;
   private DatasetConfigManager datasetConfigDAO;
   private AnomalyFunctionFactory anomalyFunctionFactory;
   private AlertFilterFactory alertFilterFactory;

Review comment:
   It is used in line 141. 
   `anomalyResults = AlertFilterHelper.applyFiltrationRule(anomalyResults, 
alertFilterFactory);`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] Jackie-Jiang merged pull request #5451: Refactor DistinctTable to use PriorityQueue based algorithm

2020-05-29 Thread GitBox



Jackie-Jiang merged pull request #5451:
URL: https://github.com/apache/incubator-pinot/pull/5451


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[incubator-pinot] branch master updated: Refactor DistinctTable to use PriorityQueue based algorithm (#5451)

2020-05-29 Thread jackie

This is an automated email from the ASF dual-hosted git repository.

jackie pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git


The following commit(s) were added to refs/heads/master by this push:
 new 44a1e2e  Refactor DistinctTable to use PriorityQueue based algorithm 
(#5451)
44a1e2e is described below

commit 44a1e2e237fc6ce7570241728514b55240acb52f
Author: Xiaotian (Jackie) Jiang <1751+jackie-ji...@users.noreply.github.com>
AuthorDate: Fri May 29 17:19:39 2020 -0700

Refactor DistinctTable to use PriorityQueue based algorithm (#5451)

Currently DISTINCT query is solved the same way as GROUP-BY queries,
which is not necessary (consume much more memory and CPU) and does
not guarantee accuracy of the result.

Instead, DISTINCT query can be solved by a set and a heap efficiently
(similar to SelectionOrderBy but unique records need to be tracked).

The new DistinctTable does not implement the Table interface because
the table interface is designed mostly for the GROUP-BY queries, and
is not efficient for DISTINCT. If in the future we want to use Table
interface to uniform all the input/output, we can redesign the Table
interface to make it suitable for all types of queries.
---
 .../requesthandler/BaseBrokerRequestHandler.java   |   4 +
 .../apache/pinot/core/common/ObjectSerDeUtils.java |   2 +-
 .../apache/pinot/core/data/table/TableResizer.java | 191 ++--
 .../core/query/aggregation/DistinctTable.java  | 249 ---
 .../function/DistinctAggregationFunction.java  |  59 ++--
 .../function/customobject/DistinctTable.java   | 334 +
 .../query/reduce/DistinctDataTableReducer.java |  91 +++---
 .../core/query/reduce/ResultReducerFactory.java|   2 +-
 .../pinot/core/data/table/TableResizerTest.java| 158 --
 .../apache/pinot/queries/DistinctQueriesTest.java  |  57 ++--
 ...erSegmentAggregationSingleValueQueriesTest.java |  10 +-
 11 files changed, 466 insertions(+), 691 deletions(-)

diff --git 
a/pinot-broker/src/main/java/org/apache/pinot/broker/requesthandler/BaseBrokerRequestHandler.java
 
b/pinot-broker/src/main/java/org/apache/pinot/broker/requesthandler/BaseBrokerRequestHandler.java
index 6a00878..e7869a2 100644
--- 
a/pinot-broker/src/main/java/org/apache/pinot/broker/requesthandler/BaseBrokerRequestHandler.java
+++ 
b/pinot-broker/src/main/java/org/apache/pinot/broker/requesthandler/BaseBrokerRequestHandler.java
@@ -715,6 +715,10 @@ public abstract class BaseBrokerRequestHandler implements 
BrokerRequestHandler {
 // TODO: Explore if DISTINCT should be supported with GROUP BY
 throw new UnsupportedOperationException("DISTINCT with GROUP BY is 
currently not supported");
   }
+  if (brokerRequest.getLimit() == 0) {
+// TODO: Consider changing it to SELECTION query for LIMIT 0
+throw new UnsupportedOperationException("DISTINCT must have 
positive LIMIT");
+  }
   if (brokerRequest.isSetOrderBy()) {
 Set expressionSet = new 
HashSet<>(AggregationFunctionUtils.getArguments(aggregationInfo));
 List orderByColumns = brokerRequest.getOrderBy();
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/common/ObjectSerDeUtils.java 
b/pinot-core/src/main/java/org/apache/pinot/core/common/ObjectSerDeUtils.java
index 88c5cf8..f471e37 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/common/ObjectSerDeUtils.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/common/ObjectSerDeUtils.java
@@ -36,8 +36,8 @@ import java.util.Map;
 import org.apache.datasketches.memory.Memory;
 import org.apache.datasketches.theta.Sketch;
 import org.apache.pinot.common.utils.StringUtil;
-import org.apache.pinot.core.query.aggregation.DistinctTable;
 import org.apache.pinot.core.query.aggregation.function.customobject.AvgPair;
+import 
org.apache.pinot.core.query.aggregation.function.customobject.DistinctTable;
 import 
org.apache.pinot.core.query.aggregation.function.customobject.MinMaxRangePair;
 import 
org.apache.pinot.core.query.aggregation.function.customobject.QuantileDigest;
 
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/data/table/TableResizer.java 
b/pinot-core/src/main/java/org/apache/pinot/core/data/table/TableResizer.java
index 1fa1deb..a474c3d 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/data/table/TableResizer.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/data/table/TableResizer.java
@@ -19,7 +19,6 @@
 package org.apache.pinot.core.data.table;
 
 import com.google.common.annotations.VisibleForTesting;
-import com.google.common.base.Preconditions;
 import it.unimi.dsi.fastutil.objects.ObjectOpenHashSet;
 import java.util.ArrayList;
 import java.util.Arrays;
@@ -29,7 +28,6 @@ import java.util.HashMap;
 import java.util.List;
 import java.util.Map;
 import java.util.Priorit

[GitHub] [incubator-pinot] vincentchenjl commented on a change in pull request #5435: [TE] clean up decprecated/unused code

2020-05-29 Thread GitBox



vincentchenjl commented on a change in pull request #5435:
URL: https://github.com/apache/incubator-pinot/pull/5435#discussion_r432789651



##
File path: 
thirdeye/thirdeye-pinot/src/main/java/org/apache/pinot/thirdeye/dashboard/ThirdEyeDashboardApplication.java
##
@@ -160,14 +155,12 @@ public void run(ThirdEyeDashboardConfiguration config, 
Environment env)
 
 AnomalyFunctionFactory anomalyFunctionFactory = new 
AnomalyFunctionFactory(config.getFunctionConfigPath());

Review comment:
   When we search for anomalies by using `/anomalies/search/time/`, we 
construct `AnomalyDetails`, and it requires `BaseAnomalyFunction` to do it. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5440: Add GenericTransformFunction wrapper for simple ScalarFunctions

2020-05-29 Thread GitBox



Jackie-Jiang commented on a change in pull request #5440:
URL: https://github.com/apache/incubator-pinot/pull/5440#discussion_r432782384



##
File path: pinot-common/pom.xml
##
@@ -33,6 +33,7 @@
   https://pinot.apache.org/
   
 ${basedir}/..
+0.9.11

Review comment:
   Move the version info into the root pom file

##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/operator/transform/function/TransformFunctionFactory.java
##
@@ -112,13 +115,24 @@ public static TransformFunction 
get(TransformExpressionTree expression, Map transformFunctionClass = 
TRANSFORM_FUNCTION_MAP.get(functionName);
+Class transformFunctionClass;
+FunctionInfo functionInfo = null;
+if (FunctionRegistry.containsFunctionByName(functionName)) {
+  transformFunctionClass = ScalarTransformFunctionWrapper.class;
+  functionInfo = FunctionRegistry.getFunctionByName(functionName);
+} else {
+  transformFunctionClass = TRANSFORM_FUNCTION_MAP.get(functionName);
+}
+
 if (transformFunctionClass == null) {
   throw new BadQueryRequestException("Unsupported transform function: 
" + functionName);
 }
 try {
   transformFunction = transformFunctionClass.newInstance();
-} catch (InstantiationException | IllegalAccessException e) {
+  if (functionInfo != null) {
+((ScalarTransformFunctionWrapper) 
transformFunction).setFunction(functionName, functionInfo);

Review comment:
   Suggest using a constructor with `functionName` and `functionInfo` 
instead of using `newInstance()`

##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/operator/transform/function/ScalarTransformFunctionWrapper.java
##
@@ -0,0 +1,301 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.operator.transform.function;
+
+import com.google.common.base.Preconditions;
+import java.lang.reflect.InvocationTargetException;
+import java.lang.reflect.Method;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+import java.util.Map;
+import org.apache.pinot.common.function.FunctionInfo;
+import org.apache.pinot.common.function.FunctionInvoker;
+import org.apache.pinot.core.common.DataSource;
+import org.apache.pinot.core.operator.blocks.ProjectionBlock;
+import org.apache.pinot.core.operator.transform.TransformResultMetadata;
+import org.apache.pinot.core.plan.DocIdSetPlanNode;
+import org.apache.pinot.spi.data.FieldSpec;
+
+
+public class ScalarTransformFunctionWrapper extends BaseTransformFunction {
+
+  FunctionInvoker _functionInvoker;
+  String _name;
+  Object[] _args;
+  List _nonLiteralArgIndices;
+  List _nonLiteralArgType;
+  List _nonLiteralTransformFunction;
+  TransformResultMetadata _transformResultMetadata;
+  String[] _stringResult;
+  int[] _integerResult;
+  float[] _floatResult;
+  double[] _doubleResult;
+  long[] _longResult;
+
+  public ScalarTransformFunctionWrapper() {
+_nonLiteralArgIndices = new ArrayList<>();
+_nonLiteralArgType = new ArrayList<>();
+_nonLiteralTransformFunction = new ArrayList<>();
+  }
+
+  @Override
+  public String getName() {
+return _name;
+  }
+
+  public void setFunction(String functionName, FunctionInfo info)
+  throws Exception {
+_name = functionName;
+_functionInvoker = new FunctionInvoker(info);
+  }
+
+  @Override
+  public void init(List arguments, Map 
dataSourceMap) {
+Preconditions.checkArgument(arguments.size() == 
_functionInvoker.getParameterTypes().length,
+"The number of arguments are not same for scalar function and 
transform function: %s", getName());
+
+_args = new Object[arguments.size()];
+for (int i = 0; i < arguments.size(); i++) {
+  TransformFunction function = arguments.get(i);
+  if (function instanceof LiteralTransformFunction) {
+String literal = ((LiteralTransformFunction) function).getLiteral();
+Class paramType = _functionInvoker.getParameterTypes()[i];
+switch (paramType.getTypeName()) {

Review comment:
   For readability, can we always order them as INT, LONG, FLOAT, DOUBLE, 
STRING? Sam

[incubator-pinot] branch release-0.4.0-rc3 updated: [maven-release-plugin] prepare for next development iteration

2020-05-29 Thread haibow

This is an automated email from the ASF dual-hosted git repository.

haibow pushed a commit to branch release-0.4.0-rc3
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git


The following commit(s) were added to refs/heads/release-0.4.0-rc3 by this push:
 new 46863b1  [maven-release-plugin] prepare for next development iteration
46863b1 is described below

commit 46863b12ddbb45f6fdd82b592b55ea05ed8c8470
Author: Haibo Wang 
AuthorDate: Fri May 29 17:13:51 2020 -0700

[maven-release-plugin] prepare for next development iteration
---
 pinot-broker/pom.xml  | 2 +-
 pinot-clients/pinot-java-client/pom.xml   | 2 +-
 pinot-clients/pom.xml | 2 +-
 pinot-common/pom.xml  | 2 +-
 pinot-controller/pom.xml  | 2 +-
 pinot-core/pom.xml| 2 +-
 pinot-distribution/pom.xml| 2 +-
 pinot-integration-tests/pom.xml   | 2 +-
 pinot-minion/pom.xml  | 2 +-
 pinot-perf/pom.xml| 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-common/pom.xml| 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-hadoop/pom.xml| 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-spark/pom.xml | 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-standalone/pom.xml| 2 +-
 pinot-plugins/pinot-batch-ingestion/pom.xml   | 2 +-
 .../pinot-batch-ingestion/v0_deprecated/pinot-hadoop/pom.xml  | 2 +-
 .../v0_deprecated/pinot-ingestion-common/pom.xml  | 2 +-
 pinot-plugins/pinot-batch-ingestion/v0_deprecated/pinot-spark/pom.xml | 2 +-
 pinot-plugins/pinot-batch-ingestion/v0_deprecated/pom.xml | 2 +-
 pinot-plugins/pinot-file-system/pinot-adls/pom.xml| 2 +-
 pinot-plugins/pinot-file-system/pinot-gcs/pom.xml | 2 +-
 pinot-plugins/pinot-file-system/pinot-hdfs/pom.xml| 2 +-
 pinot-plugins/pinot-file-system/pinot-s3/pom.xml  | 2 +-
 pinot-plugins/pinot-file-system/pom.xml   | 2 +-
 pinot-plugins/pinot-input-format/pinot-avro-base/pom.xml  | 2 +-
 pinot-plugins/pinot-input-format/pinot-avro/pom.xml   | 2 +-
 pinot-plugins/pinot-input-format/pinot-confluent-avro/pom.xml | 2 +-
 pinot-plugins/pinot-input-format/pinot-csv/pom.xml| 2 +-
 pinot-plugins/pinot-input-format/pinot-json/pom.xml   | 2 +-
 pinot-plugins/pinot-input-format/pinot-orc/pom.xml| 2 +-
 pinot-plugins/pinot-input-format/pinot-parquet/pom.xml| 2 +-
 pinot-plugins/pinot-input-format/pinot-thrift/pom.xml | 2 +-
 pinot-plugins/pinot-input-format/pom.xml  | 2 +-
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-0.9/pom.xml  | 2 +-
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-2.0/pom.xml  | 2 +-
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-base/pom.xml | 2 +-
 pinot-plugins/pinot-stream-ingestion/pom.xml  | 2 +-
 pinot-plugins/pom.xml | 2 +-
 pinot-server/pom.xml  | 2 +-
 pinot-spi/pom.xml | 2 +-
 pinot-tools/pom.xml   | 2 +-
 pom.xml   | 4 ++--
 42 files changed, 43 insertions(+), 43 deletions(-)

diff --git a/pinot-broker/pom.xml b/pinot-broker/pom.xml
index 2fd885d..a884630 100644
--- a/pinot-broker/pom.xml
+++ b/pinot-broker/pom.xml
@@ -24,7 +24,7 @@
   
 pinot
 org.apache.pinot
-0.4.0
+0.5.0-SNAPSHOT
 ..
   
   pinot-broker
diff --git a/pinot-clients/pinot-java-client/pom.xml 
b/pinot-clients/pinot-java-client/pom.xml
index 615c5e9..60177d7 100644
--- a/pinot-clients/pinot-java-client/pom.xml
+++ b/pinot-clients/pinot-java-client/pom.xml
@@ -24,7 +24,7 @@
   
 pinot-clients
 org.apache.pinot
-0.4.0
+0.5.0-SNAPSHOT
 ..
   
   pinot-java-client
diff --git a/pinot-clients/pom.xml b/pinot-clients/pom.xml
index e05e7ec..902f54f 100644
--- a/pinot-clients/pom.xml
+++ b/pinot-clients/pom.xml
@@ -24,7 +24,7 @@
   
 pinot
 org.apache.pinot
-0.4.0
+0.5.0-SNAPSHOT
 ..
   
   pinot-clients
diff --git a/pinot-common/pom.xml b/pinot-common/pom.xml
index db2d28d..7f2a7fc 100644
--- a/pinot-common/pom.xml
+++ b/pinot-common/pom.xml
@@ -24,7 +24,7 @@
   
 pinot
 org.apache.pinot
-0.4.0
+0.5.0-SNAPSHOT
 ..
   
   pi

[incubator-pinot] annotated tag release-0.4.0-rc3 updated (c3f36c2 -> bf16d5d)

2020-05-29 Thread haibow

This is an automated email from the ASF dual-hosted git repository.

haibow pushed a change to annotated tag release-0.4.0-rc3
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


*** WARNING: tag release-0.4.0-rc3 was modified! ***

from c3f36c2  (commit)
  to bf16d5d  (tag)
 tagging c3f36c28b37e4ea216433be3d6eb8b5dc3603f3f (commit)
  by Haibo Wang
  on Fri May 29 17:13:47 2020 -0700

- Log -
[maven-release-plugin] copy for tag release-0.4.0-rc3
---


No new revisions were added by this update.

Summary of changes:


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[incubator-pinot] 02/02: [maven-release-plugin] prepare release release-0.4.0-rc3

2020-05-29 Thread haibow

This is an automated email from the ASF dual-hosted git repository.

haibow pushed a commit to branch release-0.4.0-rc3
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git

commit c3f36c28b37e4ea216433be3d6eb8b5dc3603f3f
Author: Haibo Wang 
AuthorDate: Fri May 29 17:13:32 2020 -0700

[maven-release-plugin] prepare release release-0.4.0-rc3
---
 pinot-broker/pom.xml  | 5 ++---
 pinot-clients/pinot-java-client/pom.xml   | 5 ++---
 pinot-clients/pom.xml | 6 ++
 pinot-common/pom.xml  | 5 ++---
 pinot-controller/pom.xml  | 5 ++---
 pinot-core/pom.xml| 5 ++---
 pinot-distribution/pom.xml| 7 +++
 pinot-integration-tests/pom.xml   | 5 ++---
 pinot-minion/pom.xml  | 5 ++---
 pinot-perf/pom.xml| 5 ++---
 .../pinot-batch-ingestion/pinot-batch-ingestion-common/pom.xml| 6 ++
 .../pinot-batch-ingestion/pinot-batch-ingestion-hadoop/pom.xml| 6 ++
 .../pinot-batch-ingestion/pinot-batch-ingestion-spark/pom.xml | 6 ++
 .../pinot-batch-ingestion-standalone/pom.xml  | 6 ++
 pinot-plugins/pinot-batch-ingestion/pom.xml   | 6 ++
 .../pinot-batch-ingestion/v0_deprecated/pinot-hadoop/pom.xml  | 7 +++
 .../v0_deprecated/pinot-ingestion-common/pom.xml  | 6 ++
 .../pinot-batch-ingestion/v0_deprecated/pinot-spark/pom.xml   | 7 +++
 pinot-plugins/pinot-batch-ingestion/v0_deprecated/pom.xml | 6 ++
 pinot-plugins/pinot-file-system/pinot-adls/pom.xml| 5 ++---
 pinot-plugins/pinot-file-system/pinot-gcs/pom.xml | 6 ++
 pinot-plugins/pinot-file-system/pinot-hdfs/pom.xml| 5 ++---
 pinot-plugins/pinot-file-system/pinot-s3/pom.xml  | 6 ++
 pinot-plugins/pinot-file-system/pom.xml   | 6 ++
 pinot-plugins/pinot-input-format/pinot-avro-base/pom.xml  | 5 ++---
 pinot-plugins/pinot-input-format/pinot-avro/pom.xml   | 5 ++---
 pinot-plugins/pinot-input-format/pinot-confluent-avro/pom.xml | 5 ++---
 pinot-plugins/pinot-input-format/pinot-csv/pom.xml| 5 ++---
 pinot-plugins/pinot-input-format/pinot-json/pom.xml   | 5 ++---
 pinot-plugins/pinot-input-format/pinot-orc/pom.xml| 6 ++
 pinot-plugins/pinot-input-format/pinot-parquet/pom.xml| 5 ++---
 pinot-plugins/pinot-input-format/pinot-thrift/pom.xml | 5 ++---
 pinot-plugins/pinot-input-format/pom.xml  | 6 ++
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-0.9/pom.xml  | 6 ++
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-2.0/pom.xml  | 6 ++
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-base/pom.xml | 6 ++
 pinot-plugins/pinot-stream-ingestion/pom.xml  | 6 ++
 pinot-plugins/pom.xml | 8 +++-
 pinot-server/pom.xml  | 5 ++---
 pinot-spi/pom.xml | 5 ++---
 pinot-tools/pom.xml   | 5 ++---
 pom.xml   | 7 +++
 42 files changed, 89 insertions(+), 149 deletions(-)

diff --git a/pinot-broker/pom.xml b/pinot-broker/pom.xml
index d9e48e1..2fd885d 100644
--- a/pinot-broker/pom.xml
+++ b/pinot-broker/pom.xml
@@ -19,13 +19,12 @@
 under the License.
 
 -->
-http://www.w3.org/2001/XMLSchema-instance"; 
xmlns="http://maven.apache.org/POM/4.0.0";
- xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd";>
+http://maven.apache.org/POM/4.0.0"; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; 
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd";>
   4.0.0
   
 pinot
 org.apache.pinot
-${revision}${sha1}
+0.4.0
 ..
   
   pinot-broker
diff --git a/pinot-clients/pinot-java-client/pom.xml 
b/pinot-clients/pinot-java-client/pom.xml
index 6a98e3d..615c5e9 100644
--- a/pinot-clients/pinot-java-client/pom.xml
+++ b/pinot-clients/pinot-java-client/pom.xml
@@ -19,13 +19,12 @@
 under the License.
 
 -->
-http://maven.apache.org/POM/4.0.0"; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
- xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd";>
+http://maven.apache.org/POM/4.0.0"; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"; 
xsi:schemaLocation="http://maven.apache.org/POM/

[incubator-pinot] branch release-0.4.0-rc3 created (now c3f36c2)

2020-05-29 Thread haibow

This is an automated email from the ASF dual-hosted git repository.

haibow pushed a change to branch release-0.4.0-rc3
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


  at c3f36c2  [maven-release-plugin] prepare release release-0.4.0-rc3

This branch includes the following new commits:

 new 7f65bfe  Add license
 new c3f36c2  [maven-release-plugin] prepare release release-0.4.0-rc3

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[incubator-pinot] 01/02: Add license

2020-05-29 Thread haibow

This is an automated email from the ASF dual-hosted git repository.

haibow pushed a commit to branch release-0.4.0-rc3
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git

commit 7f65bfe5e87dc3a3b2e743c991c534ab34a7aeac
Author: Haibo Wang 
AuthorDate: Fri May 29 01:10:14 2020 -0700

Add license
---
 licenses-binary/LICENSE-gpl-2.0.txt | 641 
 1 file changed, 641 insertions(+)

diff --git a/licenses-binary/LICENSE-gpl-2.0.txt 
b/licenses-binary/LICENSE-gpl-2.0.txt
new file mode 100644
index 000..b4a4b30
--- /dev/null
+++ b/licenses-binary/LICENSE-gpl-2.0.txt
@@ -0,0 +1,641 @@
+Apache Pinot (incubating)
+Copyright 2018 The Apache Software Foundation
+
+This product includes software developed at
+The Apache Software Foundation (http://www.apache.org/).
+
+// --
+// NOTICE file corresponding to the section 4d of The Apache License,
+// Version 2.0, in this case for 
+// --
+
+The HermiteInterpolator class and its corresponding test have been imported 
from
+the orekit library distributed under the terms of the Apache 2 licence. 
Original
+source copyright:
+Copyright 2010-2012 CS Systèmes d'Information
+===
+
+This product includes software developed at
+The Apache Software Foundation (http://www.apache.org/).
+
+Apache Commons Configuration
+Copyright 2001-2008 The Apache Software Foundation
+
+This product includes software developed by
+The Apache Software Foundation (http://www.apache.org/).
+
+Apache Commons Collections
+Copyright 2001-2008 The Apache Software Foundation
+
+Apache Jakarta Commons Digester
+Copyright 2001-2006 The Apache Software Foundation
+
+Apache Commons BeanUtils
+Copyright 2000-2010 The Apache Software Foundation
+
+Apache Commons BeanUtils
+Copyright 2000-2008 The Apache Software Foundation
+
+Apache Commons Codec
+Copyright 2002-2011 The Apache Software Foundation
+
+
+src/test/org/apache/commons/codec/language/DoubleMetaphoneTest.java contains 
+test data from http://aspell.sourceforge.net/test/batch0.tab.
+
+Copyright (C) 2002 Kevin Atkinson (kev...@gnu.org). Verbatim copying
+and distribution of this entire article is permitted in any medium,
+provided this notice is preserved.
+
+
+Apache Commons IO
+Copyright 2002-2012 The Apache Software Foundation
+
+Apache Commons Lang
+Copyright 2001-2011 The Apache Software Foundation
+
+Apache Commons Logging
+Copyright 2003-2014 The Apache Software Foundation
+
+Apache Commons Lang
+Copyright 2001-2016 The Apache Software Foundation
+
+This product includes software from the Spring Framework,
+under the Apache License 2.0 (see: StringUtils.containsWhitespace())
+
+Apache Log4j SLF4J Binding
+Copyright 1999-2019 The Apache Software Foundation
+
+Apache Log4j API
+Copyright 1999-2019 The Apache Software Foundation
+
+Apache Log4j 1.x Compatibility API
+Copyright 1999-2019 The Apache Software Foundation
+
+=
+= NOTICE file corresponding to section 4d of the Apache License Version 2.0 =
+=
+This product includes software developed by
+Joda.org (https://www.joda.org/).
+
+# Jackson JSON processor
+
+Jackson is a high-performance, Free/Open Source JSON processing library.
+It was originally written by Tatu Saloranta (tatu.salora...@iki.fi), and has
+been in development since 2007.
+It is currently developed by a community of developers, as well as supported
+commercially by FasterXML.com.
+
+## Licensing
+
+Jackson core and extension components may be licensed under different licenses.
+To find the details that apply to this artifact see the accompanying LICENSE 
file.
+For more information, including possible other licensing options, contact
+FasterXML.com (http://fasterxml.com).
+
+## Credits
+
+A list of contributors may be found from CREDITS file, which is included
+in some artifacts (usually source distributions); but is always available
+from the source code management (SCM) system project uses.
+
+Apache Groovy
+Copyright 2003-2017 The Apache Software Foundation
+
+This product includes/uses ANTLR (http://www.antlr2.org/)
+developed by Terence Parr 1989-2006
+
+This product bundles icons from the famfamfam.com silk icons set
+http://www.famfamfam.com/lab/icons/silk/
+Licensed under the Creative Commons Attribution Licence v2.5
+http://creativecommons.org/licenses/by/2.5/
+
+Apache HttpClient Mime
+Copyright 1999-2017 The Apache Software Foundation
+
+Apache HttpClient
+Copyright 1999-2017 The Apache Software Foundation
+
+Apache HttpCore
+Copyright 2005-2017 The Apac

[GitHub] [incubator-pinot] Jackie-Jiang commented on pull request #5451: Refactor DistinctTable to use PriorityQueue based algorithm

2020-05-29 Thread GitBox



Jackie-Jiang commented on pull request #5451:
URL: https://github.com/apache/incubator-pinot/pull/5451#issuecomment-636230181


   @siddharthteotia Addressed all the comments



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[incubator-pinot] branch master updated: Adding files generated by running quickstart to gitignore (#5441)

2020-05-29 Thread kishoreg

This is an automated email from the ASF dual-hosted git repository.

kishoreg pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git


The following commit(s) were added to refs/heads/master by this push:
 new fb5b75f  Adding files generated by running quickstart to gitignore 
(#5441)
fb5b75f is described below

commit fb5b75f07f04bdb8d0f1a474af1640cad3b75c0a
Author: Kishore Gopalakrishna 
AuthorDate: Fri May 29 16:02:45 2020 -0700

Adding files generated by running quickstart to gitignore (#5441)
---
 .gitignore | 4 
 1 file changed, 4 insertions(+)

diff --git a/.gitignore b/.gitignore
index c74cf56..f0a1088 100644
--- a/.gitignore
+++ b/.gitignore
@@ -47,3 +47,7 @@ yarn-debug.log*
 yarn-error.log*
 
 .github/PULL_REQUEST_TEMPLATE.md
+
+#quickstart files
+examples/
+quickstart*


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] kishoreg merged pull request #5441: Adding files generated by running quickstart to gitignore

2020-05-29 Thread GitBox



kishoreg merged pull request #5441:
URL: https://github.com/apache/incubator-pinot/pull/5441


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5451: Refactor DistinctTable to use PriorityQueue based algorithm

2020-05-29 Thread GitBox



Jackie-Jiang commented on a change in pull request #5451:
URL: https://github.com/apache/incubator-pinot/pull/5451#discussion_r432775396



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/customobject/DistinctTable.java
##
@@ -0,0 +1,297 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.query.aggregation.function.customobject;
+
+import it.unimi.dsi.fastutil.objects.Object2IntOpenHashMap;
+import it.unimi.dsi.fastutil.objects.ObjectOpenHashSet;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.Iterator;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.PriorityQueue;
+import java.util.Set;
+import javax.annotation.Nullable;
+import org.apache.pinot.common.request.SelectionSort;
+import org.apache.pinot.common.utils.DataSchema;
+import org.apache.pinot.common.utils.DataTable;
+import org.apache.pinot.core.common.datatable.DataTableBuilder;
+import org.apache.pinot.core.common.datatable.DataTableFactory;
+import org.apache.pinot.core.data.table.Record;
+import org.apache.pinot.spi.utils.ByteArray;
+
+
+/**
+ * The {@code DistinctTable} class serves as the intermediate result of {@code 
DistinctAggregationFunction}.
+ */
+@SuppressWarnings({"rawtypes", "unchecked"})
+public class DistinctTable {
+  private static final int MAX_INITIAL_CAPACITY = 1;
+
+  private final DataSchema _dataSchema;
+  private final int _limit;
+  private final Set _uniqueRecords;
+  private final PriorityQueue _sortedRecords;
+  private final List _records;
+
+  /**
+   * Constructor of the main {@code DistinctTable} which can be used to add 
records and merge other
+   * {@code DistinctTable}s.
+   */
+  public DistinctTable(DataSchema dataSchema, @Nullable List 
orderBy, int limit) {
+_dataSchema = dataSchema;
+_limit = limit;
+
+// TODO: see if 10k is the right max initial capacity to use
+// NOTE: When LIMIT is smaller than or equal to the MAX_INITIAL_CAPACITY, 
no resize is required.
+int initialCapacity = Math.min(limit, MAX_INITIAL_CAPACITY);
+_uniqueRecords = new ObjectOpenHashSet<>(initialCapacity);
+if (orderBy != null) {
+  String[] columns = dataSchema.getColumnNames();
+  int numColumns = columns.length;
+  Object2IntOpenHashMap columnIndexMap = new 
Object2IntOpenHashMap<>(numColumns);
+  for (int i = 0; i < numColumns; i++) {
+columnIndexMap.put(columns[i], i);
+  }
+  int numOrderByColumns = orderBy.size();
+  int[] orderByColumnIndexes = new int[numOrderByColumns];
+  boolean[] orderByAsc = new boolean[numOrderByColumns];
+  for (int i = 0; i < numOrderByColumns; i++) {
+SelectionSort selectionSort = orderBy.get(i);
+orderByColumnIndexes[i] = 
columnIndexMap.getInt(selectionSort.getColumn());
+orderByAsc[i] = selectionSort.isIsAsc();
+  }
+  _sortedRecords = new PriorityQueue<>(initialCapacity, (record1, record2) 
-> {
+Object[] values1 = record1.getValues();
+Object[] values2 = record2.getValues();
+for (int i = 0; i < numOrderByColumns; i++) {
+  Comparable valueToCompare1 = (Comparable) 
values1[orderByColumnIndexes[i]];
+  Comparable valueToCompare2 = (Comparable) 
values2[orderByColumnIndexes[i]];
+  int result =
+  orderByAsc[i] ? valueToCompare2.compareTo(valueToCompare1) : 
valueToCompare1.compareTo(valueToCompare2);
+  if (result != 0) {
+return result;
+  }
+}
+return 0;
+  });
+} else {
+  _sortedRecords = null;
+}
+_records = null;
+  }
+
+  /**
+   * Returns the {@code DataSchema} of the {@code DistinctTable}.
+   */
+  public DataSchema getDataSchema() {
+return _dataSchema;
+  }
+
+  /**
+   * Returns the number of unique records within the {@code DistinctTable}.
+   */
+  public int size() {
+if (_uniqueRecords != null) {
+  // Server-side
+  return _uniqueRecords.size();
+} else {
+  // Broker-side
+  return _records.size();
+}
+  }
+
+  /**
+   * Adds a record into the DistinctTable and returns whether more reco

[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5451: Refactor DistinctTable to use PriorityQueue based algorithm

2020-05-29 Thread GitBox



Jackie-Jiang commented on a change in pull request #5451:
URL: https://github.com/apache/incubator-pinot/pull/5451#discussion_r432774922



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/customobject/DistinctTable.java
##
@@ -0,0 +1,297 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.query.aggregation.function.customobject;
+
+import it.unimi.dsi.fastutil.objects.Object2IntOpenHashMap;
+import it.unimi.dsi.fastutil.objects.ObjectOpenHashSet;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.Iterator;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.PriorityQueue;
+import java.util.Set;
+import javax.annotation.Nullable;
+import org.apache.pinot.common.request.SelectionSort;
+import org.apache.pinot.common.utils.DataSchema;
+import org.apache.pinot.common.utils.DataTable;
+import org.apache.pinot.core.common.datatable.DataTableBuilder;
+import org.apache.pinot.core.common.datatable.DataTableFactory;
+import org.apache.pinot.core.data.table.Record;
+import org.apache.pinot.spi.utils.ByteArray;
+
+
+/**
+ * The {@code DistinctTable} class serves as the intermediate result of {@code 
DistinctAggregationFunction}.
+ */
+@SuppressWarnings({"rawtypes", "unchecked"})
+public class DistinctTable {
+  private static final int MAX_INITIAL_CAPACITY = 1;
+
+  private final DataSchema _dataSchema;
+  private final int _limit;
+  private final Set _uniqueRecords;
+  private final PriorityQueue _sortedRecords;
+  private final List _records;
+
+  /**
+   * Constructor of the main {@code DistinctTable} which can be used to add 
records and merge other
+   * {@code DistinctTable}s.
+   */
+  public DistinctTable(DataSchema dataSchema, @Nullable List 
orderBy, int limit) {
+_dataSchema = dataSchema;
+_limit = limit;
+
+// TODO: see if 10k is the right max initial capacity to use
+// NOTE: When LIMIT is smaller than or equal to the MAX_INITIAL_CAPACITY, 
no resize is required.
+int initialCapacity = Math.min(limit, MAX_INITIAL_CAPACITY);
+_uniqueRecords = new ObjectOpenHashSet<>(initialCapacity);
+if (orderBy != null) {
+  String[] columns = dataSchema.getColumnNames();
+  int numColumns = columns.length;
+  Object2IntOpenHashMap columnIndexMap = new 
Object2IntOpenHashMap<>(numColumns);
+  for (int i = 0; i < numColumns; i++) {
+columnIndexMap.put(columns[i], i);
+  }
+  int numOrderByColumns = orderBy.size();
+  int[] orderByColumnIndexes = new int[numOrderByColumns];
+  boolean[] orderByAsc = new boolean[numOrderByColumns];
+  for (int i = 0; i < numOrderByColumns; i++) {
+SelectionSort selectionSort = orderBy.get(i);
+orderByColumnIndexes[i] = 
columnIndexMap.getInt(selectionSort.getColumn());
+orderByAsc[i] = selectionSort.isIsAsc();
+  }
+  _sortedRecords = new PriorityQueue<>(initialCapacity, (record1, record2) 
-> {
+Object[] values1 = record1.getValues();
+Object[] values2 = record2.getValues();
+for (int i = 0; i < numOrderByColumns; i++) {
+  Comparable valueToCompare1 = (Comparable) 
values1[orderByColumnIndexes[i]];
+  Comparable valueToCompare2 = (Comparable) 
values2[orderByColumnIndexes[i]];
+  int result =
+  orderByAsc[i] ? valueToCompare2.compareTo(valueToCompare1) : 
valueToCompare1.compareTo(valueToCompare2);
+  if (result != 0) {
+return result;
+  }
+}
+return 0;
+  });
+} else {
+  _sortedRecords = null;
+}
+_records = null;
+  }
+
+  /**
+   * Returns the {@code DataSchema} of the {@code DistinctTable}.
+   */
+  public DataSchema getDataSchema() {
+return _dataSchema;
+  }
+
+  /**
+   * Returns the number of unique records within the {@code DistinctTable}.
+   */
+  public int size() {
+if (_uniqueRecords != null) {
+  // Server-side
+  return _uniqueRecords.size();
+} else {
+  // Broker-side
+  return _records.size();
+}
+  }
+
+  /**
+   * Adds a record into the DistinctTable and returns whether more reco

[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5451: Refactor DistinctTable to use PriorityQueue based algorithm

2020-05-29 Thread GitBox



Jackie-Jiang commented on a change in pull request #5451:
URL: https://github.com/apache/incubator-pinot/pull/5451#discussion_r432774854



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/customobject/DistinctTable.java
##
@@ -0,0 +1,297 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.query.aggregation.function.customobject;
+
+import it.unimi.dsi.fastutil.objects.Object2IntOpenHashMap;
+import it.unimi.dsi.fastutil.objects.ObjectOpenHashSet;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.Iterator;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.PriorityQueue;
+import java.util.Set;
+import javax.annotation.Nullable;
+import org.apache.pinot.common.request.SelectionSort;
+import org.apache.pinot.common.utils.DataSchema;
+import org.apache.pinot.common.utils.DataTable;
+import org.apache.pinot.core.common.datatable.DataTableBuilder;
+import org.apache.pinot.core.common.datatable.DataTableFactory;
+import org.apache.pinot.core.data.table.Record;
+import org.apache.pinot.spi.utils.ByteArray;
+
+
+/**
+ * The {@code DistinctTable} class serves as the intermediate result of {@code 
DistinctAggregationFunction}.
+ */
+@SuppressWarnings({"rawtypes", "unchecked"})
+public class DistinctTable {
+  private static final int MAX_INITIAL_CAPACITY = 1;
+
+  private final DataSchema _dataSchema;
+  private final int _limit;
+  private final Set _uniqueRecords;
+  private final PriorityQueue _sortedRecords;
+  private final List _records;
+
+  /**
+   * Constructor of the main {@code DistinctTable} which can be used to add 
records and merge other
+   * {@code DistinctTable}s.
+   */
+  public DistinctTable(DataSchema dataSchema, @Nullable List 
orderBy, int limit) {
+_dataSchema = dataSchema;
+_limit = limit;
+
+// TODO: see if 10k is the right max initial capacity to use
+// NOTE: When LIMIT is smaller than or equal to the MAX_INITIAL_CAPACITY, 
no resize is required.
+int initialCapacity = Math.min(limit, MAX_INITIAL_CAPACITY);
+_uniqueRecords = new ObjectOpenHashSet<>(initialCapacity);
+if (orderBy != null) {
+  String[] columns = dataSchema.getColumnNames();
+  int numColumns = columns.length;
+  Object2IntOpenHashMap columnIndexMap = new 
Object2IntOpenHashMap<>(numColumns);
+  for (int i = 0; i < numColumns; i++) {
+columnIndexMap.put(columns[i], i);
+  }
+  int numOrderByColumns = orderBy.size();
+  int[] orderByColumnIndexes = new int[numOrderByColumns];
+  boolean[] orderByAsc = new boolean[numOrderByColumns];
+  for (int i = 0; i < numOrderByColumns; i++) {
+SelectionSort selectionSort = orderBy.get(i);
+orderByColumnIndexes[i] = 
columnIndexMap.getInt(selectionSort.getColumn());
+orderByAsc[i] = selectionSort.isIsAsc();
+  }
+  _sortedRecords = new PriorityQueue<>(initialCapacity, (record1, record2) 
-> {
+Object[] values1 = record1.getValues();
+Object[] values2 = record2.getValues();
+for (int i = 0; i < numOrderByColumns; i++) {
+  Comparable valueToCompare1 = (Comparable) 
values1[orderByColumnIndexes[i]];
+  Comparable valueToCompare2 = (Comparable) 
values2[orderByColumnIndexes[i]];
+  int result =
+  orderByAsc[i] ? valueToCompare2.compareTo(valueToCompare1) : 
valueToCompare1.compareTo(valueToCompare2);
+  if (result != 0) {
+return result;
+  }
+}
+return 0;
+  });
+} else {
+  _sortedRecords = null;
+}
+_records = null;
+  }
+
+  /**
+   * Returns the {@code DataSchema} of the {@code DistinctTable}.
+   */
+  public DataSchema getDataSchema() {
+return _dataSchema;
+  }
+
+  /**
+   * Returns the number of unique records within the {@code DistinctTable}.
+   */
+  public int size() {
+if (_uniqueRecords != null) {
+  // Server-side
+  return _uniqueRecords.size();
+} else {
+  // Broker-side
+  return _records.size();
+}
+  }
+
+  /**
+   * Adds a record into the DistinctTable and returns whether more reco

[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5451: Refactor DistinctTable to use PriorityQueue based algorithm

2020-05-29 Thread GitBox



Jackie-Jiang commented on a change in pull request #5451:
URL: https://github.com/apache/incubator-pinot/pull/5451#discussion_r432774276



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/customobject/DistinctTable.java
##
@@ -0,0 +1,297 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.query.aggregation.function.customobject;
+
+import it.unimi.dsi.fastutil.objects.Object2IntOpenHashMap;
+import it.unimi.dsi.fastutil.objects.ObjectOpenHashSet;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.Iterator;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.PriorityQueue;
+import java.util.Set;
+import javax.annotation.Nullable;
+import org.apache.pinot.common.request.SelectionSort;
+import org.apache.pinot.common.utils.DataSchema;
+import org.apache.pinot.common.utils.DataTable;
+import org.apache.pinot.core.common.datatable.DataTableBuilder;
+import org.apache.pinot.core.common.datatable.DataTableFactory;
+import org.apache.pinot.core.data.table.Record;
+import org.apache.pinot.spi.utils.ByteArray;
+
+
+/**
+ * The {@code DistinctTable} class serves as the intermediate result of {@code 
DistinctAggregationFunction}.
+ */
+@SuppressWarnings({"rawtypes", "unchecked"})
+public class DistinctTable {
+  private static final int MAX_INITIAL_CAPACITY = 1;
+
+  private final DataSchema _dataSchema;
+  private final int _limit;
+  private final Set _uniqueRecords;
+  private final PriorityQueue _sortedRecords;
+  private final List _records;
+
+  /**
+   * Constructor of the main {@code DistinctTable} which can be used to add 
records and merge other
+   * {@code DistinctTable}s.
+   */
+  public DistinctTable(DataSchema dataSchema, @Nullable List 
orderBy, int limit) {
+_dataSchema = dataSchema;
+_limit = limit;
+
+// TODO: see if 10k is the right max initial capacity to use
+// NOTE: When LIMIT is smaller than or equal to the MAX_INITIAL_CAPACITY, 
no resize is required.
+int initialCapacity = Math.min(limit, MAX_INITIAL_CAPACITY);
+_uniqueRecords = new ObjectOpenHashSet<>(initialCapacity);
+if (orderBy != null) {
+  String[] columns = dataSchema.getColumnNames();
+  int numColumns = columns.length;
+  Object2IntOpenHashMap columnIndexMap = new 
Object2IntOpenHashMap<>(numColumns);
+  for (int i = 0; i < numColumns; i++) {
+columnIndexMap.put(columns[i], i);
+  }
+  int numOrderByColumns = orderBy.size();
+  int[] orderByColumnIndexes = new int[numOrderByColumns];
+  boolean[] orderByAsc = new boolean[numOrderByColumns];
+  for (int i = 0; i < numOrderByColumns; i++) {
+SelectionSort selectionSort = orderBy.get(i);
+orderByColumnIndexes[i] = 
columnIndexMap.getInt(selectionSort.getColumn());
+orderByAsc[i] = selectionSort.isIsAsc();
+  }
+  _sortedRecords = new PriorityQueue<>(initialCapacity, (record1, record2) 
-> {
+Object[] values1 = record1.getValues();
+Object[] values2 = record2.getValues();
+for (int i = 0; i < numOrderByColumns; i++) {
+  Comparable valueToCompare1 = (Comparable) 
values1[orderByColumnIndexes[i]];
+  Comparable valueToCompare2 = (Comparable) 
values2[orderByColumnIndexes[i]];
+  int result =
+  orderByAsc[i] ? valueToCompare2.compareTo(valueToCompare1) : 
valueToCompare1.compareTo(valueToCompare2);
+  if (result != 0) {
+return result;
+  }
+}
+return 0;
+  });
+} else {
+  _sortedRecords = null;
+}
+_records = null;
+  }
+
+  /**
+   * Returns the {@code DataSchema} of the {@code DistinctTable}.
+   */
+  public DataSchema getDataSchema() {
+return _dataSchema;
+  }
+
+  /**
+   * Returns the number of unique records within the {@code DistinctTable}.
+   */
+  public int size() {
+if (_uniqueRecords != null) {
+  // Server-side
+  return _uniqueRecords.size();
+} else {
+  // Broker-side
+  return _records.size();
+}
+  }
+
+  /**
+   * Adds a record into the DistinctTable and returns whether more reco

[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5451: Refactor DistinctTable to use PriorityQueue based algorithm

2020-05-29 Thread GitBox



Jackie-Jiang commented on a change in pull request #5451:
URL: https://github.com/apache/incubator-pinot/pull/5451#discussion_r432774378



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/customobject/DistinctTable.java
##
@@ -0,0 +1,297 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.query.aggregation.function.customobject;
+
+import it.unimi.dsi.fastutil.objects.Object2IntOpenHashMap;
+import it.unimi.dsi.fastutil.objects.ObjectOpenHashSet;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.Iterator;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.PriorityQueue;
+import java.util.Set;
+import javax.annotation.Nullable;
+import org.apache.pinot.common.request.SelectionSort;
+import org.apache.pinot.common.utils.DataSchema;
+import org.apache.pinot.common.utils.DataTable;
+import org.apache.pinot.core.common.datatable.DataTableBuilder;
+import org.apache.pinot.core.common.datatable.DataTableFactory;
+import org.apache.pinot.core.data.table.Record;
+import org.apache.pinot.spi.utils.ByteArray;
+
+
+/**
+ * The {@code DistinctTable} class serves as the intermediate result of {@code 
DistinctAggregationFunction}.
+ */
+@SuppressWarnings({"rawtypes", "unchecked"})
+public class DistinctTable {
+  private static final int MAX_INITIAL_CAPACITY = 1;
+
+  private final DataSchema _dataSchema;
+  private final int _limit;
+  private final Set _uniqueRecords;
+  private final PriorityQueue _sortedRecords;
+  private final List _records;
+
+  /**
+   * Constructor of the main {@code DistinctTable} which can be used to add 
records and merge other
+   * {@code DistinctTable}s.
+   */
+  public DistinctTable(DataSchema dataSchema, @Nullable List 
orderBy, int limit) {
+_dataSchema = dataSchema;
+_limit = limit;
+
+// TODO: see if 10k is the right max initial capacity to use
+// NOTE: When LIMIT is smaller than or equal to the MAX_INITIAL_CAPACITY, 
no resize is required.
+int initialCapacity = Math.min(limit, MAX_INITIAL_CAPACITY);
+_uniqueRecords = new ObjectOpenHashSet<>(initialCapacity);
+if (orderBy != null) {
+  String[] columns = dataSchema.getColumnNames();
+  int numColumns = columns.length;
+  Object2IntOpenHashMap columnIndexMap = new 
Object2IntOpenHashMap<>(numColumns);
+  for (int i = 0; i < numColumns; i++) {
+columnIndexMap.put(columns[i], i);
+  }
+  int numOrderByColumns = orderBy.size();
+  int[] orderByColumnIndexes = new int[numOrderByColumns];
+  boolean[] orderByAsc = new boolean[numOrderByColumns];
+  for (int i = 0; i < numOrderByColumns; i++) {
+SelectionSort selectionSort = orderBy.get(i);
+orderByColumnIndexes[i] = 
columnIndexMap.getInt(selectionSort.getColumn());
+orderByAsc[i] = selectionSort.isIsAsc();
+  }
+  _sortedRecords = new PriorityQueue<>(initialCapacity, (record1, record2) 
-> {
+Object[] values1 = record1.getValues();
+Object[] values2 = record2.getValues();
+for (int i = 0; i < numOrderByColumns; i++) {
+  Comparable valueToCompare1 = (Comparable) 
values1[orderByColumnIndexes[i]];
+  Comparable valueToCompare2 = (Comparable) 
values2[orderByColumnIndexes[i]];
+  int result =
+  orderByAsc[i] ? valueToCompare2.compareTo(valueToCompare1) : 
valueToCompare1.compareTo(valueToCompare2);
+  if (result != 0) {
+return result;
+  }
+}
+return 0;
+  });
+} else {
+  _sortedRecords = null;
+}
+_records = null;
+  }
+
+  /**
+   * Returns the {@code DataSchema} of the {@code DistinctTable}.
+   */
+  public DataSchema getDataSchema() {
+return _dataSchema;
+  }
+
+  /**
+   * Returns the number of unique records within the {@code DistinctTable}.
+   */
+  public int size() {
+if (_uniqueRecords != null) {
+  // Server-side
+  return _uniqueRecords.size();
+} else {
+  // Broker-side
+  return _records.size();
+}
+  }
+
+  /**
+   * Adds a record into the DistinctTable and returns whether more reco

[incubator-pinot] branch hotfix_chunkwriter_realtime updated (48553ab -> f6a3476)

2020-05-29 Thread jlli

This is an automated email from the ASF dual-hosted git repository.

jlli pushed a change to branch hotfix_chunkwriter_realtime
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


 discard 48553ab  remove master branch restriction in travis.yml
 new f6a3476  remove master branch restriction in travis.yml

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (48553ab)
\
 N -- N -- N   refs/heads/hotfix_chunkwriter_realtime (f6a3476)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 pinot-plugins/pinot-file-system/pinot-adls/pom.xml | 1 -
 1 file changed, 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[incubator-pinot] 01/01: remove master branch restriction in travis.yml

2020-05-29 Thread jlli

This is an automated email from the ASF dual-hosted git repository.

jlli pushed a commit to branch hotfix_chunkwriter_realtime
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git

commit f6a34765160212e31e9e14f3948511747c9ef541
Author: Siddharth Teotia 
AuthorDate: Thu May 28 23:40:03 2020 -0700

remove master branch restriction in travis.yml
---
 .travis.yml| 8 
 pinot-plugins/pinot-file-system/pinot-adls/pom.xml | 1 -
 2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index 84dfceb..a72e68e 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -25,14 +25,14 @@ addons:
 install:
   - ./.travis/.travis_install.sh
 
-branches:
-  only:
-- master
+#branches:
+ # only:
+ #   - master
 
 stages:
   - test
   - name: deploy
-if: branch = master
+  #  if: branch = master
 
 jobs:
   include:
diff --git a/pinot-plugins/pinot-file-system/pinot-adls/pom.xml 
b/pinot-plugins/pinot-file-system/pinot-adls/pom.xml
index 2f598e1..5cad052 100644
--- a/pinot-plugins/pinot-file-system/pinot-adls/pom.xml
+++ b/pinot-plugins/pinot-file-system/pinot-adls/pom.xml
@@ -29,7 +29,6 @@
 ..
   
   pinot-adls
-  org.apache.pinot.plugins
   Pinot Azure Data Lake Storage
   https://pinot.apache.org/
   


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] fx19880617 commented on a change in pull request #5461: Adding Support for SQL CASE Statement

2020-05-29 Thread GitBox



fx19880617 commented on a change in pull request #5461:
URL: https://github.com/apache/incubator-pinot/pull/5461#discussion_r432769120



##
File path: 
pinot-common/src/main/java/org/apache/pinot/sql/parsers/CalciteSqlParser.java
##
@@ -610,6 +611,25 @@ private static Expression toExpression(SqlNode node) {
 }
 
asFuncExpr.getFunctionCall().addToOperands(RequestUtils.getIdentifierExpression(aliasName));
 return asFuncExpr;
+  case CASE:
+// CASE WHEN Statement is model as a function with variable length 
parameters.
+// Assume N is number of WHEN Statements, total number of parameters 
is (2 * N + 1).
+// - N: Convert each WHEN Statement into a function Expression;
+// - N: Convert each THEN Statement into an Expression;
+// - 1: Convert ELSE Statement into an Expression.
+SqlCase caseSqlNode = (SqlCase) node;
+SqlNodeList whenOperands = caseSqlNode.getWhenOperands();
+SqlNodeList thenOperands = caseSqlNode.getThenOperands();
+SqlNode elseOperand = caseSqlNode.getElseOperand();
+Expression caseFuncExpr = 
RequestUtils.getFunctionExpression(SqlKind.CASE.name());
+for (SqlNode whenSqlNode : whenOperands.getList()) {

Review comment:
   Added.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[incubator-pinot] branch support_case_when_statement updated (43bb868 -> 8e00513)

2020-05-29 Thread xiangfu

This is an automated email from the ASF dual-hosted git repository.

xiangfu pushed a change to branch support_case_when_statement
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


from 43bb868  Checks on then statements result type
 add 8e00513  Not allowing aggregation functions in case statements for now

No new revisions were added by this update.

Summary of changes:
 .../apache/pinot/sql/parsers/CalciteSqlParser.java | 24 +-
 .../pinot/sql/parsers/CalciteSqlCompilerTest.java  | 15 ++
 2 files changed, 34 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[incubator-pinot] branch support_case_when_statement updated (385a8b8 -> 43bb868)

2020-05-29 Thread xiangfu

This is an automated email from the ASF dual-hosted git repository.

xiangfu pushed a change to branch support_case_when_statement
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


from 385a8b8  Adding transform function support for case-when-else
 add 43bb868  Checks on then statements result type

No new revisions were added by this update.

Summary of changes:
 .../transform/function/CaseTransformFunction.java  | 359 +++--
 .../tests/OfflineClusterIntegrationTest.java   |  20 ++
 2 files changed, 353 insertions(+), 26 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5451: Refactor DistinctTable to use PriorityQueue based algorithm

2020-05-29 Thread GitBox



Jackie-Jiang commented on a change in pull request #5451:
URL: https://github.com/apache/incubator-pinot/pull/5451#discussion_r432742947



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/customobject/DistinctTable.java
##
@@ -0,0 +1,297 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.query.aggregation.function.customobject;
+
+import it.unimi.dsi.fastutil.objects.Object2IntOpenHashMap;
+import it.unimi.dsi.fastutil.objects.ObjectOpenHashSet;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.Iterator;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.PriorityQueue;
+import java.util.Set;
+import javax.annotation.Nullable;
+import org.apache.pinot.common.request.SelectionSort;
+import org.apache.pinot.common.utils.DataSchema;
+import org.apache.pinot.common.utils.DataTable;
+import org.apache.pinot.core.common.datatable.DataTableBuilder;
+import org.apache.pinot.core.common.datatable.DataTableFactory;
+import org.apache.pinot.core.data.table.Record;
+import org.apache.pinot.spi.utils.ByteArray;
+
+
+/**
+ * The {@code DistinctTable} class serves as the intermediate result of {@code 
DistinctAggregationFunction}.
+ */
+@SuppressWarnings({"rawtypes", "unchecked"})
+public class DistinctTable {
+  private static final int MAX_INITIAL_CAPACITY = 1;
+
+  private final DataSchema _dataSchema;
+  private final int _limit;
+  private final Set _uniqueRecords;
+  private final PriorityQueue _sortedRecords;
+  private final List _records;

Review comment:
   Added more javadoc and comments





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] siddharthteotia edited a comment on pull request #5440: Add GenericTransformFunction wrapper for simple ScalarFunctions

2020-05-29 Thread GitBox



siddharthteotia edited a comment on pull request #5440:
URL: https://github.com/apache/incubator-pinot/pull/5440#issuecomment-636196791


   > @fx19880617 @siddharthteotia Should I add tests in CalciteSQL for all the 
functions?
   
   @KKcorps , sorry missed this. Yes, the query compilation tests should be in 
CalciteSqlCompilerTest. Here we can verify that PinotQuery is being built 
correctly and that gets converted to BrokerRequest correctly. Most other tests 
in this file do this validation.
   
   The other suggestion was to also add unit tests for exercising end-to-end 
execution path. Please consider adding these tests to an appropriate file in 
`/incubator-pinot/pinot-core/src/test/java/org/apache/pinot/queries/`. May be 
TransformQueriesTest



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] siddharthteotia edited a comment on pull request #5440: Add GenericTransformFunction wrapper for simple ScalarFunctions

2020-05-29 Thread GitBox



siddharthteotia edited a comment on pull request #5440:
URL: https://github.com/apache/incubator-pinot/pull/5440#issuecomment-636196791


   > @fx19880617 @siddharthteotia Should I add tests in CalciteSQL for all the 
functions?
   
   @KKcorps , sorry missed this. Yes, the query compilation tests should be in 
CalciteSqlCompilerTest. Here we can verify that PinotQuery is being built 
correctly and that gets converted to BrokerRequest correctly. Most other tests 
in this file do this.
   
   The other suggestion was to also add unit tests for exercising end-to-end 
execution path. Please consider adding these tests to an appropriate file in 
`/incubator-pinot/pinot-core/src/test/java/org/apache/pinot/queries/`. May be 
TransformQueriesTest



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] siddharthteotia commented on pull request #5440: Add GenericTransformFunction wrapper for simple ScalarFunctions

2020-05-29 Thread GitBox



siddharthteotia commented on pull request #5440:
URL: https://github.com/apache/incubator-pinot/pull/5440#issuecomment-636196791


   > @fx19880617 @siddharthteotia Should I add tests in CalciteSQL for all the 
functions?
   
   @KKcorps , sorry missed seeing this. Yes, the query compilation tests should 
be in CalciteSqlCompilerTest. Here we can verify that PinotQuery is being built 
correctly and that gets converted to BrokerRequest correctly. Most other tests 
in this file do this.
   
   The other suggestion was to also add unit tests for exercising end-to-end 
execution path. Please consider adding these tests to an appropriate file in 
`/incubator-pinot/pinot-core/src/test/java/org/apache/pinot/queries/`. May be 
TransformQueriesTest



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] KKcorps edited a comment on pull request #5440: Add GenericTransformFunction wrapper for simple ScalarFunctions

2020-05-29 Thread GitBox



KKcorps edited a comment on pull request #5440:
URL: https://github.com/apache/incubator-pinot/pull/5440#issuecomment-634088300


   @fx19880617 @siddharthteotia  Should I add tests in CalciteSQL for all the 
functions?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] KKcorps edited a comment on pull request #5440: Add GenericTransformFunction wrapper for simple ScalarFunctions

2020-05-29 Thread GitBox



KKcorps edited a comment on pull request #5440:
URL: https://github.com/apache/incubator-pinot/pull/5440#issuecomment-634088300


   @fx19880617 @siddharthteotia  Should I add tests in CalciteSQL for all the 
functions?
   
   @KKcorps , sorry missed seeing this. Yes, the query compilation tests should 
be in CalciteSqlCompilerTest. Here we can verify that PinotQuery is being built 
correctly and that gets converted to BrokerRequest correctly. Most other tests 
in this file do this.
   
   The other suggestion was to also add unit tests for exercising end-to-end 
execution path. Please consider adding these tests to an appropriate file in 
`/incubator-pinot/pinot-core/src/test/java/org/apache/pinot/queries/`. May be 
TransformQueriesTest



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5444: Enhance and simplify the filtering

2020-05-29 Thread GitBox



siddharthteotia commented on a change in pull request #5444:
URL: https://github.com/apache/incubator-pinot/pull/5444#discussion_r432323335



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/operator/dociditerators/SVScanDocIdIterator.java
##
@@ -18,238 +18,138 @@
  */
 package org.apache.pinot.core.operator.dociditerators;
 
-import org.apache.pinot.core.common.BlockMetadata;
 import org.apache.pinot.core.common.BlockSingleValIterator;
-import org.apache.pinot.core.common.BlockValSet;
 import org.apache.pinot.core.common.Constants;
 import org.apache.pinot.core.operator.filter.predicate.PredicateEvaluator;
-import org.apache.pinot.spi.data.FieldSpec;
 import org.roaringbitmap.IntIterator;
 import org.roaringbitmap.buffer.ImmutableRoaringBitmap;
 import org.roaringbitmap.buffer.MutableRoaringBitmap;
 
 
-public class SVScanDocIdIterator implements ScanBasedDocIdIterator {
-  private int _currentDocId = -1;
+public final class SVScanDocIdIterator implements ScanBasedDocIdIterator {
+  private final PredicateEvaluator _predicateEvaluator;
   private final BlockSingleValIterator _valueIterator;
-  private int _startDocId;
-  private int _endDocId;
-  private PredicateEvaluator _evaluator;
-  private String _operatorName;
-  private int _numEntriesScanned = 0;
+  private final int _numDocs;
   private final ValueMatcher _valueMatcher;
 
-  public SVScanDocIdIterator(String operatorName, BlockValSet blockValSet, 
BlockMetadata blockMetadata,
-  PredicateEvaluator evaluator) {
-_operatorName = operatorName;
-_evaluator = evaluator;
-_valueIterator = (BlockSingleValIterator) blockValSet.iterator();
-
-if (evaluator.isAlwaysFalse()) {
-  _currentDocId = Constants.EOF;
-  setStartDocId(Constants.EOF);
-  setEndDocId(Constants.EOF);
-} else {
-  setStartDocId(blockMetadata.getStartDocId());
-  setEndDocId(blockMetadata.getEndDocId());
-}
+  private int _nextDocId = 0;
+  private long _numEntriesScanned = 0L;
 
-if (evaluator.isDictionaryBased()) {
-  _valueMatcher = new IntMatcher(); // Match using dictionary id's that 
are integers.
-} else {
-  _valueMatcher = getValueMatcherForType(blockMetadata.getDataType());
-}
-_valueMatcher.setEvaluator(evaluator);
-  }
-
-  /**
-   * After setting the startDocId, next calls will always return from 
>=startDocId
-   *
-   * @param startDocId Start doc id
-   */
-  public void setStartDocId(int startDocId) {
-_currentDocId = startDocId - 1;
-_valueIterator.skipTo(startDocId);
-_startDocId = startDocId;
-  }
-
-  /**
-   * After setting the endDocId, next call will return Constants.EOF after 
currentDocId exceeds
-   * endDocId
-   *
-   * @param endDocId End doc id
-   */
-  public void setEndDocId(int endDocId) {
-_endDocId = endDocId;
-  }
-
-  @Override
-  public boolean isMatch(int docId) {
-if (_currentDocId == Constants.EOF) {
-  return false;
-}
-_valueIterator.skipTo(docId);
-_numEntriesScanned++;
-return _valueMatcher.doesCurrentEntryMatch(_valueIterator);
-  }
-
-  @Override
-  public int advance(int targetDocId) {
-if (_currentDocId == Constants.EOF) {
-  return _currentDocId;
-}
-if (targetDocId < _startDocId) {
-  targetDocId = _startDocId;
-} else if (targetDocId > _endDocId) {
-  _currentDocId = Constants.EOF;
-}
-if (_currentDocId >= targetDocId) {
-  return _currentDocId;
-} else {
-  _currentDocId = targetDocId - 1;
-  _valueIterator.skipTo(targetDocId);
-  return next();
-}
+  public SVScanDocIdIterator(PredicateEvaluator predicateEvaluator, 
BlockSingleValIterator valueIterator, int numDocs) {
+_predicateEvaluator = predicateEvaluator;
+_valueIterator = valueIterator;
+_numDocs = numDocs;
+_valueMatcher = getValueMatcher();
   }
 
   @Override
   public int next() {
-if (_currentDocId == Constants.EOF) {
-  return Constants.EOF;
-}
-while (_valueIterator.hasNext() && _currentDocId < _endDocId) {
-  _currentDocId = _currentDocId + 1;
+while (_nextDocId < _numDocs) {
+  int nextDocId = _nextDocId++;
   _numEntriesScanned++;
-  if (_valueMatcher.doesCurrentEntryMatch(_valueIterator)) {
-return _currentDocId;
+  if (_valueMatcher.doesNextValueMatch()) {
+return nextDocId;
   }
 }
-_currentDocId = Constants.EOF;
 return Constants.EOF;
   }
 
   @Override
-  public int currentDocId() {
-return _currentDocId;
-  }
-
-  @Override
-  public String toString() {
-return SVScanDocIdIterator.class.getSimpleName() + "[" + _operatorName + 
"]";
+  public int advance(int targetDocId) {
+_nextDocId = targetDocId;
+_valueIterator.skipTo(targetDocId);
+return next();
   }
 
   @Override
   public MutableRoaringBitmap applyAnd(ImmutableRoaringBitmap docIds) {
 MutableRoaringBitmap result = new MutableRoaringBitmap();
-if (_evaluator.isAlwaysFalse()) {
-  ret

[incubator-pinot] branch master updated: Faster bit unpacking (Part 1) (#5409)

2020-05-29 Thread siddteotia

This is an automated email from the ASF dual-hosted git repository.

siddteotia pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git


The following commit(s) were added to refs/heads/master by this push:
 new b40dd99  Faster bit unpacking (Part 1) (#5409)
b40dd99 is described below

commit b40dd992874f9bc38b911870e041a8f6e24c3776
Author: Sidd 
AuthorDate: Fri May 29 14:08:03 2020 -0700

Faster bit unpacking (Part 1) (#5409)

* Faster bit unpacking

* Add unit tests

* new

* Improved degree of vectorization and more tests

* fix build

* cleanup

* docs

* change file name

* address review comments and add more benchmarks

Co-authored-by: Siddharth Teotia 
---
 .../core/io/util/FixedBitIntReaderWriterV2.java| 149 +
 .../pinot/core/io/util/PinotDataBitSetV2.java  | 729 +
 .../pinot/core/io/util/PinotDataBitSetV2Test.java  | 438 +
 .../pinot/perf/BenchmarkPinotDataBitSet.java   | 546 +++
 4 files changed, 1862 insertions(+)

diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/io/util/FixedBitIntReaderWriterV2.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/io/util/FixedBitIntReaderWriterV2.java
new file mode 100644
index 000..55c8c94
--- /dev/null
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/io/util/FixedBitIntReaderWriterV2.java
@@ -0,0 +1,149 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.io.util;
+
+import com.google.common.base.Preconditions;
+import java.io.Closeable;
+import java.io.IOException;
+import org.apache.pinot.core.segment.memory.PinotDataBuffer;
+
+
+public final class FixedBitIntReaderWriterV2 implements Closeable {
+  private PinotDataBitSetV2 _dataBitSet;
+
+  public FixedBitIntReaderWriterV2(PinotDataBuffer dataBuffer, int numValues, 
int numBitsPerValue) {
+Preconditions
+.checkState(dataBuffer.size() == (int) (((long) numValues * 
numBitsPerValue + Byte.SIZE - 1) / Byte.SIZE));
+_dataBitSet = PinotDataBitSetV2.createBitSet(dataBuffer, numBitsPerValue);
+  }
+
+  /**
+   * Read dictionaryId for a particular docId
+   * @param index docId to get the dictionaryId for
+   * @return dictionaryId
+   */
+  public int readInt(int index) {
+return _dataBitSet.readInt(index);
+  }
+
+  /**
+   * Array based API to read dictionaryIds for a contiguous
+   * range of docIds starting at startDocId for a given length
+   * @param startDocId docId range start
+   * @param length length of contiguous docId range
+   * @param buffer out buffer to read dictionaryIds into
+   */
+  public void readInt(int startDocId, int length, int[] buffer) {
+_dataBitSet.readInt(startDocId, length, buffer);
+  }
+
+  /**
+   * Array based API to read dictionaryIds for an array of docIds
+   * which are monotonically increasing but not necessarily contiguous.
+   * The difference between this and previous array based API {@link 
#readInt(int, int, int[])}
+   * is that unlike the other API, we are provided an array of docIds.
+   * So even though the docIds in docIds[] array are monotonically increasing,
+   * they may not necessarily be contiguous. They can have gaps.
+   *
+   * {@link PinotDataBitSetV2} implements efficient bulk contiguous API
+   * {@link PinotDataBitSetV2#readInt(long, int, int[])}
+   * to read dictionaryIds for a contiguous range of docIds represented
+   * by startDocId and length.
+   *
+   * This API although works on docIds with gaps, it still tries to
+   * leverage the underlying bulk contiguous API as much as possible to
+   * get benefits of vectorization.
+   *
+   * For a given docIds[] array, we determine if we should use the
+   * bulk contiguous API or not by checking if the length of the array
+   * is >= 50% of actual docIdRange (lastDocId - firstDocId + 1). This
+   * sort of gives a very rough idea of the gaps in docIds. We will benefit
+   * from bulk contiguous read if the gaps are narrow implying fewer dictIds
+   * unpacked as part of contiguous read will have to be thrown away/ignored.
+   * If the gaps are wide, a higher numb

[GitHub] [incubator-pinot] siddharthteotia merged pull request #5409: Faster bit unpacking (Part 1)

2020-05-29 Thread GitBox



siddharthteotia merged pull request #5409:
URL: https://github.com/apache/incubator-pinot/pull/5409


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] siddharthteotia commented on pull request #5409: Faster bit unpacking (Part 1)

2020-05-29 Thread GitBox



siddharthteotia commented on pull request #5409:
URL: https://github.com/apache/incubator-pinot/pull/5409#issuecomment-636192732


   Merging it. 
   
   Follow-ups coming next:
   
   - Wire with the reader and writer. Currently the fast methods here can be 
used for existing reader and writer if the index is power of 2 bit encoded. 
   - Consider changing the format to Little Endian
   - Consider aligning the bytes on the writer side. This will remove a few 
branches for 2/4 bit encoding to handle unaligned reads. It will also make it 
easier to add fast methods for non power of 2 bit encodings. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] xiaohui-sun merged pull request #5447: [TE] frontend - harleyjj/rca - update frontend for new AC event format

2020-05-29 Thread GitBox



xiaohui-sun merged pull request #5447:
URL: https://github.com/apache/incubator-pinot/pull/5447


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[incubator-pinot] branch master updated: [TE] frontend - harleyjj/rca - update frontend for new AC event format (#5447)

2020-05-29 Thread xhsun

This is an automated email from the ASF dual-hosted git repository.

xhsun pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git


The following commit(s) were added to refs/heads/master by this push:
 new 79260b3  [TE] frontend - harleyjj/rca - update frontend for new AC 
event format (#5447)
79260b3 is described below

commit 79260b3078e8553fd6d33ff7284afe44d8b495c5
Author: Harley Jackson 
AuthorDate: Fri May 29 13:36:54 2020 -0700

[TE] frontend - harleyjj/rca - update frontend for new AC event format 
(#5447)
---
 .../thirdeye-frontend/app/shared/filterBarConfig.js | 21 -
 1 file changed, 8 insertions(+), 13 deletions(-)

diff --git a/thirdeye/thirdeye-frontend/app/shared/filterBarConfig.js 
b/thirdeye/thirdeye-frontend/app/shared/filterBarConfig.js
index b8cc5cd..ea29fe5 100644
--- a/thirdeye/thirdeye-frontend/app/shared/filterBarConfig.js
+++ b/thirdeye/thirdeye-frontend/app/shared/filterBarConfig.js
@@ -100,28 +100,23 @@ export default [
 color: "blue",
 inputs: [
   {
-label: "services",
-labelMapping: "services",
-type: "dropdown"
-  },
-  {
-label: "endpoints",
-labelMapping: "endpoint",
+label: "fabrics",
+labelMapping: "fabric",
 type: "dropdown"
   },
   {
-label: "upstreams",
-labelMapping: "upstream",
+label: "services",
+labelMapping: "services",
 type: "dropdown"
   },
   {
-label: "upstream endpoints",
-labelMapping: "upstreamendpoint",
+label: "downstream services",
+labelMapping: "downstreamservice",
 type: "dropdown"
   },
   {
-label: "fabrics",
-labelMapping: "fabric",
+label: "downstream endpoints",
+labelMapping: "downstreamendpoint",
 type: "dropdown"
   }
 ]


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[incubator-pinot] branch master updated (6bfcacb -> 71ce427)

2020-05-29 Thread xhsun

This is an automated email from the ASF dual-hosted git repository.

xhsun pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


from 6bfcacb  [Cleanup] Merge RealtimeSegmentOnlineOfflineStateModel and 
SegmentOnlineOfflineStateModel in CommonConstants (#5459)
 add 71ce427  [TE] frontend - harleyjj/packages - remove bower from 
frontend (#5460)

No new revisions were added by this update.

Summary of changes:
 thirdeye/thirdeye-frontend/.bowerrc|4 -
 thirdeye/thirdeye-frontend/.gitignore  |2 -
 .../pods/components/timeseries-chart/component.js  |8 +-
 thirdeye/thirdeye-frontend/bower.json  |8 -
 thirdeye/thirdeye-frontend/ember-cli-build.js  |4 +-
 thirdeye/thirdeye-frontend/jsconfig.json   |2 +-
 thirdeye/thirdeye-frontend/package.json|7 +-
 thirdeye/thirdeye-frontend/pom.xml |9 -
 thirdeye/thirdeye-frontend/yarn.lock   | 5119 +++-
 9 files changed, 2715 insertions(+), 2448 deletions(-)
 delete mode 100644 thirdeye/thirdeye-frontend/.bowerrc
 delete mode 100644 thirdeye/thirdeye-frontend/bower.json


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[incubator-pinot] branch master updated: [TE] frontend - harleyjj/components - remove dead components (#5466)

2020-05-29 Thread xhsun

This is an automated email from the ASF dual-hosted git repository.

xhsun pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git


The following commit(s) were added to refs/heads/master by this push:
 new 2807584  [TE] frontend - harleyjj/components - remove dead components 
(#5466)
2807584 is described below

commit 2807584edfefc710d7c084747a0a509526205587
Author: Harley Jackson 
AuthorDate: Fri May 29 13:36:04 2020 -0700

[TE] frontend - harleyjj/components - remove dead components (#5466)
---
 .../app/pods/components/alert-details/template.hbs |   1 +
 .../components/alert-report-modal/template.hbs |   6 +
 .../app/pods/components/anomaly-graph/component.js | 764 -
 .../app/pods/components/anomaly-graph/template.hbs | 120 
 .../app/pods/components/anomaly-id/component.js|  37 -
 .../app/pods/components/anomaly-id/template.hbs|  15 -
 .../components/anomaly-stats-block/component.js|  21 -
 .../components/anomaly-stats-block/template.hbs|  53 --
 .../pods/components/dimension-heatmap/component.js | 199 --
 .../pods/components/dimension-heatmap/template.hbs |  16 -
 .../pods/components/dimension-summary/component.js |   5 -
 .../pods/components/dimension-summary/template.hbs |  16 -
 .../app/pods/components/events-header/component.js |  86 ---
 .../app/pods/components/events-header/template.hbs |  26 -
 .../app/pods/components/events-table/component.js  | 134 
 .../app/pods/components/events-table/template.hbs  |  16 -
 .../modals/manage-groups-modal/component.js| 716 ---
 .../modals/manage-groups-modal/template.hbs| 239 ---
 .../components/performance-tooltip/component.js|   5 -
 .../components/performance-tooltip/template.hbs|  17 -
 .../self-serve-alert-yaml-details/component.js |   2 +-
 .../pods/components/self-serve-graph/component.js  | 114 ---
 .../pods/components/self-serve-graph/template.hbs  |  38 -
 .../pods/components/thirdeye-chart/component.js|   8 -
 .../pods/components/thirdeye-chart/template.hbs|   1 -
 .../pods/custom/anomalies-table/rule/template.hbs  |   2 +-
 .../app/pods/example/controller.js |  10 -
 .../thirdeye-frontend/app/pods/example/route.js|  32 -
 .../app/pods/example/template.hbs  |  23 -
 .../pods/manage/alerts/performance/controller.js   | 101 ---
 .../app/pods/manage/alerts/performance/route.js| 348 --
 .../pods/manage/alerts/performance/template.hbs| 140 
 .../app/pods/self-serve/create-alert/template.hbs  |  15 -
 thirdeye/thirdeye-frontend/app/router.js   |   2 +-
 thirdeye/thirdeye-frontend/app/styles/app.scss |   4 -
 .../app/styles/components/anomaly-graph.scss   | 151 
 .../app/styles/components/anomaly-id.scss  |  32 -
 .../app/styles/components/dimension-heatmap.scss   |  69 --
 .../app/styles/components/dimension-summary.scss   |  30 -
 .../app/styles/pods/manage/alerts-performance.scss |  20 -
 .../components/anomaly-graph/component-test.js |  23 -
 .../pods/components/anomaly-id/component-test.js   |  55 --
 .../components/thirdeye-chart/component-test.js|  26 -
 43 files changed, 10 insertions(+), 3728 deletions(-)

diff --git 
a/thirdeye/thirdeye-frontend/app/pods/components/alert-details/template.hbs 
b/thirdeye/thirdeye-frontend/app/pods/components/alert-details/template.hbs
index b622e43..9760ad6 100644
--- a/thirdeye/thirdeye-frontend/app/pods/components/alert-details/template.hbs
+++ b/thirdeye/thirdeye-frontend/app/pods/components/alert-details/template.hbs
@@ -247,6 +247,7 @@
   isMetricDataLoading=isMetricDataLoading
   isMetricDataInvalid=isMetricDataInvalid
   inputAction=(action "onInputMissingAnomaly")
+  isReportFailure=isReportFailure
 }}
   {{else}}
 {{ember-spinner}}
diff --git 
a/thirdeye/thirdeye-frontend/app/pods/components/alert-report-modal/template.hbs
 
b/thirdeye/thirdeye-frontend/app/pods/components/alert-report-modal/template.hbs
index 8bb3444..08e85fb 100644
--- 
a/thirdeye/thirdeye-frontend/app/pods/components/alert-report-modal/template.hbs
+++ 
b/thirdeye/thirdeye-frontend/app/pods/components/alert-report-modal/template.hbs
@@ -47,6 +47,12 @@
 {{/if}}
   
 
+  {{#if isReportFailure}}
+{{#bs-alert type="danger" class="te-form__banner 
te-form__banner--failure"}}
+  Error: Failed to save reported anomaly. Did you select 
dates and times?
+{{/bs-alert}}
+  {{/if}}
+
   
 
   Mark the Anomaly Region
diff --git 
a/thirdeye/thirdeye-frontend/app/pods/components/anomaly-graph/component.js 
b/thirdeye/thirdeye-frontend/app/pods/components/anomaly-graph/component.js
deleted file mode 100644
index 3ac6d93..000
--- a/thirdeye/thirdeye-frontend/app/pods/components/anomaly-graph/component.js
+++ /dev/null
@@ -1,764 +0,0 @@
-import $ from 'jquery';
-impor

[GitHub] [incubator-pinot] xiaohui-sun merged pull request #5466: [TE] frontend - harleyjj/components - remove dead components

2020-05-29 Thread GitBox



xiaohui-sun merged pull request #5466:
URL: https://github.com/apache/incubator-pinot/pull/5466


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] xiaohui-sun merged pull request #5460: [TE] frontend - harleyjj/packages - remove bower from frontend

2020-05-29 Thread GitBox



xiaohui-sun merged pull request #5460:
URL: https://github.com/apache/incubator-pinot/pull/5460


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5451: Refactor DistinctTable to use PriorityQueue based algorithm

2020-05-29 Thread GitBox



Jackie-Jiang commented on a change in pull request #5451:
URL: https://github.com/apache/incubator-pinot/pull/5451#discussion_r432711634



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctAggregationFunction.java
##
@@ -123,21 +120,20 @@ public void aggregate(int length, AggregationResultHolder 
aggregationResultHolde
 columnDataTypes[i] = 
ColumnDataType.fromDataTypeSV(blockValSetMap.get(_inputExpressions.get(i)).getValueType());
   }
   DataSchema dataSchema = new DataSchema(_columns, columnDataTypes);
-  distinctTable = new DistinctTable(dataSchema, _orderBy, _capacity);
+  distinctTable = new DistinctTable(dataSchema, _orderBy, _limit);
   aggregationResultHolder.setValue(distinctTable);
+} else if (distinctTable.shouldNotAddMore()) {
+  return;
 }
 
-// TODO: Follow up PR will make few changes to start using 
DictionaryBasedAggregationOperator
-// for DISTINCT queries without filter.
+// TODO: Follow up PR will make few changes to start using 
DictionaryBasedAggregationOperator for DISTINCT queries
+//   without filter.
 RowBasedBlockValueFetcher blockValueFetcher = new 
RowBasedBlockValueFetcher(blockValSets);
 
-// TODO: Do early termination in the operator itself which should
-// not call aggregate function at all if the limit has reached
-// that will require the interface change since this function
-// has to communicate back that required number of records have
-// been collected
 for (int i = 0; i < length; i++) {
-  distinctTable.upsert(new Record(blockValueFetcher.getRow(i)));
+  if (!distinctTable.add(new Record(blockValueFetcher.getRow(i {

Review comment:
   Good point, added





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] Jackie-Jiang commented on pull request #5444: Enhance and simplify the filtering

2020-05-29 Thread GitBox



Jackie-Jiang commented on pull request #5444:
URL: https://github.com/apache/incubator-pinot/pull/5444#issuecomment-636166279







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5444: Enhance and simplify the filtering

2020-05-29 Thread GitBox



Jackie-Jiang commented on a change in pull request #5444:
URL: https://github.com/apache/incubator-pinot/pull/5444#discussion_r432704090



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/operator/docidsets/AndDocIdSet.java
##
@@ -0,0 +1,156 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.operator.docidsets;
+
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.pinot.common.utils.Pairs.IntPair;
+import org.apache.pinot.core.common.BlockDocIdIterator;
+import org.apache.pinot.core.operator.dociditerators.AndDocIdIterator;
+import org.apache.pinot.core.operator.dociditerators.BitmapBasedDocIdIterator;
+import 
org.apache.pinot.core.operator.dociditerators.RangelessBitmapDocIdIterator;
+import org.apache.pinot.core.operator.dociditerators.ScanBasedDocIdIterator;
+import org.apache.pinot.core.operator.dociditerators.SortedDocIdIterator;
+import org.apache.pinot.core.util.SortedRangeIntersection;
+import org.roaringbitmap.buffer.ImmutableRoaringBitmap;
+import org.roaringbitmap.buffer.MutableRoaringBitmap;
+
+
+/**
+ * The FilterBlockDocIdSet to perform AND on all child FilterBlockDocIdSets.
+ * The AndBlockDocIdSet will construct the BlockDocIdIterator based on the 
BlockDocIdIterators from the child
+ * FilterBlockDocIdSets:
+ * 
+ *   
+ * When there are at least one index-base BlockDocIdIterators 
(SortedDocIdIterator or BitmapBasedDocIdIterator) and
+ * at least one ScanBasedDocIdIterator, or more than one index-based 
BlockDocIdIterators, merge them and construct a
+ * RangelessBitmapDocIdIterator from the merged document ids. If there is 
no remaining BlockDocIdIterators, directly
+ * return the merged RangelessBitmapDocIdIterator; otherwise, construct 
and return an AndDocIdIterator with the
+ * merged RangelessBitmapDocIdIterator and the remaining 
BlockDocIdIterators.
+ *   
+ *   
+ * Otherwise, construct and return an AndDocIdIterator with all 
BlockDocIdIterators.
+ *   
+ * 
+ */
+public final class AndDocIdSet implements FilterBlockDocIdSet {
+  private final List _docIdSets;
+
+  public AndDocIdSet(List docIdSets) {
+_docIdSets = docIdSets;
+  }
+
+  @Override
+  public BlockDocIdIterator iterator() {
+int numDocIdSets = _docIdSets.size();
+// NOTE: Keep the order of FilterBlockDocIdSets to preserve the order 
decided within FilterOperatorUtils.
+// TODO: Consider deciding the order based on the stats of 
BlockDocIdIterators
+BlockDocIdIterator[] allDocIdIterators = new 
BlockDocIdIterator[numDocIdSets];
+List sortedDocIdIterators = new ArrayList<>();

Review comment:
   For the current supported iterators, yes `remainingDocIdIterators` can 
only be AND/OR.
   
   There could be multiple sorted iterators if there are multiple predicates on 
the same sorted column or there are multiple sorted columns.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5444: Enhance and simplify the filtering

2020-05-29 Thread GitBox



Jackie-Jiang commented on a change in pull request #5444:
URL: https://github.com/apache/incubator-pinot/pull/5444#discussion_r432700876



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/common/BlockDocIdIterator.java
##
@@ -25,25 +25,16 @@
 public interface BlockDocIdIterator {
 
   /**
-   * Get the next document id.
-   *
-   * @return Next document id or EOF if there is no more documents
+   * Returns the next matched document id, or {@link Constants#EOF} if there 
is no more matched document.
+   * NOTE: There should be no more call to this method after it returns 
{@link Constants#EOF}.
*/
   int next();
 
   /**
-   * Advance to the first document whose id is equal or greater than the given 
target document id.
-   * If the given target document id is smaller or equal to the current 
document id, then return the current one.
-   *
-   * @param targetDocId The target document id
-   * @return First document id that is equal or greater than target or EOF if 
no document matches
+   * Returns the first matched document whose id is equal to or greater than 
the given target document id, or
+   * {@link Constants#EOF} if there is no such document.
+   * NOTE: The target document id should be greater than the document id 
previous returned.

Review comment:
   No, the `advance(targetDocId)` here is equivalent to 
`skipTo(targetDocId)` and `next()` where you already iterate over the 
`targetDocId`. From an iterator's perspective, it should not return the same 
value twice. With this assumption, we can save one if check inside the 
`advance(targetDocId)`.
   Updated the javadoc to make it more clear.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] kishoreg commented on a change in pull request #5444: Enhance and simplify the filtering

2020-05-29 Thread GitBox



kishoreg commented on a change in pull request #5444:
URL: https://github.com/apache/incubator-pinot/pull/5444#discussion_r432699285



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/operator/dociditerators/SVScanDocIdIterator.java
##
@@ -18,238 +18,138 @@
  */
 package org.apache.pinot.core.operator.dociditerators;
 
-import org.apache.pinot.core.common.BlockMetadata;
 import org.apache.pinot.core.common.BlockSingleValIterator;
-import org.apache.pinot.core.common.BlockValSet;
 import org.apache.pinot.core.common.Constants;
 import org.apache.pinot.core.operator.filter.predicate.PredicateEvaluator;
-import org.apache.pinot.spi.data.FieldSpec;
 import org.roaringbitmap.IntIterator;
 import org.roaringbitmap.buffer.ImmutableRoaringBitmap;
 import org.roaringbitmap.buffer.MutableRoaringBitmap;
 
 
-public class SVScanDocIdIterator implements ScanBasedDocIdIterator {
-  private int _currentDocId = -1;
+public final class SVScanDocIdIterator implements ScanBasedDocIdIterator {
+  private final PredicateEvaluator _predicateEvaluator;
   private final BlockSingleValIterator _valueIterator;
-  private int _startDocId;
-  private int _endDocId;
-  private PredicateEvaluator _evaluator;
-  private String _operatorName;
-  private int _numEntriesScanned = 0;
+  private final int _numDocs;
   private final ValueMatcher _valueMatcher;
 
-  public SVScanDocIdIterator(String operatorName, BlockValSet blockValSet, 
BlockMetadata blockMetadata,
-  PredicateEvaluator evaluator) {
-_operatorName = operatorName;
-_evaluator = evaluator;
-_valueIterator = (BlockSingleValIterator) blockValSet.iterator();
-
-if (evaluator.isAlwaysFalse()) {
-  _currentDocId = Constants.EOF;
-  setStartDocId(Constants.EOF);
-  setEndDocId(Constants.EOF);
-} else {
-  setStartDocId(blockMetadata.getStartDocId());
-  setEndDocId(blockMetadata.getEndDocId());
-}
+  private int _nextDocId = 0;
+  private long _numEntriesScanned = 0L;
 
-if (evaluator.isDictionaryBased()) {
-  _valueMatcher = new IntMatcher(); // Match using dictionary id's that 
are integers.
-} else {
-  _valueMatcher = getValueMatcherForType(blockMetadata.getDataType());
-}
-_valueMatcher.setEvaluator(evaluator);
-  }
-
-  /**
-   * After setting the startDocId, next calls will always return from 
>=startDocId
-   *
-   * @param startDocId Start doc id
-   */
-  public void setStartDocId(int startDocId) {
-_currentDocId = startDocId - 1;
-_valueIterator.skipTo(startDocId);
-_startDocId = startDocId;
-  }
-
-  /**
-   * After setting the endDocId, next call will return Constants.EOF after 
currentDocId exceeds
-   * endDocId
-   *
-   * @param endDocId End doc id
-   */
-  public void setEndDocId(int endDocId) {
-_endDocId = endDocId;
-  }
-
-  @Override
-  public boolean isMatch(int docId) {
-if (_currentDocId == Constants.EOF) {
-  return false;
-}
-_valueIterator.skipTo(docId);
-_numEntriesScanned++;
-return _valueMatcher.doesCurrentEntryMatch(_valueIterator);
-  }
-
-  @Override
-  public int advance(int targetDocId) {
-if (_currentDocId == Constants.EOF) {
-  return _currentDocId;
-}
-if (targetDocId < _startDocId) {
-  targetDocId = _startDocId;
-} else if (targetDocId > _endDocId) {
-  _currentDocId = Constants.EOF;
-}
-if (_currentDocId >= targetDocId) {
-  return _currentDocId;
-} else {
-  _currentDocId = targetDocId - 1;
-  _valueIterator.skipTo(targetDocId);
-  return next();
-}
+  public SVScanDocIdIterator(PredicateEvaluator predicateEvaluator, 
BlockSingleValIterator valueIterator, int numDocs) {
+_predicateEvaluator = predicateEvaluator;
+_valueIterator = valueIterator;
+_numDocs = numDocs;
+_valueMatcher = getValueMatcher();
   }
 
   @Override
   public int next() {
-if (_currentDocId == Constants.EOF) {
-  return Constants.EOF;
-}
-while (_valueIterator.hasNext() && _currentDocId < _endDocId) {
-  _currentDocId = _currentDocId + 1;
+while (_nextDocId < _numDocs) {
+  int nextDocId = _nextDocId++;
   _numEntriesScanned++;
-  if (_valueMatcher.doesCurrentEntryMatch(_valueIterator)) {
-return _currentDocId;
+  if (_valueMatcher.doesNextValueMatch()) {
+return nextDocId;
   }
 }
-_currentDocId = Constants.EOF;
 return Constants.EOF;
   }
 
   @Override
-  public int currentDocId() {
-return _currentDocId;
-  }
-
-  @Override
-  public String toString() {
-return SVScanDocIdIterator.class.getSimpleName() + "[" + _operatorName + 
"]";
+  public int advance(int targetDocId) {
+_nextDocId = targetDocId;
+_valueIterator.skipTo(targetDocId);
+return next();
   }
 
   @Override
   public MutableRoaringBitmap applyAnd(ImmutableRoaringBitmap docIds) {
 MutableRoaringBitmap result = new MutableRoaringBitmap();
-if (_evaluator.isAlwaysFalse()) {
-  return res

[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5444: Enhance and simplify the filtering

2020-05-29 Thread GitBox



Jackie-Jiang commented on a change in pull request #5444:
URL: https://github.com/apache/incubator-pinot/pull/5444#discussion_r432695654



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/operator/dociditerators/SVScanDocIdIterator.java
##
@@ -18,238 +18,138 @@
  */
 package org.apache.pinot.core.operator.dociditerators;
 
-import org.apache.pinot.core.common.BlockMetadata;
 import org.apache.pinot.core.common.BlockSingleValIterator;
-import org.apache.pinot.core.common.BlockValSet;
 import org.apache.pinot.core.common.Constants;
 import org.apache.pinot.core.operator.filter.predicate.PredicateEvaluator;
-import org.apache.pinot.spi.data.FieldSpec;
 import org.roaringbitmap.IntIterator;
 import org.roaringbitmap.buffer.ImmutableRoaringBitmap;
 import org.roaringbitmap.buffer.MutableRoaringBitmap;
 
 
-public class SVScanDocIdIterator implements ScanBasedDocIdIterator {
-  private int _currentDocId = -1;
+public final class SVScanDocIdIterator implements ScanBasedDocIdIterator {
+  private final PredicateEvaluator _predicateEvaluator;
   private final BlockSingleValIterator _valueIterator;
-  private int _startDocId;
-  private int _endDocId;
-  private PredicateEvaluator _evaluator;
-  private String _operatorName;
-  private int _numEntriesScanned = 0;
+  private final int _numDocs;
   private final ValueMatcher _valueMatcher;
 
-  public SVScanDocIdIterator(String operatorName, BlockValSet blockValSet, 
BlockMetadata blockMetadata,
-  PredicateEvaluator evaluator) {
-_operatorName = operatorName;
-_evaluator = evaluator;
-_valueIterator = (BlockSingleValIterator) blockValSet.iterator();
-
-if (evaluator.isAlwaysFalse()) {
-  _currentDocId = Constants.EOF;
-  setStartDocId(Constants.EOF);
-  setEndDocId(Constants.EOF);
-} else {
-  setStartDocId(blockMetadata.getStartDocId());
-  setEndDocId(blockMetadata.getEndDocId());
-}
+  private int _nextDocId = 0;
+  private long _numEntriesScanned = 0L;
 
-if (evaluator.isDictionaryBased()) {
-  _valueMatcher = new IntMatcher(); // Match using dictionary id's that 
are integers.
-} else {
-  _valueMatcher = getValueMatcherForType(blockMetadata.getDataType());
-}
-_valueMatcher.setEvaluator(evaluator);
-  }
-
-  /**
-   * After setting the startDocId, next calls will always return from 
>=startDocId
-   *
-   * @param startDocId Start doc id
-   */
-  public void setStartDocId(int startDocId) {
-_currentDocId = startDocId - 1;
-_valueIterator.skipTo(startDocId);
-_startDocId = startDocId;
-  }
-
-  /**
-   * After setting the endDocId, next call will return Constants.EOF after 
currentDocId exceeds
-   * endDocId
-   *
-   * @param endDocId End doc id
-   */
-  public void setEndDocId(int endDocId) {
-_endDocId = endDocId;
-  }
-
-  @Override
-  public boolean isMatch(int docId) {
-if (_currentDocId == Constants.EOF) {
-  return false;
-}
-_valueIterator.skipTo(docId);
-_numEntriesScanned++;
-return _valueMatcher.doesCurrentEntryMatch(_valueIterator);
-  }
-
-  @Override
-  public int advance(int targetDocId) {
-if (_currentDocId == Constants.EOF) {
-  return _currentDocId;
-}
-if (targetDocId < _startDocId) {
-  targetDocId = _startDocId;
-} else if (targetDocId > _endDocId) {
-  _currentDocId = Constants.EOF;
-}
-if (_currentDocId >= targetDocId) {
-  return _currentDocId;
-} else {
-  _currentDocId = targetDocId - 1;
-  _valueIterator.skipTo(targetDocId);
-  return next();
-}
+  public SVScanDocIdIterator(PredicateEvaluator predicateEvaluator, 
BlockSingleValIterator valueIterator, int numDocs) {
+_predicateEvaluator = predicateEvaluator;
+_valueIterator = valueIterator;
+_numDocs = numDocs;
+_valueMatcher = getValueMatcher();
   }
 
   @Override
   public int next() {
-if (_currentDocId == Constants.EOF) {
-  return Constants.EOF;
-}
-while (_valueIterator.hasNext() && _currentDocId < _endDocId) {
-  _currentDocId = _currentDocId + 1;
+while (_nextDocId < _numDocs) {
+  int nextDocId = _nextDocId++;
   _numEntriesScanned++;
-  if (_valueMatcher.doesCurrentEntryMatch(_valueIterator)) {
-return _currentDocId;
+  if (_valueMatcher.doesNextValueMatch()) {
+return nextDocId;
   }
 }
-_currentDocId = Constants.EOF;
 return Constants.EOF;
   }
 
   @Override
-  public int currentDocId() {
-return _currentDocId;
-  }
-
-  @Override
-  public String toString() {
-return SVScanDocIdIterator.class.getSimpleName() + "[" + _operatorName + 
"]";
+  public int advance(int targetDocId) {
+_nextDocId = targetDocId;
+_valueIterator.skipTo(targetDocId);
+return next();
   }
 
   @Override
   public MutableRoaringBitmap applyAnd(ImmutableRoaringBitmap docIds) {
 MutableRoaringBitmap result = new MutableRoaringBitmap();
-if (_evaluator.isAlwaysFalse()) {
-  return

[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5444: Enhance and simplify the filtering

2020-05-29 Thread GitBox



Jackie-Jiang commented on a change in pull request #5444:
URL: https://github.com/apache/incubator-pinot/pull/5444#discussion_r432692609



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/operator/dociditerators/ScanBasedDocIdIterator.java
##
@@ -24,22 +24,18 @@
 
 
 /**
- * All scan based filter iterators must implement this interface. This allows 
intersection to be
- * optimized.
- * For example, if the we have two iterators one index based and another scan 
based, instead of

Review comment:
   This part is explained in the AndDocIdSet. Updated the javadoc so that 
it is more clear here as well.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] kishoreg commented on a change in pull request #5465: Support distinctCountRawThetaSketch aggregation that returns serialized sketch.

2020-05-29 Thread GitBox



kishoreg commented on a change in pull request #5465:
URL: https://github.com/apache/incubator-pinot/pull/5465#discussion_r432689807



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/AggregationFunctionFactory.java
##
@@ -137,6 +137,8 @@ public static AggregationFunction 
getAggregationFunction(AggregationInfo aggrega
 return new FastHLLAggregationFunction(column);
   case DISTINCTCOUNTTHETASKETCH:
 return new DistinctCountThetaSketchAggregationFunction(arguments);
+  case DISTINCTCOUNTRAWTHETASKETCH:

Review comment:
   this should not be called distinctCount right, you can do anything with 
thetasketch. maybe just RAWTHETASKETCH





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] kishoreg commented on a change in pull request #5461: Adding Support for SQL CASE Statement

2020-05-29 Thread GitBox



kishoreg commented on a change in pull request #5461:
URL: https://github.com/apache/incubator-pinot/pull/5461#discussion_r432688497



##
File path: 
pinot-common/src/main/java/org/apache/pinot/sql/parsers/CalciteSqlParser.java
##
@@ -610,6 +611,25 @@ private static Expression toExpression(SqlNode node) {
 }
 
asFuncExpr.getFunctionCall().addToOperands(RequestUtils.getIdentifierExpression(aliasName));
 return asFuncExpr;
+  case CASE:
+// CASE WHEN Statement is model as a function with variable length 
parameters.
+// Assume N is number of WHEN Statements, total number of parameters 
is (2 * N + 1).
+// - N: Convert each WHEN Statement into a function Expression;
+// - N: Convert each THEN Statement into an Expression;
+// - 1: Convert ELSE Statement into an Expression.
+SqlCase caseSqlNode = (SqlCase) node;
+SqlNodeList whenOperands = caseSqlNode.getWhenOperands();
+SqlNodeList thenOperands = caseSqlNode.getThenOperands();
+SqlNode elseOperand = caseSqlNode.getElseOperand();
+Expression caseFuncExpr = 
RequestUtils.getFunctionExpression(SqlKind.CASE.name());
+for (SqlNode whenSqlNode : whenOperands.getList()) {

Review comment:
   we want to make add a validation that none of them are agg functions 
since current implementation only handles transform functions 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5444: Enhance and simplify the filtering

2020-05-29 Thread GitBox



Jackie-Jiang commented on a change in pull request #5444:
URL: https://github.com/apache/incubator-pinot/pull/5444#discussion_r432681482



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/operator/filter/BitmapBasedFilterOperator.java
##
@@ -25,62 +25,73 @@
 import org.apache.pinot.core.operator.filter.predicate.PredicateEvaluator;
 import org.apache.pinot.core.segment.index.readers.InvertedIndexReader;
 import org.roaringbitmap.buffer.ImmutableRoaringBitmap;
+import org.roaringbitmap.buffer.MutableRoaringBitmap;
 
 
+@SuppressWarnings("rawtypes")
 public class BitmapBasedFilterOperator extends BaseFilterOperator {
   private static final String OPERATOR_NAME = "BitmapBasedFilterOperator";
 
   private final PredicateEvaluator _predicateEvaluator;
-  private final DataSource _dataSource;
-  private final ImmutableRoaringBitmap[] _bitmaps;
-  private final int _startDocId;
-  // TODO: change it to exclusive
-  // Inclusive
-  private final int _endDocId;
+  private final InvertedIndexReader _invertedIndexReader;
+  private final ImmutableRoaringBitmap _docIds;
   private final boolean _exclusive;
+  private final int _numDocs;
 
-  BitmapBasedFilterOperator(PredicateEvaluator predicateEvaluator, DataSource 
dataSource, int startDocId,
-  int endDocId) {
-// NOTE:
-// Predicate that is always evaluated as true or false should not be 
passed into the BitmapBasedFilterOperator for
-// performance concern.
-// If predicate is always evaluated as true, use MatchAllFilterOperator; 
if predicate is always evaluated as false,
-// use EmptyFilterOperator.
-Preconditions.checkArgument(!predicateEvaluator.isAlwaysTrue() && 
!predicateEvaluator.isAlwaysFalse());
-
+  BitmapBasedFilterOperator(PredicateEvaluator predicateEvaluator, DataSource 
dataSource, int numDocs) {
 _predicateEvaluator = predicateEvaluator;
-_dataSource = dataSource;
-_bitmaps = null;
-_startDocId = startDocId;
-_endDocId = endDocId;
+_invertedIndexReader = dataSource.getInvertedIndex();
+_docIds = null;
 _exclusive = predicateEvaluator.isExclusive();
+_numDocs = numDocs;
   }
 
-  public BitmapBasedFilterOperator(ImmutableRoaringBitmap[] bitmaps, int 
startDocId, int endDocId, boolean exclusive) {
+  public BitmapBasedFilterOperator(ImmutableRoaringBitmap docIds, boolean 
exclusive, int numDocs) {
 _predicateEvaluator = null;
-_dataSource = null;
-_bitmaps = bitmaps;
-_startDocId = startDocId;
-_endDocId = endDocId;
+_invertedIndexReader = null;
+_docIds = docIds;
 _exclusive = exclusive;
+_numDocs = numDocs;
   }
 
   @Override
   protected FilterBlock getNextBlock() {
-if (_bitmaps != null) {
-  return new FilterBlock(new BitmapDocIdSet(_bitmaps, _startDocId, 
_endDocId, _exclusive));
+if (_docIds != null) {
+  if (_exclusive) {
+return new FilterBlock(new 
BitmapDocIdSet(ImmutableRoaringBitmap.flip(_docIds, 0L, _numDocs), _numDocs));
+  } else {
+return new FilterBlock(new BitmapDocIdSet(_docIds, _numDocs));
+  }
 }
 
 int[] dictIds = _exclusive ? _predicateEvaluator.getNonMatchingDictIds() : 
_predicateEvaluator.getMatchingDictIds();

Review comment:
   For exclusive predicate, we need to flip (inverse) the bitmap so that 
the result bitmap can reflect the matching docIds.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5451: Refactor DistinctTable to use PriorityQueue based algorithm

2020-05-29 Thread GitBox



siddharthteotia commented on a change in pull request #5451:
URL: https://github.com/apache/incubator-pinot/pull/5451#discussion_r432678859



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctAggregationFunction.java
##
@@ -123,21 +120,20 @@ public void aggregate(int length, AggregationResultHolder 
aggregationResultHolde
 columnDataTypes[i] = 
ColumnDataType.fromDataTypeSV(blockValSetMap.get(_inputExpressions.get(i)).getValueType());
   }
   DataSchema dataSchema = new DataSchema(_columns, columnDataTypes);
-  distinctTable = new DistinctTable(dataSchema, _orderBy, _capacity);
+  distinctTable = new DistinctTable(dataSchema, _orderBy, _limit);
   aggregationResultHolder.setValue(distinctTable);
+} else if (distinctTable.shouldNotAddMore()) {
+  return;
 }
 
-// TODO: Follow up PR will make few changes to start using 
DictionaryBasedAggregationOperator
-// for DISTINCT queries without filter.
+// TODO: Follow up PR will make few changes to start using 
DictionaryBasedAggregationOperator for DISTINCT queries
+//   without filter.
 RowBasedBlockValueFetcher blockValueFetcher = new 
RowBasedBlockValueFetcher(blockValSets);
 
-// TODO: Do early termination in the operator itself which should
-// not call aggregate function at all if the limit has reached
-// that will require the interface change since this function
-// has to communicate back that required number of records have
-// been collected
 for (int i = 0; i < length; i++) {
-  distinctTable.upsert(new Record(blockValueFetcher.getRow(i)));
+  if (!distinctTable.add(new Record(blockValueFetcher.getRow(i {

Review comment:
   I think this for loop should be written separately for order by and non 
order by.
   
   For order by, there is no early termination so if check can be avoided since 
the return value will always be true.
   For non order, after adding every record, check the return value to see if 
limit has been reached and terminate early

##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctAggregationFunction.java
##
@@ -123,21 +120,20 @@ public void aggregate(int length, AggregationResultHolder 
aggregationResultHolde
 columnDataTypes[i] = 
ColumnDataType.fromDataTypeSV(blockValSetMap.get(_inputExpressions.get(i)).getValueType());
   }
   DataSchema dataSchema = new DataSchema(_columns, columnDataTypes);
-  distinctTable = new DistinctTable(dataSchema, _orderBy, _capacity);
+  distinctTable = new DistinctTable(dataSchema, _orderBy, _limit);
   aggregationResultHolder.setValue(distinctTable);
+} else if (distinctTable.shouldNotAddMore()) {
+  return;
 }
 
-// TODO: Follow up PR will make few changes to start using 
DictionaryBasedAggregationOperator
-// for DISTINCT queries without filter.
+// TODO: Follow up PR will make few changes to start using 
DictionaryBasedAggregationOperator for DISTINCT queries
+//   without filter.
 RowBasedBlockValueFetcher blockValueFetcher = new 
RowBasedBlockValueFetcher(blockValSets);
 
-// TODO: Do early termination in the operator itself which should
-// not call aggregate function at all if the limit has reached
-// that will require the interface change since this function
-// has to communicate back that required number of records have
-// been collected
 for (int i = 0; i < length; i++) {
-  distinctTable.upsert(new Record(blockValueFetcher.getRow(i)));
+  if (!distinctTable.add(new Record(blockValueFetcher.getRow(i {

Review comment:
   I think this for loop should be written separately for order by and non 
order by.
   
   For order by, there is no early termination so if check can be avoided since 
the return value will always be true.
   For non order, after adding every record, check the return value to see if 
limit has been reached and terminate early within the loop





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5451: Refactor DistinctTable to use PriorityQueue based algorithm

2020-05-29 Thread GitBox



siddharthteotia commented on a change in pull request #5451:
URL: https://github.com/apache/incubator-pinot/pull/5451#discussion_r432679650



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/customobject/DistinctTable.java
##
@@ -0,0 +1,297 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.query.aggregation.function.customobject;
+
+import it.unimi.dsi.fastutil.objects.Object2IntOpenHashMap;
+import it.unimi.dsi.fastutil.objects.ObjectOpenHashSet;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.Iterator;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.PriorityQueue;
+import java.util.Set;
+import javax.annotation.Nullable;
+import org.apache.pinot.common.request.SelectionSort;
+import org.apache.pinot.common.utils.DataSchema;
+import org.apache.pinot.common.utils.DataTable;
+import org.apache.pinot.core.common.datatable.DataTableBuilder;
+import org.apache.pinot.core.common.datatable.DataTableFactory;
+import org.apache.pinot.core.data.table.Record;
+import org.apache.pinot.spi.utils.ByteArray;
+
+
+/**
+ * The {@code DistinctTable} class serves as the intermediate result of {@code 
DistinctAggregationFunction}.
+ */
+@SuppressWarnings({"rawtypes", "unchecked"})
+public class DistinctTable {
+  private static final int MAX_INITIAL_CAPACITY = 1;
+
+  private final DataSchema _dataSchema;
+  private final int _limit;
+  private final Set _uniqueRecords;
+  private final PriorityQueue _sortedRecords;
+  private final List _records;
+
+  /**
+   * Constructor of the main {@code DistinctTable} which can be used to add 
records and merge other
+   * {@code DistinctTable}s.
+   */
+  public DistinctTable(DataSchema dataSchema, @Nullable List 
orderBy, int limit) {
+_dataSchema = dataSchema;
+_limit = limit;
+
+// TODO: see if 10k is the right max initial capacity to use
+// NOTE: When LIMIT is smaller than or equal to the MAX_INITIAL_CAPACITY, 
no resize is required.
+int initialCapacity = Math.min(limit, MAX_INITIAL_CAPACITY);
+_uniqueRecords = new ObjectOpenHashSet<>(initialCapacity);
+if (orderBy != null) {
+  String[] columns = dataSchema.getColumnNames();
+  int numColumns = columns.length;
+  Object2IntOpenHashMap columnIndexMap = new 
Object2IntOpenHashMap<>(numColumns);
+  for (int i = 0; i < numColumns; i++) {
+columnIndexMap.put(columns[i], i);
+  }
+  int numOrderByColumns = orderBy.size();
+  int[] orderByColumnIndexes = new int[numOrderByColumns];
+  boolean[] orderByAsc = new boolean[numOrderByColumns];
+  for (int i = 0; i < numOrderByColumns; i++) {
+SelectionSort selectionSort = orderBy.get(i);
+orderByColumnIndexes[i] = 
columnIndexMap.getInt(selectionSort.getColumn());
+orderByAsc[i] = selectionSort.isIsAsc();
+  }
+  _sortedRecords = new PriorityQueue<>(initialCapacity, (record1, record2) 
-> {
+Object[] values1 = record1.getValues();
+Object[] values2 = record2.getValues();
+for (int i = 0; i < numOrderByColumns; i++) {
+  Comparable valueToCompare1 = (Comparable) 
values1[orderByColumnIndexes[i]];
+  Comparable valueToCompare2 = (Comparable) 
values2[orderByColumnIndexes[i]];
+  int result =
+  orderByAsc[i] ? valueToCompare2.compareTo(valueToCompare1) : 
valueToCompare1.compareTo(valueToCompare2);
+  if (result != 0) {
+return result;
+  }
+}
+return 0;
+  });
+} else {
+  _sortedRecords = null;
+}
+_records = null;
+  }
+
+  /**
+   * Returns the {@code DataSchema} of the {@code DistinctTable}.
+   */
+  public DataSchema getDataSchema() {
+return _dataSchema;
+  }
+
+  /**
+   * Returns the number of unique records within the {@code DistinctTable}.
+   */
+  public int size() {
+if (_uniqueRecords != null) {
+  // Server-side
+  return _uniqueRecords.size();
+} else {
+  // Broker-side
+  return _records.size();
+}
+  }
+
+  /**
+   * Adds a record into the DistinctTable and returns whether more r

[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5451: Refactor DistinctTable to use PriorityQueue based algorithm

2020-05-29 Thread GitBox



siddharthteotia commented on a change in pull request #5451:
URL: https://github.com/apache/incubator-pinot/pull/5451#discussion_r432679222



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/customobject/DistinctTable.java
##
@@ -0,0 +1,297 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.query.aggregation.function.customobject;
+
+import it.unimi.dsi.fastutil.objects.Object2IntOpenHashMap;
+import it.unimi.dsi.fastutil.objects.ObjectOpenHashSet;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.Iterator;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.PriorityQueue;
+import java.util.Set;
+import javax.annotation.Nullable;
+import org.apache.pinot.common.request.SelectionSort;
+import org.apache.pinot.common.utils.DataSchema;
+import org.apache.pinot.common.utils.DataTable;
+import org.apache.pinot.core.common.datatable.DataTableBuilder;
+import org.apache.pinot.core.common.datatable.DataTableFactory;
+import org.apache.pinot.core.data.table.Record;
+import org.apache.pinot.spi.utils.ByteArray;
+
+
+/**
+ * The {@code DistinctTable} class serves as the intermediate result of {@code 
DistinctAggregationFunction}.
+ */
+@SuppressWarnings({"rawtypes", "unchecked"})
+public class DistinctTable {
+  private static final int MAX_INITIAL_CAPACITY = 1;
+
+  private final DataSchema _dataSchema;
+  private final int _limit;
+  private final Set _uniqueRecords;
+  private final PriorityQueue _sortedRecords;
+  private final List _records;
+
+  /**
+   * Constructor of the main {@code DistinctTable} which can be used to add 
records and merge other
+   * {@code DistinctTable}s.
+   */
+  public DistinctTable(DataSchema dataSchema, @Nullable List 
orderBy, int limit) {
+_dataSchema = dataSchema;
+_limit = limit;
+
+// TODO: see if 10k is the right max initial capacity to use
+// NOTE: When LIMIT is smaller than or equal to the MAX_INITIAL_CAPACITY, 
no resize is required.
+int initialCapacity = Math.min(limit, MAX_INITIAL_CAPACITY);
+_uniqueRecords = new ObjectOpenHashSet<>(initialCapacity);
+if (orderBy != null) {
+  String[] columns = dataSchema.getColumnNames();
+  int numColumns = columns.length;
+  Object2IntOpenHashMap columnIndexMap = new 
Object2IntOpenHashMap<>(numColumns);
+  for (int i = 0; i < numColumns; i++) {
+columnIndexMap.put(columns[i], i);
+  }
+  int numOrderByColumns = orderBy.size();
+  int[] orderByColumnIndexes = new int[numOrderByColumns];
+  boolean[] orderByAsc = new boolean[numOrderByColumns];
+  for (int i = 0; i < numOrderByColumns; i++) {
+SelectionSort selectionSort = orderBy.get(i);
+orderByColumnIndexes[i] = 
columnIndexMap.getInt(selectionSort.getColumn());
+orderByAsc[i] = selectionSort.isIsAsc();
+  }
+  _sortedRecords = new PriorityQueue<>(initialCapacity, (record1, record2) 
-> {
+Object[] values1 = record1.getValues();
+Object[] values2 = record2.getValues();
+for (int i = 0; i < numOrderByColumns; i++) {
+  Comparable valueToCompare1 = (Comparable) 
values1[orderByColumnIndexes[i]];
+  Comparable valueToCompare2 = (Comparable) 
values2[orderByColumnIndexes[i]];
+  int result =
+  orderByAsc[i] ? valueToCompare2.compareTo(valueToCompare1) : 
valueToCompare1.compareTo(valueToCompare2);
+  if (result != 0) {
+return result;
+  }
+}
+return 0;
+  });
+} else {
+  _sortedRecords = null;
+}
+_records = null;
+  }
+
+  /**
+   * Returns the {@code DataSchema} of the {@code DistinctTable}.
+   */
+  public DataSchema getDataSchema() {
+return _dataSchema;
+  }
+
+  /**
+   * Returns the number of unique records within the {@code DistinctTable}.
+   */
+  public int size() {
+if (_uniqueRecords != null) {
+  // Server-side
+  return _uniqueRecords.size();
+} else {
+  // Broker-side
+  return _records.size();
+}
+  }
+
+  /**
+   * Adds a record into the DistinctTable and returns whether more r

[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5451: Refactor DistinctTable to use PriorityQueue based algorithm

2020-05-29 Thread GitBox



siddharthteotia commented on a change in pull request #5451:
URL: https://github.com/apache/incubator-pinot/pull/5451#discussion_r432678859



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctAggregationFunction.java
##
@@ -123,21 +120,20 @@ public void aggregate(int length, AggregationResultHolder 
aggregationResultHolde
 columnDataTypes[i] = 
ColumnDataType.fromDataTypeSV(blockValSetMap.get(_inputExpressions.get(i)).getValueType());
   }
   DataSchema dataSchema = new DataSchema(_columns, columnDataTypes);
-  distinctTable = new DistinctTable(dataSchema, _orderBy, _capacity);
+  distinctTable = new DistinctTable(dataSchema, _orderBy, _limit);
   aggregationResultHolder.setValue(distinctTable);
+} else if (distinctTable.shouldNotAddMore()) {
+  return;
 }
 
-// TODO: Follow up PR will make few changes to start using 
DictionaryBasedAggregationOperator
-// for DISTINCT queries without filter.
+// TODO: Follow up PR will make few changes to start using 
DictionaryBasedAggregationOperator for DISTINCT queries
+//   without filter.
 RowBasedBlockValueFetcher blockValueFetcher = new 
RowBasedBlockValueFetcher(blockValSets);
 
-// TODO: Do early termination in the operator itself which should
-// not call aggregate function at all if the limit has reached
-// that will require the interface change since this function
-// has to communicate back that required number of records have
-// been collected
 for (int i = 0; i < length; i++) {
-  distinctTable.upsert(new Record(blockValueFetcher.getRow(i)));
+  if (!distinctTable.add(new Record(blockValueFetcher.getRow(i {

Review comment:
   I think this for loop should be written separately for order by and non 
order by.
   
   For order by, there is no early termination so if check can be avoided since 
the return value will always be true.
   For non order, after adding every record, check the return value to see if 
limit has been reached.

##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/customobject/DistinctTable.java
##
@@ -0,0 +1,297 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.query.aggregation.function.customobject;
+
+import it.unimi.dsi.fastutil.objects.Object2IntOpenHashMap;
+import it.unimi.dsi.fastutil.objects.ObjectOpenHashSet;
+import java.io.IOException;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.Iterator;
+import java.util.LinkedList;
+import java.util.List;
+import java.util.PriorityQueue;
+import java.util.Set;
+import javax.annotation.Nullable;
+import org.apache.pinot.common.request.SelectionSort;
+import org.apache.pinot.common.utils.DataSchema;
+import org.apache.pinot.common.utils.DataTable;
+import org.apache.pinot.core.common.datatable.DataTableBuilder;
+import org.apache.pinot.core.common.datatable.DataTableFactory;
+import org.apache.pinot.core.data.table.Record;
+import org.apache.pinot.spi.utils.ByteArray;
+
+
+/**
+ * The {@code DistinctTable} class serves as the intermediate result of {@code 
DistinctAggregationFunction}.
+ */
+@SuppressWarnings({"rawtypes", "unchecked"})
+public class DistinctTable {
+  private static final int MAX_INITIAL_CAPACITY = 1;
+
+  private final DataSchema _dataSchema;
+  private final int _limit;
+  private final Set _uniqueRecords;
+  private final PriorityQueue _sortedRecords;
+  private final List _records;

Review comment:
   A comment that "list is not used on the server and/or is only for broker 
deserialized and reduce" would be good to have.

##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/customobject/DistinctTable.java
##
@@ -0,0 +1,297 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this

[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #5444: Enhance and simplify the filtering

2020-05-29 Thread GitBox



Jackie-Jiang commented on a change in pull request #5444:
URL: https://github.com/apache/incubator-pinot/pull/5444#discussion_r432678997



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/operator/filter/SortedIndexBasedFilterOperator.java
##
@@ -0,0 +1,145 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.operator.filter;
+
+import com.google.common.base.Preconditions;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.List;
+import org.apache.pinot.common.utils.Pairs.IntPair;
+import org.apache.pinot.core.common.DataSource;
+import org.apache.pinot.core.io.reader.impl.v1.SortedIndexReader;
+import org.apache.pinot.core.operator.blocks.FilterBlock;
+import org.apache.pinot.core.operator.docidsets.SortedDocIdSet;
+import org.apache.pinot.core.operator.filter.predicate.PredicateEvaluator;
+import 
org.apache.pinot.core.operator.filter.predicate.RangePredicateEvaluatorFactory.OfflineDictionaryBasedRangePredicateEvaluator;
+
+
+@SuppressWarnings("rawtypes")
+public class SortedIndexBasedFilterOperator extends BaseFilterOperator {
+  private static final String OPERATOR_NAME = "SortedIndexBasedFilterOperator";
+
+  private final PredicateEvaluator _predicateEvaluator;
+  private final SortedIndexReader _sortedIndexReader;
+  private final int _numDocs;
+
+  SortedIndexBasedFilterOperator(PredicateEvaluator predicateEvaluator, 
DataSource dataSource, int numDocs) {
+_predicateEvaluator = predicateEvaluator;
+_sortedIndexReader = (SortedIndexReader) dataSource.getInvertedIndex();
+_numDocs = numDocs;
+  }
+
+  @Override
+  protected FilterBlock getNextBlock() {
+// At this point, we need to create a list of matching docId ranges. There 
are two kinds of operators:
+//
+// - "Additive" operators, such as EQ, IN and RANGE build up a list of 
ranges and merge overlapping/adjacent ones,
+//   clipping the ranges to [startDocId; endDocId]
+//
+// - "Subtractive" operators, such as NEQ and NOT IN build up a list of 
ranges that do not match and build a list of
+//   matching intervals by subtracting a list of non-matching intervals 
from the given range of
+//   [startDocId; endDocId]
+//
+// For now, we don't look at the cardinality of the column's dictionary, 
although we should do that if someone

Review comment:
   I didn't change this block of comments, but seems it identifies it as a 
new class. Let me update it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] Jackie-Jiang merged pull request #5459: [Cleanup] Merge RealtimeSegmentOnlineOfflineStateModel and SegmentOnlineOfflineStateModel in CommonConstants

2020-05-29 Thread GitBox



Jackie-Jiang merged pull request #5459:
URL: https://github.com/apache/incubator-pinot/pull/5459


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[incubator-pinot] branch master updated: [Cleanup] Merge RealtimeSegmentOnlineOfflineStateModel and SegmentOnlineOfflineStateModel in CommonConstants (#5459)

2020-05-29 Thread jackie

This is an automated email from the ASF dual-hosted git repository.

jackie pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git


The following commit(s) were added to refs/heads/master by this push:
 new 6bfcacb  [Cleanup] Merge RealtimeSegmentOnlineOfflineStateModel and 
SegmentOnlineOfflineStateModel in CommonConstants (#5459)
6bfcacb is described below

commit 6bfcacb239f55c27491c2a580bf4a63cf2ec88fe
Author: Xiaotian (Jackie) Jiang <1751+jackie-ji...@users.noreply.github.com>
AuthorDate: Fri May 29 11:39:44 2020 -0700

[Cleanup] Merge RealtimeSegmentOnlineOfflineStateModel and 
SegmentOnlineOfflineStateModel in CommonConstants (#5459)

We only have one SegmentOnlineOfflineStateModel, so there is no value 
keeping both of them
---
 .../HelixExternalViewBasedQueryQuotaManager.java   |  4 +-
 .../pinot/broker/routing/RoutingManager.java   |  6 +--
 .../instanceselector/BaseInstanceSelector.java |  6 +--
 .../segmentselector/RealtimeSegmentSelector.java   |  4 +-
 .../instanceselector/InstanceSelectorTest.java |  8 ++--
 .../segmentselector/SegmentSelectorTest.java   |  4 +-
 .../apache/pinot/common/utils/CommonConstants.java | 10 +---
 .../helix/core/PinotHelixResourceManager.java  | 20 
 .../segment/OfflineSegmentAssignment.java  |  6 +--
 .../segment/RealtimeSegmentAssignment.java | 14 +++---
 .../assignment/segment/SegmentAssignmentUtils.java | 13 +++---
 .../realtime/PinotLLCRealtimeSegmentManager.java   | 39 
 .../helix/core/rebalance/TableRebalancer.java  |  6 +--
 .../helix/core/retention/RetentionManager.java |  3 +-
 ...fflineNonReplicaGroupSegmentAssignmentTest.java | 14 +++---
 .../OfflineReplicaGroupSegmentAssignmentTest.java  | 26 +--
 ...altimeNonReplicaGroupSegmentAssignmentTest.java | 19 
 .../RealtimeReplicaGroupSegmentAssignmentTest.java | 19 
 .../segment/SegmentAssignmentUtilsTest.java|  8 ++--
 .../PinotLLCRealtimeSegmentManagerTest.java| 54 +++---
 .../core/rebalance/TableRebalancerClusterTest.java |  2 +-
 .../helix/core/rebalance/TableRebalancerTest.java  |  8 ++--
 .../ControllerPeriodicTasksIntegrationTest.java|  8 ++--
 .../server/starter/helix/HelixServerStarter.java   |  5 +-
 24 files changed, 148 insertions(+), 158 deletions(-)

diff --git 
a/pinot-broker/src/main/java/org/apache/pinot/broker/queryquota/HelixExternalViewBasedQueryQuotaManager.java
 
b/pinot-broker/src/main/java/org/apache/pinot/broker/queryquota/HelixExternalViewBasedQueryQuotaManager.java
index 172b13a..3306e58 100644
--- 
a/pinot-broker/src/main/java/org/apache/pinot/broker/queryquota/HelixExternalViewBasedQueryQuotaManager.java
+++ 
b/pinot-broker/src/main/java/org/apache/pinot/broker/queryquota/HelixExternalViewBasedQueryQuotaManager.java
@@ -168,7 +168,7 @@ public class HelixExternalViewBasedQueryQuotaManager 
implements ClusterChangeHan
 if (stateMap != null) {
   for (Map.Entry state : stateMap.entrySet()) {
 if (!_helixManager.getInstanceName().equals(state.getKey()) && 
state.getValue()
-
.equals(CommonConstants.Helix.StateModel.SegmentOnlineOfflineStateModel.ONLINE))
 {
+
.equals(CommonConstants.Helix.StateModel.SegmentStateModel.ONLINE)) {
   otherOnlineBrokerCount++;
 }
   }
@@ -304,7 +304,7 @@ public class HelixExternalViewBasedQueryQuotaManager 
implements ClusterChangeHan
   int otherOnlineBrokerCount = 0;
   for (Map.Entry state : stateMap.entrySet()) {
 if (!_helixManager.getInstanceName().equals(state.getKey()) && 
state.getValue()
-
.equals(CommonConstants.Helix.StateModel.SegmentOnlineOfflineStateModel.ONLINE))
 {
+
.equals(CommonConstants.Helix.StateModel.SegmentStateModel.ONLINE)) {
   otherOnlineBrokerCount++;
 }
   }
diff --git 
a/pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
 
b/pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
index 62eece4..e8bf81f 100644
--- 
a/pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
+++ 
b/pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
@@ -51,7 +51,7 @@ import org.apache.pinot.common.metrics.BrokerMeter;
 import org.apache.pinot.common.metrics.BrokerMetrics;
 import org.apache.pinot.common.request.BrokerRequest;
 import org.apache.pinot.common.utils.CommonConstants;
-import 
org.apache.pinot.common.utils.CommonConstants.Helix.StateModel.RealtimeSegmentOnlineOfflineStateModel;
+import 
org.apache.pinot.common.utils.CommonConstants.Helix.StateModel.SegmentStateModel;
 import org.apache.pinot.common.utils.HashUtil;
 import org.apache.pinot.core.transport.ServerInstance;
 import org.apache.pinot.spi.config.table.QueryConfig;
@@ -195,8 +195,8 @@ public class RoutingManager implements ClusterChangeHandler 
{
   Set online

[incubator-pinot] branch master updated: Remove master branch restriction (#5467)

2020-05-29 Thread jlli

This is an automated email from the ASF dual-hosted git repository.

jlli pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git


The following commit(s) were added to refs/heads/master by this push:
 new 7310ffb  Remove master branch restriction (#5467)
7310ffb is described below

commit 7310ffb6b7364e155ba8bb041f3015ba9b16c9d9
Author: Jialiang Li 
AuthorDate: Fri May 29 09:05:25 2020 -0700

Remove master branch restriction (#5467)

Co-authored-by: Jack Li(Analytics Engineering) 
---
 .travis.yml | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index 0e5a142..2315bc6 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -25,14 +25,14 @@ addons:
 install:
   - ./.travis/.travis_install.sh
 
-branches:
-  only:
-- master
+#branches:
+#  only:
+#- master
 
 stages:
   - test
   - name: deploy
-if: branch = master
+#if: branch = master
 
 jobs:
   include:


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] jackjlli merged pull request #5467: Remove master branch restriction to trigger Travis builds

2020-05-29 Thread GitBox



jackjlli merged pull request #5467:
URL: https://github.com/apache/incubator-pinot/pull/5467


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] jackjlli commented on pull request #5467: Remove master branch restriction to trigger Travis builds

2020-05-29 Thread GitBox



jackjlli commented on pull request #5467:
URL: https://github.com/apache/incubator-pinot/pull/5467#issuecomment-636049216


   > Does this change will start to run Travis job for all existing branches? 
(we have `155` branches as of writing this comment). We should be able to 
trigger in ad-hoc basis for branches other than master.
   
   No, Travis build won't be triggered automatically unless we set it that way 
in Travis settings. Plus, it's only the repo admin that can only change that 
settings. So we are good to remove this restriction in the yml file.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] snleee edited a comment on pull request #5467: Remove master branch restriction to trigger Travis builds

2020-05-29 Thread GitBox



snleee edited a comment on pull request #5467:
URL: https://github.com/apache/incubator-pinot/pull/5467#issuecomment-635850318


   Does this change will start to run Travis job for all existing branches? (we 
have `155` branches as of writing this comment). We should be able to trigger 
in ad-hoc basis for branches other than master.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] snleee commented on pull request #5467: Remove master branch restriction to trigger Travis builds

2020-05-29 Thread GitBox



snleee commented on pull request #5467:
URL: https://github.com/apache/incubator-pinot/pull/5467#issuecomment-635850318


   Does this will start to run Travis job for all existing branches? (we have 
`155` branches as of writing this comment). We should be able to trigger in 
ad-hoc basis for branches other than master.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5444: Enhance and simplify the filtering

2020-05-29 Thread GitBox



siddharthteotia commented on a change in pull request #5444:
URL: https://github.com/apache/incubator-pinot/pull/5444#discussion_r432331313



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/operator/docidsets/AndDocIdSet.java
##
@@ -0,0 +1,156 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.operator.docidsets;
+
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.pinot.common.utils.Pairs.IntPair;
+import org.apache.pinot.core.common.BlockDocIdIterator;
+import org.apache.pinot.core.operator.dociditerators.AndDocIdIterator;
+import org.apache.pinot.core.operator.dociditerators.BitmapBasedDocIdIterator;
+import 
org.apache.pinot.core.operator.dociditerators.RangelessBitmapDocIdIterator;
+import org.apache.pinot.core.operator.dociditerators.ScanBasedDocIdIterator;
+import org.apache.pinot.core.operator.dociditerators.SortedDocIdIterator;
+import org.apache.pinot.core.util.SortedRangeIntersection;
+import org.roaringbitmap.buffer.ImmutableRoaringBitmap;
+import org.roaringbitmap.buffer.MutableRoaringBitmap;
+
+
+/**
+ * The FilterBlockDocIdSet to perform AND on all child FilterBlockDocIdSets.
+ * The AndBlockDocIdSet will construct the BlockDocIdIterator based on the 
BlockDocIdIterators from the child
+ * FilterBlockDocIdSets:
+ * 
+ *   
+ * When there are at least one index-base BlockDocIdIterators 
(SortedDocIdIterator or BitmapBasedDocIdIterator) and
+ * at least one ScanBasedDocIdIterator, or more than one index-based 
BlockDocIdIterators, merge them and construct a
+ * RangelessBitmapDocIdIterator from the merged document ids. If there is 
no remaining BlockDocIdIterators, directly
+ * return the merged RangelessBitmapDocIdIterator; otherwise, construct 
and return an AndDocIdIterator with the
+ * merged RangelessBitmapDocIdIterator and the remaining 
BlockDocIdIterators.
+ *   
+ *   
+ * Otherwise, construct and return an AndDocIdIterator with all 
BlockDocIdIterators.
+ *   
+ * 
+ */
+public final class AndDocIdSet implements FilterBlockDocIdSet {
+  private final List _docIdSets;
+
+  public AndDocIdSet(List docIdSets) {
+_docIdSets = docIdSets;
+  }
+
+  @Override
+  public BlockDocIdIterator iterator() {
+int numDocIdSets = _docIdSets.size();
+// NOTE: Keep the order of FilterBlockDocIdSets to preserve the order 
decided within FilterOperatorUtils.
+// TODO: Consider deciding the order based on the stats of 
BlockDocIdIterators
+BlockDocIdIterator[] allDocIdIterators = new 
BlockDocIdIterator[numDocIdSets];
+List sortedDocIdIterators = new ArrayList<>();

Review comment:
   These 3 (sorted, inv index, and scan) are basically for left or right 
leaf operators of AND. The `remainingDocIdIterators` is for non-leaves (child 
AND/OR) right? 
   
   On that note, for any sub-tree rooted at AND, there can be at-most one child 
with sorted iterator. Right?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5444: Enhance and simplify the filtering

2020-05-29 Thread GitBox



siddharthteotia commented on a change in pull request #5444:
URL: https://github.com/apache/incubator-pinot/pull/5444#discussion_r432323606



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/common/BlockDocIdSet.java
##
@@ -18,9 +18,15 @@
  */
 package org.apache.pinot.core.common;
 
+/**
+ * The interface BlockDocIdSet represents all the matching 
document ids for a predicate.

Review comment:
   +1000 on removing this.

##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/common/BlockDocIdIterator.java
##
@@ -25,25 +25,16 @@
 public interface BlockDocIdIterator {
 
   /**
-   * Get the next document id.
-   *
-   * @return Next document id or EOF if there is no more documents
+   * Returns the next matched document id, or {@link Constants#EOF} if there 
is no more matched document.
+   * NOTE: There should be no more call to this method after it returns 
{@link Constants#EOF}.
*/
   int next();
 
   /**
-   * Advance to the first document whose id is equal or greater than the given 
target document id.
-   * If the given target document id is smaller or equal to the current 
document id, then return the current one.
-   *
-   * @param targetDocId The target document id
-   * @return First document id that is equal or greater than target or EOF if 
no document matches
+   * Returns the first matched document whose id is equal to or greater than 
the given target document id, or
+   * {@link Constants#EOF} if there is no such document.
+   * NOTE: The target document id should be greater than the document id 
previous returned.

Review comment:
   What happens from an API point of view if the target is same as last 
returned matching docId?
   We won't throw error and return the target as is. So, the comment should 
state greater than or equal to.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5444: Enhance and simplify the filtering

2020-05-29 Thread GitBox



siddharthteotia commented on a change in pull request #5444:
URL: https://github.com/apache/incubator-pinot/pull/5444#discussion_r432323266



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/operator/dociditerators/ScanBasedDocIdIterator.java
##
@@ -24,22 +24,18 @@
 
 
 /**
- * All scan based filter iterators must implement this interface. This allows 
intersection to be
- * optimized.
- * For example, if the we have two iterators one index based and another scan 
based, instead of

Review comment:
   This is good piece of information. Why delete it? That's how the 
filtering will work right if we do 
   WHERE col1 = 200 AND col2 = 10 -- if there is an inverted index on col1 and 
no index on col2

##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/operator/dociditerators/SVScanDocIdIterator.java
##
@@ -18,238 +18,138 @@
  */
 package org.apache.pinot.core.operator.dociditerators;
 
-import org.apache.pinot.core.common.BlockMetadata;
 import org.apache.pinot.core.common.BlockSingleValIterator;
-import org.apache.pinot.core.common.BlockValSet;
 import org.apache.pinot.core.common.Constants;
 import org.apache.pinot.core.operator.filter.predicate.PredicateEvaluator;
-import org.apache.pinot.spi.data.FieldSpec;
 import org.roaringbitmap.IntIterator;
 import org.roaringbitmap.buffer.ImmutableRoaringBitmap;
 import org.roaringbitmap.buffer.MutableRoaringBitmap;
 
 
-public class SVScanDocIdIterator implements ScanBasedDocIdIterator {
-  private int _currentDocId = -1;
+public final class SVScanDocIdIterator implements ScanBasedDocIdIterator {
+  private final PredicateEvaluator _predicateEvaluator;
   private final BlockSingleValIterator _valueIterator;
-  private int _startDocId;
-  private int _endDocId;
-  private PredicateEvaluator _evaluator;
-  private String _operatorName;
-  private int _numEntriesScanned = 0;
+  private final int _numDocs;
   private final ValueMatcher _valueMatcher;
 
-  public SVScanDocIdIterator(String operatorName, BlockValSet blockValSet, 
BlockMetadata blockMetadata,
-  PredicateEvaluator evaluator) {
-_operatorName = operatorName;
-_evaluator = evaluator;
-_valueIterator = (BlockSingleValIterator) blockValSet.iterator();
-
-if (evaluator.isAlwaysFalse()) {
-  _currentDocId = Constants.EOF;
-  setStartDocId(Constants.EOF);
-  setEndDocId(Constants.EOF);
-} else {
-  setStartDocId(blockMetadata.getStartDocId());
-  setEndDocId(blockMetadata.getEndDocId());
-}
+  private int _nextDocId = 0;
+  private long _numEntriesScanned = 0L;
 
-if (evaluator.isDictionaryBased()) {
-  _valueMatcher = new IntMatcher(); // Match using dictionary id's that 
are integers.
-} else {
-  _valueMatcher = getValueMatcherForType(blockMetadata.getDataType());
-}
-_valueMatcher.setEvaluator(evaluator);
-  }
-
-  /**
-   * After setting the startDocId, next calls will always return from 
>=startDocId
-   *
-   * @param startDocId Start doc id
-   */
-  public void setStartDocId(int startDocId) {
-_currentDocId = startDocId - 1;
-_valueIterator.skipTo(startDocId);
-_startDocId = startDocId;
-  }
-
-  /**
-   * After setting the endDocId, next call will return Constants.EOF after 
currentDocId exceeds
-   * endDocId
-   *
-   * @param endDocId End doc id
-   */
-  public void setEndDocId(int endDocId) {
-_endDocId = endDocId;
-  }
-
-  @Override
-  public boolean isMatch(int docId) {
-if (_currentDocId == Constants.EOF) {
-  return false;
-}
-_valueIterator.skipTo(docId);
-_numEntriesScanned++;
-return _valueMatcher.doesCurrentEntryMatch(_valueIterator);
-  }
-
-  @Override
-  public int advance(int targetDocId) {
-if (_currentDocId == Constants.EOF) {
-  return _currentDocId;
-}
-if (targetDocId < _startDocId) {
-  targetDocId = _startDocId;
-} else if (targetDocId > _endDocId) {
-  _currentDocId = Constants.EOF;
-}
-if (_currentDocId >= targetDocId) {
-  return _currentDocId;
-} else {
-  _currentDocId = targetDocId - 1;
-  _valueIterator.skipTo(targetDocId);
-  return next();
-}
+  public SVScanDocIdIterator(PredicateEvaluator predicateEvaluator, 
BlockSingleValIterator valueIterator, int numDocs) {
+_predicateEvaluator = predicateEvaluator;
+_valueIterator = valueIterator;
+_numDocs = numDocs;
+_valueMatcher = getValueMatcher();
   }
 
   @Override
   public int next() {
-if (_currentDocId == Constants.EOF) {
-  return Constants.EOF;
-}
-while (_valueIterator.hasNext() && _currentDocId < _endDocId) {
-  _currentDocId = _currentDocId + 1;
+while (_nextDocId < _numDocs) {
+  int nextDocId = _nextDocId++;
   _numEntriesScanned++;
-  if (_valueMatcher.doesCurrentEntryMatch(_valueIterator)) {
-return _currentDocId;
+  if (_valueMatcher.doesNextValueMatch()) {
+return nextDocId;
   }
 }
-_currentDo

[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5444: Enhance and simplify the filtering

2020-05-29 Thread GitBox



siddharthteotia commented on a change in pull request #5444:
URL: https://github.com/apache/incubator-pinot/pull/5444#discussion_r432322094



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/operator/filter/BitmapBasedFilterOperator.java
##
@@ -25,62 +25,73 @@
 import org.apache.pinot.core.operator.filter.predicate.PredicateEvaluator;
 import org.apache.pinot.core.segment.index.readers.InvertedIndexReader;
 import org.roaringbitmap.buffer.ImmutableRoaringBitmap;
+import org.roaringbitmap.buffer.MutableRoaringBitmap;
 
 
+@SuppressWarnings("rawtypes")
 public class BitmapBasedFilterOperator extends BaseFilterOperator {
   private static final String OPERATOR_NAME = "BitmapBasedFilterOperator";
 
   private final PredicateEvaluator _predicateEvaluator;
-  private final DataSource _dataSource;
-  private final ImmutableRoaringBitmap[] _bitmaps;
-  private final int _startDocId;
-  // TODO: change it to exclusive
-  // Inclusive
-  private final int _endDocId;
+  private final InvertedIndexReader _invertedIndexReader;
+  private final ImmutableRoaringBitmap _docIds;
   private final boolean _exclusive;
+  private final int _numDocs;
 
-  BitmapBasedFilterOperator(PredicateEvaluator predicateEvaluator, DataSource 
dataSource, int startDocId,
-  int endDocId) {
-// NOTE:
-// Predicate that is always evaluated as true or false should not be 
passed into the BitmapBasedFilterOperator for
-// performance concern.
-// If predicate is always evaluated as true, use MatchAllFilterOperator; 
if predicate is always evaluated as false,
-// use EmptyFilterOperator.
-Preconditions.checkArgument(!predicateEvaluator.isAlwaysTrue() && 
!predicateEvaluator.isAlwaysFalse());
-
+  BitmapBasedFilterOperator(PredicateEvaluator predicateEvaluator, DataSource 
dataSource, int numDocs) {
 _predicateEvaluator = predicateEvaluator;
-_dataSource = dataSource;
-_bitmaps = null;
-_startDocId = startDocId;
-_endDocId = endDocId;
+_invertedIndexReader = dataSource.getInvertedIndex();
+_docIds = null;
 _exclusive = predicateEvaluator.isExclusive();
+_numDocs = numDocs;
   }
 
-  public BitmapBasedFilterOperator(ImmutableRoaringBitmap[] bitmaps, int 
startDocId, int endDocId, boolean exclusive) {
+  public BitmapBasedFilterOperator(ImmutableRoaringBitmap docIds, boolean 
exclusive, int numDocs) {
 _predicateEvaluator = null;
-_dataSource = null;
-_bitmaps = bitmaps;
-_startDocId = startDocId;
-_endDocId = endDocId;
+_invertedIndexReader = null;
+_docIds = docIds;
 _exclusive = exclusive;
+_numDocs = numDocs;
   }
 
   @Override
   protected FilterBlock getNextBlock() {
-if (_bitmaps != null) {
-  return new FilterBlock(new BitmapDocIdSet(_bitmaps, _startDocId, 
_endDocId, _exclusive));
+if (_docIds != null) {
+  if (_exclusive) {
+return new FilterBlock(new 
BitmapDocIdSet(ImmutableRoaringBitmap.flip(_docIds, 0L, _numDocs), _numDocs));
+  } else {
+return new FilterBlock(new BitmapDocIdSet(_docIds, _numDocs));
+  }
 }
 
 int[] dictIds = _exclusive ? _predicateEvaluator.getNonMatchingDictIds() : 
_predicateEvaluator.getMatchingDictIds();

Review comment:
   Is is possible to handle NOT_IN, NEQ exactly once?
   We checked for _exclusive and accordingly get non matching dictIds or 
matching dictIds based on whether it is true or false. So the predicate is 
already evaluated correctly. Now why can't we can just work on the docIds for 
these dictIds.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5444: Enhance and simplify the filtering

2020-05-29 Thread GitBox



siddharthteotia commented on a change in pull request #5444:
URL: https://github.com/apache/incubator-pinot/pull/5444#discussion_r432321846



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/operator/filter/SortedIndexBasedFilterOperator.java
##
@@ -0,0 +1,145 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.operator.filter;
+
+import com.google.common.base.Preconditions;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.List;
+import org.apache.pinot.common.utils.Pairs.IntPair;
+import org.apache.pinot.core.common.DataSource;
+import org.apache.pinot.core.io.reader.impl.v1.SortedIndexReader;
+import org.apache.pinot.core.operator.blocks.FilterBlock;
+import org.apache.pinot.core.operator.docidsets.SortedDocIdSet;
+import org.apache.pinot.core.operator.filter.predicate.PredicateEvaluator;
+import 
org.apache.pinot.core.operator.filter.predicate.RangePredicateEvaluatorFactory.OfflineDictionaryBasedRangePredicateEvaluator;
+
+
+@SuppressWarnings("rawtypes")
+public class SortedIndexBasedFilterOperator extends BaseFilterOperator {
+  private static final String OPERATOR_NAME = "SortedIndexBasedFilterOperator";
+
+  private final PredicateEvaluator _predicateEvaluator;
+  private final SortedIndexReader _sortedIndexReader;
+  private final int _numDocs;
+
+  SortedIndexBasedFilterOperator(PredicateEvaluator predicateEvaluator, 
DataSource dataSource, int numDocs) {
+_predicateEvaluator = predicateEvaluator;
+_sortedIndexReader = (SortedIndexReader) dataSource.getInvertedIndex();
+_numDocs = numDocs;
+  }
+
+  @Override
+  protected FilterBlock getNextBlock() {
+// At this point, we need to create a list of matching docId ranges. There 
are two kinds of operators:
+//
+// - "Additive" operators, such as EQ, IN and RANGE build up a list of 
ranges and merge overlapping/adjacent ones,
+//   clipping the ranges to [startDocId; endDocId]
+//
+// - "Subtractive" operators, such as NEQ and NOT IN build up a list of 
ranges that do not match and build a list of
+//   matching intervals by subtracting a list of non-matching intervals 
from the given range of
+//   [startDocId; endDocId]
+//
+// For now, we don't look at the cardinality of the column's dictionary, 
although we should do that if someone

Review comment:
   (nit): should just be cardinality of the column (since size of 
dictionary is equal to cardinality)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[incubator-pinot] tag release-0.4.0-rc2 created (now 0ee9083)

2020-05-29 Thread haibow

This is an automated email from the ASF dual-hosted git repository.

haibow pushed a change to tag release-0.4.0-rc2
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


  at 0ee9083  (commit)
No new revisions were added by this update.


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] fx19880617 commented on a change in pull request #5461: Adding Support for SQL CASE Statement

2020-05-29 Thread GitBox



fx19880617 commented on a change in pull request #5461:
URL: https://github.com/apache/incubator-pinot/pull/5461#discussion_r432313174



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/operator/transform/function/BinaryOperatorTransformFunction.java
##
@@ -0,0 +1,114 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.operator.transform.function;
+
+import java.util.List;
+import java.util.Map;
+import org.apache.pinot.core.common.DataSource;
+import org.apache.pinot.core.operator.blocks.ProjectionBlock;
+import org.apache.pinot.core.operator.transform.TransformResultMetadata;
+import org.apache.pinot.core.plan.DocIdSetPlanNode;
+import org.apache.pinot.spi.data.FieldSpec;
+import org.apache.pinot.spi.utils.ByteArray;
+
+
+/**
+ * BinaryOperatorTransformFunction abstracts common functions for 
binary operators (=, !=, >=, >, <=, <)
+ * The results are in boolean format and stored as an integer array with 1 
represents true and 0 represents false.
+ *
+ */
+public abstract class BinaryOperatorTransformFunction extends 
BaseTransformFunction {
+
+  protected TransformFunction _leftTransformFunction;
+  protected TransformFunction _rightTransformFunction;
+  protected int[] _results;
+
+  @Override
+  public void init(List arguments, Map 
dataSourceMap) {
+// Check that there are more than 1 arguments
+if (arguments.size() != 2) {

Review comment:
   Added data type check.
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[incubator-pinot] branch support_case_when_statement updated (5667776 -> 385a8b8)

2020-05-29 Thread xiangfu

This is an automated email from the ASF dual-hosted git repository.

xiangfu pushed a change to branch support_case_when_statement
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


 discard 5667776  Adding transform function support for case-when-else
 add 385a8b8  Adding transform function support for case-when-else

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (5667776)
\
 N -- N -- N   refs/heads/support_case_when_statement (385a8b8)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

No new revisions were added by this update.

Summary of changes:
 .../function/BinaryOperatorTransformFunction.java  | 246 +++--
 1 file changed, 226 insertions(+), 20 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] fx19880617 commented on a change in pull request #5461: Adding Support for SQL CASE Statement

2020-05-29 Thread GitBox



fx19880617 commented on a change in pull request #5461:
URL: https://github.com/apache/incubator-pinot/pull/5461#discussion_r432305059



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/operator/transform/function/BinaryOperatorTransformFunction.java
##
@@ -0,0 +1,114 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.operator.transform.function;
+
+import java.util.List;
+import java.util.Map;
+import org.apache.pinot.core.common.DataSource;
+import org.apache.pinot.core.operator.blocks.ProjectionBlock;
+import org.apache.pinot.core.operator.transform.TransformResultMetadata;
+import org.apache.pinot.core.plan.DocIdSetPlanNode;
+import org.apache.pinot.spi.data.FieldSpec;
+import org.apache.pinot.spi.utils.ByteArray;
+
+
+/**
+ * BinaryOperatorTransformFunction abstracts common functions for 
binary operators (=, !=, >=, >, <=, <)
+ * The results are in boolean format and stored as an integer array with 1 
represents true and 0 represents false.
+ *
+ */
+public abstract class BinaryOperatorTransformFunction extends 
BaseTransformFunction {
+
+  protected TransformFunction _leftTransformFunction;
+  protected TransformFunction _rightTransformFunction;
+  protected int[] _results;
+
+  @Override
+  public void init(List arguments, Map 
dataSourceMap) {
+// Check that there are more than 1 arguments
+if (arguments.size() != 2) {
+  throw new IllegalArgumentException("Exact 2 arguments are required for 
greater transform function");
+}
+_leftTransformFunction = arguments.get(0);
+_rightTransformFunction = arguments.get(1);
+  }
+
+  @Override
+  public TransformResultMetadata getResultMetadata() {
+return INT_SV_NO_DICTIONARY_METADATA;
+  }
+
+  protected void fillResultArray(ProjectionBlock projectionBlock) {
+if (_results == null) {
+  _results = new int[DocIdSetPlanNode.MAX_DOC_PER_CALL];
+}
+FieldSpec.DataType dataType = 
_leftTransformFunction.getResultMetadata().getDataType();

Review comment:
   Yes.
   
   Right now I will also try to parse String to BigDecimal then do the 
comparison, as LiteralTransformFunction's result type is always String.
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] chenboat edited a comment on pull request #5444: Enhance and simplify the filtering

2020-05-29 Thread GitBox



chenboat edited a comment on pull request #5444:
URL: https://github.com/apache/incubator-pinot/pull/5444#issuecomment-635802473


   > @chenboat There is no new added major classes. Most of the changes are 
making the classes compatible with the filter interface change, so it is very 
hard to break them into multiple PRs. I can make the removed interface in 
BlockValIterator a separate PR as they are independent of the filtering.
   > This PR is the first step of re-structuring the filtering in Pinot, so for 
historical reason the name for some interfaces (e.g. BlockDocIdSet) might be 
confusing. I wouldn't spend too much time renaming and documenting them because 
they are going to be re-structured in the following steps.
   
   Thanks for reducing the number of changed files. Can you update the summary 
as well? I also realized some new classes are just renaming of previously 
existing classes -- my bad.

   I still think there is room for breaking up this PR. Based on your summary 
there are multiple things going on in this PR:
> 1. Uniformed the behavior of all filter-related classes to bound the 
return docIds with numDocs
> 2. Simplified the logic of AND/OR handling
> 3. Pushed down ... 
   
   Can we put the 3 items in 3 PRs? I found there are non-trivial coding 
refactoring  (e.g., 
pinot-core/src/main/java/org/apache/pinot/core/plan/FilterPlanNode.java) mixed 
with interface changes in this PR. Can we separate these into smaller PRs? 
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] chenboat edited a comment on pull request #5444: Enhance and simplify the filtering

2020-05-29 Thread GitBox



chenboat edited a comment on pull request #5444:
URL: https://github.com/apache/incubator-pinot/pull/5444#issuecomment-635802473


   > @chenboat There is no new added major classes. Most of the changes are 
making the classes compatible with the filter interface change, so it is very 
hard to break them into multiple PRs. I can make the removed interface in 
BlockValIterator a separate PR as they are independent of the filtering.
   > This PR is the first step of re-structuring the filtering in Pinot, so for 
historical reason the name for some interfaces (e.g. BlockDocIdSet) might be 
confusing. I wouldn't spend too much time renaming and documenting them because 
they are going to be re-structured in the following steps.
   
   Thanks for reducing the number of changed files. Can you update the summary 
as well? I also realized some new classes are just renaming of previously 
existing classes -- my bad.

   I still think there is room for breaking up this PR. Based on your summary 
there are multiple things going on in this PR:
> 1. Uniformed the behavior of all filter-related classes to bound the 
return docIds with numDocs
> 2. Simplified the logic of AND/OR handling
> 3. Pushed down ... 
   
   Can we put the 3 items in 3 PR? I found there are non-trivial coding 
refactoring  (e.g., 
pinot-core/src/main/java/org/apache/pinot/core/plan/FilterPlanNode.java) mixed 
with interface changes in this PR. Can we separate these into smaller PRs? 
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] chenboat commented on pull request #5444: Enhance and simplify the filtering

2020-05-29 Thread GitBox



chenboat commented on pull request #5444:
URL: https://github.com/apache/incubator-pinot/pull/5444#issuecomment-635802473


   > @chenboat There is no new added major classes. Most of the changes are 
making the classes compatible with the filter interface change, so it is very 
hard to break them into multiple PRs. I can make the removed interface in 
BlockValIterator a separate PR as they are independent of the filtering.
   > This PR is the first step of re-structuring the filtering in Pinot, so for 
historical reason the name for some interfaces (e.g. BlockDocIdSet) might be 
confusing. I wouldn't spend too much time renaming and documenting them because 
they are going to be re-structured in the following steps.
   
   Thanks for reducing the number of changed files. Can you update the summary 
as well? I also realized some new classes are just renaming of previously 
existing classes -- my bad.

   I still think there is room for breaking up this PR. Based on your summary 
there are multiple things going on in this PR:
> 1. Uniformed the behavior of all filter-related classes to bound the 
return docIds with numDocs
> 2. Simplified the logic of AND/OR handling
> 3. Pushed down ...  
   Can we put the 3 items in 3 PRs? I found there are non-trivial coding 
refactoring  (e.g., 
pinot-core/src/main/java/org/apache/pinot/core/plan/FilterPlanNode.java) mixed 
with interface changes in this PR. Can we separate these into smaller PRs? 
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

77 matches

Mail list logo