[GitHub] [incubator-pinot] Jackie-Jiang commented on issue #6093: Use star-tree index when filteres not part of index match all rows of a segment

2020-10-05 Thread GitBox


Jackie-Jiang commented on issue #6093:
URL: 
https://github.com/apache/incubator-pinot/issues/6093#issuecomment-703984609


   Enhance star-tree to skip matching-all predicate on non-star-tree dimension: 
#6109 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] Jackie-Jiang opened a new pull request #6109: Enhance star-tree to skip matching-all predicate on non-star-tree dimension

2020-10-05 Thread GitBox


Jackie-Jiang opened a new pull request #6109:
URL: https://github.com/apache/incubator-pinot/pull/6109


   ## Description
   For #6093 
   
   Currently in order to use the star-tree index, all the predicates must be 
applied to the star-tree dimensions. This could limit the usage of star-tree 
when a predicates is not applied to the star-tree dimensions, but can be 
skipped because it matches all the records. This is especially common for the 
implicit predicate applied to time column for the hybrid tables.
   This PR enhances the planning phase for star-tree to skip the predicates 
that matches all the records, so that star-tree can be applied to the use cases 
described above.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch enhance_star_tree created (now 6519f22)

2020-10-05 Thread jackie
This is an automated email from the ASF dual-hosted git repository.

jackie pushed a change to branch enhance_star_tree
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


  at 6519f22  Enhance star-tree to skip matching-all predicate on 
non-star-tree dimension

This branch includes the following new commits:

 new 6519f22  Enhance star-tree to skip matching-all predicate on 
non-star-tree dimension

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] 01/01: Enhance star-tree to skip matching-all predicate on non-star-tree dimension

2020-10-05 Thread jackie
This is an automated email from the ASF dual-hosted git repository.

jackie pushed a commit to branch enhance_star_tree
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git

commit 6519f22df04d8174715530f16c101ae86841024b
Author: Xiaotian (Jackie) Jiang 
AuthorDate: Mon Oct 5 18:44:09 2020 -0700

Enhance star-tree to skip matching-all predicate on non-star-tree dimension
---
 .../plan/AggregationGroupByOrderByPlanNode.java|  34 ++---
 .../core/plan/AggregationGroupByPlanNode.java  |  34 ++---
 .../pinot/core/plan/AggregationPlanNode.java   |  36 ++---
 .../apache/pinot/core/startree/StarTreeUtils.java  | 147 +++--
 .../startree/operator/StarTreeFilterOperator.java  | 101 +++---
 .../startree/plan/StarTreeDocIdSetPlanNode.java|   8 +-
 .../core/startree/plan/StarTreeFilterPlanNode.java |  13 +-
 .../startree/plan/StarTreeProjectionPlanNode.java  |  10 +-
 .../startree/plan/StarTreeTransformPlanNode.java   |   9 +-
 .../pinot/core/startree/v2/BaseStarTreeV2Test.java |  30 ++---
 .../tests/OfflineClusterIntegrationTest.java   |  12 ++
 11 files changed, 210 insertions(+), 224 deletions(-)

diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/plan/AggregationGroupByOrderByPlanNode.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/plan/AggregationGroupByOrderByPlanNode.java
index 5cb94bd..892a96d 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/plan/AggregationGroupByOrderByPlanNode.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/plan/AggregationGroupByOrderByPlanNode.java
@@ -19,13 +19,14 @@
 package org.apache.pinot.core.plan;
 
 import java.util.List;
+import java.util.Map;
 import java.util.Set;
 import org.apache.pinot.core.indexsegment.IndexSegment;
+import org.apache.pinot.core.operator.filter.predicate.PredicateEvaluator;
 import org.apache.pinot.core.operator.query.AggregationGroupByOrderByOperator;
 import org.apache.pinot.core.query.aggregation.function.AggregationFunction;
 import 
org.apache.pinot.core.query.aggregation.function.AggregationFunctionUtils;
 import org.apache.pinot.core.query.request.context.ExpressionContext;
-import org.apache.pinot.core.query.request.context.FilterContext;
 import org.apache.pinot.core.query.request.context.QueryContext;
 import org.apache.pinot.core.startree.StarTreeUtils;
 import org.apache.pinot.core.startree.plan.StarTreeTransformPlanNode;
@@ -59,32 +60,21 @@ public class AggregationGroupByOrderByPlanNode implements 
PlanNode {
 _groupByExpressions = groupByExpressions.toArray(new ExpressionContext[0]);
 
 List starTrees = indexSegment.getStarTrees();
-if (starTrees != null) {
-  if (!StarTreeUtils.isStarTreeDisabled(queryContext)) {
-int numAggregationFunctions = _aggregationFunctions.length;
-AggregationFunctionColumnPair[] aggregationFunctionColumnPairs =
-new AggregationFunctionColumnPair[numAggregationFunctions];
-boolean hasUnsupportedAggregationFunction = false;
-for (int i = 0; i < numAggregationFunctions; i++) {
-  AggregationFunctionColumnPair aggregationFunctionColumnPair =
-  
AggregationFunctionUtils.getAggregationFunctionColumnPair(_aggregationFunctions[i]);
-  if (aggregationFunctionColumnPair != null) {
-aggregationFunctionColumnPairs[i] = aggregationFunctionColumnPair;
-  } else {
-hasUnsupportedAggregationFunction = true;
-break;
-  }
-}
-if (!hasUnsupportedAggregationFunction) {
-  FilterContext filter = queryContext.getFilter();
+if (starTrees != null && !StarTreeUtils.isStarTreeDisabled(queryContext)) {
+  AggregationFunctionColumnPair[] aggregationFunctionColumnPairs =
+  StarTreeUtils.extractAggregationFunctionPairs(_aggregationFunctions);
+  if (aggregationFunctionColumnPairs != null) {
+Map> predicateEvaluatorsMap =
+StarTreeUtils.extractPredicateEvaluatorsMap(indexSegment, 
queryContext.getFilter());
+if (predicateEvaluatorsMap != null) {
   for (StarTreeV2 starTreeV2 : starTrees) {
 if (StarTreeUtils
 .isFitForStarTree(starTreeV2.getMetadata(), 
aggregationFunctionColumnPairs, _groupByExpressions,
-filter)) {
+predicateEvaluatorsMap.keySet())) {
   _transformPlanNode = null;
   _starTreeTransformPlanNode =
-  new StarTreeTransformPlanNode(starTreeV2, 
aggregationFunctionColumnPairs, _groupByExpressions, filter,
-  queryContext.getDebugOptions());
+  new StarTreeTransformPlanNode(starTreeV2, 
aggregationFunctionColumnPairs, _groupByExpressions,
+  predicateEvaluatorsMap, queryContext.getDebugOptions());
   return;
 }
   }
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/plan/AggregationGroupByPlanNode.java
 

[GitHub] [incubator-pinot] Jackie-Jiang closed issue #6092: Regenerate star-tree index when the index config is changed on reloads

2020-10-05 Thread GitBox


Jackie-Jiang closed issue #6092:
URL: https://github.com/apache/incubator-pinot/issues/6092


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch master updated (be99d78 -> b658925)

2020-10-05 Thread jackie
This is an automated email from the ASF dual-hosted git repository.

jackie pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


from be99d78  Add the primary key reading from the GenericRow (#6102)
 add b658925  Allow modifying/removing existing star-trees during segment 
reload (#6100)

No new revisions were added by this update.

Summary of changes:
 .../segment/index/loader/IndexLoadingConfig.java   |  18 ++--
 .../segment/index/loader/SegmentPreProcessor.java  |  41 ++--
 .../pinot/core/startree/StarTreeBuilderUtils.java  |  27 +
 .../apache/pinot/core/startree/StarTreeUtils.java  |  62 +++
 .../core/startree/v2/StarTreeV2Constants.java  |   5 +-
 .../pinot/core/startree/v2/StarTreeV2Metadata.java |  35 ++-
 .../startree/v2/builder/MultipleTreesBuilder.java  |  36 ---
 .../v2/builder/StarTreeV2BuilderConfig.java|  96 +++--
 .../MultiNodesOfflineClusterIntegrationTest.java   |  15 +++
 .../tests/OfflineClusterIntegrationTest.java   | 114 ++---
 .../spi/config/table/StarTreeIndexConfig.java  |   7 +-
 11 files changed, 320 insertions(+), 136 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch master updated (24147dd -> be99d78)

2020-10-05 Thread jackie
This is an automated email from the ASF dual-hosted git repository.

jackie pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


from 24147dd  [Issue 6068] Fixing the calls to Helix to throw exception if 
zk conne… (#6069)
 add be99d78  Add the primary key reading from the GenericRow (#6102)

No new revisions were added by this update.

Summary of changes:
 .../inputformat/csv/CSVRecordReaderTest.java   | 15 ++---
 .../inputformat/json/JSONRecordReaderTest.java | 13 +---
 .../protobuf/ProtoBufRecordReaderTest.java |  5 +--
 .../apache/pinot/spi/data/readers/GenericRow.java  | 10 ++
 .../Plugin.java => data/readers/PrimaryKey.java}   | 34 +++-
 .../spi/data/readers/AbstractRecordReaderTest.java | 36 ++
 6 files changed, 83 insertions(+), 30 deletions(-)
 copy pinot-spi/src/main/java/org/apache/pinot/spi/{plugin/Plugin.java => 
data/readers/PrimaryKey.java} (60%)


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch master updated (24147dd -> be99d78)

2020-10-05 Thread jackie
This is an automated email from the ASF dual-hosted git repository.

jackie pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


from 24147dd  [Issue 6068] Fixing the calls to Helix to throw exception if 
zk conne… (#6069)
 add be99d78  Add the primary key reading from the GenericRow (#6102)

No new revisions were added by this update.

Summary of changes:
 .../inputformat/csv/CSVRecordReaderTest.java   | 15 ++---
 .../inputformat/json/JSONRecordReaderTest.java | 13 +---
 .../protobuf/ProtoBufRecordReaderTest.java |  5 +--
 .../apache/pinot/spi/data/readers/GenericRow.java  | 10 ++
 .../Plugin.java => data/readers/PrimaryKey.java}   | 34 +++-
 .../spi/data/readers/AbstractRecordReaderTest.java | 36 ++
 6 files changed, 83 insertions(+), 30 deletions(-)
 copy pinot-spi/src/main/java/org/apache/pinot/spi/{plugin/Plugin.java => 
data/readers/PrimaryKey.java} (60%)


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] Jackie-Jiang merged pull request #6100: Allow modifying/removing existing star-trees during segment reload

2020-10-05 Thread GitBox


Jackie-Jiang merged pull request #6100:
URL: https://github.com/apache/incubator-pinot/pull/6100


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] Jackie-Jiang removed a comment on pull request #6100: Allow modifying/removing existing star-trees during segment reload

2020-10-05 Thread GitBox


Jackie-Jiang removed a comment on pull request #6100:
URL: https://github.com/apache/incubator-pinot/pull/6100#issuecomment-703974794


   For #6092 
   Currently we allow generating new star-trees during the segment reload when 
there is no existing ones.
   This PR adds the support of modifying or removing the existing star-trees 
during segment reload when it detects changes for the star-tree index config.
   
   Other enhancement:
   - Deduplicate the star-tree builder config to avoid generating the same 
star-tree multiple times when user put multiple identical star-tree index 
configs, or the star-tree index config is the same as the default star-tree.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] Jackie-Jiang commented on pull request #6100: Allow modifying/removing existing star-trees during segment reload

2020-10-05 Thread GitBox


Jackie-Jiang commented on pull request #6100:
URL: https://github.com/apache/incubator-pinot/pull/6100#issuecomment-703974794


   For #6092 
   Currently we allow generating new star-trees during the segment reload when 
there is no existing ones.
   This PR adds the support of modifying or removing the existing star-trees 
during segment reload when it detects changes for the star-tree index config.
   
   Other enhancement:
   - Deduplicate the star-tree builder config to avoid generating the same 
star-tree multiple times when user put multiple identical star-tree index 
configs, or the star-tree index config is the same as the default star-tree.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] Jackie-Jiang merged pull request #6102: Add the primary key reading from the GenericRow

2020-10-05 Thread GitBox


Jackie-Jiang merged pull request #6102:
URL: https://github.com/apache/incubator-pinot/pull/6102


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] yupeng9 commented on pull request #6102: Add the primary key reading from the GenericRow

2020-10-05 Thread GitBox


yupeng9 commented on pull request #6102:
URL: https://github.com/apache/incubator-pinot/pull/6102#issuecomment-703956599


   @Jackie-Jiang done. Thanks for taking another look



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu merged pull request #6069: [Issue 6068] Fixing the calls to Helix to throw exception if zk conne…

2020-10-05 Thread GitBox


mcvsubbu merged pull request #6069:
URL: https://github.com/apache/incubator-pinot/pull/6069


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch master updated: [Issue 6068] Fixing the calls to Helix to throw exception if zk conne… (#6069)

2020-10-05 Thread mcvsubbu
This is an automated email from the ASF dual-hosted git repository.

mcvsubbu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git


The following commit(s) were added to refs/heads/master by this push:
 new 24147dd  [Issue 6068] Fixing the calls to Helix to throw exception if 
zk conne… (#6069)
24147dd is described below

commit 24147dd5b7dc188e01af283bd0d025a6cec3527b
Author: Subbu Subramaniam 
AuthorDate: Mon Oct 5 17:07:53 2020 -0700

[Issue 6068] Fixing the calls to Helix to throw exception if zk conne… 
(#6069)

* [Issue 6068] Fixing the calls to Helix to throw exception if zk 
connection is broken

See Issue #6068

* Update zk metadata APIs to use timeout zk timeout

These APIs will ensure that if there is a zk disconnect
we will get an exception after a minimal number of retries.
We can change this to retry once and implement a backoff retry
if needed later on.

Note that the underlying helix library ends up calling the previous
API (as yet), but we will upgrade to a helix version soon that actually
implements these

* Fixing a unit test mock call

* Updating more calls to getChildren
---
 .../java/org/apache/pinot/broker/routing/RoutingManager.java   |  3 ++-
 .../org/apache/pinot/common/metadata/ZKMetadataProvider.java   | 10 +++---
 .../java/org/apache/pinot/common/utils/CommonConstants.java|  3 +++
 .../java/org/apache/pinot/common/utils/helix/HelixHelper.java  |  2 +-
 .../pinot/controller/helix/core/PinotHelixResourceManager.java |  2 +-
 .../pinot/controller/helix/core/rebalance/TableRebalancer.java |  2 +-
 .../segment/OfflineReplicaGroupSegmentAssignmentTest.java  |  2 +-
 .../main/java/org/apache/pinot/tools/UpdateSegmentState.java   |  4 +++-
 8 files changed, 19 insertions(+), 9 deletions(-)

diff --git 
a/pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
 
b/pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
index 8fe3f52..940e40c 100644
--- 
a/pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
+++ 
b/pinot-broker/src/main/java/org/apache/pinot/broker/routing/RoutingManager.java
@@ -213,7 +213,8 @@ public class RoutingManager implements ClusterChangeHandler 
{
 long startTimeMs = System.currentTimeMillis();
 
 List instanceConfigZNRecords =
-_zkDataAccessor.getChildren(_instanceConfigsPath, null, 
AccessOption.PERSISTENT);
+_zkDataAccessor.getChildren(_instanceConfigsPath, null, 
AccessOption.PERSISTENT,
+CommonConstants.Helix.ZkClient.RETRY_COUNT, 
CommonConstants.Helix.ZkClient.RETRY_INTERVAL_MS);
 long fetchInstanceConfigsEndTimeMs = System.currentTimeMillis();
 
 // Calculate new enabled and disabled instances
diff --git 
a/pinot-common/src/main/java/org/apache/pinot/common/metadata/ZKMetadataProvider.java
 
b/pinot-common/src/main/java/org/apache/pinot/common/metadata/ZKMetadataProvider.java
index 8278777..af5a5d6 100644
--- 
a/pinot-common/src/main/java/org/apache/pinot/common/metadata/ZKMetadataProvider.java
+++ 
b/pinot-common/src/main/java/org/apache/pinot/common/metadata/ZKMetadataProvider.java
@@ -31,6 +31,7 @@ import 
org.apache.pinot.common.metadata.instance.InstanceZKMetadata;
 import org.apache.pinot.common.metadata.segment.LLCRealtimeSegmentZKMetadata;
 import org.apache.pinot.common.metadata.segment.OfflineSegmentZKMetadata;
 import org.apache.pinot.common.metadata.segment.RealtimeSegmentZKMetadata;
+import org.apache.pinot.common.utils.CommonConstants;
 import org.apache.pinot.common.utils.SchemaUtils;
 import org.apache.pinot.common.utils.SegmentName;
 import org.apache.pinot.common.utils.StringUtil;
@@ -305,7 +306,8 @@ public class ZKMetadataProvider {
   ZkHelixPropertyStore propertyStore, String tableName) {
 String offlineTableName = 
TableNameBuilder.OFFLINE.tableNameWithType(tableName);
 String parentPath = 
constructPropertyStorePathForResource(offlineTableName);
-List znRecords = propertyStore.getChildren(parentPath, null, 
AccessOption.PERSISTENT);
+List znRecords = propertyStore.getChildren(parentPath, null, 
AccessOption.PERSISTENT,
+CommonConstants.Helix.ZkClient.RETRY_COUNT, 
CommonConstants.Helix.ZkClient.RETRY_INTERVAL_MS);
 if (znRecords != null) {
   int numZNRecords = znRecords.size();
   List offlineSegmentZKMetadataList = new 
ArrayList<>(numZNRecords);
@@ -335,7 +337,8 @@ public class ZKMetadataProvider {
   ZkHelixPropertyStore propertyStore, String tableName) {
 String realtimeTableName = 
TableNameBuilder.REALTIME.tableNameWithType(tableName);
 String parentPath = 
constructPropertyStorePathForResource(realtimeTableName);
-List znRecords = propertyStore.getChildren(parentPath, null, 
AccessOption.PERSISTENT);
+List znRecords = propertyStore.getChildren(parentPath, null, 
AccessOption.PERSISTENT,
+

[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #6102: Add the primary key reading from the GenericRow

2020-10-05 Thread GitBox


Jackie-Jiang commented on a change in pull request #6102:
URL: https://github.com/apache/incubator-pinot/pull/6102#discussion_r499935781



##
File path: 
pinot-spi/src/main/java/org/apache/pinot/spi/data/readers/GenericRow.java
##
@@ -123,6 +124,15 @@ public void putValue(String fieldName, @Nullable Object 
value) {
 _fieldToValueMap.put(fieldName, value);
   }
 
+  public PrimaryKey getPrimaryKey(List primaryKeyColumns) {
+Object[] values = new Object[primaryKeyColumns.size()];
+int size = primaryKeyColumns.size();
+for (int i = 0; i < size; i++) {

Review comment:
   (nit)
   ```suggestion
   int numPrimaryKeyColumns = primaryKeyColumns.size();
   Object[] values = new Object[numPrimaryKeyColumns];
   for (int i = 0; i < numPrimaryKeyColumns; i++) {
   ```

##
File path: 
pinot-spi/src/main/java/org/apache/pinot/spi/data/readers/PrimaryKey.java
##
@@ -0,0 +1,59 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.spi.data.readers;
+
+import java.util.Arrays;
+
+
+/**
+ * The primary key of a record. Note that the value used in the primary key 
must be single-value.
+ */
+public class PrimaryKey {
+  private final Object[] _values;
+
+  public PrimaryKey(Object[] fields) {
+_values = fields;

Review comment:
   (nit)
   ```suggestion
 public PrimaryKey(Object[] values) {
   _values = values;
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] Jackie-Jiang commented on pull request #6069: [Issue 6068] Fixing the calls to Helix to throw exception if zk conne…

2020-10-05 Thread GitBox


Jackie-Jiang commented on pull request #6069:
URL: https://github.com/apache/incubator-pinot/pull/6069#issuecomment-703951253


   @mcvsubbu The unexpected behavior I was referring to is that we explicitly 
set retry in the method, but Helix won't do the retry, and won't throw 
exception when the read fails. So I suggest not doing the explicit retry before 
upgrading the Helix version. But this is not critical. Feel free to merge



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] adriancole commented on a change in pull request #6039: WIP: ServiceManager ADD_TABLE role

2020-10-05 Thread GitBox


adriancole commented on a change in pull request #6039:
URL: https://github.com/apache/incubator-pinot/pull/6039#discussion_r499930870



##
File path: 
pinot-controller/src/main/java/org/apache/pinot/controller/ControllerStarter.java
##
@@ -350,6 +350,8 @@ private void setUpPinotController() {
 LOGGER.info("Starting Pinot Helix resource manager and connecting to 
Zookeeper");
 _helixResourceManager.start(_helixParticipantManager);
 
+// TODO: ideally we can block to do schema/table install here, as it is 
before the thing is listening..

Review comment:
   when a table is in the bootstrap roles for servicemanager, it can make 
sense that this is a part of bootstrap. Not just controller, but nothing 
ideally should listen until bootstrap is complete. ServiceManager is 
effectively a composite service.
   
   In talking also at length with @mayankshriv  and with your comment also, I 
can see this point is not understandable outside my head. The impact is more 
external orchestration, which imho is a bad experience. We are allowing a 
composite service, but deciding against taking advantage of it.
   
   Another point lost despite various attempts to convey it is that I don't 
mean to literally block here. I mean to add a callback iff servicemanager is in 
use implements it. otherwise a noop.
   
   If this above comment changes things, lemme know. Otherwise, I'm not really 
sure this PR is worth proceeding with as if we aren't taking these concerns 
internally, the external approach (shell scripts etc) seems more coherent..





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] adriancole commented on a change in pull request #6039: WIP: ServiceManager ADD_TABLE role

2020-10-05 Thread GitBox


adriancole commented on a change in pull request #6039:
URL: https://github.com/apache/incubator-pinot/pull/6039#discussion_r499929548



##
File path: 
pinot-spi/src/main/java/org/apache/pinot/spi/services/ServiceRole.java
##
@@ -22,5 +22,5 @@
  * ServiceRole defines a role that Pinot Service could start/stop.
  */
 public enum ServiceRole {
-  CONTROLLER, BROKER, SERVER, MINION,
+  CONTROLLER, BROKER, SERVER, MINION, ADD_TABLE

Review comment:
   ok





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] adriancole commented on a change in pull request #6039: WIP: ServiceManager ADD_TABLE role

2020-10-05 Thread GitBox


adriancole commented on a change in pull request #6039:
URL: https://github.com/apache/incubator-pinot/pull/6039#discussion_r499929479



##
File path: 
pinot-tools/src/main/java/org/apache/pinot/tools/service/PinotServiceManager.java
##
@@ -92,6 +92,8 @@ public String startRole(ServiceRole role, Map 
properties)
 return startBroker(new PinotConfiguration(properties));
   case SERVER:
 return startServer(new PinotConfiguration(properties));
+  case ADD_TABLE:

Review comment:
   ServiceManagerRole perhaps? I currently don't have any other use case 
except setting up a table and prefer to not be too speculative.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] Jackie-Jiang merged pull request #6107: Making pushType non-mandatory

2020-10-05 Thread GitBox


Jackie-Jiang merged pull request #6107:
URL: https://github.com/apache/incubator-pinot/pull/6107


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch master updated: Making pushType non-mandatory (#6107)

2020-10-05 Thread jackie
This is an automated email from the ASF dual-hosted git repository.

jackie pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git


The following commit(s) were added to refs/heads/master by this push:
 new 11ff74a  Making pushType non-mandatory (#6107)
11ff74a is described below

commit 11ff74a5af6cf2aaa7dd642775ef9553ee4ffe67
Author: icefury71 
AuthorDate: Mon Oct 5 16:35:27 2020 -0700

Making pushType non-mandatory (#6107)
---
 .../src/main/java/org/apache/pinot/core/util/TableConfigUtils.java  | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/util/TableConfigUtils.java 
b/pinot-core/src/main/java/org/apache/pinot/core/util/TableConfigUtils.java
index bc23bb9..d793f34 100644
--- a/pinot-core/src/main/java/org/apache/pinot/core/util/TableConfigUtils.java
+++ b/pinot-core/src/main/java/org/apache/pinot/core/util/TableConfigUtils.java
@@ -108,11 +108,7 @@ public final class TableConfigUtils {
 
 String segmentPushType = segmentsConfig.getSegmentPushType();
 // segmentPushType is not needed for Realtime table
-if (tableConfig.getTableType() == TableType.OFFLINE) {
-  if (segmentPushType == null) {
-throw new IllegalStateException(String.format("Table: %s, null push 
type", tableName));
-  }
-
+if (tableConfig.getTableType() == TableType.OFFLINE && segmentPushType != 
null && !segmentPushType.isEmpty()) {
   if (!segmentPushType.equalsIgnoreCase("REFRESH") && 
!segmentPushType.equalsIgnoreCase("APPEND")) {
 throw new IllegalStateException(String.format("Table: %s, invalid push 
type: %s", tableName, segmentPushType));
   }


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] yupeng9 commented on pull request #6102: Add the primary key reading from the GenericRow

2020-10-05 Thread GitBox


yupeng9 commented on pull request #6102:
URL: https://github.com/apache/incubator-pinot/pull/6102#issuecomment-703942976


   @Jackie-Jiang thanks for the review. Comments addressed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] icefury71 commented on pull request #6107: Making pushType non-mandatory

2020-10-05 Thread GitBox


icefury71 commented on pull request #6107:
URL: https://github.com/apache/incubator-pinot/pull/6107#issuecomment-703941715


   Removed the default value based on this conversation. We're now back to the 
old behaviour where push type is not mandatory and no default value assigned. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] yupeng9 commented on a change in pull request #6102: Add the primary key reading from the GenericRow

2020-10-05 Thread GitBox


yupeng9 commented on a change in pull request #6102:
URL: https://github.com/apache/incubator-pinot/pull/6102#discussion_r499923780



##
File path: 
pinot-spi/src/main/java/org/apache/pinot/spi/data/readers/PrimaryKey.java
##
@@ -0,0 +1,59 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.spi.data.readers;
+
+import java.util.Arrays;
+
+
+/**
+ * The primary key of a record.

Review comment:
   Yes. Table config shall validate this.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] jackjlli commented on pull request #6107: Making pushType non-mandatory

2020-10-05 Thread GitBox


jackjlli commented on pull request #6107:
URL: https://github.com/apache/incubator-pinot/pull/6107#issuecomment-703938464


   @icefury71 The 4th item I pasted above is the one we should be cautious 
about:
   
https://github.com/apache/incubator-pinot/blob/540853d786c9e4b4bb27bf4bbc6fa73837ff1bc3/pinot-core/src/main/java/org/apache/pinot/core/segment/name/NormalizedDateSegmentNameGenerator.java#L53
   
   The boolean value depends on whether the push type is shown as `APPEND` in 
the table config. If it's missing or it's REFRESH, (or it's mis-spelled), the 
behavior will be different. In this case, the missing pushType behavior is the 
same as `REFRESH` push type, which violates your default assumption.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on pull request #6069: [Issue 6068] Fixing the calls to Helix to throw exception if zk conne…

2020-10-05 Thread GitBox


mcvsubbu commented on pull request #6069:
URL: https://github.com/apache/incubator-pinot/pull/6069#issuecomment-703938266


   > @mcvsubbu We want both `getChildNames()` and `getChildren()` to have the 
same behavior, and should throw exception on failures. Since Helix does not 
have the correct APIs for `getChildren()` yet, I would suggest not changing 
`getChildren()` for now and wait until we upgrade to the new Helix version, or 
it will be very confusing for the developers because of the unexpected Helix 
behavior. Thoughts?
   
   Since the new API has not been implemented yet, there will not be any 
unexpected behavior at this time, right?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on pull request #6107: Making pushType non-mandatory

2020-10-05 Thread GitBox


mcvsubbu commented on pull request #6107:
URL: https://github.com/apache/incubator-pinot/pull/6107#issuecomment-703937589


   ah ok, I was thinking push-frequency



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] icefury71 commented on pull request #6107: Making pushType non-mandatory

2020-10-05 Thread GitBox


icefury71 commented on pull request #6107:
URL: https://github.com/apache/incubator-pinot/pull/6107#issuecomment-703935277


   > @mcvsubbu @jackjlli @snleee The mandatory check is just introduced in 
#6073, and might break the existing use cases without the push type configured 
(correct me if I was wrong @icefury71 ). I think there are offline use cases 
within LinkedIn without the push type, which will break.
   > We cannot directly change the default behavior to use `APPEND` if the push 
type is not mandatory for the existing use cases.
   
   Yes, this was mostly to conserve the existing use cases that don't 
explicitly specify a push type. Hence I removed the mandate.
   
   However, I did choose to add a default value (APPEND) since it will break 
the retention manager otherwise. @Jackie-Jiang lemme know if there are any 
implications of this assumption ? As @mcvsubbu mentioned, most of the code 
places kinda assume APPEND semantics.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch master updated (93238c9 -> 2afea5c)

2020-10-05 Thread xiangfu
This is an automated email from the ASF dual-hosted git repository.

xiangfu pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


from 93238c9  add upsert related configs (#6096)
 add 2afea5c  Fix missing segment count reporting for realtime llc segment 
(#6103)

No new revisions were added by this update.

Summary of changes:
 .../pinot/core/data/manager/realtime/RealtimeTableDataManager.java  | 2 ++
 1 file changed, 2 insertions(+)


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] fx19880617 merged pull request #6103: Fix missing segment count reporting for realtime llc segment

2020-10-05 Thread GitBox


fx19880617 merged pull request #6103:
URL: https://github.com/apache/incubator-pinot/pull/6103


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] Jackie-Jiang commented on pull request #6107: Making pushType non-mandatory

2020-10-05 Thread GitBox


Jackie-Jiang commented on pull request #6107:
URL: https://github.com/apache/incubator-pinot/pull/6107#issuecomment-703932617


   @mcvsubbu @jackjlli @snleee The mandatory check is just introduced in #6073, 
and might break the existing use cases without the push type configured 
(correct me if I was wrong @icefury71 ). I think there are offline use cases 
within LinkedIn without the push type, which will break.
   We cannot directly change the default behavior to use `APPEND` if the push 
type is not mandatory for the existing use cases.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] Jackie-Jiang commented on pull request #6069: [Issue 6068] Fixing the calls to Helix to throw exception if zk conne…

2020-10-05 Thread GitBox


Jackie-Jiang commented on pull request #6069:
URL: https://github.com/apache/incubator-pinot/pull/6069#issuecomment-703929767


   @mcvsubbu We want both `getChildNames()` and `getChildren()` to have the 
same behavior, and should throw exception on failures. Since Helix does not 
have the correct APIs for `getChildren()` yet, I would suggest not changing 
`getChildren()` for now and wait until we upgrade to the new Helix version, or 
it will be very confusing for the developers because of the unexpected Helix 
behavior. Thoughts?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #6105: scalar functions for array

2020-10-05 Thread GitBox


Jackie-Jiang commented on a change in pull request #6105:
URL: https://github.com/apache/incubator-pinot/pull/6105#discussion_r499909017



##
File path: 
pinot-common/src/main/java/org/apache/pinot/common/function/FunctionUtils.java
##
@@ -56,6 +61,12 @@ private FunctionUtils() {
 put(Double.class, PinotDataType.DOUBLE);
 put(String.class, PinotDataType.STRING);
 put(byte[].class, PinotDataType.BYTES);
+put(int[].class, PinotDataType.INTEGER_ARRAY);
+put(int.class, PinotDataType.INTEGER);

Review comment:
   Why do you need `int` here? The argument should always be `Object`

##
File path: 
pinot-common/src/main/java/org/apache/pinot/common/function/FunctionUtils.java
##
@@ -42,6 +42,11 @@ private FunctionUtils() {
 put(Double.class, PinotDataType.DOUBLE);
 put(String.class, PinotDataType.STRING);
 put(byte[].class, PinotDataType.BYTES);
+put(int[].class, PinotDataType.INTEGER_ARRAY);

Review comment:
   Can we keep the same order as the SV ones (int, long, float, double, 
string), same for other places

##
File path: 
pinot-common/src/main/java/org/apache/pinot/common/function/FunctionUtils.java
##
@@ -69,6 +80,12 @@ private FunctionUtils() {
 put(Double.class, DataType.DOUBLE);
 put(String.class, DataType.STRING);
 put(byte[].class, DataType.BYTES);
+put(int[].class, DataType.INT);
+put(String[].class, DataType.STRING);
+put(long[].class, DataType.LONG);
+put(float[].class, DataType.FLOAT);
+put(double[].class, DataType.DOUBLE);
+put(boolean.class, DataType.BOOLEAN);

Review comment:
   Don't add `boolean` here as `DataType.BOOLEAN` is not a valid internal 
`DataType` (we use `STRING` to represent boolean). All the unrecognized types 
will be handled as `STRING`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] snleee commented on pull request #6107: Making pushType non-mandatory

2020-10-05 Thread GitBox


snleee commented on pull request #6107:
URL: https://github.com/apache/incubator-pinot/pull/6107#issuecomment-703925780


   If we want to remove the mandatory check, we should adjust codes to align 
all the usage with falling back to the default behavior. We can make `APPEND` 
as the default behavior since it's the most common use case. Currently, the 
code explicitly checks the `APPEND` in most cases, so `null` value for push 
type will end up with the `REFRESH` behavior. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #6046: Deep Extraction Support for ORC, Thrift, and ProtoBuf Records

2020-10-05 Thread GitBox


Jackie-Jiang commented on a change in pull request #6046:
URL: https://github.com/apache/incubator-pinot/pull/6046#discussion_r499902967



##
File path: 
pinot-spi/src/main/java/org/apache/pinot/spi/data/readers/BaseRecordExtractor.java
##
@@ -0,0 +1,174 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.spi.data.readers;
+
+import java.nio.ByteBuffer;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.HashMap;
+import java.util.Map;
+import javax.annotation.Nullable;
+
+
+/**
+ * Base abstract class for extracting and converting the fields of various 
data formats into supported Pinot data types.
+ *
+ * @param  the format of the input record
+ */
+public abstract class BaseRecordExtractor implements RecordExtractor {
+
+  /**
+   * Converts the field value to either a single value (string, number, 
byte[]), multi value (Object[]) or a Map.
+   * Returns {@code null} if the value is an empty array/collection/map.
+   *
+   * Natively Pinot only understands single values and multi values.
+   * Map is useful only if some ingestion transform functions operates on it 
in the transformation layer.
+   */
+  @Nullable
+  public Object convert(Object value) {
+Object convertedValue;
+if (isInstanceOfMultiValue(value)) {
+  convertedValue = convertMultiValue(value);
+} else if (isInstanceOfMap(value)) {
+  convertedValue = convertMap(value);
+} else if (isInstanceOfRecord(value)) {
+  convertedValue = convertRecord(value);
+} else {
+  convertedValue = convertSingleValue(value);
+}
+return convertedValue;

Review comment:
   Suggest renaming some methods:
   ```suggestion
   if (isMultiValue(value)) {
 return convertMultiValue(value);
   } else if (isMap(value)) {
 return convertMap(value);
   } else if (isRecord(value)) {
 return convertRecord(value);
   } else {
 return convertSingleValue(value);
   }
   ```

##
File path: 
pinot-spi/src/main/java/org/apache/pinot/spi/data/readers/BaseRecordExtractor.java
##
@@ -0,0 +1,174 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.spi.data.readers;
+
+import java.nio.ByteBuffer;
+import java.util.Arrays;
+import java.util.Collection;
+import java.util.HashMap;
+import java.util.Map;
+import javax.annotation.Nullable;
+
+
+/**
+ * Base abstract class for extracting and converting the fields of various 
data formats into supported Pinot data types.
+ *
+ * @param  the format of the input record
+ */
+public abstract class BaseRecordExtractor implements RecordExtractor {
+
+  /**
+   * Converts the field value to either a single value (string, number, 
byte[]), multi value (Object[]) or a Map.
+   * Returns {@code null} if the value is an empty array/collection/map.
+   *
+   * Natively Pinot only understands single values and multi values.
+   * Map is useful only if some ingestion transform functions operates on it 
in the transformation layer.
+   */
+  @Nullable
+  public Object convert(Object value) {
+Object convertedValue;
+if (isInstanceOfMultiValue(value)) {
+  convertedValue = convertMultiValue(value);
+} else if (isInstanceOfMap(value)) {
+  convertedValue = convertMap(value);
+} else if (isInstanceOfRecord(value)) {
+  convertedValue = convertRecord(value);
+} else {
+  convertedValue = 

[GitHub] [incubator-pinot] suvodeep-pyne commented on pull request #6108: [TE] Datalayer refactor. Reorganizing Guice Module inside DaoProviderUtil

2020-10-05 Thread GitBox


suvodeep-pyne commented on pull request #6108:
URL: https://github.com/apache/incubator-pinot/pull/6108#issuecomment-703922813


   Yes. That is correct. This streamlines access to the datalayer classes which 
is currently behind a static interface wall.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] jackjlli commented on pull request #6107: Making pushType non-mandatory

2020-10-05 Thread GitBox


jackjlli commented on pull request #6107:
URL: https://github.com/apache/incubator-pinot/pull/6107#issuecomment-703921970


   The config `segmentPushType` is needed at least in the following scenarios:
   1. Retention manager needs it to purge segments: 
https://github.com/apache/incubator-pinot/blob/9929dad2e804f525b8ace925e859e07c6fdee9da/pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/retention/RetentionManager.java#L104
   2. `TableRetentionValidator` gets called 
https://github.com/apache/incubator-pinot/blob/fb043dab7d5fef3ef726896d2f53e45a2fd69f22/pinot-controller/src/main/java/org/apache/pinot/controller/util/TableRetentionValidator.java#L112
   3. Segment creation Hadoop/Spark job requires it to validate the time 
column: 
https://github.com/apache/incubator-pinot/blob/fb043dab7d5fef3ef726896d2f53e45a2fd69f22/pinot-plugins/pinot-batch-ingestion/v0_deprecated/pinot-hadoop/src/main/java/org/apache/pinot/hadoop/job/HadoopSegmentCreationJob.java#L145
   4. Segment creation job requires it to generate the segment name: 
https://github.com/apache/incubator-pinot/blob/540853d786c9e4b4bb27bf4bbc6fa73837ff1bc3/pinot-core/src/main/java/org/apache/pinot/core/segment/name/NormalizedDateSegmentNameGenerator.java#L48
   
   So if you remove the check, you'd also need to adjust those logic to avoid 
breaking the current behaviors.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #6102: Add the primary key reading from the GenericRow

2020-10-05 Thread GitBox


Jackie-Jiang commented on a change in pull request #6102:
URL: https://github.com/apache/incubator-pinot/pull/6102#discussion_r499901568



##
File path: 
pinot-spi/src/main/java/org/apache/pinot/spi/data/readers/PrimaryKey.java
##
@@ -0,0 +1,59 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.spi.data.readers;
+
+import java.util.Arrays;
+
+
+/**
+ * The primary key of a record.
+ */
+public class PrimaryKey {
+  private final Object[] _fields;

Review comment:
   Suggest renaming it to `_values`

##
File path: 
pinot-spi/src/main/java/org/apache/pinot/spi/data/readers/GenericRow.java
##
@@ -123,6 +124,14 @@ public void putValue(String fieldName, @Nullable Object 
value) {
 _fieldToValueMap.put(fieldName, value);
   }
 
+  public PrimaryKey getPrimaryKey(List primaryKeyColumns) {
+Object[] fields = new Object[primaryKeyColumns.size()];

Review comment:
   Cache `primaryKeyColumns.size()` into a local variable

##
File path: 
pinot-spi/src/main/java/org/apache/pinot/spi/data/readers/PrimaryKey.java
##
@@ -0,0 +1,59 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.spi.data.readers;
+
+import java.util.Arrays;
+
+
+/**
+ * The primary key of a record.

Review comment:
   Add some javadoc stating that the values should be SV (MV values won't 
work for equals and hashcode)

##
File path: 
pinot-spi/src/main/java/org/apache/pinot/spi/data/readers/GenericRow.java
##
@@ -123,6 +124,14 @@ public void putValue(String fieldName, @Nullable Object 
value) {
 _fieldToValueMap.put(fieldName, value);
   }
 
+  public PrimaryKey getPrimaryKey(List primaryKeyColumns) {
+Object[] fields = new Object[primaryKeyColumns.size()];

Review comment:
   Suggest renaming it to `values`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #6069: [Issue 6068] Fixing the calls to Helix to throw exception if zk conne…

2020-10-05 Thread GitBox


mcvsubbu commented on a change in pull request #6069:
URL: https://github.com/apache/incubator-pinot/pull/6069#discussion_r499900482



##
File path: 
pinot-common/src/main/java/org/apache/pinot/common/metadata/ZKMetadataProvider.java
##
@@ -305,7 +308,7 @@ public static Schema getTableSchema(@Nonnull 
ZkHelixPropertyStore prop
   ZkHelixPropertyStore propertyStore, String tableName) {
 String offlineTableName = 
TableNameBuilder.OFFLINE.tableNameWithType(tableName);
 String parentPath = 
constructPropertyStorePathForResource(offlineTableName);
-List znRecords = propertyStore.getChildren(parentPath, null, 
AccessOption.PERSISTENT);
+List znRecords = propertyStore.getChildren(parentPath, null, 
AccessOption.PERSISTENT, ZK_OP_RETRY_COUNT, ZK_OP_RETRY_INTERVAL_MS);

Review comment:
   We are working with the helix team to give us a new version with the API 
implemented. Since I took the trouble to trace all the APIs, I took the liberty 
to modify them so that when we upgrade to newer helix version, we have the 
right calls





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] yupeng9 commented on issue #5942: Better Table config validation

2020-10-05 Thread GitBox


yupeng9 commented on issue #5942:
URL: 
https://github.com/apache/incubator-pinot/issues/5942#issuecomment-703891914


   +1
   
   Another use case is the upcoming upsert. There are several expectations for 
upsert table like primary key in schema, replica group setup, routing by 
replicas group, partition config, ingestion via LLC etc



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] suvodeep-pyne opened a new pull request #6108: [TE] Datalayer refactor. Reorganizing Guice Module inside DaoProviderUtil

2020-10-05 Thread GitBox


suvodeep-pyne opened a new pull request #6108:
URL: https://github.com/apache/incubator-pinot/pull/6108


   Changes
   - Moved DataSourceModule outside to ThirdEyePersistenceModule
   - Refactored module logic to leverage Guice to inject dependencies
   - Making all services singletons on Guice level instead of manual handling
   - Moved unit test from `thirdeye-dashboard` to `thirdeye-pinot`
   
   No logic changes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] npawar commented on pull request #6107: Making pushType non-mandatory

2020-10-05 Thread GitBox


npawar commented on pull request #6107:
URL: https://github.com/apache/incubator-pinot/pull/6107#issuecomment-703889578


   Along with this, I think we also need to make APPEND the default? I remember 
there were some retention related issues.
   From retention manager
   ```
   String segmentPushType = validationConfig.getSegmentPushType();
   if (tableConfig.getTableType() == TableType.OFFLINE && 
!"APPEND".equalsIgnoreCase(segmentPushType)) {
 LOGGER.info("Segment push type is not APPEND for table: {}, skip", 
tableNameWithType);
 return;
   }
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] icefury71 opened a new pull request #6107: Making pushType non-mandatory

2020-10-05 Thread GitBox


icefury71 opened a new pull request #6107:
URL: https://github.com/apache/incubator-pinot/pull/6107


   ## Description
   As part of #5942 pushType was made mandatory for offline tables my mistake. 
Removing this mandate. 
   
   ## Upgrade Notes
   Does this PR prevent a zero down-time upgrade?
   No
   
   Does this PR fix a zero-downtime upgrade introduced earlier?
   No
   
   Does this PR otherwise need attention when creating release notes?
   No



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] npawar commented on a change in pull request #6105: scalar functions for array

2020-10-05 Thread GitBox


npawar commented on a change in pull request #6105:
URL: https://github.com/apache/incubator-pinot/pull/6105#discussion_r499821865



##
File path: 
pinot-common/src/main/java/org/apache/pinot/common/function/scalar/ArrayFunctions.java
##
@@ -0,0 +1,133 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.function.scalar;
+
+import java.util.Arrays;
+import org.apache.commons.lang3.ArrayUtils;
+import org.apache.pinot.spi.annotations.ScalarFunction;
+
+/**
+ * Inbuilt Array Transformation Functions
+ * The functions can be used as UDFs in Query when added in the 
FunctionRegistry.
+ * @ScalarFunction annotation is used with each method for the registration
+ *
+ * Example usage:
+ *  SELECT reverseArray(array) FROM baseballStats LIMIT 10 
+ */
+public class ArrayFunctions {
+  private ArrayFunctions() {
+
+  }
+
+  /**
+   * @see ArrayUtils#reverse(Object[])
+   * @param value
+   * @return reversed input array in from end to start
+   */
+  @ScalarFunction
+  public static int[] reverseIntArray(int[] value) {

Review comment:
   for every function that we want to add, we'll have to always write 
fooLong, fooInt, fooDouble, fooFloat, fooString. In the presto-pinot connector 
we'll have to do additional mapping. Is there no way around that?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #6069: [Issue 6068] Fixing the calls to Helix to throw exception if zk conne…

2020-10-05 Thread GitBox


Jackie-Jiang commented on a change in pull request #6069:
URL: https://github.com/apache/incubator-pinot/pull/6069#discussion_r499809705



##
File path: 
pinot-common/src/main/java/org/apache/pinot/common/metadata/ZKMetadataProvider.java
##
@@ -305,7 +308,7 @@ public static Schema getTableSchema(@Nonnull 
ZkHelixPropertyStore prop
   ZkHelixPropertyStore propertyStore, String tableName) {
 String offlineTableName = 
TableNameBuilder.OFFLINE.tableNameWithType(tableName);
 String parentPath = 
constructPropertyStorePathForResource(offlineTableName);
-List znRecords = propertyStore.getChildren(parentPath, null, 
AccessOption.PERSISTENT);
+List znRecords = propertyStore.getChildren(parentPath, null, 
AccessOption.PERSISTENT, ZK_OP_RETRY_COUNT, ZK_OP_RETRY_INTERVAL_MS);

Review comment:
   This won't take effect... Here is the Helix code for it (I would count 
it as a bug):
   ```
 @Override
 public List getChildren(String parentPath, List stats, int 
options, int retryCount,
 int retryInterval) throws HelixException {
   return getChildren(parentPath, stats, options);
 }
   ```
   I don't see an easy way to throw exception for this API unless we create the 
children paths manually.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #6103: Fix missing segment count reporting for realtime llc segment

2020-10-05 Thread GitBox


Jackie-Jiang commented on a change in pull request #6103:
URL: https://github.com/apache/incubator-pinot/pull/6103#discussion_r499804758



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/RealtimeTableDataManager.java
##
@@ -269,6 +270,7 @@ public void addSegment(String segmentName, TableConfig 
tableConfig, IndexLoading
   }
   _logger.info("Initialize RealtimeSegmentDataManager - " + segmentName);
   _segmentDataManagerMap.put(segmentName, manager);
+  _serverMetrics.addValueToTableGauge(_tableNameWithType, 
ServerGauge.SEGMENT_COUNT, 1L);

Review comment:
   Oh, yes you are right haha





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] deemoliu opened a new issue #6106: Support customized maxLength for fields in schemaBuilder

2020-10-05 Thread GitBox


deemoliu opened a new issue #6106:
URL: https://github.com/apache/incubator-pinot/issues/6106


   When construct a Pinot schema using 
[schemaBuilder](https://github.com/apache/incubator-pinot/blob/master/pinot-spi/src/main/java/org/apache/pinot/spi/data/Schema.java#L474),
 the maxLength of fields are always set to DEFAULT_MAX_LENGTH which is 512. 
Currently we need to tune "maxLength" in table schema manually.
   
   Would it be better to expose the maxLength configuration to schemaBuilder?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] fx19880617 commented on a change in pull request #6103: Fix missing segment count reporting for realtime llc segment

2020-10-05 Thread GitBox


fx19880617 commented on a change in pull request #6103:
URL: https://github.com/apache/incubator-pinot/pull/6103#discussion_r499786861



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/RealtimeTableDataManager.java
##
@@ -269,6 +270,7 @@ public void addSegment(String segmentName, TableConfig 
tableConfig, IndexLoading
   }
   _logger.info("Initialize RealtimeSegmentDataManager - " + segmentName);
   _segmentDataManagerMap.put(segmentName, manager);
+  _serverMetrics.addValueToTableGauge(_tableNameWithType, 
ServerGauge.SEGMENT_COUNT, 1L);

Review comment:
   Yes, this is inside this else block, just the last line :) 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #6103: Fix missing segment count reporting for realtime llc segment

2020-10-05 Thread GitBox


Jackie-Jiang commented on a change in pull request #6103:
URL: https://github.com/apache/incubator-pinot/pull/6103#discussion_r499783811



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/RealtimeTableDataManager.java
##
@@ -269,6 +270,7 @@ public void addSegment(String segmentName, TableConfig 
tableConfig, IndexLoading
   }
   _logger.info("Initialize RealtimeSegmentDataManager - " + segmentName);
   _segmentDataManagerMap.put(segmentName, manager);
+  _serverMetrics.addValueToTableGauge(_tableNameWithType, 
ServerGauge.SEGMENT_COUNT, 1L);

Review comment:
   This one should be added inside the else block in line 237, or we will 
double counting because this gauge will also be updated in 
`addSegment(ImmutableSegmentLoader.load(indexDir, indexLoadingConfig, schema));`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #6039: WIP: ServiceManager ADD_TABLE role

2020-10-05 Thread GitBox


mcvsubbu commented on a change in pull request #6039:
URL: https://github.com/apache/incubator-pinot/pull/6039#discussion_r499763424



##
File path: 
pinot-controller/src/main/java/org/apache/pinot/controller/ControllerStarter.java
##
@@ -350,6 +350,8 @@ private void setUpPinotController() {
 LOGGER.info("Starting Pinot Helix resource manager and connecting to 
Zookeeper");
 _helixResourceManager.start(_helixParticipantManager);
 
+// TODO: ideally we can block to do schema/table install here, as it is 
before the thing is listening..

Review comment:
   Why would you want to block before schema/table install? A controller 
can start up without any tables just fine

##
File path: 
pinot-tools/src/main/java/org/apache/pinot/tools/service/PinotServiceManager.java
##
@@ -92,6 +92,8 @@ public String startRole(ServiceRole role, Map 
properties)
 return startBroker(new PinotConfiguration(properties));
   case SERVER:
 return startServer(new PinotConfiguration(properties));
+  case ADD_TABLE:

Review comment:
   This is not a pinot service. Perhaps you should pull out a base class, 
and add a `PinotSetupServiceManager` (I am sure there is a better name) in 
which you can add tables, ingest data, check for data being online, etc.? 

##
File path: 
pinot-spi/src/main/java/org/apache/pinot/spi/services/ServiceRole.java
##
@@ -22,5 +22,5 @@
  * ServiceRole defines a role that Pinot Service could start/stop.
  */
 public enum ServiceRole {
-  CONTROLLER, BROKER, SERVER, MINION,
+  CONTROLLER, BROKER, SERVER, MINION, ADD_TABLE

Review comment:
   `ADD_TABLE` is not a service in pinot, it should be an external service.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] Jackie-Jiang merged pull request #6096: add upsert related configs

2020-10-05 Thread GitBox


Jackie-Jiang merged pull request #6096:
URL: https://github.com/apache/incubator-pinot/pull/6096


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch master updated (e4d7a10 -> 93238c9)

2020-10-05 Thread jackie
This is an automated email from the ASF dual-hosted git repository.

jackie pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


from e4d7a10  Add a property to set the s3 endpoint (#6104)
 add 93238c9  add upsert related configs (#6096)

No new revisions were added by this update.

Summary of changes:
 .../org/apache/pinot/common/data/SchemaTest.java   | 12 --
 .../common/utils/config/TableConfigSerDeTest.java  |  8 +---
 .../apache/pinot/spi/config/table/TableConfig.java |  6 +++
 .../pinot/spi/config/table/UpsertConfig.java   | 44 ++
 .../java/org/apache/pinot/spi/data/Schema.java | 37 +++---
 .../pinot/spi/config/table/UpsertConfigTest.java   | 36 ++
 6 files changed, 62 insertions(+), 81 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mayankshriv commented on pull request #6039: WIP: ServiceManager ADD_TABLE role

2020-10-05 Thread GitBox


mayankshriv commented on pull request #6039:
URL: https://github.com/apache/incubator-pinot/pull/6039#issuecomment-703762755


   I put some more thought on this, and still unable to convince myself about 
ServiceManager controlling when the service should announce health status 
(especially given that the use case is for dev/test env). I feel a cleaner way 
is for client side to check for something like /tableReady or /tableHealthy 
(that could be added).
   
   Perhaps this is because I still see ServiceManager as a utility to setup the 
cluster (and not an integral part of the service itself). We can solicit inputs 
from other folks as well: @mcvsubbu @siddharthteotia @kishoreg 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] KKcorps commented on a change in pull request #6105: scalar functions for array

2020-10-05 Thread GitBox


KKcorps commented on a change in pull request #6105:
URL: https://github.com/apache/incubator-pinot/pull/6105#discussion_r499737262



##
File path: 
pinot-common/src/main/java/org/apache/pinot/common/function/scalar/ArrayFunctions.java
##
@@ -0,0 +1,133 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.function.scalar;
+
+import java.util.Arrays;
+import org.apache.commons.lang3.ArrayUtils;
+import org.apache.pinot.spi.annotations.ScalarFunction;
+
+/**
+ * Inbuilt Array Transformation Functions
+ * The functions can be used as UDFs in Query when added in the 
FunctionRegistry.
+ * @ScalarFunction annotation is used with each method for the registration
+ *
+ * Example usage:
+ *  SELECT reverseArray(array) FROM baseballStats LIMIT 10 
+ */
+public class ArrayFunctions {
+  private ArrayFunctions() {
+
+  }
+
+  /**
+   * @see ArrayUtils#reverse(Object[])
+   * @param value
+   * @return reversed input array in from end to start
+   */
+  @ScalarFunction
+  public static int[] reverseIntArray(int[] value) {
+  ArrayUtils.reverse(value);
+  return value;
+  }
+
+  /**
+   * @see ArrayUtils#reverse(Object[])
+   * @param value
+   * @return reversed input array in from end to start
+   */
+  @ScalarFunction
+  public static String[] reverseStringArray(String[] value) {
+ArrayUtils.reverse(value);
+return value;
+  }
+
+  /**
+   * @see Arrays#stream(double[]).max()
+   * @param value
+   * @return find minimum value in input array
+   */
+  @ScalarFunction
+  public static int[] arrayMax(int[] value) {
+return new int[] {Arrays.stream(value).max().getAsInt()};
+  }
+
+  /**
+   * @see Arrays#stream(double[]).min()
+   * @param value
+   * @return find maximum value in input array
+   */
+  @ScalarFunction
+  public static int[] arrayMin(int[] value) {
+return new int[] {Arrays.stream(value).min().getAsInt()};
+  }
+
+  /**
+   * @see Arrays#stream(double[]).sum()
+   * @param value
+   * @return calculate sum of all values in input array
+   */
+  @ScalarFunction
+  public static int[] arraySum(int[] value) {
+return new int[] {Arrays.stream(value).sum()};
+  }
+
+  /**
+   * @see ArrayUtils#indexOf(double[], double)
+   * @param value
+   * @param valueToFind
+   * @return return position of value in input array
+   */
+  @ScalarFunction
+  public static int[] arrayPosition(int[] value, int valueToFind) {

Review comment:
   Can we keep function names to match the 
https://prestodb.io/docs/current/functions/array.html
   
   Also, please add any other possible functions from the link.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] KKcorps commented on a change in pull request #6105: scalar functions for array

2020-10-05 Thread GitBox


KKcorps commented on a change in pull request #6105:
URL: https://github.com/apache/incubator-pinot/pull/6105#discussion_r499735864



##
File path: 
pinot-common/src/main/java/org/apache/pinot/common/function/FunctionUtils.java
##
@@ -42,6 +42,11 @@ private FunctionUtils() {
 put(Double.class, PinotDataType.DOUBLE);
 put(String.class, PinotDataType.STRING);
 put(byte[].class, PinotDataType.BYTES);
+put(int[].class, PinotDataType.INTEGER_ARRAY);
+put(String[].class, PinotDataType.STRING_ARRAY);
+put(long[].class, PinotDataType.LONG_ARRAY);
+put(float[].class, PinotDataType.FLOAT_ARRAY);
+put(double[].class, PinotDataType.DOUBLE_ARRAY);

Review comment:
   Also, add non-primitive data types e.g. Integer[], Long[]





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] KKcorps commented on pull request #6105: scalar functions for array

2020-10-05 Thread GitBox


KKcorps commented on pull request #6105:
URL: https://github.com/apache/incubator-pinot/pull/6105#issuecomment-703750862


   @SandishKumarHN Can you also add some tests for these functions along with 
an example in the description?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] codecov-commenter commented on pull request #6105: scalar functions for array

2020-10-05 Thread GitBox


codecov-commenter commented on pull request #6105:
URL: https://github.com/apache/incubator-pinot/pull/6105#issuecomment-703732970


   # 
[Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6105?src=pr=h1) 
Report
   > Merging 
[#6105](https://codecov.io/gh/apache/incubator-pinot/pull/6105?src=pr=desc) 
into 
[master](https://codecov.io/gh/apache/incubator-pinot/commit/1beaab59b73f26c4e35f3b9bc856b03806cddf5a?el=desc)
 will **increase** coverage by `6.28%`.
   > The diff coverage is `60.07%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-pinot/pull/6105/graphs/tree.svg?width=650=150=pr=4ibza2ugkz)](https://codecov.io/gh/apache/incubator-pinot/pull/6105?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#6105  +/-   ##
   ==
   + Coverage   66.44%   72.73%   +6.28% 
   ==
 Files1075 1224 +149 
 Lines   5477357825+3052 
 Branches 8168 8530 +362 
   ==
   + Hits3639642059+5663 
   + Misses  1570013004-2696 
   - Partials 2677 2762  +85 
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | #integration | `45.26% <48.50%> (?)` | |
   | #unittests | `63.99% <38.12%> (?)` | |
   
   Flags with carried forward coverage won't be shown. [Click 
here](https://docs.codecov.io/docs/carryforward-flags#carryforward-flags-in-the-pull-request-comment)
 to find out more.
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-pinot/pull/6105?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[...ot/broker/broker/AllowAllAccessControlFactory.java](https://codecov.io/gh/apache/incubator-pinot/pull/6105/diff?src=pr=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvYnJva2VyL0FsbG93QWxsQWNjZXNzQ29udHJvbEZhY3RvcnkuamF2YQ==)
 | `100.00% <ø> (ø)` | |
   | 
[.../helix/BrokerUserDefinedMessageHandlerFactory.java](https://codecov.io/gh/apache/incubator-pinot/pull/6105/diff?src=pr=tree#diff-cGlub3QtYnJva2VyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9icm9rZXIvYnJva2VyL2hlbGl4L0Jyb2tlclVzZXJEZWZpbmVkTWVzc2FnZUhhbmRsZXJGYWN0b3J5LmphdmE=)
 | `52.83% <0.00%> (-13.84%)` | :arrow_down: |
   | 
[...ava/org/apache/pinot/client/AbstractResultSet.java](https://codecov.io/gh/apache/incubator-pinot/pull/6105/diff?src=pr=tree#diff-cGlub3QtY2xpZW50cy9waW5vdC1qYXZhLWNsaWVudC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY2xpZW50L0Fic3RyYWN0UmVzdWx0U2V0LmphdmE=)
 | `53.33% <0.00%> (-3.81%)` | :arrow_down: |
   | 
[.../main/java/org/apache/pinot/client/Connection.java](https://codecov.io/gh/apache/incubator-pinot/pull/6105/diff?src=pr=tree#diff-cGlub3QtY2xpZW50cy9waW5vdC1qYXZhLWNsaWVudC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY2xpZW50L0Nvbm5lY3Rpb24uamF2YQ==)
 | `44.44% <0.00%> (-4.40%)` | :arrow_down: |
   | 
[.../org/apache/pinot/client/ResultTableResultSet.java](https://codecov.io/gh/apache/incubator-pinot/pull/6105/diff?src=pr=tree#diff-cGlub3QtY2xpZW50cy9waW5vdC1qYXZhLWNsaWVudC9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY2xpZW50L1Jlc3VsdFRhYmxlUmVzdWx0U2V0LmphdmE=)
 | `24.00% <0.00%> (-10.29%)` | :arrow_down: |
   | 
[...not/common/assignment/InstancePartitionsUtils.java](https://codecov.io/gh/apache/incubator-pinot/pull/6105/diff?src=pr=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vYXNzaWdubWVudC9JbnN0YW5jZVBhcnRpdGlvbnNVdGlscy5qYXZh)
 | `78.57% <ø> (+5.40%)` | :arrow_up: |
   | 
[.../apache/pinot/common/exception/QueryException.java](https://codecov.io/gh/apache/incubator-pinot/pull/6105/diff?src=pr=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vZXhjZXB0aW9uL1F1ZXJ5RXhjZXB0aW9uLmphdmE=)
 | `90.27% <ø> (+5.55%)` | :arrow_up: |
   | 
[...pinot/common/function/AggregationFunctionType.java](https://codecov.io/gh/apache/incubator-pinot/pull/6105/diff?src=pr=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vZnVuY3Rpb24vQWdncmVnYXRpb25GdW5jdGlvblR5cGUuamF2YQ==)
 | `98.27% <ø> (-1.73%)` | :arrow_down: |
   | 
[.../pinot/common/function/DateTimePatternHandler.java](https://codecov.io/gh/apache/incubator-pinot/pull/6105/diff?src=pr=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vZnVuY3Rpb24vRGF0ZVRpbWVQYXR0ZXJuSGFuZGxlci5qYXZh)
 | `83.33% <ø> (ø)` | |
   | 
[...ot/common/function/FunctionDefinitionRegistry.java](https://codecov.io/gh/apache/incubator-pinot/pull/6105/diff?src=pr=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vZnVuY3Rpb24vRnVuY3Rpb25EZWZpbml0aW9uUmVnaXN0cnkuamF2YQ==)
 | `88.88% <ø> (+44.44%)` | :arrow_up: |
   | ... and [956 
more](https://codecov.io/gh/apache/incubator-pinot/pull/6105/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 

[GitHub] [incubator-pinot] mcvsubbu commented on pull request #6020: Add Caching in Controller Broker API

2020-10-05 Thread GitBox


mcvsubbu commented on pull request #6020:
URL: https://github.com/apache/incubator-pinot/pull/6020#issuecomment-703717382


   > @mcvsubbu The issue is the changes are needed for new /Brokers api.
   > The plan is to call the API to fetch the brokers for a tenant from JDBC 
driver. The current implementation requires too many calls to zookeeper.
   > I was not aware of the watchers concern though. @kishoreg @fx19880617 What 
should we do about this?
   
   Why is calls to zookeeper bad? How many calls is it? Zookeepers can handle a 
lot more calls to it than pinot controllers can.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on pull request #6020: Add Caching in Controller Broker API

2020-10-05 Thread GitBox


mcvsubbu commented on pull request #6020:
URL: https://github.com/apache/incubator-pinot/pull/6020#issuecomment-703716431


   > @mcvsubbu I don't think this change will add more watchers though. These 
watchers are always set by Helix, and we are just registering some extra 
callbacks to Helix
   
   That is not true. We have separated Helix and pinot controllers. The watches 
are registered by helix controllers. Pinot controllers have zero watches in them



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] SandishKumarHN opened a new pull request #6105: scalar functions for array

2020-10-05 Thread GitBox


SandishKumarHN opened a new pull request #6105:
URL: https://github.com/apache/incubator-pinot/pull/6105


Inbuilt functions - Scalar funtions for array columns



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch master updated (1126cac -> e4d7a10)

2020-10-05 Thread xiangfu
This is an automated email from the ASF dual-hosted git repository.

xiangfu pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


from 1126cac  [TE] move dashboard resources for refactoring (#6058)
 add e4d7a10  Add a property to set the s3 endpoint (#6104)

No new revisions were added by this update.

Summary of changes:
 .../java/org/apache/pinot/plugin/filesystem/S3PinotFS.java  | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] fx19880617 merged pull request #6104: Add a property to set the s3 endpoint

2020-10-05 Thread GitBox


fx19880617 merged pull request #6104:
URL: https://github.com/apache/incubator-pinot/pull/6104


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] fx19880617 commented on a change in pull request #6104: Add a property to set the s3 endpoint

2020-10-05 Thread GitBox


fx19880617 commented on a change in pull request #6104:
URL: https://github.com/apache/incubator-pinot/pull/6104#discussion_r499382289



##
File path: 
pinot-plugins/pinot-file-system/pinot-s3/src/main/java/org/apache/pinot/plugin/filesystem/S3PinotFS.java
##
@@ -93,7 +95,16 @@ public void init(PinotConfiguration config) {
 awsCredentialsProvider = DefaultCredentialsProvider.create();
   }
 
-  _s3Client = 
S3Client.builder().region(Region.of(region)).credentialsProvider(awsCredentialsProvider).build();
+  S3ClientBuilder s3ClientBuilder = 
S3Client.builder().region(Region.of(region)).credentialsProvider(awsCredentialsProvider);

Review comment:
   I think your code to not touch this should be ok, we can clean this up 
later.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org