[GitHub] [incubator-pinot] chenboat commented on issue #4686: Pinot-server getting timedout with zookeeper and it is not reconnecting back

2019-10-08 Thread GitBox
chenboat commented on issue #4686: Pinot-server getting timedout with zookeeper 
and it is not reconnecting back
URL: 
https://github.com/apache/incubator-pinot/issues/4686#issuecomment-539830511
 
 
   Do you know why the Pinot server's connection to zookeeper got timeout? We 
ran into similar problem here in Uber because the Pinot servers experienced 
Garbage Collection due to expensive queries. Thus, you may take a look at the 
query traffic on the servers. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] snleee opened a new pull request #4687: Remove "refresh" usecase check in RoutingTableBuilderFactory

2019-10-08 Thread GitBox
snleee opened a new pull request #4687: Remove "refresh" usecase check in 
RoutingTableBuilderFactory
URL: https://github.com/apache/incubator-pinot/pull/4687
 
 
   We have created "DefaultOfflineRoutingTableBuilder" when the
   use case is a refresh use case. Since PartitionAwareOffline
   should work with the refresh table, we are removing the check.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #4602: First pass of GROUP BY with ORDER BY support

2019-10-08 Thread GitBox
Jackie-Jiang commented on a change in pull request #4602: First pass of GROUP 
BY with ORDER BY support
URL: https://github.com/apache/incubator-pinot/pull/4602#discussion_r332783799
 
 

 ##
 File path: 
pinot-core/src/main/java/org/apache/pinot/core/query/reduce/BrokerReduceService.java
 ##
 @@ -428,6 +469,217 @@ private void setAggregationResults(BrokerResponseNative 
brokerResponseNative,
 }
   }
 
+  /**
+   * Extract group by order by results and set into {@link ResultTable}
+   * @param brokerResponseNative broker response
+   * @param dataSchema data schema
+   * @param aggregationInfos aggregations info
+   * @param groupBy group by info
+   * @param orderBy order by info
+   * @param dataTableMap map from server to data table
+   */
+  private void setSQLGroupByOrderByResults(BrokerResponseNative 
brokerResponseNative, DataSchema dataSchema,
+  List aggregationInfos, GroupBy groupBy, 
List orderBy,
+  Map dataTableMap, boolean preserveType) {
+
+List columns = new ArrayList<>(dataSchema.size());
+for (int i = 0; i < dataSchema.size(); i++) {
+  columns.add(dataSchema.getColumnName(i));
+}
+
+int numGroupBy = groupBy.getExpressionsSize();
+int numAggregations = aggregationInfos.size();
+
+IndexedTable indexedTable;
+try {
+  indexedTable =
+  getIndexedTable(numGroupBy, numAggregations, groupBy, 
aggregationInfos, orderBy, dataSchema, dataTableMap);
+} catch (Throwable throwable) {
+  throw new IllegalStateException(throwable);
+}
+
+List aggregationFunctions = new 
ArrayList<>(aggregationInfos.size());
+for (AggregationInfo aggregationInfo : aggregationInfos) {
+  aggregationFunctions.add(
+  
AggregationFunctionUtils.getAggregationFunctionContext(aggregationInfo).getAggregationFunction());
+}
+
+List rows = new ArrayList<>();
+int numColumns = columns.size();
+Iterator sortedIterator = indexedTable.iterator();
+int numRows = 0;
+while (numRows < groupBy.getTopN() && sortedIterator.hasNext()) {
+
+  Record nextRecord = sortedIterator.next();
+  Serializable[] row = new Serializable[numColumns];
+  int index = 0;
+  for (Object keyColumn : nextRecord.getKey().getColumns()) {
+row[index ++] = getSerializableValue(keyColumn);
+  }
+  int aggNum = 0;
+  for (Object valueColumn : nextRecord.getValues()) {
+row[index] = 
getSerializableValue(aggregationFunctions.get(aggNum).extractFinalResult(valueColumn));
+if (preserveType) {
+  row[index] = AggregationFunctionUtils.formatValue(row[index]);
+}
+index ++;
+  }
+  rows.add(row);
+  numRows++;
+}
+
+brokerResponseNative.setResultTable(new ResultTable(columns, rows));
+  }
+
+  private IndexedTable getIndexedTable(int numGroupBy, int numAggregations, 
GroupBy groupBy,
+  List aggregationInfos, List orderBy, 
DataSchema dataSchema, Map dataTableMap)
+  throws Throwable {
+
+IndexedTable indexedTable = new ConcurrentIndexedTable();
+int indexedTableCapacity = 1_000_000;
+// FIXME: indexedTableCapacity should be derived from TOP. Hardcoding this 
value to a higher number until we can tune the resize
+// int capacity = GroupByUtils.getTableCapacity((int) groupBy.getTopN());
+indexedTable.init(dataSchema, aggregationInfos, orderBy, 
indexedTableCapacity);
+
+for (DataTable dataTable : dataTableMap.values()) {
+  CheckedFunction2[] functions = new CheckedFunction2[dataSchema.size()];
 
 Review comment:
   Is this a commonly used interface? `BiFunction` from `java.util.function` 
might be better?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #4602: First pass of GROUP BY with ORDER BY support

2019-10-08 Thread GitBox
Jackie-Jiang commented on a change in pull request #4602: First pass of GROUP 
BY with ORDER BY support
URL: https://github.com/apache/incubator-pinot/pull/4602#discussion_r332764404
 
 

 ##
 File path: 
pinot-broker/src/main/java/org/apache/pinot/broker/requesthandler/BaseBrokerRequestHandler.java
 ##
 @@ -60,10 +60,6 @@
 import org.slf4j.LoggerFactory;
 
 import static org.apache.pinot.common.utils.CommonConstants.Broker.*;
 
 Review comment:
   (nit) Can we change this to non-static import `import 
org.apache.pinot.common.utils.CommonConstants.Broker;`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #4602: First pass of GROUP BY with ORDER BY support

2019-10-08 Thread GitBox
Jackie-Jiang commented on a change in pull request #4602: First pass of GROUP 
BY with ORDER BY support
URL: https://github.com/apache/incubator-pinot/pull/4602#discussion_r332781939
 
 

 ##
 File path: 
pinot-core/src/main/java/org/apache/pinot/core/query/reduce/BrokerReduceService.java
 ##
 @@ -202,10 +210,10 @@ public BrokerResponseNative 
reduceOnDataTable(BrokerRequest brokerRequest,
 String rawTableName = TableNameBuilder.extractRawTableName(tableName);
 if (brokerMetrics != null) {
   brokerMetrics.addMeteredTableValue(rawTableName, 
BrokerMeter.DOCUMENTS_SCANNED, numDocsScanned);
-  brokerMetrics
-  .addMeteredTableValue(rawTableName, 
BrokerMeter.ENTRIES_SCANNED_IN_FILTER, numEntriesScannedInFilter);
-  brokerMetrics
-  .addMeteredTableValue(rawTableName, 
BrokerMeter.ENTRIES_SCANNED_POST_FILTER, numEntriesScannedPostFilter);
+  brokerMetrics.addMeteredTableValue(rawTableName, 
BrokerMeter.ENTRIES_SCANNED_IN_FILTER,
 
 Review comment:
   Are you using Pinot style? I think this difference is from Pinot style vs 
LinkedIn style


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #4602: First pass of GROUP BY with ORDER BY support

2019-10-08 Thread GitBox
Jackie-Jiang commented on a change in pull request #4602: First pass of GROUP 
BY with ORDER BY support
URL: https://github.com/apache/incubator-pinot/pull/4602#discussion_r332778952
 
 

 ##
 File path: 
pinot-core/src/main/java/org/apache/pinot/core/data/table/SimpleIndexedTable.java
 ##
 @@ -116,11 +158,20 @@ public int size() {
 
   @Override
   public Iterator iterator() {
-return _records.iterator();
+return _iterator;
+  }
+
+  @Override
+  public void finish(boolean sort) {
+sortAndResize(_maxCapacity);
 
 Review comment:
   Add a TODO here for resize without sort


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #4602: First pass of GROUP BY with ORDER BY support

2019-10-08 Thread GitBox
Jackie-Jiang commented on a change in pull request #4602: First pass of GROUP 
BY with ORDER BY support
URL: https://github.com/apache/incubator-pinot/pull/4602#discussion_r332779790
 
 

 ##
 File path: 
pinot-core/src/main/java/org/apache/pinot/core/operator/query/AggregationGroupByOperator.java
 ##
 @@ -18,8 +18,12 @@
  */
 package org.apache.pinot.core.operator.query;
 
+import java.util.HashMap;
 
 Review comment:
   This file need to be reverted


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] Jackie-Jiang commented on a change in pull request #4602: First pass of GROUP BY with ORDER BY support

2019-10-08 Thread GitBox
Jackie-Jiang commented on a change in pull request #4602: First pass of GROUP 
BY with ORDER BY support
URL: https://github.com/apache/incubator-pinot/pull/4602#discussion_r332764719
 
 

 ##
 File path: 
pinot-common/src/main/java/org/apache/pinot/common/response/broker/ResultTable.java
 ##
 @@ -0,0 +1,54 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.response.broker;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonPropertyOrder;
+import java.io.Serializable;
+import java.util.List;
+
+
+/**
+ * Holds the results in a standard tabular structure
+ *
+ * FIXME: Column types and Multi-value support are missing, and deserialize 
might not work properly
+ */
+@JsonPropertyOrder({"columns", "rows"})
+public class ResultTable {
+  final private List _columns;
 
 Review comment:
   (nit) `private final ...`, same for next line


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] codecov-io edited a comment on issue #4602: First pass of GROUP BY with ORDER BY support

2019-10-08 Thread GitBox
codecov-io edited a comment on issue #4602: First pass of GROUP BY with ORDER 
BY support
URL: https://github.com/apache/incubator-pinot/pull/4602#issuecomment-530129911
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/4602?src=pr=h1) 
Report
   > Merging 
[#4602](https://codecov.io/gh/apache/incubator-pinot/pull/4602?src=pr=desc) 
into 
[master](https://codecov.io/gh/apache/incubator-pinot/commit/af48abce79de1482c757c3f250ee0a584f423b05?src=pr=desc)
 will **increase** coverage by `0.19%`.
   > The diff coverage is `80.06%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-pinot/pull/4602/graphs/tree.svg?width=650=4ibza2ugkz=150=pr)](https://codecov.io/gh/apache/incubator-pinot/pull/4602?src=pr=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#4602  +/-   ##
   
   + Coverage 64.66%   64.85%   +0.19% 
 Complexity   16   16  
   
 Files  1056 1061   +5 
 Lines 5619256755 +563 
 Branches   8331 8436 +105 
   
   + Hits  3633836811 +473 
   - Misses1716017212  +52 
   - Partials   2694 2732  +38
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-pinot/pull/4602?src=pr=tree) | 
Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | 
[...org/apache/pinot/common/utils/CommonConstants.java](https://codecov.io/gh/apache/incubator-pinot/pull/4602/diff?src=pr=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvQ29tbW9uQ29uc3RhbnRzLmphdmE=)
 | `42.3% <ø> (ø)` | `0 <0> (ø)` | :arrow_down: |
   | 
[...e/query/aggregation/groupby/GroupKeyGenerator.java](https://codecov.io/gh/apache/incubator-pinot/pull/4602/diff?src=pr=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9hZ2dyZWdhdGlvbi9ncm91cGJ5L0dyb3VwS2V5R2VuZXJhdG9yLmphdmE=)
 | `100% <ø> (ø)` | `0 <0> (ø)` | :arrow_down: |
   | 
[...ore/operator/query/AggregationGroupByOperator.java](https://codecov.io/gh/apache/incubator-pinot/pull/4602/diff?src=pr=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci9xdWVyeS9BZ2dyZWdhdGlvbkdyb3VwQnlPcGVyYXRvci5qYXZh)
 | `95.65% <ø> (ø)` | `0 <0> (ø)` | :arrow_down: |
   | 
[...ain/java/org/apache/pinot/core/data/table/Key.java](https://codecov.io/gh/apache/incubator-pinot/pull/4602/diff?src=pr=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL3RhYmxlL0tleS5qYXZh)
 | `87.5% <0%> (+37.5%)` | `0 <0> (ø)` | :arrow_down: |
   | 
[.../java/org/apache/pinot/core/data/table/Record.java](https://codecov.io/gh/apache/incubator-pinot/pull/4602/diff?src=pr=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL3RhYmxlL1JlY29yZC5qYXZh)
 | `85.71% <0%> (-14.29%)` | `0 <0> (ø)` | |
   | 
[...ion/groupby/AggregationGroupByTrimmingService.java](https://codecov.io/gh/apache/incubator-pinot/pull/4602/diff?src=pr=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9hZ2dyZWdhdGlvbi9ncm91cGJ5L0FnZ3JlZ2F0aW9uR3JvdXBCeVRyaW1taW5nU2VydmljZS5qYXZh)
 | `94.44% <100%> (ø)` | `0 <0> (ø)` | :arrow_down: |
   | 
[...t/common/response/broker/BrokerResponseNative.java](https://codecov.io/gh/apache/incubator-pinot/pull/4602/diff?src=pr=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vcmVzcG9uc2UvYnJva2VyL0Jyb2tlclJlc3BvbnNlTmF0aXZlLmphdmE=)
 | `91.57% <100%> (+0.27%)` | `0 <0> (ø)` | :arrow_down: |
   | 
[...tion/groupby/DictionaryBasedGroupKeyGenerator.java](https://codecov.io/gh/apache/incubator-pinot/pull/4602/diff?src=pr=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9hZ2dyZWdhdGlvbi9ncm91cGJ5L0RpY3Rpb25hcnlCYXNlZEdyb3VwS2V5R2VuZXJhdG9yLmphdmE=)
 | `96.84% <100%> (ø)` | `0 <0> (ø)` | :arrow_down: |
   | 
[...aggregation/function/AggregationFunctionUtils.java](https://codecov.io/gh/apache/incubator-pinot/pull/4602/diff?src=pr=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9xdWVyeS9hZ2dyZWdhdGlvbi9mdW5jdGlvbi9BZ2dyZWdhdGlvbkZ1bmN0aW9uVXRpbHMuamF2YQ==)
 | `95.65% <100%> (+0.09%)` | `0 <0> (ø)` | :arrow_down: |
   | 
[...org/apache/pinot/core/data/table/IndexedTable.java](https://codecov.io/gh/apache/incubator-pinot/pull/4602/diff?src=pr=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9kYXRhL3RhYmxlL0luZGV4ZWRUYWJsZS5qYXZh)
 | `100% <100%> (ø)` | `0 <0> (ø)` | :arrow_down: |
   | ... and [81 
more](https://codecov.io/gh/apache/incubator-pinot/pull/4602/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/4602?src=pr=continue).
   > **Legend** - [Click here to learn 

[GitHub] [incubator-pinot] Jackie-Jiang merged pull request #4608: Unit tests and bug fixes for DeleteTable rest API for controller.

2019-10-08 Thread GitBox
Jackie-Jiang merged pull request #4608: Unit tests and bug fixes for 
DeleteTable rest API for controller.
URL: https://github.com/apache/incubator-pinot/pull/4608
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch master updated (2d20c64 -> ce298ba)

2019-10-08 Thread jackie
This is an automated email from the ASF dual-hosted git repository.

jackie pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


from 2d20c64  Possible fix for NPE seen in this test. (#4684)
 add ce298ba  Unit tests and bug fixes for DeleteTable rest API for 
controller. (#4608)

No new revisions were added by this update.

Summary of changes:
 .../api/resources/PinotTableRestletResource.java   | 39 +++--
 .../api/PinotTableRestletResourceTest.java | 98 ++
 2 files changed, 132 insertions(+), 5 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] chenboat commented on issue #4608: Unit tests and bug fixes for DeleteTable rest API for controller.

2019-10-08 Thread GitBox
chenboat commented on issue #4608: Unit tests and bug fixes for DeleteTable 
rest API for controller.
URL: https://github.com/apache/incubator-pinot/pull/4608#issuecomment-539694652
 
 
   > Please address the comments and we will merge the change
   
   Done. Please take another look. @Jackie-Jiang 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] npawar commented on a change in pull request #4602: First pass of GROUP BY with ORDER BY support

2019-10-08 Thread GitBox
npawar commented on a change in pull request #4602: First pass of GROUP BY with 
ORDER BY support
URL: https://github.com/apache/incubator-pinot/pull/4602#discussion_r332713325
 
 

 ##
 File path: 
pinot-core/src/main/java/org/apache/pinot/core/data/table/Record.java
 ##
 @@ -43,4 +46,9 @@ public Key getKey() {
   public Object[] getValues() {
 return _values;
   }
+
+  @Override
+  public String toString() {
+return _key.toString() + " " + Arrays.deepToString(_values);
 
 Review comment:
   This String will never be used for de-serialization. We will always be using 
the granular values within the array for ser/deser.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] vincentchenjl commented on a change in pull request #4685: [TE] handle data insufficient exceptions in the detection pipeline

2019-10-08 Thread GitBox
vincentchenjl commented on a change in pull request #4685: [TE] handle data 
insufficient exceptions in the detection pipeline
URL: https://github.com/apache/incubator-pinot/pull/4685#discussion_r332664162
 
 

 ##
 File path: 
thirdeye/thirdeye-pinot/src/main/java/org/apache/pinot/thirdeye/detection/algorithm/DimensionWrapper.java
 ##
 @@ -278,22 +286,48 @@ public DetectionPipelineResult run() throws Exception {
 .collect(Collectors.toList());
   }
 
-  private void checkEarlyStop(long totalNestedMetrics, long 
successNestedMetrics, int i, Exception lastException) throws 
DetectionPipelineException {
+  private void checkEarlyStop(long totalNestedMetrics, long 
successNestedMetrics, int i, List exceptions)
+  throws DetectionPipelineException {
 // if the first certain number of dimensions all failed, throw an exception
 if (i == EARLY_STOP_THRESHOLD && successNestedMetrics == 0) {
   throw new DetectionPipelineException(String.format(
   "Detection failed for first %d out of %d metric dimensions for 
monitoring window %d to %d, stop processing.",
-  i, totalNestedMetrics, this.getStartTime(), this.getEndTime()), 
lastException);
+  i, totalNestedMetrics, this.getStartTime(), this.getEndTime()), 
Iterables.getLast(exceptions));
 }
   }
 
-  private void checkNestedMetricsStatus(long totalNestedMetrics, long 
successNestedMetrics, Exception lastException)
-  throws DetectionPipelineException {
+  /**
+   * Check the exception for all nested metric and determine whether to fail 
the detection.
+   *
+   * This method will throw an exception to outer wrappers if the all nested 
metrics failed.
+   * UNLESS the exception for all nested metrics are failed by
+   * {@link DetectorDataInsufficientException} and multiple detector are 
configured to run,
+   * and some of them run successfully. In such case, this method will
+   * still allow the detection to finish and return the generated anomalies. 
Because the data insufficient exception
+   * thrown by one detector should not block other detectors's result.
+   *
+   * @param totalNestedMetrics the total number of nested metrics
+   * @param successNestedMetrics the successfully generated nested metrics
+   * @param exceptions the list of all exceptions
+   * @param predictions the prediction results generated, which can tell us 
whether there are detectors run successfully.
+   * @throws DetectionPipelineException the exception to throw
+   */
+  private void checkNestedMetricsStatus(long totalNestedMetrics, long 
successNestedMetrics, List exceptions,
+  List predictions) throws DetectionPipelineException {
 // if all nested metrics failed, throw an exception
 if (successNestedMetrics == 0 && totalNestedMetrics > 0) {
-  throw new DetectionPipelineException(String.format(
-  "Detection failed for all nested dimensions for detection config id 
%d for monitoring window %d to %d.",
-  this.config.getId(), this.getStartTime(), this.getEndTime()), 
lastException);
+  // if all exceptions are caused by DetectorDataInsufficientException and
+  // there are other detectors run successfully, keep the detection running
+  if (exceptions.stream().allMatch(e -> e.getCause() instanceof 
DetectorDataInsufficientException)
+  && predictions.size() != 0) {
 
 Review comment:
   I think that `predictions.size() == 0` will always true if 
`successNestedMetrics == 0` is true. Please let me know if I miss something 
here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] vincentchenjl commented on a change in pull request #4685: [TE] handle data insufficient exceptions in the detection pipeline

2019-10-08 Thread GitBox
vincentchenjl commented on a change in pull request #4685: [TE] handle data 
insufficient exceptions in the detection pipeline
URL: https://github.com/apache/incubator-pinot/pull/4685#discussion_r332662501
 
 

 ##
 File path: 
thirdeye/thirdeye-pinot/src/main/java/org/apache/pinot/thirdeye/detection/algorithm/DimensionWrapper.java
 ##
 @@ -278,22 +286,48 @@ public DetectionPipelineResult run() throws Exception {
 .collect(Collectors.toList());
   }
 
-  private void checkEarlyStop(long totalNestedMetrics, long 
successNestedMetrics, int i, Exception lastException) throws 
DetectionPipelineException {
+  private void checkEarlyStop(long totalNestedMetrics, long 
successNestedMetrics, int i, List exceptions)
+  throws DetectionPipelineException {
 // if the first certain number of dimensions all failed, throw an exception
 if (i == EARLY_STOP_THRESHOLD && successNestedMetrics == 0) {
   throw new DetectionPipelineException(String.format(
   "Detection failed for first %d out of %d metric dimensions for 
monitoring window %d to %d, stop processing.",
-  i, totalNestedMetrics, this.getStartTime(), this.getEndTime()), 
lastException);
+  i, totalNestedMetrics, this.getStartTime(), this.getEndTime()), 
Iterables.getLast(exceptions));
 }
   }
 
-  private void checkNestedMetricsStatus(long totalNestedMetrics, long 
successNestedMetrics, Exception lastException)
-  throws DetectionPipelineException {
+  /**
+   * Check the exception for all nested metric and determine whether to fail 
the detection.
+   *
+   * This method will throw an exception to outer wrappers if the all nested 
metrics failed.
+   * UNLESS the exception for all nested metrics are failed by
+   * {@link DetectorDataInsufficientException} and multiple detector are 
configured to run,
+   * and some of them run successfully. In such case, this method will
+   * still allow the detection to finish and return the generated anomalies. 
Because the data insufficient exception
+   * thrown by one detector should not block other detectors's result.
+   *
+   * @param totalNestedMetrics the total number of nested metrics
+   * @param successNestedMetrics the successfully generated nested metrics
+   * @param exceptions the list of all exceptions
+   * @param predictions the prediction results generated, which can tell us 
whether there are detectors run successfully.
+   * @throws DetectionPipelineException the exception to throw
+   */
+  private void checkNestedMetricsStatus(long totalNestedMetrics, long 
successNestedMetrics, List exceptions,
+  List predictions) throws DetectionPipelineException {
 // if all nested metrics failed, throw an exception
 if (successNestedMetrics == 0 && totalNestedMetrics > 0) {
-  throw new DetectionPipelineException(String.format(
-  "Detection failed for all nested dimensions for detection config id 
%d for monitoring window %d to %d.",
-  this.config.getId(), this.getStartTime(), this.getEndTime()), 
lastException);
+  // if all exceptions are caused by DetectorDataInsufficientException and
+  // there are other detectors run successfully, keep the detection running
 
 Review comment:
   Just curious what is the behavior if detectors in some dimensions throws 
other exceptions than `DetectorDataInsufficientException`. Do we still consider 
the detection is running successfully if one of the dimension fails due to 
other exceptions?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mayankshriv commented on a change in pull request #4602: First pass of GROUP BY with ORDER BY support

2019-10-08 Thread GitBox
mayankshriv commented on a change in pull request #4602: First pass of GROUP BY 
with ORDER BY support
URL: https://github.com/apache/incubator-pinot/pull/4602#discussion_r332620855
 
 

 ##
 File path: 
pinot-core/src/main/java/org/apache/pinot/core/data/table/Record.java
 ##
 @@ -43,4 +46,9 @@ public Key getKey() {
   public Object[] getValues() {
 return _values;
   }
+
+  @Override
+  public String toString() {
+return _key.toString() + " " + Arrays.deepToString(_values);
 
 Review comment:
   Will this String be used for de-serialization? And can key have whitespaces?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mayankshriv commented on a change in pull request #4602: First pass of GROUP BY with ORDER BY support

2019-10-08 Thread GitBox
mayankshriv commented on a change in pull request #4602: First pass of GROUP BY 
with ORDER BY support
URL: https://github.com/apache/incubator-pinot/pull/4602#discussion_r332618635
 
 

 ##
 File path: 
pinot-core/src/main/java/org/apache/pinot/core/data/table/ConcurrentIndexedTable.java
 ##
 @@ -19,37 +19,68 @@
 package org.apache.pinot.core.data.table;
 
 import com.google.common.base.Preconditions;
+import java.util.ArrayList;
 import java.util.Comparator;
 import java.util.Iterator;
 import java.util.List;
 import java.util.PriorityQueue;
 import java.util.concurrent.ConcurrentHashMap;
 import java.util.concurrent.ConcurrentMap;
+import java.util.concurrent.atomic.AtomicBoolean;
+import java.util.concurrent.atomic.LongAccumulator;
+import java.util.concurrent.atomic.LongAdder;
 import java.util.concurrent.locks.ReentrantReadWriteLock;
 import javax.annotation.Nonnull;
+import org.apache.commons.collections.CollectionUtils;
 import org.apache.pinot.common.request.AggregationInfo;
 import org.apache.pinot.common.request.SelectionSort;
 import org.apache.pinot.common.utils.DataSchema;
 import org.apache.pinot.core.data.order.OrderByUtils;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
 
 
 /**
  * Thread safe {@link Table} implementation for aggregating TableRecords based 
on combination of keys
  */
 public class ConcurrentIndexedTable extends IndexedTable {
 
+  private static final Logger LOGGER = 
LoggerFactory.getLogger(ConcurrentIndexedTable.class);
+
   private ConcurrentMap _lookupMap;
-  private Comparator _minHeapComparator;
   private ReentrantReadWriteLock _readWriteLock;
 
+  private boolean _isOrderBy;
+  private Comparator _orderByComparator;
+  private Comparator _finalOrderByComparator;
+  private List _aggregationIndexes;
+
+  private AtomicBoolean _noMoreNewRecords = new AtomicBoolean();
+  private LongAdder _numResizes = new LongAdder();
+  private LongAccumulator _resizeTime = new LongAccumulator(Long::sum, 0);
+
+  /**
+   * Initializes the data structures and comparators needed for this Table
+   * @param dataSchema data schema of the record's keys and values
+   * @param aggregationInfos aggregation infors for the aggregations in 
record'd values
+   * @param orderBy list of {@link SelectionSort} defining the order by
+   * @param maxCapacity the max number of records to hold
+   * @param sort does final result need to be sorted
+   */
   @Override
   public void init(@Nonnull DataSchema dataSchema, List 
aggregationInfos, List orderBy,
-  int maxCapacity) {
-super.init(dataSchema, aggregationInfos, orderBy, maxCapacity);
+  int maxCapacity, boolean sort) {
+super.init(dataSchema, aggregationInfos, orderBy, maxCapacity, sort);
 
-_minHeapComparator = OrderByUtils.getKeysAndValuesComparator(dataSchema, 
orderBy, aggregationInfos).reversed();
 _lookupMap = new ConcurrentHashMap<>();
 _readWriteLock = new ReentrantReadWriteLock();
+_isOrderBy = CollectionUtils.isNotEmpty(orderBy);
+if (_isOrderBy) {
+  _orderByComparator = OrderByUtils.getKeysAndValuesComparator(dataSchema, 
orderBy, aggregationInfos, false);
 
 Review comment:
   May be add a TODO here in the code as well?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org