[GitHub] [incubator-pinot] kishoreg commented on pull request #6033: Add Broker Reduce Time Log

2020-09-21 Thread GitBox


kishoreg commented on pull request #6033:
URL: https://github.com/apache/incubator-pinot/pull/6033#issuecomment-696501753


   This is the one that uncovered the issue that broker was indeed taking a 
long time. This will probably help at LinkedIn as well to identify the usecases 
that can benefit from parallel reduce in broker



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] 01/01: [maven-release-plugin] prepare for next development iteration

2020-09-21 Thread tingchen
This is an automated email from the ASF dual-hosted git repository.

tingchen pushed a commit to branch release-0.5.0-rc
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git

commit 7e7af116684950a8e5e86600d9c1a07bc5e0759c
Author: TING CHEN 
AuthorDate: Wed Sep 2 17:49:37 2020 -0700

[maven-release-plugin] prepare for next development iteration
---
 pinot-broker/pom.xml  | 2 +-
 pinot-clients/pinot-java-client/pom.xml   | 2 +-
 pinot-clients/pinot-jdbc-client/pom.xml   | 2 +-
 pinot-clients/pom.xml | 2 +-
 pinot-common/pom.xml  | 2 +-
 pinot-controller/pom.xml  | 2 +-
 pinot-core/pom.xml| 2 +-
 pinot-distribution/pom.xml| 2 +-
 pinot-integration-tests/pom.xml   | 2 +-
 pinot-minion/pom.xml  | 2 +-
 pinot-perf/pom.xml| 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-common/pom.xml| 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-hadoop/pom.xml| 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-spark/pom.xml | 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-standalone/pom.xml| 2 +-
 pinot-plugins/pinot-batch-ingestion/pom.xml   | 2 +-
 .../pinot-batch-ingestion/v0_deprecated/pinot-hadoop/pom.xml  | 2 +-
 .../v0_deprecated/pinot-ingestion-common/pom.xml  | 2 +-
 pinot-plugins/pinot-batch-ingestion/v0_deprecated/pinot-spark/pom.xml | 2 +-
 pinot-plugins/pinot-batch-ingestion/v0_deprecated/pom.xml | 2 +-
 pinot-plugins/pinot-file-system/pinot-adls/pom.xml| 2 +-
 pinot-plugins/pinot-file-system/pinot-gcs/pom.xml | 2 +-
 pinot-plugins/pinot-file-system/pinot-hdfs/pom.xml| 2 +-
 pinot-plugins/pinot-file-system/pinot-s3/pom.xml  | 2 +-
 pinot-plugins/pinot-file-system/pom.xml   | 2 +-
 pinot-plugins/pinot-input-format/pinot-avro-base/pom.xml  | 2 +-
 pinot-plugins/pinot-input-format/pinot-avro/pom.xml   | 2 +-
 pinot-plugins/pinot-input-format/pinot-confluent-avro/pom.xml | 2 +-
 pinot-plugins/pinot-input-format/pinot-csv/pom.xml| 2 +-
 pinot-plugins/pinot-input-format/pinot-json/pom.xml   | 2 +-
 pinot-plugins/pinot-input-format/pinot-orc/pom.xml| 2 +-
 pinot-plugins/pinot-input-format/pinot-parquet/pom.xml| 2 +-
 pinot-plugins/pinot-input-format/pinot-protobuf/pom.xml   | 2 +-
 pinot-plugins/pinot-input-format/pinot-thrift/pom.xml | 2 +-
 pinot-plugins/pinot-input-format/pom.xml  | 2 +-
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-0.9/pom.xml  | 2 +-
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-2.0/pom.xml  | 2 +-
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-base/pom.xml | 2 +-
 pinot-plugins/pinot-stream-ingestion/pom.xml  | 2 +-
 pinot-plugins/pom.xml | 2 +-
 pinot-server/pom.xml  | 2 +-
 pinot-spi/pom.xml | 2 +-
 pinot-tools/pom.xml   | 2 +-
 pom.xml   | 4 ++--
 44 files changed, 45 insertions(+), 45 deletions(-)

diff --git a/pinot-broker/pom.xml b/pinot-broker/pom.xml
index e578c82..146ca13 100644
--- a/pinot-broker/pom.xml
+++ b/pinot-broker/pom.xml
@@ -24,7 +24,7 @@
   
 pinot
 org.apache.pinot
-0.5.0
+0.6.0-SNAPSHOT
 ..
   
   pinot-broker
diff --git a/pinot-clients/pinot-java-client/pom.xml 
b/pinot-clients/pinot-java-client/pom.xml
index a5f06df..8681625 100644
--- a/pinot-clients/pinot-java-client/pom.xml
+++ b/pinot-clients/pinot-java-client/pom.xml
@@ -24,7 +24,7 @@
   
 pinot-clients
 org.apache.pinot
-0.5.0
+0.6.0-SNAPSHOT
 ..
   
   pinot-java-client
diff --git a/pinot-clients/pinot-jdbc-client/pom.xml 
b/pinot-clients/pinot-jdbc-client/pom.xml
index 528822b..e729330 100644
--- a/pinot-clients/pinot-jdbc-client/pom.xml
+++ b/pinot-clients/pinot-jdbc-client/pom.xml
@@ -24,7 +24,7 @@
   
 pinot-clients
 org.apache.pinot
-0.5.0
+0.6.0-SNAPSHOT
 ..
   
   pinot-jdbc-client
diff --git a/pinot-clients/pom.xml b/pinot-clients/pom.xml
index 6a6d988..5b7c75c 100644
--- a/pinot-clients/pom.xml
+++ b/pinot-clients/pom.xml
@@ -24,7 +24,7 @@
   
 pinot
 

[incubator-pinot] branch release-0.5.0-rc created (now 7e7af11)

2020-09-21 Thread tingchen
This is an automated email from the ASF dual-hosted git repository.

tingchen pushed a change to branch release-0.5.0-rc
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


  at 7e7af11  [maven-release-plugin] prepare for next development iteration

This branch includes the following new commits:

 new 7e7af11  [maven-release-plugin] prepare for next development iteration

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch release-0.5.0-rc created (now 67299cd)

2020-09-21 Thread tingchen
This is an automated email from the ASF dual-hosted git repository.

tingchen pushed a change to branch release-0.5.0-rc
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


  at 67299cd  Fix built-in virtual columns for immutable segment (#6042)

No new revisions were added by this update.


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mayankshriv edited a comment on issue #6028: Make BrokerReduceService.reduceOnDataTable Multi Threaded to increase aggregation performance

2020-09-21 Thread GitBox


mayankshriv edited a comment on issue #6028:
URL: 
https://github.com/apache/incubator-pinot/issues/6028#issuecomment-696510899


   @mr-agrwal Could you please try the changes in this PR and provide some 
feedback: https://github.com/apache/incubator-pinot/pull/6044
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] Jackie-Jiang opened a new pull request #6043: Add IN_PARTITIONED_SUBQUERY support

2020-09-21 Thread GitBox


Jackie-Jiang opened a new pull request #6043:
URL: https://github.com/apache/incubator-pinot/pull/6043


   ## Description
   Add `IN_PARTITIONED_SUBQUERY` transform function to support `IDSET` 
aggregation function as the subquery on the server side. Because the subquery 
is solved on the server side, in order to make it work, the subquery must hit 
the same table as the main query, and the table must be partitioned at server 
level (all the segments for a partition is served by a single server).
   
   E.g. The following 2 queries can be combined into one query:
   SELECT ID_SET(col) FROM table WHERE date = 20200901
   SELECT DISTINCT_COUNT(col), date FROM table WHERE IN_ID_SET(col, 
'') = 1 GROUP BY date
   ->
   SELECT DISTINCT_COUNT(col), date FROM table WHERE 
IN_PARTITIONED_SUBQUERY(col, 'SELECT ID_SET(col) FROM table WHERE date = 
20200901') = 1 GROUP BY date



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mayankshriv opened a new pull request #6044: Support for multi-threaded Group By reducer for SQL.

2020-09-21 Thread GitBox


mayankshriv opened a new pull request #6044:
URL: https://github.com/apache/incubator-pinot/pull/6044


   The existing implementation of Broker reduce phase is single-threaded.
   For group-by queries where large response are being sent back from multiple 
servers,
   this could become a bottlenect.
   
   Given that brokers are generally light on CPU usage, making the reduce phase
   multi-threaded would be a good idea to boost performance. This PR adds a 
multi-threaded
   implementation for the Group-By reducer for SQL.
   
   - Added an executor service in BrokerReduceService that can be used by the 
reduce phase.
   
   - In this PR, the executor service defaults to have a single thread, until 
the performance
 impact can be studied under various conditions (eg high qps, where brokers 
have higher
 CPU usage).
   
   - Added a broker side config to specify the number of threads to be used for 
reduce phase.
 `pinot.broker.num.reduce.threads`
   
   - For testing, explicitly sets num threads to reduce to be > 1 to ensure 
functional
 correctness is tested.
   
   ## Description
   Add a description of your PR here.
   A good description should include pointers to an issue or design document, 
etc.
   ## Upgrade Notes
   Does this PR prevent a zero down-time upgrade? (Assume upgrade order: 
Controller, Broker, Server, Minion)
   * [ ] Yes (Please label as **backward-incompat**, and complete 
the section below on Release Notes)
   
   Does this PR fix a zero-downtime upgrade introduced earlier?
   * [ ] Yes (Please label this as **backward-incompat**, and 
complete the section below on Release Notes)
   
   Does this PR otherwise need attention when creating release notes? Things to 
consider:
   - New configuration options
   - Deprecation of configurations
   - Signature changes to public methods/interfaces
   - New plugins added or old plugins removed
   * [ ] Yes (Please label this PR as **release-notes** and 
complete the section on Release Notes)
   ## Release Notes
   If you have tagged this as either backward-incompat or release-notes,
   you MUST add text here that you would like to see appear in release notes of 
the
   next release.
   
   If you have a series of commits adding or enabling a feature, then
   add this section only in final commit that marks the feature completed.
   Refer to earlier release notes to see examples of text
   
   ## Documentation
   If you have introduced a new feature or configuration, please add it to the 
documentation as well.
   See 
https://docs.pinot.apache.org/developers/developers-and-contributors/update-document
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mayankshriv commented on issue #6028: Make BrokerReduceService.reduceOnDataTable Multi Threaded to increase aggregation performance

2020-09-21 Thread GitBox


mayankshriv commented on issue #6028:
URL: 
https://github.com/apache/incubator-pinot/issues/6028#issuecomment-696510899


   @mr-agrwal Could you please try the changes in this PR and provide some 
feedback:
   
   https://github.com/apache/incubator-pinot/pull/6044
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] apucher merged pull request #6041: remove default javaagent opts in generator.sh script to avoid javaagent port colission

2020-09-21 Thread GitBox


apucher merged pull request #6041:
URL: https://github.com/apache/incubator-pinot/pull/6041


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mayankshriv commented on pull request #6033: Add Broker Reduce Time Log

2020-09-21 Thread GitBox


mayankshriv commented on pull request #6033:
URL: https://github.com/apache/incubator-pinot/pull/6033#issuecomment-696497353


   @mr-agrwal @fx19880617 We do have metrics for each phase of query in the 
broker side (including scatter/gather, reduce, etc). Adding just the reduce 
part in the log does not make a lot of sense (because then we could argue why 
not add other times, and we definitely want to limit log size for high 
throughput cases).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] kishoreg commented on issue #5942: Better Table config validation

2020-09-21 Thread GitBox


kishoreg commented on issue #5942:
URL: 
https://github.com/apache/incubator-pinot/issues/5942#issuecomment-696214712


   +1 to a utility to validate, we can add a validate endpoint in the controller



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] Jackie-Jiang merged pull request #6042: Fix built-in virtual columns for immutable segment

2020-09-21 Thread GitBox


Jackie-Jiang merged pull request #6042:
URL: https://github.com/apache/incubator-pinot/pull/6042


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on issue #5942: Better Table config validation

2020-09-21 Thread GitBox


mcvsubbu commented on issue #5942:
URL: 
https://github.com/apache/incubator-pinot/issues/5942#issuecomment-696207582


   With all the changes proposed/added, it will be useful if a utility is also 
provided to validate an existing tableconfig. Many of them are tightening the 
(lax) rules from before, and are valid. It will be unfortunate if an existing 
installation suddenly stopped working because code elsewhere (not just in the 
table addition path) starts assuming things about the tableconfig.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] adriancole commented on issue #5977: Allow ServiceManager to install tables prior to listening on service ports or a healthy status

2020-09-21 Thread GitBox


adriancole commented on issue #5977:
URL: 
https://github.com/apache/incubator-pinot/issues/5977#issuecomment-696109515


   ok I added a sketch for feedback. I will work on this again next week 
https://github.com/apache/incubator-pinot/pull/6039



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] KKcorps commented on a change in pull request #6020: Add Caching in Controller Broker API

2020-09-21 Thread GitBox


KKcorps commented on a change in pull request #6020:
URL: https://github.com/apache/incubator-pinot/pull/6020#discussion_r491796919



##
File path: 
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/listener/ClusterLiveInstanceChangeListener.java
##
@@ -0,0 +1,34 @@
+package org.apache.pinot.controller.helix.core.listener;
+
+import java.util.ArrayList;
+import org.apache.helix.HelixDataAccessor;
+import org.apache.helix.NotificationContext;
+import org.apache.helix.api.listeners.LiveInstanceChangeListener;
+import org.apache.helix.model.LiveInstance;
+
+import java.util.List;
+import org.apache.helix.PropertyKey.Builder;
+
+
+public class ClusterLiveInstanceChangeListener implements 
LiveInstanceChangeListener {
+  private HelixDataAccessor _helixDataAccessor;
+  private Builder _keyBuilder;
+  private List _liveInstances = new ArrayList<>();
+
+  public ClusterLiveInstanceChangeListener(HelixDataAccessor 
helixDataAccessor, Builder keyBuilder) {
+_helixDataAccessor = helixDataAccessor;
+_keyBuilder = keyBuilder;
+  }
+
+  @Override
+  public void onLiveInstanceChange(List liveInstances, 
NotificationContext changeContext) {
+_liveInstances = liveInstances;

Review comment:
   This was returning older values rather than updated values in onChange 
methoda.

##
File path: 
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/listener/ClusterInstanceConfigChangeListener.java
##
@@ -0,0 +1,32 @@
+package org.apache.pinot.controller.helix.core.listener;
+
+import org.apache.helix.HelixManager;
+import org.apache.helix.NotificationContext;
+import org.apache.helix.api.listeners.InstanceConfigChangeListener;
+import org.apache.helix.model.InstanceConfig;
+
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.pinot.common.utils.helix.HelixHelper;
+
+
+public class ClusterInstanceConfigChangeListener implements 
InstanceConfigChangeListener {
+private HelixManager _helixManager;
+private List _instanceConfigs = new ArrayList<>();
+
+public ClusterInstanceConfigChangeListener(HelixManager helixManager) {
+_helixManager = helixManager;
+}
+
+@Override
+public void onInstanceConfigChange(List instanceConfigs, 
NotificationContext context) {
+_instanceConfigs = instanceConfigs;

Review comment:
   This was returning older values rather than updated values in onChange 
methoda.
   
   

##
File path: 
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/listener/ClusterInstanceConfigChangeListener.java
##
@@ -0,0 +1,32 @@
+package org.apache.pinot.controller.helix.core.listener;
+
+import org.apache.helix.HelixManager;
+import org.apache.helix.NotificationContext;
+import org.apache.helix.api.listeners.InstanceConfigChangeListener;
+import org.apache.helix.model.InstanceConfig;
+
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.pinot.common.utils.helix.HelixHelper;
+
+
+public class ClusterInstanceConfigChangeListener implements 
InstanceConfigChangeListener {
+private HelixManager _helixManager;
+private List _instanceConfigs = new ArrayList<>();
+
+public ClusterInstanceConfigChangeListener(HelixManager helixManager) {
+_helixManager = helixManager;

Review comment:
   Returns null or empty list

##
File path: 
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/listener/ClusterInstanceConfigChangeListener.java
##
@@ -0,0 +1,32 @@
+package org.apache.pinot.controller.helix.core.listener;
+
+import org.apache.helix.HelixManager;
+import org.apache.helix.NotificationContext;
+import org.apache.helix.api.listeners.InstanceConfigChangeListener;
+import org.apache.helix.model.InstanceConfig;
+
+import java.util.ArrayList;
+import java.util.List;
+import org.apache.pinot.common.utils.helix.HelixHelper;
+
+
+public class ClusterInstanceConfigChangeListener implements 
InstanceConfigChangeListener {
+private HelixManager _helixManager;
+private List _instanceConfigs = new ArrayList<>();
+
+public ClusterInstanceConfigChangeListener(HelixManager helixManager) {
+_helixManager = helixManager;
+}
+
+@Override
+public void onInstanceConfigChange(List instanceConfigs, 
NotificationContext context) {
+_instanceConfigs = instanceConfigs;
+}
+
+public List getInstanceConfigs() {
+if(_instanceConfigs.isEmpty()){

Review comment:
   Returns null or empty list





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For 

[GitHub] [incubator-pinot] sajjad-moradi commented on pull request #6037: Add list of allowed tables for emitting table level metrics

2020-09-21 Thread GitBox


sajjad-moradi commented on pull request #6037:
URL: https://github.com/apache/incubator-pinot/pull/6037#issuecomment-696225516


   > Aren’t we trying to address the limitations of monitoring systems in Pinot?
   
   I'm not aware that. Could you elaborate a bit more? what's exactly the plan 
and what's the timeline for it?
   The solution presented here is regarding a large cluster at Linkedin for 
which table level metrics are disabled for all tables. The current situation is 
risky as for some high priority tables, we don't get alerted. This PR 
immediately alleviates the existing issue. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] jihaozh merged pull request #6026: [TE] Creating a thirdeye-dashboard module to host the dashboard server

2020-09-21 Thread GitBox


jihaozh merged pull request #6026:
URL: https://github.com/apache/incubator-pinot/pull/6026


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] Jackie-Jiang commented on pull request #6040: Remove the partition info from the consuming segment ZK metadata

2020-09-21 Thread GitBox


Jackie-Jiang commented on pull request #6040:
URL: https://github.com/apache/incubator-pinot/pull/6040#issuecomment-696415308


   @mcvsubbu The segment partition pruning for consuming segment will happen on 
server side instead of broker side. Yes there will be some overhead, and we 
need to measure the impact with a high QPS use case.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] jihaozh merged pull request #6036: [TE] fix labeler config mapping and timeout when fetching anomalies

2020-09-21 Thread GitBox


jihaozh merged pull request #6036:
URL: https://github.com/apache/incubator-pinot/pull/6036


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] kishoreg commented on pull request #6037: Add list of allowed tables for emitting table level metrics

2020-09-21 Thread GitBox


kishoreg commented on pull request #6037:
URL: https://github.com/apache/incubator-pinot/pull/6037#issuecomment-696321992


   > > Aren’t we trying to address the limitations of monitoring systems in 
Pinot?
   > 
   > I'm not aware that. Could you elaborate a bit more? what's exactly the 
plan and what's the timeline for it?
   > The solution presented here is regarding a large cluster at Linkedin for 
which table level metrics are disabled for all tables. The current situation is 
risky as for some high priority tables, we don't get alerted. This PR 
immediately alleviates the existing issue.
   
   Sorry, I was referring to the changes in this PR. Ideally, we should be 
logging metrics for all tables/resources. It's up to the operators to set 
alerts on the right tables that are important for the business.
   
   By adding this new config, we are adding a workaround in Pinot to overcome 
the limitation with the metrics system which cannot handle thousands of tables.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] icefury71 commented on issue #5942: Better Table config validation

2020-09-21 Thread GitBox


icefury71 commented on issue #5942:
URL: 
https://github.com/apache/incubator-pinot/issues/5942#issuecomment-696164071







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] Jackie-Jiang merged pull request #6042: Fix built-in virtual columns for immutable segment

2020-09-21 Thread GitBox


Jackie-Jiang merged pull request #6042:
URL: https://github.com/apache/incubator-pinot/pull/6042


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch master updated: Fix built-in virtual columns for immutable segment (#6042)

2020-09-21 Thread jackie
This is an automated email from the ASF dual-hosted git repository.

jackie pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git


The following commit(s) were added to refs/heads/master by this push:
 new 67299cd  Fix built-in virtual columns for immutable segment (#6042)
67299cd is described below

commit 67299cde563b8a9aaecea02b2cc989e234bffcaf
Author: Xiaotian (Jackie) Jiang <1751+jackie-ji...@users.noreply.github.com>
AuthorDate: Mon Sep 21 17:42:08 2020 -0700

Fix built-in virtual columns for immutable segment (#6042)

Fix the bug in `ImmutableSegmentLoader` where built-in virtual columns are 
not added to the schema in the segment metadata, which will cause wrong result 
when explicitly querying the built-in virtual columns (e.g. `SELECT $docID FROM 
myTable`).
---
 .../immutable/ImmutableSegmentLoader.java  | 11 +--
 .../core/segment/index/loader/LoaderTest.java  | 80 +-
 2 files changed, 64 insertions(+), 27 deletions(-)

diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/immutable/ImmutableSegmentLoader.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/immutable/ImmutableSegmentLoader.java
index 6ddaaa5..b7b1fe0 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/immutable/ImmutableSegmentLoader.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/immutable/ImmutableSegmentLoader.java
@@ -112,15 +112,10 @@ public class ImmutableSegmentLoader {
   new PhysicalColumnIndexContainer(segmentReader, entry.getValue(), 
indexLoadingConfig, indexDir));
 }
 
-if (schema == null) {
-  schema = segmentMetadata.getSchema();
-}
-
-// Ensure that the schema has the virtual columns added
-
VirtualColumnProviderFactory.addBuiltInVirtualColumnsToSegmentSchema(schema, 
segmentName);
-
 // Instantiate virtual columns
-for (FieldSpec fieldSpec : schema.getAllFieldSpecs()) {
+Schema segmentSchema = segmentMetadata.getSchema();
+
VirtualColumnProviderFactory.addBuiltInVirtualColumnsToSegmentSchema(segmentSchema,
 segmentName);
+for (FieldSpec fieldSpec : segmentSchema.getAllFieldSpecs()) {
   if (fieldSpec.isVirtualColumn()) {
 String columnName = fieldSpec.getName();
 VirtualColumnContext context = new VirtualColumnContext(fieldSpec, 
segmentMetadata.getTotalDocs());
diff --git 
a/pinot-core/src/test/java/org/apache/pinot/core/segment/index/loader/LoaderTest.java
 
b/pinot-core/src/test/java/org/apache/pinot/core/segment/index/loader/LoaderTest.java
index 6fbd11a..32a4c0a 100644
--- 
a/pinot-core/src/test/java/org/apache/pinot/core/segment/index/loader/LoaderTest.java
+++ 
b/pinot-core/src/test/java/org/apache/pinot/core/segment/index/loader/LoaderTest.java
@@ -25,6 +25,7 @@ import java.util.HashSet;
 import java.util.List;
 import org.apache.commons.io.FileUtils;
 import org.apache.pinot.common.segment.ReadMode;
+import 
org.apache.pinot.common.utils.CommonConstants.Segment.BuiltInVirtualColumn;
 import org.apache.pinot.common.utils.TarGzCompressionUtils;
 import org.apache.pinot.core.indexsegment.IndexSegment;
 import org.apache.pinot.core.indexsegment.generator.SegmentGeneratorConfig;
@@ -151,6 +152,28 @@ public class LoaderTest {
   }
 
   @Test
+  public void testBuiltInVirtualColumns()
+  throws Exception {
+Schema schema = constructV1Segment();
+
+IndexSegment indexSegment = ImmutableSegmentLoader.load(_indexDir, 
_v1IndexLoadingConfig, schema);
+testBuiltInVirtualColumns(indexSegment);
+indexSegment.destroy();
+
+indexSegment = ImmutableSegmentLoader.load(_indexDir, 
_v1IndexLoadingConfig, null);
+testBuiltInVirtualColumns(indexSegment);
+indexSegment.destroy();
+  }
+
+  private void testBuiltInVirtualColumns(IndexSegment indexSegment) {
+Assert.assertTrue(indexSegment.getColumnNames().containsAll(
+Arrays.asList(BuiltInVirtualColumn.DOCID, 
BuiltInVirtualColumn.HOSTNAME, BuiltInVirtualColumn.SEGMENTNAME)));
+
Assert.assertNotNull(indexSegment.getDataSource(BuiltInVirtualColumn.DOCID));
+
Assert.assertNotNull(indexSegment.getDataSource(BuiltInVirtualColumn.HOSTNAME));
+
Assert.assertNotNull(indexSegment.getDataSource(BuiltInVirtualColumn.SEGMENTNAME));
+  }
+
+  @Test
   public void testPadding()
   throws Exception {
 // Old Format
@@ -270,7 +293,8 @@ public class LoaderTest {
   }
 
   @Test
-  public void testTextIndexLoad() throws Exception {
+  public void testTextIndexLoad()
+  throws Exception {
 // Tests for scenarios by creating on-disk segment in V3 and then loading
 // the segment with and without specifying segmentVersion in 
IndexLoadingConfig
 
@@ -288,7 +312,8 @@ public class LoaderTest {
 File textIndexFile = 
SegmentDirectoryPaths.findTextIndexIndexFile(_indexDir, TEXT_INDEX_COL_NAME);
 Assert.assertNotNull(textIndexFile);
 

[incubator-pinot] branch master updated (274b4c2 -> b5e67c9)

2020-09-21 Thread jihao
This is an automated email from the ASF dual-hosted git repository.

jihao pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


from 274b4c2  remove default javaagent opts in generator.sh script to avoid 
javaagent port colission (#6041)
 add b5e67c9  [TE] Creating a thirdeye-dashboard module to host the 
dashboard server (#6026)

No new revisions were added by this update.

Summary of changes:
 thirdeye/pom.xml   |1 +
 thirdeye/run-backend.sh|4 +-
 thirdeye/run-frontend.sh   |4 +-
 .../config/dashboard.yml   |0
 .../config/data-sources/cache-config.yml   |0
 .../config/data-sources/data-sources-config.yml|0
 .../config/data/README.md  |0
 .../config/data/daily.csv  |0
 .../config/data/hourly.csv |0
 .../config/data/pageviews.csv  |0
 .../anomaly-functions/alertFilter.properties   |0
 .../alertFilterAutotune.properties |0
 .../anomaly-functions/functions.properties |0
 .../config/detector.yml|0
 .../config/h2db.mv.db  |  Bin 2490368 -> 2498560 
bytes
 .../config/persistence.yml |0
 .../config/rca.yml |0
 thirdeye/thirdeye-dashboard/pom.xml|  113 ++
 .../api/application/ApplicationResource.java   |0
 .../api/detection/AnomalyDetectionResource.java|0
 .../api/user/dashboard/UserDashboardResource.java  |0
 .../dashboard/DetectionPreviewConfiguration.java   |0
 .../thirdeye/dashboard/DetectorHttpUtils.java  |0
 .../thirdeye/dashboard/HandlebarsHelperBundle.java |0
 .../thirdeye/dashboard/HandlebarsViewRenderer.java |0
 .../pinot/thirdeye/dashboard/HelperBundle.java |0
 .../thirdeye/dashboard/RootCauseConfiguration.java |0
 .../dashboard/RootCauseResourceProvider.java   |0
 .../dashboard/ThirdEyeDashboardApplication.java|0
 .../dashboard/ThirdEyeDashboardConfiguration.java  |0
 .../dashboard/ThirdEyeDashboardModule.java |0
 .../apache/pinot/thirdeye/dashboard/ViewType.java  |0
 .../dashboard/configs/AuthConfiguration.java   |0
 .../dashboard/configs/ResourceConfiguration.java   |0
 .../dashboard/resources/AdminResource.java |0
 .../resources/AnomalyFlattenResource.java  |0
 .../dashboard/resources/AnomalyResource.java   |0
 .../dashboard/resources/AutoOnboardResource.java   |0
 .../resources/BadRequestWebException.java  |0
 .../dashboard/resources/CacheResource.java |0
 .../resources/CustomizedEventResource.java |0
 .../dashboard/resources/DashboardResource.java |0
 .../dashboard/resources/DatasetConfigResource.java |0
 .../dashboard/resources/EntityManagerResource.java |0
 .../dashboard/resources/EntityMappingResource.java |0
 .../dashboard/resources/MetricConfigResource.java  |0
 .../resources/OnboardDatasetMetricResource.java|0
 .../dashboard/resources/ResourceUtils.java |0
 .../thirdeye/dashboard/resources/RootResource.java |0
 .../dashboard/resources/SummaryResource.java   |0
 .../dashboard/resources/ThirdEyeResource.java  |0
 .../dashboard/resources/v2/AnomaliesResource.java  |   46 +-
 .../dashboard/resources/v2/AuthResource.java   |2 +-
 .../dashboard/resources/v2/ConfigResource.java |0
 .../dashboard/resources/v2/DataResource.java   |3 +-
 .../resources/v2/DetectionAlertResource.java   |0
 .../dashboard/resources/v2/ResourceUtils.java  |0
 .../resources/v2/RootCauseEntityFormatter.java |0
 .../v2/RootCauseEventEntityFormatter.java  |0
 .../resources/v2/RootCauseMetricResource.java  |0
 .../dashboard/resources/v2/RootCauseResource.java  |0
 .../resources/v2/RootCauseSessionResource.java |0
 .../resources/v2/RootCauseTemplateResource.java|0
 .../resources/v2/alerts/AlertResource.java |0
 .../resources/v2/alerts/AlertSearchFilter.java |0
 .../resources/v2/alerts/AlertSearcher.java |0
 .../v2/anomalies/AnomalySearchFilter.java  |0
 .../v2/anomalies/AnomalySearchResource.java|0
 .../resources/v2/anomalies/AnomalySearcher.java|0
 .../resources/v2/pojo/AnomaliesSummary.java|0
 .../resources/v2/pojo/AnomaliesWrapper.java|0
 .../v2/pojo/AnomalyClassificationType.java |0
 .../resources/v2/pojo/AnomalyDetails.java  |0
 .../resources/v2/pojo/AnomalySummary.java  |0
 .../dashboard/resources/v2/pojo/MetricSummary.java |0
 .../resources/v2/pojo/RootCauseEntity.java |0
 

[GitHub] [incubator-pinot] jihaozh merged pull request #6026: [TE] Creating a thirdeye-dashboard module to host the dashboard server

2020-09-21 Thread GitBox


jihaozh merged pull request #6026:
URL: https://github.com/apache/incubator-pinot/pull/6026


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch master updated (b65fe43 -> 274b4c2)

2020-09-21 Thread apucher
This is an automated email from the ASF dual-hosted git repository.

apucher pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


from b65fe43  [TE] fix labeler config mapping and timeout when fetching 
anomalies (#6036)
 add 274b4c2  remove default javaagent opts in generator.sh script to avoid 
javaagent port colission (#6041)

No new revisions were added by this update.

Summary of changes:
 docker/images/pinot/bin/generator.sh | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] apucher merged pull request #6041: remove default javaagent opts in generator.sh script to avoid javaagent port colission

2020-09-21 Thread GitBox


apucher merged pull request #6041:
URL: https://github.com/apache/incubator-pinot/pull/6041


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] Jackie-Jiang opened a new pull request #6042: Fix built-in virtual columns for immutable segment

2020-09-21 Thread GitBox


Jackie-Jiang opened a new pull request #6042:
URL: https://github.com/apache/incubator-pinot/pull/6042


   ## Description
   Fix the bug in `ImmutableSegmentLoader` where built-in virtual columns are 
not added to the schema in the segment metadata, which will cause wrong result 
when explicitly querying the built-in virtual columns (e.g. `SELECT $docID FROM 
myTable`).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] Jackie-Jiang commented on pull request #6040: Remove the partition info from the consuming segment ZK metadata

2020-09-21 Thread GitBox


Jackie-Jiang commented on pull request #6040:
URL: https://github.com/apache/incubator-pinot/pull/6040#issuecomment-696415308


   @mcvsubbu The segment partition pruning for consuming segment will happen on 
server side instead of broker side. Yes there will be some overhead, and we 
need to measure the impact with a high QPS use case.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] apucher opened a new pull request #6041: remove default javaagent opts in generator.sh script to avoid javaagent port colission

2020-09-21 Thread GitBox


apucher opened a new pull request #6041:
URL: https://github.com/apache/incubator-pinot/pull/6041


   ## Description
   This bugfix removes the default javaagent opts in generator.sh script 
(**only**) to avoid javaagent port collisions when running pinot-admin from a 
container that already has an active controller, broker, or server.
   
   ## Upgrade Notes
   Does this PR prevent a zero down-time upgrade? (Assume upgrade order: 
Controller, Broker, Server, Minion)
   No
   
   Does this PR fix a zero-downtime upgrade introduced earlier?
   No
   
   Does this PR otherwise need attention when creating release notes? Things to 
consider:
   No



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch data-generator-javaagent-fix-20200921 created (now 918bb82)

2020-09-21 Thread apucher
This is an automated email from the ASF dual-hosted git repository.

apucher pushed a change to branch data-generator-javaagent-fix-20200921
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


  at 918bb82  remove default javaagent opts in generator.sh script to avoid 
javaagent port colission

This branch includes the following new commits:

 new 918bb82  remove default javaagent opts in generator.sh script to avoid 
javaagent port colission

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] 01/01: remove default javaagent opts in generator.sh script to avoid javaagent port colission

2020-09-21 Thread apucher
This is an automated email from the ASF dual-hosted git repository.

apucher pushed a commit to branch data-generator-javaagent-fix-20200921
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git

commit 918bb822bac9bff736c63088fb242d5c36fc8496
Author: Alexander Pucher 
AuthorDate: Mon Sep 21 15:09:06 2020 -0700

remove default javaagent opts in generator.sh script to avoid javaagent 
port colission
---
 docker/images/pinot/bin/generator.sh | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/docker/images/pinot/bin/generator.sh 
b/docker/images/pinot/bin/generator.sh
index a79afa1..1f98f29 100755
--- a/docker/images/pinot/bin/generator.sh
+++ b/docker/images/pinot/bin/generator.sh
@@ -47,7 +47,7 @@ sed -i -e "s/\"schemaName\": 
\"$TEMPLATE_NAME\"/\"schemaName\": \"$TABLE_NAME\"/
 sed -i -e "s/\"schemaName\": \"$TEMPLATE_NAME\"/\"schemaName\": 
\"$TABLE_NAME\"/g" "${TEMPLATE_BASEDIR}/${TEMPLATE_NAME}_schema.json"
 
 echo "Generating data for ${TEMPLATE_NAME} in ${DATA_DIR}"
-${ADMIN_PATH} GenerateData \
+JAVA_OPTS="" ${ADMIN_PATH} GenerateData \
 -numFiles 1 -numRecords 631152  -format csv \
 -schemaFile "${TEMPLATE_BASEDIR}/${TEMPLATE_NAME}_schema.json" \
 -schemaAnnotationFile "${TEMPLATE_BASEDIR}/${TEMPLATE_NAME}_generator.json" \
@@ -60,7 +60,7 @@ if [ ! -d "${DATA_DIR}" ]; then
 fi
 
 echo "Creating segment for ${TEMPLATE_NAME} in ${SEGMENT_DIR}"
-${ADMIN_PATH} CreateSegment \
+JAVA_OPTS="" ${ADMIN_PATH} CreateSegment \
 -format csv \
 -tableConfigFile "${TEMPLATE_BASEDIR}/${TEMPLATE_NAME}_config.json" \
 -schemaFile "${TEMPLATE_BASEDIR}/${TEMPLATE_NAME}_schema.json" \
@@ -74,12 +74,12 @@ if [ ! -d "${SEGMENT_DIR}" ]; then
 fi
 
 echo "Adding table ${TABLE_NAME} from template ${TEMPLATE_NAME}"
-${ADMIN_PATH} AddTable -exec \
+JAVA_OPTS="" ${ADMIN_PATH} AddTable -exec \
 -tableConfigFile "${TEMPLATE_BASEDIR}/${TEMPLATE_NAME}_config.json" \
 -schemaFile "${TEMPLATE_BASEDIR}/${TEMPLATE_NAME}_schema.json" || exit 1
 
 echo "Uploading segment for ${TEMPLATE_NAME}"
-${ADMIN_PATH} UploadSegment \
+JAVA_OPTS="" ${ADMIN_PATH} UploadSegment \
 -tableName "${TABLE_NAME}" \
 -segmentDir "${SEGMENT_DIR}"
 


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch master updated (73f0459 -> b65fe43)

2020-09-21 Thread jihao
This is an automated email from the ASF dual-hosted git repository.

jihao pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


from 73f0459  Add Broker Reduce Time Log (#6033)
 add b65fe43  [TE] fix labeler config mapping and timeout when fetching 
anomalies (#6036)

No new revisions were added by this update.

Summary of changes:
 .../cache/builder/AnomaliesCacheBuilder.java   |  9 --
 .../components/ThresholdSeverityLabeler.java   | 33 --
 .../spec/SeverityThresholdLabelerSpec.java | 22 ---
 .../detection/wrapper/AnomalyLabelerWrapper.java   |  7 +
 .../alert/commons/TestAnomalyFeedFactory.java  |  2 +-
 .../components/ThresholdSeverityLabelerTest.java   | 29 +--
 .../email/filter/TestPrecisionRecallEvaluator.java |  2 +-
 .../detector/email/filter/TestUserReportUtils.java |  2 +-
 .../TestAlertContentFormatterFactory.java  |  2 +-
 thirdeye/thirdeye-spi/pom.xml  |  6 
 10 files changed, 72 insertions(+), 42 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] jihaozh merged pull request #6036: [TE] fix labeler config mapping and timeout when fetching anomalies

2020-09-21 Thread GitBox


jihaozh merged pull request #6036:
URL: https://github.com/apache/incubator-pinot/pull/6036


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] kishoreg commented on pull request #6037: Add list of allowed tables for emitting table level metrics

2020-09-21 Thread GitBox


kishoreg commented on pull request #6037:
URL: https://github.com/apache/incubator-pinot/pull/6037#issuecomment-696321992


   > > Aren’t we trying to address the limitations of monitoring systems in 
Pinot?
   > 
   > I'm not aware that. Could you elaborate a bit more? what's exactly the 
plan and what's the timeline for it?
   > The solution presented here is regarding a large cluster at Linkedin for 
which table level metrics are disabled for all tables. The current situation is 
risky as for some high priority tables, we don't get alerted. This PR 
immediately alleviates the existing issue.
   
   Sorry, I was referring to the changes in this PR. Ideally, we should be 
logging metrics for all tables/resources. It's up to the operators to set 
alerts on the right tables that are important for the business.
   
   By adding this new config, we are adding a workaround in Pinot to overcome 
the limitation with the metrics system which cannot handle thousands of tables.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] Jackie-Jiang opened a new pull request #6040: Remove the partition info from the consuming segment ZK metadata

2020-09-21 Thread GitBox


Jackie-Jiang opened a new pull request #6040:
URL: https://github.com/apache/incubator-pinot/pull/6040


   ## Description
   For issue: #6029 
   Do not persist the partition info to the consuming segment ZK metadata 
because it might not reflect the correct segment partitioning when the data is 
not partitioned correctly in the stream, or the partitions changed during the 
consumption. Persisting the partition info could cause segment to be 
mis-pruned, thus cause wrong result.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] 01/01: Remove the partition info from the consuming segment ZK metadata

2020-09-21 Thread jackie
This is an automated email from the ASF dual-hosted git repository.

jackie pushed a commit to branch remove_consuming_partition_info
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git

commit f83cbb569448d883250ae6d57d1070e80f5b3657
Author: Xiaotian (Jackie) Jiang 
AuthorDate: Mon Sep 21 12:10:37 2020 -0700

Remove the partition info from the consuming segment ZK metadata
---
 .../segmentpruner/PartitionSegmentPruner.java  |  32 +++-
 .../realtime/PinotLLCRealtimeSegmentManager.java   |  28 +--
 ...PartitionLLCRealtimeClusterIntegrationTest.java | 199 +
 3 files changed, 192 insertions(+), 67 deletions(-)

diff --git 
a/pinot-broker/src/main/java/org/apache/pinot/broker/routing/segmentpruner/PartitionSegmentPruner.java
 
b/pinot-broker/src/main/java/org/apache/pinot/broker/routing/segmentpruner/PartitionSegmentPruner.java
index 8320b30..181bad9 100644
--- 
a/pinot-broker/src/main/java/org/apache/pinot/broker/routing/segmentpruner/PartitionSegmentPruner.java
+++ 
b/pinot-broker/src/main/java/org/apache/pinot/broker/routing/segmentpruner/PartitionSegmentPruner.java
@@ -32,7 +32,7 @@ import org.apache.pinot.common.metadata.ZKMetadataProvider;
 import org.apache.pinot.common.metadata.segment.ColumnPartitionMetadata;
 import org.apache.pinot.common.metadata.segment.SegmentPartitionMetadata;
 import org.apache.pinot.common.request.BrokerRequest;
-import org.apache.pinot.common.utils.CommonConstants;
+import org.apache.pinot.common.utils.CommonConstants.Segment;
 import org.apache.pinot.common.utils.request.FilterQueryTree;
 import org.apache.pinot.common.utils.request.RequestUtils;
 import org.apache.pinot.core.data.partition.PartitionFunction;
@@ -76,17 +76,32 @@ public class PartitionSegmentPruner implements 
SegmentPruner {
 List znRecords = _propertyStore.get(segmentZKMetadataPaths, 
null, AccessOption.PERSISTENT);
 for (int i = 0; i < numSegments; i++) {
   String segment = segments.get(i);
-  _partitionInfoMap.put(segment, 
extractPartitionInfoFromSegmentZKMetadataZNRecord(segment, znRecords.get(i)));
+  PartitionInfo partitionInfo = 
extractPartitionInfoFromSegmentZKMetadataZNRecord(segment, znRecords.get(i));
+  if (partitionInfo != null) {
+_partitionInfoMap.put(segment, partitionInfo);
+  }
 }
   }
 
+  /**
+   * NOTE: Returns {@code null} when the ZNRecord is missing (could be 
transient Helix issue), or the segment is a
+   *   consuming segment so that we can retry later. Returns {@link 
#INVALID_PARTITION_INFO} when the segment does
+   *   not have valid partition metadata in its ZK metadata, in which case 
we won't retry later.
+   */
+  @Nullable
   private PartitionInfo 
extractPartitionInfoFromSegmentZKMetadataZNRecord(String segment, @Nullable 
ZNRecord znRecord) {
 if (znRecord == null) {
   LOGGER.warn("Failed to find segment ZK metadata for segment: {}, table: 
{}", segment, _tableNameWithType);
-  return INVALID_PARTITION_INFO;
+  return null;
 }
 
-String partitionMetadataJson = 
znRecord.getSimpleField(CommonConstants.Segment.PARTITION_METADATA);
+// Skip processing the partition metadata for the consuming segment 
because the partition metadata is updated when
+// the consuming segment is committed
+if 
(Segment.Realtime.Status.IN_PROGRESS.name().equals(znRecord.getSimpleField(Segment.Realtime.STATUS)))
 {
+  return null;
+}
+
+String partitionMetadataJson = 
znRecord.getSimpleField(Segment.PARTITION_METADATA);
 if (partitionMetadataJson == null) {
   LOGGER.warn("Failed to find segment partition metadata for segment: {}, 
table: {}", segment, _tableNameWithType);
   return INVALID_PARTITION_INFO;
@@ -127,8 +142,13 @@ public class PartitionSegmentPruner implements 
SegmentPruner {
 
   @Override
   public synchronized void refreshSegment(String segment) {
-_partitionInfoMap.put(segment, 
extractPartitionInfoFromSegmentZKMetadataZNRecord(segment,
-_propertyStore.get(_segmentZKMetadataPathPrefix + segment, null, 
AccessOption.PERSISTENT)));
+PartitionInfo partitionInfo = 
extractPartitionInfoFromSegmentZKMetadataZNRecord(segment,
+_propertyStore.get(_segmentZKMetadataPathPrefix + segment, null, 
AccessOption.PERSISTENT));
+if (partitionInfo != null) {
+  _partitionInfoMap.put(segment, partitionInfo);
+} else {
+  _partitionInfoMap.remove(segment);
+}
   }
 
   @Override
diff --git 
a/pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java
 
b/pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java
index 43ea74c..b85bdb6 100644
--- 
a/pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java
+++ 
b/pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java
@@ -27,7 

[incubator-pinot] branch remove_consuming_partition_info updated (200bef6 -> f83cbb5)

2020-09-21 Thread jackie
This is an automated email from the ASF dual-hosted git repository.

jackie pushed a change to branch remove_consuming_partition_info
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


 discard 200bef6  Remove the partition info from the consuming segment ZK 
metadata
 discard fcf2b7a  Handle the partitioning mismatch between table config and 
stream
 add d9aec17  Improve the realtime time creation unit test (#6032)
 add 5548e79  Table indexing config validation (#6017)
 add 8511410  Publish helm package pinot 0.2.1 (#6034)
 add 0dbe06d  Publish helm repo with new index (#6035)
 add fe047fd  Support streaming query in QueryExecutor (#6027)
 add 919f407  Handle the partitioning mismatch between table config and 
stream (#6031)
 add 73f0459  Add Broker Reduce Time Log (#6033)
 new f83cbb5  Remove the partition info from the consuming segment ZK 
metadata

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (200bef6)
\
 N -- N -- N   refs/heads/remove_consuming_partition_info (f83cbb5)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 docs/README.md => kubernetes/helm/README-dev.md|  25 +-
 kubernetes/helm/index.yaml |  34 ++-
 kubernetes/helm/pinot-0.2.1.tgz| Bin 0 -> 23883 bytes
 kubernetes/helm/pinot/Chart.yaml   |   4 +-
 .../requesthandler/BaseBrokerRequestHandler.java   |   4 +-
 .../segmentpruner/PartitionSegmentPruner.java  |  32 +-
 .../helix/core/PinotHelixResourceManager.java  |   1 -
 .../realtime/PinotLLCRealtimeSegmentManager.java   |   6 +-
 .../api/PinotTableRestletResourceTest.java |  13 +-
 .../core/query/executor/GrpcQueryExecutor.java | 327 -
 .../pinot/core/query/executor/QueryExecutor.java   |  26 +-
 .../query/executor/ServerQueryExecutorV1Impl.java  |  29 +-
 .../pinot/core/transport/grpc/GrpcQueryServer.java |  76 -
 .../apache/pinot/core/util/TableConfigUtils.java   | 114 ++-
 .../query/scheduler/PrioritySchedulerTest.java |  36 ++-
 .../pinot/core/util/TableConfigUtilsTest.java  | 154 ++
 ...PartitionLLCRealtimeClusterIntegrationTest.java | 170 ++-
 ...ulls_default_column_test_missing_columns.schema |   4 +-
 .../pinot/server/starter/ServerInstance.java   |   5 +-
 .../spi/utils/builder/TableConfigBuilder.java  |  15 +
 20 files changed, 668 insertions(+), 407 deletions(-)
 copy docs/README.md => kubernetes/helm/README-dev.md (67%)
 create mode 100644 kubernetes/helm/pinot-0.2.1.tgz
 delete mode 100644 
pinot-core/src/main/java/org/apache/pinot/core/query/executor/GrpcQueryExecutor.java


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] icefury71 commented on issue #5942: Better Table config validation

2020-09-21 Thread GitBox


icefury71 commented on issue #5942:
URL: 
https://github.com/apache/incubator-pinot/issues/5942#issuecomment-696260153


   Updated ticket description with all the suggestions.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] sajjad-moradi commented on pull request #6037: Add list of allowed tables for emitting table level metrics

2020-09-21 Thread GitBox


sajjad-moradi commented on pull request #6037:
URL: https://github.com/apache/incubator-pinot/pull/6037#issuecomment-696225516


   > Aren’t we trying to address the limitations of monitoring systems in Pinot?
   
   I'm not aware that. Could you elaborate a bit more? what's exactly the plan 
and what's the timeline for it?
   The solution presented here is regarding a large cluster at Linkedin for 
which table level metrics are disabled for all tables. The current situation is 
risky as for some high priority tables, we don't get alerted. This PR 
immediately alleviates the existing issue. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] kishoreg commented on issue #5942: Better Table config validation

2020-09-21 Thread GitBox


kishoreg commented on issue #5942:
URL: 
https://github.com/apache/incubator-pinot/issues/5942#issuecomment-696214712


   +1 to a utility to validate, we can add a validate endpoint in the controller



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on issue #5942: Better Table config validation

2020-09-21 Thread GitBox


mcvsubbu commented on issue #5942:
URL: 
https://github.com/apache/incubator-pinot/issues/5942#issuecomment-696207582


   With all the changes proposed/added, it will be useful if a utility is also 
provided to validate an existing tableconfig. Many of them are tightening the 
(lax) rules from before, and are valid. It will be unfortunate if an existing 
installation suddenly stopped working because code elsewhere (not just in the 
table addition path) starts assuming things about the tableconfig.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] icefury71 commented on issue #5942: Better Table config validation

2020-09-21 Thread GitBox


icefury71 commented on issue #5942:
URL: 
https://github.com/apache/incubator-pinot/issues/5942#issuecomment-696164071


   One more check: Currently retention does not work if pushType is null (for 
real-time table). Check for a case where pushType is null and retention is 
configured for a table. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] adriancole commented on issue #5977: Allow ServiceManager to install tables prior to listening on service ports or a healthy status

2020-09-21 Thread GitBox


adriancole commented on issue #5977:
URL: 
https://github.com/apache/incubator-pinot/issues/5977#issuecomment-696109515


   ok I added a sketch for feedback. I will work on this again next week 
https://github.com/apache/incubator-pinot/pull/6039



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] adriancole opened a new pull request #6039: WIP: ServiceManager ADD_TABLE role

2020-09-21 Thread GitBox


adriancole opened a new pull request #6039:
URL: https://github.com/apache/incubator-pinot/pull/6039


   This is a work in progress towards #5977
   
   Besides overall review of this sketch, this needs.
   
   [ ] how do we get a reference to the _helixResourceManager?
   [ ] do we need to block until Pinot Broker registers with the controller 
before installing the tables?
   [ ] can we can stop the controller from listening on a port until these 
tables are added?
   [ ] add test relative file paths can work
   [ ] add test multiple schemas install in parallel
   [ ] add test health check fails if a schema fails due to missing files, or 
inability to run addSchema or addTable.
   [ ] figure out where to document this
   [ ] determine if quickStart could or should use this
   
   ## Description
   
   This would allow one or more bootstrap config files that include one or more 
tables to add.
   
   ex given a file `etc/pinot-backendEntityView.conf`
   ```
   pinot.service.role=ADD_TABLE
   pinot.addTable.schemaFile=./schemas/backendEntityView-schemaFile.json
   
pinot.addTable.tableConfigFile=./schemas/backendEntityView-tableConfigFile.json
   ```
   and a file `etc/pinot-rawServiceView.conf`
   ```
   pinot.service.role=ADD_TABLE
   pinot.addTable.schemaFile=./schemas/rawServiceView-schemaFile.json
   pinot.addTable.tableConfigFile=./schemas/rawServiceView-tableConfigFile.json
   ```
   
   you could pass the following to `StartServiceManager`:  
`-bootstrapConfigPaths  etc/pinot-controller.conf etc/pinot-broker.conf 
etc/pinot-server.conf etc/pinot-backendEntityView.conf 
etc/pinot-backendEntityView.conf`
   
   What will happen in ideal case is the controller starts, the tables install, 
then the controller is listening. If we can't block the controller from 
listening until this is done, we can at least make sure health check is not ok.
   
   ## Upgrade Notes
   Does this PR prevent a zero down-time upgrade? **No**
   Does this PR fix a zero-downtime upgrade introduced earlier? **No**
   Does this PR otherwise need attention when creating release notes?
   * [x] Yes (Please label this PR as **release-notes** and 
complete the section on Release Notes)
   
   ## Release Notes
   TODO
   
   ## Documentation
   TODO
   
   If you have introduced a new feature or configuration, please add it to the 
documentation as well.
   See 
https://docs.pinot.apache.org/developers/developers-and-contributors/update-document
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org