[GitHub] [incubator-pinot] kishoreg commented on pull request #6033: Add Broker Reduce Time Log
kishoreg commented on pull request #6033: URL: https://github.com/apache/incubator-pinot/pull/6033#issuecomment-696501753 This is the one that uncovered the issue that broker was indeed taking a long time. This will probably help at LinkedIn as well to identify the usecases that can benefit from parallel reduce in broker This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[incubator-pinot] 01/01: [maven-release-plugin] prepare for next development iteration
This is an automated email from the ASF dual-hosted git repository. tingchen pushed a commit to branch release-0.5.0-rc in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git commit 7e7af116684950a8e5e86600d9c1a07bc5e0759c Author: TING CHEN AuthorDate: Wed Sep 2 17:49:37 2020 -0700 [maven-release-plugin] prepare for next development iteration --- pinot-broker/pom.xml | 2 +- pinot-clients/pinot-java-client/pom.xml | 2 +- pinot-clients/pinot-jdbc-client/pom.xml | 2 +- pinot-clients/pom.xml | 2 +- pinot-common/pom.xml | 2 +- pinot-controller/pom.xml | 2 +- pinot-core/pom.xml| 2 +- pinot-distribution/pom.xml| 2 +- pinot-integration-tests/pom.xml | 2 +- pinot-minion/pom.xml | 2 +- pinot-perf/pom.xml| 2 +- .../pinot-batch-ingestion/pinot-batch-ingestion-common/pom.xml| 2 +- .../pinot-batch-ingestion/pinot-batch-ingestion-hadoop/pom.xml| 2 +- .../pinot-batch-ingestion/pinot-batch-ingestion-spark/pom.xml | 2 +- .../pinot-batch-ingestion/pinot-batch-ingestion-standalone/pom.xml| 2 +- pinot-plugins/pinot-batch-ingestion/pom.xml | 2 +- .../pinot-batch-ingestion/v0_deprecated/pinot-hadoop/pom.xml | 2 +- .../v0_deprecated/pinot-ingestion-common/pom.xml | 2 +- pinot-plugins/pinot-batch-ingestion/v0_deprecated/pinot-spark/pom.xml | 2 +- pinot-plugins/pinot-batch-ingestion/v0_deprecated/pom.xml | 2 +- pinot-plugins/pinot-file-system/pinot-adls/pom.xml| 2 +- pinot-plugins/pinot-file-system/pinot-gcs/pom.xml | 2 +- pinot-plugins/pinot-file-system/pinot-hdfs/pom.xml| 2 +- pinot-plugins/pinot-file-system/pinot-s3/pom.xml | 2 +- pinot-plugins/pinot-file-system/pom.xml | 2 +- pinot-plugins/pinot-input-format/pinot-avro-base/pom.xml | 2 +- pinot-plugins/pinot-input-format/pinot-avro/pom.xml | 2 +- pinot-plugins/pinot-input-format/pinot-confluent-avro/pom.xml | 2 +- pinot-plugins/pinot-input-format/pinot-csv/pom.xml| 2 +- pinot-plugins/pinot-input-format/pinot-json/pom.xml | 2 +- pinot-plugins/pinot-input-format/pinot-orc/pom.xml| 2 +- pinot-plugins/pinot-input-format/pinot-parquet/pom.xml| 2 +- pinot-plugins/pinot-input-format/pinot-protobuf/pom.xml | 2 +- pinot-plugins/pinot-input-format/pinot-thrift/pom.xml | 2 +- pinot-plugins/pinot-input-format/pom.xml | 2 +- pinot-plugins/pinot-stream-ingestion/pinot-kafka-0.9/pom.xml | 2 +- pinot-plugins/pinot-stream-ingestion/pinot-kafka-2.0/pom.xml | 2 +- pinot-plugins/pinot-stream-ingestion/pinot-kafka-base/pom.xml | 2 +- pinot-plugins/pinot-stream-ingestion/pom.xml | 2 +- pinot-plugins/pom.xml | 2 +- pinot-server/pom.xml | 2 +- pinot-spi/pom.xml | 2 +- pinot-tools/pom.xml | 2 +- pom.xml | 4 ++-- 44 files changed, 45 insertions(+), 45 deletions(-) diff --git a/pinot-broker/pom.xml b/pinot-broker/pom.xml index e578c82..146ca13 100644 --- a/pinot-broker/pom.xml +++ b/pinot-broker/pom.xml @@ -24,7 +24,7 @@ pinot org.apache.pinot -0.5.0 +0.6.0-SNAPSHOT .. pinot-broker diff --git a/pinot-clients/pinot-java-client/pom.xml b/pinot-clients/pinot-java-client/pom.xml index a5f06df..8681625 100644 --- a/pinot-clients/pinot-java-client/pom.xml +++ b/pinot-clients/pinot-java-client/pom.xml @@ -24,7 +24,7 @@ pinot-clients org.apache.pinot -0.5.0 +0.6.0-SNAPSHOT .. pinot-java-client diff --git a/pinot-clients/pinot-jdbc-client/pom.xml b/pinot-clients/pinot-jdbc-client/pom.xml index 528822b..e729330 100644 --- a/pinot-clients/pinot-jdbc-client/pom.xml +++ b/pinot-clients/pinot-jdbc-client/pom.xml @@ -24,7 +24,7 @@ pinot-clients org.apache.pinot -0.5.0 +0.6.0-SNAPSHOT .. pinot-jdbc-client diff --git a/pinot-clients/pom.xml b/pinot-clients/pom.xml index 6a6d988..5b7c75c 100644 --- a/pinot-clients/pom.xml +++ b/pinot-clients/pom.xml @@ -24,7 +24,7 @@ pinot
[incubator-pinot] branch release-0.5.0-rc created (now 7e7af11)
This is an automated email from the ASF dual-hosted git repository. tingchen pushed a change to branch release-0.5.0-rc in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git. at 7e7af11 [maven-release-plugin] prepare for next development iteration This branch includes the following new commits: new 7e7af11 [maven-release-plugin] prepare for next development iteration The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[incubator-pinot] branch release-0.5.0-rc created (now 67299cd)
This is an automated email from the ASF dual-hosted git repository. tingchen pushed a change to branch release-0.5.0-rc in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git. at 67299cd Fix built-in virtual columns for immutable segment (#6042) No new revisions were added by this update. - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] mayankshriv edited a comment on issue #6028: Make BrokerReduceService.reduceOnDataTable Multi Threaded to increase aggregation performance
mayankshriv edited a comment on issue #6028: URL: https://github.com/apache/incubator-pinot/issues/6028#issuecomment-696510899 @mr-agrwal Could you please try the changes in this PR and provide some feedback: https://github.com/apache/incubator-pinot/pull/6044 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang opened a new pull request #6043: Add IN_PARTITIONED_SUBQUERY support
Jackie-Jiang opened a new pull request #6043: URL: https://github.com/apache/incubator-pinot/pull/6043 ## Description Add `IN_PARTITIONED_SUBQUERY` transform function to support `IDSET` aggregation function as the subquery on the server side. Because the subquery is solved on the server side, in order to make it work, the subquery must hit the same table as the main query, and the table must be partitioned at server level (all the segments for a partition is served by a single server). E.g. The following 2 queries can be combined into one query: SELECT ID_SET(col) FROM table WHERE date = 20200901 SELECT DISTINCT_COUNT(col), date FROM table WHERE IN_ID_SET(col, '') = 1 GROUP BY date -> SELECT DISTINCT_COUNT(col), date FROM table WHERE IN_PARTITIONED_SUBQUERY(col, 'SELECT ID_SET(col) FROM table WHERE date = 20200901') = 1 GROUP BY date This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] mayankshriv opened a new pull request #6044: Support for multi-threaded Group By reducer for SQL.
mayankshriv opened a new pull request #6044: URL: https://github.com/apache/incubator-pinot/pull/6044 The existing implementation of Broker reduce phase is single-threaded. For group-by queries where large response are being sent back from multiple servers, this could become a bottlenect. Given that brokers are generally light on CPU usage, making the reduce phase multi-threaded would be a good idea to boost performance. This PR adds a multi-threaded implementation for the Group-By reducer for SQL. - Added an executor service in BrokerReduceService that can be used by the reduce phase. - In this PR, the executor service defaults to have a single thread, until the performance impact can be studied under various conditions (eg high qps, where brokers have higher CPU usage). - Added a broker side config to specify the number of threads to be used for reduce phase. `pinot.broker.num.reduce.threads` - For testing, explicitly sets num threads to reduce to be > 1 to ensure functional correctness is tested. ## Description Add a description of your PR here. A good description should include pointers to an issue or design document, etc. ## Upgrade Notes Does this PR prevent a zero down-time upgrade? (Assume upgrade order: Controller, Broker, Server, Minion) * [ ] Yes (Please label as **backward-incompat**, and complete the section below on Release Notes) Does this PR fix a zero-downtime upgrade introduced earlier? * [ ] Yes (Please label this as **backward-incompat**, and complete the section below on Release Notes) Does this PR otherwise need attention when creating release notes? Things to consider: - New configuration options - Deprecation of configurations - Signature changes to public methods/interfaces - New plugins added or old plugins removed * [ ] Yes (Please label this PR as **release-notes** and complete the section on Release Notes) ## Release Notes If you have tagged this as either backward-incompat or release-notes, you MUST add text here that you would like to see appear in release notes of the next release. If you have a series of commits adding or enabling a feature, then add this section only in final commit that marks the feature completed. Refer to earlier release notes to see examples of text ## Documentation If you have introduced a new feature or configuration, please add it to the documentation as well. See https://docs.pinot.apache.org/developers/developers-and-contributors/update-document This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] mayankshriv commented on issue #6028: Make BrokerReduceService.reduceOnDataTable Multi Threaded to increase aggregation performance
mayankshriv commented on issue #6028: URL: https://github.com/apache/incubator-pinot/issues/6028#issuecomment-696510899 @mr-agrwal Could you please try the changes in this PR and provide some feedback: https://github.com/apache/incubator-pinot/pull/6044 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] apucher merged pull request #6041: remove default javaagent opts in generator.sh script to avoid javaagent port colission
apucher merged pull request #6041: URL: https://github.com/apache/incubator-pinot/pull/6041 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] mayankshriv commented on pull request #6033: Add Broker Reduce Time Log
mayankshriv commented on pull request #6033: URL: https://github.com/apache/incubator-pinot/pull/6033#issuecomment-696497353 @mr-agrwal @fx19880617 We do have metrics for each phase of query in the broker side (including scatter/gather, reduce, etc). Adding just the reduce part in the log does not make a lot of sense (because then we could argue why not add other times, and we definitely want to limit log size for high throughput cases). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] kishoreg commented on issue #5942: Better Table config validation
kishoreg commented on issue #5942: URL: https://github.com/apache/incubator-pinot/issues/5942#issuecomment-696214712 +1 to a utility to validate, we can add a validate endpoint in the controller This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang merged pull request #6042: Fix built-in virtual columns for immutable segment
Jackie-Jiang merged pull request #6042: URL: https://github.com/apache/incubator-pinot/pull/6042 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] mcvsubbu commented on issue #5942: Better Table config validation
mcvsubbu commented on issue #5942: URL: https://github.com/apache/incubator-pinot/issues/5942#issuecomment-696207582 With all the changes proposed/added, it will be useful if a utility is also provided to validate an existing tableconfig. Many of them are tightening the (lax) rules from before, and are valid. It will be unfortunate if an existing installation suddenly stopped working because code elsewhere (not just in the table addition path) starts assuming things about the tableconfig. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] adriancole commented on issue #5977: Allow ServiceManager to install tables prior to listening on service ports or a healthy status
adriancole commented on issue #5977: URL: https://github.com/apache/incubator-pinot/issues/5977#issuecomment-696109515 ok I added a sketch for feedback. I will work on this again next week https://github.com/apache/incubator-pinot/pull/6039 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] KKcorps commented on a change in pull request #6020: Add Caching in Controller Broker API
KKcorps commented on a change in pull request #6020: URL: https://github.com/apache/incubator-pinot/pull/6020#discussion_r491796919 ## File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/listener/ClusterLiveInstanceChangeListener.java ## @@ -0,0 +1,34 @@ +package org.apache.pinot.controller.helix.core.listener; + +import java.util.ArrayList; +import org.apache.helix.HelixDataAccessor; +import org.apache.helix.NotificationContext; +import org.apache.helix.api.listeners.LiveInstanceChangeListener; +import org.apache.helix.model.LiveInstance; + +import java.util.List; +import org.apache.helix.PropertyKey.Builder; + + +public class ClusterLiveInstanceChangeListener implements LiveInstanceChangeListener { + private HelixDataAccessor _helixDataAccessor; + private Builder _keyBuilder; + private List _liveInstances = new ArrayList<>(); + + public ClusterLiveInstanceChangeListener(HelixDataAccessor helixDataAccessor, Builder keyBuilder) { +_helixDataAccessor = helixDataAccessor; +_keyBuilder = keyBuilder; + } + + @Override + public void onLiveInstanceChange(List liveInstances, NotificationContext changeContext) { +_liveInstances = liveInstances; Review comment: This was returning older values rather than updated values in onChange methoda. ## File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/listener/ClusterInstanceConfigChangeListener.java ## @@ -0,0 +1,32 @@ +package org.apache.pinot.controller.helix.core.listener; + +import org.apache.helix.HelixManager; +import org.apache.helix.NotificationContext; +import org.apache.helix.api.listeners.InstanceConfigChangeListener; +import org.apache.helix.model.InstanceConfig; + +import java.util.ArrayList; +import java.util.List; +import org.apache.pinot.common.utils.helix.HelixHelper; + + +public class ClusterInstanceConfigChangeListener implements InstanceConfigChangeListener { +private HelixManager _helixManager; +private List _instanceConfigs = new ArrayList<>(); + +public ClusterInstanceConfigChangeListener(HelixManager helixManager) { +_helixManager = helixManager; +} + +@Override +public void onInstanceConfigChange(List instanceConfigs, NotificationContext context) { +_instanceConfigs = instanceConfigs; Review comment: This was returning older values rather than updated values in onChange methoda. ## File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/listener/ClusterInstanceConfigChangeListener.java ## @@ -0,0 +1,32 @@ +package org.apache.pinot.controller.helix.core.listener; + +import org.apache.helix.HelixManager; +import org.apache.helix.NotificationContext; +import org.apache.helix.api.listeners.InstanceConfigChangeListener; +import org.apache.helix.model.InstanceConfig; + +import java.util.ArrayList; +import java.util.List; +import org.apache.pinot.common.utils.helix.HelixHelper; + + +public class ClusterInstanceConfigChangeListener implements InstanceConfigChangeListener { +private HelixManager _helixManager; +private List _instanceConfigs = new ArrayList<>(); + +public ClusterInstanceConfigChangeListener(HelixManager helixManager) { +_helixManager = helixManager; Review comment: Returns null or empty list ## File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/listener/ClusterInstanceConfigChangeListener.java ## @@ -0,0 +1,32 @@ +package org.apache.pinot.controller.helix.core.listener; + +import org.apache.helix.HelixManager; +import org.apache.helix.NotificationContext; +import org.apache.helix.api.listeners.InstanceConfigChangeListener; +import org.apache.helix.model.InstanceConfig; + +import java.util.ArrayList; +import java.util.List; +import org.apache.pinot.common.utils.helix.HelixHelper; + + +public class ClusterInstanceConfigChangeListener implements InstanceConfigChangeListener { +private HelixManager _helixManager; +private List _instanceConfigs = new ArrayList<>(); + +public ClusterInstanceConfigChangeListener(HelixManager helixManager) { +_helixManager = helixManager; +} + +@Override +public void onInstanceConfigChange(List instanceConfigs, NotificationContext context) { +_instanceConfigs = instanceConfigs; +} + +public List getInstanceConfigs() { +if(_instanceConfigs.isEmpty()){ Review comment: Returns null or empty list This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For
[GitHub] [incubator-pinot] sajjad-moradi commented on pull request #6037: Add list of allowed tables for emitting table level metrics
sajjad-moradi commented on pull request #6037: URL: https://github.com/apache/incubator-pinot/pull/6037#issuecomment-696225516 > Aren’t we trying to address the limitations of monitoring systems in Pinot? I'm not aware that. Could you elaborate a bit more? what's exactly the plan and what's the timeline for it? The solution presented here is regarding a large cluster at Linkedin for which table level metrics are disabled for all tables. The current situation is risky as for some high priority tables, we don't get alerted. This PR immediately alleviates the existing issue. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] jihaozh merged pull request #6026: [TE] Creating a thirdeye-dashboard module to host the dashboard server
jihaozh merged pull request #6026: URL: https://github.com/apache/incubator-pinot/pull/6026 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on pull request #6040: Remove the partition info from the consuming segment ZK metadata
Jackie-Jiang commented on pull request #6040: URL: https://github.com/apache/incubator-pinot/pull/6040#issuecomment-696415308 @mcvsubbu The segment partition pruning for consuming segment will happen on server side instead of broker side. Yes there will be some overhead, and we need to measure the impact with a high QPS use case. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] jihaozh merged pull request #6036: [TE] fix labeler config mapping and timeout when fetching anomalies
jihaozh merged pull request #6036: URL: https://github.com/apache/incubator-pinot/pull/6036 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] kishoreg commented on pull request #6037: Add list of allowed tables for emitting table level metrics
kishoreg commented on pull request #6037: URL: https://github.com/apache/incubator-pinot/pull/6037#issuecomment-696321992 > > Aren’t we trying to address the limitations of monitoring systems in Pinot? > > I'm not aware that. Could you elaborate a bit more? what's exactly the plan and what's the timeline for it? > The solution presented here is regarding a large cluster at Linkedin for which table level metrics are disabled for all tables. The current situation is risky as for some high priority tables, we don't get alerted. This PR immediately alleviates the existing issue. Sorry, I was referring to the changes in this PR. Ideally, we should be logging metrics for all tables/resources. It's up to the operators to set alerts on the right tables that are important for the business. By adding this new config, we are adding a workaround in Pinot to overcome the limitation with the metrics system which cannot handle thousands of tables. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] icefury71 commented on issue #5942: Better Table config validation
icefury71 commented on issue #5942: URL: https://github.com/apache/incubator-pinot/issues/5942#issuecomment-696164071 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang merged pull request #6042: Fix built-in virtual columns for immutable segment
Jackie-Jiang merged pull request #6042: URL: https://github.com/apache/incubator-pinot/pull/6042 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[incubator-pinot] branch master updated: Fix built-in virtual columns for immutable segment (#6042)
This is an automated email from the ASF dual-hosted git repository. jackie pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git The following commit(s) were added to refs/heads/master by this push: new 67299cd Fix built-in virtual columns for immutable segment (#6042) 67299cd is described below commit 67299cde563b8a9aaecea02b2cc989e234bffcaf Author: Xiaotian (Jackie) Jiang <1751+jackie-ji...@users.noreply.github.com> AuthorDate: Mon Sep 21 17:42:08 2020 -0700 Fix built-in virtual columns for immutable segment (#6042) Fix the bug in `ImmutableSegmentLoader` where built-in virtual columns are not added to the schema in the segment metadata, which will cause wrong result when explicitly querying the built-in virtual columns (e.g. `SELECT $docID FROM myTable`). --- .../immutable/ImmutableSegmentLoader.java | 11 +-- .../core/segment/index/loader/LoaderTest.java | 80 +- 2 files changed, 64 insertions(+), 27 deletions(-) diff --git a/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/immutable/ImmutableSegmentLoader.java b/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/immutable/ImmutableSegmentLoader.java index 6ddaaa5..b7b1fe0 100644 --- a/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/immutable/ImmutableSegmentLoader.java +++ b/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/immutable/ImmutableSegmentLoader.java @@ -112,15 +112,10 @@ public class ImmutableSegmentLoader { new PhysicalColumnIndexContainer(segmentReader, entry.getValue(), indexLoadingConfig, indexDir)); } -if (schema == null) { - schema = segmentMetadata.getSchema(); -} - -// Ensure that the schema has the virtual columns added - VirtualColumnProviderFactory.addBuiltInVirtualColumnsToSegmentSchema(schema, segmentName); - // Instantiate virtual columns -for (FieldSpec fieldSpec : schema.getAllFieldSpecs()) { +Schema segmentSchema = segmentMetadata.getSchema(); + VirtualColumnProviderFactory.addBuiltInVirtualColumnsToSegmentSchema(segmentSchema, segmentName); +for (FieldSpec fieldSpec : segmentSchema.getAllFieldSpecs()) { if (fieldSpec.isVirtualColumn()) { String columnName = fieldSpec.getName(); VirtualColumnContext context = new VirtualColumnContext(fieldSpec, segmentMetadata.getTotalDocs()); diff --git a/pinot-core/src/test/java/org/apache/pinot/core/segment/index/loader/LoaderTest.java b/pinot-core/src/test/java/org/apache/pinot/core/segment/index/loader/LoaderTest.java index 6fbd11a..32a4c0a 100644 --- a/pinot-core/src/test/java/org/apache/pinot/core/segment/index/loader/LoaderTest.java +++ b/pinot-core/src/test/java/org/apache/pinot/core/segment/index/loader/LoaderTest.java @@ -25,6 +25,7 @@ import java.util.HashSet; import java.util.List; import org.apache.commons.io.FileUtils; import org.apache.pinot.common.segment.ReadMode; +import org.apache.pinot.common.utils.CommonConstants.Segment.BuiltInVirtualColumn; import org.apache.pinot.common.utils.TarGzCompressionUtils; import org.apache.pinot.core.indexsegment.IndexSegment; import org.apache.pinot.core.indexsegment.generator.SegmentGeneratorConfig; @@ -151,6 +152,28 @@ public class LoaderTest { } @Test + public void testBuiltInVirtualColumns() + throws Exception { +Schema schema = constructV1Segment(); + +IndexSegment indexSegment = ImmutableSegmentLoader.load(_indexDir, _v1IndexLoadingConfig, schema); +testBuiltInVirtualColumns(indexSegment); +indexSegment.destroy(); + +indexSegment = ImmutableSegmentLoader.load(_indexDir, _v1IndexLoadingConfig, null); +testBuiltInVirtualColumns(indexSegment); +indexSegment.destroy(); + } + + private void testBuiltInVirtualColumns(IndexSegment indexSegment) { +Assert.assertTrue(indexSegment.getColumnNames().containsAll( +Arrays.asList(BuiltInVirtualColumn.DOCID, BuiltInVirtualColumn.HOSTNAME, BuiltInVirtualColumn.SEGMENTNAME))); + Assert.assertNotNull(indexSegment.getDataSource(BuiltInVirtualColumn.DOCID)); + Assert.assertNotNull(indexSegment.getDataSource(BuiltInVirtualColumn.HOSTNAME)); + Assert.assertNotNull(indexSegment.getDataSource(BuiltInVirtualColumn.SEGMENTNAME)); + } + + @Test public void testPadding() throws Exception { // Old Format @@ -270,7 +293,8 @@ public class LoaderTest { } @Test - public void testTextIndexLoad() throws Exception { + public void testTextIndexLoad() + throws Exception { // Tests for scenarios by creating on-disk segment in V3 and then loading // the segment with and without specifying segmentVersion in IndexLoadingConfig @@ -288,7 +312,8 @@ public class LoaderTest { File textIndexFile = SegmentDirectoryPaths.findTextIndexIndexFile(_indexDir, TEXT_INDEX_COL_NAME); Assert.assertNotNull(textIndexFile);
[incubator-pinot] branch master updated (274b4c2 -> b5e67c9)
This is an automated email from the ASF dual-hosted git repository. jihao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git. from 274b4c2 remove default javaagent opts in generator.sh script to avoid javaagent port colission (#6041) add b5e67c9 [TE] Creating a thirdeye-dashboard module to host the dashboard server (#6026) No new revisions were added by this update. Summary of changes: thirdeye/pom.xml |1 + thirdeye/run-backend.sh|4 +- thirdeye/run-frontend.sh |4 +- .../config/dashboard.yml |0 .../config/data-sources/cache-config.yml |0 .../config/data-sources/data-sources-config.yml|0 .../config/data/README.md |0 .../config/data/daily.csv |0 .../config/data/hourly.csv |0 .../config/data/pageviews.csv |0 .../anomaly-functions/alertFilter.properties |0 .../alertFilterAutotune.properties |0 .../anomaly-functions/functions.properties |0 .../config/detector.yml|0 .../config/h2db.mv.db | Bin 2490368 -> 2498560 bytes .../config/persistence.yml |0 .../config/rca.yml |0 thirdeye/thirdeye-dashboard/pom.xml| 113 ++ .../api/application/ApplicationResource.java |0 .../api/detection/AnomalyDetectionResource.java|0 .../api/user/dashboard/UserDashboardResource.java |0 .../dashboard/DetectionPreviewConfiguration.java |0 .../thirdeye/dashboard/DetectorHttpUtils.java |0 .../thirdeye/dashboard/HandlebarsHelperBundle.java |0 .../thirdeye/dashboard/HandlebarsViewRenderer.java |0 .../pinot/thirdeye/dashboard/HelperBundle.java |0 .../thirdeye/dashboard/RootCauseConfiguration.java |0 .../dashboard/RootCauseResourceProvider.java |0 .../dashboard/ThirdEyeDashboardApplication.java|0 .../dashboard/ThirdEyeDashboardConfiguration.java |0 .../dashboard/ThirdEyeDashboardModule.java |0 .../apache/pinot/thirdeye/dashboard/ViewType.java |0 .../dashboard/configs/AuthConfiguration.java |0 .../dashboard/configs/ResourceConfiguration.java |0 .../dashboard/resources/AdminResource.java |0 .../resources/AnomalyFlattenResource.java |0 .../dashboard/resources/AnomalyResource.java |0 .../dashboard/resources/AutoOnboardResource.java |0 .../resources/BadRequestWebException.java |0 .../dashboard/resources/CacheResource.java |0 .../resources/CustomizedEventResource.java |0 .../dashboard/resources/DashboardResource.java |0 .../dashboard/resources/DatasetConfigResource.java |0 .../dashboard/resources/EntityManagerResource.java |0 .../dashboard/resources/EntityMappingResource.java |0 .../dashboard/resources/MetricConfigResource.java |0 .../resources/OnboardDatasetMetricResource.java|0 .../dashboard/resources/ResourceUtils.java |0 .../thirdeye/dashboard/resources/RootResource.java |0 .../dashboard/resources/SummaryResource.java |0 .../dashboard/resources/ThirdEyeResource.java |0 .../dashboard/resources/v2/AnomaliesResource.java | 46 +- .../dashboard/resources/v2/AuthResource.java |2 +- .../dashboard/resources/v2/ConfigResource.java |0 .../dashboard/resources/v2/DataResource.java |3 +- .../resources/v2/DetectionAlertResource.java |0 .../dashboard/resources/v2/ResourceUtils.java |0 .../resources/v2/RootCauseEntityFormatter.java |0 .../v2/RootCauseEventEntityFormatter.java |0 .../resources/v2/RootCauseMetricResource.java |0 .../dashboard/resources/v2/RootCauseResource.java |0 .../resources/v2/RootCauseSessionResource.java |0 .../resources/v2/RootCauseTemplateResource.java|0 .../resources/v2/alerts/AlertResource.java |0 .../resources/v2/alerts/AlertSearchFilter.java |0 .../resources/v2/alerts/AlertSearcher.java |0 .../v2/anomalies/AnomalySearchFilter.java |0 .../v2/anomalies/AnomalySearchResource.java|0 .../resources/v2/anomalies/AnomalySearcher.java|0 .../resources/v2/pojo/AnomaliesSummary.java|0 .../resources/v2/pojo/AnomaliesWrapper.java|0 .../v2/pojo/AnomalyClassificationType.java |0 .../resources/v2/pojo/AnomalyDetails.java |0 .../resources/v2/pojo/AnomalySummary.java |0 .../dashboard/resources/v2/pojo/MetricSummary.java |0 .../resources/v2/pojo/RootCauseEntity.java |0
[GitHub] [incubator-pinot] jihaozh merged pull request #6026: [TE] Creating a thirdeye-dashboard module to host the dashboard server
jihaozh merged pull request #6026: URL: https://github.com/apache/incubator-pinot/pull/6026 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[incubator-pinot] branch master updated (b65fe43 -> 274b4c2)
This is an automated email from the ASF dual-hosted git repository. apucher pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git. from b65fe43 [TE] fix labeler config mapping and timeout when fetching anomalies (#6036) add 274b4c2 remove default javaagent opts in generator.sh script to avoid javaagent port colission (#6041) No new revisions were added by this update. Summary of changes: docker/images/pinot/bin/generator.sh | 8 1 file changed, 4 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] apucher merged pull request #6041: remove default javaagent opts in generator.sh script to avoid javaagent port colission
apucher merged pull request #6041: URL: https://github.com/apache/incubator-pinot/pull/6041 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang opened a new pull request #6042: Fix built-in virtual columns for immutable segment
Jackie-Jiang opened a new pull request #6042: URL: https://github.com/apache/incubator-pinot/pull/6042 ## Description Fix the bug in `ImmutableSegmentLoader` where built-in virtual columns are not added to the schema in the segment metadata, which will cause wrong result when explicitly querying the built-in virtual columns (e.g. `SELECT $docID FROM myTable`). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on pull request #6040: Remove the partition info from the consuming segment ZK metadata
Jackie-Jiang commented on pull request #6040: URL: https://github.com/apache/incubator-pinot/pull/6040#issuecomment-696415308 @mcvsubbu The segment partition pruning for consuming segment will happen on server side instead of broker side. Yes there will be some overhead, and we need to measure the impact with a high QPS use case. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] apucher opened a new pull request #6041: remove default javaagent opts in generator.sh script to avoid javaagent port colission
apucher opened a new pull request #6041: URL: https://github.com/apache/incubator-pinot/pull/6041 ## Description This bugfix removes the default javaagent opts in generator.sh script (**only**) to avoid javaagent port collisions when running pinot-admin from a container that already has an active controller, broker, or server. ## Upgrade Notes Does this PR prevent a zero down-time upgrade? (Assume upgrade order: Controller, Broker, Server, Minion) No Does this PR fix a zero-downtime upgrade introduced earlier? No Does this PR otherwise need attention when creating release notes? Things to consider: No This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[incubator-pinot] branch data-generator-javaagent-fix-20200921 created (now 918bb82)
This is an automated email from the ASF dual-hosted git repository. apucher pushed a change to branch data-generator-javaagent-fix-20200921 in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git. at 918bb82 remove default javaagent opts in generator.sh script to avoid javaagent port colission This branch includes the following new commits: new 918bb82 remove default javaagent opts in generator.sh script to avoid javaagent port colission The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[incubator-pinot] 01/01: remove default javaagent opts in generator.sh script to avoid javaagent port colission
This is an automated email from the ASF dual-hosted git repository. apucher pushed a commit to branch data-generator-javaagent-fix-20200921 in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git commit 918bb822bac9bff736c63088fb242d5c36fc8496 Author: Alexander Pucher AuthorDate: Mon Sep 21 15:09:06 2020 -0700 remove default javaagent opts in generator.sh script to avoid javaagent port colission --- docker/images/pinot/bin/generator.sh | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docker/images/pinot/bin/generator.sh b/docker/images/pinot/bin/generator.sh index a79afa1..1f98f29 100755 --- a/docker/images/pinot/bin/generator.sh +++ b/docker/images/pinot/bin/generator.sh @@ -47,7 +47,7 @@ sed -i -e "s/\"schemaName\": \"$TEMPLATE_NAME\"/\"schemaName\": \"$TABLE_NAME\"/ sed -i -e "s/\"schemaName\": \"$TEMPLATE_NAME\"/\"schemaName\": \"$TABLE_NAME\"/g" "${TEMPLATE_BASEDIR}/${TEMPLATE_NAME}_schema.json" echo "Generating data for ${TEMPLATE_NAME} in ${DATA_DIR}" -${ADMIN_PATH} GenerateData \ +JAVA_OPTS="" ${ADMIN_PATH} GenerateData \ -numFiles 1 -numRecords 631152 -format csv \ -schemaFile "${TEMPLATE_BASEDIR}/${TEMPLATE_NAME}_schema.json" \ -schemaAnnotationFile "${TEMPLATE_BASEDIR}/${TEMPLATE_NAME}_generator.json" \ @@ -60,7 +60,7 @@ if [ ! -d "${DATA_DIR}" ]; then fi echo "Creating segment for ${TEMPLATE_NAME} in ${SEGMENT_DIR}" -${ADMIN_PATH} CreateSegment \ +JAVA_OPTS="" ${ADMIN_PATH} CreateSegment \ -format csv \ -tableConfigFile "${TEMPLATE_BASEDIR}/${TEMPLATE_NAME}_config.json" \ -schemaFile "${TEMPLATE_BASEDIR}/${TEMPLATE_NAME}_schema.json" \ @@ -74,12 +74,12 @@ if [ ! -d "${SEGMENT_DIR}" ]; then fi echo "Adding table ${TABLE_NAME} from template ${TEMPLATE_NAME}" -${ADMIN_PATH} AddTable -exec \ +JAVA_OPTS="" ${ADMIN_PATH} AddTable -exec \ -tableConfigFile "${TEMPLATE_BASEDIR}/${TEMPLATE_NAME}_config.json" \ -schemaFile "${TEMPLATE_BASEDIR}/${TEMPLATE_NAME}_schema.json" || exit 1 echo "Uploading segment for ${TEMPLATE_NAME}" -${ADMIN_PATH} UploadSegment \ +JAVA_OPTS="" ${ADMIN_PATH} UploadSegment \ -tableName "${TABLE_NAME}" \ -segmentDir "${SEGMENT_DIR}" - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[incubator-pinot] branch master updated (73f0459 -> b65fe43)
This is an automated email from the ASF dual-hosted git repository. jihao pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git. from 73f0459 Add Broker Reduce Time Log (#6033) add b65fe43 [TE] fix labeler config mapping and timeout when fetching anomalies (#6036) No new revisions were added by this update. Summary of changes: .../cache/builder/AnomaliesCacheBuilder.java | 9 -- .../components/ThresholdSeverityLabeler.java | 33 -- .../spec/SeverityThresholdLabelerSpec.java | 22 --- .../detection/wrapper/AnomalyLabelerWrapper.java | 7 + .../alert/commons/TestAnomalyFeedFactory.java | 2 +- .../components/ThresholdSeverityLabelerTest.java | 29 +-- .../email/filter/TestPrecisionRecallEvaluator.java | 2 +- .../detector/email/filter/TestUserReportUtils.java | 2 +- .../TestAlertContentFormatterFactory.java | 2 +- thirdeye/thirdeye-spi/pom.xml | 6 10 files changed, 72 insertions(+), 42 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] jihaozh merged pull request #6036: [TE] fix labeler config mapping and timeout when fetching anomalies
jihaozh merged pull request #6036: URL: https://github.com/apache/incubator-pinot/pull/6036 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] kishoreg commented on pull request #6037: Add list of allowed tables for emitting table level metrics
kishoreg commented on pull request #6037: URL: https://github.com/apache/incubator-pinot/pull/6037#issuecomment-696321992 > > Aren’t we trying to address the limitations of monitoring systems in Pinot? > > I'm not aware that. Could you elaborate a bit more? what's exactly the plan and what's the timeline for it? > The solution presented here is regarding a large cluster at Linkedin for which table level metrics are disabled for all tables. The current situation is risky as for some high priority tables, we don't get alerted. This PR immediately alleviates the existing issue. Sorry, I was referring to the changes in this PR. Ideally, we should be logging metrics for all tables/resources. It's up to the operators to set alerts on the right tables that are important for the business. By adding this new config, we are adding a workaround in Pinot to overcome the limitation with the metrics system which cannot handle thousands of tables. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang opened a new pull request #6040: Remove the partition info from the consuming segment ZK metadata
Jackie-Jiang opened a new pull request #6040: URL: https://github.com/apache/incubator-pinot/pull/6040 ## Description For issue: #6029 Do not persist the partition info to the consuming segment ZK metadata because it might not reflect the correct segment partitioning when the data is not partitioned correctly in the stream, or the partitions changed during the consumption. Persisting the partition info could cause segment to be mis-pruned, thus cause wrong result. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[incubator-pinot] 01/01: Remove the partition info from the consuming segment ZK metadata
This is an automated email from the ASF dual-hosted git repository. jackie pushed a commit to branch remove_consuming_partition_info in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git commit f83cbb569448d883250ae6d57d1070e80f5b3657 Author: Xiaotian (Jackie) Jiang AuthorDate: Mon Sep 21 12:10:37 2020 -0700 Remove the partition info from the consuming segment ZK metadata --- .../segmentpruner/PartitionSegmentPruner.java | 32 +++- .../realtime/PinotLLCRealtimeSegmentManager.java | 28 +-- ...PartitionLLCRealtimeClusterIntegrationTest.java | 199 + 3 files changed, 192 insertions(+), 67 deletions(-) diff --git a/pinot-broker/src/main/java/org/apache/pinot/broker/routing/segmentpruner/PartitionSegmentPruner.java b/pinot-broker/src/main/java/org/apache/pinot/broker/routing/segmentpruner/PartitionSegmentPruner.java index 8320b30..181bad9 100644 --- a/pinot-broker/src/main/java/org/apache/pinot/broker/routing/segmentpruner/PartitionSegmentPruner.java +++ b/pinot-broker/src/main/java/org/apache/pinot/broker/routing/segmentpruner/PartitionSegmentPruner.java @@ -32,7 +32,7 @@ import org.apache.pinot.common.metadata.ZKMetadataProvider; import org.apache.pinot.common.metadata.segment.ColumnPartitionMetadata; import org.apache.pinot.common.metadata.segment.SegmentPartitionMetadata; import org.apache.pinot.common.request.BrokerRequest; -import org.apache.pinot.common.utils.CommonConstants; +import org.apache.pinot.common.utils.CommonConstants.Segment; import org.apache.pinot.common.utils.request.FilterQueryTree; import org.apache.pinot.common.utils.request.RequestUtils; import org.apache.pinot.core.data.partition.PartitionFunction; @@ -76,17 +76,32 @@ public class PartitionSegmentPruner implements SegmentPruner { List znRecords = _propertyStore.get(segmentZKMetadataPaths, null, AccessOption.PERSISTENT); for (int i = 0; i < numSegments; i++) { String segment = segments.get(i); - _partitionInfoMap.put(segment, extractPartitionInfoFromSegmentZKMetadataZNRecord(segment, znRecords.get(i))); + PartitionInfo partitionInfo = extractPartitionInfoFromSegmentZKMetadataZNRecord(segment, znRecords.get(i)); + if (partitionInfo != null) { +_partitionInfoMap.put(segment, partitionInfo); + } } } + /** + * NOTE: Returns {@code null} when the ZNRecord is missing (could be transient Helix issue), or the segment is a + * consuming segment so that we can retry later. Returns {@link #INVALID_PARTITION_INFO} when the segment does + * not have valid partition metadata in its ZK metadata, in which case we won't retry later. + */ + @Nullable private PartitionInfo extractPartitionInfoFromSegmentZKMetadataZNRecord(String segment, @Nullable ZNRecord znRecord) { if (znRecord == null) { LOGGER.warn("Failed to find segment ZK metadata for segment: {}, table: {}", segment, _tableNameWithType); - return INVALID_PARTITION_INFO; + return null; } -String partitionMetadataJson = znRecord.getSimpleField(CommonConstants.Segment.PARTITION_METADATA); +// Skip processing the partition metadata for the consuming segment because the partition metadata is updated when +// the consuming segment is committed +if (Segment.Realtime.Status.IN_PROGRESS.name().equals(znRecord.getSimpleField(Segment.Realtime.STATUS))) { + return null; +} + +String partitionMetadataJson = znRecord.getSimpleField(Segment.PARTITION_METADATA); if (partitionMetadataJson == null) { LOGGER.warn("Failed to find segment partition metadata for segment: {}, table: {}", segment, _tableNameWithType); return INVALID_PARTITION_INFO; @@ -127,8 +142,13 @@ public class PartitionSegmentPruner implements SegmentPruner { @Override public synchronized void refreshSegment(String segment) { -_partitionInfoMap.put(segment, extractPartitionInfoFromSegmentZKMetadataZNRecord(segment, -_propertyStore.get(_segmentZKMetadataPathPrefix + segment, null, AccessOption.PERSISTENT))); +PartitionInfo partitionInfo = extractPartitionInfoFromSegmentZKMetadataZNRecord(segment, +_propertyStore.get(_segmentZKMetadataPathPrefix + segment, null, AccessOption.PERSISTENT)); +if (partitionInfo != null) { + _partitionInfoMap.put(segment, partitionInfo); +} else { + _partitionInfoMap.remove(segment); +} } @Override diff --git a/pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java b/pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java index 43ea74c..b85bdb6 100644 --- a/pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java +++ b/pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/realtime/PinotLLCRealtimeSegmentManager.java @@ -27,7
[incubator-pinot] branch remove_consuming_partition_info updated (200bef6 -> f83cbb5)
This is an automated email from the ASF dual-hosted git repository. jackie pushed a change to branch remove_consuming_partition_info in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git. discard 200bef6 Remove the partition info from the consuming segment ZK metadata discard fcf2b7a Handle the partitioning mismatch between table config and stream add d9aec17 Improve the realtime time creation unit test (#6032) add 5548e79 Table indexing config validation (#6017) add 8511410 Publish helm package pinot 0.2.1 (#6034) add 0dbe06d Publish helm repo with new index (#6035) add fe047fd Support streaming query in QueryExecutor (#6027) add 919f407 Handle the partitioning mismatch between table config and stream (#6031) add 73f0459 Add Broker Reduce Time Log (#6033) new f83cbb5 Remove the partition info from the consuming segment ZK metadata This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (200bef6) \ N -- N -- N refs/heads/remove_consuming_partition_info (f83cbb5) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: docs/README.md => kubernetes/helm/README-dev.md| 25 +- kubernetes/helm/index.yaml | 34 ++- kubernetes/helm/pinot-0.2.1.tgz| Bin 0 -> 23883 bytes kubernetes/helm/pinot/Chart.yaml | 4 +- .../requesthandler/BaseBrokerRequestHandler.java | 4 +- .../segmentpruner/PartitionSegmentPruner.java | 32 +- .../helix/core/PinotHelixResourceManager.java | 1 - .../realtime/PinotLLCRealtimeSegmentManager.java | 6 +- .../api/PinotTableRestletResourceTest.java | 13 +- .../core/query/executor/GrpcQueryExecutor.java | 327 - .../pinot/core/query/executor/QueryExecutor.java | 26 +- .../query/executor/ServerQueryExecutorV1Impl.java | 29 +- .../pinot/core/transport/grpc/GrpcQueryServer.java | 76 - .../apache/pinot/core/util/TableConfigUtils.java | 114 ++- .../query/scheduler/PrioritySchedulerTest.java | 36 ++- .../pinot/core/util/TableConfigUtilsTest.java | 154 ++ ...PartitionLLCRealtimeClusterIntegrationTest.java | 170 ++- ...ulls_default_column_test_missing_columns.schema | 4 +- .../pinot/server/starter/ServerInstance.java | 5 +- .../spi/utils/builder/TableConfigBuilder.java | 15 + 20 files changed, 668 insertions(+), 407 deletions(-) copy docs/README.md => kubernetes/helm/README-dev.md (67%) create mode 100644 kubernetes/helm/pinot-0.2.1.tgz delete mode 100644 pinot-core/src/main/java/org/apache/pinot/core/query/executor/GrpcQueryExecutor.java - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] icefury71 commented on issue #5942: Better Table config validation
icefury71 commented on issue #5942: URL: https://github.com/apache/incubator-pinot/issues/5942#issuecomment-696260153 Updated ticket description with all the suggestions. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] sajjad-moradi commented on pull request #6037: Add list of allowed tables for emitting table level metrics
sajjad-moradi commented on pull request #6037: URL: https://github.com/apache/incubator-pinot/pull/6037#issuecomment-696225516 > Aren’t we trying to address the limitations of monitoring systems in Pinot? I'm not aware that. Could you elaborate a bit more? what's exactly the plan and what's the timeline for it? The solution presented here is regarding a large cluster at Linkedin for which table level metrics are disabled for all tables. The current situation is risky as for some high priority tables, we don't get alerted. This PR immediately alleviates the existing issue. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] kishoreg commented on issue #5942: Better Table config validation
kishoreg commented on issue #5942: URL: https://github.com/apache/incubator-pinot/issues/5942#issuecomment-696214712 +1 to a utility to validate, we can add a validate endpoint in the controller This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] mcvsubbu commented on issue #5942: Better Table config validation
mcvsubbu commented on issue #5942: URL: https://github.com/apache/incubator-pinot/issues/5942#issuecomment-696207582 With all the changes proposed/added, it will be useful if a utility is also provided to validate an existing tableconfig. Many of them are tightening the (lax) rules from before, and are valid. It will be unfortunate if an existing installation suddenly stopped working because code elsewhere (not just in the table addition path) starts assuming things about the tableconfig. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] icefury71 commented on issue #5942: Better Table config validation
icefury71 commented on issue #5942: URL: https://github.com/apache/incubator-pinot/issues/5942#issuecomment-696164071 One more check: Currently retention does not work if pushType is null (for real-time table). Check for a case where pushType is null and retention is configured for a table. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] adriancole commented on issue #5977: Allow ServiceManager to install tables prior to listening on service ports or a healthy status
adriancole commented on issue #5977: URL: https://github.com/apache/incubator-pinot/issues/5977#issuecomment-696109515 ok I added a sketch for feedback. I will work on this again next week https://github.com/apache/incubator-pinot/pull/6039 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org
[GitHub] [incubator-pinot] adriancole opened a new pull request #6039: WIP: ServiceManager ADD_TABLE role
adriancole opened a new pull request #6039: URL: https://github.com/apache/incubator-pinot/pull/6039 This is a work in progress towards #5977 Besides overall review of this sketch, this needs. [ ] how do we get a reference to the _helixResourceManager? [ ] do we need to block until Pinot Broker registers with the controller before installing the tables? [ ] can we can stop the controller from listening on a port until these tables are added? [ ] add test relative file paths can work [ ] add test multiple schemas install in parallel [ ] add test health check fails if a schema fails due to missing files, or inability to run addSchema or addTable. [ ] figure out where to document this [ ] determine if quickStart could or should use this ## Description This would allow one or more bootstrap config files that include one or more tables to add. ex given a file `etc/pinot-backendEntityView.conf` ``` pinot.service.role=ADD_TABLE pinot.addTable.schemaFile=./schemas/backendEntityView-schemaFile.json pinot.addTable.tableConfigFile=./schemas/backendEntityView-tableConfigFile.json ``` and a file `etc/pinot-rawServiceView.conf` ``` pinot.service.role=ADD_TABLE pinot.addTable.schemaFile=./schemas/rawServiceView-schemaFile.json pinot.addTable.tableConfigFile=./schemas/rawServiceView-tableConfigFile.json ``` you could pass the following to `StartServiceManager`: `-bootstrapConfigPaths etc/pinot-controller.conf etc/pinot-broker.conf etc/pinot-server.conf etc/pinot-backendEntityView.conf etc/pinot-backendEntityView.conf` What will happen in ideal case is the controller starts, the tables install, then the controller is listening. If we can't block the controller from listening until this is done, we can at least make sure health check is not ok. ## Upgrade Notes Does this PR prevent a zero down-time upgrade? **No** Does this PR fix a zero-downtime upgrade introduced earlier? **No** Does this PR otherwise need attention when creating release notes? * [x] Yes (Please label this PR as **release-notes** and complete the section on Release Notes) ## Release Notes TODO ## Documentation TODO If you have introduced a new feature or configuration, please add it to the documentation as well. See https://docs.pinot.apache.org/developers/developers-and-contributors/update-document This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org