[incubator-pinot] 01/01: [maven-release-plugin] prepare for next development iteration

2020-05-30 Thread xiangfu
This is an automated email from the ASF dual-hosted git repository.

xiangfu pushed a commit to branch release-0.4.0-rc4
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git

commit ec5fb8ceb0b7c4465e92a9e0ec697757defa35ed
Author: Xiang Fu 
AuthorDate: Sat May 30 04:12:22 2020 -0700

[maven-release-plugin] prepare for next development iteration
---
 pinot-broker/pom.xml  | 2 +-
 pinot-clients/pinot-java-client/pom.xml   | 2 +-
 pinot-clients/pom.xml | 2 +-
 pinot-common/pom.xml  | 2 +-
 pinot-controller/pom.xml  | 2 +-
 pinot-core/pom.xml| 2 +-
 pinot-distribution/pom.xml| 2 +-
 pinot-integration-tests/pom.xml   | 2 +-
 pinot-minion/pom.xml  | 2 +-
 pinot-perf/pom.xml| 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-common/pom.xml| 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-hadoop/pom.xml| 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-spark/pom.xml | 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-standalone/pom.xml| 2 +-
 pinot-plugins/pinot-batch-ingestion/pom.xml   | 2 +-
 .../pinot-batch-ingestion/v0_deprecated/pinot-hadoop/pom.xml  | 2 +-
 .../v0_deprecated/pinot-ingestion-common/pom.xml  | 2 +-
 pinot-plugins/pinot-batch-ingestion/v0_deprecated/pinot-spark/pom.xml | 2 +-
 pinot-plugins/pinot-batch-ingestion/v0_deprecated/pom.xml | 2 +-
 pinot-plugins/pinot-file-system/pinot-adls/pom.xml| 2 +-
 pinot-plugins/pinot-file-system/pinot-gcs/pom.xml | 2 +-
 pinot-plugins/pinot-file-system/pinot-hdfs/pom.xml| 2 +-
 pinot-plugins/pinot-file-system/pinot-s3/pom.xml  | 2 +-
 pinot-plugins/pinot-file-system/pom.xml   | 2 +-
 pinot-plugins/pinot-input-format/pinot-avro-base/pom.xml  | 2 +-
 pinot-plugins/pinot-input-format/pinot-avro/pom.xml   | 2 +-
 pinot-plugins/pinot-input-format/pinot-confluent-avro/pom.xml | 2 +-
 pinot-plugins/pinot-input-format/pinot-csv/pom.xml| 2 +-
 pinot-plugins/pinot-input-format/pinot-json/pom.xml   | 2 +-
 pinot-plugins/pinot-input-format/pinot-orc/pom.xml| 2 +-
 pinot-plugins/pinot-input-format/pinot-parquet/pom.xml| 2 +-
 pinot-plugins/pinot-input-format/pinot-thrift/pom.xml | 2 +-
 pinot-plugins/pinot-input-format/pom.xml  | 2 +-
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-0.9/pom.xml  | 2 +-
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-2.0/pom.xml  | 2 +-
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-base/pom.xml | 2 +-
 pinot-plugins/pinot-stream-ingestion/pom.xml  | 2 +-
 pinot-plugins/pom.xml | 2 +-
 pinot-server/pom.xml  | 2 +-
 pinot-spi/pom.xml | 2 +-
 pinot-tools/pom.xml   | 2 +-
 pom.xml   | 4 ++--
 42 files changed, 43 insertions(+), 43 deletions(-)

diff --git a/pinot-broker/pom.xml b/pinot-broker/pom.xml
index 2fd885d..a884630 100644
--- a/pinot-broker/pom.xml
+++ b/pinot-broker/pom.xml
@@ -24,7 +24,7 @@
   
 pinot
 org.apache.pinot
-0.4.0
+0.5.0-SNAPSHOT
 ..
   
   pinot-broker
diff --git a/pinot-clients/pinot-java-client/pom.xml 
b/pinot-clients/pinot-java-client/pom.xml
index 615c5e9..60177d7 100644
--- a/pinot-clients/pinot-java-client/pom.xml
+++ b/pinot-clients/pinot-java-client/pom.xml
@@ -24,7 +24,7 @@
   
 pinot-clients
 org.apache.pinot
-0.4.0
+0.5.0-SNAPSHOT
 ..
   
   pinot-java-client
diff --git a/pinot-clients/pom.xml b/pinot-clients/pom.xml
index e05e7ec..902f54f 100644
--- a/pinot-clients/pom.xml
+++ b/pinot-clients/pom.xml
@@ -24,7 +24,7 @@
   
 pinot
 org.apache.pinot
-0.4.0
+0.5.0-SNAPSHOT
 ..
   
   pinot-clients
diff --git a/pinot-common/pom.xml b/pinot-common/pom.xml
index db2d28d..7f2a7fc 100644
--- a/pinot-common/pom.xml
+++ b/pinot-common/pom.xml
@@ -24,7 +24,7 @@
   
 pinot
 org.apache.pinot
-0.4.0
+0.5.0-SNAPSHOT
 ..
   
   pinot-common
diff --git a/pinot-controller/pom.xml b/pinot-controller/pom.xml
index 8c72703..08eb80c 100644
--- a/pinot-controller/pom.xml
+++ b/pinot-controller/pom.xml
@@ -24,7 +24,7 @@
   
 

[incubator-pinot] branch release-0.4.0-rc4 created (now ec5fb8c)

2020-05-30 Thread xiangfu
This is an automated email from the ASF dual-hosted git repository.

xiangfu pushed a change to branch release-0.4.0-rc4
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


  at ec5fb8c  [maven-release-plugin] prepare for next development iteration

This branch includes the following new commits:

 new ec5fb8c  [maven-release-plugin] prepare for next development iteration

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] annotated tag release-0.4.0-rc4 updated (996e9cb -> 107ad58)

2020-05-30 Thread xiangfu
This is an automated email from the ASF dual-hosted git repository.

xiangfu pushed a change to annotated tag release-0.4.0-rc4
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


*** WARNING: tag release-0.4.0-rc4 was modified! ***

from 996e9cb  (commit)
  to 107ad58  (tag)
 tagging 996e9cb40a7ffc9e39c2deaaad1e089d324d589f (commit)
  by Xiang Fu
  on Sat May 30 04:12:17 2020 -0700

- Log -
[maven-release-plugin] copy for tag release-0.4.0-rc4
---


No new revisions were added by this update.

Summary of changes:


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch hotfix-0530 created (now 4f01114)

2020-05-30 Thread siddteotia
This is an automated email from the ASF dual-hosted git repository.

siddteotia pushed a change to branch hotfix-0530
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


  at 4f01114  Set hashmap initial size to 16 in 
DictionaryBasedGroupKeyGenerator. (#5421)

No new revisions were added by this update.


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] KKcorps opened a new issue #5468: Add Schema Registry for Protocol Buffer Input Format

2020-05-30 Thread GitBox


KKcorps opened a new issue #5468:
URL: https://github.com/apache/incubator-pinot/issues/5468


   Currently Protocol Buffers require a .desc file on disk to parse the data. 
Deprecating the present approach, the user should have the ability to register 
any proto buf schema using a simple REST  API call.
   For more details take a look at [Kafka schema 
registry](https://docs.confluent.io/current/schema-registry/schema_registry_tutorial.html)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] ChethanUK opened a new pull request #5469: Fixing mobile site image responsiveness

2020-05-30 Thread GitBox


ChethanUK opened a new pull request #5469:
URL: https://github.com/apache/incubator-pinot/pull/5469


   ## Description
   
   - Fixing the SVG problem on the main page.
   
   The mobile site as pointed out by few was not responsive (SVG images)
   
![SVG_Mobile_Before](https://user-images.githubusercontent.com/16241795/83338400-a0f56700-a2e1-11ea-88f8-12985aa48403.png)
   
   Now, with this fix, the SVG's are responsive in mobile views:
   
![SVG_Mobile_After](https://user-images.githubusercontent.com/16241795/83338409-b4a0cd80-a2e1-11ea-8e3a-14e8078126da.png)
   
   
   Does this PR fix a zero-downtime upgrade introduced earlier?
   * [Yes] Yes (Please label this as **backward-incompat**, and 
complete the section below on Release Notes)
   Yes 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] jackjlli commented on a change in pull request #5470: Derive numDocsPerChunk for var byte raw index from metadata only if config is enabled.

2020-05-30 Thread GitBox


jackjlli commented on a change in pull request #5470:
URL: https://github.com/apache/incubator-pinot/pull/5470#discussion_r432896019



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
##
@@ -330,8 +330,32 @@ public long getLatestIngestionTimestamp() {
*/
   private boolean isNoDictionaryColumn(Set noDictionaryColumns, 
Set invertedIndexColumns,
   Set textIndexColumns, FieldSpec fieldSpec, String column) {
-return textIndexColumns.contains(column) || 
(noDictionaryColumns.contains(column) && fieldSpec.isSingleValueField()
-&& !invertedIndexColumns.contains(column));
+if (textIndexColumns.contains(column)) {

Review comment:
   The description of textIndexColumns is missing in this method.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] snleee commented on a change in pull request #5470: Derive numDocsPerChunk for var byte raw index from metadata only if config is enabled.

2020-05-30 Thread GitBox


snleee commented on a change in pull request #5470:
URL: https://github.com/apache/incubator-pinot/pull/5470#discussion_r432896575



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
##
@@ -330,8 +330,32 @@ public long getLatestIngestionTimestamp() {
*/
   private boolean isNoDictionaryColumn(Set noDictionaryColumns, 
Set invertedIndexColumns,
   Set textIndexColumns, FieldSpec fieldSpec, String column) {
-return textIndexColumns.contains(column) || 
(noDictionaryColumns.contains(column) && fieldSpec.isSingleValueField()
-&& !invertedIndexColumns.contains(column));
+if (textIndexColumns.contains(column)) {
+  // text column is no dictionary currently
+  return true;
+}
+FieldSpec.DataType dataType = fieldSpec.getDataType();
+if (noDictionaryColumns.contains(column)) {
+  // Earlier we didn't support noDict in consuming segments for STRING and 
BYTES columns.
+  // So even if the user had the column in noDictionaryColumns set in 
table config, we still
+  // created dictionary in consuming segments.
+  // Later on we added this support. There is a particular impact of this 
change on the use cases
+  // that have set noDict on their STRING dimension columns for other 
performance
+  // reasons and also want metricsAggregation. These use cases don't get to
+  // aggregateMetrics because the new implementation is able to honor 
their table config setting
+  // of noDict on STRING/BYTES. Without metrics aggregation, memory 
pressure increases.
+  // So to continue aggregating metrics for such cases, we will create 
dictionary even
+  // if the column is part of noDictionary set from table config
+  if (fieldSpec instanceof DimensionFieldSpec && _aggregateMetrics && 
(dataType == FieldSpec.DataType.STRING ||

Review comment:
   Sounds good to me  





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #5470: Derive numDocsPerChunk for var byte raw index from metadata only if config is enabled.

2020-05-30 Thread GitBox


mcvsubbu commented on a change in pull request #5470:
URL: https://github.com/apache/incubator-pinot/pull/5470#discussion_r432905885



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
##
@@ -330,8 +330,32 @@ public long getLatestIngestionTimestamp() {
*/
   private boolean isNoDictionaryColumn(Set noDictionaryColumns, 
Set invertedIndexColumns,
   Set textIndexColumns, FieldSpec fieldSpec, String column) {
-return textIndexColumns.contains(column) || 
(noDictionaryColumns.contains(column) && fieldSpec.isSingleValueField()
-&& !invertedIndexColumns.contains(column));
+if (textIndexColumns.contains(column)) {

Review comment:
   Excellent. I think we still rename the method as something along the 
lines if `shouldCreateDictionaryForColumn()`, since it has some logic and is 
not just checking the table config for noDictionary setting,





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] siddharthteotia opened a new pull request #5470: Derive numDocsPerChunk for var byte raw index from metadata only if config is enabled.

2020-05-30 Thread GitBox


siddharthteotia opened a new pull request #5470:
URL: https://github.com/apache/incubator-pinot/pull/5470


   (1) PR https://github.com/apache/incubator-pinot/pull/5256 added support for 
deriving num docs per chunk for var byte raw index create from column length. 
This was specifically
   done as part of supporting large text values. For use cases that don't want 
this feature and are high QPS, they see a negative impact since size of chunk 
increases (earlier value
   of numDocsPerChunk was hardcoded to 1000) and based on the access pattern we 
might end up uncompressing a bigger chunk to get values for a set of docIds. We 
have made this change configurable. So the default behavior is same as old 
(1000 docs per chunk. It can be enabled as follows
   
   `fieldConfigList":[
  {
"name":"textCol",
"encodingType":"RAW",
"indexType":"TEXT",
"properties":{
   "derive.num.chunks.raw.index":"true",
 }
   }
   `
   
   (2) PR https://github.com/apache/incubator-pinot/pull/4791 added support for 
noDict for STRING/BYTES in consuming segments. Before PR 4791, even if user had 
STRING/BYTES as no dictionary in table config, consuming segment still created 
dictionary because of the lack of support for raw index.  There is a particular 
impact of this change on the use cases that have set noDict on their STRING 
dimension columns for other performance reasons and also want 
metricsAggregation. These use cases don't get to aggregateMetrics because the 
new implementation was able to honor their table config setting of noDict on 
STRING/BYTES and created a raw index. Without metrics aggregation, memory 
pressure increases. So to continue aggregating metrics for such cases, we will 
create dictionary for STRING/BYTES even if the column is part of noDictionary 
set from table config.
   
   ## Description
   Add a description of your PR here.
   A good description should include pointers to an issue or design document, 
etc.
   ## Upgrade Notes
   Does this PR prevent a zero down-time upgrade? (Assume upgrade order: 
Controller, Broker, Server, Minion)
   * [ ] Yes (Please label as **backward-incompat**, and complete 
the section below on Release Notes)
   
   Does this PR fix a zero-downtime upgrade introduced earlier?
   * [ ] Yes (Please label this as **backward-incompat**, and 
complete the section below on Release Notes)
   
   Does this PR otherwise need attention when creating release notes? Things to 
consider:
   - New configuration options
   - Deprecation of configurations
   - Signature changes to public methods/interfaces
   - New plugins added or old plugins removed
   * [ ] Yes (Please label this PR as **release-notes** and 
complete the section on Release Notes)
   ## Release Notes
   If you have tagged this as either backward-incompat or release-notes,
   you MUST add text here that you would like to see appear in release notes of 
the
   next release.
   
   If you have a series of commits adding or enabling a feature, then
   add this section only in final commit that marks the feature completed.
   Refer to earlier release notes to see examples of text
   
   ## Documentation
   If you have introduced a new feature or configuration, please add it to the 
documentation as well.
   See 
https://docs.pinot.apache.org/developers/developers-and-contributors/update-document
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mayankshriv commented on a change in pull request #5470: Derive numDocsPerChunk for var byte raw index from metadata only if config is enabled.

2020-05-30 Thread GitBox


mayankshriv commented on a change in pull request #5470:
URL: https://github.com/apache/incubator-pinot/pull/5470#discussion_r432896660



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
##
@@ -330,8 +330,32 @@ public long getLatestIngestionTimestamp() {
*/
   private boolean isNoDictionaryColumn(Set noDictionaryColumns, 
Set invertedIndexColumns,
   Set textIndexColumns, FieldSpec fieldSpec, String column) {
-return textIndexColumns.contains(column) || 
(noDictionaryColumns.contains(column) && fieldSpec.isSingleValueField()
-&& !invertedIndexColumns.contains(column));
+if (textIndexColumns.contains(column)) {
+  // text column is no dictionary currently
+  return true;
+}
+FieldSpec.DataType dataType = fieldSpec.getDataType();
+if (noDictionaryColumns.contains(column)) {
+  // Earlier we didn't support noDict in consuming segments for STRING and 
BYTES columns.
+  // So even if the user had the column in noDictionaryColumns set in 
table config, we still
+  // created dictionary in consuming segments.
+  // Later on we added this support. There is a particular impact of this 
change on the use cases
+  // that have set noDict on their STRING dimension columns for other 
performance
+  // reasons and also want metricsAggregation. These use cases don't get to
+  // aggregateMetrics because the new implementation is able to honor 
their table config setting
+  // of noDict on STRING/BYTES. Without metrics aggregation, memory 
pressure increases.
+  // So to continue aggregating metrics for such cases, we will create 
dictionary even
+  // if the column is part of noDictionary set from table config
+  if (fieldSpec instanceof DimensionFieldSpec && _aggregateMetrics && 
(dataType == FieldSpec.DataType.STRING ||

Review comment:
   Log message for this?

##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/SegmentColumnarIndexCreator.java
##
@@ -213,6 +215,14 @@ public void init(SegmentGeneratorConfig 
segmentCreationSpec, SegmentIndexCreatio
 }
   }
 
+  public static boolean shouldDeriveNumDocsPerChunk(String columnName, 
Map> columnProperties) {
+if (columnProperties != null) {
+  Map properties = columnProperties.get(columnName);
+  return properties != null && 
Boolean.parseBoolean(properties.get(FieldConfig.DERIVE_NUM_DOCS_PER_CHUNK_RAW_INDEX_KEY));

Review comment:
   Defaults to false, right?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] annotated tag release-0.4.0-rc3 updated (7bcdd1b -> 257a8d4)

2020-05-30 Thread haibow
This is an automated email from the ASF dual-hosted git repository.

haibow pushed a change to annotated tag release-0.4.0-rc3
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


*** WARNING: tag release-0.4.0-rc3 was modified! ***

from 7bcdd1b  (commit)
  to 257a8d4  (tag)
 tagging 7bcdd1bb8880ddb958f06b4e0d50672ef93a8292 (commit)
  by Haibo Wang
  on Sat May 30 20:39:15 2020 -0700

- Log -
add tag release-0.4.0-rc3
---


No new revisions were added by this update.

Summary of changes:


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch release-0.4.0-rc created (now 9c943e3)

2020-05-30 Thread haibow
This is an automated email from the ASF dual-hosted git repository.

haibow pushed a change to branch release-0.4.0-rc
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


  at 9c943e3  [maven-release-plugin] prepare for next development iteration

No new revisions were added by this update.


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] 03/03: [maven-release-plugin] prepare for next development iteration

2020-05-30 Thread haibow
This is an automated email from the ASF dual-hosted git repository.

haibow pushed a commit to branch release-0.4.0-rc3
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git

commit 9c943e35456a433e8232b9b8da8743ab243c756a
Author: Haibo Wang 
AuthorDate: Fri May 29 17:13:51 2020 -0700

[maven-release-plugin] prepare for next development iteration
---
 pinot-broker/pom.xml  | 2 +-
 pinot-clients/pinot-java-client/pom.xml   | 2 +-
 pinot-clients/pom.xml | 2 +-
 pinot-common/pom.xml  | 2 +-
 pinot-controller/pom.xml  | 2 +-
 pinot-core/pom.xml| 2 +-
 pinot-distribution/pom.xml| 2 +-
 pinot-integration-tests/pom.xml   | 2 +-
 pinot-minion/pom.xml  | 2 +-
 pinot-perf/pom.xml| 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-common/pom.xml| 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-hadoop/pom.xml| 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-spark/pom.xml | 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-standalone/pom.xml| 2 +-
 pinot-plugins/pinot-batch-ingestion/pom.xml   | 2 +-
 .../pinot-batch-ingestion/v0_deprecated/pinot-hadoop/pom.xml  | 2 +-
 .../v0_deprecated/pinot-ingestion-common/pom.xml  | 2 +-
 pinot-plugins/pinot-batch-ingestion/v0_deprecated/pinot-spark/pom.xml | 2 +-
 pinot-plugins/pinot-batch-ingestion/v0_deprecated/pom.xml | 2 +-
 pinot-plugins/pinot-file-system/pinot-adls/pom.xml| 2 +-
 pinot-plugins/pinot-file-system/pinot-gcs/pom.xml | 2 +-
 pinot-plugins/pinot-file-system/pinot-hdfs/pom.xml| 2 +-
 pinot-plugins/pinot-file-system/pinot-s3/pom.xml  | 2 +-
 pinot-plugins/pinot-file-system/pom.xml   | 2 +-
 pinot-plugins/pinot-input-format/pinot-avro-base/pom.xml  | 2 +-
 pinot-plugins/pinot-input-format/pinot-avro/pom.xml   | 2 +-
 pinot-plugins/pinot-input-format/pinot-confluent-avro/pom.xml | 2 +-
 pinot-plugins/pinot-input-format/pinot-csv/pom.xml| 2 +-
 pinot-plugins/pinot-input-format/pinot-json/pom.xml   | 2 +-
 pinot-plugins/pinot-input-format/pinot-orc/pom.xml| 2 +-
 pinot-plugins/pinot-input-format/pinot-parquet/pom.xml| 2 +-
 pinot-plugins/pinot-input-format/pinot-thrift/pom.xml | 2 +-
 pinot-plugins/pinot-input-format/pom.xml  | 2 +-
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-0.9/pom.xml  | 2 +-
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-2.0/pom.xml  | 2 +-
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-base/pom.xml | 2 +-
 pinot-plugins/pinot-stream-ingestion/pom.xml  | 2 +-
 pinot-plugins/pom.xml | 2 +-
 pinot-server/pom.xml  | 2 +-
 pinot-spi/pom.xml | 2 +-
 pinot-tools/pom.xml   | 2 +-
 pom.xml   | 4 ++--
 42 files changed, 43 insertions(+), 43 deletions(-)

diff --git a/pinot-broker/pom.xml b/pinot-broker/pom.xml
index 2fd885d..a884630 100644
--- a/pinot-broker/pom.xml
+++ b/pinot-broker/pom.xml
@@ -24,7 +24,7 @@
   
 pinot
 org.apache.pinot
-0.4.0
+0.5.0-SNAPSHOT
 ..
   
   pinot-broker
diff --git a/pinot-clients/pinot-java-client/pom.xml 
b/pinot-clients/pinot-java-client/pom.xml
index 615c5e9..60177d7 100644
--- a/pinot-clients/pinot-java-client/pom.xml
+++ b/pinot-clients/pinot-java-client/pom.xml
@@ -24,7 +24,7 @@
   
 pinot-clients
 org.apache.pinot
-0.4.0
+0.5.0-SNAPSHOT
 ..
   
   pinot-java-client
diff --git a/pinot-clients/pom.xml b/pinot-clients/pom.xml
index e05e7ec..902f54f 100644
--- a/pinot-clients/pom.xml
+++ b/pinot-clients/pom.xml
@@ -24,7 +24,7 @@
   
 pinot
 org.apache.pinot
-0.4.0
+0.5.0-SNAPSHOT
 ..
   
   pinot-clients
diff --git a/pinot-common/pom.xml b/pinot-common/pom.xml
index db2d28d..7f2a7fc 100644
--- a/pinot-common/pom.xml
+++ b/pinot-common/pom.xml
@@ -24,7 +24,7 @@
   
 pinot
 org.apache.pinot
-0.4.0
+0.5.0-SNAPSHOT
 ..
   
   pinot-common
diff --git a/pinot-controller/pom.xml b/pinot-controller/pom.xml
index 8c72703..08eb80c 100644
--- a/pinot-controller/pom.xml
+++ b/pinot-controller/pom.xml
@@ -24,7 +24,7 @@
   

[incubator-pinot] 01/03: remove distributionManagement

2020-05-30 Thread haibow
This is an automated email from the ASF dual-hosted git repository.

haibow pushed a commit to branch release-0.4.0-rc3
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git

commit 6ab6d4c4191bae8d7f51b04078d651daf310a093
Author: Haibo Wang 
AuthorDate: Sat May 30 20:33:47 2020 -0700

remove distributionManagement
---
 pom.xml | 8 
 1 file changed, 8 deletions(-)

diff --git a/pom.xml b/pom.xml
index 2b6ae90..7024d3a 100644
--- a/pom.xml
+++ b/pom.xml
@@ -97,14 +97,6 @@
   
   2018
 
-  
-
-  bintray-linkedin-maven
-  linkedin-maven
-  
https://api.bintray.com/maven/linkedin/maven/pinot/;publish=1;override=1
-
-  
-
   
 ${basedir}
 0.4.0


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] 02/03: [maven-release-plugin] prepare release release-0.4.0-rc3

2020-05-30 Thread haibow
This is an automated email from the ASF dual-hosted git repository.

haibow pushed a commit to branch release-0.4.0-rc3
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git

commit 7bcdd1bb8880ddb958f06b4e0d50672ef93a8292
Author: Haibo Wang 
AuthorDate: Fri May 29 17:13:32 2020 -0700

[maven-release-plugin] prepare release release-0.4.0-rc3
---
 pinot-broker/pom.xml  | 5 ++---
 pinot-clients/pinot-java-client/pom.xml   | 5 ++---
 pinot-clients/pom.xml | 6 ++
 pinot-common/pom.xml  | 5 ++---
 pinot-controller/pom.xml  | 5 ++---
 pinot-core/pom.xml| 5 ++---
 pinot-distribution/pom.xml| 7 +++
 pinot-integration-tests/pom.xml   | 5 ++---
 pinot-minion/pom.xml  | 5 ++---
 pinot-perf/pom.xml| 5 ++---
 .../pinot-batch-ingestion/pinot-batch-ingestion-common/pom.xml| 6 ++
 .../pinot-batch-ingestion/pinot-batch-ingestion-hadoop/pom.xml| 6 ++
 .../pinot-batch-ingestion/pinot-batch-ingestion-spark/pom.xml | 6 ++
 .../pinot-batch-ingestion-standalone/pom.xml  | 6 ++
 pinot-plugins/pinot-batch-ingestion/pom.xml   | 6 ++
 .../pinot-batch-ingestion/v0_deprecated/pinot-hadoop/pom.xml  | 7 +++
 .../v0_deprecated/pinot-ingestion-common/pom.xml  | 6 ++
 .../pinot-batch-ingestion/v0_deprecated/pinot-spark/pom.xml   | 7 +++
 pinot-plugins/pinot-batch-ingestion/v0_deprecated/pom.xml | 6 ++
 pinot-plugins/pinot-file-system/pinot-adls/pom.xml| 5 ++---
 pinot-plugins/pinot-file-system/pinot-gcs/pom.xml | 6 ++
 pinot-plugins/pinot-file-system/pinot-hdfs/pom.xml| 5 ++---
 pinot-plugins/pinot-file-system/pinot-s3/pom.xml  | 6 ++
 pinot-plugins/pinot-file-system/pom.xml   | 6 ++
 pinot-plugins/pinot-input-format/pinot-avro-base/pom.xml  | 5 ++---
 pinot-plugins/pinot-input-format/pinot-avro/pom.xml   | 5 ++---
 pinot-plugins/pinot-input-format/pinot-confluent-avro/pom.xml | 5 ++---
 pinot-plugins/pinot-input-format/pinot-csv/pom.xml| 5 ++---
 pinot-plugins/pinot-input-format/pinot-json/pom.xml   | 5 ++---
 pinot-plugins/pinot-input-format/pinot-orc/pom.xml| 6 ++
 pinot-plugins/pinot-input-format/pinot-parquet/pom.xml| 5 ++---
 pinot-plugins/pinot-input-format/pinot-thrift/pom.xml | 5 ++---
 pinot-plugins/pinot-input-format/pom.xml  | 6 ++
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-0.9/pom.xml  | 6 ++
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-2.0/pom.xml  | 6 ++
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-base/pom.xml | 6 ++
 pinot-plugins/pinot-stream-ingestion/pom.xml  | 6 ++
 pinot-plugins/pom.xml | 8 +++-
 pinot-server/pom.xml  | 5 ++---
 pinot-spi/pom.xml | 5 ++---
 pinot-tools/pom.xml   | 5 ++---
 pom.xml   | 7 +++
 42 files changed, 89 insertions(+), 149 deletions(-)

diff --git a/pinot-broker/pom.xml b/pinot-broker/pom.xml
index d9e48e1..2fd885d 100644
--- a/pinot-broker/pom.xml
+++ b/pinot-broker/pom.xml
@@ -19,13 +19,12 @@
 under the License.
 
 -->
-http://www.w3.org/2001/XMLSchema-instance; 
xmlns="http://maven.apache.org/POM/4.0.0;
- xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
   4.0.0
   
 pinot
 org.apache.pinot
-${revision}${sha1}
+0.4.0
 ..
   
   pinot-broker
diff --git a/pinot-clients/pinot-java-client/pom.xml 
b/pinot-clients/pinot-java-client/pom.xml
index 6a98e3d..615c5e9 100644
--- a/pinot-clients/pinot-java-client/pom.xml
+++ b/pinot-clients/pinot-java-client/pom.xml
@@ -19,13 +19,12 @@
 under the License.
 
 -->
-http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
- xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 

[incubator-pinot] branch release-0.4.0-rc3 updated (46863b1 -> 9c943e3)

2020-05-30 Thread haibow
This is an automated email from the ASF dual-hosted git repository.

haibow pushed a change to branch release-0.4.0-rc3
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


 discard 46863b1  [maven-release-plugin] prepare for next development iteration
 discard c3f36c2  [maven-release-plugin] prepare release release-0.4.0-rc3
 new 6ab6d4c  remove distributionManagement
 new 7bcdd1b  [maven-release-plugin] prepare release release-0.4.0-rc3
 new 9c943e3  [maven-release-plugin] prepare for next development iteration

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (46863b1)
\
 N -- N -- N   refs/heads/release-0.4.0-rc3 (9c943e3)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 pom.xml | 8 
 1 file changed, 8 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch hotfix-0530 updated: Check aggregateMetrics from RealtimeSegmentConfig

2020-05-30 Thread siddteotia
This is an automated email from the ASF dual-hosted git repository.

siddteotia pushed a commit to branch hotfix-0530
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git


The following commit(s) were added to refs/heads/hotfix-0530 by this push:
 new 7a18f68  Check aggregateMetrics from RealtimeSegmentConfig
7a18f68 is described below

commit 7a18f68d0220fe8e4ff7ade1c214e82876ae9772
Author: Siddharth Teotia 
AuthorDate: Sat May 30 21:10:30 2020 -0700

Check aggregateMetrics from RealtimeSegmentConfig
---
 .../apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java  | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
index 0e24664..527572f 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
@@ -219,7 +219,7 @@ public class MutableSegmentImpl implements MutableSegment {
   FieldSpec.DataType dataType = fieldSpec.getDataType();
   boolean isFixedWidthColumn = dataType.isFixedWidth();
   int forwardIndexColumnSize = -1;
-  if (isNoDictionaryColumn(noDictionaryColumns, invertedIndexColumns, 
fieldSpec, column)) {
+  if (isNoDictionaryColumn(noDictionaryColumns, invertedIndexColumns, 
fieldSpec, column, config)) {
 // no dictionary
 // each forward index entry will be equal to size of data for that row
 // For INT, LONG, FLOAT, DOUBLE it is equal to the number of fixed 
bytes used to store the value,
@@ -329,7 +329,7 @@ public class MutableSegmentImpl implements MutableSegment {
* @return true if column is no-dictionary, false if dictionary encoded
*/
   private boolean isNoDictionaryColumn(Set noDictionaryColumns, 
Set invertedIndexColumns,
-  FieldSpec fieldSpec, String column) {
+  FieldSpec fieldSpec, String column, RealtimeSegmentConfig config) {
 FieldSpec.DataType dataType = fieldSpec.getDataType();
 if (noDictionaryColumns.contains(column)) {
   // Earlier we didn't support noDict in consuming segments for STRING and 
BYTES columns.
@@ -342,7 +342,7 @@ public class MutableSegmentImpl implements MutableSegment {
   // of noDict on STRING/BYTES. Without metrics aggregation, memory 
pressure increases.
   // So to continue aggregating metrics for such cases, we will create 
dictionary even
   // if the column is part of noDictionary set from table config
-  if (fieldSpec instanceof DimensionFieldSpec && _aggregateMetrics && 
(dataType == FieldSpec.DataType.STRING ||
+  if (fieldSpec instanceof DimensionFieldSpec && config.aggregateMetrics() 
&& (dataType == FieldSpec.DataType.STRING ||
   dataType == FieldSpec.DataType.BYTES)) {
 _logger.info("Not creating dictionary in consuming segment for column 
{} of type {}", column, dataType.toString());
 return false;


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] fx19880617 merged pull request #5471: Update Superset image build

2020-05-30 Thread GitBox


fx19880617 merged pull request #5471:
URL: https://github.com/apache/incubator-pinot/pull/5471


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch master updated (01a316e -> 563d289)

2020-05-30 Thread xiangfu
This is an automated email from the ASF dual-hosted git repository.

xiangfu pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


from 01a316e  Support distinctCountRawThetaSketch aggregation that returns 
serialized sketch. (#5465)
 add 563d289  Update Superset image build (#5471)

No new revisions were added by this update.

Summary of changes:
 docker/images/pinot-superset/Dockerfile  |  3 ++-
 docker/images/pinot-superset/bin/superset-init   | 13 +
 docker/images/pinot-superset/requirements-db.txt |  2 +-
 3 files changed, 16 insertions(+), 2 deletions(-)
 create mode 100644 docker/images/pinot-superset/bin/superset-init


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] 01/01: [maven-release-plugin] prepare for next development iteration

2020-05-30 Thread xiangfu
This is an automated email from the ASF dual-hosted git repository.

xiangfu pushed a commit to branch release-0.4.0-rc
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git

commit 24c545263a8bcaa3171b09478a0e0cd4cc96d2ed
Author: Xiang Fu 
AuthorDate: Sat May 30 14:10:35 2020 -0700

[maven-release-plugin] prepare for next development iteration
---
 pinot-broker/pom.xml  | 2 +-
 pinot-clients/pinot-java-client/pom.xml   | 2 +-
 pinot-clients/pom.xml | 2 +-
 pinot-common/pom.xml  | 2 +-
 pinot-controller/pom.xml  | 2 +-
 pinot-core/pom.xml| 2 +-
 pinot-distribution/pom.xml| 2 +-
 pinot-integration-tests/pom.xml   | 2 +-
 pinot-minion/pom.xml  | 2 +-
 pinot-perf/pom.xml| 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-common/pom.xml| 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-hadoop/pom.xml| 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-spark/pom.xml | 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-standalone/pom.xml| 2 +-
 pinot-plugins/pinot-batch-ingestion/pom.xml   | 2 +-
 .../pinot-batch-ingestion/v0_deprecated/pinot-hadoop/pom.xml  | 2 +-
 .../v0_deprecated/pinot-ingestion-common/pom.xml  | 2 +-
 pinot-plugins/pinot-batch-ingestion/v0_deprecated/pinot-spark/pom.xml | 2 +-
 pinot-plugins/pinot-batch-ingestion/v0_deprecated/pom.xml | 2 +-
 pinot-plugins/pinot-file-system/pinot-adls/pom.xml| 2 +-
 pinot-plugins/pinot-file-system/pinot-gcs/pom.xml | 2 +-
 pinot-plugins/pinot-file-system/pinot-hdfs/pom.xml| 2 +-
 pinot-plugins/pinot-file-system/pinot-s3/pom.xml  | 2 +-
 pinot-plugins/pinot-file-system/pom.xml   | 2 +-
 pinot-plugins/pinot-input-format/pinot-avro-base/pom.xml  | 2 +-
 pinot-plugins/pinot-input-format/pinot-avro/pom.xml   | 2 +-
 pinot-plugins/pinot-input-format/pinot-confluent-avro/pom.xml | 2 +-
 pinot-plugins/pinot-input-format/pinot-csv/pom.xml| 2 +-
 pinot-plugins/pinot-input-format/pinot-json/pom.xml   | 2 +-
 pinot-plugins/pinot-input-format/pinot-orc/pom.xml| 2 +-
 pinot-plugins/pinot-input-format/pinot-parquet/pom.xml| 2 +-
 pinot-plugins/pinot-input-format/pinot-thrift/pom.xml | 2 +-
 pinot-plugins/pinot-input-format/pom.xml  | 2 +-
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-0.9/pom.xml  | 2 +-
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-2.0/pom.xml  | 2 +-
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-base/pom.xml | 2 +-
 pinot-plugins/pinot-stream-ingestion/pom.xml  | 2 +-
 pinot-plugins/pom.xml | 2 +-
 pinot-server/pom.xml  | 2 +-
 pinot-spi/pom.xml | 2 +-
 pinot-tools/pom.xml   | 2 +-
 pom.xml   | 4 ++--
 42 files changed, 43 insertions(+), 43 deletions(-)

diff --git a/pinot-broker/pom.xml b/pinot-broker/pom.xml
index 2fd885d..a884630 100644
--- a/pinot-broker/pom.xml
+++ b/pinot-broker/pom.xml
@@ -24,7 +24,7 @@
   
 pinot
 org.apache.pinot
-0.4.0
+0.5.0-SNAPSHOT
 ..
   
   pinot-broker
diff --git a/pinot-clients/pinot-java-client/pom.xml 
b/pinot-clients/pinot-java-client/pom.xml
index 615c5e9..60177d7 100644
--- a/pinot-clients/pinot-java-client/pom.xml
+++ b/pinot-clients/pinot-java-client/pom.xml
@@ -24,7 +24,7 @@
   
 pinot-clients
 org.apache.pinot
-0.4.0
+0.5.0-SNAPSHOT
 ..
   
   pinot-java-client
diff --git a/pinot-clients/pom.xml b/pinot-clients/pom.xml
index e05e7ec..902f54f 100644
--- a/pinot-clients/pom.xml
+++ b/pinot-clients/pom.xml
@@ -24,7 +24,7 @@
   
 pinot
 org.apache.pinot
-0.4.0
+0.5.0-SNAPSHOT
 ..
   
   pinot-clients
diff --git a/pinot-common/pom.xml b/pinot-common/pom.xml
index db2d28d..7f2a7fc 100644
--- a/pinot-common/pom.xml
+++ b/pinot-common/pom.xml
@@ -24,7 +24,7 @@
   
 pinot
 org.apache.pinot
-0.4.0
+0.5.0-SNAPSHOT
 ..
   
   pinot-common
diff --git a/pinot-controller/pom.xml b/pinot-controller/pom.xml
index 8c72703..08eb80c 100644
--- a/pinot-controller/pom.xml
+++ b/pinot-controller/pom.xml
@@ -24,7 +24,7 @@
   
  

[incubator-pinot] annotated tag release-0.4.0-rc4 updated (b726119 -> 6aca8fc)

2020-05-30 Thread xiangfu
This is an automated email from the ASF dual-hosted git repository.

xiangfu pushed a change to annotated tag release-0.4.0-rc4
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


*** WARNING: tag release-0.4.0-rc4 was modified! ***

from b726119  (commit)
  to 6aca8fc  (tag)
 tagging b72611946e3a792b548490b2c418526701fc8e94 (commit)
  by Xiang Fu
  on Sat May 30 14:10:30 2020 -0700

- Log -
[maven-release-plugin] copy for tag release-0.4.0-rc4
---


No new revisions were added by this update.

Summary of changes:


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch release-0.4.0-rc created (now 24c5452)

2020-05-30 Thread xiangfu
This is an automated email from the ASF dual-hosted git repository.

xiangfu pushed a change to branch release-0.4.0-rc
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


  at 24c5452  [maven-release-plugin] prepare for next development iteration

This branch includes the following new commits:

 new 24c5452  [maven-release-plugin] prepare for next development iteration

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch hotfix-0530 updated: Two changes:

2020-05-30 Thread siddteotia
This is an automated email from the ASF dual-hosted git repository.

siddteotia pushed a commit to branch hotfix-0530
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git


The following commit(s) were added to refs/heads/hotfix-0530 by this push:
 new 54cac4a  Two changes:
54cac4a is described below

commit 54cac4a84c713b98fef7d2a3e77d0bac2f9fb2ae
Author: Siddharth Teotia 
AuthorDate: Sat May 30 14:14:38 2020 -0700

Two changes:

(1) PR https://github.com/apache/incubator-pinot/pull/5256
added support for deriving num docs per chunk for var byte
raw index create from column length. This was specifically
done as part of supporting text blobs. For use cases that
don't want this feature and are high QPS, see a negative
impact since size of chunk increases (earlier value
of numDocsPerChunk was hardcoded to 1000) and based on the
access pattern we might end up uncompressing a bigger chunk to get values
for a set of docIds. We have made this change configurable.
So the default behaviour is same as old (1000 docs per chunk)

(2) PR https://github.com/apache/incubator-pinot/pull/4791
added support for noDict for STRING/BYTES in consuming segments.
There is a particular impact of this change on the use cases
that have set noDict on their STRING dimension columns for other performance
reasons and also want metricsAggregation. These use cases don't get to
aggregateMetrics because the new implementation was able to honor their
table config setting of noDict on STRING/BYTES. Without metrics aggregation,
memory pressure increases. So to continue aggregating metrics for such 
cases,
we will create dictionary even if the column is part of noDictionary set
from table config.
---
 .../generator/SegmentGeneratorConfig.java  | 15 
 .../indexsegment/mutable/MutableSegmentImpl.java   | 28 --
 .../creator/impl/SegmentColumnarIndexCreator.java  | 22 +++--
 .../fwd/SingleValueVarByteRawIndexCreator.java | 10 +++-
 .../apache/pinot/spi/config/table/FieldConfig.java |  1 +
 5 files changed, 71 insertions(+), 5 deletions(-)

diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/generator/SegmentGeneratorConfig.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/generator/SegmentGeneratorConfig.java
index 59531fe..9af9c16 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/generator/SegmentGeneratorConfig.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/generator/SegmentGeneratorConfig.java
@@ -102,6 +102,9 @@ public class SegmentGeneratorConfig {
   private boolean _skipTimeValueCheck = false;
   private boolean _nullHandlingEnabled = false;
 
+  // constructed from FieldConfig
+  private Map> _columnProperties = new HashMap<>();
+
   @Deprecated
   public SegmentGeneratorConfig() {
   }
@@ -174,12 +177,24 @@ public class SegmentGeneratorConfig {
 _invertedIndexCreationColumns = 
indexingConfig.getInvertedIndexColumns();
   }
 
+  List fieldConfigList = tableConfig.getFieldConfigList();
+  if (fieldConfigList != null) {
+for (FieldConfig fieldConfig : fieldConfigList) {
+  _columnProperties.put(fieldConfig.getName(), 
fieldConfig.getProperties());
+}
+  }
+
   extractTextIndexColumnsFromTableConfig(tableConfig);
 
   _nullHandlingEnabled = indexingConfig.isNullHandlingEnabled();
 }
   }
 
+  @Nonnull
+  public Map> getColumnProperties() {
+return _columnProperties;
+  }
+
   /**
* Set time column details using the given time column
*/
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
index 07e5ec9..1f3f2e0 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
@@ -330,8 +330,32 @@ public class MutableSegmentImpl implements MutableSegment {
*/
   private boolean isNoDictionaryColumn(Set noDictionaryColumns, 
Set invertedIndexColumns,
   Set textIndexColumns, FieldSpec fieldSpec, String column) {
-return textIndexColumns.contains(column) || 
(noDictionaryColumns.contains(column) && fieldSpec.isSingleValueField()
-&& !invertedIndexColumns.contains(column));
+if (textIndexColumns.contains(column)) {
+  // text column is no dictionary currently
+  return true;
+}
+FieldSpec.DataType dataType = fieldSpec.getDataType();
+if (noDictionaryColumns.contains(column)) {
+  // Earlier we didn't support noDict in consuming segments for STRING and 
BYTES columns.
+  // So even if the user had the column in noDictionaryColumns set in 
table config, we still
+  // created 

[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5470: Derive numDocsPerChunk for var byte raw index from metadata only if config is enabled.

2020-05-30 Thread GitBox


siddharthteotia commented on a change in pull request #5470:
URL: https://github.com/apache/incubator-pinot/pull/5470#discussion_r432896416



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
##
@@ -330,8 +330,32 @@ public long getLatestIngestionTimestamp() {
*/
   private boolean isNoDictionaryColumn(Set noDictionaryColumns, 
Set invertedIndexColumns,
   Set textIndexColumns, FieldSpec fieldSpec, String column) {
-return textIndexColumns.contains(column) || 
(noDictionaryColumns.contains(column) && fieldSpec.isSingleValueField()
-&& !invertedIndexColumns.contains(column));
+if (textIndexColumns.contains(column)) {
+  // text column is no dictionary currently
+  return true;
+}
+FieldSpec.DataType dataType = fieldSpec.getDataType();
+if (noDictionaryColumns.contains(column)) {
+  // Earlier we didn't support noDict in consuming segments for STRING and 
BYTES columns.
+  // So even if the user had the column in noDictionaryColumns set in 
table config, we still
+  // created dictionary in consuming segments.
+  // Later on we added this support. There is a particular impact of this 
change on the use cases
+  // that have set noDict on their STRING dimension columns for other 
performance
+  // reasons and also want metricsAggregation. These use cases don't get to
+  // aggregateMetrics because the new implementation is able to honor 
their table config setting
+  // of noDict on STRING/BYTES. Without metrics aggregation, memory 
pressure increases.
+  // So to continue aggregating metrics for such cases, we will create 
dictionary even
+  // if the column is part of noDictionary set from table config
+  if (fieldSpec instanceof DimensionFieldSpec && _aggregateMetrics && 
(dataType == FieldSpec.DataType.STRING ||

Review comment:
   Yes we checked for STRING. See here 
https://github.com/apache/incubator-pinot/pull/4791
   
   Note that there is a method  `enableMetricsAggregationIfPossible` that 
decides whether metrics can be aggregated or not and that checks whether all 
dimensions have dictionary, all metrics should not have dictionary and should 
be SV etc. That method is still intact.
   
   Just that during initialization of MutableSegmentImpl, we used to check for 
STRING and remove it from noDictionaryColumns set since raw index wasn't 
supported. This was actually the reason why the use cases were able to specify 
it as noDict in config and still able to aggregate metrics. 

##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
##
@@ -330,8 +330,32 @@ public long getLatestIngestionTimestamp() {
*/
   private boolean isNoDictionaryColumn(Set noDictionaryColumns, 
Set invertedIndexColumns,
   Set textIndexColumns, FieldSpec fieldSpec, String column) {
-return textIndexColumns.contains(column) || 
(noDictionaryColumns.contains(column) && fieldSpec.isSingleValueField()
-&& !invertedIndexColumns.contains(column));
+if (textIndexColumns.contains(column)) {

Review comment:
   done

##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/SegmentColumnarIndexCreator.java
##
@@ -193,9 +194,10 @@ public void init(SegmentGeneratorConfig 
segmentCreationSpec, SegmentIndexCreatio
 getColumnCompressionType(segmentCreationSpec, fieldSpec);
 
 // Initialize forward index creator
+boolean deriveNumChunksForVarByteRawIndex = 
shouldDeriveNumChunksForRawIndex(columnName, 
segmentCreationSpec.getColumnProperties());

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mayankshriv commented on a change in pull request #5465: Support distinctCountRawThetaSketch aggregation that returns serialized sketch.

2020-05-30 Thread GitBox


mayankshriv commented on a change in pull request #5465:
URL: https://github.com/apache/incubator-pinot/pull/5465#discussion_r432909781



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/AggregationFunctionFactory.java
##
@@ -137,6 +137,8 @@ public static AggregationFunction 
getAggregationFunction(AggregationInfo aggrega
 return new FastHLLAggregationFunction(column);
   case DISTINCTCOUNTTHETASKETCH:
 return new DistinctCountThetaSketchAggregationFunction(arguments);
+  case DISTINCTCOUNTRAWTHETASKETCH:

Review comment:
   Agree, but following the convention of the distinctCountRawHLL. Perhaphs 
we can alias them both later.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mayankshriv merged pull request #5465: Support distinctCountRawThetaSketch aggregation that returns serialized sketch.

2020-05-30 Thread GitBox


mayankshriv merged pull request #5465:
URL: https://github.com/apache/incubator-pinot/pull/5465


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch master updated: Support distinctCountRawThetaSketch aggregation that returns serialized sketch. (#5465)

2020-05-30 Thread mayanks
This is an automated email from the ASF dual-hosted git repository.

mayanks pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git


The following commit(s) were added to refs/heads/master by this push:
 new 01a316e  Support distinctCountRawThetaSketch aggregation that returns 
serialized sketch. (#5465)
01a316e is described below

commit 01a316ef16001786acddefa1d4334d0eb1b86689
Author: Mayank Shrivastava 
AuthorDate: Sat May 30 21:56:06 2020 -0700

Support distinctCountRawThetaSketch aggregation that returns serialized 
sketch. (#5465)

1. Support a variation of theta sketch based distinct count aggregation 
function that returns
   serialized bytes of the final aggregated sketch, instead of the actual 
distinct value.

2. The return value is hex encoded String of the serialized sketch bytes. 
This can be
   deserialized at the client side by the library using 
org.apache.commons.codec.binary as:
   `Hex.decodeHex(stringValue.toCharArray())`. This is the same as any 
other byte[] value
   returned by Pinot.

3. Added unit test for the new aggregation function.
---
 .../common/function/AggregationFunctionType.java   |   1 +
 .../function/AggregationFunctionFactory.java   |   2 +
 ...inctCountRawThetaSketchAggregationFunction.java | 140 +
 ...istinctCountThetaSketchAggregationFunction.java |  16 ++-
 .../queries/DistinctCountThetaSketchTest.java  | 115 +++--
 5 files changed, 235 insertions(+), 39 deletions(-)

diff --git 
a/pinot-common/src/main/java/org/apache/pinot/common/function/AggregationFunctionType.java
 
b/pinot-common/src/main/java/org/apache/pinot/common/function/AggregationFunctionType.java
index af31639..ff3fb50 100644
--- 
a/pinot-common/src/main/java/org/apache/pinot/common/function/AggregationFunctionType.java
+++ 
b/pinot-common/src/main/java/org/apache/pinot/common/function/AggregationFunctionType.java
@@ -31,6 +31,7 @@ public enum AggregationFunctionType {
   DISTINCTCOUNTRAWHLL("distinctCountRawHLL"),
   FASTHLL("fastHLL"),
   DISTINCTCOUNTTHETASKETCH("distinctCountThetaSketch"),
+  DISTINCTCOUNTRAWTHETASKETCH("distinctCountRawThetaSketch"),
   PERCENTILE("percentile"),
   PERCENTILEEST("percentileEst"),
   PERCENTILETDIGEST("percentileTDigest"),
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/AggregationFunctionFactory.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/AggregationFunctionFactory.java
index f21f7fb..496b50e 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/AggregationFunctionFactory.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/AggregationFunctionFactory.java
@@ -137,6 +137,8 @@ public class AggregationFunctionFactory {
 return new FastHLLAggregationFunction(column);
   case DISTINCTCOUNTTHETASKETCH:
 return new DistinctCountThetaSketchAggregationFunction(arguments);
+  case DISTINCTCOUNTRAWTHETASKETCH:
+return new 
DistinctCountRawThetaSketchAggregationFunction(arguments);
   case COUNTMV:
 return new CountMVAggregationFunction(column);
   case MINMV:
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountRawThetaSketchAggregationFunction.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountRawThetaSketchAggregationFunction.java
new file mode 100644
index 000..a620c00
--- /dev/null
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/query/aggregation/function/DistinctCountRawThetaSketchAggregationFunction.java
@@ -0,0 +1,140 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.query.aggregation.function;
+
+import java.util.List;
+import java.util.Map;
+import org.apache.calcite.sql.parser.SqlParseException;
+import org.apache.datasketches.theta.Sketch;
+import org.apache.pinot.common.function.AggregationFunctionType;
+import org.apache.pinot.common.request.transform.TransformExpressionTree;
+import 

[incubator-pinot] annotated tag release-0.4.0-rc4 updated (a694481 -> 1b29b98)

2020-05-30 Thread xiangfu
This is an automated email from the ASF dual-hosted git repository.

xiangfu pushed a change to annotated tag release-0.4.0-rc4
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


*** WARNING: tag release-0.4.0-rc4 was modified! ***

from a694481  (commit)
  to 1b29b98  (tag)
 tagging a6944816ee3c8ddfa272e7ae66ebf6b9d3029653 (commit)
  by Xiang Fu
  on Sat May 30 13:13:24 2020 -0700

- Log -
[maven-release-plugin] copy for tag release-0.4.0-rc4
---


No new revisions were added by this update.

Summary of changes:


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch release-0.4.0-rc updated: [maven-release-plugin] prepare for next development iteration

2020-05-30 Thread xiangfu
This is an automated email from the ASF dual-hosted git repository.

xiangfu pushed a commit to branch release-0.4.0-rc
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git


The following commit(s) were added to refs/heads/release-0.4.0-rc by this push:
 new 669627b  [maven-release-plugin] prepare for next development iteration
669627b is described below

commit 669627bee18b78a91fd996331a2bca82172ae2c7
Author: Xiang Fu 
AuthorDate: Sat May 30 13:13:29 2020 -0700

[maven-release-plugin] prepare for next development iteration
---
 pinot-broker/pom.xml  | 2 +-
 pinot-clients/pinot-java-client/pom.xml   | 2 +-
 pinot-clients/pom.xml | 2 +-
 pinot-common/pom.xml  | 2 +-
 pinot-controller/pom.xml  | 2 +-
 pinot-core/pom.xml| 2 +-
 pinot-distribution/pom.xml| 2 +-
 pinot-integration-tests/pom.xml   | 2 +-
 pinot-minion/pom.xml  | 2 +-
 pinot-perf/pom.xml| 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-common/pom.xml| 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-hadoop/pom.xml| 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-spark/pom.xml | 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-standalone/pom.xml| 2 +-
 pinot-plugins/pinot-batch-ingestion/pom.xml   | 2 +-
 .../pinot-batch-ingestion/v0_deprecated/pinot-hadoop/pom.xml  | 2 +-
 .../v0_deprecated/pinot-ingestion-common/pom.xml  | 2 +-
 pinot-plugins/pinot-batch-ingestion/v0_deprecated/pinot-spark/pom.xml | 2 +-
 pinot-plugins/pinot-batch-ingestion/v0_deprecated/pom.xml | 2 +-
 pinot-plugins/pinot-file-system/pinot-adls/pom.xml| 2 +-
 pinot-plugins/pinot-file-system/pinot-gcs/pom.xml | 2 +-
 pinot-plugins/pinot-file-system/pinot-hdfs/pom.xml| 2 +-
 pinot-plugins/pinot-file-system/pinot-s3/pom.xml  | 2 +-
 pinot-plugins/pinot-file-system/pom.xml   | 2 +-
 pinot-plugins/pinot-input-format/pinot-avro-base/pom.xml  | 2 +-
 pinot-plugins/pinot-input-format/pinot-avro/pom.xml   | 2 +-
 pinot-plugins/pinot-input-format/pinot-confluent-avro/pom.xml | 2 +-
 pinot-plugins/pinot-input-format/pinot-csv/pom.xml| 2 +-
 pinot-plugins/pinot-input-format/pinot-json/pom.xml   | 2 +-
 pinot-plugins/pinot-input-format/pinot-orc/pom.xml| 2 +-
 pinot-plugins/pinot-input-format/pinot-parquet/pom.xml| 2 +-
 pinot-plugins/pinot-input-format/pinot-thrift/pom.xml | 2 +-
 pinot-plugins/pinot-input-format/pom.xml  | 2 +-
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-0.9/pom.xml  | 2 +-
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-2.0/pom.xml  | 2 +-
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-base/pom.xml | 2 +-
 pinot-plugins/pinot-stream-ingestion/pom.xml  | 2 +-
 pinot-plugins/pom.xml | 2 +-
 pinot-server/pom.xml  | 2 +-
 pinot-spi/pom.xml | 2 +-
 pinot-tools/pom.xml   | 2 +-
 pom.xml   | 4 ++--
 42 files changed, 43 insertions(+), 43 deletions(-)

diff --git a/pinot-broker/pom.xml b/pinot-broker/pom.xml
index 2fd885d..a884630 100644
--- a/pinot-broker/pom.xml
+++ b/pinot-broker/pom.xml
@@ -24,7 +24,7 @@
   
 pinot
 org.apache.pinot
-0.4.0
+0.5.0-SNAPSHOT
 ..
   
   pinot-broker
diff --git a/pinot-clients/pinot-java-client/pom.xml 
b/pinot-clients/pinot-java-client/pom.xml
index 615c5e9..60177d7 100644
--- a/pinot-clients/pinot-java-client/pom.xml
+++ b/pinot-clients/pinot-java-client/pom.xml
@@ -24,7 +24,7 @@
   
 pinot-clients
 org.apache.pinot
-0.4.0
+0.5.0-SNAPSHOT
 ..
   
   pinot-java-client
diff --git a/pinot-clients/pom.xml b/pinot-clients/pom.xml
index e05e7ec..902f54f 100644
--- a/pinot-clients/pom.xml
+++ b/pinot-clients/pom.xml
@@ -24,7 +24,7 @@
   
 pinot
 org.apache.pinot
-0.4.0
+0.5.0-SNAPSHOT
 ..
   
   pinot-clients
diff --git a/pinot-common/pom.xml b/pinot-common/pom.xml
index db2d28d..7f2a7fc 100644
--- a/pinot-common/pom.xml
+++ b/pinot-common/pom.xml
@@ -24,7 +24,7 @@
   
 pinot
 org.apache.pinot
-0.4.0
+0.5.0-SNAPSHOT
 ..
   
   

[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #5470: Derive numDocsPerChunk for var byte raw index from metadata only if config is enabled.

2020-05-30 Thread GitBox


mcvsubbu commented on a change in pull request #5470:
URL: https://github.com/apache/incubator-pinot/pull/5470#discussion_r432897195



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
##
@@ -330,8 +330,32 @@ public long getLatestIngestionTimestamp() {
*/
   private boolean isNoDictionaryColumn(Set noDictionaryColumns, 
Set invertedIndexColumns,
   Set textIndexColumns, FieldSpec fieldSpec, String column) {
-return textIndexColumns.contains(column) || 
(noDictionaryColumns.contains(column) && fieldSpec.isSingleValueField()
-&& !invertedIndexColumns.contains(column));
+if (textIndexColumns.contains(column)) {

Review comment:
   This check here seems a little dangerous to me. We do have column level 
settings, it is better to throw an exception if a column has both text index as 
well as dictionary.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #5470: Derive numDocsPerChunk for var byte raw index from metadata only if config is enabled.

2020-05-30 Thread GitBox


mcvsubbu commented on a change in pull request #5470:
URL: https://github.com/apache/incubator-pinot/pull/5470#discussion_r432897195



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
##
@@ -330,8 +330,32 @@ public long getLatestIngestionTimestamp() {
*/
   private boolean isNoDictionaryColumn(Set noDictionaryColumns, 
Set invertedIndexColumns,
   Set textIndexColumns, FieldSpec fieldSpec, String column) {
-return textIndexColumns.contains(column) || 
(noDictionaryColumns.contains(column) && fieldSpec.isSingleValueField()
-&& !invertedIndexColumns.contains(column));
+if (textIndexColumns.contains(column)) {

Review comment:
   This check here seems a little dangerous to me. We do have column level 
settings, it is better to throw an exception if a column has both text index as 
well as dictionary. If we somehow add a dictionary for text column later, we 
will have to remember to come back to change this place.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] fx19880617 merged pull request #5469: Fixing mobile site image responsiveness

2020-05-30 Thread GitBox


fx19880617 merged pull request #5469:
URL: https://github.com/apache/incubator-pinot/pull/5469


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch master updated (44a1e2e -> de97edc)

2020-05-30 Thread xiangfu
This is an automated email from the ASF dual-hosted git repository.

xiangfu pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


from 44a1e2e  Refactor DistinctTable to use PriorityQueue based algorithm 
(#5451)
 add de97edc  Fixing mobile site image responsiveness (#5469)

No new revisions were added by this update.

Summary of changes:
 website/package.json| 32 
 website/src/pages/index.css |  7 +--
 website/src/pages/index.js  |  5 +++--
 3 files changed, 24 insertions(+), 20 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] snleee commented on a change in pull request #5470: Derive numDocsPerChunk for var byte raw index from metadata only if config is enabled.

2020-05-30 Thread GitBox


snleee commented on a change in pull request #5470:
URL: https://github.com/apache/incubator-pinot/pull/5470#discussion_r432898475



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/SegmentColumnarIndexCreator.java
##
@@ -213,6 +215,14 @@ public void init(SegmentGeneratorConfig 
segmentCreationSpec, SegmentIndexCreatio
 }
   }
 
+  public static boolean shouldDeriveNumDocsPerChunk(String columnName, 
Map> columnProperties) {
+if (columnProperties != null) {
+  Map properties = columnProperties.get(columnName);
+  return properties != null && 
Boolean.parseBoolean(properties.get(FieldConfig.DERIVE_NUM_DOCS_PER_CHUNK_RAW_INDEX_KEY));

Review comment:
   @mcvsubbu deriving was the issue for our production issue. So, we need a 
way to control when we want to derive or use the default value.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch master updated: Two changes: (#5470)

2020-05-30 Thread siddteotia
This is an automated email from the ASF dual-hosted git repository.

siddteotia pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git


The following commit(s) were added to refs/heads/master by this push:
 new ee21e79  Two changes: (#5470)
ee21e79 is described below

commit ee21e793c0365f538bcb3b801f47122c59fc0e04
Author: Sidd 
AuthorDate: Sat May 30 19:21:23 2020 -0700

Two changes: (#5470)

(1) PR https://github.com/apache/incubator-pinot/pull/5256
added support for deriving num docs per chunk for var byte
raw index create from column length. This was specifically
done as part of supporting text blobs. For use cases that
don't want this feature and are high QPS, see a negative
impact since size of chunk increases (earlier value
of numDocsPerChunk was hardcoded to 1000) and based on the
access pattern we might end up uncompressing a bigger chunk to get values
for a set of docIds. We have made this change configurable.
So the default behaviour is same as old (1000 docs per chunk)

(2) PR https://github.com/apache/incubator-pinot/pull/4791
added support for noDict for STRING/BYTES in consuming segments.
There is a particular impact of this change on the use cases
that have set noDict on their STRING dimension columns for other performance
reasons and also want metricsAggregation. These use cases don't get to
aggregateMetrics because the new implementation was able to honor their
table config setting of noDict on STRING/BYTES. Without metrics aggregation,
memory pressure increases. So to continue aggregating metrics for such 
cases,
we will create dictionary even if the column is part of noDictionary set
from table config.

Co-authored-by: Siddharth Teotia 
---
 .../generator/SegmentGeneratorConfig.java  | 15 +++
 .../indexsegment/mutable/MutableSegmentImpl.java   | 29 +++---
 .../creator/impl/SegmentColumnarIndexCreator.java  | 22 ++--
 .../fwd/SingleValueVarByteRawIndexCreator.java | 10 +++-
 .../defaultcolumn/BaseDefaultColumnHandler.java|  3 ++-
 .../apache/pinot/spi/config/table/FieldConfig.java |  1 +
 6 files changed, 72 insertions(+), 8 deletions(-)

diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/generator/SegmentGeneratorConfig.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/generator/SegmentGeneratorConfig.java
index 59531fe..9af9c16 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/generator/SegmentGeneratorConfig.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/generator/SegmentGeneratorConfig.java
@@ -102,6 +102,9 @@ public class SegmentGeneratorConfig {
   private boolean _skipTimeValueCheck = false;
   private boolean _nullHandlingEnabled = false;
 
+  // constructed from FieldConfig
+  private Map> _columnProperties = new HashMap<>();
+
   @Deprecated
   public SegmentGeneratorConfig() {
   }
@@ -174,12 +177,24 @@ public class SegmentGeneratorConfig {
 _invertedIndexCreationColumns = 
indexingConfig.getInvertedIndexColumns();
   }
 
+  List fieldConfigList = tableConfig.getFieldConfigList();
+  if (fieldConfigList != null) {
+for (FieldConfig fieldConfig : fieldConfigList) {
+  _columnProperties.put(fieldConfig.getName(), 
fieldConfig.getProperties());
+}
+  }
+
   extractTextIndexColumnsFromTableConfig(tableConfig);
 
   _nullHandlingEnabled = indexingConfig.isNullHandlingEnabled();
 }
   }
 
+  @Nonnull
+  public Map> getColumnProperties() {
+return _columnProperties;
+  }
+
   /**
* Set time column details using the given time column
*/
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
index 07e5ec9..0e24664 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
@@ -219,7 +219,7 @@ public class MutableSegmentImpl implements MutableSegment {
   FieldSpec.DataType dataType = fieldSpec.getDataType();
   boolean isFixedWidthColumn = dataType.isFixedWidth();
   int forwardIndexColumnSize = -1;
-  if (isNoDictionaryColumn(noDictionaryColumns, invertedIndexColumns, 
textIndexColumns, fieldSpec, column)) {
+  if (isNoDictionaryColumn(noDictionaryColumns, invertedIndexColumns, 
fieldSpec, column)) {
 // no dictionary
 // each forward index entry will be equal to size of data for that row
 // For INT, LONG, FLOAT, DOUBLE it is equal to the number of fixed 
bytes used to store the value,
@@ -329,9 +329,30 @@ public class MutableSegmentImpl implements MutableSegment {
* @return true if 

[GitHub] [incubator-pinot] siddharthteotia merged pull request #5470: Derive numDocsPerChunk for var byte raw index from metadata only if config is enabled.

2020-05-30 Thread GitBox


siddharthteotia merged pull request #5470:
URL: https://github.com/apache/incubator-pinot/pull/5470


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] 01/03: Add license

2020-05-30 Thread haibow
This is an automated email from the ASF dual-hosted git repository.

haibow pushed a commit to branch release-0.4.0-rc
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git

commit 7f65bfe5e87dc3a3b2e743c991c534ab34a7aeac
Author: Haibo Wang 
AuthorDate: Fri May 29 01:10:14 2020 -0700

Add license
---
 licenses-binary/LICENSE-gpl-2.0.txt | 641 
 1 file changed, 641 insertions(+)

diff --git a/licenses-binary/LICENSE-gpl-2.0.txt 
b/licenses-binary/LICENSE-gpl-2.0.txt
new file mode 100644
index 000..b4a4b30
--- /dev/null
+++ b/licenses-binary/LICENSE-gpl-2.0.txt
@@ -0,0 +1,641 @@
+Apache Pinot (incubating)
+Copyright 2018 The Apache Software Foundation
+
+This product includes software developed at
+The Apache Software Foundation (http://www.apache.org/).
+
+// --
+// NOTICE file corresponding to the section 4d of The Apache License,
+// Version 2.0, in this case for 
+// --
+
+The HermiteInterpolator class and its corresponding test have been imported 
from
+the orekit library distributed under the terms of the Apache 2 licence. 
Original
+source copyright:
+Copyright 2010-2012 CS Systèmes d'Information
+===
+
+This product includes software developed at
+The Apache Software Foundation (http://www.apache.org/).
+
+Apache Commons Configuration
+Copyright 2001-2008 The Apache Software Foundation
+
+This product includes software developed by
+The Apache Software Foundation (http://www.apache.org/).
+
+Apache Commons Collections
+Copyright 2001-2008 The Apache Software Foundation
+
+Apache Jakarta Commons Digester
+Copyright 2001-2006 The Apache Software Foundation
+
+Apache Commons BeanUtils
+Copyright 2000-2010 The Apache Software Foundation
+
+Apache Commons BeanUtils
+Copyright 2000-2008 The Apache Software Foundation
+
+Apache Commons Codec
+Copyright 2002-2011 The Apache Software Foundation
+
+
+src/test/org/apache/commons/codec/language/DoubleMetaphoneTest.java contains 
+test data from http://aspell.sourceforge.net/test/batch0.tab.
+
+Copyright (C) 2002 Kevin Atkinson (kev...@gnu.org). Verbatim copying
+and distribution of this entire article is permitted in any medium,
+provided this notice is preserved.
+
+
+Apache Commons IO
+Copyright 2002-2012 The Apache Software Foundation
+
+Apache Commons Lang
+Copyright 2001-2011 The Apache Software Foundation
+
+Apache Commons Logging
+Copyright 2003-2014 The Apache Software Foundation
+
+Apache Commons Lang
+Copyright 2001-2016 The Apache Software Foundation
+
+This product includes software from the Spring Framework,
+under the Apache License 2.0 (see: StringUtils.containsWhitespace())
+
+Apache Log4j SLF4J Binding
+Copyright 1999-2019 The Apache Software Foundation
+
+Apache Log4j API
+Copyright 1999-2019 The Apache Software Foundation
+
+Apache Log4j 1.x Compatibility API
+Copyright 1999-2019 The Apache Software Foundation
+
+=
+= NOTICE file corresponding to section 4d of the Apache License Version 2.0 =
+=
+This product includes software developed by
+Joda.org (https://www.joda.org/).
+
+# Jackson JSON processor
+
+Jackson is a high-performance, Free/Open Source JSON processing library.
+It was originally written by Tatu Saloranta (tatu.salora...@iki.fi), and has
+been in development since 2007.
+It is currently developed by a community of developers, as well as supported
+commercially by FasterXML.com.
+
+## Licensing
+
+Jackson core and extension components may be licensed under different licenses.
+To find the details that apply to this artifact see the accompanying LICENSE 
file.
+For more information, including possible other licensing options, contact
+FasterXML.com (http://fasterxml.com).
+
+## Credits
+
+A list of contributors may be found from CREDITS file, which is included
+in some artifacts (usually source distributions); but is always available
+from the source code management (SCM) system project uses.
+
+Apache Groovy
+Copyright 2003-2017 The Apache Software Foundation
+
+This product includes/uses ANTLR (http://www.antlr2.org/)
+developed by Terence Parr 1989-2006
+
+This product bundles icons from the famfamfam.com silk icons set
+http://www.famfamfam.com/lab/icons/silk/
+Licensed under the Creative Commons Attribution Licence v2.5
+http://creativecommons.org/licenses/by/2.5/
+
+Apache HttpClient Mime
+Copyright 1999-2017 The Apache Software Foundation
+
+Apache HttpClient
+Copyright 1999-2017 The Apache Software Foundation
+
+Apache HttpCore
+Copyright 2005-2017 The 

[incubator-pinot] 03/03: [maven-release-plugin] prepare release release-0.4.0-rc1

2020-05-30 Thread haibow
This is an automated email from the ASF dual-hosted git repository.

haibow pushed a commit to branch release-0.4.0-rc
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git

commit fcfd016ae9c4f922c31b46bd8dc3fd624e1a297a
Author: Haibo Wang 
AuthorDate: Sat May 30 21:41:19 2020 -0700

[maven-release-plugin] prepare release release-0.4.0-rc1
---
 pinot-broker/pom.xml  | 5 ++---
 pinot-clients/pinot-java-client/pom.xml   | 5 ++---
 pinot-clients/pom.xml | 6 ++
 pinot-common/pom.xml  | 5 ++---
 pinot-controller/pom.xml  | 5 ++---
 pinot-core/pom.xml| 5 ++---
 pinot-distribution/pom.xml| 7 +++
 pinot-integration-tests/pom.xml   | 5 ++---
 pinot-minion/pom.xml  | 5 ++---
 pinot-perf/pom.xml| 5 ++---
 .../pinot-batch-ingestion/pinot-batch-ingestion-common/pom.xml| 6 ++
 .../pinot-batch-ingestion/pinot-batch-ingestion-hadoop/pom.xml| 6 ++
 .../pinot-batch-ingestion/pinot-batch-ingestion-spark/pom.xml | 6 ++
 .../pinot-batch-ingestion-standalone/pom.xml  | 6 ++
 pinot-plugins/pinot-batch-ingestion/pom.xml   | 6 ++
 .../pinot-batch-ingestion/v0_deprecated/pinot-hadoop/pom.xml  | 7 +++
 .../v0_deprecated/pinot-ingestion-common/pom.xml  | 6 ++
 .../pinot-batch-ingestion/v0_deprecated/pinot-spark/pom.xml   | 7 +++
 pinot-plugins/pinot-batch-ingestion/v0_deprecated/pom.xml | 6 ++
 pinot-plugins/pinot-file-system/pinot-adls/pom.xml| 5 ++---
 pinot-plugins/pinot-file-system/pinot-gcs/pom.xml | 6 ++
 pinot-plugins/pinot-file-system/pinot-hdfs/pom.xml| 5 ++---
 pinot-plugins/pinot-file-system/pinot-s3/pom.xml  | 6 ++
 pinot-plugins/pinot-file-system/pom.xml   | 6 ++
 pinot-plugins/pinot-input-format/pinot-avro-base/pom.xml  | 5 ++---
 pinot-plugins/pinot-input-format/pinot-avro/pom.xml   | 5 ++---
 pinot-plugins/pinot-input-format/pinot-confluent-avro/pom.xml | 5 ++---
 pinot-plugins/pinot-input-format/pinot-csv/pom.xml| 5 ++---
 pinot-plugins/pinot-input-format/pinot-json/pom.xml   | 5 ++---
 pinot-plugins/pinot-input-format/pinot-orc/pom.xml| 6 ++
 pinot-plugins/pinot-input-format/pinot-parquet/pom.xml| 5 ++---
 pinot-plugins/pinot-input-format/pinot-thrift/pom.xml | 5 ++---
 pinot-plugins/pinot-input-format/pom.xml  | 6 ++
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-0.9/pom.xml  | 6 ++
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-2.0/pom.xml  | 6 ++
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-base/pom.xml | 6 ++
 pinot-plugins/pinot-stream-ingestion/pom.xml  | 6 ++
 pinot-plugins/pom.xml | 8 +++-
 pinot-server/pom.xml  | 5 ++---
 pinot-spi/pom.xml | 5 ++---
 pinot-tools/pom.xml   | 5 ++---
 pom.xml   | 7 +++
 42 files changed, 89 insertions(+), 149 deletions(-)

diff --git a/pinot-broker/pom.xml b/pinot-broker/pom.xml
index d9e48e1..2fd885d 100644
--- a/pinot-broker/pom.xml
+++ b/pinot-broker/pom.xml
@@ -19,13 +19,12 @@
 under the License.
 
 -->
-http://www.w3.org/2001/XMLSchema-instance; 
xmlns="http://maven.apache.org/POM/4.0.0;
- xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
   4.0.0
   
 pinot
 org.apache.pinot
-${revision}${sha1}
+0.4.0
 ..
   
   pinot-broker
diff --git a/pinot-clients/pinot-java-client/pom.xml 
b/pinot-clients/pinot-java-client/pom.xml
index 6a98e3d..615c5e9 100644
--- a/pinot-clients/pinot-java-client/pom.xml
+++ b/pinot-clients/pinot-java-client/pom.xml
@@ -19,13 +19,12 @@
 under the License.
 
 -->
-http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
- xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 

[incubator-pinot] 02/03: remove distributionManagement

2020-05-30 Thread haibow
This is an automated email from the ASF dual-hosted git repository.

haibow pushed a commit to branch release-0.4.0-rc
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git

commit 6ab6d4c4191bae8d7f51b04078d651daf310a093
Author: Haibo Wang 
AuthorDate: Sat May 30 20:33:47 2020 -0700

remove distributionManagement
---
 pom.xml | 8 
 1 file changed, 8 deletions(-)

diff --git a/pom.xml b/pom.xml
index 2b6ae90..7024d3a 100644
--- a/pom.xml
+++ b/pom.xml
@@ -97,14 +97,6 @@
   
   2018
 
-  
-
-  bintray-linkedin-maven
-  linkedin-maven
-  
https://api.bintray.com/maven/linkedin/maven/pinot/;publish=1;override=1
-
-  
-
   
 ${basedir}
 0.4.0


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch release-0.4.0-rc updated: [maven-release-plugin] prepare for next development iteration

2020-05-30 Thread haibow
This is an automated email from the ASF dual-hosted git repository.

haibow pushed a commit to branch release-0.4.0-rc
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git


The following commit(s) were added to refs/heads/release-0.4.0-rc by this push:
 new 938b171  [maven-release-plugin] prepare for next development iteration
938b171 is described below

commit 938b1711617a7d2b95e38ff03b8dbb1a32dcfce6
Author: Haibo Wang 
AuthorDate: Sat May 30 21:41:39 2020 -0700

[maven-release-plugin] prepare for next development iteration
---
 pinot-broker/pom.xml  | 2 +-
 pinot-clients/pinot-java-client/pom.xml   | 2 +-
 pinot-clients/pom.xml | 2 +-
 pinot-common/pom.xml  | 2 +-
 pinot-controller/pom.xml  | 2 +-
 pinot-core/pom.xml| 2 +-
 pinot-distribution/pom.xml| 2 +-
 pinot-integration-tests/pom.xml   | 2 +-
 pinot-minion/pom.xml  | 2 +-
 pinot-perf/pom.xml| 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-common/pom.xml| 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-hadoop/pom.xml| 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-spark/pom.xml | 2 +-
 .../pinot-batch-ingestion/pinot-batch-ingestion-standalone/pom.xml| 2 +-
 pinot-plugins/pinot-batch-ingestion/pom.xml   | 2 +-
 .../pinot-batch-ingestion/v0_deprecated/pinot-hadoop/pom.xml  | 2 +-
 .../v0_deprecated/pinot-ingestion-common/pom.xml  | 2 +-
 pinot-plugins/pinot-batch-ingestion/v0_deprecated/pinot-spark/pom.xml | 2 +-
 pinot-plugins/pinot-batch-ingestion/v0_deprecated/pom.xml | 2 +-
 pinot-plugins/pinot-file-system/pinot-adls/pom.xml| 2 +-
 pinot-plugins/pinot-file-system/pinot-gcs/pom.xml | 2 +-
 pinot-plugins/pinot-file-system/pinot-hdfs/pom.xml| 2 +-
 pinot-plugins/pinot-file-system/pinot-s3/pom.xml  | 2 +-
 pinot-plugins/pinot-file-system/pom.xml   | 2 +-
 pinot-plugins/pinot-input-format/pinot-avro-base/pom.xml  | 2 +-
 pinot-plugins/pinot-input-format/pinot-avro/pom.xml   | 2 +-
 pinot-plugins/pinot-input-format/pinot-confluent-avro/pom.xml | 2 +-
 pinot-plugins/pinot-input-format/pinot-csv/pom.xml| 2 +-
 pinot-plugins/pinot-input-format/pinot-json/pom.xml   | 2 +-
 pinot-plugins/pinot-input-format/pinot-orc/pom.xml| 2 +-
 pinot-plugins/pinot-input-format/pinot-parquet/pom.xml| 2 +-
 pinot-plugins/pinot-input-format/pinot-thrift/pom.xml | 2 +-
 pinot-plugins/pinot-input-format/pom.xml  | 2 +-
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-0.9/pom.xml  | 2 +-
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-2.0/pom.xml  | 2 +-
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-base/pom.xml | 2 +-
 pinot-plugins/pinot-stream-ingestion/pom.xml  | 2 +-
 pinot-plugins/pom.xml | 2 +-
 pinot-server/pom.xml  | 2 +-
 pinot-spi/pom.xml | 2 +-
 pinot-tools/pom.xml   | 2 +-
 pom.xml   | 4 ++--
 42 files changed, 43 insertions(+), 43 deletions(-)

diff --git a/pinot-broker/pom.xml b/pinot-broker/pom.xml
index 2fd885d..a884630 100644
--- a/pinot-broker/pom.xml
+++ b/pinot-broker/pom.xml
@@ -24,7 +24,7 @@
   
 pinot
 org.apache.pinot
-0.4.0
+0.5.0-SNAPSHOT
 ..
   
   pinot-broker
diff --git a/pinot-clients/pinot-java-client/pom.xml 
b/pinot-clients/pinot-java-client/pom.xml
index 615c5e9..60177d7 100644
--- a/pinot-clients/pinot-java-client/pom.xml
+++ b/pinot-clients/pinot-java-client/pom.xml
@@ -24,7 +24,7 @@
   
 pinot-clients
 org.apache.pinot
-0.4.0
+0.5.0-SNAPSHOT
 ..
   
   pinot-java-client
diff --git a/pinot-clients/pom.xml b/pinot-clients/pom.xml
index e05e7ec..902f54f 100644
--- a/pinot-clients/pom.xml
+++ b/pinot-clients/pom.xml
@@ -24,7 +24,7 @@
   
 pinot
 org.apache.pinot
-0.4.0
+0.5.0-SNAPSHOT
 ..
   
   pinot-clients
diff --git a/pinot-common/pom.xml b/pinot-common/pom.xml
index db2d28d..7f2a7fc 100644
--- a/pinot-common/pom.xml
+++ b/pinot-common/pom.xml
@@ -24,7 +24,7 @@
   
 pinot
 org.apache.pinot
-0.4.0
+0.5.0-SNAPSHOT
 ..
   
   

[incubator-pinot] annotated tag release-0.4.0-rc1 updated (fcfd016 -> 4bbd301)

2020-05-30 Thread haibow
This is an automated email from the ASF dual-hosted git repository.

haibow pushed a change to annotated tag release-0.4.0-rc1
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


*** WARNING: tag release-0.4.0-rc1 was modified! ***

from fcfd016  (commit)
  to 4bbd301  (tag)
 tagging fcfd016ae9c4f922c31b46bd8dc3fd624e1a297a (commit)
  by Haibo Wang
  on Sat May 30 21:41:34 2020 -0700

- Log -
[maven-release-plugin] copy for tag release-0.4.0-rc1
---


No new revisions were added by this update.

Summary of changes:


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch release-0.4.0-rc created (now fcfd016)

2020-05-30 Thread haibow
This is an automated email from the ASF dual-hosted git repository.

haibow pushed a change to branch release-0.4.0-rc
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


  at fcfd016  [maven-release-plugin] prepare release release-0.4.0-rc1

This branch includes the following new commits:

 new 7f65bfe  Add license
 new 6ab6d4c  remove distributionManagement
 new fcfd016  [maven-release-plugin] prepare release release-0.4.0-rc1

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] 01/01: [maven-release-plugin] prepare release release-0.4.0-rc4

2020-05-30 Thread xiangfu
This is an automated email from the ASF dual-hosted git repository.

xiangfu pushed a commit to branch release-0.4.0-rc
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git

commit a6944816ee3c8ddfa272e7ae66ebf6b9d3029653
Author: Xiang Fu 
AuthorDate: Sat May 30 13:09:50 2020 -0700

[maven-release-plugin] prepare release release-0.4.0-rc4
---
 pinot-broker/pom.xml  | 5 ++---
 pinot-clients/pinot-java-client/pom.xml   | 5 ++---
 pinot-clients/pom.xml | 6 ++
 pinot-common/pom.xml  | 5 ++---
 pinot-controller/pom.xml  | 5 ++---
 pinot-core/pom.xml| 5 ++---
 pinot-distribution/pom.xml| 7 +++
 pinot-integration-tests/pom.xml   | 5 ++---
 pinot-minion/pom.xml  | 5 ++---
 pinot-perf/pom.xml| 5 ++---
 .../pinot-batch-ingestion/pinot-batch-ingestion-common/pom.xml| 6 ++
 .../pinot-batch-ingestion/pinot-batch-ingestion-hadoop/pom.xml| 6 ++
 .../pinot-batch-ingestion/pinot-batch-ingestion-spark/pom.xml | 6 ++
 .../pinot-batch-ingestion-standalone/pom.xml  | 6 ++
 pinot-plugins/pinot-batch-ingestion/pom.xml   | 6 ++
 .../pinot-batch-ingestion/v0_deprecated/pinot-hadoop/pom.xml  | 7 +++
 .../v0_deprecated/pinot-ingestion-common/pom.xml  | 6 ++
 .../pinot-batch-ingestion/v0_deprecated/pinot-spark/pom.xml   | 7 +++
 pinot-plugins/pinot-batch-ingestion/v0_deprecated/pom.xml | 6 ++
 pinot-plugins/pinot-file-system/pinot-adls/pom.xml| 5 ++---
 pinot-plugins/pinot-file-system/pinot-gcs/pom.xml | 6 ++
 pinot-plugins/pinot-file-system/pinot-hdfs/pom.xml| 5 ++---
 pinot-plugins/pinot-file-system/pinot-s3/pom.xml  | 6 ++
 pinot-plugins/pinot-file-system/pom.xml   | 6 ++
 pinot-plugins/pinot-input-format/pinot-avro-base/pom.xml  | 5 ++---
 pinot-plugins/pinot-input-format/pinot-avro/pom.xml   | 5 ++---
 pinot-plugins/pinot-input-format/pinot-confluent-avro/pom.xml | 5 ++---
 pinot-plugins/pinot-input-format/pinot-csv/pom.xml| 5 ++---
 pinot-plugins/pinot-input-format/pinot-json/pom.xml   | 5 ++---
 pinot-plugins/pinot-input-format/pinot-orc/pom.xml| 6 ++
 pinot-plugins/pinot-input-format/pinot-parquet/pom.xml| 5 ++---
 pinot-plugins/pinot-input-format/pinot-thrift/pom.xml | 5 ++---
 pinot-plugins/pinot-input-format/pom.xml  | 6 ++
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-0.9/pom.xml  | 6 ++
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-2.0/pom.xml  | 6 ++
 pinot-plugins/pinot-stream-ingestion/pinot-kafka-base/pom.xml | 6 ++
 pinot-plugins/pinot-stream-ingestion/pom.xml  | 6 ++
 pinot-plugins/pom.xml | 8 +++-
 pinot-server/pom.xml  | 5 ++---
 pinot-spi/pom.xml | 5 ++---
 pinot-tools/pom.xml   | 5 ++---
 pom.xml   | 7 +++
 42 files changed, 89 insertions(+), 149 deletions(-)

diff --git a/pinot-broker/pom.xml b/pinot-broker/pom.xml
index d9e48e1..2fd885d 100644
--- a/pinot-broker/pom.xml
+++ b/pinot-broker/pom.xml
@@ -19,13 +19,12 @@
 under the License.
 
 -->
-http://www.w3.org/2001/XMLSchema-instance; 
xmlns="http://maven.apache.org/POM/4.0.0;
- xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
   4.0.0
   
 pinot
 org.apache.pinot
-${revision}${sha1}
+0.4.0
 ..
   
   pinot-broker
diff --git a/pinot-clients/pinot-java-client/pom.xml 
b/pinot-clients/pinot-java-client/pom.xml
index 6a98e3d..615c5e9 100644
--- a/pinot-clients/pinot-java-client/pom.xml
+++ b/pinot-clients/pinot-java-client/pom.xml
@@ -19,13 +19,12 @@
 under the License.
 
 -->
-http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
- xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 

[incubator-pinot] branch release-0.4.0-rc created (now a694481)

2020-05-30 Thread xiangfu
This is an automated email from the ASF dual-hosted git repository.

xiangfu pushed a change to branch release-0.4.0-rc
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


  at a694481  [maven-release-plugin] prepare release release-0.4.0-rc4

This branch includes the following new commits:

 new a694481  [maven-release-plugin] prepare release release-0.4.0-rc4

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] snleee commented on a change in pull request #5470: Derive numDocsPerChunk for var byte raw index from metadata only if config is enabled.

2020-05-30 Thread GitBox


snleee commented on a change in pull request #5470:
URL: https://github.com/apache/incubator-pinot/pull/5470#discussion_r432895436



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
##
@@ -330,8 +330,32 @@ public long getLatestIngestionTimestamp() {
*/
   private boolean isNoDictionaryColumn(Set noDictionaryColumns, 
Set invertedIndexColumns,
   Set textIndexColumns, FieldSpec fieldSpec, String column) {
-return textIndexColumns.contains(column) || 
(noDictionaryColumns.contains(column) && fieldSpec.isSingleValueField()
-&& !invertedIndexColumns.contains(column));
+if (textIndexColumns.contains(column)) {
+  // text column is no dictionary currently
+  return true;
+}
+FieldSpec.DataType dataType = fieldSpec.getDataType();
+if (noDictionaryColumns.contains(column)) {
+  // Earlier we didn't support noDict in consuming segments for STRING and 
BYTES columns.
+  // So even if the user had the column in noDictionaryColumns set in 
table config, we still
+  // created dictionary in consuming segments.
+  // Later on we added this support. There is a particular impact of this 
change on the use cases
+  // that have set noDict on their STRING dimension columns for other 
performance
+  // reasons and also want metricsAggregation. These use cases don't get to
+  // aggregateMetrics because the new implementation is able to honor 
their table config setting
+  // of noDict on STRING/BYTES. Without metrics aggregation, memory 
pressure increases.
+  // So to continue aggregating metrics for such cases, we will create 
dictionary even
+  // if the column is part of noDictionary set from table config
+  if (fieldSpec instanceof DimensionFieldSpec && _aggregateMetrics && 
(dataType == FieldSpec.DataType.STRING ||

Review comment:
   What was the original behavior before your recent change? Did we 
explicitly check `STRING` and `BYTES` types also?

##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/SegmentColumnarIndexCreator.java
##
@@ -193,9 +194,10 @@ public void init(SegmentGeneratorConfig 
segmentCreationSpec, SegmentIndexCreatio
 getColumnCompressionType(segmentCreationSpec, fieldSpec);
 
 // Initialize forward index creator
+boolean deriveNumChunksForVarByteRawIndex = 
shouldDeriveNumChunksForRawIndex(columnName, 
segmentCreationSpec.getColumnProperties());

Review comment:
   `deriveNumDocsPerChunk` sounds more accurate? (applicable to all related 
config & variable names)





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5470: Derive numDocsPerChunk for var byte raw index from metadata only if config is enabled.

2020-05-30 Thread GitBox


siddharthteotia commented on a change in pull request #5470:
URL: https://github.com/apache/incubator-pinot/pull/5470#discussion_r432898453



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/SegmentColumnarIndexCreator.java
##
@@ -213,6 +215,14 @@ public void init(SegmentGeneratorConfig 
segmentCreationSpec, SegmentIndexCreatio
 }
   }
 
+  public static boolean shouldDeriveNumDocsPerChunk(String columnName, 
Map> columnProperties) {
+if (columnProperties != null) {
+  Map properties = columnProperties.get(columnName);
+  return properties != null && 
Boolean.parseBoolean(properties.get(FieldConfig.DERIVE_NUM_DOCS_PER_CHUNK_RAW_INDEX_KEY));

Review comment:
   > Defaults to false, right?
   
   Yes
   
   > Can we derive it automatically? (e.g. if column is text index then we 
derive it from metadata) Or, do you see this being usefiul in other cases as 
well?
   
   We could. Even for columns with text indexes, I don't think we should use it 
by default (since now that we have seen the potential -ve impact related to 
access pattern). Yes, most likely this will be used for columns with text index 
but only if the average column value size is very large (around 1-2MB) since 
that is the case which takes the chunk size and compressed chunk size (2 * raw) 
> 2GB and deriving the numDocsPerChunk becomes useful. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] snleee commented on a change in pull request #5470: Derive numDocsPerChunk for var byte raw index from metadata only if config is enabled.

2020-05-30 Thread GitBox


snleee commented on a change in pull request #5470:
URL: https://github.com/apache/incubator-pinot/pull/5470#discussion_r432898475



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/SegmentColumnarIndexCreator.java
##
@@ -213,6 +215,14 @@ public void init(SegmentGeneratorConfig 
segmentCreationSpec, SegmentIndexCreatio
 }
   }
 
+  public static boolean shouldDeriveNumDocsPerChunk(String columnName, 
Map> columnProperties) {
+if (columnProperties != null) {
+  Map properties = columnProperties.get(columnName);
+  return properties != null && 
Boolean.parseBoolean(properties.get(FieldConfig.DERIVE_NUM_DOCS_PER_CHUNK_RAW_INDEX_KEY));

Review comment:
   @mcvsubbu deriving was the issue for our production issue.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #5470: Derive numDocsPerChunk for var byte raw index from metadata only if config is enabled.

2020-05-30 Thread GitBox


mcvsubbu commented on a change in pull request #5470:
URL: https://github.com/apache/incubator-pinot/pull/5470#discussion_r432905837



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/SegmentColumnarIndexCreator.java
##
@@ -213,6 +215,14 @@ public void init(SegmentGeneratorConfig 
segmentCreationSpec, SegmentIndexCreatio
 }
   }
 
+  public static boolean shouldDeriveNumDocsPerChunk(String columnName, 
Map> columnProperties) {
+if (columnProperties != null) {
+  Map properties = columnProperties.get(columnName);
+  return properties != null && 
Boolean.parseBoolean(properties.get(FieldConfig.DERIVE_NUM_DOCS_PER_CHUNK_RAW_INDEX_KEY));

Review comment:
   The reason I ask is that if we introduce a config it is hard to 
remove/deprecate, etc. if we make it a default for text column, we can always 
introduce a config later to adjust. In both offline and realtime cases, we know 
the average column size (or, can compute easily) at the segment generation 
time, so it seems to me that this can be done automatically without introducing 
a configuration. I would propose to NOT introduce a config at this time





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] fx19880617 opened a new pull request #5471: Update Superset image build

2020-05-30 Thread GitBox


fx19880617 opened a new pull request #5471:
URL: https://github.com/apache/incubator-pinot/pull/5471


   ## Description
   Update Superset docker image build file
   - Upgrade pinotdb version to 0.2.5
   - Allow customized superset repo
   
   ## Release Notes
   No
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch upgrade_superset_docker_image_script created (now 3834da1)

2020-05-30 Thread xiangfu
This is an automated email from the ASF dual-hosted git repository.

xiangfu pushed a change to branch upgrade_superset_docker_image_script
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


  at 3834da1  Update Superset image build

This branch includes the following new commits:

 new 3834da1  Update Superset image build

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] 01/01: Update Superset image build

2020-05-30 Thread xiangfu
This is an automated email from the ASF dual-hosted git repository.

xiangfu pushed a commit to branch upgrade_superset_docker_image_script
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git

commit 3834da1da5e94f361f7f97dd553be94946331d0f
Author: Xiang Fu 
AuthorDate: Sat May 30 16:22:28 2020 -0700

Update Superset image build
---
 docker/images/pinot-superset/Dockerfile  |  3 ++-
 docker/images/pinot-superset/bin/superset-init   | 13 +
 docker/images/pinot-superset/requirements-db.txt |  2 +-
 3 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/docker/images/pinot-superset/Dockerfile 
b/docker/images/pinot-superset/Dockerfile
index 837b905..bc9c2e6 100644
--- a/docker/images/pinot-superset/Dockerfile
+++ b/docker/images/pinot-superset/Dockerfile
@@ -33,11 +33,12 @@ RUN mkdir /app \
 
 # Superset version to build
 ARG SUPERSET_VERSION=master
+ARG SUPERSET_REPO=https://github.com/apache/incubator-superset
 ENV SUPERSET_SRC=/app/superset-src
 
 # Download source
 WORKDIR ${SUPERSET_SRC}
-RUN wget -qO /tmp/superset.tar.gz 
https://github.com/apache/incubator-superset/archive/${SUPERSET_VERSION}.tar.gz 
\
+RUN wget -qO /tmp/superset.tar.gz 
${SUPERSET_REPO}/archive/${SUPERSET_VERSION}.tar.gz \
   && tar xzf /tmp/superset.tar.gz -C ${SUPERSET_SRC} --strip-components=1
 
 # First, we just wanna install requirements, which will allow us to utilize 
the cache
diff --git a/docker/images/pinot-superset/bin/superset-init 
b/docker/images/pinot-superset/bin/superset-init
new file mode 100644
index 000..48208d7
--- /dev/null
+++ b/docker/images/pinot-superset/bin/superset-init
@@ -0,0 +1,13 @@
+#!/bin/bash
+
+set -e
+
+# Create an admin user
+FLASK_APP=superset flask fab create-admin $@
+
+# Initialize the database
+superset db upgrade
+
+# Create default roles and permissions
+superset init
+
diff --git a/docker/images/pinot-superset/requirements-db.txt 
b/docker/images/pinot-superset/requirements-db.txt
index e778804..c123e63 100644
--- a/docker/images/pinot-superset/requirements-db.txt
+++ b/docker/images/pinot-superset/requirements-db.txt
@@ -20,4 +20,4 @@ gevent==1.4.0
 psycopg2-binary==2.7.5
 pyhive==0.6.1
 redis==3.2.1
-pinotdb==0.2.4
+pinotdb==0.2.5


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #5470: Derive numDocsPerChunk for var byte raw index from metadata only if config is enabled.

2020-05-30 Thread GitBox


mcvsubbu commented on a change in pull request #5470:
URL: https://github.com/apache/incubator-pinot/pull/5470#discussion_r432897362



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
##
@@ -330,8 +330,32 @@ public long getLatestIngestionTimestamp() {
*/
   private boolean isNoDictionaryColumn(Set noDictionaryColumns, 
Set invertedIndexColumns,
   Set textIndexColumns, FieldSpec fieldSpec, String column) {
-return textIndexColumns.contains(column) || 
(noDictionaryColumns.contains(column) && fieldSpec.isSingleValueField()
-&& !invertedIndexColumns.contains(column));
+if (textIndexColumns.contains(column)) {
+  // text column is no dictionary currently
+  return true;
+}
+FieldSpec.DataType dataType = fieldSpec.getDataType();
+if (noDictionaryColumns.contains(column)) {
+  // Earlier we didn't support noDict in consuming segments for STRING and 
BYTES columns.
+  // So even if the user had the column in noDictionaryColumns set in 
table config, we still
+  // created dictionary in consuming segments.
+  // Later on we added this support. There is a particular impact of this 
change on the use cases
+  // that have set noDict on their STRING dimension columns for other 
performance
+  // reasons and also want metricsAggregation. These use cases don't get to
+  // aggregateMetrics because the new implementation is able to honor 
their table config setting
+  // of noDict on STRING/BYTES. Without metrics aggregation, memory 
pressure increases.
+  // So to continue aggregating metrics for such cases, we will create 
dictionary even
+  // if the column is part of noDictionary set from table config
+  if (fieldSpec instanceof DimensionFieldSpec && _aggregateMetrics && 
(dataType == FieldSpec.DataType.STRING ||

Review comment:
   should the name of the method be changed to 
`shouldCreateDictionaryForColumn()`?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #5470: Derive numDocsPerChunk for var byte raw index from metadata only if config is enabled.

2020-05-30 Thread GitBox


mcvsubbu commented on a change in pull request #5470:
URL: https://github.com/apache/incubator-pinot/pull/5470#discussion_r432897422



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/SegmentColumnarIndexCreator.java
##
@@ -213,6 +215,14 @@ public void init(SegmentGeneratorConfig 
segmentCreationSpec, SegmentIndexCreatio
 }
   }
 
+  public static boolean shouldDeriveNumDocsPerChunk(String columnName, 
Map> columnProperties) {
+if (columnProperties != null) {
+  Map properties = columnProperties.get(columnName);
+  return properties != null && 
Boolean.parseBoolean(properties.get(FieldConfig.DERIVE_NUM_DOCS_PER_CHUNK_RAW_INDEX_KEY));

Review comment:
   Can we derive it automatically? (e.g. if column is text index then we 
derive it from metadata) Or, do you see this being usefiul in other cases as 
well?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] siddharthteotia commented on a change in pull request #5470: Derive numDocsPerChunk for var byte raw index from metadata only if config is enabled.

2020-05-30 Thread GitBox


siddharthteotia commented on a change in pull request #5470:
URL: https://github.com/apache/incubator-pinot/pull/5470#discussion_r432898231



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
##
@@ -330,8 +330,32 @@ public long getLatestIngestionTimestamp() {
*/
   private boolean isNoDictionaryColumn(Set noDictionaryColumns, 
Set invertedIndexColumns,
   Set textIndexColumns, FieldSpec fieldSpec, String column) {
-return textIndexColumns.contains(column) || 
(noDictionaryColumns.contains(column) && fieldSpec.isSingleValueField()
-&& !invertedIndexColumns.contains(column));
+if (textIndexColumns.contains(column)) {

Review comment:
   Actually it is not needed anymore. I already do the validation upfront 
in TableConfig (already in master). So we can remove it. Per column config is 
enough





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] snleee commented on a change in pull request #5470: Derive numDocsPerChunk for var byte raw index from metadata only if config is enabled.

2020-05-30 Thread GitBox


snleee commented on a change in pull request #5470:
URL: https://github.com/apache/incubator-pinot/pull/5470#discussion_r432898475



##
File path: 
pinot-core/src/main/java/org/apache/pinot/core/segment/creator/impl/SegmentColumnarIndexCreator.java
##
@@ -213,6 +215,14 @@ public void init(SegmentGeneratorConfig 
segmentCreationSpec, SegmentIndexCreatio
 }
   }
 
+  public static boolean shouldDeriveNumDocsPerChunk(String columnName, 
Map> columnProperties) {
+if (columnProperties != null) {
+  Map properties = columnProperties.get(columnName);
+  return properties != null && 
Boolean.parseBoolean(properties.get(FieldConfig.DERIVE_NUM_DOCS_PER_CHUNK_RAW_INDEX_KEY));

Review comment:
   @mcvsubbu deriving was the issue for our production issue. So, we need a 
way to control when we want to derive or use the default value.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] 01/02: Two changes:

2020-05-30 Thread siddteotia
This is an automated email from the ASF dual-hosted git repository.

siddteotia pushed a commit to branch hotfix-0530
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git

commit 9e35e157270e1d63ca9333e13f13951df30867ff
Author: Siddharth Teotia 
AuthorDate: Sat May 30 14:14:38 2020 -0700

Two changes:

(1) PR https://github.com/apache/incubator-pinot/pull/5256
added support for deriving num docs per chunk for var byte
raw index create from column length. This was specifically
done as part of supporting text blobs. For use cases that
don't want this feature and are high QPS, see a negative
impact since size of chunk increases (earlier value
of numDocsPerChunk was hardcoded to 1000) and based on the
access pattern we might end up uncompressing a bigger chunk to get values
for a set of docIds. We have made this change configurable.
So the default behaviour is same as old (1000 docs per chunk)

(2) PR https://github.com/apache/incubator-pinot/pull/4791
added support for noDict for STRING/BYTES in consuming segments.
There is a particular impact of this change on the use cases
that have set noDict on their STRING dimension columns for other performance
reasons and also want metricsAggregation. These use cases don't get to
aggregateMetrics because the new implementation was able to honor their
table config setting of noDict on STRING/BYTES. Without metrics aggregation,
memory pressure increases. So to continue aggregating metrics for such 
cases,
we will create dictionary even if the column is part of noDictionary set
from table config.
---
 .../generator/SegmentGeneratorConfig.java  | 15 +++
 .../indexsegment/mutable/MutableSegmentImpl.java   | 29 +++---
 .../creator/impl/SegmentColumnarIndexCreator.java  | 22 ++--
 .../fwd/SingleValueVarByteRawIndexCreator.java | 10 +++-
 .../defaultcolumn/BaseDefaultColumnHandler.java|  3 ++-
 .../apache/pinot/spi/config/table/FieldConfig.java |  1 +
 6 files changed, 72 insertions(+), 8 deletions(-)

diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/generator/SegmentGeneratorConfig.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/generator/SegmentGeneratorConfig.java
index 59531fe..9af9c16 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/generator/SegmentGeneratorConfig.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/generator/SegmentGeneratorConfig.java
@@ -102,6 +102,9 @@ public class SegmentGeneratorConfig {
   private boolean _skipTimeValueCheck = false;
   private boolean _nullHandlingEnabled = false;
 
+  // constructed from FieldConfig
+  private Map> _columnProperties = new HashMap<>();
+
   @Deprecated
   public SegmentGeneratorConfig() {
   }
@@ -174,12 +177,24 @@ public class SegmentGeneratorConfig {
 _invertedIndexCreationColumns = 
indexingConfig.getInvertedIndexColumns();
   }
 
+  List fieldConfigList = tableConfig.getFieldConfigList();
+  if (fieldConfigList != null) {
+for (FieldConfig fieldConfig : fieldConfigList) {
+  _columnProperties.put(fieldConfig.getName(), 
fieldConfig.getProperties());
+}
+  }
+
   extractTextIndexColumnsFromTableConfig(tableConfig);
 
   _nullHandlingEnabled = indexingConfig.isNullHandlingEnabled();
 }
   }
 
+  @Nonnull
+  public Map> getColumnProperties() {
+return _columnProperties;
+  }
+
   /**
* Set time column details using the given time column
*/
diff --git 
a/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
 
b/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
index 07e5ec9..0e24664 100644
--- 
a/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
+++ 
b/pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
@@ -219,7 +219,7 @@ public class MutableSegmentImpl implements MutableSegment {
   FieldSpec.DataType dataType = fieldSpec.getDataType();
   boolean isFixedWidthColumn = dataType.isFixedWidth();
   int forwardIndexColumnSize = -1;
-  if (isNoDictionaryColumn(noDictionaryColumns, invertedIndexColumns, 
textIndexColumns, fieldSpec, column)) {
+  if (isNoDictionaryColumn(noDictionaryColumns, invertedIndexColumns, 
fieldSpec, column)) {
 // no dictionary
 // each forward index entry will be equal to size of data for that row
 // For INT, LONG, FLOAT, DOUBLE it is equal to the number of fixed 
bytes used to store the value,
@@ -329,9 +329,30 @@ public class MutableSegmentImpl implements MutableSegment {
* @return true if column is no-dictionary, false if dictionary encoded
*/
   private boolean isNoDictionaryColumn(Set noDictionaryColumns, 
Set invertedIndexColumns,
-  Set 

[incubator-pinot] branch hotfix-0530 updated (54cac4a -> 5580af7)

2020-05-30 Thread siddteotia
This is an automated email from the ASF dual-hosted git repository.

siddteotia pushed a change to branch hotfix-0530
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


 discard 54cac4a  Two changes:
 new 9e35e15  Two changes:
 new 5580af7  Remove master branch restriction (#5467)

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (54cac4a)
\
 N -- N -- N   refs/heads/hotfix-0530 (5580af7)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .travis.yml |  8 
 .../pinot/core/indexsegment/mutable/MutableSegmentImpl.java | 13 +
 .../segment/creator/impl/SegmentColumnarIndexCreator.java   | 12 ++--
 .../creator/impl/fwd/SingleValueVarByteRawIndexCreator.java |  4 ++--
 .../loader/defaultcolumn/BaseDefaultColumnHandler.java  |  3 ++-
 .../java/org/apache/pinot/spi/config/table/FieldConfig.java |  2 +-
 6 files changed, 20 insertions(+), 22 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] 02/02: Remove master branch restriction (#5467)

2020-05-30 Thread siddteotia
This is an automated email from the ASF dual-hosted git repository.

siddteotia pushed a commit to branch hotfix-0530
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git

commit 5580af7f253d66a3421971e1400c1bb919a9f115
Author: Jialiang Li 
AuthorDate: Fri May 29 09:05:25 2020 -0700

Remove master branch restriction (#5467)

Co-authored-by: Jack Li(Analytics Engineering) 
---
 .travis.yml | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index 84dfceb..76eb09b 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -25,14 +25,14 @@ addons:
 install:
   - ./.travis/.travis_install.sh
 
-branches:
-  only:
-- master
+#branches:
+#  only:
+#- master
 
 stages:
   - test
   - name: deploy
-if: branch = master
+#if: branch = master
 
 jobs:
   include:


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org