[GitHub] [incubator-pinot] akshayrai commented on a change in pull request #4960: [TE] Use upper and lower bounds as predicted for threshold algorithm

2020-01-06 Thread GitBox
akshayrai commented on a change in pull request #4960: [TE] Use upper and lower 
bounds as predicted for threshold algorithm
URL: https://github.com/apache/incubator-pinot/pull/4960#discussion_r363620810
 
 

 ##
 File path: 
thirdeye/thirdeye-pinot/src/main/java/org/apache/pinot/thirdeye/detection/components/ThresholdRuleDetector.java
 ##
 @@ -101,15 +100,24 @@ public TimeSeries computePredictedTimeSeries(MetricSlice 
slice) {
 InputData data =
 this.dataFetcher.fetchData(new 
InputDataSpec().withTimeseriesSlices(Collections.singletonList(slice)));
 DataFrame df = data.getTimeseries().get(slice);
-return TimeSeries.fromDataFrame(constructThresholdBoundaries(df));
+return TimeSeries.fromDataFrame(constructBaselineAndBoundaries(df));
   }
 
-  private DataFrame constructThresholdBoundaries(DataFrame df) {
+  /**
+   * Populate the dataframe with upper/lower boundaries and baseline
+   */
+  private DataFrame constructBaselineAndBoundaries(DataFrame df) {
+// Set default baseline as the actual value
+df.addSeries(COL_VALUE, df.getDoubles(COL_CURRENT));
 if (!Double.isNaN(this.min)) {
   df.addSeries(COL_LOWER_BOUND, DoubleSeries.fillValues(df.size(), 
this.min));
+  // set baseline value as the lower bound when actual value across below 
the mark
+  df.mapInPlace(DoubleSeries.MAX, COL_VALUE, COL_LOWER_BOUND, COL_VALUE);
 
 Review comment:
   ```Why we want to change the baseline value?```
   I updated the description. Pls take a look there. This is addressing user 
ticket APA-11365. This PR is not proposing any changes to upper/lower bounds. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] kishoreg commented on a change in pull request #4963: Package rename pinot-plugin modules (part of #4941).

2020-01-06 Thread GitBox
kishoreg commented on a change in pull request #4963: Package rename 
pinot-plugin modules (part of #4941).
URL: https://github.com/apache/incubator-pinot/pull/4963#discussion_r363617632
 
 

 ##
 File path: 
pinot-plugins/pinot-batch-ingestion/v0_deprecated/pinot-hadoop/src/main/java/org/apache/pinot/hadoop/job/HadoopSegmentPreprocessingJob.java
 ##
 @@ -60,9 +60,9 @@
 import org.apache.pinot.hadoop.job.partitioners.GenericPartitioner;
 import org.apache.pinot.hadoop.job.reducers.SegmentPreprocessingReducer;
 import org.apache.pinot.hadoop.utils.PinotHadoopJobPreparationHelper;
-import org.apache.pinot.ingestion.common.ControllerRestApi;
-import org.apache.pinot.ingestion.common.JobConfigConstants;
-import org.apache.pinot.ingestion.jobs.SegmentPreprocessingJob;
+import org.apache.pinot.plugin.ingestion.batch.common.ControllerRestApi;
 
 Review comment:
   It might be better to not re-organize packages in v0_deprecated


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] kishoreg commented on a change in pull request #4963: Package rename pinot-plugin modules (part of #4941).

2020-01-06 Thread GitBox
kishoreg commented on a change in pull request #4963: Package rename 
pinot-plugin modules (part of #4941).
URL: https://github.com/apache/incubator-pinot/pull/4963#discussion_r363616977
 
 

 ##
 File path: 
pinot-plugins/pinot-batch-ingestion/pinot-batch-ingestion-base/src/main/java/org/apache/pinot/plugin/ingestion/batch/standalone/SegmentGenerationJobRunner.java
 ##
 @@ -36,14 +36,14 @@
 import org.apache.commons.configuration.MapConfiguration;
 import org.apache.commons.io.FileUtils;
 import org.apache.commons.io.IOUtils;
+import org.apache.pinot.plugin.ingestion.batch.common.Constants;
 
 Review comment:
   we should ideally move these to spi. I will do this in another PR


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] kishoreg commented on a change in pull request #4963: Package rename pinot-plugin modules (part of #4941).

2020-01-06 Thread GitBox
kishoreg commented on a change in pull request #4963: Package rename 
pinot-plugin modules (part of #4941).
URL: https://github.com/apache/incubator-pinot/pull/4963#discussion_r363617882
 
 

 ##
 File path: 
pinot-spi/src/main/java/org/apache/pinot/spi/data/readers/RecordReaderFactory.java
 ##
 @@ -33,18 +33,18 @@
   private static final Map 
DEFAULT_RECORD_READER_CONFIG_CLASS_MAP = new HashMap<>();
 
   // TODO: This could be removed once we have dynamic loading plugins supports.
 
 Review comment:
   @fx19880617 can we remove this now?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] codecov-io commented on issue #4963: Package rename pinot-plugin modules (part of #4941).

2020-01-06 Thread GitBox
codecov-io commented on issue #4963: Package rename pinot-plugin modules (part 
of #4941).
URL: https://github.com/apache/incubator-pinot/pull/4963#issuecomment-571461556
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/4963?src=pr=h1) 
Report
   > Merging 
[#4963](https://codecov.io/gh/apache/incubator-pinot/pull/4963?src=pr=desc) 
into 
[master](https://codecov.io/gh/apache/incubator-pinot/commit/247af5d600ed806e3a1088ee8500e3596483e62a?src=pr=desc)
 will **increase** coverage by `21.25%`.
   > The diff coverage is `0%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-pinot/pull/4963/graphs/tree.svg?width=650=4ibza2ugkz=150=pr)](https://codecov.io/gh/apache/incubator-pinot/pull/4963?src=pr=tree)
   
   ```diff
   @@  Coverage Diff  @@
   ## master#4963   +/-   ##
   =
   + Coverage 36.31%   57.56%   +21.25% 
   - Complexity0   12   +12 
   =
 Files  1159 1159   
 Lines 6183761837   
 Branches   9118 9118   
   =
   + Hits  2245435595+13141 
   + Misses3737523585-13790 
   - Partials   2008 2657  +649
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-pinot/pull/4963?src=pr=tree) | 
Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | 
[.../ingestion/batch/jobs/SegmentPreprocessingJob.java](https://codecov.io/gh/apache/incubator-pinot/pull/4963/diff?src=pr=tree#diff-cGlub3QtcGx1Z2lucy9waW5vdC1iYXRjaC1pbmdlc3Rpb24vdjBfZGVwcmVjYXRlZC9waW5vdC1pbmdlc3Rpb24tY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9wbHVnaW4vaW5nZXN0aW9uL2JhdGNoL2pvYnMvU2VnbWVudFByZXByb2Nlc3NpbmdKb2IuamF2YQ==)
 | `66.66% <ø> (ø)` | `0 <0> (?)` | |
   | 
[...plugin/ingestion/batch/jobs/SegmentTarPushJob.java](https://codecov.io/gh/apache/incubator-pinot/pull/4963/diff?src=pr=tree#diff-cGlub3QtcGx1Z2lucy9waW5vdC1iYXRjaC1pbmdlc3Rpb24vdjBfZGVwcmVjYXRlZC9waW5vdC1pbmdlc3Rpb24tY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9wbHVnaW4vaW5nZXN0aW9uL2JhdGNoL2pvYnMvU2VnbWVudFRhclB1c2hKb2IuamF2YQ==)
 | `89.47% <ø> (ø)` | `0 <0> (?)` | |
   | 
[...n/inputformat/thrift/ThriftRecordReaderConfig.java](https://codecov.io/gh/apache/incubator-pinot/pull/4963/diff?src=pr=tree#diff-cGlub3QtcGx1Z2lucy9waW5vdC1pbnB1dC1mb3JtYXQvcGlub3QtdGhyaWZ0L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9wbHVnaW4vaW5wdXRmb3JtYXQvdGhyaWZ0L1RocmlmdFJlY29yZFJlYWRlckNvbmZpZy5qYXZh)
 | `80% <ø> (ø)` | `0 <0> (?)` | |
   | 
[...t/plugin/stream/AvroRecordToPinotRowGenerator.java](https://codecov.io/gh/apache/incubator-pinot/pull/4963/diff?src=pr=tree#diff-cGlub3QtcGx1Z2lucy9waW5vdC1zdHJlYW0taW5nZXN0aW9uL3Bpbm90LWthZmthLWJhc2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3Bpbm90L3BsdWdpbi9zdHJlYW0vQXZyb1JlY29yZFRvUGlub3RSb3dHZW5lcmF0b3IuamF2YQ==)
 | `91.66% <ø> (+91.66%)` | `0 <0> (ø)` | :arrow_down: |
   | 
[.../pinot/plugin/inputformat/avro/AvroSchemaUtil.java](https://codecov.io/gh/apache/incubator-pinot/pull/4963/diff?src=pr=tree#diff-cGlub3QtcGx1Z2lucy9waW5vdC1pbnB1dC1mb3JtYXQvcGlub3QtYXZyby9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcGx1Z2luL2lucHV0Zm9ybWF0L2F2cm8vQXZyb1NjaGVtYVV0aWwuamF2YQ==)
 | `22.22% <ø> (ø)` | `0 <0> (?)` | |
   | 
[...inot/plugin/inputformat/avro/AvroRecordReader.java](https://codecov.io/gh/apache/incubator-pinot/pull/4963/diff?src=pr=tree#diff-cGlub3QtcGx1Z2lucy9waW5vdC1pbnB1dC1mb3JtYXQvcGlub3QtYXZyby9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcGx1Z2luL2lucHV0Zm9ybWF0L2F2cm8vQXZyb1JlY29yZFJlYWRlci5qYXZh)
 | `89.65% <ø> (ø)` | `0 <0> (?)` | |
   | 
[...plugin/ingestion/batch/jobs/SegmentUriPushJob.java](https://codecov.io/gh/apache/incubator-pinot/pull/4963/diff?src=pr=tree#diff-cGlub3QtcGx1Z2lucy9waW5vdC1iYXRjaC1pbmdlc3Rpb24vdjBfZGVwcmVjYXRlZC9waW5vdC1pbmdlc3Rpb24tY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9wbHVnaW4vaW5nZXN0aW9uL2JhdGNoL2pvYnMvU2VnbWVudFVyaVB1c2hKb2IuamF2YQ==)
 | `0% <ø> (ø)` | `0 <0> (?)` | |
   | 
[...gin/ingestion/batch/common/JobConfigConstants.java](https://codecov.io/gh/apache/incubator-pinot/pull/4963/diff?src=pr=tree#diff-cGlub3QtcGx1Z2lucy9waW5vdC1iYXRjaC1pbmdlc3Rpb24vdjBfZGVwcmVjYXRlZC9waW5vdC1pbmdlc3Rpb24tY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9wbHVnaW4vaW5nZXN0aW9uL2JhdGNoL2NvbW1vbi9Kb2JDb25maWdDb25zdGFudHMuamF2YQ==)
 | `0% <ø> (ø)` | `0 <0> (?)` | |
   | 
[...inot/plugin/inputformat/json/JSONRecordReader.java](https://codecov.io/gh/apache/incubator-pinot/pull/4963/diff?src=pr=tree#diff-cGlub3QtcGx1Z2lucy9waW5vdC1pbnB1dC1mb3JtYXQvcGlub3QtanNvbi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvcGx1Z2luL2lucHV0Zm9ybWF0L2pzb24vSlNPTlJlY29yZFJlYWRlci5qYXZh)
 | `90% <ø> (ø)` | `12 <0> (?)` | |
   | 

[GitHub] [incubator-pinot] haibow commented on a change in pull request #4954: Support schema evolution for consuming segments

2020-01-06 Thread GitBox
haibow commented on a change in pull request #4954: Support schema evolution 
for consuming segments
URL: https://github.com/apache/incubator-pinot/pull/4954#discussion_r363563130
 
 

 ##
 File path: 
pinot-core/src/main/java/org/apache/pinot/core/segment/index/SegmentMetadataImpl.java
 ##
 @@ -43,19 +43,19 @@
 import org.apache.commons.configuration.ConfigurationException;
 import org.apache.commons.configuration.PropertiesConfiguration;
 import org.apache.commons.lang.StringEscapeUtils;
-import org.apache.pinot.spi.data.MetricFieldSpec;
-import org.apache.pinot.spi.data.Schema;
 import org.apache.pinot.common.metadata.segment.RealtimeSegmentZKMetadata;
 import org.apache.pinot.common.segment.SegmentMetadata;
 import org.apache.pinot.common.segment.StarTreeMetadata;
-import org.apache.pinot.spi.utils.JsonUtils;
-import org.apache.pinot.spi.utils.TimeUtils;
 import org.apache.pinot.core.indexsegment.generator.SegmentVersion;
 import org.apache.pinot.core.segment.creator.impl.V1Constants;
 import org.apache.pinot.core.segment.creator.impl.V1Constants.MetadataKeys;
 import org.apache.pinot.core.segment.store.SegmentDirectoryPaths;
 import org.apache.pinot.core.startree.v2.StarTreeV2Constants;
 import org.apache.pinot.core.startree.v2.StarTreeV2Metadata;
+import org.apache.pinot.spi.data.MetricFieldSpec;
 
 Review comment:
   The file is auto formatted - had the wrong import order (after spi 
refactor). I guess I happened to format it when reading it


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] kishoreg commented on issue #4957: Rename stream plugin pkgs

2020-01-06 Thread GitBox
kishoreg commented on issue #4957: Rename stream plugin pkgs
URL: https://github.com/apache/incubator-pinot/pull/4957#issuecomment-571408989
 
 
   Along with mentioning it in the release notes, can we add a hashmap that 
maps old class to the new class in PluginManager to make this seamless? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] haibow commented on a change in pull request #4954: Support schema evolution for consuming segments

2020-01-06 Thread GitBox
haibow commented on a change in pull request #4954: Support schema evolution 
for consuming segments
URL: https://github.com/apache/incubator-pinot/pull/4954#discussion_r363563130
 
 

 ##
 File path: 
pinot-core/src/main/java/org/apache/pinot/core/segment/index/SegmentMetadataImpl.java
 ##
 @@ -43,19 +43,19 @@
 import org.apache.commons.configuration.ConfigurationException;
 import org.apache.commons.configuration.PropertiesConfiguration;
 import org.apache.commons.lang.StringEscapeUtils;
-import org.apache.pinot.spi.data.MetricFieldSpec;
-import org.apache.pinot.spi.data.Schema;
 import org.apache.pinot.common.metadata.segment.RealtimeSegmentZKMetadata;
 import org.apache.pinot.common.segment.SegmentMetadata;
 import org.apache.pinot.common.segment.StarTreeMetadata;
-import org.apache.pinot.spi.utils.JsonUtils;
-import org.apache.pinot.spi.utils.TimeUtils;
 import org.apache.pinot.core.indexsegment.generator.SegmentVersion;
 import org.apache.pinot.core.segment.creator.impl.V1Constants;
 import org.apache.pinot.core.segment.creator.impl.V1Constants.MetadataKeys;
 import org.apache.pinot.core.segment.store.SegmentDirectoryPaths;
 import org.apache.pinot.core.startree.v2.StarTreeV2Constants;
 import org.apache.pinot.core.startree.v2.StarTreeV2Metadata;
+import org.apache.pinot.spi.data.MetricFieldSpec;
 
 Review comment:
   The file is auto formatted - had the wrong import order (after spi refactor)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] haibow commented on a change in pull request #4954: Support schema evolution for consuming segments

2020-01-06 Thread GitBox
haibow commented on a change in pull request #4954: Support schema evolution 
for consuming segments
URL: https://github.com/apache/incubator-pinot/pull/4954#discussion_r363562951
 
 

 ##
 File path: 
pinot-core/src/main/java/org/apache/pinot/core/query/pruner/SegmentPrunerService.java
 ##
 @@ -52,7 +52,7 @@ public SegmentPrunerService(SegmentPrunerConfig config) {
   public boolean prune(IndexSegment segment, ServerQueryRequest queryRequest) {
 for (SegmentPruner segmentPruner : _segmentPruners) {
   if (segmentPruner.prune(segment, queryRequest)) {
-LOGGER.debug("Pruned segment: {}", segment.getSegmentName());
+LOGGER.debug("{} pruned segment: {}", 
segmentPruner.getClass().getName(), segment.getSegmentName());
 
 Review comment:
   I'm trying to log the class name for each SegmentPruner class being called, 
not the main SegmentPrunerService class


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] haibow commented on a change in pull request #4954: Support schema evolution for consuming segments

2020-01-06 Thread GitBox
haibow commented on a change in pull request #4954: Support schema evolution 
for consuming segments
URL: https://github.com/apache/incubator-pinot/pull/4954#discussion_r363562153
 
 

 ##
 File path: 
pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
 ##
 @@ -80,9 +84,9 @@
   private final long _startTimeMillis = System.currentTimeMillis();
 
   private final String _segmentName;
-  private final Schema _schema;
+  private final Schema _originalSchema;
   private final int _capacity;
-  private final SegmentMetadata _segmentMetadata;
+  private final SegmentMetadata _originalSegmentMetadata;
 
 Review comment:
   same above, segmentMetadata is immutable, just to be super clear it does not 
reflect the newly added columns.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] haibow commented on a change in pull request #4954: Support schema evolution for consuming segments

2020-01-06 Thread GitBox
haibow commented on a change in pull request #4954: Support schema evolution 
for consuming segments
URL: https://github.com/apache/incubator-pinot/pull/4954#discussion_r363562045
 
 

 ##
 File path: 
pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
 ##
 @@ -80,9 +84,9 @@
   private final long _startTimeMillis = System.currentTimeMillis();
 
   private final String _segmentName;
-  private final Schema _schema;
+  private final Schema _originalSchema;
 
 Review comment:
   Would like to avoid any potential confusion. If a schema update happens on 
the consuming segment, segmentMetadata and schema are still the original ones, 
without including the newly added columns


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] haibow commented on a change in pull request #4954: Support schema evolution for consuming segments

2020-01-06 Thread GitBox
haibow commented on a change in pull request #4954: Support schema evolution 
for consuming segments
URL: https://github.com/apache/incubator-pinot/pull/4954#discussion_r363561268
 
 

 ##
 File path: 
pinot-core/src/main/java/org/apache/pinot/core/indexsegment/IndexSegment.java
 ##
 @@ -57,6 +58,13 @@
*/
   Set getPhysicalColumnNames();
 
+  /**
+   * Returns all columns for the "select *" query
+   *
+   * @return Set of column names
+   */
+  Set getColumnNamesForSelectStar();
 
 Review comment:
   For the consuming segment, `select *` should return physical columns + newly 
added columns. Since the newly added columns are provided by the virtual column 
provider, I feel it a bit weird to include them in getPhysicalColumnNames 
method. 
   
   Basically the columns are
   1. physical column
   2. virtual column
2.1 built-in virtual columns (docId, segmentName, hostName)
2.2 newly added columns in the consuming segment
   
   `getColumnNames` should return 1+2.1+2.2
   `getPhysicalColumnNames` should return 1
   `getColumnNamesForSelectStar` should return 1+2.2


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #4954: Support schema evolution for consuming segments

2020-01-06 Thread GitBox
mcvsubbu commented on a change in pull request #4954: Support schema evolution 
for consuming segments
URL: https://github.com/apache/incubator-pinot/pull/4954#discussion_r363555321
 
 

 ##
 File path: 
pinot-core/src/main/java/org/apache/pinot/core/io/reader/impl/SameMultiValueInvertedIndex.java
 ##
 @@ -0,0 +1,65 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.core.io.reader.impl;
+
+import java.io.IOException;
+import org.apache.pinot.common.utils.Pairs;
+import org.apache.pinot.core.io.reader.BaseSingleColumnMultiValueReader;
+import org.apache.pinot.core.io.reader.impl.v1.FixedBitMultiValueReader;
+
+
+/**
+ * Reader for the multi-value column with the same value
+ */
+public class SameMultiValueInvertedIndex extends 
BaseSingleColumnMultiValueReader implements 
SortedIndexMultiValueReader {
 
 Review comment:
   Let us think of a better name for this


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #4954: Support schema evolution for consuming segments

2020-01-06 Thread GitBox
mcvsubbu commented on a change in pull request #4954: Support schema evolution 
for consuming segments
URL: https://github.com/apache/incubator-pinot/pull/4954#discussion_r363552150
 
 

 ##
 File path: 
pinot-core/src/main/java/org/apache/pinot/core/indexsegment/IndexSegment.java
 ##
 @@ -57,6 +58,13 @@
*/
   Set getPhysicalColumnNames();
 
+  /**
+   * Returns all columns for the "select *" query
+   *
+   * @return Set of column names
+   */
+  Set getColumnNamesForSelectStar();
 
 Review comment:
   We have two APIs, one named getColumnNames() and the other named 
getPhysicalColumnNames().  select * should decide which one to use? Do we need 
a third one? Why can't getPhysicalColumnNames API return the right set of 
columns from the consuming segment?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #4954: Support schema evolution for consuming segments

2020-01-06 Thread GitBox
mcvsubbu commented on a change in pull request #4954: Support schema evolution 
for consuming segments
URL: https://github.com/apache/incubator-pinot/pull/4954#discussion_r363553655
 
 

 ##
 File path: 
pinot-server/src/main/java/org/apache/pinot/server/starter/helix/HelixInstanceDataManager.java
 ##
 @@ -194,7 +198,17 @@ private void reloadSegment(String tableNameWithType, 
SegmentMetadata segmentMeta
 
 File indexDir = segmentMetadata.getIndexDir();
 if (indexDir == null) {
-  LOGGER.info("Skip reloading REALTIME consuming segment: {} in table: 
{}", segmentName, tableNameWithType);
+  if (!_instanceDataManagerConfig.shouldReloadConsumingSegment()) {
+LOGGER.info("Skip reloading REALTIME consuming segment: {} in table: 
{}", segmentName, tableNameWithType);
+return;
+  }
+  LOGGER.info("Try reloading REALTIME consuming segment: {} in table: {}", 
segmentName, tableNameWithType);
+  SegmentMetadataImpl segmentMetadataImpl = (SegmentMetadataImpl) 
segmentMetadata;
 
 Review comment:
   Just call the `reload` API on the segment, and let the segment decide if it 
is to be reloaded (i.e. depending on the config), and also how it should do it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #4954: Support schema evolution for consuming segments

2020-01-06 Thread GitBox
mcvsubbu commented on a change in pull request #4954: Support schema evolution 
for consuming segments
URL: https://github.com/apache/incubator-pinot/pull/4954#discussion_r363553172
 
 

 ##
 File path: 
pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
 ##
 @@ -80,9 +84,9 @@
   private final long _startTimeMillis = System.currentTimeMillis();
 
   private final String _segmentName;
-  private final Schema _schema;
+  private final Schema _originalSchema;
   private final int _capacity;
-  private final SegmentMetadata _segmentMetadata;
+  private final SegmentMetadata _originalSegmentMetadata;
 
 Review comment:
   Same here. Why rename?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #4954: Support schema evolution for consuming segments

2020-01-06 Thread GitBox
mcvsubbu commented on a change in pull request #4954: Support schema evolution 
for consuming segments
URL: https://github.com/apache/incubator-pinot/pull/4954#discussion_r363553071
 
 

 ##
 File path: 
pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
 ##
 @@ -80,9 +84,9 @@
   private final long _startTimeMillis = System.currentTimeMillis();
 
   private final String _segmentName;
-  private final Schema _schema;
+  private final Schema _originalSchema;
 
 Review comment:
   Given that we dont have a member called `_newSchema`, why rename this? Can 
we keep it as `_schema`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #4954: Support schema evolution for consuming segments

2020-01-06 Thread GitBox
mcvsubbu commented on a change in pull request #4954: Support schema evolution 
for consuming segments
URL: https://github.com/apache/incubator-pinot/pull/4954#discussion_r363556801
 
 

 ##
 File path: 
pinot-core/src/main/java/org/apache/pinot/core/query/pruner/SegmentPrunerService.java
 ##
 @@ -52,7 +52,7 @@ public SegmentPrunerService(SegmentPrunerConfig config) {
   public boolean prune(IndexSegment segment, ServerQueryRequest queryRequest) {
 for (SegmentPruner segmentPruner : _segmentPruners) {
   if (segmentPruner.prune(segment, queryRequest)) {
-LOGGER.debug("Pruned segment: {}", segment.getSegmentName());
+LOGGER.debug("{} pruned segment: {}", 
segmentPruner.getClass().getName(), segment.getSegmentName());
 
 Review comment:
   LOGGER  can be configured to log the class name, line number, etc. so we 
should not be logging class name


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #4954: Support schema evolution for consuming segments

2020-01-06 Thread GitBox
mcvsubbu commented on a change in pull request #4954: Support schema evolution 
for consuming segments
URL: https://github.com/apache/incubator-pinot/pull/4954#discussion_r363554485
 
 

 ##
 File path: 
pinot-core/src/main/java/org/apache/pinot/core/indexsegment/mutable/MutableSegmentImpl.java
 ##
 @@ -68,6 +71,7 @@
 
 
 public class MutableSegmentImpl implements MutableSegment {
+  private static final Logger LOGGER = 
LoggerFactory.getLogger(MutableSegmentImpl.class);
 
 Review comment:
   remove this. We already have `_logger`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mayankshriv opened a new pull request #4963: Package rename pinot-plugin modules (part of #4941).

2020-01-06 Thread GitBox
mayankshriv opened a new pull request #4963: Package rename pinot-plugin 
modules (part of #4941).
URL: https://github.com/apache/incubator-pinot/pull/4963
 
 
   Renamed packages for various pinot-plugins as follows.
   
   1. pinot-input-format: The input format plugins will be under the
   package org.apache.pinot.plugin.inputformat. (where
is avro, csv, json).
   
   2. pinot-file-system: The file system plugins will be under the package
   org.apache.pinot.plugin.filesystem. (where  is adls, hdfs, gcs,
   etc).
   
   3. pinot-batch-ingestion: The batch ingestion plugins will be under the
   package org.apache.pinot.plugin.ingestion.batch.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on issue #4952: Adding new Controller APIs for retrieving and setting tag for an instance

2020-01-06 Thread GitBox
mcvsubbu commented on issue #4952: Adding new Controller APIs for retrieving 
and setting tag for an instance
URL: https://github.com/apache/incubator-pinot/pull/4952#issuecomment-571380809
 
 
   @snleee  right now we have a use case where we need to tag hosts in helix 
that do not conform to any "tenant" scheme that you have mentioned (the tag 
override scenario I mentioned). In that case, we will still need to invoke the 
helix API, right? Or, am I missing something?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] snleee edited a comment on issue #4952: Adding new Controller APIs for retrieving and setting tag for an instance

2020-01-06 Thread GitBox
snleee edited a comment on issue #4952: Adding new Controller APIs for 
retrieving and setting tag for an instance
URL: https://github.com/apache/incubator-pinot/pull/4952#issuecomment-571359606
 
 
   @icefury71 Current implementation adds a generic API that can add arbitrary 
`tags` on `instance` (there's no check on `newTag`). I think that this is why 
@mcvsubbu mentioned that we are probably adding a very similar API that Helix 
already provides. We currently use Helix's tag mostly for tenant assignment but 
it's possible that we will use these tags for other features (e.g. multiple 
tags for multiple features for an instance).  In that case, this API can start 
to become complex since it will have to include all the validations for 
different tag usages.
   
   Instead, I think that it's better to provide very specific API on `/tenants` 
and limit its functionality to add/remove instances to/from a tenant and not to 
touch other tags. 
   
   I can think of something like the following:
   ```
   List the instances that belong to a tenant (this already exists)
   GET /tenant/{tenantName}
   
   Add an instance to a tenant
   POST /tenant/{tenantName}/{instanceName}?type=offline/realtime/broker
   
   Delete an instance from a tenant
   DELETE /tenant/{tenantName}/{instanceName}
   ```
   
   @mcvsubbu How do you think on this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] snleee edited a comment on issue #4952: Adding new Controller APIs for retrieving and setting tag for an instance

2020-01-06 Thread GitBox
snleee edited a comment on issue #4952: Adding new Controller APIs for 
retrieving and setting tag for an instance
URL: https://github.com/apache/incubator-pinot/pull/4952#issuecomment-571359606
 
 
   @icefury71 Current implementation adds a generic API that can add arbitrary 
`tags` on `instance` (there's no check on `newTag`). I think that this is why 
@mcvsubbu mentioned that we are probably adding a very similar API that Helix 
already provides. We currently use Helix's tag mostly for tenant assignment but 
it's possible that we will use these tags for other features (e.g. multiple 
tags for multiple features for an instance).  In that case, this API can start 
to become complex since it will have to include all the validations for 
different tag usages.
   
   Instead, I think that it's better to provide very specific API on `/tenants` 
and limit its functionality to add/remove instances to/from a tenant and not to 
touch other tags.
   
   I can think of something like the following:
   ```
   List the instances that belong to a tenant (this already exists)
   GET /tenant/{tenantName}
   
   Add an instance to a tenant
   POST /tenant/{tenantName}/{instanceName}?type=offline/realtime/broker
   
   Delete an instance from a tenant
   DELETE /tenant/{tenantName}/{instanceName}
   ```
   
   @mcvsubbu How do you think on this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on issue #4957: Rename stream plugin pkgs

2020-01-06 Thread GitBox
mcvsubbu commented on issue #4957: Rename stream plugin pkgs
URL: https://github.com/apache/incubator-pinot/pull/4957#issuecomment-571369256
 
 
   > > > Will this package name change causing existing realtime tables fail to 
start consuming?
   > > > I've also updated the quickstart and examples in #4956
   > > 
   > > 
   > > Not sure why you ask, but yes, one may need to change config files to 
point to the right class, or load the right class names. I will look through 
other files to see if there are places where we hard-code classpath names
   > 
   > What I mean is that if users upgrade to current master, then all their 
existing tables will stop consuming after the upgrade as we specify the stream 
factory class in table config, which is changed after this.
   
   That is true, and must be noted in the release notes.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] snleee edited a comment on issue #4952: Adding new Controller APIs for retrieving and setting tag for an instance

2020-01-06 Thread GitBox
snleee edited a comment on issue #4952: Adding new Controller APIs for 
retrieving and setting tag for an instance
URL: https://github.com/apache/incubator-pinot/pull/4952#issuecomment-571359606
 
 
   @icefury71 Current implementation adds a generic API that can add arbitrary 
`tags` on `instance` (there's no check on `newTag`). I think that this is why 
@mcvsubbu mentioned that we are probably adding a very similar API that Helix 
already provides. We currently use Helix's tag mostly for tenant assignment but 
it's possible that we will use these tags for other features (e.g. multiple 
tags for multiple features for an instance).  In that case, this API can start 
to become complex since it will have to include all the validations for 
different tag usages.
   
   Instead, I think that it's better to provide very specific API on `/tenants` 
and limit its functionality to add/remove instances to/from a tenant and not to 
touch other tags.
   
   I can think of something like the following:
   ```
   List the instances that belong to a tenant (this already exists)
   GET /tenant/{tenantName}
   
   Add an instance to a tenant
   POST /tenant/{tenantName}/{instanceName}?type=offline/realtime/broker
   
   Delete an instance from a tenant
   DELETE /tenant/{tenantName}/{instanceName}
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] codecov-io edited a comment on issue #4954: Support schema evolution for consuming segments

2020-01-06 Thread GitBox
codecov-io edited a comment on issue #4954: Support schema evolution for 
consuming segments
URL: https://github.com/apache/incubator-pinot/pull/4954#issuecomment-570396099
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/4954?src=pr=h1) 
Report
   > Merging 
[#4954](https://codecov.io/gh/apache/incubator-pinot/pull/4954?src=pr=desc) 
into 
[master](https://codecov.io/gh/apache/incubator-pinot/commit/247af5d600ed806e3a1088ee8500e3596483e62a?src=pr=desc)
 will **increase** coverage by `21.25%`.
   > The diff coverage is `72.77%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-pinot/pull/4954/graphs/tree.svg?width=650=4ibza2ugkz=150=pr)](https://codecov.io/gh/apache/incubator-pinot/pull/4954?src=pr=tree)
   
   ```diff
   @@  Coverage Diff  @@
   ## master#4954   +/-   ##
   =
   + Coverage 36.31%   57.56%   +21.25% 
   - Complexity0   12   +12 
   =
 Files  1159 1166+7 
 Lines 6183761934   +97 
 Branches   9118 9131   +13 
   =
   + Hits  2245435655+13201 
   + Misses3737523614-13761 
   - Partials   2008 2665  +657
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-pinot/pull/4954?src=pr=tree) | 
Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | 
[...e/io/reader/impl/IntSingleValueDataFileReader.java](https://codecov.io/gh/apache/incubator-pinot/pull/4954/diff?src=pr=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9pby9yZWFkZXIvaW1wbC9JbnRTaW5nbGVWYWx1ZURhdGFGaWxlUmVhZGVyLmphdmE=)
 | `0% <ø> (ø)` | `0 <0> (?)` | |
   | 
[.../pinot/core/segment/index/SegmentMetadataImpl.java](https://codecov.io/gh/apache/incubator-pinot/pull/4954/diff?src=pr=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L2luZGV4L1NlZ21lbnRNZXRhZGF0YUltcGwuamF2YQ==)
 | `80.97% <ø> (+18.62%)` | `0 <0> (ø)` | :arrow_down: |
   | 
[...inot/tools/admin/command/CreateSegmentCommand.java](https://codecov.io/gh/apache/incubator-pinot/pull/4954/diff?src=pr=tree#diff-cGlub3QtdG9vbHMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3Bpbm90L3Rvb2xzL2FkbWluL2NvbW1hbmQvQ3JlYXRlU2VnbWVudENvbW1hbmQuamF2YQ==)
 | `0% <ø> (ø)` | `0 <0> (ø)` | :arrow_down: |
   | 
[...org/apache/pinot/common/utils/CommonConstants.java](https://codecov.io/gh/apache/incubator-pinot/pull/4954/diff?src=pr=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvQ29tbW9uQ29uc3RhbnRzLmphdmE=)
 | `34% <ø> (ø)` | `0 <0> (ø)` | :arrow_down: |
   | 
[.../pinot/tools/scan/query/SegmentQueryProcessor.java](https://codecov.io/gh/apache/incubator-pinot/pull/4954/diff?src=pr=tree#diff-cGlub3QtdG9vbHMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3Bpbm90L3Rvb2xzL3NjYW4vcXVlcnkvU2VnbWVudFF1ZXJ5UHJvY2Vzc29yLmphdmE=)
 | `0% <0%> (ø)` | `0 <0> (ø)` | :arrow_down: |
   | 
[...filter/SortedInvertedIndexBasedFilterOperator.java](https://codecov.io/gh/apache/incubator-pinot/pull/4954/diff?src=pr=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci9maWx0ZXIvU29ydGVkSW52ZXJ0ZWRJbmRleEJhc2VkRmlsdGVyT3BlcmF0b3IuamF2YQ==)
 | `78% <100%> (ø)` | `0 <0> (ø)` | :arrow_down: |
   | 
[...ment/virtualcolumn/DocIdVirtualColumnProvider.java](https://codecov.io/gh/apache/incubator-pinot/pull/4954/diff?src=pr=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L3ZpcnR1YWxjb2x1bW4vRG9jSWRWaXJ0dWFsQ29sdW1uUHJvdmlkZXIuamF2YQ==)
 | `65.21% <100%> (+24.19%)` | `0 <0> (ø)` | :arrow_down: |
   | 
[...indexsegment/immutable/ImmutableSegmentLoader.java](https://codecov.io/gh/apache/incubator-pinot/pull/4954/diff?src=pr=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9pbmRleHNlZ21lbnQvaW1tdXRhYmxlL0ltbXV0YWJsZVNlZ21lbnRMb2FkZXIuamF2YQ==)
 | `91.83% <100%> (+1.83%)` | `0 <0> (ø)` | :arrow_down: |
   | 
[...re/segment/index/data/source/ColumnDataSource.java](https://codecov.io/gh/apache/incubator-pinot/pull/4954/diff?src=pr=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L2luZGV4L2RhdGEvc291cmNlL0NvbHVtbkRhdGFTb3VyY2UuamF2YQ==)
 | `95.83% <100%> (+4.52%)` | `0 <0> (ø)` | :arrow_down: |
   | 
[...starter/helix/DefaultHelixStarterServerConfig.java](https://codecov.io/gh/apache/incubator-pinot/pull/4954/diff?src=pr=tree#diff-cGlub3Qtc2VydmVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zZXJ2ZXIvc3RhcnRlci9oZWxpeC9EZWZhdWx0SGVsaXhTdGFydGVyU2VydmVyQ29uZmlnLmphdmE=)
 | `96.42% <100%> (+0.27%)` | `0 <0> (ø)` | :arrow_down: |
   | ... and [631 
more](https://codecov.io/gh/apache/incubator-pinot/pull/4954/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 

[GitHub] [incubator-pinot] snleee commented on issue #4952: Adding new Controller APIs for retrieving and setting tag for an instance

2020-01-06 Thread GitBox
snleee commented on issue #4952: Adding new Controller APIs for retrieving and 
setting tag for an instance
URL: https://github.com/apache/incubator-pinot/pull/4952#issuecomment-571359606
 
 
   @icefury71 Current implementation adds a generic API that can add arbitrary 
`tags` on `instance` (there's no check on `newTag`). I think that this is why 
@mcvsubbu mentioned that we are probably adding the same API that Helix already 
provides. We currently use Helix's tag mostly for tenant assignment but it's 
possible that we will use these tags for other features (e.g. multiple tags for 
multiple features for an instance).  In that case, this API can start to become 
complex since it will have to include all the validations for different tag 
usages.
   
   Instead, I think that it's better to provide very specific API on `/tenants` 
and limit its functionality to add/remove instances to/from a tenant and not to 
touch other tags.
   
   I can think of something like the following:
   ```
   List the instances that belong to a tenant (this already exists)
   GET /tenant/{tenantName}
   
   Add an instance to a tenant
   POST /tenant/{tenantName}/{instanceName}?type=offline/realtime/broker
   
   Delete an instance from a tenant
   DELETE /tenant/{tenantName}/{instanceName}
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] codecov-io edited a comment on issue #4954: Support schema evolution for consuming segments

2020-01-06 Thread GitBox
codecov-io edited a comment on issue #4954: Support schema evolution for 
consuming segments
URL: https://github.com/apache/incubator-pinot/pull/4954#issuecomment-570396099
 
 
   # 
[Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/4954?src=pr=h1) 
Report
   > Merging 
[#4954](https://codecov.io/gh/apache/incubator-pinot/pull/4954?src=pr=desc) 
into 
[master](https://codecov.io/gh/apache/incubator-pinot/commit/247af5d600ed806e3a1088ee8500e3596483e62a?src=pr=desc)
 will **increase** coverage by `0.35%`.
   > The diff coverage is `65.34%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-pinot/pull/4954/graphs/tree.svg?width=650=4ibza2ugkz=150=pr)](https://codecov.io/gh/apache/incubator-pinot/pull/4954?src=pr=tree)
   
   ```diff
   @@Coverage Diff @@
   ##   master#4954  +/-   ##
   ==
   + Coverage   36.31%   36.66%   +0.35% 
   ==
 Files1159 1166   +7 
 Lines   6183761934  +97 
 Branches 9118 9131  +13 
   ==
   + Hits2245422706 +252 
   + Misses  3737537154 -221 
   - Partials 2008 2074  +66
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-pinot/pull/4954?src=pr=tree) | 
Coverage Δ | |
   |---|---|---|
   | 
[...e/io/reader/impl/IntSingleValueDataFileReader.java](https://codecov.io/gh/apache/incubator-pinot/pull/4954/diff?src=pr=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9pby9yZWFkZXIvaW1wbC9JbnRTaW5nbGVWYWx1ZURhdGFGaWxlUmVhZGVyLmphdmE=)
 | `0% <ø> (ø)` | |
   | 
[.../pinot/core/segment/index/SegmentMetadataImpl.java](https://codecov.io/gh/apache/incubator-pinot/pull/4954/diff?src=pr=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L2luZGV4L1NlZ21lbnRNZXRhZGF0YUltcGwuamF2YQ==)
 | `62.34% <ø> (ø)` | :arrow_up: |
   | 
[...inot/tools/admin/command/CreateSegmentCommand.java](https://codecov.io/gh/apache/incubator-pinot/pull/4954/diff?src=pr=tree#diff-cGlub3QtdG9vbHMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3Bpbm90L3Rvb2xzL2FkbWluL2NvbW1hbmQvQ3JlYXRlU2VnbWVudENvbW1hbmQuamF2YQ==)
 | `0% <ø> (ø)` | :arrow_up: |
   | 
[...org/apache/pinot/common/utils/CommonConstants.java](https://codecov.io/gh/apache/incubator-pinot/pull/4954/diff?src=pr=tree#diff-cGlub3QtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9jb21tb24vdXRpbHMvQ29tbW9uQ29uc3RhbnRzLmphdmE=)
 | `34% <ø> (ø)` | :arrow_up: |
   | 
[.../pinot/tools/scan/query/SegmentQueryProcessor.java](https://codecov.io/gh/apache/incubator-pinot/pull/4954/diff?src=pr=tree#diff-cGlub3QtdG9vbHMvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL3Bpbm90L3Rvb2xzL3NjYW4vcXVlcnkvU2VnbWVudFF1ZXJ5UHJvY2Vzc29yLmphdmE=)
 | `0% <0%> (ø)` | :arrow_up: |
   | 
[...filter/SortedInvertedIndexBasedFilterOperator.java](https://codecov.io/gh/apache/incubator-pinot/pull/4954/diff?src=pr=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9vcGVyYXRvci9maWx0ZXIvU29ydGVkSW52ZXJ0ZWRJbmRleEJhc2VkRmlsdGVyT3BlcmF0b3IuamF2YQ==)
 | `78% <100%> (ø)` | :arrow_up: |
   | 
[...ment/virtualcolumn/DocIdVirtualColumnProvider.java](https://codecov.io/gh/apache/incubator-pinot/pull/4954/diff?src=pr=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L3ZpcnR1YWxjb2x1bW4vRG9jSWRWaXJ0dWFsQ29sdW1uUHJvdmlkZXIuamF2YQ==)
 | `52.17% <100%> (+11.14%)` | :arrow_up: |
   | 
[...indexsegment/immutable/ImmutableSegmentLoader.java](https://codecov.io/gh/apache/incubator-pinot/pull/4954/diff?src=pr=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9pbmRleHNlZ21lbnQvaW1tdXRhYmxlL0ltbXV0YWJsZVNlZ21lbnRMb2FkZXIuamF2YQ==)
 | `89.79% <100%> (-0.21%)` | :arrow_down: |
   | 
[...re/segment/index/data/source/ColumnDataSource.java](https://codecov.io/gh/apache/incubator-pinot/pull/4954/diff?src=pr=tree#diff-cGlub3QtY29yZS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvcGlub3QvY29yZS9zZWdtZW50L2luZGV4L2RhdGEvc291cmNlL0NvbHVtbkRhdGFTb3VyY2UuamF2YQ==)
 | `91.66% <100%> (+0.36%)` | :arrow_up: |
   | 
[...starter/helix/DefaultHelixStarterServerConfig.java](https://codecov.io/gh/apache/incubator-pinot/pull/4954/diff?src=pr=tree#diff-cGlub3Qtc2VydmVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9waW5vdC9zZXJ2ZXIvc3RhcnRlci9oZWxpeC9EZWZhdWx0SGVsaXhTdGFydGVyU2VydmVyQ29uZmlnLmphdmE=)
 | `96.42% <100%> (+0.27%)` | :arrow_up: |
   | ... and [87 
more](https://codecov.io/gh/apache/incubator-pinot/pull/4954/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/4954?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/4954?src=pr=footer).
 

[GitHub] [incubator-pinot] npawar commented on issue #4962: Create a new query endpoint for sql

2020-01-06 Thread GitBox
npawar commented on issue #4962: Create a new query endpoint for sql
URL: 
https://github.com/apache/incubator-pinot/issues/4962#issuecomment-571349043
 
 
   > +1 to adding a new endpoint.
   > Should pql endpoint still support responseFormat and groupByMode of sql? 
Why?
   
   responseFormat, maybe not. But groupByMode, I think it should. If we don't 
support that, no one can use `order by` unless they are on sql. It might take 
users long before they get to sql endpoint, or until it's even ready.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] haibow opened a new pull request #4954: Support schema evolution for consuming segments

2020-01-06 Thread GitBox
haibow opened a new pull request #4954: Support schema evolution for consuming 
segments
URL: https://github.com/apache/incubator-pinot/pull/4954
 
 
   Implement the design in 
https://github.com/apache/incubator-pinot/issues/4225#issuecomment-548173120
   
   Support querying newly added columns in the consuming segment (OFF by 
default to keep default behavior the same). Currently after updating the schema 
for a realtime table and calling reload on the table, consuming segments would 
not be reloaded with the new schema, which would cause consuming segments to be 
dropped at query time due to schema inconsistency. 
   
   In this diff, we use the virtual column provider to provide default null 
values for the newly added columns in the consuming segment (until it is 
committed), to address the schema inconsistency issue. There is still a delay 
in indexing and querying the actual values in the newly added columns (up to a 
few hours, depending on the segment commit thresholds). We will revisit this 
later.
   
   This functionality is off by default for now to keep the default behavior 
the same.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] haibow closed pull request #4954: Support schema evolution for consuming segments

2020-01-06 Thread GitBox
haibow closed pull request #4954: Support schema evolution for consuming 
segments
URL: https://github.com/apache/incubator-pinot/pull/4954
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] fx19880617 commented on issue #4957: Rename stream plugin pkgs

2020-01-06 Thread GitBox
fx19880617 commented on issue #4957: Rename stream plugin pkgs
URL: https://github.com/apache/incubator-pinot/pull/4957#issuecomment-571344239
 
 
   > > Will this package name change causing existing realtime tables fail to 
start consuming?
   > > I've also updated the quickstart and examples in #4956
   > 
   > Not sure why you ask, but yes, one may need to change config files to 
point to the right class, or load the right class names. I will look through 
other files to see if there are places where we hard-code classpath names
   
   What I mean is that if users upgrade to current master, then all their 
existing tables will stop consuming after the upgrade as we specify the stream 
factory class in table config, which is changed after this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] snleee removed a comment on issue #4952: Adding new Controller APIs for retrieving and setting tag for an instance

2020-01-06 Thread GitBox
snleee removed a comment on issue #4952: Adding new Controller APIs for 
retrieving and setting tag for an instance
URL: https://github.com/apache/incubator-pinot/pull/4952#issuecomment-571338493
 
 
   @mcvsubbu I agree with @icefury71. As you mentioned, we are depending on 
helix's tagging feature for our tenant implementation; however, adding/removing 
tenant is Pinot specific operation. Our user should be able to control Pinot 
fully using admin APIs provided by us.
   
   +1 on the fact that this API will make sure that tenant tags are in the 
correct form while using Helix API directly won't do any check.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] snleee commented on issue #4952: Adding new Controller APIs for retrieving and setting tag for an instance

2020-01-06 Thread GitBox
snleee commented on issue #4952: Adding new Controller APIs for retrieving and 
setting tag for an instance
URL: https://github.com/apache/incubator-pinot/pull/4952#issuecomment-571338493
 
 
   @mcvsubbu I agree with @icefury71. We are depending on helix's tagging 
feature for our tenant implementation and adding/removing tenant is Pinot 
specific operation. Our user should be able to control Pinot fully using admin 
APIs provided by us.
   
   +1 on the fact that this API will make sure that tenant tags are in the 
correct form while using Helix API directly won't do any check.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] snleee edited a comment on issue #4952: Adding new Controller APIs for retrieving and setting tag for an instance

2020-01-06 Thread GitBox
snleee edited a comment on issue #4952: Adding new Controller APIs for 
retrieving and setting tag for an instance
URL: https://github.com/apache/incubator-pinot/pull/4952#issuecomment-571338493
 
 
   @mcvsubbu I agree with @icefury71. As you mentioned, we are depending on 
helix's tagging feature for our tenant implementation; however, adding/removing 
tenant is Pinot specific operation. Our user should be able to control Pinot 
fully using admin APIs provided by us.
   
   +1 on the fact that this API will make sure that tenant tags are in the 
correct form while using Helix API directly won't do any check.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] icefury71 commented on issue #4952: Adding new Controller APIs for retrieving and setting tag for an instance

2020-01-06 Thread GitBox
icefury71 commented on issue #4952: Adding new Controller APIs for retrieving 
and setting tag for an instance
URL: https://github.com/apache/incubator-pinot/pull/4952#issuecomment-571335765
 
 
   > Helix provides commands to add/remove/list tags, so the only value added 
in making this a part of pinot is to make sure that the tags correspond to the 
conventions used/allowed in pinot. Even this is not completely predictable, 
because we can use pretty much any tag name in 
[tagOverrideConfig](https://github.com/apache/incubator-pinot/blob/master/pinot-common/src/main/java/org/apache/pinot/common/config/TagOverrideConfig.java).
 I suggest that we refer the admins to the helix commands to do this operation 
instead of adding code in pinot. We can add documentation on specific helix 
commands in pinot docs.
   
   Thanks for your comments @mcvsubbu . I believe this API is still useful for 
the following reasons:
   1) Adding a "tenant" as a tag is a very Pinot specific operation. We may not 
be able to do pre-requisite checks if we do this through helix (eg: retagging 
from one tenant to another).
   2) Having one admin API endpoint is very convenient and allows for the 
caller (eg: automated cluster/setup tenant, node replacement etc).
   
   Lemme know if you have any more questions.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[incubator-pinot] branch master updated (9edad92 -> 247af5d)

2020-01-06 Thread xiangfu
This is an automated email from the ASF dual-hosted git repository.

xiangfu pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git.


from 9edad92  Make plugin manager to load plugins based on environment 
variables (#4956)
 add 247af5d  Add schemaFile as option in AddTableCommand (#4959)

No new revisions were added by this update.

Summary of changes:
 docker/images/pinot/README.md  |  9 +--
 docs/getting_started.rst   | 20 ++
 docs/pinot_hadoop.rst  |  2 +-
 kubernetes/helm/pinot-realtime-quickstart.yml  | 33 ++---
 .../skaffold/gke/pinot-realtime-quickstart.yml |  9 +--
 .../tests/ChaosMonkeyIntegrationTest.java  |  4 +-
 .../v0_deprecated/pinot-ingestion-common/pom.xml   |  1 -
 .../pinot-kafka-base/pom.xml   |  5 --
 .../org/apache/pinot/tools/HybridQuickstart.java   |  2 -
 .../java/org/apache/pinot/tools/Quickstart.java|  2 -
 .../org/apache/pinot/tools/RealtimeQuickStart.java |  2 -
 .../pinot/tools/admin/PinotAdministrator.java  |  6 +-
 .../pinot/tools/admin/command/AddTableCommand.java | 83 +++---
 .../tools/admin/command/GenerateDataCommand.java   |  2 +-
 .../tools/admin/command/QuickstartRunner.java  | 11 +--
 .../tools/query/comparison/ClusterStarter.java |  3 +-
 16 files changed, 85 insertions(+), 109 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] fx19880617 merged pull request #4959: Add schemaFile as option in AddTableCommand

2020-01-06 Thread GitBox
fx19880617 merged pull request #4959: Add schemaFile as option in 
AddTableCommand
URL: https://github.com/apache/incubator-pinot/pull/4959
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] fx19880617 commented on a change in pull request #4959: Add schemaFile as option in AddTableCommand

2020-01-06 Thread GitBox
fx19880617 commented on a change in pull request #4959: Add schemaFile as 
option in AddTableCommand
URL: https://github.com/apache/incubator-pinot/pull/4959#discussion_r363488701
 
 

 ##
 File path: 
pinot-tools/src/main/java/org/apache/pinot/tools/admin/command/AddTableCommand.java
 ##
 @@ -102,33 +111,55 @@ public AddTableCommand setExecute(boolean exec) {
 return this;
   }
 
-  public boolean execute(JsonNode node)
-  throws IOException {
-if (_controllerHost == null) {
-  _controllerHost = NetUtil.getHostAddress();
+  public void uploadSchema()
+  throws Exception {
+File schemaFile;
+Schema schema;
+try {
+  schemaFile = new File(_schemaFile);
+  schema = Schema.fromFile(schemaFile);
+} catch (Exception e) {
+  LOGGER.error("Got exception while reading Pinot schema from file: [" + 
_schemaFile + "]");
+  throw e;
 }
-_controllerAddress = "http://; + _controllerHost + ":" + _controllerPort;
-
-if (!_exec) {
-  LOGGER.warn("Dry Running Command: " + toString());
-  LOGGER.warn("Use the -exec option to actually execute the command.");
-  return true;
+try (FileUploadDownloadClient fileUploadDownloadClient = new 
FileUploadDownloadClient()) {
+  fileUploadDownloadClient.addSchema(
+  FileUploadDownloadClient.getUploadSchemaHttpURI(_controllerHost, 
Integer.parseInt(_controllerPort)),
 
 Review comment:
   Dry run shouldn't upload schema. This check is at line 146.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] fx19880617 commented on a change in pull request #4959: Add schemaFile as option in AddTableCommand

2020-01-06 Thread GitBox
fx19880617 commented on a change in pull request #4959: Add schemaFile as 
option in AddTableCommand
URL: https://github.com/apache/incubator-pinot/pull/4959#discussion_r363488701
 
 

 ##
 File path: 
pinot-tools/src/main/java/org/apache/pinot/tools/admin/command/AddTableCommand.java
 ##
 @@ -102,33 +111,55 @@ public AddTableCommand setExecute(boolean exec) {
 return this;
   }
 
-  public boolean execute(JsonNode node)
-  throws IOException {
-if (_controllerHost == null) {
-  _controllerHost = NetUtil.getHostAddress();
+  public void uploadSchema()
+  throws Exception {
+File schemaFile;
+Schema schema;
+try {
+  schemaFile = new File(_schemaFile);
+  schema = Schema.fromFile(schemaFile);
+} catch (Exception e) {
+  LOGGER.error("Got exception while reading Pinot schema from file: [" + 
_schemaFile + "]");
+  throw e;
 }
-_controllerAddress = "http://; + _controllerHost + ":" + _controllerPort;
-
-if (!_exec) {
-  LOGGER.warn("Dry Running Command: " + toString());
-  LOGGER.warn("Use the -exec option to actually execute the command.");
-  return true;
+try (FileUploadDownloadClient fileUploadDownloadClient = new 
FileUploadDownloadClient()) {
+  fileUploadDownloadClient.addSchema(
+  FileUploadDownloadClient.getUploadSchemaHttpURI(_controllerHost, 
Integer.parseInt(_controllerPort)),
 
 Review comment:
   Dry run shouldn't upload schema. The check is at line 146


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on issue #4962: Create a new query endpoint for sql

2020-01-06 Thread GitBox
mcvsubbu commented on issue #4962: Create a new query endpoint for sql
URL: 
https://github.com/apache/incubator-pinot/issues/4962#issuecomment-571310841
 
 
   +1 to adding a new endpoint.
   Should pql endpoint still support responseFormat and groupByMode of sql? Why?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] xiaohui-sun commented on a change in pull request #4960: [TE] Use upper and lower bounds as predicted for threshold algorithm

2020-01-06 Thread GitBox
xiaohui-sun commented on a change in pull request #4960: [TE] Use upper and 
lower bounds as predicted for threshold algorithm
URL: https://github.com/apache/incubator-pinot/pull/4960#discussion_r363473207
 
 

 ##
 File path: 
thirdeye/thirdeye-pinot/src/main/java/org/apache/pinot/thirdeye/detection/components/ThresholdRuleDetector.java
 ##
 @@ -101,15 +100,24 @@ public TimeSeries computePredictedTimeSeries(MetricSlice 
slice) {
 InputData data =
 this.dataFetcher.fetchData(new 
InputDataSpec().withTimeseriesSlices(Collections.singletonList(slice)));
 DataFrame df = data.getTimeseries().get(slice);
-return TimeSeries.fromDataFrame(constructThresholdBoundaries(df));
+return TimeSeries.fromDataFrame(constructBaselineAndBoundaries(df));
   }
 
-  private DataFrame constructThresholdBoundaries(DataFrame df) {
+  /**
+   * Populate the dataframe with upper/lower boundaries and baseline
+   */
+  private DataFrame constructBaselineAndBoundaries(DataFrame df) {
+// Set default baseline as the actual value
+df.addSeries(COL_VALUE, df.getDoubles(COL_CURRENT));
 if (!Double.isNaN(this.min)) {
   df.addSeries(COL_LOWER_BOUND, DoubleSeries.fillValues(df.size(), 
this.min));
+  // set baseline value as the lower bound when actual value across below 
the mark
+  df.mapInPlace(DoubleSeries.MAX, COL_VALUE, COL_LOWER_BOUND, COL_VALUE);
 
 Review comment:
   Why we want to change the baseline value? 
   If it is threshold algorithm we should not show upper/lower bounds. 
   Please take a look at this doc: 
https://docs.google.com/document/d/12bE6e_fIbk3pp0oPCrOHOPFid4NEMguZ-dF3IVjF3xQ/edit


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] npawar edited a comment on issue #4962: Create a new query endpoint for sql

2020-01-06 Thread GitBox
npawar edited a comment on issue #4962: Create a new query endpoint for sql
URL: 
https://github.com/apache/incubator-pinot/issues/4962#issuecomment-571272652
 
 
   Existing endpoints are:
   1. /query
   GET 
   ```
   @GET
   @Produces(MediaType.APPLICATION_JSON)
   @Path("query")
   public String processQueryGet(
   @ApiParam(value = "Query", required = true) @QueryParam("bql") 
String query,
   @ApiParam(value = "Trace enabled") @QueryParam(TRACE) String 
traceEnabled,
   @ApiParam(value = "Debug options") @QueryParam(DEBUG_OPTIONS) String 
debugOptions) {
   ```
   This is a strictly PQL endpoint. A JsonNode is created in this method, which 
sets the query into a "pql" fieldname. There's no scope yet to add any query 
options, which is necessary for sql
   
   2. /query
   POST
   ```
   @POST
   @Path("query")
   public String processQueryPost(String payload) {
   ```
   This endpoint takes a payload. As a result, this one works for pql as well 
as sql. This one also lets you set the query options for pql.
   
   
   We have 2 options to take it from here to introduce the sql endpoint
   1. Add a completely new set of endpoints for `/sql`
   a) For the GET, this means creating the JsonNode correctly with `sql`, and 
setting all the sql options
   b) For the POST, this means validating the payload has `sql`. Adding the sql 
query options to the payload.
   
   2. Modify existing endpoints to be able to handle both modes
   a) For the GET, this means adding a new query param, to indicate this is a 
sql/pql query, and then accordingly creating the JsonNode
   b) For the POST, this means checking the payload for sql/pql, and then 
adding the sql query options if sql.
   
   I like option 1. It lets us keep pql and sql as obviously separate 
endpoints, and make the coding clean, avoiding a lot of if else.  


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] npawar commented on issue #4962: Create a new query endpoint for sql

2020-01-06 Thread GitBox
npawar commented on issue #4962: Create a new query endpoint for sql
URL: 
https://github.com/apache/incubator-pinot/issues/4962#issuecomment-571272652
 
 
   Existing endpoints are:
   1. /query
   GET 
   ```
   @GET
   @Produces(MediaType.APPLICATION_JSON)
   @Path("query")
   public String processQueryGet(
   @ApiParam(value = "Query", required = true) @QueryParam("bql") 
String query,
   @ApiParam(value = "Trace enabled") @QueryParam(TRACE) String 
traceEnabled,
   @ApiParam(value = "Debug options") @QueryParam(DEBUG_OPTIONS) String 
debugOptions) {
   ```
   This is a strictly PQL endpoint. A JsonNode is created in this method, which 
sets the query into a "pql" fieldname. There's no scope yet to add any query 
options, which is necessary for sql
   
   2. /query
   POST
   ```
   @POST
   @Path("query")
   public String processQueryPost(String payload) {
   ```
   This endpoint takes a payload. As a result, this one works for pql as well 
as sql. This one also lets you set the query options for pql.
   
   
   We have 2 options to take it from here to introduce the sql endpoint
   1. Add a completely new set of endpoints for "/sql"
   a) For the GET, this means creating the JsonNode correctly with "sql", and 
setting all the sql options
   b) For the POST, this means validating the payload has "sql". Adding the sql 
query options to the payload.
   
   2. Modify existing endpoints to be able to handle both modes
   a) For the GET, this means adding a new query param, to indicate this is a 
sql/pql query, and then accordingly creating the JsonNode
   b) For the POST, this means checking the payload for sql/pql, and then 
adding the sql query options if sql.
   
   I like option 1. It lets us keep pql and sql as obviously separate 
endpoints, and make the coding clean, avoiding a lot of if else.  


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] npawar opened a new issue #4962: Create a new query endpoint for sql

2020-01-06 Thread GitBox
npawar opened a new issue #4962: Create a new query endpoint for sql
URL: https://github.com/apache/incubator-pinot/issues/4962
 
 
   Add a new endpoint to the pinot broker, which will be the sql query 
endpoint. Adding a new endpoint will make it easier for migration from pql to 
sql. 
   The sql endpoint will fix the query type to sql, the response format to sql, 
and the group by mode to sql. There will be no scope for configuring these.
   The pql endpoint will continue to support options responseFormat=sql and 
groupByMode=sql.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on issue #4957: Rename stream plugin pkgs

2020-01-06 Thread GitBox
mcvsubbu commented on issue #4957: Rename stream plugin pkgs
URL: https://github.com/apache/incubator-pinot/pull/4957#issuecomment-571220982
 
 
   > Will this package name change causing existing realtime tables fail to 
start consuming?
   > I've also updated the quickstart and examples in #4956
   
   Not sure why you ask, but yes, one may need to change config files to point 
to the right class, or load the right class names. I will look through other 
files to see if there are places where we hard-code classpath names


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org



[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #4959: Add schemaFile as option in AddTableCommand

2020-01-06 Thread GitBox
mcvsubbu commented on a change in pull request #4959: Add schemaFile as option 
in AddTableCommand
URL: https://github.com/apache/incubator-pinot/pull/4959#discussion_r363383402
 
 

 ##
 File path: 
pinot-tools/src/main/java/org/apache/pinot/tools/admin/command/AddTableCommand.java
 ##
 @@ -102,33 +111,55 @@ public AddTableCommand setExecute(boolean exec) {
 return this;
   }
 
-  public boolean execute(JsonNode node)
-  throws IOException {
-if (_controllerHost == null) {
-  _controllerHost = NetUtil.getHostAddress();
+  public void uploadSchema()
+  throws Exception {
+File schemaFile;
+Schema schema;
+try {
+  schemaFile = new File(_schemaFile);
+  schema = Schema.fromFile(schemaFile);
+} catch (Exception e) {
+  LOGGER.error("Got exception while reading Pinot schema from file: [" + 
_schemaFile + "]");
+  throw e;
 }
-_controllerAddress = "http://; + _controllerHost + ":" + _controllerPort;
-
-if (!_exec) {
-  LOGGER.warn("Dry Running Command: " + toString());
-  LOGGER.warn("Use the -exec option to actually execute the command.");
-  return true;
+try (FileUploadDownloadClient fileUploadDownloadClient = new 
FileUploadDownloadClient()) {
+  fileUploadDownloadClient.addSchema(
+  FileUploadDownloadClient.getUploadSchemaHttpURI(_controllerHost, 
Integer.parseInt(_controllerPort)),
 
 Review comment:
   Do we still want to upload schema if `exec` is not true?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org