[GitHub] [incubator-pinot] codecov-io edited a comment on pull request #6120: Add FST Index using lucene lib to speedup regexp queries

2020-10-14 Thread GitBox
codecov-io edited a comment on pull request #6120: URL: https://github.com/apache/incubator-pinot/pull/6120#issuecomment-705891235 # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6120?src=pr=h1) Report > Merging

[GitHub] [incubator-pinot] codecov-io edited a comment on pull request #6120: Add FST Index using lucene lib to speedup regexp queries

2020-10-14 Thread GitBox
codecov-io edited a comment on pull request #6120: URL: https://github.com/apache/incubator-pinot/pull/6120#issuecomment-705891235 # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6120?src=pr=h1) Report > Merging

[GitHub] [incubator-pinot] yupeng9 commented on issue #5509: Derived columns

2020-10-14 Thread GitBox
yupeng9 commented on issue #5509: URL: https://github.com/apache/incubator-pinot/issues/5509#issuecomment-708860324 +1 to derived over derived. It's useful for defining common expressions. Also, from impl it's just a topology sort

[GitHub] [incubator-pinot] npawar commented on issue #5509: Derived columns

2020-10-14 Thread GitBox
npawar commented on issue #5509: URL: https://github.com/apache/incubator-pinot/issues/5509#issuecomment-708710753 FYI @kishoreg @Jackie-Jiang This is an automated message from the Apache Git Service. To respond to the

[GitHub] [incubator-pinot] codecov-io edited a comment on pull request #6136: add query runner support for query file resampling

2020-10-14 Thread GitBox
codecov-io edited a comment on pull request #6136: URL: https://github.com/apache/incubator-pinot/pull/6136#issuecomment-707985309 # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6136?src=pr=h1) Report > Merging

[GitHub] [incubator-pinot] yupeng9 commented on issue #5509: Derived columns

2020-10-14 Thread GitBox
yupeng9 commented on issue #5509: URL: https://github.com/apache/incubator-pinot/issues/5509#issuecomment-708861168 > Should the flag "derived" be part of the FieldSpec, or part of the TableConfig -> IngestionConfig -> TransformConfig? > > It makes more sense in the fieldSpec. But

[GitHub] [incubator-pinot] npawar commented on a change in pull request #6113: Adding the upsert support to real-time ingestion and query

2020-10-14 Thread GitBox
npawar commented on a change in pull request #6113: URL: https://github.com/apache/incubator-pinot/pull/6113#discussion_r504983362 ## File path: pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/RealtimeTableDataManager.java ## @@ -378,7 +463,12 @@ public

[GitHub] [incubator-pinot] npawar commented on issue #5509: Derived columns

2020-10-14 Thread GitBox
npawar commented on issue #5509: URL: https://github.com/apache/incubator-pinot/issues/5509#issuecomment-708715214 I think we should support derived on top of derived. This is an automated message from the Apache Git

[GitHub] [incubator-pinot] npawar commented on issue #5509: Derived columns

2020-10-14 Thread GitBox
npawar commented on issue #5509: URL: https://github.com/apache/incubator-pinot/issues/5509#issuecomment-708719006 Should the flag "derived" be part of the FieldSpec, or part of the TableConfig -> IngestionConfig -> TransformConfig? It makes more sense in the fieldSpec. But we end up

[GitHub] [incubator-pinot] Jackie-Jiang opened a new pull request #6147: Add OnHeapGuavaBloomFilterReader

2020-10-14 Thread GitBox
Jackie-Jiang opened a new pull request #6147: URL: https://github.com/apache/incubator-pinot/pull/6147 ## Description Add the on-heap version of the guava bloom filter reader Add 2 new fields into the `BloomFilterConfig`: - maxSizeInBytes: if configured, limit the max size of the

[GitHub] [incubator-pinot] Jackie-Jiang commented on issue #5509: Derived columns

2020-10-14 Thread GitBox
Jackie-Jiang commented on issue #5509: URL: https://github.com/apache/incubator-pinot/issues/5509#issuecomment-708713833 Do we want to support derived field on top of derived field? E.g. we have x in the source data, and we want to add y = f(x) and z = f(y)

[GitHub] [incubator-pinot] codecov-io edited a comment on pull request #6120: Add FST Index using lucene lib to speedup regexp queries

2020-10-14 Thread GitBox
codecov-io edited a comment on pull request #6120: URL: https://github.com/apache/incubator-pinot/pull/6120#issuecomment-705891235 # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6120?src=pr=h1) Report > Merging

[GitHub] [incubator-pinot] codecov-io edited a comment on pull request #6120: Add FST Index using lucene lib to speedup regexp queries

2020-10-14 Thread GitBox
codecov-io edited a comment on pull request #6120: URL: https://github.com/apache/incubator-pinot/pull/6120#issuecomment-705891235 # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6120?src=pr=h1) Report > Merging

[incubator-pinot] branch query-runner-sampling-mode updated (338af06 -> 0546110)

2020-10-14 Thread apucher
This is an automated email from the ASF dual-hosted git repository. apucher pushed a change to branch query-runner-sampling-mode in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git. discard 338af06 track exception counts discard 8093212 review fixes discard b635bac add

[GitHub] [incubator-pinot] bryantachen commented on a change in pull request #6101: [TE] web-api - endpoint for getting performance metrics of detection …

2020-10-14 Thread GitBox
bryantachen commented on a change in pull request #6101: URL: https://github.com/apache/incubator-pinot/pull/6101#discussion_r505133609 ## File path: thirdeye/thirdeye-spi/src/main/java/org/apache/pinot/thirdeye/detection/performance/PerformanceMetrics.java ## @@ -0,0 +1,136

[GitHub] [incubator-pinot] pradeepgv42 commented on pull request #6120: Add FST Index using lucene lib to speedup regexp queries

2020-10-14 Thread GitBox
pradeepgv42 commented on pull request #6120: URL: https://github.com/apache/incubator-pinot/pull/6120#issuecomment-708660577 @Jackie-Jiang QQ about this comment if I seeing the history correctly you added it

[GitHub] [incubator-pinot] npawar commented on issue #5509: Derived columns

2020-10-14 Thread GitBox
npawar commented on issue #5509: URL: https://github.com/apache/incubator-pinot/issues/5509#issuecomment-708706434 **Challenges** Although this seems exactly like transform functions, there's some differences because of which we cannot handle this solely as regular transform

[GitHub] [incubator-pinot] codecov-io edited a comment on pull request #6136: add query runner support for query file resampling

2020-10-14 Thread GitBox
codecov-io edited a comment on pull request #6136: URL: https://github.com/apache/incubator-pinot/pull/6136#issuecomment-707985309 # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6136?src=pr=h1) Report > Merging

[GitHub] [incubator-pinot] venkatvghub closed pull request #6131: F master fix

2020-10-14 Thread GitBox
venkatvghub closed pull request #6131: URL: https://github.com/apache/incubator-pinot/pull/6131 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [incubator-pinot] venkatvghub commented on pull request #6131: F master fix

2020-10-14 Thread GitBox
venkatvghub commented on pull request #6131: URL: https://github.com/apache/incubator-pinot/pull/6131#issuecomment-708265518 This was filed in the wrong place. Closing this This is an automated message from the Apache Git

[GitHub] [incubator-pinot] timsants commented on a change in pull request #6046: Deep Extraction Support for ORC, Thrift, and ProtoBuf Records

2020-10-14 Thread GitBox
timsants commented on a change in pull request #6046: URL: https://github.com/apache/incubator-pinot/pull/6046#discussion_r504422005 ## File path: pinot-plugins/pinot-input-format/pinot-orc/src/main/java/org/apache/pinot/plugin/inputformat/orc/ORCRecordReader.java ## @@

[GitHub] [incubator-pinot] timsants commented on a change in pull request #6046: Deep Extraction Support for ORC, Thrift, and ProtoBuf Records

2020-10-14 Thread GitBox
timsants commented on a change in pull request #6046: URL: https://github.com/apache/incubator-pinot/pull/6046#discussion_r504421904 ## File path: pinot-plugins/pinot-input-format/pinot-orc/src/main/java/org/apache/pinot/plugin/inputformat/orc/ORCRecordReader.java ## @@

[GitHub] [incubator-pinot] timsants closed pull request #6046: Deep Extraction Support for ORC, Thrift, and ProtoBuf Records

2020-10-14 Thread GitBox
timsants closed pull request #6046: URL: https://github.com/apache/incubator-pinot/pull/6046 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [incubator-pinot] timsants commented on pull request #6046: Deep Extraction Support for ORC, Thrift, and ProtoBuf Records

2020-10-14 Thread GitBox
timsants commented on pull request #6046: URL: https://github.com/apache/incubator-pinot/pull/6046#issuecomment-708178631 > shouldn't `CSVRecordExtractor` also extend `BaseRecordExtractor` abstract class instead of implementing the `RecordExtractor` interface?

[GitHub] [incubator-pinot] kvanjana opened a new issue #6142: Email Alert setup with fromAddr as company ID and Backend Log files

2020-10-14 Thread GitBox
kvanjana opened a new issue #6142: URL: https://github.com/apache/incubator-pinot/issues/6142 Dear Team, I would like to ask three questions. 1. **Change the "fromAddress" of alert email:**I read in many documents that “# Sender of the alert. Please avoid changing this

[GitHub] [incubator-pinot] mcvsubbu commented on issue #5753: Built-in jobs to move segments of hybrid tables from Realtime Servers to Offline Servers

2020-10-14 Thread GitBox
mcvsubbu commented on issue #5753: URL: https://github.com/apache/incubator-pinot/issues/5753#issuecomment-708496354 I just realized that if we have multiple data centers, this technique will not produce the same results across the data centers. Something worth noting.

[GitHub] [incubator-pinot] kishoreg edited a comment on issue #5753: Built-in jobs to move segments of hybrid tables from Realtime Servers to Offline Servers

2020-10-14 Thread GitBox
kishoreg edited a comment on issue #5753: URL: https://github.com/apache/incubator-pinot/issues/5753#issuecomment-708503218 Why do you say that? As long as you give enough buffer time for the events from previous time period to flow in, it should be ok right?

[GitHub] [incubator-pinot] kishoreg commented on issue #5753: Built-in jobs to move segments of hybrid tables from Realtime Servers to Offline Servers

2020-10-14 Thread GitBox
kishoreg commented on issue #5753: URL: https://github.com/apache/incubator-pinot/issues/5753#issuecomment-708503218 Why do you say that? As long as you give enough buffer, it should be ok right? This is an automated

[GitHub] [incubator-pinot] lgo opened a new issue #6145: Querying a partitioned column for a non-existent value times out

2020-10-14 Thread GitBox
lgo opened a new issue #6145: URL: https://github.com/apache/incubator-pinot/issues/6145 With a partition column, say `type`, if we uploaded segments containing `type=foo` but had the following query over a non-existent value: SELECT * FROM table WHERE type = 'bar'

[GitHub] [incubator-pinot] lgo opened a new issue #6143: Invalid query results when querying with non-existent column

2020-10-14 Thread GitBox
lgo opened a new issue #6143: URL: https://github.com/apache/incubator-pinot/issues/6143 While testing some results, I accidentally used the wrong name to refer to a column. When querying, rather than raising an error the query actually returned but with `0` results, such as the following

[GitHub] [incubator-pinot] lgo opened a new issue #6144: Querying an indexed field for `distinct` values is slow

2020-10-14 Thread GitBox
lgo opened a new issue #6144: URL: https://github.com/apache/incubator-pinot/issues/6144 While building a query intended to pull out all values for a dimension, the query was slow and timing out. ```sql select type from adjustment group by type ``` Meanwhile, a query

[GitHub] [incubator-pinot] lgo opened a new issue #6146: Low maximum limit for batch jobSpec pushParallelism

2020-10-14 Thread GitBox
lgo opened a new issue #6146: URL: https://github.com/apache/incubator-pinot/issues/6146 On batch jobs processing lots of segments for a table, they often run into Zookeeper conflicts when updating idealState. This causes contention on updates, slowing down everything. To resolve that the

[GitHub] [incubator-pinot] codecov-io edited a comment on pull request #6139: Remove tyrus dependencies in pinot-tools module

2020-10-14 Thread GitBox
codecov-io edited a comment on pull request #6139: URL: https://github.com/apache/incubator-pinot/pull/6139#issuecomment-708062404 # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6139?src=pr=h1) Report > Merging

[GitHub] [incubator-pinot] codecov-io edited a comment on pull request #6066: Detect the behavior when column name mismatches in the query

2020-10-14 Thread GitBox
codecov-io edited a comment on pull request #6066: URL: https://github.com/apache/incubator-pinot/pull/6066#issuecomment-707443440 # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6066?src=pr=h1) Report > Merging

[GitHub] [incubator-pinot] mcvsubbu commented on issue #5753: Built-in jobs to move segments of hybrid tables from Realtime Servers to Offline Servers

2020-10-14 Thread GitBox
mcvsubbu commented on issue #5753: URL: https://github.com/apache/incubator-pinot/issues/5753#issuecomment-708520151 I mis-worded it. The results will be the same, but the segments in each data center may not be the same, right? I am not sure if the m to n segment reduction and time

[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #6094: Implement the segment merge task generator

2020-10-14 Thread GitBox
mcvsubbu commented on a change in pull request #6094: URL: https://github.com/apache/incubator-pinot/pull/6094#discussion_r504846446 ## File path: pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/minion/generator/SegmentMergeRollupTaskGenerator.java ## @@

[incubator-pinot] branch throw-exception-when-column-mismatch updated (1c01c8b -> fb4cb73)

2020-10-14 Thread jlli
This is an automated email from the ASF dual-hosted git repository. jlli pushed a change to branch throw-exception-when-column-mismatch in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git. discard 1c01c8b Address PR comments discard 25eccab Add warn level message and emit

[GitHub] [incubator-pinot] fx19880617 edited a comment on issue #6146: Low maximum limit for batch jobSpec pushParallelism

2020-10-14 Thread GitBox
fx19880617 edited a comment on issue #6146: URL: https://github.com/apache/incubator-pinot/issues/6146#issuecomment-708558434 This is due to we upload segments to all the controller hosts and the idealStats update requests coming from all controllers will cause the slowness and update

[GitHub] [incubator-pinot] fx19880617 commented on issue #6146: Low maximum limit for batch jobSpec pushParallelism

2020-10-14 Thread GitBox
fx19880617 commented on issue #6146: URL: https://github.com/apache/incubator-pinot/issues/6146#issuecomment-708558434 I feel is is due to we upload segments to all the controller hosts and the idealStats update requests coming from all controllers will cause the slowness and update

[GitHub] [incubator-pinot] fx19880617 edited a comment on issue #6146: Low maximum limit for batch jobSpec pushParallelism

2020-10-14 Thread GitBox
fx19880617 edited a comment on issue #6146: URL: https://github.com/apache/incubator-pinot/issues/6146#issuecomment-708558434 I feel this is due to we upload segments to all the controller hosts and the idealStats update requests coming from all controllers will cause the slowness and

[GitHub] [incubator-pinot] mcvsubbu commented on issue #6143: Invalid query results when querying with non-existent column

2020-10-14 Thread GitBox
mcvsubbu commented on issue #6143: URL: https://github.com/apache/incubator-pinot/issues/6143#issuecomment-708559618 I think there was a PR that @jackjlli had sometime before? It is best if we enforce it as a part of SQL, and let PQL be. We are in the process of migrating from PQL to

[GitHub] [incubator-pinot] jackjlli commented on issue #6143: Invalid query results when querying with non-existent column

2020-10-14 Thread GitBox
jackjlli commented on issue #6143: URL: https://github.com/apache/incubator-pinot/issues/6143#issuecomment-708562143 Yes, I have a PR to detect the column mismatch in the query: https://github.com/apache/incubator-pinot/pull/6066 We'll first monitor how many existing use cases are

[GitHub] [incubator-pinot] joey-stripe commented on issue #6146: Low maximum limit for batch jobSpec pushParallelism

2020-10-14 Thread GitBox
joey-stripe commented on issue #6146: URL: https://github.com/apache/incubator-pinot/issues/6146#issuecomment-708570008 Here was the stripped down jobSpec we are using ```yaml executionFrameworkSpec: name: spark segmentMetadataPushJobRunnerClassName:

[GitHub] [incubator-pinot] yupeng9 commented on pull request #6141: add query option of disabling upsert during query

2020-10-14 Thread GitBox
yupeng9 commented on pull request #6141: URL: https://github.com/apache/incubator-pinot/pull/6141#issuecomment-708525828 > `disableUpsert` is a bit confusing. Initially it led me to believe that it is actually going to disable upsert operation, which led me to question why the read path

[GitHub] [incubator-pinot] mayankshriv commented on a change in pull request #6134: Adding a flag in PinotServiceManager to allow health check includes all components health check

2020-10-14 Thread GitBox
mayankshriv commented on a change in pull request #6134: URL: https://github.com/apache/incubator-pinot/pull/6134#discussion_r504825339 ## File path: pinot-tools/src/main/java/org/apache/pinot/tools/admin/command/StartServiceManagerCommand.java ## @@ -69,6 +69,8 @@ private

[GitHub] [incubator-pinot] mayankshriv commented on a change in pull request #6127: Rewrite possible array aggregation functions to one level

2020-10-14 Thread GitBox
mayankshriv commented on a change in pull request #6127: URL: https://github.com/apache/incubator-pinot/pull/6127#discussion_r504830479 ## File path: pinot-common/src/main/java/org/apache/pinot/sql/parsers/CalciteSqlParser.java ## @@ -343,6 +346,9 @@ private static void

[GitHub] [incubator-pinot] mcvsubbu commented on a change in pull request #6113: Adding the upsert support to real-time ingestion and query

2020-10-14 Thread GitBox
mcvsubbu commented on a change in pull request #6113: URL: https://github.com/apache/incubator-pinot/pull/6113#discussion_r504839631 ## File path: pinot-core/src/main/java/org/apache/pinot/core/data/manager/realtime/RealtimeTableDataManager.java ## @@ -378,7 +463,12 @@ public

[incubator-pinot] branch fix-rsvp-meetup updated (0ec26b8 -> b432f15)

2020-10-14 Thread jlli
This is an automated email from the ASF dual-hosted git repository. jlli pushed a change to branch fix-rsvp-meetup in repository https://gitbox.apache.org/repos/asf/incubator-pinot.git. discard 0ec26b8 Remove tyrus dependencies add b432f15 Remove tyrus dependencies This update added

[GitHub] [incubator-pinot] fx19880617 commented on issue #6143: Invalid query results when querying with non-existent column

2020-10-14 Thread GitBox
fx19880617 commented on issue #6143: URL: https://github.com/apache/incubator-pinot/issues/6143#issuecomment-708548825 I feel we should by default do column validation add an flag to loose this behavior for legacy offline tables which doesn't enforce this behavior. Thoughts? @mayankshriv

[GitHub] [incubator-pinot] snleee commented on pull request #6094: Implement the segment merge task generator

2020-10-14 Thread GitBox
snleee commented on pull request #6094: URL: https://github.com/apache/incubator-pinot/pull/6094#issuecomment-708555891 #2715 https://docs.google.com/document/d/1-AKCfXNXdoNjFIvJ87wjWwFM_38gS0NCwFrIYjYsqp8/edit#heading=h.3ajrnu1jdp9u

[GitHub] [incubator-pinot] mcvsubbu commented on issue #6146: Low maximum limit for batch jobSpec pushParallelism

2020-10-14 Thread GitBox
mcvsubbu commented on issue #6146: URL: https://github.com/apache/incubator-pinot/issues/6146#issuecomment-708561221 Not sure what you mean by rethink. A metadata only or URI push is a cheaper operation, so there is less likelyhood of contention. We can also make the backoff be in

[GitHub] [incubator-pinot] joey-stripe edited a comment on issue #6146: Low maximum limit for batch jobSpec pushParallelism

2020-10-14 Thread GitBox
joey-stripe edited a comment on issue #6146: URL: https://github.com/apache/incubator-pinot/issues/6146#issuecomment-708570008 Here was the stripped down jobSpec we are using for reference ```yaml executionFrameworkSpec: name: spark segmentMetadataPushJobRunnerClassName:

[GitHub] [incubator-pinot] codecov-io edited a comment on pull request #6136: add query runner support for query file resampling

2020-10-14 Thread GitBox
codecov-io edited a comment on pull request #6136: URL: https://github.com/apache/incubator-pinot/pull/6136#issuecomment-707985309 # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6136?src=pr=h1) Report > Merging

[GitHub] [incubator-pinot] codecov-io edited a comment on pull request #6136: add query runner support for query file resampling

2020-10-14 Thread GitBox
codecov-io edited a comment on pull request #6136: URL: https://github.com/apache/incubator-pinot/pull/6136#issuecomment-707985309 # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6136?src=pr=h1) Report > Merging

[GitHub] [incubator-pinot] codecov-io edited a comment on pull request #6141: add query option of disabling upsert during query

2020-10-14 Thread GitBox
codecov-io edited a comment on pull request #6141: URL: https://github.com/apache/incubator-pinot/pull/6141#issuecomment-708169596 # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6141?src=pr=h1) Report > Merging

[GitHub] [incubator-pinot] codecov-io edited a comment on pull request #6141: add query option of disabling upsert during query

2020-10-14 Thread GitBox
codecov-io edited a comment on pull request #6141: URL: https://github.com/apache/incubator-pinot/pull/6141#issuecomment-708169596 # [Codecov](https://codecov.io/gh/apache/incubator-pinot/pull/6141?src=pr=h1) Report > Merging