[GitHub] [druid] gianm commented on pull request #10267: DruidInputSource: Fix issues in column projection, timestamp handling.

2020-12-03 Thread GitBox


gianm commented on pull request #10267:
URL: https://github.com/apache/druid/pull/10267#issuecomment-738621133


   > I didn't see the ClassCastException in the log for the failed batch test 
though.
   
   I finally ran this locally and was able to extract the full error. I think 
it's happening because `__time` is getting added to the dimensions list of an 
input row. (This makes a string dimension named `__time` get created, which 
violates all sorts of assumptions.)
   
   We'll need to prevent that from happening somehow. I'll look into it.
   
   ```
   Caused by: java.lang.ClassCastException: 
org.apache.druid.segment.incremental.IncrementalIndexRowHolder cannot be cast 
to org.apache.druid.segment.DimensionSelector
   at 
org.apache.druid.segment.StringDimensionIndexer.convertUnsortedValuesToSorted(StringDimensionIndexer.java:793)
 ~[druid-processing-0.21.0-SNAPSHOT.jar:0.21.0-SNAPSHOT]
   at 
org.apache.druid.segment.incremental.IncrementalIndexRowIterator.lambda$makeRowPointer$0(IncrementalIndexRowIterator.java:78)
 ~[druid-processing-0.21.0-SNAPSHOT.jar:0.21.0-SNAPSHOT]
   at 
java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) 
~[?:1.8.0_232]
   at java.util.Iterator.forEachRemaining(Iterator.java:116) 
~[?:1.8.0_232]
   at 
java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
 ~[?:1.8.0_232]
   at 
java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) 
~[?:1.8.0_232]
   at 
java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) 
~[?:1.8.0_232]
   at 
java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:546) 
~[?:1.8.0_232]
   at 
java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:260)
 ~[?:1.8.0_232]
   at 
java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:505) 
~[?:1.8.0_232]
   at 
org.apache.druid.segment.incremental.IncrementalIndexRowIterator.makeRowPointer(IncrementalIndexRowIterator.java:80)
 ~[druid-processing-0.21.0-SNAPSHOT.jar:0.21.0-SNAPSHOT]
   at 
org.apache.druid.segment.incremental.IncrementalIndexRowIterator.(IncrementalIndexRowIterator.java:54)
 ~[druid-processing-0.21.0-SNAPSHOT.jar:0.21.0-SNAPSHOT]
   at 
org.apache.druid.segment.incremental.IncrementalIndexAdapter.getRows(IncrementalIndexAdapter.java:162)
 ~[druid-processing-0.21.0-SNAPSHOT.jar:0.21.0-SNAPSHOT]
   at 
org.apache.druid.segment.IndexMergerV9.makeMergedTimeAndDimsIterator(IndexMergerV9.java:1106)
 ~[druid-processing-0.21.0-SNAPSHOT.jar:0.21.0-SNAPSHOT]
   at 
org.apache.druid.segment.IndexMergerV9.makeIndexFiles(IndexMergerV9.java:255) 
~[druid-processing-0.21.0-SNAPSHOT.jar:0.21.0-SNAPSHOT]
   at 
org.apache.druid.segment.IndexMergerV9.merge(IndexMergerV9.java:999) 
~[druid-processing-0.21.0-SNAPSHOT.jar:0.21.0-SNAPSHOT]
   at 
org.apache.druid.segment.IndexMergerV9.persist(IndexMergerV9.java:864) 
~[druid-processing-0.21.0-SNAPSHOT.jar:0.21.0-SNAPSHOT]
   at 
org.apache.druid.segment.IndexMergerV9.persist(IndexMergerV9.java:833) 
~[druid-processing-0.21.0-SNAPSHOT.jar:0.21.0-SNAPSHOT]
   at 
org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.persistHydrant(AppenderatorImpl.java:1334)
 ~[druid-server-0.21.0-SNAPSHOT.jar:0.21.0-SNAPSHOT]
   at 
org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.access$100(AppenderatorImpl.java:102)
 ~[druid-server-0.21.0-SNAPSHOT.jar:0.21.0-SNAPSHOT]
   at 
org.apache.druid.segment.realtime.appenderator.AppenderatorImpl$2.call(AppenderatorImpl.java:547)
 ~[druid-server-0.21.0-SNAPSHOT.jar:0.21.0-SNAPSHOT]
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] jihoonson commented on a change in pull request #10631: Fixes and tests related to the Indexer process.

2020-12-03 Thread GitBox


jihoonson commented on a change in pull request #10631:
URL: https://github.com/apache/druid/pull/10631#discussion_r535866173



##
File path: 
indexing-service/src/main/java/org/apache/druid/indexing/common/task/Task.java
##
@@ -160,8 +160,11 @@ default int getPriority()
   /**
* Execute preflight actions for a task. This can be used to acquire locks, 
check preconditions, and so on. The
* actions must be idempotent, since this method may be executed multiple 
times. This typically runs on the
-   * coordinator. If this method throws an exception, the task should be 
considered a failure.
-   * 
+   * Overlord. If this method throws an exception, the task should be 
considered a failure.
+   *
+   * This method will not necessarily be executed before {@link 
#run(TaskToolbox)}, since this task readiness check
+   * may be done on a different machine from the one that actually runs the 
task.

Review comment:
   Oh, I see. I think it happens only in indexers because peon calls 
`isReady()` before it runs its task.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] gianm commented on a change in pull request #10631: Fixes and tests related to the Indexer process.

2020-12-03 Thread GitBox


gianm commented on a change in pull request #10631:
URL: https://github.com/apache/druid/pull/10631#discussion_r535855400



##
File path: integration-tests/docker/environment-configs/coordinator
##
@@ -32,7 +32,11 @@ druid_coordinator_startDelay=PT5S
 druid_manager_lookups_hostUpdateTimeout=PT30S
 druid_manager_lookups_period=1
 druid_manager_lookups_threadPoolSize=2
+druid_manager_config_pollDuration=PT10S
+druid_manager_rules_pollDuration=PT10S
+druid_manager_segments_pollDuration=PT2S
 druid_auth_basic_common_cacheDirectory=/tmp/authCache/coordinator
 druid_auth_unsecuredPaths=["/druid/coordinator/v1/loadqueue"]
 druid_server_https_crlPath=/tls/revocations.crl
-druid_coordinator_period_indexingPeriod=PT18S
\ No newline at end of file
+druid_coordinator_period_indexingPeriod=PT18S
+druid_coordinator_period=PT1S

Review comment:
   Ha ha 
   
   I think it's OK — there are going to be very few segments in the integration 
tests, so a coordinator run should be able to finish very quickly. This change 
was meant to speed up the tests.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] gianm commented on a change in pull request #10631: Fixes and tests related to the Indexer process.

2020-12-03 Thread GitBox


gianm commented on a change in pull request #10631:
URL: https://github.com/apache/druid/pull/10631#discussion_r535855400



##
File path: integration-tests/docker/environment-configs/coordinator
##
@@ -32,7 +32,11 @@ druid_coordinator_startDelay=PT5S
 druid_manager_lookups_hostUpdateTimeout=PT30S
 druid_manager_lookups_period=1
 druid_manager_lookups_threadPoolSize=2
+druid_manager_config_pollDuration=PT10S
+druid_manager_rules_pollDuration=PT10S
+druid_manager_segments_pollDuration=PT2S
 druid_auth_basic_common_cacheDirectory=/tmp/authCache/coordinator
 druid_auth_unsecuredPaths=["/druid/coordinator/v1/loadqueue"]
 druid_server_https_crlPath=/tls/revocations.crl
-druid_coordinator_period_indexingPeriod=PT18S
\ No newline at end of file
+druid_coordinator_period_indexingPeriod=PT18S
+druid_coordinator_period=PT1S

Review comment:
   Ha ha 
   
   I think it's OK — there are going to be very few segments in the integration 
tests, so a coordinator run should be able to finish very quickly.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] gianm commented on pull request #10631: Fixes and tests related to the Indexer process.

2020-12-03 Thread GitBox


gianm commented on pull request #10631:
URL: https://github.com/apache/druid/pull/10631#issuecomment-738583131


   > #10538 and #10258 will be fixed by this PR. Perhaps #9820 as well.
   
   I agree, it should fix all of those. I commented in those patches.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] gianm edited a comment on pull request #10631: Fixes and tests related to the Indexer process.

2020-12-03 Thread GitBox


gianm edited a comment on pull request #10631:
URL: https://github.com/apache/druid/pull/10631#issuecomment-738583131


   > #10538 and #10258 will be fixed by this PR. Perhaps #9820 as well.
   
   I agree, it should fix all of those. I commented in those issues.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] gianm commented on issue #10258: Regression: Indexer nodes dont announce realtime segments in 0.19

2020-12-03 Thread GitBox


gianm commented on issue #10258:
URL: https://github.com/apache/druid/issues/10258#issuecomment-738583006


   #10631 should fix this.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] gianm commented on issue #10538: can not read realtime data when using indexer process instead of middleManager peon process to ingest metrics from kafka

2020-12-03 Thread GitBox


gianm commented on issue #10538:
URL: https://github.com/apache/druid/issues/10538#issuecomment-738582968


   #10631 should fix this.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] gianm commented on issue #9820: "taskLockHelper is not initialized yet"

2020-12-03 Thread GitBox


gianm commented on issue #9820:
URL: https://github.com/apache/druid/issues/9820#issuecomment-738582813


   #10631 should fix this.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] gianm commented on a change in pull request #10631: Fixes and tests related to the Indexer process.

2020-12-03 Thread GitBox


gianm commented on a change in pull request #10631:
URL: https://github.com/apache/druid/pull/10631#discussion_r535854619



##
File path: 
indexing-service/src/main/java/org/apache/druid/indexing/common/task/Task.java
##
@@ -160,8 +160,11 @@ default int getPriority()
   /**
* Execute preflight actions for a task. This can be used to acquire locks, 
check preconditions, and so on. The
* actions must be idempotent, since this method may be executed multiple 
times. This typically runs on the
-   * coordinator. If this method throws an exception, the task should be 
considered a failure.
-   * 
+   * Overlord. If this method throws an exception, the task should be 
considered a failure.
+   *
+   * This method will not necessarily be executed before {@link 
#run(TaskToolbox)}, since this task readiness check
+   * may be done on a different machine from the one that actually runs the 
task.

Review comment:
   It won't necessarily be called on that same Task object (it might be 
called on a different instance that represents the same actual task).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] jihoonson commented on a change in pull request #10631: Fixes and tests related to the Indexer process.

2020-12-03 Thread GitBox


jihoonson commented on a change in pull request #10631:
URL: https://github.com/apache/druid/pull/10631#discussion_r535850965



##
File path: integration-tests/docker/environment-configs/coordinator
##
@@ -32,7 +32,11 @@ druid_coordinator_startDelay=PT5S
 druid_manager_lookups_hostUpdateTimeout=PT30S
 druid_manager_lookups_period=1
 druid_manager_lookups_threadPoolSize=2
+druid_manager_config_pollDuration=PT10S
+druid_manager_rules_pollDuration=PT10S
+druid_manager_segments_pollDuration=PT2S
 druid_auth_basic_common_cacheDirectory=/tmp/authCache/coordinator
 druid_auth_unsecuredPaths=["/druid/coordinator/v1/loadqueue"]
 druid_server_https_crlPath=/tls/revocations.crl
-druid_coordinator_period_indexingPeriod=PT18S
\ No newline at end of file
+druid_coordinator_period_indexingPeriod=PT18S
+druid_coordinator_period=PT1S

Review comment:
   Is this too aggressive? :sweat_smile: 

##
File path: 
indexing-service/src/main/java/org/apache/druid/indexing/common/task/Task.java
##
@@ -160,8 +160,11 @@ default int getPriority()
   /**
* Execute preflight actions for a task. This can be used to acquire locks, 
check preconditions, and so on. The
* actions must be idempotent, since this method may be executed multiple 
times. This typically runs on the
-   * coordinator. If this method throws an exception, the task should be 
considered a failure.
-   * 
+   * Overlord. If this method throws an exception, the task should be 
considered a failure.
+   *
+   * This method will not necessarily be executed before {@link 
#run(TaskToolbox)}, since this task readiness check
+   * may be done on a different machine from the one that actually runs the 
task.

Review comment:
   Hmm, how can this happen? I thought `Task.isReady()` is always called 
before `Task.run()` because a task can be scheduled only when `Task.isReady()` 
returns true. `Task.run()` will be called after the task is scheduled in some 
indexer or middleManager.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] maytasm edited a comment on pull request #10625: Add option for maxBytesInMemory to be percentage of JVM memory

2020-12-03 Thread GitBox


maytasm edited a comment on pull request #10625:
URL: https://github.com/apache/druid/pull/10625#issuecomment-738578884


   Travis failure is because of processing module test failing due to code 
coverage. Note that all tests passed.
   The code coverage is missing on a single line changed in 
OnheapIncrementalIndex.java. I don't think it's worth-wide or possible to test 
so we should just ignore the code coverage.
   
   ```
   
--
   org/apache/druid/segment/incremental/OnheapIncrementalIndex.java
   
--
   463   }
   464   
   465   @Override
   466   public long getMaxJvmMemory()
   467   {
   468 F | L   return 
JvmUtils.getRuntimeInfo().getMaxHeapSizeBytes();
   
--
   Diff coverage statistics:
   
--
   | lines  |branches|   functions|   path
   
--
   |   0% (0/1) | 100% (0/0) | 100% (1/1) | 
org/apache/druid/segment/incremental/OnheapIncrementalIndex.java
   
--
   Total diff coverage:
- lines: 0% (0/1)
- branches: 100% (0/0)
- functions: 100% (1/1)
   ERROR: Insufficient line coverage of 0% (0/1). Required 50%.
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] maytasm commented on pull request #10625: Add option for maxBytesInMemory to be percentage of JVM memory

2020-12-03 Thread GitBox


maytasm commented on pull request #10625:
URL: https://github.com/apache/druid/pull/10625#issuecomment-738578884


   Travis failure is because of processing module test failing due to code 
coverage. Note that all tests passed.
   The code coverage is missing on a single line changed in 
OnheapIncrementalIndex.java. I don't think it's worth-wide or possible to test 
so we should just ignore the code coverage.
   
   
`--
   org/apache/druid/segment/incremental/OnheapIncrementalIndex.java
   
--
   463   }
   464   
   465   @Override
   466   public long getMaxJvmMemory()
   467   {
   468 F | L   return 
JvmUtils.getRuntimeInfo().getMaxHeapSizeBytes();
   
--
   Diff coverage statistics:
   
--
   | lines  |branches|   functions|   path
   
--
   |   0% (0/1) | 100% (0/0) | 100% (1/1) | 
org/apache/druid/segment/incremental/OnheapIncrementalIndex.java
   
--
   Total diff coverage:
- lines: 0% (0/1)
- branches: 100% (0/0)
- functions: 100% (1/1)
   ERROR: Insufficient line coverage of 0% (0/1). Required 50%.`



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] abhishekagarwal87 closed pull request #10624: Grouping id 2

2020-12-03 Thread GitBox


abhishekagarwal87 closed pull request #10624:
URL: https://github.com/apache/druid/pull/10624


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] jihoonson commented on pull request #10631: Fixes and tests related to the Indexer process.

2020-12-03 Thread GitBox


jihoonson commented on pull request #10631:
URL: https://github.com/apache/druid/pull/10631#issuecomment-738570769


   https://github.com/apache/druid/issues/10538 and 
https://github.com/apache/druid/issues/10258 will be fixed by this PR. Perhaps 
https://github.com/apache/druid/issues/9820 as well.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] gianm opened a new pull request #10631: Fixes and tests related to the Indexer process.

2020-12-03 Thread GitBox


gianm opened a new pull request #10631:
URL: https://github.com/apache/druid/pull/10631


   Three bugs fixed:
   
   1) Indexers would not announce themselves as segment servers if they
  did not have storage locations defined. This used to work, but was
  broken in #9971. Fixed this by adding an "isSegmentServer" method
  to ServerType and updating SegmentLoadDropHandler to always announce
  if this method returns true.
   
   2) Certain batch task types were written in a way that assumed "isReady"
  would be called before "run", which is not guaranteed. In particular,
  they relied on it in order to initialize "taskLockHelper". Fixed this
  by updating AbstractBatchIndexTask to ensure "isReady" is called
  before "run" for these tasks.
   
   3) UnifiedIndexerAppenderatorsManager did not properly handle complex
  datasources. Introduced DataSourceAnalysis in order to fix this.
   
   Test changes:
   
   1) Add a new "docker-compose.cli-indexer.yml" config that spins up an
  Indexer instead of a MiddleManager.
   
   2) Introduce a "USE_INDEXER" environment variable that determines if
  docker-compose will start up an Indexer or a MiddleManager.
   
   3) Duplicate all the jdk8 tests and run them in both MiddleManager and
  Indexer mode.
   
   4) Various adjustments to encourage fail-fast errors in the Docker
  build scripts.
   
   5) Various adjustments to speed up integration tests and reduce memory
  usage.
   
   6) Add another Mac-specific approach to determining a machine's own IP.
  This was useful on my development machine.
   
   7) Update segment-count check in ITCompactionTaskTest to eliminate a
  race condition (it was looking for 6 segments, which only exist
  together briefly, until the older 4 are marked unused).
   
   Javadoc updates:
   
   1) AbstractBatchIndexTask: Added javadocs to determineLockGranularityXXX
  that make it clear when taskLockHelper will be initialized as a side
  effect. (Related to the second bug above.)
   
   2) Task: Clarified that "isReady" is not guaranteed to be called before
  "run". It was already implied, but now it's explicit.
   
   3) ZkCoordinator: Clarified deprecation message.
   
   4) DataSegmentServerAnnouncer: Clarified deprecation message.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] gianm opened a new pull request #10630: Scan query: More accurate error message when segment per time chunk limit is exceeded.

2020-12-03 Thread GitBox


gianm opened a new pull request #10630:
URL: https://github.com/apache/druid/pull/10630


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] zhangyue19921010 commented on pull request #10551: Remove hard limitation that druid(after 0.15.0) only can consume Kafka version 0.11.x or better

2020-12-03 Thread GitBox


zhangyue19921010 commented on pull request #10551:
URL: https://github.com/apache/druid/pull/10551#issuecomment-738549172


   @FrankChen021 @nishantmonu51 @a2l007 @himanshug @pphust Thanks for your 
review!
   And @himanshug Thanks for your merge!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[druid] branch master updated: Move common configurations to TuningConfig (#10478)

2020-12-03 Thread cwylie
This is an automated email from the ASF dual-hosted git repository.

cwylie pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git


The following commit(s) were added to refs/heads/master by this push:
 new 52d46ce  Move common configurations to TuningConfig (#10478)
52d46ce is described below

commit 52d46cebc31026b8dd39c7d9fb82c62bd77965fb
Author: Liran Funaro 
AuthorDate: Fri Dec 4 04:13:32 2020 +0200

Move common configurations to TuningConfig (#10478)

* Move common methods that are used in HadoopTuningConfig and in 
AppenderatorConfig to TuningConfig
* Rename rowFlushBoundary in HadoopTuningConfig to maxRowsInMemory to match 
TuningConfig API
---
 .../MaterializedViewSupervisorSpec.java|  4 ++--
 .../apache/druid/indexer/HadoopTuningConfig.java   | 25 --
 .../apache/druid/indexer/IndexGeneratorJob.java|  2 +-
 .../druid/indexer/HadoopTuningConfigTest.java  |  2 +-
 .../druid/segment/indexing/TuningConfig.java   | 13 +++
 .../realtime/appenderator/AppenderatorConfig.java  | 13 ---
 6 files changed, 31 insertions(+), 28 deletions(-)

diff --git 
a/extensions-contrib/materialized-view-maintenance/src/main/java/org/apache/druid/indexing/materializedview/MaterializedViewSupervisorSpec.java
 
b/extensions-contrib/materialized-view-maintenance/src/main/java/org/apache/druid/indexing/materializedview/MaterializedViewSupervisorSpec.java
index 5388388..db63a73 100644
--- 
a/extensions-contrib/materialized-view-maintenance/src/main/java/org/apache/druid/indexing/materializedview/MaterializedViewSupervisorSpec.java
+++ 
b/extensions-contrib/materialized-view-maintenance/src/main/java/org/apache/druid/indexing/materializedview/MaterializedViewSupervisorSpec.java
@@ -182,7 +182,7 @@ public class MaterializedViewSupervisorSpec implements 
SupervisorSpec
 tuningConfig.getIndexSpec(),
 tuningConfig.getIndexSpecForIntermediatePersists(),
 tuningConfig.getAppendableIndexSpec(),
-tuningConfig.getRowFlushBoundary(),
+tuningConfig.getMaxRowsInMemory(),
 tuningConfig.getMaxBytesInMemory(),
 tuningConfig.isLeaveIntermediate(),
 tuningConfig.isCleanupOnFailure(),
@@ -191,7 +191,7 @@ public class MaterializedViewSupervisorSpec implements 
SupervisorSpec
 tuningConfig.getJobProperties(),
 tuningConfig.isCombineText(),
 tuningConfig.getUseCombiner(),
-tuningConfig.getRowFlushBoundary(),
+tuningConfig.getMaxRowsInMemory(),
 tuningConfig.getBuildV9Directly(),
 tuningConfig.getNumBackgroundPersistThreads(),
 tuningConfig.isForceExtendableShardSpecs(),
diff --git 
a/indexing-hadoop/src/main/java/org/apache/druid/indexer/HadoopTuningConfig.java
 
b/indexing-hadoop/src/main/java/org/apache/druid/indexer/HadoopTuningConfig.java
index f1a8cc8..0f29f6a 100644
--- 
a/indexing-hadoop/src/main/java/org/apache/druid/indexer/HadoopTuningConfig.java
+++ 
b/indexing-hadoop/src/main/java/org/apache/druid/indexer/HadoopTuningConfig.java
@@ -44,7 +44,6 @@ public class HadoopTuningConfig implements TuningConfig
   private static final DimensionBasedPartitionsSpec DEFAULT_PARTITIONS_SPEC = 
HashedPartitionsSpec.defaultSpec();
   private static final Map> DEFAULT_SHARD_SPECS = 
ImmutableMap.of();
   private static final IndexSpec DEFAULT_INDEX_SPEC = new IndexSpec();
-  private static final int DEFAULT_ROW_FLUSH_BOUNDARY = 
TuningConfig.DEFAULT_MAX_ROWS_IN_MEMORY;
   private static final boolean DEFAULT_USE_COMBINER = false;
   private static final int DEFAULT_NUM_BACKGROUND_PERSIST_THREADS = 0;
 
@@ -58,7 +57,7 @@ public class HadoopTuningConfig implements TuningConfig
 DEFAULT_INDEX_SPEC,
 DEFAULT_INDEX_SPEC,
 DEFAULT_APPENDABLE_INDEX,
-DEFAULT_ROW_FLUSH_BOUNDARY,
+DEFAULT_MAX_ROWS_IN_MEMORY,
 0L,
 false,
 true,
@@ -86,7 +85,7 @@ public class HadoopTuningConfig implements TuningConfig
   private final IndexSpec indexSpec;
   private final IndexSpec indexSpecForIntermediatePersists;
   private final AppendableIndexSpec appendableIndexSpec;
-  private final int rowFlushBoundary;
+  private final int maxRowsInMemory;
   private final long maxBytesInMemory;
   private final boolean leaveIntermediate;
   private final boolean cleanupOnFailure;
@@ -141,8 +140,8 @@ public class HadoopTuningConfig implements TuningConfig
 this.indexSpec = indexSpec == null ? DEFAULT_INDEX_SPEC : indexSpec;
 this.indexSpecForIntermediatePersists = indexSpecForIntermediatePersists 
== null ?
 this.indexSpec : 
indexSpecForIntermediatePersists;
-this.rowFlushBoundary = maxRowsInMemory == null ? maxRowsInMemoryCOMPAT == 
null
-  ? 
DEFAULT_ROW_FLUSH_BOUNDARY
+this.maxRowsInMemory = maxRowsInMemory == null ? maxRowsInMemoryCOMPAT == 
null
+  

[GitHub] [druid] clintropolis merged pull request #10478: Move common configurations to TuningConfig

2020-12-03 Thread GitBox


clintropolis merged pull request #10478:
URL: https://github.com/apache/druid/pull/10478


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] clintropolis commented on pull request #10478: Move common configurations to TuningConfig

2020-12-03 Thread GitBox


clintropolis commented on pull request #10478:
URL: https://github.com/apache/druid/pull/10478#issuecomment-738505130


   I don't seem to be able to retry the failing CI which looks like it was a 
flaky test.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[druid] branch master updated (ae6c43d -> 229b5f3)

2020-12-03 Thread himanshug
This is an automated email from the ASF dual-hosted git repository.

himanshug pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git.


from ae6c43d  Add an integration test for HTTP inputSource (#10620)
 add 229b5f3  Remove hard limitation that druid(after 0.15.0) only can 
consume Kafka version 0.11.x or better (#10551)

No new revisions were added by this update.

Summary of changes:
 .../development/extensions-core/kafka-ingestion.md |  6 ++--
 .../druid/indexing/kafka/KafkaConsumerConfigs.java |  1 -
 .../druid/indexing/kafka/KafkaIndexTask.java   |  1 +
 .../druid/indexing/kafka/KafkaRecordSupplier.java  |  1 +
 .../druid/indexing/kafka/KafkaIndexTaskTest.java   | 41 ++
 5 files changed, 47 insertions(+), 3 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] himanshug merged pull request #10551: Remove hard limitation that druid(after 0.15.0) only can consume Kafka version 0.11.x or better

2020-12-03 Thread GitBox


himanshug merged pull request #10551:
URL: https://github.com/apache/druid/pull/10551


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] lgtm-com[bot] commented on pull request #10628: map-string-string dimension column contrib extension

2020-12-03 Thread GitBox


lgtm-com[bot] commented on pull request #10628:
URL: https://github.com/apache/druid/pull/10628#issuecomment-738488519


   This pull request **introduces 2 alerts** when merging 
f306b31f55e358c4c3f579ae05dd8f830147c3ee into 
ae6c43de71f9983564b440f8db851e5e7f1ae16d - [view on 
LGTM.com](https://lgtm.com/projects/g/apache/druid/rev/pr-eeba6ddea50cbe3b00bada4c9dacc3a2a2d5da4e)
   
   **new alerts:**
   
   * 1 for Result of multiplication cast to wider type
   * 1 for Potential input resource leak



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] himanshug commented on pull request #10551: Remove hard limitation that druid(after 0.15.0) only can consume Kafka version 0.11.x or better

2020-12-03 Thread GitBox


himanshug commented on pull request #10551:
URL: https://github.com/apache/druid/pull/10551#issuecomment-738482887


   +1 after the build



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] himanshug commented on a change in pull request #10551: Remove hard limitation that druid(after 0.15.0) only can consume Kafka version 0.11.x or better

2020-12-03 Thread GitBox


himanshug commented on a change in pull request #10551:
URL: https://github.com/apache/druid/pull/10551#discussion_r535759359



##
File path: 
extensions-core/kafka-indexing-service/src/main/java/org/apache/druid/indexing/kafka/KafkaConsumerConfigs.java
##
@@ -31,14 +31,14 @@
 public class KafkaConsumerConfigs
 {
 
-  public static Map getConsumerProperties()
+  public static Map getConsumerProperties(Map 
customerConsumerProperties)

Review comment:
   > we need to ensure that the deserialize between new IoConfig and old 
IoConfig is completely consistent and smooth
   
   
   well, my main concern was the `consumeTransactionally` param which has been 
removed . Though I fail to see why simply adding the default to 
`KafkaSupervisorIOConfig.java` would not have same behavior functionally (or 
for backward compatibility as well)... but I am not so worried about the 
specifics at this point as it is easy to change later if needed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] techdocsmith opened a new pull request #10629: update syntax for golbal cached uri lookups

2020-12-03 Thread GitBox


techdocsmith opened a new pull request #10629:
URL: https://github.com/apache/druid/pull/10629


   Fixes an issue with the syntax for columns in cached global lookups
   
   ### Description
   
   Updates the docs to be inline with the syntax displayed in the Druid UI.
   
   
   This PR has:
   - [x] been self-reviewed.
  - [ ] using the [concurrency 
checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md)
 (Remove this item if the PR doesn't have any relation to concurrency.)
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked 
related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in 
[licenses.yaml](https://github.com/apache/druid/blob/master/licenses.yaml)
   - [ ] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   - [ ] added integration tests.
   - [ ] been tested in a test Druid cluster.
   
   
   
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] jihoonson commented on pull request #10543: Avatica protobuf

2020-12-03 Thread GitBox


jihoonson commented on pull request #10543:
URL: https://github.com/apache/druid/pull/10543#issuecomment-738466273


   @lkm besides my comments above, would you please fix the conflicts?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] jihoonson commented on a change in pull request #10543: Avatica protobuf

2020-12-03 Thread GitBox


jihoonson commented on a change in pull request #10543:
URL: https://github.com/apache/druid/pull/10543#discussion_r535741400



##
File path: 
sql/src/main/java/org/apache/druid/sql/avatica/DruidAvaticaProtobufHandler.java
##
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.sql.avatica;
+
+import com.google.inject.Inject;
+import org.apache.calcite.avatica.remote.LocalService;
+import org.apache.calcite.avatica.remote.Service;
+import org.apache.calcite.avatica.server.AvaticaProtobufHandler;
+import org.apache.druid.guice.annotations.Self;
+import org.apache.druid.server.DruidNode;
+import org.eclipse.jetty.server.Request;
+
+import javax.servlet.ServletException;
+import javax.servlet.http.HttpServletRequest;
+import javax.servlet.http.HttpServletResponse;
+import java.io.IOException;
+
+public class DruidAvaticaProtobufHandler extends AvaticaProtobufHandler
+{
+  public static final String AVATICA_PATH = "/druid/v2/sql/avatica-pb/";

Review comment:
   I see. It sounds reasonable to me.

##
File path: 
server/src/main/java/org/apache/druid/server/AsyncQueryForwardingServlet.java
##
@@ -119,6 +124,7 @@ private static void handleException(HttpServletResponse 
response, ObjectMapper o
   private final RequestLogger requestLogger;
   private final GenericQueryMetricsFactory queryMetricsFactory;
   private final AuthenticatorMapper authenticatorMapper;
+  private final transient ProtobufTranslation protobufTranslation;

Review comment:
   Hmm, I didn't know that the servlet is serializable. However, [it seems 
that Jetty doesn't support serializing servlets 
anyway](https://github.com/eclipse/jetty.project/issues/5169). I would suggest 
to not add it since it can be misleading as other variables are not transient.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] himanshug commented on issue #10626: Whether HDFS Dependencies Can Be Removed in the Future Development of Druid?

2020-12-03 Thread GitBox


himanshug commented on issue #10626:
URL: https://github.com/apache/druid/issues/10626#issuecomment-738463834


   curious about what/how-much resources are being consumed (assuming you don't 
add `hdfs-storage` to extensions list anyways) ?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] himanshug opened a new pull request #10628: map-string-string dimension column contrib extension

2020-12-03 Thread GitBox


himanshug opened a new pull request #10628:
URL: https://github.com/apache/druid/pull/10628


   ### Description
   
   Introduces a "mapStringString" complex type dimension column. All the code 
lives in an isolated contrib extension and implements relevant extensible Druid 
core interfaces e.g. ComplexColumn, ComplexMetricSerde, DimensionHandler, 
DimensionIndexer, DimensionMerger etc etc.
   
   It allows ingestion of `Map` column in users input data as a 
single `dimension` column . At query time, user can access individual keys in 
the map as if they were simple `string` dimension columns (see 
`MapStringStringKeyVirtualColumn.java` ) or column can be accessed as records 
of `Map` type but that does not have much use in Druid query 
layer for now aside from `select` query.
   
   User would have to implement their own `InputRowParser` to supply  
`Map` records for ingestion , we used one that was very specific 
to our needs.
   
   This extension also serves as an example of how users can create their own 
dimension columns for the specific needs and also [indirectly] tests the 
dimension extensibility support introduced in #10277
   
   Future: Now that `VirtualColumn` interface has been enhanced to support 
vectorization, maybe the virtual column implementation code in this extension 
could also be improved to support that for [potentially] better performance.
   
   
   
   This PR has:
   - [x] been self-reviewed.
  - [ ] using the [concurrency 
checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md)
 (Remove this item if the PR doesn't have any relation to concurrency.)
   - [ ] added documentation for new or modified features or behaviors.
   - [x] added Javadocs for most classes and all non-trivial methods. Linked 
related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in 
[licenses.yaml](https://github.com/apache/druid/blob/master/licenses.yaml)
   - [x] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [x] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   - [ ] added integration tests.
   - [x] been tested in a test Druid cluster.
   
   
   
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] jihoonson commented on issue #10626: Whether HDFS Dependencies Can Be Removed in the Future Development of Druid?

2020-12-03 Thread GitBox


jihoonson commented on issue #10626:
URL: https://github.com/apache/druid/issues/10626#issuecomment-738460145


   This has been my wish for a long time. Ideally, HDFS dependencies should be 
included only when you include Hadoop-related extensions. However, I don't 
think it could be done easily since it will require restructuring and 
refactorying lots of code. I don't think it could be done anytime soon. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[druid] branch master updated: Add an integration test for HTTP inputSource (#10620)

2020-12-03 Thread jihoonson
This is an automated email from the ASF dual-hosted git repository.

jihoonson pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git


The following commit(s) were added to refs/heads/master by this push:
 new ae6c43d  Add an integration test for HTTP inputSource (#10620)
ae6c43d is described below

commit ae6c43de71f9983564b440f8db851e5e7f1ae16d
Author: Jihoon Son 
AuthorDate: Thu Dec 3 15:51:56 2020 -0800

Add an integration test for HTTP inputSource (#10620)
---
 .../druid/tests/indexer/ITHttpInputSourceTest.java | 53 
 .../wikipedia_http_inputsource_queries.json| 98 ++
 .../indexer/wikipedia_http_inputsource_task.json   | 90 
 3 files changed, 241 insertions(+)

diff --git 
a/integration-tests/src/test/java/org/apache/druid/tests/indexer/ITHttpInputSourceTest.java
 
b/integration-tests/src/test/java/org/apache/druid/tests/indexer/ITHttpInputSourceTest.java
new file mode 100644
index 000..69d1eea
--- /dev/null
+++ 
b/integration-tests/src/test/java/org/apache/druid/tests/indexer/ITHttpInputSourceTest.java
@@ -0,0 +1,53 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.tests.indexer;
+
+import org.apache.druid.testing.guice.DruidTestModuleFactory;
+import org.apache.druid.tests.TestNGGroup;
+import org.testng.annotations.Guice;
+import org.testng.annotations.Test;
+
+import java.io.Closeable;
+import java.io.IOException;
+import java.util.UUID;
+
+@Test(groups = TestNGGroup.INPUT_SOURCE)
+@Guice(moduleFactory = DruidTestModuleFactory.class)
+public class ITHttpInputSourceTest extends AbstractITBatchIndexTest
+{
+  private static final String INDEX_TASK = 
"/indexer/wikipedia_http_inputsource_task.json";
+  private static final String INDEX_QUERIES_RESOURCE = 
"/indexer/wikipedia_http_inputsource_queries.json";
+
+  @Test
+  public void doTest() throws IOException
+  {
+final String indexDatasource = "wikipedia_http_inputsource_test_" + 
UUID.randomUUID();
+try (final Closeable ignored1 = unloader(indexDatasource + 
config.getExtraDatasourceNameSuffix())) {
+  doIndexTest(
+  indexDatasource,
+  INDEX_TASK,
+  INDEX_QUERIES_RESOURCE,
+  false,
+  true,
+  true
+  );
+}
+  }
+}
diff --git 
a/integration-tests/src/test/resources/indexer/wikipedia_http_inputsource_queries.json
 
b/integration-tests/src/test/resources/indexer/wikipedia_http_inputsource_queries.json
new file mode 100644
index 000..11496c2
--- /dev/null
+++ 
b/integration-tests/src/test/resources/indexer/wikipedia_http_inputsource_queries.json
@@ -0,0 +1,98 @@
+[
+{
+"description": "timeseries, 1 agg, all",
+"query":{
+"queryType" : "timeBoundary",
+"dataSource": "%%DATASOURCE%%"
+},
+"expectedResults":[
+{
+"timestamp" : "2016-06-27T00:00:11.000Z",
+"result" : {
+"minTime" : "2016-06-27T00:00:11.000Z",
+"maxTime" : "2016-06-27T21:31:02.000Z"
+}
+}
+]
+},
+{
+"description": "timeseries, datasketch aggs, all",
+"query":{
+"queryType" : "timeseries",
+"dataSource": "%%DATASOURCE%%",
+"granularity":"day",
+"intervals":[
+"2016-06-27/P1D"
+],
+"filter":null,
+"aggregations":[
+{
+"type": "HLLSketchMerge",
+"name": "approxCountHLL",
+"fieldName": "HLLSketchBuild",
+"lgK": 12,
+"tgtHllType": "HLL_4",
+"round": true
+},
+{
+"type":"thetaSketch",
+"name":"approxCountTheta",
+"fieldName":"thetaSketch",
+"size":16384,
+"shouldFinalize":true,
+"isInputThetaSketch":false,
+"errorBoundsStdDev":null
+},
+{
+

[GitHub] [druid] jihoonson merged pull request #10620: Add an integration test for HTTP inputSource

2020-12-03 Thread GitBox


jihoonson merged pull request #10620:
URL: https://github.com/apache/druid/pull/10620


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] jihoonson commented on pull request #10624: Grouping id 2

2020-12-03 Thread GitBox


jihoonson commented on pull request #10624:
URL: https://github.com/apache/druid/pull/10624#issuecomment-738449988


   @abhishekagarwal87 thanks for trying it out. As I commented in 
https://github.com/apache/druid/pull/10518#discussion_r535718040, I prefer the 
original approach to this alternative.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] jihoonson commented on a change in pull request #10518: Add grouping_id function

2020-12-03 Thread GitBox


jihoonson commented on a change in pull request #10518:
URL: https://github.com/apache/druid/pull/10518#discussion_r535700672



##
File path: 
processing/src/main/java/org/apache/druid/query/aggregation/GroupingAggregatorFactory.java
##
@@ -0,0 +1,282 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.query.aggregation;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Preconditions;
+import org.apache.druid.annotations.EverythingIsNonnullByDefault;
+import org.apache.druid.query.aggregation.constant.LongConstantAggregator;
+import 
org.apache.druid.query.aggregation.constant.LongConstantBufferAggregator;
+import 
org.apache.druid.query.aggregation.constant.LongConstantVectorAggregator;
+import org.apache.druid.query.cache.CacheKeyBuilder;
+import org.apache.druid.segment.ColumnInspector;
+import org.apache.druid.segment.ColumnSelectorFactory;
+import org.apache.druid.segment.column.ValueType;
+import org.apache.druid.segment.vector.VectorColumnSelectorFactory;
+import org.apache.druid.utils.CollectionUtils;
+
+import javax.annotation.Nullable;
+import java.util.Collections;
+import java.util.Comparator;
+import java.util.List;
+import java.util.Objects;
+import java.util.Set;
+
+@EverythingIsNonnullByDefault
+public class GroupingAggregatorFactory extends AggregatorFactory

Review comment:
   Please add some javadoc. It would be nice to include but not necessarily 
limited to the followings.
   - This factory is for computing `grouping` function.
   - This factory can create `LongConstant*Aggregators`. Unlike other 
aggregators, these `LongConstant*Aggregators` created by this factory does 
nothing with computing `grouping` function. Instead, they are used just to hold 
the positions of `grouping` results in the `ResultRow`.
   - The actual computation of `grouping` function is done before processing 
each subtotal. The result of `LongConstant*Aggregators` is _not_ used but 
replaced with the precomputed result when iterating the result of subtotal 
computation. See `RowBasedGrouperHelper.makeGrouperIterator()` for more details.
   - There could be some different approach to implement the same 
functionality. We chose this approach because it seems more stable and less 
complex than others. See 
https://github.com/apache/druid/pull/10518#discussion_r532941216 for more 
details.

##
File path: 
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/RowBasedGrouperHelper.java
##
@@ -576,7 +594,12 @@ private static ValueExtractFunction 
makeValueExtractFunction(
   // Add aggregations.
   final int resultRowAggregatorStart = 
query.getResultRowAggregatorStart();
   for (int i = 0; i < entry.getValues().length; i++) {
-resultRow.set(resultRowAggregatorStart + i, entry.getValues()[i]);
+if (dimsToInclude != null && groupingAggregatorsBitSet.get(i)) {
+  resultRow.set(resultRowAggregatorStart + i, 
groupingAggregatorValues[i]);

Review comment:
   Please add some comment about what is happening here.

##
File path: 
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/RowBasedGrouperHelper.java
##
@@ -576,7 +594,12 @@ private static ValueExtractFunction 
makeValueExtractFunction(
   // Add aggregations.
   final int resultRowAggregatorStart = 
query.getResultRowAggregatorStart();
   for (int i = 0; i < entry.getValues().length; i++) {
-resultRow.set(resultRowAggregatorStart + i, entry.getValues()[i]);
+if (dimsToInclude != null && groupingAggregatorsBitSet.get(i)) {

Review comment:
   I talked to @abhishekagarwal87 offline. My biggest concern is that the 
special handling for `GroupingAggregatorFactory` seems pretty magical because 
the behaviour of the factory and its aggregators is different from others. What 
I suggested above is making it more special but less magical, because I think 
it's less confusing. @abhishekagarwal87's concern is mostly around the 
complexity of 

[druid] branch master updated (813e187 -> 2cd017b)

2020-12-03 Thread jihoonson
This is an automated email from the ASF dual-hosted git repository.

jihoonson pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git.


from 813e187  make dimension column extensible with COMPLEX type (#10277)
 add 2cd017b  Fix the config initialization on container restart (#10458)

No new revisions were added by this update.

Summary of changes:
 distribution/docker/druid.sh | 1 +
 1 file changed, 1 insertion(+)


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] jihoonson merged pull request #10458: Fix the config initialization on container restart

2020-12-03 Thread GitBox


jihoonson merged pull request #10458:
URL: https://github.com/apache/druid/pull/10458


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] jihoonson commented on pull request #10458: Fix the config initialization on container restart

2020-12-03 Thread GitBox


jihoonson commented on pull request #10458:
URL: https://github.com/apache/druid/pull/10458#issuecomment-738275612


   The failed Travis CI seems an old one which was replaced by the new one 
based on GitHub action. The new CI passed, so I'm merging this PR.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] senthilkv commented on pull request #10495: Added Request log updates for status change on cooridnator / overlord…

2020-12-03 Thread GitBox


senthilkv commented on pull request #10495:
URL: https://github.com/apache/druid/pull/10495#issuecomment-738215212


   @himanshug Sure will look into it



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10624: Grouping id 2

2020-12-03 Thread GitBox


abhishekagarwal87 commented on a change in pull request #10624:
URL: https://github.com/apache/druid/pull/10624#discussion_r535466228



##
File path: 
processing/src/main/java/org/apache/druid/query/groupby/GroupByQueryQueryToolChest.java
##
@@ -233,7 +235,7 @@ public GroupByQueryQueryToolChest(
 return groupByStrategy.processSubtotalsSpec(
 query,
 resource,
-groupByStrategy.processSubqueryResult(subquery, query, resource, 
finalizingResults, false)
+groupByStrategy.processSubqueryResult(subquery, 
withoutSubtotals(query), resource, finalizingResults, false)

Review comment:
   some context is in here but not helpful - 
https://github.com/apache/druid/pull/5280/files#r181885318
   I am concerned that modifying the query cause some problem down the line. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[druid] branch master updated (c94be8a -> 813e187)

2020-12-03 Thread himanshug
This is an automated email from the ASF dual-hosted git repository.

himanshug pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git.


from c94be8a  Revert "Update google client libraries (#10536)" (#10599)
 add 813e187  make dimension column extensible with COMPLEX type (#10277)

No new revisions were added by this update.

Summary of changes:
 ...tSegment.java => DimensionHandlerProvider.java} |   9 +-
 .../druid/segment/DimensionHandlerUtils.java   |  31 ++
 .../apache/druid/segment/column/ColumnBuilder.java |   6 ++
 .../druid/segment/column/ColumnCapabilities.java   |   7 ++
 .../segment/column/ColumnCapabilitiesImpl.java |  25 +
 .../segment/incremental/IncrementalIndex.java  |  13 ++-
 .../segment/serde/ComplexColumnPartSerde.java  |   2 +-
 .../druid/segment/serde/ComplexMetricSerde.java|  16 
 .../query/aggregation/AggregationTestHelper.java   | 106 ++---
 ...ilsTest.java => DimensionHandlerUtilsTest.java} |  36 ---
 .../java/org/apache/druid/segment/TestIndex.java   |   6 +-
 .../druid/segment/column/ColumnBuilderTest.java|  16 ++--
 ...ColumnSupportedComplexColumnSerializerTest.java |   2 +-
 13 files changed, 219 insertions(+), 56 deletions(-)
 copy processing/src/main/java/org/apache/druid/segment/{AbstractSegment.java 
=> DimensionHandlerProvider.java} (76%)
 copy processing/src/test/java/org/apache/druid/segment/{IntListUtilsTest.java 
=> DimensionHandlerUtilsTest.java} (56%)
 copy 
extensions-contrib/ambari-metrics-emitter/src/test/java/org/apache/druid/emitter/ambari/metrics/DruidToWhiteListBasedConverterTest.java
 => 
processing/src/test/java/org/apache/druid/segment/column/ColumnBuilderTest.java 
(72%)


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] himanshug merged pull request #10277: make dimension column extensible with COMPLEX type

2020-12-03 Thread GitBox


himanshug merged pull request #10277:
URL: https://github.com/apache/druid/pull/10277


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] suneet-s commented on pull request #10599: Revert "Update google client libraries (#10536)"

2020-12-03 Thread GitBox


suneet-s commented on pull request #10599:
URL: https://github.com/apache/druid/pull/10599#issuecomment-738125916


   Thanks @nishantmonu51 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] belugabehr commented on pull request #10627: Remove JODA dependency from druid-basic-security package

2020-12-03 Thread GitBox


belugabehr commented on pull request #10627:
URL: https://github.com/apache/druid/pull/10627#issuecomment-738097479


   For now, I had to add some new APIs to the `druid-core` library to support 
JDK Time and JODA Time concurrently.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] belugabehr opened a new pull request #10627: Remove JODA dependency from druid-basic-security package

2020-12-03 Thread GitBox


belugabehr opened a new pull request #10627:
URL: https://github.com/apache/druid/pull/10627


   
   
   Fixes #8225
   
   
   
   
   
   ### Description
   
   Remove JODA time from Druid, one package at a time since it would be way to 
unwieldy to review in one shot.
   
   
   
   
   
   
   
   
   
   
   
   This PR has:
   - [x] been self-reviewed.
  - [ ] using the [concurrency 
checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md)
 (Remove this item if the PR doesn't have any relation to concurrency.)
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked 
related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in 
[licenses.yaml](https://github.com/apache/druid/blob/master/licenses.yaml)
   - [ ] added comments explaining the "why" and the intent of the code 
wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths, 
ensuring the threshold for [code 
coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md)
 is met.
   - [ ] added integration tests.
   - [ ] been tested in a test Druid cluster.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] belugabehr commented on issue #8225: Remove JODA Time dependency

2020-12-03 Thread GitBox


belugabehr commented on issue #8225:
URL: https://github.com/apache/druid/issues/8225#issuecomment-738093734


   Re-open ?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] zhangyue19921010 commented on a change in pull request #10551: Remove hard limitation that druid(after 0.15.0) only can consume Kafka version 0.11.x or better

2020-12-03 Thread GitBox


zhangyue19921010 commented on a change in pull request #10551:
URL: https://github.com/apache/druid/pull/10551#discussion_r535297518



##
File path: 
extensions-core/kafka-indexing-service/src/main/java/org/apache/druid/indexing/kafka/KafkaConsumerConfigs.java
##
@@ -31,14 +31,14 @@
 public class KafkaConsumerConfigs
 {
 
-  public static Map getConsumerProperties()
+  public static Map getConsumerProperties(Map 
customerConsumerProperties)

Review comment:
   Hi @himanshug Thanks for your review! 
   I agree with `it would likely be simpler to leave this alone, remove the 
line props.put("isolation.level".. altogether`.  But I think maybe it is not 
very appropriate to modify 
`https://github.com/apache/druid/blob/master/extensions-core/kafka-indexing-service/src/main/java/org/apache/druid/indexing/kafka/supervisor/KafkaSupervisorIOConfig.java#L78`
 
   
   I have tried to do changes in  `KafkaSupervisorIOConfig.java` as you said 
(https://github.com/apache/druid/pull/10551/commits/cc59d8bbe0175faa269a4bdc06a24f13f6dd220c#)
   ```
   this.consumerProperties = Preconditions.checkNotNull(consumerProperties, 
"consumerProperties");
   consumerProperties.putIfAbsent("isolation.level", "read_committed");
   Preconditions.checkNotNull(
   ```
   and functional testing is fine while UT is failed at 
https://github.com/apache/druid/blob/ac6d703814ccb5b258c586b63e0bc33d669e0f57/extensions-core/kafka-indexing-service/src/test/java/org/apache/druid/indexing/kafka/KafkaIOConfigTest.java#L363
 and 
https://github.com/apache/druid/blob/ac6d703814ccb5b258c586b63e0bc33d669e0f57/extensions-core/kafka-indexing-service/src/test/java/org/apache/druid/indexing/kafka/KafkaIOConfigTest.java#L332
   
   These failure means changes in KafkaSupervisorIOConfig may cause 
compatibility issues when deserialize from old IoConfig to new IoConfig or 
deserialize from new IoConfig to old IoConfig. In other words, we need to 
ensure that the deserialize between new IoConfig and old IoConfig is completely 
consistent and smooth.
   
   So I switched to a safer way in the latest commit.
   All the changes is tested on Dev cluster.
   PTAL :)
   
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[druid] branch master updated (7e95228 -> c94be8a)

2020-12-03 Thread nishant
This is an automated email from the ASF dual-hosted git repository.

nishant pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git.


from 7e95228  introduce DynamicConfigProvider interface and make kafka 
consumer props extensible (#10309)
 add c94be8a  Revert "Update google client libraries (#10536)" (#10599)

No new revisions were added by this update.

Summary of changes:
 licenses.yaml | 10 +-
 pom.xml   | 11 +--
 2 files changed, 10 insertions(+), 11 deletions(-)


-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] nishantmonu51 commented on pull request #10599: Revert "Update google client libraries (#10536)"

2020-12-03 Thread GitBox


nishantmonu51 commented on pull request #10599:
URL: https://github.com/apache/druid/pull/10599#issuecomment-738039094


   @suneet-s : lets merge this change for now and i will create a new PR with 
the changes again. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] nishantmonu51 merged pull request #10599: Revert "Update google client libraries (#10536)"

2020-12-03 Thread GitBox


nishantmonu51 merged pull request #10599:
URL: https://github.com/apache/druid/pull/10599


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] kroeders commented on a change in pull request #10428: allow server selection to be aware of query

2020-12-03 Thread GitBox


kroeders commented on a change in pull request #10428:
URL: https://github.com/apache/druid/pull/10428#discussion_r535256895



##
File path: 
server/src/main/java/org/apache/druid/client/CachingClusteredClient.java
##
@@ -812,7 +812,7 @@ String computeResultLevelCachingEtag(
   Hasher hasher = Hashing.sha1().newHasher();
   boolean hasOnlyHistoricalSegments = true;
   for (SegmentServerSelector p : segments) {
-if (!p.getServer().pick().getServer().isSegmentReplicationTarget()) {
+if 
(!p.getServer().pick(null).getServer().isSegmentReplicationTarget()) {

Review comment:
   this was the only functional place that the parameterless pick() was 
called that I mentioned in the other comment @jihoonson , the Query is 
available inside of SpecificQueryRunnable where this is called, so I could 
alternatively pass the query into computeResultLevelCachingEtag and add a null 
check on this line, what do you think?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] kroeders commented on pull request #10428: allow server selection to be aware of query

2020-12-03 Thread GitBox


kroeders commented on pull request #10428:
URL: https://github.com/apache/druid/pull/10428#issuecomment-738015529


   thanks for the feedback and taking the time to look @jihoonson , I 
appreciate it! I updated the branch per the comments and left a few notes that 
I found while making the changes.  



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] kroeders commented on a change in pull request #10428: allow server selection to be aware of query

2020-12-03 Thread GitBox


kroeders commented on a change in pull request #10428:
URL: https://github.com/apache/druid/pull/10428#discussion_r535252310



##
File path: 
server/src/main/java/org/apache/druid/client/selector/TierSelectorStrategy.java
##
@@ -47,6 +49,22 @@
   List pick(
   Int2ObjectRBTreeMap> prioritizedServers,
   DataSegment segment,
-  int numServersToPick
-  );
+  int numServersToPick);
+
+  @Nullable
+  default  QueryableDruidServer pick(Query query,
+  Int2ObjectRBTreeMap> prioritizedServers,
+  DataSegment segment)
+  {
+return pick(prioritizedServers, segment);

Review comment:
   This was done for backward compatibility - I was trying to make this 
change as minimal as possible on the first cut. We don't have any custom 
implementations here and I haven't seen any on a cursory look, but I don't want 
to presume. I think it's cleaner to remove the ones that don't take Query and 
having an interface of all defaults that call each other makes the abstract 
methods less obvious. What do you think? I'm happy to defer to your judgement. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] kroeders commented on a change in pull request #10428: allow server selection to be aware of query

2020-12-03 Thread GitBox


kroeders commented on a change in pull request #10428:
URL: https://github.com/apache/druid/pull/10428#discussion_r535246679



##
File path: 
server/src/main/java/org/apache/druid/client/selector/TierSelectorStrategy.java
##
@@ -47,6 +49,22 @@
   List pick(

Review comment:
   I marked the non-query ones as deprecated throughout





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] kroeders commented on a change in pull request #10428: allow server selection to be aware of query

2020-12-03 Thread GitBox


kroeders commented on a change in pull request #10428:
URL: https://github.com/apache/druid/pull/10428#discussion_r535246123



##
File path: 
server/src/main/java/org/apache/druid/client/selector/ServerSelector.java
##
@@ -162,12 +163,20 @@ public boolean isEmpty()
   @Nullable
   @Override
   public QueryableDruidServer pick()
+  {
+if (!historicalServers.isEmpty()) {
+  return strategy.pick(historicalServers, segment.get());
+}
+return strategy.pick(realtimeServers, segment.get());
+  }
+
+  public  QueryableDruidServer pick(Query query)

Review comment:
   I removed it, the method was called a couple of times in tests (which 
seems OK) and in the etag creation part, which seems OK to me. It looks a 
little funny to pass in null as a single parameter, but I could argue either 
way vs having a method that isn't used elsewhere. 

##
File path: 
server/src/main/java/org/apache/druid/client/selector/TierSelectorStrategy.java
##
@@ -47,6 +49,22 @@
   List pick(
   Int2ObjectRBTreeMap> prioritizedServers,
   DataSegment segment,
-  int numServersToPick
-  );
+  int numServersToPick);
+
+  @Nullable
+  default  QueryableDruidServer pick(Query query,

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] kroeders commented on a change in pull request #10428: allow server selection to be aware of query

2020-12-03 Thread GitBox


kroeders commented on a change in pull request #10428:
URL: https://github.com/apache/druid/pull/10428#discussion_r535244203



##
File path: 
server/src/main/java/org/apache/druid/client/selector/TierSelectorStrategy.java
##
@@ -47,6 +49,22 @@
   List pick(
   Int2ObjectRBTreeMap> prioritizedServers,
   DataSegment segment,
-  int numServersToPick
-  );
+  int numServersToPick);
+
+  @Nullable
+  default  QueryableDruidServer pick(Query query,
+  Int2ObjectRBTreeMap> prioritizedServers,
+  DataSegment segment)
+  {
+return pick(prioritizedServers, segment);
+  }
+
+  default  List pick(
+  Query query,

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] kroeders commented on a change in pull request #10428: allow server selection to be aware of query

2020-12-03 Thread GitBox


kroeders commented on a change in pull request #10428:
URL: https://github.com/apache/druid/pull/10428#discussion_r535243969



##
File path: 
server/src/main/java/org/apache/druid/client/selector/ServerSelectorStrategy.java
##
@@ -33,6 +35,17 @@
 })
 public interface ServerSelectorStrategy
 {
+  default  QueryableDruidServer pick(Query query, 
Set servers, DataSegment segment)

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] lkm commented on a change in pull request #10543: Avatica protobuf

2020-12-03 Thread GitBox


lkm commented on a change in pull request #10543:
URL: https://github.com/apache/druid/pull/10543#discussion_r535196727



##
File path: 
server/src/main/java/org/apache/druid/server/AsyncQueryForwardingServlet.java
##
@@ -88,6 +92,7 @@
   private static final String OBJECTMAPPER_ATTRIBUTE = 
"org.apache.druid.proxy.objectMapper";
 
   private static final int CANCELLATION_TIMEOUT_MILLIS = 500;
+  private static final long serialVersionUID = 2026655693637894524L;

Review comment:
   It's because I'm using the transient keyword below and the class is 
serializable 
([documentation](https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/io/Serializable.html))
 and basically to satisfy LGMT





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] lkm commented on a change in pull request #10543: Avatica protobuf

2020-12-03 Thread GitBox


lkm commented on a change in pull request #10543:
URL: https://github.com/apache/druid/pull/10543#discussion_r535195705



##
File path: 
server/src/main/java/org/apache/druid/server/AsyncQueryForwardingServlet.java
##
@@ -119,6 +124,7 @@ private static void handleException(HttpServletResponse 
response, ObjectMapper o
   private final RequestLogger requestLogger;
   private final GenericQueryMetricsFactory queryMetricsFactory;
   private final AuthenticatorMapper authenticatorMapper;
+  private final transient ProtobufTranslation protobufTranslation;

Review comment:
   The servlet is serializable and the ProtobufTranslation isn't, so this 
is required





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] Mrxiashu opened a new issue #10626: Whether HDFS Dependencies Can Be Removed in the Future Development of Druid?

2020-12-03 Thread GitBox


Mrxiashu opened a new issue #10626:
URL: https://github.com/apache/druid/issues/10626


   In the production environment, HDFS occupies some resources. If HDFS 
features (such as data ingestion) are not used, is HDFS optional in future 
design?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] andrewbridge commented on issue #10609: Custom aggregation type name not registering correctly

2020-12-03 Thread GitBox


andrewbridge commented on issue #10609:
URL: https://github.com/apache/druid/issues/10609#issuecomment-737925029


   Okay, great. Does this mean I have to create a child class off of 
SimpleModule for every extension or can I use this annotations elsewhere?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] lkm commented on a change in pull request #10543: Avatica protobuf

2020-12-03 Thread GitBox


lkm commented on a change in pull request #10543:
URL: https://github.com/apache/druid/pull/10543#discussion_r535164826



##
File path: 
server/src/main/java/org/apache/druid/server/AsyncQueryForwardingServlet.java
##
@@ -449,6 +463,95 @@ static String getAvaticaConnectionId(Map 
requestMap)
 return (String) connectionIdObj;
   }
 
+  static String getAvaticaProtobufConnectionId(Service.Request request)
+  {
+if (request instanceof Service.CatalogsRequest) {

Review comment:
   Agreed, it's a bit fragile, however, that, I'm afraid, is the nature of 
protobufs. I'm not sure how likely they are to add new request types without 
significant overhaul to Calcite though





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] andrewbridge commented on issue #10611: Custom, multi-column aggregation best practice

2020-12-03 Thread GitBox


andrewbridge commented on issue #10611:
URL: https://github.com/apache/druid/issues/10611#issuecomment-737898313


   To follow up, I've copied examples of aggregators which do not appear to use 
an associated `Serde` class:
   
   ```java
   @Override
 public List getJacksonModules()
 {
   return ImmutableList.of(
   new SimpleModule(getClass().getSimpleName())
   .registerSubtypes(
   new NamedType(DoubleMinOfAggregatorFactory.class, 
DoubleMinOfAggregatorFactory.TYPE_NAME)
   ).addSerializer(
   DoubleMinOfCollector.class,
   DoubleMinOfCollector.Serializer.INSTANCE
   )
   );
 }
   ```
   
   This allows my tests to pass as expected. Are there any issues with using 
this pattern over the `Serde` class pattern? Or am I entirely misunderstanding 
and the two are unrelated?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] lkm commented on a change in pull request #10543: Avatica protobuf

2020-12-03 Thread GitBox


lkm commented on a change in pull request #10543:
URL: https://github.com/apache/druid/pull/10543#discussion_r535163258



##
File path: 
sql/src/main/java/org/apache/druid/sql/avatica/DruidAvaticaProtobufHandler.java
##
@@ -0,0 +1,62 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.druid.sql.avatica;
+
+import com.google.inject.Inject;
+import org.apache.calcite.avatica.remote.LocalService;
+import org.apache.calcite.avatica.remote.Service;
+import org.apache.calcite.avatica.server.AvaticaProtobufHandler;
+import org.apache.druid.guice.annotations.Self;
+import org.apache.druid.server.DruidNode;
+import org.eclipse.jetty.server.Request;
+
+import javax.servlet.ServletException;
+import javax.servlet.http.HttpServletRequest;
+import javax.servlet.http.HttpServletResponse;
+import java.io.IOException;
+
+public class DruidAvaticaProtobufHandler extends AvaticaProtobufHandler
+{
+  public static final String AVATICA_PATH = "/druid/v2/sql/avatica-pb/";

Review comment:
   The problem is the drivers. If we want to support JDBC and golang 
drivers, I'm not sure they would be happy with always appending a query param 
to the url. The other option I can think of that may work is to auto-detect it 
when the first letter of the post is `[` or `{` (ie. json), that just seemed a 
bit dirty to me





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] lgtm-com[bot] commented on pull request #10625: Add option for maxBytesInMemory to be percentage of JVM memory

2020-12-03 Thread GitBox


lgtm-com[bot] commented on pull request #10625:
URL: https://github.com/apache/druid/pull/10625#issuecomment-737895233


   This pull request **introduces 1 alert** when merging 
339a0ad0d077445808a025aa76ad343c81f6b5ee into 
7e9522870f019db444b4b6eb5cb3945d6545cf7a - [view on 
LGTM.com](https://lgtm.com/projects/g/apache/druid/rev/pr-bef2480c4f13b1545ea4b350f29596bd113742f6)
   
   **new alerts:**
   
   * 1 for Useless comparison test



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10624: Grouping id 2

2020-12-03 Thread GitBox


abhishekagarwal87 commented on a change in pull request #10624:
URL: https://github.com/apache/druid/pull/10624#discussion_r535019924



##
File path: 
processing/src/main/java/org/apache/druid/query/groupby/GroupByQueryQueryToolChest.java
##
@@ -233,7 +235,7 @@ public GroupByQueryQueryToolChest(
 return groupByStrategy.processSubtotalsSpec(
 query,
 resource,
-groupByStrategy.processSubqueryResult(subquery, query, resource, 
finalizingResults, false)
+groupByStrategy.processSubqueryResult(subquery, 
withoutSubtotals(query), resource, finalizingResults, false)

Review comment:
   this needs more thinking. why weren't we passing 
`query.withSubtotalsSpec(null)` instead of `query`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org



[GitHub] [druid] abhishekagarwal87 opened a new pull request #10624: Grouping id 2

2020-12-03 Thread GitBox


abhishekagarwal87 opened a new pull request #10624:
URL: https://github.com/apache/druid/pull/10624


   A different way to do #10518. This avoids the grouping aggregation on 
historicals by re-working the query that is sent to historicals. Historicals 
don't get grouping aggregation at all. However, now you have a mismatch in 
result row signatures that needs to be fixed up in a few places. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org