[GitHub] [druid] gianm commented on pull request #10267: DruidInputSource: Fix issues in column projection, timestamp handling.
gianm commented on pull request #10267: URL: https://github.com/apache/druid/pull/10267#issuecomment-738621133 > I didn't see the ClassCastException in the log for the failed batch test though. I finally ran this locally and was able to extract the full error. I think it's happening because `__time` is getting added to the dimensions list of an input row. (This makes a string dimension named `__time` get created, which violates all sorts of assumptions.) We'll need to prevent that from happening somehow. I'll look into it. ``` Caused by: java.lang.ClassCastException: org.apache.druid.segment.incremental.IncrementalIndexRowHolder cannot be cast to org.apache.druid.segment.DimensionSelector at org.apache.druid.segment.StringDimensionIndexer.convertUnsortedValuesToSorted(StringDimensionIndexer.java:793) ~[druid-processing-0.21.0-SNAPSHOT.jar:0.21.0-SNAPSHOT] at org.apache.druid.segment.incremental.IncrementalIndexRowIterator.lambda$makeRowPointer$0(IncrementalIndexRowIterator.java:78) ~[druid-processing-0.21.0-SNAPSHOT.jar:0.21.0-SNAPSHOT] at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) ~[?:1.8.0_232] at java.util.Iterator.forEachRemaining(Iterator.java:116) ~[?:1.8.0_232] at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) ~[?:1.8.0_232] at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) ~[?:1.8.0_232] at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) ~[?:1.8.0_232] at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:546) ~[?:1.8.0_232] at java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:260) ~[?:1.8.0_232] at java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:505) ~[?:1.8.0_232] at org.apache.druid.segment.incremental.IncrementalIndexRowIterator.makeRowPointer(IncrementalIndexRowIterator.java:80) ~[druid-processing-0.21.0-SNAPSHOT.jar:0.21.0-SNAPSHOT] at org.apache.druid.segment.incremental.IncrementalIndexRowIterator.(IncrementalIndexRowIterator.java:54) ~[druid-processing-0.21.0-SNAPSHOT.jar:0.21.0-SNAPSHOT] at org.apache.druid.segment.incremental.IncrementalIndexAdapter.getRows(IncrementalIndexAdapter.java:162) ~[druid-processing-0.21.0-SNAPSHOT.jar:0.21.0-SNAPSHOT] at org.apache.druid.segment.IndexMergerV9.makeMergedTimeAndDimsIterator(IndexMergerV9.java:1106) ~[druid-processing-0.21.0-SNAPSHOT.jar:0.21.0-SNAPSHOT] at org.apache.druid.segment.IndexMergerV9.makeIndexFiles(IndexMergerV9.java:255) ~[druid-processing-0.21.0-SNAPSHOT.jar:0.21.0-SNAPSHOT] at org.apache.druid.segment.IndexMergerV9.merge(IndexMergerV9.java:999) ~[druid-processing-0.21.0-SNAPSHOT.jar:0.21.0-SNAPSHOT] at org.apache.druid.segment.IndexMergerV9.persist(IndexMergerV9.java:864) ~[druid-processing-0.21.0-SNAPSHOT.jar:0.21.0-SNAPSHOT] at org.apache.druid.segment.IndexMergerV9.persist(IndexMergerV9.java:833) ~[druid-processing-0.21.0-SNAPSHOT.jar:0.21.0-SNAPSHOT] at org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.persistHydrant(AppenderatorImpl.java:1334) ~[druid-server-0.21.0-SNAPSHOT.jar:0.21.0-SNAPSHOT] at org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.access$100(AppenderatorImpl.java:102) ~[druid-server-0.21.0-SNAPSHOT.jar:0.21.0-SNAPSHOT] at org.apache.druid.segment.realtime.appenderator.AppenderatorImpl$2.call(AppenderatorImpl.java:547) ~[druid-server-0.21.0-SNAPSHOT.jar:0.21.0-SNAPSHOT] ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] jihoonson commented on a change in pull request #10631: Fixes and tests related to the Indexer process.
jihoonson commented on a change in pull request #10631: URL: https://github.com/apache/druid/pull/10631#discussion_r535866173 ## File path: indexing-service/src/main/java/org/apache/druid/indexing/common/task/Task.java ## @@ -160,8 +160,11 @@ default int getPriority() /** * Execute preflight actions for a task. This can be used to acquire locks, check preconditions, and so on. The * actions must be idempotent, since this method may be executed multiple times. This typically runs on the - * coordinator. If this method throws an exception, the task should be considered a failure. - * + * Overlord. If this method throws an exception, the task should be considered a failure. + * + * This method will not necessarily be executed before {@link #run(TaskToolbox)}, since this task readiness check + * may be done on a different machine from the one that actually runs the task. Review comment: Oh, I see. I think it happens only in indexers because peon calls `isReady()` before it runs its task. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] gianm commented on a change in pull request #10631: Fixes and tests related to the Indexer process.
gianm commented on a change in pull request #10631: URL: https://github.com/apache/druid/pull/10631#discussion_r535855400 ## File path: integration-tests/docker/environment-configs/coordinator ## @@ -32,7 +32,11 @@ druid_coordinator_startDelay=PT5S druid_manager_lookups_hostUpdateTimeout=PT30S druid_manager_lookups_period=1 druid_manager_lookups_threadPoolSize=2 +druid_manager_config_pollDuration=PT10S +druid_manager_rules_pollDuration=PT10S +druid_manager_segments_pollDuration=PT2S druid_auth_basic_common_cacheDirectory=/tmp/authCache/coordinator druid_auth_unsecuredPaths=["/druid/coordinator/v1/loadqueue"] druid_server_https_crlPath=/tls/revocations.crl -druid_coordinator_period_indexingPeriod=PT18S \ No newline at end of file +druid_coordinator_period_indexingPeriod=PT18S +druid_coordinator_period=PT1S Review comment: Ha ha I think it's OK — there are going to be very few segments in the integration tests, so a coordinator run should be able to finish very quickly. This change was meant to speed up the tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] gianm commented on a change in pull request #10631: Fixes and tests related to the Indexer process.
gianm commented on a change in pull request #10631: URL: https://github.com/apache/druid/pull/10631#discussion_r535855400 ## File path: integration-tests/docker/environment-configs/coordinator ## @@ -32,7 +32,11 @@ druid_coordinator_startDelay=PT5S druid_manager_lookups_hostUpdateTimeout=PT30S druid_manager_lookups_period=1 druid_manager_lookups_threadPoolSize=2 +druid_manager_config_pollDuration=PT10S +druid_manager_rules_pollDuration=PT10S +druid_manager_segments_pollDuration=PT2S druid_auth_basic_common_cacheDirectory=/tmp/authCache/coordinator druid_auth_unsecuredPaths=["/druid/coordinator/v1/loadqueue"] druid_server_https_crlPath=/tls/revocations.crl -druid_coordinator_period_indexingPeriod=PT18S \ No newline at end of file +druid_coordinator_period_indexingPeriod=PT18S +druid_coordinator_period=PT1S Review comment: Ha ha I think it's OK — there are going to be very few segments in the integration tests, so a coordinator run should be able to finish very quickly. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] gianm commented on pull request #10631: Fixes and tests related to the Indexer process.
gianm commented on pull request #10631: URL: https://github.com/apache/druid/pull/10631#issuecomment-738583131 > #10538 and #10258 will be fixed by this PR. Perhaps #9820 as well. I agree, it should fix all of those. I commented in those patches. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] gianm edited a comment on pull request #10631: Fixes and tests related to the Indexer process.
gianm edited a comment on pull request #10631: URL: https://github.com/apache/druid/pull/10631#issuecomment-738583131 > #10538 and #10258 will be fixed by this PR. Perhaps #9820 as well. I agree, it should fix all of those. I commented in those issues. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] gianm commented on issue #10258: Regression: Indexer nodes dont announce realtime segments in 0.19
gianm commented on issue #10258: URL: https://github.com/apache/druid/issues/10258#issuecomment-738583006 #10631 should fix this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] gianm commented on issue #10538: can not read realtime data when using indexer process instead of middleManager peon process to ingest metrics from kafka
gianm commented on issue #10538: URL: https://github.com/apache/druid/issues/10538#issuecomment-738582968 #10631 should fix this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] gianm commented on issue #9820: "taskLockHelper is not initialized yet"
gianm commented on issue #9820: URL: https://github.com/apache/druid/issues/9820#issuecomment-738582813 #10631 should fix this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] gianm commented on a change in pull request #10631: Fixes and tests related to the Indexer process.
gianm commented on a change in pull request #10631: URL: https://github.com/apache/druid/pull/10631#discussion_r535854619 ## File path: indexing-service/src/main/java/org/apache/druid/indexing/common/task/Task.java ## @@ -160,8 +160,11 @@ default int getPriority() /** * Execute preflight actions for a task. This can be used to acquire locks, check preconditions, and so on. The * actions must be idempotent, since this method may be executed multiple times. This typically runs on the - * coordinator. If this method throws an exception, the task should be considered a failure. - * + * Overlord. If this method throws an exception, the task should be considered a failure. + * + * This method will not necessarily be executed before {@link #run(TaskToolbox)}, since this task readiness check + * may be done on a different machine from the one that actually runs the task. Review comment: It won't necessarily be called on that same Task object (it might be called on a different instance that represents the same actual task). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] jihoonson commented on a change in pull request #10631: Fixes and tests related to the Indexer process.
jihoonson commented on a change in pull request #10631: URL: https://github.com/apache/druid/pull/10631#discussion_r535850965 ## File path: integration-tests/docker/environment-configs/coordinator ## @@ -32,7 +32,11 @@ druid_coordinator_startDelay=PT5S druid_manager_lookups_hostUpdateTimeout=PT30S druid_manager_lookups_period=1 druid_manager_lookups_threadPoolSize=2 +druid_manager_config_pollDuration=PT10S +druid_manager_rules_pollDuration=PT10S +druid_manager_segments_pollDuration=PT2S druid_auth_basic_common_cacheDirectory=/tmp/authCache/coordinator druid_auth_unsecuredPaths=["/druid/coordinator/v1/loadqueue"] druid_server_https_crlPath=/tls/revocations.crl -druid_coordinator_period_indexingPeriod=PT18S \ No newline at end of file +druid_coordinator_period_indexingPeriod=PT18S +druid_coordinator_period=PT1S Review comment: Is this too aggressive? :sweat_smile: ## File path: indexing-service/src/main/java/org/apache/druid/indexing/common/task/Task.java ## @@ -160,8 +160,11 @@ default int getPriority() /** * Execute preflight actions for a task. This can be used to acquire locks, check preconditions, and so on. The * actions must be idempotent, since this method may be executed multiple times. This typically runs on the - * coordinator. If this method throws an exception, the task should be considered a failure. - * + * Overlord. If this method throws an exception, the task should be considered a failure. + * + * This method will not necessarily be executed before {@link #run(TaskToolbox)}, since this task readiness check + * may be done on a different machine from the one that actually runs the task. Review comment: Hmm, how can this happen? I thought `Task.isReady()` is always called before `Task.run()` because a task can be scheduled only when `Task.isReady()` returns true. `Task.run()` will be called after the task is scheduled in some indexer or middleManager. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] maytasm edited a comment on pull request #10625: Add option for maxBytesInMemory to be percentage of JVM memory
maytasm edited a comment on pull request #10625: URL: https://github.com/apache/druid/pull/10625#issuecomment-738578884 Travis failure is because of processing module test failing due to code coverage. Note that all tests passed. The code coverage is missing on a single line changed in OnheapIncrementalIndex.java. I don't think it's worth-wide or possible to test so we should just ignore the code coverage. ``` -- org/apache/druid/segment/incremental/OnheapIncrementalIndex.java -- 463 } 464 465 @Override 466 public long getMaxJvmMemory() 467 { 468 F | L return JvmUtils.getRuntimeInfo().getMaxHeapSizeBytes(); -- Diff coverage statistics: -- | lines |branches| functions| path -- | 0% (0/1) | 100% (0/0) | 100% (1/1) | org/apache/druid/segment/incremental/OnheapIncrementalIndex.java -- Total diff coverage: - lines: 0% (0/1) - branches: 100% (0/0) - functions: 100% (1/1) ERROR: Insufficient line coverage of 0% (0/1). Required 50%. ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] maytasm commented on pull request #10625: Add option for maxBytesInMemory to be percentage of JVM memory
maytasm commented on pull request #10625: URL: https://github.com/apache/druid/pull/10625#issuecomment-738578884 Travis failure is because of processing module test failing due to code coverage. Note that all tests passed. The code coverage is missing on a single line changed in OnheapIncrementalIndex.java. I don't think it's worth-wide or possible to test so we should just ignore the code coverage. `-- org/apache/druid/segment/incremental/OnheapIncrementalIndex.java -- 463 } 464 465 @Override 466 public long getMaxJvmMemory() 467 { 468 F | L return JvmUtils.getRuntimeInfo().getMaxHeapSizeBytes(); -- Diff coverage statistics: -- | lines |branches| functions| path -- | 0% (0/1) | 100% (0/0) | 100% (1/1) | org/apache/druid/segment/incremental/OnheapIncrementalIndex.java -- Total diff coverage: - lines: 0% (0/1) - branches: 100% (0/0) - functions: 100% (1/1) ERROR: Insufficient line coverage of 0% (0/1). Required 50%.` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] abhishekagarwal87 closed pull request #10624: Grouping id 2
abhishekagarwal87 closed pull request #10624: URL: https://github.com/apache/druid/pull/10624 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] jihoonson commented on pull request #10631: Fixes and tests related to the Indexer process.
jihoonson commented on pull request #10631: URL: https://github.com/apache/druid/pull/10631#issuecomment-738570769 https://github.com/apache/druid/issues/10538 and https://github.com/apache/druid/issues/10258 will be fixed by this PR. Perhaps https://github.com/apache/druid/issues/9820 as well. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] gianm opened a new pull request #10631: Fixes and tests related to the Indexer process.
gianm opened a new pull request #10631: URL: https://github.com/apache/druid/pull/10631 Three bugs fixed: 1) Indexers would not announce themselves as segment servers if they did not have storage locations defined. This used to work, but was broken in #9971. Fixed this by adding an "isSegmentServer" method to ServerType and updating SegmentLoadDropHandler to always announce if this method returns true. 2) Certain batch task types were written in a way that assumed "isReady" would be called before "run", which is not guaranteed. In particular, they relied on it in order to initialize "taskLockHelper". Fixed this by updating AbstractBatchIndexTask to ensure "isReady" is called before "run" for these tasks. 3) UnifiedIndexerAppenderatorsManager did not properly handle complex datasources. Introduced DataSourceAnalysis in order to fix this. Test changes: 1) Add a new "docker-compose.cli-indexer.yml" config that spins up an Indexer instead of a MiddleManager. 2) Introduce a "USE_INDEXER" environment variable that determines if docker-compose will start up an Indexer or a MiddleManager. 3) Duplicate all the jdk8 tests and run them in both MiddleManager and Indexer mode. 4) Various adjustments to encourage fail-fast errors in the Docker build scripts. 5) Various adjustments to speed up integration tests and reduce memory usage. 6) Add another Mac-specific approach to determining a machine's own IP. This was useful on my development machine. 7) Update segment-count check in ITCompactionTaskTest to eliminate a race condition (it was looking for 6 segments, which only exist together briefly, until the older 4 are marked unused). Javadoc updates: 1) AbstractBatchIndexTask: Added javadocs to determineLockGranularityXXX that make it clear when taskLockHelper will be initialized as a side effect. (Related to the second bug above.) 2) Task: Clarified that "isReady" is not guaranteed to be called before "run". It was already implied, but now it's explicit. 3) ZkCoordinator: Clarified deprecation message. 4) DataSegmentServerAnnouncer: Clarified deprecation message. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] gianm opened a new pull request #10630: Scan query: More accurate error message when segment per time chunk limit is exceeded.
gianm opened a new pull request #10630: URL: https://github.com/apache/druid/pull/10630 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] zhangyue19921010 commented on pull request #10551: Remove hard limitation that druid(after 0.15.0) only can consume Kafka version 0.11.x or better
zhangyue19921010 commented on pull request #10551: URL: https://github.com/apache/druid/pull/10551#issuecomment-738549172 @FrankChen021 @nishantmonu51 @a2l007 @himanshug @pphust Thanks for your review! And @himanshug Thanks for your merge! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[druid] branch master updated: Move common configurations to TuningConfig (#10478)
This is an automated email from the ASF dual-hosted git repository. cwylie pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/druid.git The following commit(s) were added to refs/heads/master by this push: new 52d46ce Move common configurations to TuningConfig (#10478) 52d46ce is described below commit 52d46cebc31026b8dd39c7d9fb82c62bd77965fb Author: Liran Funaro AuthorDate: Fri Dec 4 04:13:32 2020 +0200 Move common configurations to TuningConfig (#10478) * Move common methods that are used in HadoopTuningConfig and in AppenderatorConfig to TuningConfig * Rename rowFlushBoundary in HadoopTuningConfig to maxRowsInMemory to match TuningConfig API --- .../MaterializedViewSupervisorSpec.java| 4 ++-- .../apache/druid/indexer/HadoopTuningConfig.java | 25 -- .../apache/druid/indexer/IndexGeneratorJob.java| 2 +- .../druid/indexer/HadoopTuningConfigTest.java | 2 +- .../druid/segment/indexing/TuningConfig.java | 13 +++ .../realtime/appenderator/AppenderatorConfig.java | 13 --- 6 files changed, 31 insertions(+), 28 deletions(-) diff --git a/extensions-contrib/materialized-view-maintenance/src/main/java/org/apache/druid/indexing/materializedview/MaterializedViewSupervisorSpec.java b/extensions-contrib/materialized-view-maintenance/src/main/java/org/apache/druid/indexing/materializedview/MaterializedViewSupervisorSpec.java index 5388388..db63a73 100644 --- a/extensions-contrib/materialized-view-maintenance/src/main/java/org/apache/druid/indexing/materializedview/MaterializedViewSupervisorSpec.java +++ b/extensions-contrib/materialized-view-maintenance/src/main/java/org/apache/druid/indexing/materializedview/MaterializedViewSupervisorSpec.java @@ -182,7 +182,7 @@ public class MaterializedViewSupervisorSpec implements SupervisorSpec tuningConfig.getIndexSpec(), tuningConfig.getIndexSpecForIntermediatePersists(), tuningConfig.getAppendableIndexSpec(), -tuningConfig.getRowFlushBoundary(), +tuningConfig.getMaxRowsInMemory(), tuningConfig.getMaxBytesInMemory(), tuningConfig.isLeaveIntermediate(), tuningConfig.isCleanupOnFailure(), @@ -191,7 +191,7 @@ public class MaterializedViewSupervisorSpec implements SupervisorSpec tuningConfig.getJobProperties(), tuningConfig.isCombineText(), tuningConfig.getUseCombiner(), -tuningConfig.getRowFlushBoundary(), +tuningConfig.getMaxRowsInMemory(), tuningConfig.getBuildV9Directly(), tuningConfig.getNumBackgroundPersistThreads(), tuningConfig.isForceExtendableShardSpecs(), diff --git a/indexing-hadoop/src/main/java/org/apache/druid/indexer/HadoopTuningConfig.java b/indexing-hadoop/src/main/java/org/apache/druid/indexer/HadoopTuningConfig.java index f1a8cc8..0f29f6a 100644 --- a/indexing-hadoop/src/main/java/org/apache/druid/indexer/HadoopTuningConfig.java +++ b/indexing-hadoop/src/main/java/org/apache/druid/indexer/HadoopTuningConfig.java @@ -44,7 +44,6 @@ public class HadoopTuningConfig implements TuningConfig private static final DimensionBasedPartitionsSpec DEFAULT_PARTITIONS_SPEC = HashedPartitionsSpec.defaultSpec(); private static final Map> DEFAULT_SHARD_SPECS = ImmutableMap.of(); private static final IndexSpec DEFAULT_INDEX_SPEC = new IndexSpec(); - private static final int DEFAULT_ROW_FLUSH_BOUNDARY = TuningConfig.DEFAULT_MAX_ROWS_IN_MEMORY; private static final boolean DEFAULT_USE_COMBINER = false; private static final int DEFAULT_NUM_BACKGROUND_PERSIST_THREADS = 0; @@ -58,7 +57,7 @@ public class HadoopTuningConfig implements TuningConfig DEFAULT_INDEX_SPEC, DEFAULT_INDEX_SPEC, DEFAULT_APPENDABLE_INDEX, -DEFAULT_ROW_FLUSH_BOUNDARY, +DEFAULT_MAX_ROWS_IN_MEMORY, 0L, false, true, @@ -86,7 +85,7 @@ public class HadoopTuningConfig implements TuningConfig private final IndexSpec indexSpec; private final IndexSpec indexSpecForIntermediatePersists; private final AppendableIndexSpec appendableIndexSpec; - private final int rowFlushBoundary; + private final int maxRowsInMemory; private final long maxBytesInMemory; private final boolean leaveIntermediate; private final boolean cleanupOnFailure; @@ -141,8 +140,8 @@ public class HadoopTuningConfig implements TuningConfig this.indexSpec = indexSpec == null ? DEFAULT_INDEX_SPEC : indexSpec; this.indexSpecForIntermediatePersists = indexSpecForIntermediatePersists == null ? this.indexSpec : indexSpecForIntermediatePersists; -this.rowFlushBoundary = maxRowsInMemory == null ? maxRowsInMemoryCOMPAT == null - ? DEFAULT_ROW_FLUSH_BOUNDARY +this.maxRowsInMemory = maxRowsInMemory == null ? maxRowsInMemoryCOMPAT == null +
[GitHub] [druid] clintropolis merged pull request #10478: Move common configurations to TuningConfig
clintropolis merged pull request #10478: URL: https://github.com/apache/druid/pull/10478 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] clintropolis commented on pull request #10478: Move common configurations to TuningConfig
clintropolis commented on pull request #10478: URL: https://github.com/apache/druid/pull/10478#issuecomment-738505130 I don't seem to be able to retry the failing CI which looks like it was a flaky test. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[druid] branch master updated (ae6c43d -> 229b5f3)
This is an automated email from the ASF dual-hosted git repository. himanshug pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/druid.git. from ae6c43d Add an integration test for HTTP inputSource (#10620) add 229b5f3 Remove hard limitation that druid(after 0.15.0) only can consume Kafka version 0.11.x or better (#10551) No new revisions were added by this update. Summary of changes: .../development/extensions-core/kafka-ingestion.md | 6 ++-- .../druid/indexing/kafka/KafkaConsumerConfigs.java | 1 - .../druid/indexing/kafka/KafkaIndexTask.java | 1 + .../druid/indexing/kafka/KafkaRecordSupplier.java | 1 + .../druid/indexing/kafka/KafkaIndexTaskTest.java | 41 ++ 5 files changed, 47 insertions(+), 3 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] himanshug merged pull request #10551: Remove hard limitation that druid(after 0.15.0) only can consume Kafka version 0.11.x or better
himanshug merged pull request #10551: URL: https://github.com/apache/druid/pull/10551 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] lgtm-com[bot] commented on pull request #10628: map-string-string dimension column contrib extension
lgtm-com[bot] commented on pull request #10628: URL: https://github.com/apache/druid/pull/10628#issuecomment-738488519 This pull request **introduces 2 alerts** when merging f306b31f55e358c4c3f579ae05dd8f830147c3ee into ae6c43de71f9983564b440f8db851e5e7f1ae16d - [view on LGTM.com](https://lgtm.com/projects/g/apache/druid/rev/pr-eeba6ddea50cbe3b00bada4c9dacc3a2a2d5da4e) **new alerts:** * 1 for Result of multiplication cast to wider type * 1 for Potential input resource leak This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] himanshug commented on pull request #10551: Remove hard limitation that druid(after 0.15.0) only can consume Kafka version 0.11.x or better
himanshug commented on pull request #10551: URL: https://github.com/apache/druid/pull/10551#issuecomment-738482887 +1 after the build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] himanshug commented on a change in pull request #10551: Remove hard limitation that druid(after 0.15.0) only can consume Kafka version 0.11.x or better
himanshug commented on a change in pull request #10551: URL: https://github.com/apache/druid/pull/10551#discussion_r535759359 ## File path: extensions-core/kafka-indexing-service/src/main/java/org/apache/druid/indexing/kafka/KafkaConsumerConfigs.java ## @@ -31,14 +31,14 @@ public class KafkaConsumerConfigs { - public static Map getConsumerProperties() + public static Map getConsumerProperties(Map customerConsumerProperties) Review comment: > we need to ensure that the deserialize between new IoConfig and old IoConfig is completely consistent and smooth well, my main concern was the `consumeTransactionally` param which has been removed . Though I fail to see why simply adding the default to `KafkaSupervisorIOConfig.java` would not have same behavior functionally (or for backward compatibility as well)... but I am not so worried about the specifics at this point as it is easy to change later if needed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] techdocsmith opened a new pull request #10629: update syntax for golbal cached uri lookups
techdocsmith opened a new pull request #10629: URL: https://github.com/apache/druid/pull/10629 Fixes an issue with the syntax for columns in cached global lookups ### Description Updates the docs to be inline with the syntax displayed in the Druid UI. This PR has: - [x] been self-reviewed. - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.) - [ ] added documentation for new or modified features or behaviors. - [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links. - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/licenses.yaml) - [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader. - [ ] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met. - [ ] added integration tests. - [ ] been tested in a test Druid cluster. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] jihoonson commented on pull request #10543: Avatica protobuf
jihoonson commented on pull request #10543: URL: https://github.com/apache/druid/pull/10543#issuecomment-738466273 @lkm besides my comments above, would you please fix the conflicts? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] jihoonson commented on a change in pull request #10543: Avatica protobuf
jihoonson commented on a change in pull request #10543: URL: https://github.com/apache/druid/pull/10543#discussion_r535741400 ## File path: sql/src/main/java/org/apache/druid/sql/avatica/DruidAvaticaProtobufHandler.java ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.sql.avatica; + +import com.google.inject.Inject; +import org.apache.calcite.avatica.remote.LocalService; +import org.apache.calcite.avatica.remote.Service; +import org.apache.calcite.avatica.server.AvaticaProtobufHandler; +import org.apache.druid.guice.annotations.Self; +import org.apache.druid.server.DruidNode; +import org.eclipse.jetty.server.Request; + +import javax.servlet.ServletException; +import javax.servlet.http.HttpServletRequest; +import javax.servlet.http.HttpServletResponse; +import java.io.IOException; + +public class DruidAvaticaProtobufHandler extends AvaticaProtobufHandler +{ + public static final String AVATICA_PATH = "/druid/v2/sql/avatica-pb/"; Review comment: I see. It sounds reasonable to me. ## File path: server/src/main/java/org/apache/druid/server/AsyncQueryForwardingServlet.java ## @@ -119,6 +124,7 @@ private static void handleException(HttpServletResponse response, ObjectMapper o private final RequestLogger requestLogger; private final GenericQueryMetricsFactory queryMetricsFactory; private final AuthenticatorMapper authenticatorMapper; + private final transient ProtobufTranslation protobufTranslation; Review comment: Hmm, I didn't know that the servlet is serializable. However, [it seems that Jetty doesn't support serializing servlets anyway](https://github.com/eclipse/jetty.project/issues/5169). I would suggest to not add it since it can be misleading as other variables are not transient. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] himanshug commented on issue #10626: Whether HDFS Dependencies Can Be Removed in the Future Development of Druid?
himanshug commented on issue #10626: URL: https://github.com/apache/druid/issues/10626#issuecomment-738463834 curious about what/how-much resources are being consumed (assuming you don't add `hdfs-storage` to extensions list anyways) ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] himanshug opened a new pull request #10628: map-string-string dimension column contrib extension
himanshug opened a new pull request #10628: URL: https://github.com/apache/druid/pull/10628 ### Description Introduces a "mapStringString" complex type dimension column. All the code lives in an isolated contrib extension and implements relevant extensible Druid core interfaces e.g. ComplexColumn, ComplexMetricSerde, DimensionHandler, DimensionIndexer, DimensionMerger etc etc. It allows ingestion of `Map` column in users input data as a single `dimension` column . At query time, user can access individual keys in the map as if they were simple `string` dimension columns (see `MapStringStringKeyVirtualColumn.java` ) or column can be accessed as records of `Map` type but that does not have much use in Druid query layer for now aside from `select` query. User would have to implement their own `InputRowParser` to supply `Map` records for ingestion , we used one that was very specific to our needs. This extension also serves as an example of how users can create their own dimension columns for the specific needs and also [indirectly] tests the dimension extensibility support introduced in #10277 Future: Now that `VirtualColumn` interface has been enhanced to support vectorization, maybe the virtual column implementation code in this extension could also be improved to support that for [potentially] better performance. This PR has: - [x] been self-reviewed. - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.) - [ ] added documentation for new or modified features or behaviors. - [x] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links. - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/licenses.yaml) - [x] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader. - [x] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met. - [ ] added integration tests. - [x] been tested in a test Druid cluster. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] jihoonson commented on issue #10626: Whether HDFS Dependencies Can Be Removed in the Future Development of Druid?
jihoonson commented on issue #10626: URL: https://github.com/apache/druid/issues/10626#issuecomment-738460145 This has been my wish for a long time. Ideally, HDFS dependencies should be included only when you include Hadoop-related extensions. However, I don't think it could be done easily since it will require restructuring and refactorying lots of code. I don't think it could be done anytime soon. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[druid] branch master updated: Add an integration test for HTTP inputSource (#10620)
This is an automated email from the ASF dual-hosted git repository. jihoonson pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/druid.git The following commit(s) were added to refs/heads/master by this push: new ae6c43d Add an integration test for HTTP inputSource (#10620) ae6c43d is described below commit ae6c43de71f9983564b440f8db851e5e7f1ae16d Author: Jihoon Son AuthorDate: Thu Dec 3 15:51:56 2020 -0800 Add an integration test for HTTP inputSource (#10620) --- .../druid/tests/indexer/ITHttpInputSourceTest.java | 53 .../wikipedia_http_inputsource_queries.json| 98 ++ .../indexer/wikipedia_http_inputsource_task.json | 90 3 files changed, 241 insertions(+) diff --git a/integration-tests/src/test/java/org/apache/druid/tests/indexer/ITHttpInputSourceTest.java b/integration-tests/src/test/java/org/apache/druid/tests/indexer/ITHttpInputSourceTest.java new file mode 100644 index 000..69d1eea --- /dev/null +++ b/integration-tests/src/test/java/org/apache/druid/tests/indexer/ITHttpInputSourceTest.java @@ -0,0 +1,53 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.tests.indexer; + +import org.apache.druid.testing.guice.DruidTestModuleFactory; +import org.apache.druid.tests.TestNGGroup; +import org.testng.annotations.Guice; +import org.testng.annotations.Test; + +import java.io.Closeable; +import java.io.IOException; +import java.util.UUID; + +@Test(groups = TestNGGroup.INPUT_SOURCE) +@Guice(moduleFactory = DruidTestModuleFactory.class) +public class ITHttpInputSourceTest extends AbstractITBatchIndexTest +{ + private static final String INDEX_TASK = "/indexer/wikipedia_http_inputsource_task.json"; + private static final String INDEX_QUERIES_RESOURCE = "/indexer/wikipedia_http_inputsource_queries.json"; + + @Test + public void doTest() throws IOException + { +final String indexDatasource = "wikipedia_http_inputsource_test_" + UUID.randomUUID(); +try (final Closeable ignored1 = unloader(indexDatasource + config.getExtraDatasourceNameSuffix())) { + doIndexTest( + indexDatasource, + INDEX_TASK, + INDEX_QUERIES_RESOURCE, + false, + true, + true + ); +} + } +} diff --git a/integration-tests/src/test/resources/indexer/wikipedia_http_inputsource_queries.json b/integration-tests/src/test/resources/indexer/wikipedia_http_inputsource_queries.json new file mode 100644 index 000..11496c2 --- /dev/null +++ b/integration-tests/src/test/resources/indexer/wikipedia_http_inputsource_queries.json @@ -0,0 +1,98 @@ +[ +{ +"description": "timeseries, 1 agg, all", +"query":{ +"queryType" : "timeBoundary", +"dataSource": "%%DATASOURCE%%" +}, +"expectedResults":[ +{ +"timestamp" : "2016-06-27T00:00:11.000Z", +"result" : { +"minTime" : "2016-06-27T00:00:11.000Z", +"maxTime" : "2016-06-27T21:31:02.000Z" +} +} +] +}, +{ +"description": "timeseries, datasketch aggs, all", +"query":{ +"queryType" : "timeseries", +"dataSource": "%%DATASOURCE%%", +"granularity":"day", +"intervals":[ +"2016-06-27/P1D" +], +"filter":null, +"aggregations":[ +{ +"type": "HLLSketchMerge", +"name": "approxCountHLL", +"fieldName": "HLLSketchBuild", +"lgK": 12, +"tgtHllType": "HLL_4", +"round": true +}, +{ +"type":"thetaSketch", +"name":"approxCountTheta", +"fieldName":"thetaSketch", +"size":16384, +"shouldFinalize":true, +"isInputThetaSketch":false, +"errorBoundsStdDev":null +}, +{ +
[GitHub] [druid] jihoonson merged pull request #10620: Add an integration test for HTTP inputSource
jihoonson merged pull request #10620: URL: https://github.com/apache/druid/pull/10620 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] jihoonson commented on pull request #10624: Grouping id 2
jihoonson commented on pull request #10624: URL: https://github.com/apache/druid/pull/10624#issuecomment-738449988 @abhishekagarwal87 thanks for trying it out. As I commented in https://github.com/apache/druid/pull/10518#discussion_r535718040, I prefer the original approach to this alternative. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] jihoonson commented on a change in pull request #10518: Add grouping_id function
jihoonson commented on a change in pull request #10518: URL: https://github.com/apache/druid/pull/10518#discussion_r535700672 ## File path: processing/src/main/java/org/apache/druid/query/aggregation/GroupingAggregatorFactory.java ## @@ -0,0 +1,282 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.query.aggregation; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.google.common.annotations.VisibleForTesting; +import com.google.common.base.Preconditions; +import org.apache.druid.annotations.EverythingIsNonnullByDefault; +import org.apache.druid.query.aggregation.constant.LongConstantAggregator; +import org.apache.druid.query.aggregation.constant.LongConstantBufferAggregator; +import org.apache.druid.query.aggregation.constant.LongConstantVectorAggregator; +import org.apache.druid.query.cache.CacheKeyBuilder; +import org.apache.druid.segment.ColumnInspector; +import org.apache.druid.segment.ColumnSelectorFactory; +import org.apache.druid.segment.column.ValueType; +import org.apache.druid.segment.vector.VectorColumnSelectorFactory; +import org.apache.druid.utils.CollectionUtils; + +import javax.annotation.Nullable; +import java.util.Collections; +import java.util.Comparator; +import java.util.List; +import java.util.Objects; +import java.util.Set; + +@EverythingIsNonnullByDefault +public class GroupingAggregatorFactory extends AggregatorFactory Review comment: Please add some javadoc. It would be nice to include but not necessarily limited to the followings. - This factory is for computing `grouping` function. - This factory can create `LongConstant*Aggregators`. Unlike other aggregators, these `LongConstant*Aggregators` created by this factory does nothing with computing `grouping` function. Instead, they are used just to hold the positions of `grouping` results in the `ResultRow`. - The actual computation of `grouping` function is done before processing each subtotal. The result of `LongConstant*Aggregators` is _not_ used but replaced with the precomputed result when iterating the result of subtotal computation. See `RowBasedGrouperHelper.makeGrouperIterator()` for more details. - There could be some different approach to implement the same functionality. We chose this approach because it seems more stable and less complex than others. See https://github.com/apache/druid/pull/10518#discussion_r532941216 for more details. ## File path: processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/RowBasedGrouperHelper.java ## @@ -576,7 +594,12 @@ private static ValueExtractFunction makeValueExtractFunction( // Add aggregations. final int resultRowAggregatorStart = query.getResultRowAggregatorStart(); for (int i = 0; i < entry.getValues().length; i++) { -resultRow.set(resultRowAggregatorStart + i, entry.getValues()[i]); +if (dimsToInclude != null && groupingAggregatorsBitSet.get(i)) { + resultRow.set(resultRowAggregatorStart + i, groupingAggregatorValues[i]); Review comment: Please add some comment about what is happening here. ## File path: processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/RowBasedGrouperHelper.java ## @@ -576,7 +594,12 @@ private static ValueExtractFunction makeValueExtractFunction( // Add aggregations. final int resultRowAggregatorStart = query.getResultRowAggregatorStart(); for (int i = 0; i < entry.getValues().length; i++) { -resultRow.set(resultRowAggregatorStart + i, entry.getValues()[i]); +if (dimsToInclude != null && groupingAggregatorsBitSet.get(i)) { Review comment: I talked to @abhishekagarwal87 offline. My biggest concern is that the special handling for `GroupingAggregatorFactory` seems pretty magical because the behaviour of the factory and its aggregators is different from others. What I suggested above is making it more special but less magical, because I think it's less confusing. @abhishekagarwal87's concern is mostly around the complexity of
[druid] branch master updated (813e187 -> 2cd017b)
This is an automated email from the ASF dual-hosted git repository. jihoonson pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/druid.git. from 813e187 make dimension column extensible with COMPLEX type (#10277) add 2cd017b Fix the config initialization on container restart (#10458) No new revisions were added by this update. Summary of changes: distribution/docker/druid.sh | 1 + 1 file changed, 1 insertion(+) - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] jihoonson merged pull request #10458: Fix the config initialization on container restart
jihoonson merged pull request #10458: URL: https://github.com/apache/druid/pull/10458 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] jihoonson commented on pull request #10458: Fix the config initialization on container restart
jihoonson commented on pull request #10458: URL: https://github.com/apache/druid/pull/10458#issuecomment-738275612 The failed Travis CI seems an old one which was replaced by the new one based on GitHub action. The new CI passed, so I'm merging this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] senthilkv commented on pull request #10495: Added Request log updates for status change on cooridnator / overlord…
senthilkv commented on pull request #10495: URL: https://github.com/apache/druid/pull/10495#issuecomment-738215212 @himanshug Sure will look into it This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10624: Grouping id 2
abhishekagarwal87 commented on a change in pull request #10624: URL: https://github.com/apache/druid/pull/10624#discussion_r535466228 ## File path: processing/src/main/java/org/apache/druid/query/groupby/GroupByQueryQueryToolChest.java ## @@ -233,7 +235,7 @@ public GroupByQueryQueryToolChest( return groupByStrategy.processSubtotalsSpec( query, resource, -groupByStrategy.processSubqueryResult(subquery, query, resource, finalizingResults, false) +groupByStrategy.processSubqueryResult(subquery, withoutSubtotals(query), resource, finalizingResults, false) Review comment: some context is in here but not helpful - https://github.com/apache/druid/pull/5280/files#r181885318 I am concerned that modifying the query cause some problem down the line. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[druid] branch master updated (c94be8a -> 813e187)
This is an automated email from the ASF dual-hosted git repository. himanshug pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/druid.git. from c94be8a Revert "Update google client libraries (#10536)" (#10599) add 813e187 make dimension column extensible with COMPLEX type (#10277) No new revisions were added by this update. Summary of changes: ...tSegment.java => DimensionHandlerProvider.java} | 9 +- .../druid/segment/DimensionHandlerUtils.java | 31 ++ .../apache/druid/segment/column/ColumnBuilder.java | 6 ++ .../druid/segment/column/ColumnCapabilities.java | 7 ++ .../segment/column/ColumnCapabilitiesImpl.java | 25 + .../segment/incremental/IncrementalIndex.java | 13 ++- .../segment/serde/ComplexColumnPartSerde.java | 2 +- .../druid/segment/serde/ComplexMetricSerde.java| 16 .../query/aggregation/AggregationTestHelper.java | 106 ++--- ...ilsTest.java => DimensionHandlerUtilsTest.java} | 36 --- .../java/org/apache/druid/segment/TestIndex.java | 6 +- .../druid/segment/column/ColumnBuilderTest.java| 16 ++-- ...ColumnSupportedComplexColumnSerializerTest.java | 2 +- 13 files changed, 219 insertions(+), 56 deletions(-) copy processing/src/main/java/org/apache/druid/segment/{AbstractSegment.java => DimensionHandlerProvider.java} (76%) copy processing/src/test/java/org/apache/druid/segment/{IntListUtilsTest.java => DimensionHandlerUtilsTest.java} (56%) copy extensions-contrib/ambari-metrics-emitter/src/test/java/org/apache/druid/emitter/ambari/metrics/DruidToWhiteListBasedConverterTest.java => processing/src/test/java/org/apache/druid/segment/column/ColumnBuilderTest.java (72%) - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] himanshug merged pull request #10277: make dimension column extensible with COMPLEX type
himanshug merged pull request #10277: URL: https://github.com/apache/druid/pull/10277 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] suneet-s commented on pull request #10599: Revert "Update google client libraries (#10536)"
suneet-s commented on pull request #10599: URL: https://github.com/apache/druid/pull/10599#issuecomment-738125916 Thanks @nishantmonu51 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] belugabehr commented on pull request #10627: Remove JODA dependency from druid-basic-security package
belugabehr commented on pull request #10627: URL: https://github.com/apache/druid/pull/10627#issuecomment-738097479 For now, I had to add some new APIs to the `druid-core` library to support JDK Time and JODA Time concurrently. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] belugabehr opened a new pull request #10627: Remove JODA dependency from druid-basic-security package
belugabehr opened a new pull request #10627: URL: https://github.com/apache/druid/pull/10627 Fixes #8225 ### Description Remove JODA time from Druid, one package at a time since it would be way to unwieldy to review in one shot. This PR has: - [x] been self-reviewed. - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.) - [ ] added documentation for new or modified features or behaviors. - [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links. - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/licenses.yaml) - [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader. - [ ] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met. - [ ] added integration tests. - [ ] been tested in a test Druid cluster. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] belugabehr commented on issue #8225: Remove JODA Time dependency
belugabehr commented on issue #8225: URL: https://github.com/apache/druid/issues/8225#issuecomment-738093734 Re-open ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] zhangyue19921010 commented on a change in pull request #10551: Remove hard limitation that druid(after 0.15.0) only can consume Kafka version 0.11.x or better
zhangyue19921010 commented on a change in pull request #10551: URL: https://github.com/apache/druid/pull/10551#discussion_r535297518 ## File path: extensions-core/kafka-indexing-service/src/main/java/org/apache/druid/indexing/kafka/KafkaConsumerConfigs.java ## @@ -31,14 +31,14 @@ public class KafkaConsumerConfigs { - public static Map getConsumerProperties() + public static Map getConsumerProperties(Map customerConsumerProperties) Review comment: Hi @himanshug Thanks for your review! I agree with `it would likely be simpler to leave this alone, remove the line props.put("isolation.level".. altogether`. But I think maybe it is not very appropriate to modify `https://github.com/apache/druid/blob/master/extensions-core/kafka-indexing-service/src/main/java/org/apache/druid/indexing/kafka/supervisor/KafkaSupervisorIOConfig.java#L78` I have tried to do changes in `KafkaSupervisorIOConfig.java` as you said (https://github.com/apache/druid/pull/10551/commits/cc59d8bbe0175faa269a4bdc06a24f13f6dd220c#) ``` this.consumerProperties = Preconditions.checkNotNull(consumerProperties, "consumerProperties"); consumerProperties.putIfAbsent("isolation.level", "read_committed"); Preconditions.checkNotNull( ``` and functional testing is fine while UT is failed at https://github.com/apache/druid/blob/ac6d703814ccb5b258c586b63e0bc33d669e0f57/extensions-core/kafka-indexing-service/src/test/java/org/apache/druid/indexing/kafka/KafkaIOConfigTest.java#L363 and https://github.com/apache/druid/blob/ac6d703814ccb5b258c586b63e0bc33d669e0f57/extensions-core/kafka-indexing-service/src/test/java/org/apache/druid/indexing/kafka/KafkaIOConfigTest.java#L332 These failure means changes in KafkaSupervisorIOConfig may cause compatibility issues when deserialize from old IoConfig to new IoConfig or deserialize from new IoConfig to old IoConfig. In other words, we need to ensure that the deserialize between new IoConfig and old IoConfig is completely consistent and smooth. So I switched to a safer way in the latest commit. All the changes is tested on Dev cluster. PTAL :) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[druid] branch master updated (7e95228 -> c94be8a)
This is an automated email from the ASF dual-hosted git repository. nishant pushed a change to branch master in repository https://gitbox.apache.org/repos/asf/druid.git. from 7e95228 introduce DynamicConfigProvider interface and make kafka consumer props extensible (#10309) add c94be8a Revert "Update google client libraries (#10536)" (#10599) No new revisions were added by this update. Summary of changes: licenses.yaml | 10 +- pom.xml | 11 +-- 2 files changed, 10 insertions(+), 11 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] nishantmonu51 commented on pull request #10599: Revert "Update google client libraries (#10536)"
nishantmonu51 commented on pull request #10599: URL: https://github.com/apache/druid/pull/10599#issuecomment-738039094 @suneet-s : lets merge this change for now and i will create a new PR with the changes again. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] nishantmonu51 merged pull request #10599: Revert "Update google client libraries (#10536)"
nishantmonu51 merged pull request #10599: URL: https://github.com/apache/druid/pull/10599 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] kroeders commented on a change in pull request #10428: allow server selection to be aware of query
kroeders commented on a change in pull request #10428: URL: https://github.com/apache/druid/pull/10428#discussion_r535256895 ## File path: server/src/main/java/org/apache/druid/client/CachingClusteredClient.java ## @@ -812,7 +812,7 @@ String computeResultLevelCachingEtag( Hasher hasher = Hashing.sha1().newHasher(); boolean hasOnlyHistoricalSegments = true; for (SegmentServerSelector p : segments) { -if (!p.getServer().pick().getServer().isSegmentReplicationTarget()) { +if (!p.getServer().pick(null).getServer().isSegmentReplicationTarget()) { Review comment: this was the only functional place that the parameterless pick() was called that I mentioned in the other comment @jihoonson , the Query is available inside of SpecificQueryRunnable where this is called, so I could alternatively pass the query into computeResultLevelCachingEtag and add a null check on this line, what do you think? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] kroeders commented on pull request #10428: allow server selection to be aware of query
kroeders commented on pull request #10428: URL: https://github.com/apache/druid/pull/10428#issuecomment-738015529 thanks for the feedback and taking the time to look @jihoonson , I appreciate it! I updated the branch per the comments and left a few notes that I found while making the changes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] kroeders commented on a change in pull request #10428: allow server selection to be aware of query
kroeders commented on a change in pull request #10428: URL: https://github.com/apache/druid/pull/10428#discussion_r535252310 ## File path: server/src/main/java/org/apache/druid/client/selector/TierSelectorStrategy.java ## @@ -47,6 +49,22 @@ List pick( Int2ObjectRBTreeMap> prioritizedServers, DataSegment segment, - int numServersToPick - ); + int numServersToPick); + + @Nullable + default QueryableDruidServer pick(Query query, + Int2ObjectRBTreeMap> prioritizedServers, + DataSegment segment) + { +return pick(prioritizedServers, segment); Review comment: This was done for backward compatibility - I was trying to make this change as minimal as possible on the first cut. We don't have any custom implementations here and I haven't seen any on a cursory look, but I don't want to presume. I think it's cleaner to remove the ones that don't take Query and having an interface of all defaults that call each other makes the abstract methods less obvious. What do you think? I'm happy to defer to your judgement. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] kroeders commented on a change in pull request #10428: allow server selection to be aware of query
kroeders commented on a change in pull request #10428: URL: https://github.com/apache/druid/pull/10428#discussion_r535246679 ## File path: server/src/main/java/org/apache/druid/client/selector/TierSelectorStrategy.java ## @@ -47,6 +49,22 @@ List pick( Review comment: I marked the non-query ones as deprecated throughout This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] kroeders commented on a change in pull request #10428: allow server selection to be aware of query
kroeders commented on a change in pull request #10428: URL: https://github.com/apache/druid/pull/10428#discussion_r535246123 ## File path: server/src/main/java/org/apache/druid/client/selector/ServerSelector.java ## @@ -162,12 +163,20 @@ public boolean isEmpty() @Nullable @Override public QueryableDruidServer pick() + { +if (!historicalServers.isEmpty()) { + return strategy.pick(historicalServers, segment.get()); +} +return strategy.pick(realtimeServers, segment.get()); + } + + public QueryableDruidServer pick(Query query) Review comment: I removed it, the method was called a couple of times in tests (which seems OK) and in the etag creation part, which seems OK to me. It looks a little funny to pass in null as a single parameter, but I could argue either way vs having a method that isn't used elsewhere. ## File path: server/src/main/java/org/apache/druid/client/selector/TierSelectorStrategy.java ## @@ -47,6 +49,22 @@ List pick( Int2ObjectRBTreeMap> prioritizedServers, DataSegment segment, - int numServersToPick - ); + int numServersToPick); + + @Nullable + default QueryableDruidServer pick(Query query, Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] kroeders commented on a change in pull request #10428: allow server selection to be aware of query
kroeders commented on a change in pull request #10428: URL: https://github.com/apache/druid/pull/10428#discussion_r535244203 ## File path: server/src/main/java/org/apache/druid/client/selector/TierSelectorStrategy.java ## @@ -47,6 +49,22 @@ List pick( Int2ObjectRBTreeMap> prioritizedServers, DataSegment segment, - int numServersToPick - ); + int numServersToPick); + + @Nullable + default QueryableDruidServer pick(Query query, + Int2ObjectRBTreeMap> prioritizedServers, + DataSegment segment) + { +return pick(prioritizedServers, segment); + } + + default List pick( + Query query, Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] kroeders commented on a change in pull request #10428: allow server selection to be aware of query
kroeders commented on a change in pull request #10428: URL: https://github.com/apache/druid/pull/10428#discussion_r535243969 ## File path: server/src/main/java/org/apache/druid/client/selector/ServerSelectorStrategy.java ## @@ -33,6 +35,17 @@ }) public interface ServerSelectorStrategy { + default QueryableDruidServer pick(Query query, Set servers, DataSegment segment) Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] lkm commented on a change in pull request #10543: Avatica protobuf
lkm commented on a change in pull request #10543: URL: https://github.com/apache/druid/pull/10543#discussion_r535196727 ## File path: server/src/main/java/org/apache/druid/server/AsyncQueryForwardingServlet.java ## @@ -88,6 +92,7 @@ private static final String OBJECTMAPPER_ATTRIBUTE = "org.apache.druid.proxy.objectMapper"; private static final int CANCELLATION_TIMEOUT_MILLIS = 500; + private static final long serialVersionUID = 2026655693637894524L; Review comment: It's because I'm using the transient keyword below and the class is serializable ([documentation](https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/io/Serializable.html)) and basically to satisfy LGMT This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] lkm commented on a change in pull request #10543: Avatica protobuf
lkm commented on a change in pull request #10543: URL: https://github.com/apache/druid/pull/10543#discussion_r535195705 ## File path: server/src/main/java/org/apache/druid/server/AsyncQueryForwardingServlet.java ## @@ -119,6 +124,7 @@ private static void handleException(HttpServletResponse response, ObjectMapper o private final RequestLogger requestLogger; private final GenericQueryMetricsFactory queryMetricsFactory; private final AuthenticatorMapper authenticatorMapper; + private final transient ProtobufTranslation protobufTranslation; Review comment: The servlet is serializable and the ProtobufTranslation isn't, so this is required This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] Mrxiashu opened a new issue #10626: Whether HDFS Dependencies Can Be Removed in the Future Development of Druid?
Mrxiashu opened a new issue #10626: URL: https://github.com/apache/druid/issues/10626 In the production environment, HDFS occupies some resources. If HDFS features (such as data ingestion) are not used, is HDFS optional in future design? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] andrewbridge commented on issue #10609: Custom aggregation type name not registering correctly
andrewbridge commented on issue #10609: URL: https://github.com/apache/druid/issues/10609#issuecomment-737925029 Okay, great. Does this mean I have to create a child class off of SimpleModule for every extension or can I use this annotations elsewhere? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] lkm commented on a change in pull request #10543: Avatica protobuf
lkm commented on a change in pull request #10543: URL: https://github.com/apache/druid/pull/10543#discussion_r535164826 ## File path: server/src/main/java/org/apache/druid/server/AsyncQueryForwardingServlet.java ## @@ -449,6 +463,95 @@ static String getAvaticaConnectionId(Map requestMap) return (String) connectionIdObj; } + static String getAvaticaProtobufConnectionId(Service.Request request) + { +if (request instanceof Service.CatalogsRequest) { Review comment: Agreed, it's a bit fragile, however, that, I'm afraid, is the nature of protobufs. I'm not sure how likely they are to add new request types without significant overhaul to Calcite though This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] andrewbridge commented on issue #10611: Custom, multi-column aggregation best practice
andrewbridge commented on issue #10611: URL: https://github.com/apache/druid/issues/10611#issuecomment-737898313 To follow up, I've copied examples of aggregators which do not appear to use an associated `Serde` class: ```java @Override public List getJacksonModules() { return ImmutableList.of( new SimpleModule(getClass().getSimpleName()) .registerSubtypes( new NamedType(DoubleMinOfAggregatorFactory.class, DoubleMinOfAggregatorFactory.TYPE_NAME) ).addSerializer( DoubleMinOfCollector.class, DoubleMinOfCollector.Serializer.INSTANCE ) ); } ``` This allows my tests to pass as expected. Are there any issues with using this pattern over the `Serde` class pattern? Or am I entirely misunderstanding and the two are unrelated? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] lkm commented on a change in pull request #10543: Avatica protobuf
lkm commented on a change in pull request #10543: URL: https://github.com/apache/druid/pull/10543#discussion_r535163258 ## File path: sql/src/main/java/org/apache/druid/sql/avatica/DruidAvaticaProtobufHandler.java ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.sql.avatica; + +import com.google.inject.Inject; +import org.apache.calcite.avatica.remote.LocalService; +import org.apache.calcite.avatica.remote.Service; +import org.apache.calcite.avatica.server.AvaticaProtobufHandler; +import org.apache.druid.guice.annotations.Self; +import org.apache.druid.server.DruidNode; +import org.eclipse.jetty.server.Request; + +import javax.servlet.ServletException; +import javax.servlet.http.HttpServletRequest; +import javax.servlet.http.HttpServletResponse; +import java.io.IOException; + +public class DruidAvaticaProtobufHandler extends AvaticaProtobufHandler +{ + public static final String AVATICA_PATH = "/druid/v2/sql/avatica-pb/"; Review comment: The problem is the drivers. If we want to support JDBC and golang drivers, I'm not sure they would be happy with always appending a query param to the url. The other option I can think of that may work is to auto-detect it when the first letter of the post is `[` or `{` (ie. json), that just seemed a bit dirty to me This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] lgtm-com[bot] commented on pull request #10625: Add option for maxBytesInMemory to be percentage of JVM memory
lgtm-com[bot] commented on pull request #10625: URL: https://github.com/apache/druid/pull/10625#issuecomment-737895233 This pull request **introduces 1 alert** when merging 339a0ad0d077445808a025aa76ad343c81f6b5ee into 7e9522870f019db444b4b6eb5cb3945d6545cf7a - [view on LGTM.com](https://lgtm.com/projects/g/apache/druid/rev/pr-bef2480c4f13b1545ea4b350f29596bd113742f6) **new alerts:** * 1 for Useless comparison test This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] abhishekagarwal87 commented on a change in pull request #10624: Grouping id 2
abhishekagarwal87 commented on a change in pull request #10624: URL: https://github.com/apache/druid/pull/10624#discussion_r535019924 ## File path: processing/src/main/java/org/apache/druid/query/groupby/GroupByQueryQueryToolChest.java ## @@ -233,7 +235,7 @@ public GroupByQueryQueryToolChest( return groupByStrategy.processSubtotalsSpec( query, resource, -groupByStrategy.processSubqueryResult(subquery, query, resource, finalizingResults, false) +groupByStrategy.processSubqueryResult(subquery, withoutSubtotals(query), resource, finalizingResults, false) Review comment: this needs more thinking. why weren't we passing `query.withSubtotalsSpec(null)` instead of `query` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [druid] abhishekagarwal87 opened a new pull request #10624: Grouping id 2
abhishekagarwal87 opened a new pull request #10624: URL: https://github.com/apache/druid/pull/10624 A different way to do #10518. This avoids the grouping aggregation on historicals by re-working the query that is sent to historicals. Historicals don't get grouping aggregation at all. However, now you have a mismatch in result row signatures that needs to be fixed up in a few places. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org