Re: [PR] Adding unnest tests for join precedence when using multiple joins tog… (druid)
LakshSingla commented on code in PR #14969: URL: https://github.com/apache/druid/pull/14969#discussion_r1323987372 ## sql/src/test/java/org/apache/druid/sql/calcite/CalciteArraysQueryTest.java: ## @@ -4807,4 +4808,187 @@ public void testUnnestWithGroupByHavingWithWhereOnUnnestCol() ) ); } + + @Test + public void testUnnestWithCommaButHigherPrecedenceJoinLaterAlongWithAlias() + { +// CROSS JOIN has a higher precedence over COMMA JOIN +// CALCITE would interpret this as +// (foo t1) COMMA JOIN (UNNEST(t1.dim3) CROSS JOIN foo t2) +// while validating the right, parser does not understand t1.dim3 +// will throw a table not found validation error +expectedException.expect(DruidException.class); +expectedException.expectMessage("Table 't1' not found (line [2], column [34])"); +testQuery( +"select c1 from \n" ++ "druid.foo t1, unnest(mv_to_array(t1.dim3)) as u1(c1) CROSS JOIN druid.foo t2 \n", +QUERY_CONTEXT_UNNEST, +ImmutableList.of(), +ImmutableList.of() +); + } + + @Test + public void testUnnestWithCrossJoinEarlierToEnforcePrecedence() + { +testQuery( +"select c1 from \n" ++ "druid.foo CROSS JOIN unnest(mv_to_array(dim3)) as u1(c1) CROSS JOIN druid.foo t2 \n", +QUERY_CONTEXT_UNNEST, +ImmutableList.of( +Druids.newScanQueryBuilder() + .dataSource( + join( + new QueryDataSource( + Druids.newScanQueryBuilder() +.dataSource( +UnnestDataSource.create( +new TableDataSource(CalciteTests.DATASOURCE1), + expressionVirtualColumn("j0.unnest", "\"dim3\"", ColumnType.STRING), +null +) +) + .intervals(querySegmentSpec(Filtration.eternity())) + .resultFormat(ScanQuery.ResultFormat.RESULT_FORMAT_COMPACTED_LIST) +.legacy(false) +.columns( +"__time", +"cnt", +"dim1", +"dim2", +"dim3", +"j0.unnest", +"m1", +"m2", +"unique_dim1" +) +.context(QUERY_CONTEXT_UNNEST) +.build() + ), + new QueryDataSource( + Druids.newScanQueryBuilder() +.dataSource(new TableDataSource(CalciteTests.DATASOURCE1)) + .intervals(querySegmentSpec(Filtration.eternity())) + .resultFormat(ScanQuery.ResultFormat.RESULT_FORMAT_COMPACTED_LIST) +.legacy(false) +.context(QUERY_CONTEXT_UNNEST) +.columns("__time", "cnt", "dim1", "dim2", "dim3", "m1", "m2", "unique_dim1") +.build() + ), + "_j0.", + "1", + JoinType.INNER, + null + ) + ) + .intervals(querySegmentSpec(Filtration.eternity())) + .resultFormat(ScanQuery.ResultFormat.RESULT_FORMAT_COMPACTED_LIST) + .legacy(false) + .columns("j0.unnest") + .context(QUERY_CONTEXT_UNNEST) + .build() +), +NullHandling.sqlCompatible() ? +ImmutableList.of( +new Object[]{"a"}, +new Object[]{"a"}, +new Object[]{"a"}, +new Object[]{"a"}, +new Object[]{"a"}, +new Object[]{"a"}, Review Comment: Refactoring comment - We should use a loop or a method that supports adding multiple elements at a single time. I think one of the concise approach can be something like: ```java ImmutableList.Builder resultsBuilder = ImmutableList.builder(); resultsBuilder.addAll(Collections.nCopies(6, new Object[]{"a"})); resultsBuilder.addAll(Collections.nCopies(12, new Object[]{"b"})); resultsBuilder.addAll(Collections.nCopies(12, new Object[]{NullHandling.defaultStringValue()})); ```
[I] Invalid segment path when using HDFS as the intermediate deepstore (druid)
yuanlihan opened a new issue, #14972: URL: https://github.com/apache/druid/issues/14972 ### Affected Version Druid 27 ### Description The cluster is deployed on k8s with following features enabled 1. MM less feature 2. Use HDFS as deepstore 3. Set `druid_processing_intermediaryData_storage_type: "deepstore"` Steps to reproduce the problem 1. Submit an `index_parallel` type batch task 2. Set `maxNumConcurrentSubTasks` > 1 Error logs ``` Caused by: java.lang.IllegalArgumentException: Pathname /user/druid/k8s/druid27-test/segments/shuffle-data/index_parallel_session_analysis_test_bkdhhedd_2023-09-08T03:18:21.121Z/2023-08-29T16:00:00.000Z/2023-08-29T17:00:00.000Z/0 from hdfs://R2/user/druid/k8s/druid27-test/segments/shuffle-data/index_parallel_session_analysis_test_bkdhhedd_2023-09-08T03:18:21.121Z/2023-08-29T16:00:00.000Z/2023-08-29T17:00:00.000Z/0 is not a valid DFS filename. at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:266) ~[?:?] at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1558) ~[?:?] at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1555) ~[?:?] at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[?:?] at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:1572) ~[?:?] at org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:1547) ~[?:?] at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:2404) ~[?:?] at org.apache.druid.storage.hdfs.HdfsDataSegmentPusher.pushToPath(HdfsDataSegmentPusher.java:154) ~[?:?] at org.apache.druid.indexing.worker.shuffle.DeepStorageIntermediaryDataManager.addSegment(DeepStorageIntermediaryDataManager.java:78) ~[druid-indexing-service-27.0.0-SNAPSHOT.jar:27.0.0-SNAPSHOT] at org.apache.druid.indexing.worker.shuffle.ShuffleDataSegmentPusher.push(ShuffleDataSegmentPusher.java:66) ~[druid-indexing-service-27.0.0-SNAPSHOT.jar:27.0.0-SNAPSHOT] at org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.lambda$mergeAndPush$4(AppenderatorImpl.java:956) ~[druid-server-27.0.0-SNAPSHOT.jar:27.0.0-SNAPSHOT] at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:129) ~[druid-processing-27.0.0-SNAPSHOT.jar:27.0.0-SNAPSHOT] at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:81) ~[druid-processing-27.0.0-SNAPSHOT.jar:27.0.0-SNAPSHOT] at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:163) ~[druid-processing-27.0.0-SNAPSHOT.jar:27.0.0-SNAPSHOT] at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:153) ~[druid-processing-27.0.0-SNAPSHOT.jar:27.0.0-SNAPSHOT] at org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.mergeAndPush(AppenderatorImpl.java:952) ~[druid-server-27.0.0-SNAPSHOT.jar:27.0.0-SNAPSHOT] at org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.lambda$push$1(AppenderatorImpl.java:784) ~[druid-server-27.0.0-SNAPSHOT.jar:27.0.0-SNAPSHOT] at com.google.common.util.concurrent.Futures$1.apply(Futures.java:713) ~[guava-16.0.1.jar:?] at com.google.common.util.concurrent.Futures$ChainingListenableFuture.run(Futures.java:861) ~[guava-16.0.1.jar:?] ... 3 more ``` ``` 2023-09-08T07:07:32,491 ERROR [task-runner-0-priority-0] org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner - Exception while running task[AbstractTask{id='partial_index_generic_merge_session_analysis_test_lmhgjiib_2023-09-08T07:07:16.638Z', groupId='index_parallel_session_analysis_test_mohoklnj_2023-09-08T07:03:43.955Z', taskResource=TaskResource{availabilityGroup='partial_index_generic_merge_session_analysis_test_lmhgjiib_2023-09-08T07:07:16.638Z', requiredCapacity=1}, dataSource='session_analysis_test', context={forceTimeChunkLock=true, useLineageBasedSegmentAllocation=true}}] java.io.IOException: org.apache.druid.segment.loading.SegmentLoadingException: Do not know how to handle file type at [hdfs://R2/user/druid/k8s/druid27-test/segments/shuffle-data/index_parallel_session_analysis_test_mohoklnj_2023-09-08T07_03_43.955Z/20230830T06.000Z_20230830T07.000Z/0/partial_index_generate_session_analysis_test_fgbidneh_2023-09-08T07_04_41.368Z] at org.apache.druid.indexing.common.task.batch.parallel.DeepStorageShuffleClient.fetchSegmentFile(DeepStorageShuffleClient.java:56) ~[druid-indexing-service-27.0.0-SNAPSHOT.jar:27.0.0-SNAPSHOT] at org.apache.druid.indexing.common.task.batch.parallel.DeepStorageShuffleClient.fetchSegmentFile(DeepStorageShuffleClient.java:33) ~[druid-indexing-service-27.0.0-SNAPSHOT.jar:27.0.0-SNAPSHOT] at
[druid] branch master updated: use mmap for nested column value to dictionary id lookup for more chill heap usage during serialization (#14919)
This is an automated email from the ASF dual-hosted git repository. cwylie pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/druid.git The following commit(s) were added to refs/heads/master by this push: new 23b78c0f95 use mmap for nested column value to dictionary id lookup for more chill heap usage during serialization (#14919) 23b78c0f95 is described below commit 23b78c0f955860d51add780178ef8a4526b334db Author: Clint Wylie AuthorDate: Tue Sep 12 21:01:18 2023 -0700 use mmap for nested column value to dictionary id lookup for more chill heap usage during serialization (#14919) --- .../apache/druid/segment/AutoTypeColumnMerger.java | 3 +- .../druid/segment/NestedDataColumnMergerV4.java| 13 +- .../segment/column/StringEncodingStrategies.java | 37 +++ .../druid/segment/data/FixedIndexedWriter.java | 12 +- .../segment/data/FrontCodedIndexedWriter.java | 1 + .../druid/segment/data/GenericIndexedWriter.java | 12 +- .../druid/segment/nested/DictionaryIdLookup.java | 330 - .../nested/NestedCommonFormatColumnSerializer.java | 8 + .../segment/nested/NestedDataColumnSerializer.java | 45 ++- .../nested/NestedDataColumnSerializerV4.java | 15 +- .../segment/nested/NestedDataColumnSupplier.java | 73 + .../segment/nested/NestedDataColumnSupplierV4.java | 112 +-- .../nested/ScalarDoubleColumnSerializer.java | 24 +- .../segment/nested/ScalarLongColumnSerializer.java | 24 +- .../ScalarNestedCommonFormatColumnSerializer.java | 8 +- .../nested/ScalarStringColumnAndIndexSupplier.java | 43 +-- .../nested/ScalarStringColumnSerializer.java | 27 +- .../nested/VariantColumnAndIndexSupplier.java | 51 +--- .../segment/nested/VariantColumnSerializer.java| 44 ++- .../serde/DictionaryEncodedColumnPartSerde.java| 40 +-- .../org/apache/druid/query/DoubleStorageTest.java | 3 +- .../druid/segment/data/FixedIndexedTest.java | 29 ++ 22 files changed, 560 insertions(+), 394 deletions(-) diff --git a/processing/src/main/java/org/apache/druid/segment/AutoTypeColumnMerger.java b/processing/src/main/java/org/apache/druid/segment/AutoTypeColumnMerger.java index 7aadc5cd53..7c978f63fe 100644 --- a/processing/src/main/java/org/apache/druid/segment/AutoTypeColumnMerger.java +++ b/processing/src/main/java/org/apache/druid/segment/AutoTypeColumnMerger.java @@ -228,13 +228,12 @@ public class AutoTypeColumnMerger implements DimensionMergerV9 } else { // all the bells and whistles logicalType = ColumnType.NESTED_DATA; -final NestedDataColumnSerializer defaultSerializer = new NestedDataColumnSerializer( +serializer = new NestedDataColumnSerializer( name, indexSpec, segmentWriteOutMedium, closer ); -serializer = defaultSerializer; } serializer.openDictionaryWriter(); diff --git a/processing/src/main/java/org/apache/druid/segment/NestedDataColumnMergerV4.java b/processing/src/main/java/org/apache/druid/segment/NestedDataColumnMergerV4.java index 9a0dc139fb..6006dbdab2 100644 --- a/processing/src/main/java/org/apache/druid/segment/NestedDataColumnMergerV4.java +++ b/processing/src/main/java/org/apache/druid/segment/NestedDataColumnMergerV4.java @@ -57,7 +57,7 @@ public class NestedDataColumnMergerV4 implements DimensionMergerV9 private final Closer closer; private ColumnDescriptor.Builder descriptorBuilder; - private GenericColumnSerializer serializer; + private NestedDataColumnSerializerV4 serializer; public NestedDataColumnMergerV4( String name, @@ -111,13 +111,12 @@ public class NestedDataColumnMergerV4 implements DimensionMergerV9 descriptorBuilder = new ColumnDescriptor.Builder(); - final NestedDataColumnSerializerV4 defaultSerializer = new NestedDataColumnSerializerV4( + serializer = new NestedDataColumnSerializerV4( name, indexSpec, segmentWriteOutMedium, closer ); - serializer = defaultSerializer; final ComplexColumnPartSerde partSerde = ComplexColumnPartSerde.serializerBuilder() .withTypeName(NestedDataComplexTypeSerde.TYPE_NAME) @@ -127,14 +126,14 @@ public class NestedDataColumnMergerV4 implements DimensionMergerV9 .setHasMultipleValues(false) .addSerde(partSerde); - defaultSerializer.open(); - defaultSerializer.serializeFields(mergedFields); + serializer.open(); + serializer.serializeFields(mergedFields); int stringCardinality; int longCardinality; int doubleCardinality; if (numMergeIndex == 1) { -defaultSerializer.serializeDictionaries( +serializer.serializeDictionaries( sortedLookup.getSortedStrings(),
Re: [PR] use mmap for nested column value to dictionary id lookup for more chill heap usage during serialization (druid)
clintropolis merged PR #14919: URL: https://github.com/apache/druid/pull/14919 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Simplify DruidCoordinatorConfig and binding of metadata cleanup duties (druid)
kfaraz commented on PR #14891: URL: https://github.com/apache/druid/pull/14891#issuecomment-1716891053 Thanks for the review, @clintropolis !! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[druid] branch master updated: Simplify DruidCoordinatorConfig and binding of metadata cleanup duties (#14891)
This is an automated email from the ASF dual-hosted git repository. kfaraz pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/druid.git The following commit(s) were added to refs/heads/master by this push: new 286eecad7c Simplify DruidCoordinatorConfig and binding of metadata cleanup duties (#14891) 286eecad7c is described below commit 286eecad7cfe02369157073fd4915a7a0b480ec1 Author: Kashif Faraz AuthorDate: Wed Sep 13 09:06:57 2023 +0530 Simplify DruidCoordinatorConfig and binding of metadata cleanup duties (#14891) Changes: - Move following configs from `CliCoordinator` to `DruidCoordinatorConfig`: - `druid.coordinator.kill.on` - `druid.coordinator.kill.pendingSegments.on` - `druid.coordinator.kill.supervisors.on` - `druid.coordinator.kill.rules.on` - `druid.coordinator.kill.audit.on` - `druid.coordinator.kill.datasource.on` - `druid.coordinator.kill.compaction.on` - In the Coordinator style used by historical management duties, always instantiate all the metadata cleanup duties but execute only if enabled. In the existing code, they are instantiated only when enabled by using optional binding with Guice. - Add a wrapper `MetadataManager` which contains handles to all the different metadata managers for rules, supervisors, segments, etc. - Add a `CoordinatorConfigManager` to simplify read and update of coordinator configs - Remove persistence related methods from `CoordinatorCompactionConfig` and `CoordinatorDynamicConfig` as these are config classes. - Remove annotations `@CoordinatorIndexingServiceDuty`, `@CoordinatorMetadataStoreManagementDuty` --- .../apache/druid/common/config/ConfigManager.java | 27 +++-- ...TestAuditManager.java => NoopAuditManager.java} | 2 +- .../druid/common/config/ConfigManagerTest.java | 4 +- .../CoordinatorIndexingServiceDuty.java| 36 -- .../CoordinatorMetadataStoreManagementDuty.java| 36 -- .../coordinator/CoordinatorCompactionConfig.java | 36 -- .../coordinator/CoordinatorConfigManager.java | 130 + .../coordinator/CoordinatorDynamicConfig.java | 18 --- .../druid/server/coordinator/DruidCoordinator.java | 102 +++- .../server/coordinator/DruidCoordinatorConfig.java | 28 + .../druid/server/coordinator/MetadataManager.java | 107 + .../server/coordinator/duty/KillAuditLog.java | 8 +- .../coordinator/duty/KillCompactionConfig.java | 66 ++- .../coordinator/duty/KillDatasourceMetadata.java | 3 +- .../druid/server/coordinator/duty/KillRules.java | 3 +- .../coordinator/duty/KillStalePendingSegments.java | 2 - .../server/coordinator/duty/KillSupervisors.java | 3 +- .../duty/KillSupervisorsCustomDuty.java| 1 + .../coordinator/duty/KillUnusedSegments.java | 3 - .../coordinator/duty/MetadataCleanupDuty.java | 21 +++- .../http/CoordinatorCompactionConfigsResource.java | 117 ++- .../http/CoordinatorDynamicConfigsResource.java| 13 +-- .../server/coordinator/DruidCoordinatorTest.java | 46 +++- .../coordinator/TestDruidCoordinatorConfig.java| 42 +++ .../server/coordinator/duty/KillAuditLogTest.java | 8 +- .../coordinator/duty/KillCompactionConfigTest.java | 33 +++--- .../simulate/CoordinatorSimulationBuilder.java | 21 ++-- .../CoordinatorCompactionConfigsResourceTest.java | 97 ++- .../java/org/apache/druid/cli/CliCoordinator.java | 74 +--- 29 files changed, 593 insertions(+), 494 deletions(-) diff --git a/processing/src/main/java/org/apache/druid/common/config/ConfigManager.java b/processing/src/main/java/org/apache/druid/common/config/ConfigManager.java index 56cbcd5eba..a9be4c6bf2 100644 --- a/processing/src/main/java/org/apache/druid/common/config/ConfigManager.java +++ b/processing/src/main/java/org/apache/druid/common/config/ConfigManager.java @@ -180,9 +180,9 @@ public class ConfigManager { if (newObject == null || !started) { if (newObject == null) { -return SetResult.fail(new IllegalAccessException("input obj is null"), false); +return SetResult.failure(new IllegalAccessException("input obj is null")); } else { -return SetResult.fail(new IllegalStateException("configManager is not started yet"), false); +return SetResult.failure(new IllegalStateException("configManager is not started yet")); } } @@ -197,7 +197,8 @@ public class ConfigManager MetadataCASUpdate metadataCASUpdate = createMetadataCASUpdate(key, oldValue, newBytes); boolean success = dbConnector.compareAndSwap(ImmutableList.of(metadataCASUpdate)); if (!success) { -return SetResult.fail(new IllegalStateException("Config value has changed"), true); +// Retry
Re: [PR] Simplify DruidCoordinatorConfig and binding of metadata cleanup duties (druid)
kfaraz merged PR #14891: URL: https://github.com/apache/druid/pull/14891 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[PR] Fix the created property in DOAP RDF file (druid)
asdf2014 opened a new pull request, #14971: URL: https://github.com/apache/druid/pull/14971 ### Description In the context of a DOAP RDF file, the `created` property typically refers to the date when the project was created. I just checked the first commit: ```bash $ git show `git rev-list --max-parents=0 HEAD` -s --format=%cI ``` ```bash 2012-10-23T12:08:07-07:00 ``` Therefore, it would be better to use `2012-10-23` instead of `2023-09-08` here. This PR has: - [x] been self-reviewed. - [x] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.) - [x] added documentation for new or modified features or behaviors. - [x] a release note entry in the PR description. - [x] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links. - [x] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md) - [x] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader. - [x] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met. - [x] added integration tests. - [x] been tested in a test Druid cluster. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[druid] branch master updated: Add DOAP file for Druid (#14954)
This is an automated email from the ASF dual-hosted git repository. suneet pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/druid.git The following commit(s) were added to refs/heads/master by this push: new 6371721e17 Add DOAP file for Druid (#14954) 6371721e17 is described below commit 6371721e17ef01b824ccd65adff855f0960a08c8 Author: Suneet Saldanha AuthorDate: Tue Sep 12 17:40:21 2023 -0700 Add DOAP file for Druid (#14954) The DOAP file is a standard RDF file that describes a project's metadata. The Apache Software Foundation uses DOAP files for projects to keep project listing information at https://projects.apache.org/projects.html This change just introduces basic information about the project. Future changes can add more information like each release that goes out. The descriptions were pulled from the website and the README in this repo. --- doap_Druid.rdf | 42 ++ 1 file changed, 42 insertions(+) diff --git a/doap_Druid.rdf b/doap_Druid.rdf new file mode 100644 index 00..bb95885517 --- /dev/null +++ b/doap_Druid.rdf @@ -0,0 +1,42 @@ + + +http://usefulinc.com/ns/doap#; + xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#; + xmlns:asfext="http://projects.apache.org/ns/asfext#; + xmlns:foaf="http://xmlns.com/foaf/0.1/;> + + https://druid.apache.org/;> +2023-09-08 +https://spdx.org/licenses/Apache-2.0; /> +Apache Druid +https://druid.apache.org/; /> +https://druid.apache.org; /> +High performance, real-time analytics database for sub-second queries at scale and under load. +Apache Druid is a high performance real-time analytics database. Druid's main value add is to reduce time to insight and action. + +Apache Druid is designed for workflows where fast queries and ingest really matter. Druid excels at powering UIs, running operational (ad-hoc) queries, or handling high concurrency. Consider Druid as an open source alternative to data warehouses for a variety of use cases. +https://github.com/apache/druid/issues; /> +https://lists.apache.org/list.html?d...@druid.apache.org; /> +https://druid.apache.org/downloads/; /> +Java +https://projects.apache.org/category/big-data; /> +https://projects.apache.org/category/database; /> + + + - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Add DOAP file for Druid (druid)
suneet-s merged PR #14954: URL: https://github.com/apache/druid/pull/14954 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Fix bug in TimedShutoffInputSourceReader where a timeout does not close all registered closeable (druid)
github-actions[bot] closed pull request #10467: Fix bug in TimedShutoffInputSourceReader where a timeout does not close all registered closeable URL: https://github.com/apache/druid/pull/10467 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Fix bug in TimedShutoffInputSourceReader where a timeout does not close all registered closeable (druid)
github-actions[bot] commented on PR #10467: URL: https://github.com/apache/druid/pull/10467#issuecomment-1716748865 This pull request/issue has been closed due to lack of activity. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [I] Druid 0.18 Audit Logging (druid)
github-actions[bot] closed issue #9875: Druid 0.18 Audit Logging URL: https://github.com/apache/druid/issues/9875 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [I] Druid 0.18 Audit Logging (druid)
github-actions[bot] commented on issue #9875: URL: https://github.com/apache/druid/issues/9875#issuecomment-1716748512 This issue has been closed due to lack of activity. If you think that is incorrect, or the issue requires additional review, you can revive the issue at any time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[PR] Fix the uncaught exceptions when materializing results as frames (druid)
LakshSingla opened a new pull request, #14970: URL: https://github.com/apache/druid/pull/14970 ### Description When materializing the results as frames, we defer the creation of the frames in `ScanQueryQueryToolChest`, which passes through the catch-all block reserved for catching cases when we don't have the complete row signature in the query (and falls back to the old code). This PR aims to resolve it by adding the frame generation code to the try-catch block we have at the outer level. # Key changed/added classes in this PR * `ClienQuerySegmentWalker` This PR has: - [ ] been self-reviewed. - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.) - [ ] added documentation for new or modified features or behaviors. - [ ] a release note entry in the PR description. - [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links. - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md) - [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader. - [ ] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met. - [ ] added integration tests. - [ ] been tested in a test Druid cluster. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Fix for schema mismatch to go down using the non vectorize path till … (druid)
somu-imply commented on PR #14924: URL: https://github.com/apache/druid/pull/14924#issuecomment-1716514800 Addressed the review comments -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[druid] branch master updated: longer compatibility window for nested column format v4 (#14955)
This is an automated email from the ASF dual-hosted git repository. cwylie pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/druid.git The following commit(s) were added to refs/heads/master by this push: new 891f0a3fe9 longer compatibility window for nested column format v4 (#14955) 891f0a3fe9 is described below commit 891f0a3fe9942878ec6bab6ebb3f02e33a043a11 Author: Clint Wylie AuthorDate: Tue Sep 12 14:07:53 2023 -0700 longer compatibility window for nested column format v4 (#14955) changes: * add back nested column v4 serializers * 'json' schema by default still uses the newer 'nested common format' used by 'auto', but now has an optional 'formatVersion' property which can be specified to override format versions on native ingest jobs * add system config to specify default column format stuff, 'druid.indexing.formats', and property 'druid.indexing.formats.nestedColumnFormatVersion' to specify system level preferred nested column format for friendly rolling upgrades from versions which do not support the newer 'nested common format' used by 'auto' --- .../overlord/sampler/SamplerResponseTest.java | 2 +- .../druid/data/input/impl/DimensionSchema.java | 3 +- .../org/apache/druid/guice/NestedDataModule.java | 85 ++- .../druid/segment/DefaultColumnFormatConfig.java | 87 +++ .../druid/segment/NestedDataColumnHandlerV4.java | 105 .../druid/segment/NestedDataColumnIndexerV4.java | 501 + .../druid/segment/NestedDataColumnMergerV4.java| 215 .../druid/segment/NestedDataColumnSchema.java | 129 + .../incremental/IncrementalIndexAdapter.java | 13 + .../IncrementalIndexStorageAdapter.java| 9 + .../nested/NestedDataColumnSerializerV4.java | 396 + .../apache/druid/guice/NestedDataModuleTest.java | 113 .../apache/druid/query/NestedDataTestUtils.java| 30 + .../druid/query/scan/NestedDataScanQueryTest.java | 30 + .../segment/DefaultColumnFormatsConfigTest.java| 58 ++ .../segment/NestedDataColumnIndexerV4Test.java | 612 + .../druid/segment/NestedDataColumnSchemaTest.java | 101 .../nested/NestedDataColumnSupplierV4Test.java | 305 -- .../org/apache/druid/guice/StorageNodeModule.java | 2 + .../java/org/apache/druid/cli/DumpSegmentTest.java | 92 ++-- services/src/test/resources/nested-test-aggs.json | 6 - .../src/test/resources/nested-test-parser.json | 20 - .../druid/sql/calcite/util/SqlTestFramework.java | 2 + 23 files changed, 2788 insertions(+), 128 deletions(-) diff --git a/indexing-service/src/test/java/org/apache/druid/indexing/overlord/sampler/SamplerResponseTest.java b/indexing-service/src/test/java/org/apache/druid/indexing/overlord/sampler/SamplerResponseTest.java index 4fd1090d17..b8693b86dc 100644 --- a/indexing-service/src/test/java/org/apache/druid/indexing/overlord/sampler/SamplerResponseTest.java +++ b/indexing-service/src/test/java/org/apache/druid/indexing/overlord/sampler/SamplerResponseTest.java @@ -71,7 +71,7 @@ public class SamplerResponseTest data ) ); -String expected = "{\"numRowsRead\":1123,\"numRowsIndexed\":1112,\"logicalDimensions\":[{\"type\":\"string\",\"name\":\"dim1\",\"multiValueHandling\":\"SORTED_ARRAY\",\"createBitmapIndex\":true}],\"physicalDimensions\":[{\"type\":\"json\",\"name\":\"dim1\",\"multiValueHandling\":\"SORTED_ARRAY\",\"createBitmapIndex\":true}],\"logicalSegmentSchema\":[{\"name\":\"__time\",\"type\":\"LONG\"},{\"name\":\"dim1\",\"type\":\"STRING\"},{\"name\":\"met1\",\"type\":\"LONG\"}],\"data\":[{\"inpu [...] +String expected = "{\"numRowsRead\":1123,\"numRowsIndexed\":1112,\"logicalDimensions\":[{\"type\":\"string\",\"name\":\"dim1\",\"multiValueHandling\":\"SORTED_ARRAY\",\"createBitmapIndex\":true}],\"physicalDimensions\":[{\"type\":\"auto\",\"name\":\"dim1\",\"multiValueHandling\":\"SORTED_ARRAY\",\"createBitmapIndex\":true}],\"logicalSegmentSchema\":[{\"name\":\"__time\",\"type\":\"LONG\"},{\"name\":\"dim1\",\"type\":\"STRING\"},{\"name\":\"met1\",\"type\":\"LONG\"}],\"data\":[{\"inpu [...] Assert.assertEquals(expected, out); } diff --git a/processing/src/main/java/org/apache/druid/data/input/impl/DimensionSchema.java b/processing/src/main/java/org/apache/druid/data/input/impl/DimensionSchema.java index 379c8c21ac..9b692ecb7c 100644 --- a/processing/src/main/java/org/apache/druid/data/input/impl/DimensionSchema.java +++ b/processing/src/main/java/org/apache/druid/data/input/impl/DimensionSchema.java @@ -32,6 +32,7 @@ import org.apache.druid.java.util.emitter.EmittingLogger; import org.apache.druid.segment.AutoTypeColumnSchema; import org.apache.druid.segment.DimensionHandler; import org.apache.druid.segment.DimensionHandlerUtils; +import org.apache.druid.segment.NestedDataColumnSchema; import
Re: [PR] longer compatibility window for nested column format v4 (druid)
clintropolis merged PR #14955: URL: https://github.com/apache/druid/pull/14955 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] longer compatibility window for nested column format v4 (druid)
clintropolis commented on code in PR #14955: URL: https://github.com/apache/druid/pull/14955#discussion_r1323574477 ## processing/src/main/java/org/apache/druid/segment/DefaultColumnFormatConfig.java: ## @@ -21,12 +21,24 @@ import com.fasterxml.jackson.annotation.JsonCreator; import com.fasterxml.jackson.annotation.JsonProperty; +import org.apache.druid.error.DruidException; import javax.annotation.Nullable; import java.util.Objects; public class DefaultColumnFormatConfig { + public static void validateNestedFormatVersion(@Nullable Integer formatVersion) + { +if (formatVersion != null) { + if (formatVersion < 4 || formatVersion > 5) { +throw DruidException.forPersona(DruidException.Persona.USER) +.ofCategory(DruidException.Category.INVALID_INPUT) +.build("Unsupported nested column format version[%s]", formatVersion); Review Comment: i think im going to skip this for now to not churn through CI again, and also seems like we should find some programmatic way to list what versions are supported so the error messaging doesn't become stale if we add another version. Also the versions are still kind of confusing so idk how I feel about exposing them too much, 4 makes sense, but 5 is a virtual version since 5 is really version 0 of the new nested common format introduced by 'auto'. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Count distinct returned incorrect results without useApproximateCountDistinct (druid)
clintropolis merged PR #14748: URL: https://github.com/apache/druid/pull/14748 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[druid] branch master updated: Count distinct returned incorrect results without useApproximateCountDistinct (#14748)
This is an automated email from the ASF dual-hosted git repository. cwylie pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/druid.git The following commit(s) were added to refs/heads/master by this push: new 5d16d0edf0 Count distinct returned incorrect results without useApproximateCountDistinct (#14748) 5d16d0edf0 is described below commit 5d16d0edf088ac2c4fdec7f4296d2d9f1263dc7d Author: Zoltan Haindrich AuthorDate: Tue Sep 12 22:57:54 2023 +0200 Count distinct returned incorrect results without useApproximateCountDistinct (#14748) * fix grouping engine handling of summaries when result set is empty --- .../java/util/common/granularity/Granularity.java | 12 +- .../druid/java/util/common/guava/Sequences.java| 1 + .../apache/druid/query/groupby/GroupingEngine.java | 60 .../query/groupby/GroupByQueryRunnerTest.java | 151 .../druid/sql/calcite/CalciteSelectQueryTest.java | 157 + 5 files changed, 380 insertions(+), 1 deletion(-) diff --git a/processing/src/main/java/org/apache/druid/java/util/common/granularity/Granularity.java b/processing/src/main/java/org/apache/druid/java/util/common/granularity/Granularity.java index ca307886a7..572467f7c7 100644 --- a/processing/src/main/java/org/apache/druid/java/util/common/granularity/Granularity.java +++ b/processing/src/main/java/org/apache/druid/java/util/common/granularity/Granularity.java @@ -41,7 +41,7 @@ import java.util.regex.Pattern; public abstract class Granularity implements Cacheable { - public static Comparator IS_FINER_THAN = new Comparator() + public static final Comparator IS_FINER_THAN = new Comparator() { @Override /** @@ -215,6 +215,16 @@ public abstract class Granularity implements Cacheable return vals; } + /** + * Decides whether this granularity is finer than the other granularity + * + * @return true if this {@link Granularity} is finer than the passed one + */ + public boolean isFinerThan(Granularity g) + { +return IS_FINER_THAN.compare(this, g) < 0; + } + /** * Return an iterable of granular buckets that overlap a particular interval. * diff --git a/processing/src/main/java/org/apache/druid/java/util/common/guava/Sequences.java b/processing/src/main/java/org/apache/druid/java/util/common/guava/Sequences.java index 3dc44cb063..9f8169434a 100644 --- a/processing/src/main/java/org/apache/druid/java/util/common/guava/Sequences.java +++ b/processing/src/main/java/org/apache/druid/java/util/common/guava/Sequences.java @@ -49,6 +49,7 @@ public class Sequences return (Sequence) EMPTY_SEQUENCE; } + @SafeVarargs public static Sequence concat(Sequence... sequences) { return concat(Arrays.asList(sequences)); diff --git a/processing/src/main/java/org/apache/druid/query/groupby/GroupingEngine.java b/processing/src/main/java/org/apache/druid/query/groupby/GroupingEngine.java index 112f6ea25e..b79c4358a3 100644 --- a/processing/src/main/java/org/apache/druid/query/groupby/GroupingEngine.java +++ b/processing/src/main/java/org/apache/druid/query/groupby/GroupingEngine.java @@ -36,6 +36,7 @@ import org.apache.druid.guice.annotations.Smile; import org.apache.druid.java.util.common.Intervals; import org.apache.druid.java.util.common.StringUtils; import org.apache.druid.java.util.common.collect.Utils; +import org.apache.druid.java.util.common.granularity.Granularities; import org.apache.druid.java.util.common.granularity.Granularity; import org.apache.druid.java.util.common.guava.LazySequence; import org.apache.druid.java.util.common.guava.Sequence; @@ -66,16 +67,20 @@ import org.apache.druid.query.groupby.orderby.NoopLimitSpec; import org.apache.druid.query.spec.MultipleIntervalSegmentSpec; import org.apache.druid.segment.StorageAdapter; import org.apache.druid.segment.VirtualColumns; +import org.apache.druid.segment.join.filter.AllNullColumnSelectorFactory; import org.apache.druid.utils.CloseableUtils; import javax.annotation.Nullable; import java.nio.ByteBuffer; import java.util.ArrayList; +import java.util.Collections; import java.util.Comparator; import java.util.HashSet; +import java.util.Iterator; import java.util.List; import java.util.Map; import java.util.Set; +import java.util.concurrent.atomic.AtomicBoolean; import java.util.function.BinaryOperator; import java.util.stream.Collectors; @@ -436,6 +441,8 @@ public class GroupingEngine */ public Sequence applyPostProcessing(Sequence results, GroupByQuery query) { +results = wrapSummaryRowIfNeeded(query, results); + // Don't apply limit here for inner results, that will be pushed down to the BufferHashGrouper if (query.context().getBoolean(CTX_KEY_OUTERMOST, true)) { return query.postProcess(results); @@ -726,4 +733,57 @@ public class GroupingEngine return aggsAndPostAggs; } + + /** + * Wraps the sequence around
Re: [PR] Exposing optional replaceMissingValueWith in lookup function and macros (druid)
clintropolis commented on code in PR #14956: URL: https://github.com/apache/druid/pull/14956#discussion_r1323471660 ## sql/src/main/java/org/apache/druid/sql/calcite/expression/builtin/QueryLookupOperatorConversion.java: ## @@ -91,4 +103,18 @@ public DruidExpression toDruidExpression( } ); } + + private String getReplaceMissingValueWith( + final List inputExpressions, + final PlannerContext plannerContext + ) + { +if (inputExpressions.size() > 2) { + final Expr missingValExpr = plannerContext.parseExpression(inputExpressions.get(2).getExpression()); + if (missingValExpr.isLiteral()) { +return missingValExpr.getLiteralValue().toString(); Review Comment: can this argument be null? (add a test with the argument as NULL if it parses correctly, ignore if not) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Fix for schema mismatch to go down using the non vectorize path till … (druid)
clintropolis commented on code in PR #14924: URL: https://github.com/apache/druid/pull/14924#discussion_r1323478479 ## processing/src/main/java/org/apache/druid/query/aggregation/any/NumericNilVectorAggregator.java: ## @@ -69,6 +69,11 @@ public static NumericNilVectorAggregator longNilVectorAggregator() @Nullable private final Object returnValue; + public static NumericNilVectorAggregator of(Object returnValue) Review Comment: side note: I wonder if we should rename `NumericNilVectorAggregator` if we are going to be using it for non-numbers, maybe just `NilVectorAggregator`. I guess this doesn't need to be done in this PR, but it does seem kind of funny exposing it for other purposes ## processing/src/main/java/org/apache/druid/query/aggregation/first/DoubleFirstAggregatorFactory.java: ## @@ -149,7 +150,7 @@ public VectorAggregator factorizeVector( timeColumn); return new DoubleFirstVectorAggregator(timeSelector, valueSelector); } -return NumericNilVectorAggregator.doubleNilVectorAggregator(); +return NumericNilVectorAggregator.of(new SerializablePair<>(0L, NullHandling.defaultDoubleValue())); Review Comment: this could be saved as a static variable instead of making a new one each time since its not going to change over the lifetime of the service, same for all the other impls -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Exposing optional replaceMissingValueWith in lookup function and macros (druid)
clintropolis commented on code in PR #14956: URL: https://github.com/apache/druid/pull/14956#discussion_r1323471660 ## sql/src/main/java/org/apache/druid/sql/calcite/expression/builtin/QueryLookupOperatorConversion.java: ## @@ -91,4 +103,18 @@ public DruidExpression toDruidExpression( } ); } + + private String getReplaceMissingValueWith( + final List inputExpressions, + final PlannerContext plannerContext + ) + { +if (inputExpressions.size() > 2) { + final Expr missingValExpr = plannerContext.parseExpression(inputExpressions.get(2).getExpression()); + if (missingValExpr.isLiteral()) { +return missingValExpr.getLiteralValue().toString(); Review Comment: can this argument be null? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Exposing optional replaceMissingValueWith in lookup function and macros (druid)
clintropolis commented on code in PR #14956: URL: https://github.com/apache/druid/pull/14956#discussion_r1323467957 ## processing/src/test/java/org/apache/druid/query/expression/LookupExprMacroTest.java: ## @@ -0,0 +1,95 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.query.expression; + +import com.google.common.collect.Lists; +import org.apache.commons.compress.utils.Sets; +import org.apache.druid.math.expr.Expr; +import org.apache.druid.math.expr.ExprEval; +import org.apache.druid.query.lookup.LookupExtractorFactoryContainer; +import org.apache.druid.query.lookup.LookupExtractorFactoryContainerProvider; +import org.junit.Assert; +import org.junit.Test; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.List; +import java.util.Optional; +import java.util.Set; +import java.util.stream.Collectors; + +public class LookupExprMacroTest extends MacroTestBase +{ + public LookupExprMacroTest() + { +super( +new LookupExprMacro(new LookupExtractorFactoryContainerProvider() +{ + @Override + public Set getAllLookupNames() + { +return Sets.newHashSet("test_lookup"); + } + + @Override + public Optional get(String lookupName) + { +return Optional.empty(); + } +}) +); + } + + @Test + public void testTooFewArgs() + { +expectException(IllegalArgumentException.class, "Function[lookup] requires 2 to 3 arguments"); Review Comment: i suppose if the range is 1 we could change this messaging to be like `2 or 3` but we don't have to worry about that in this PR ## processing/src/main/java/org/apache/druid/query/expression/LookupExprMacro.java: ## @@ -116,4 +126,14 @@ public void decorateCacheKeyBuilder(CacheKeyBuilder builder) return new LookupExpr(arg); } + + private String getReplaceMissingValueWith(final List args) + { +if (args.size() > 2) { + final Expr missingValExpr = args.get(2); + validationHelperCheckArgIsLiteral(missingValExpr, "third argument"); + return String.valueOf(missingValExpr.getLiteralValue()); Review Comment: hmm, sorry I didn't notice this earlier, I think this should use `Evals.asString` so that if the value is `null` it doesn't become `"null"` as in https://github.com/apache/druid/pull/14956/files#diff-989c5e92f71c63c48629a62588a78bec198184ff896791058509176dffe91338R54 ## sql/src/main/java/org/apache/druid/sql/calcite/expression/builtin/QueryLookupOperatorConversion.java: ## @@ -91,4 +103,18 @@ public DruidExpression toDruidExpression( } ); } + + private String getReplaceMissingValueWith( + final List inputExpressions, + final PlannerContext plannerContext + ) + { +if (inputExpressions.size() > 2) { + final Expr missingValExpr = plannerContext.parseExpression(inputExpressions.get(2).getExpression()); + if (missingValExpr.isLiteral()) { +return missingValExpr.getLiteralValue().toString(); + } Review Comment: this should maybe have a similar validation to the native layer to make sure that the argument doesn't silently become null in the event that the expression is null -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [I] LDAP Auth fails with special character in password using druid-basic-security (druid)
trtg commented on issue #12379: URL: https://github.com/apache/druid/issues/12379#issuecomment-1716277631 @abhishekagarwal87 I'm dealing with a druid instance that I don't manage (i.e. I can't change any backend configuration), all I have access to is the graphical username/password dialog that appears when I login to druid, so I can't really change the way in which I provide the password. Where exactly were you proposing the usage of druid.escalator.internalClientPassword as a JSON blob? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Use task manager for task payload (druid)
churromorales commented on PR #14887: URL: https://github.com/apache/druid/pull/14887#issuecomment-1716148715 Why not just check the size of the task.json, if its larger than then MAX_SIZE do it in deep storage, if it is smaller then just use the env? I agree 100% with Druid in general having too many configuration options, makes it hard to remember everything you need include. When creating this feature, the whole goal I had in mind was to have this work with as few configuration options as possible. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Exposing optional replaceMissingValueWith in lookup function and macros (druid)
pranavbhole commented on code in PR #14956: URL: https://github.com/apache/druid/pull/14956#discussion_r1323312706 ## docs/querying/lookups.md: ## @@ -62,6 +62,12 @@ SELECT FROM sales GROUP BY 1 ``` +The lookup function also accepts the 3rd argument called `replaceMissingValueWith` as a constant string. If your value is missing a lookup for the queried key, the lookup function returns the result value from `replaceMissingValueWith` +For example: +``` +LOOKUP(store, 'store_to_country', 'NA') +``` +If value is missing from `store_to_country` lookup for given key 'store' then it will return `NA`. Review Comment: fixed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Exposing optional replaceMissingValueWith in lookup function and macros (druid)
pranavbhole commented on code in PR #14956: URL: https://github.com/apache/druid/pull/14956#discussion_r1323313568 ## processing/src/main/java/org/apache/druid/query/expression/LookupExprMacro.java: ## @@ -116,4 +126,15 @@ public void decorateCacheKeyBuilder(CacheKeyBuilder builder) return new LookupExpr(arg); } + + private String getReplaceMissingValueWith(final List args) + { +if (args.size() > 2) { + final Expr missingValExpr = args.get(2); + if (missingValExpr.isLiteral() && missingValExpr.getLiteralValue() != null) { +return missingValExpr.getLiteralValue().toString(); + } +} +return null; Review Comment: added validationHelperCheckArgIsLiteral check ## processing/src/main/java/org/apache/druid/query/expression/LookupExprMacro.java: ## @@ -54,10 +54,11 @@ public String name() @Override public Expr apply(final List args) { -validationHelperCheckArgumentCount(args, 2); +validationHelperCheckMinArgumentCount(args, 2); Review Comment: done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Update Memcached Client to elasticache-java-cluster-client (druid)
pagrawal10 commented on code in PR #14827: URL: https://github.com/apache/druid/pull/14827#discussion_r1323310539 ## website/static/css/custom.css: ## @@ -99,15 +99,3 @@ article iframe { margin-right: auto; max-width: 100%; } -.getAPI { - color: #0073e6; - font-weight: bold; -} -.postAPI { - color: #00bf7d; - font-weight: bold; -} -.deleteAPI { - color: #f49200; - font-weight: bold; -} Review Comment: This was a result of running `npm run link-lint && npm run spellcheck` locally. Will revert this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Update Memcached Client to elasticache-java-cluster-client (druid)
pagrawal10 commented on code in PR #14827: URL: https://github.com/apache/druid/pull/14827#discussion_r1323308839 ## server/src/main/java/org/apache/druid/client/cache/MemcachedCustomConnectionFactoryBuilder.java: ## @@ -213,6 +204,45 @@ public long getAuthWaitTime() { return authWaitTime; } + + @Override + public SSLContext getSSLContext() + { +return sslContext == null ? super.getSSLContext() : sslContext; + } + + @Override + public String getHostnameForTlsVerification() + { +return hostnameForTlsVerification == null ? super.getHostnameForTlsVerification() : hostnameForTlsVerification; + } + @Override + public ClientMode getClientMode() + { +return clientMode == null ? super.getClientMode() : clientMode; + } + + @Override + public boolean skipTlsHostnameVerification() + { +return skipTlsHostnameVerification; + } + + @Override + public String toString() + { +// MURMUR_128 cannot be cast to DefaultHashAlgorithm +return "Failure Mode: " + getFailureMode().name() + ", Hash Algorithm: " ++ getHashAlg().toString() + " Max Reconnect Delay: " ++ getMaxReconnectDelay() + ", Max Op Timeout: " + getOperationTimeout() ++ ", Op Queue Length: " + getOpQueueLen() + ", Op Max Queue Block Time" ++ getOpQueueMaxBlockTime() + ", Max Timeout Exception Threshold: " ++ getTimeoutExceptionThreshold() + ", Read Buffer Size: " ++ getReadBufSize() + ", Transcoder: " + getDefaultTranscoder() ++ ", Operation Factory: " + getOperationFactory() + " isDaemon: " ++ isDaemon() + ", Optimized: " + shouldOptimize() + ", Using Nagle: " ++ useNagleAlgorithm() + ", KeepAlive: " + getKeepAlive() + ", SSLContext: " + getSSLContext() + ", ConnectionFactory: " + getName(); + } Review Comment: `MURMUR3_128` is not part of DefaultHashAlgorithm enum. So, that would give a ClassCastException in the original function. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Update Memcached Client to elasticache-java-cluster-client (druid)
xvrl commented on code in PR #14827: URL: https://github.com/apache/druid/pull/14827#discussion_r1323278951 ## server/src/main/java/org/apache/druid/client/cache/MemcachedCustomConnectionFactoryBuilder.java: ## @@ -213,6 +204,45 @@ public long getAuthWaitTime() { return authWaitTime; } + + @Override + public SSLContext getSSLContext() + { +return sslContext == null ? super.getSSLContext() : sslContext; + } + + @Override + public String getHostnameForTlsVerification() + { +return hostnameForTlsVerification == null ? super.getHostnameForTlsVerification() : hostnameForTlsVerification; + } + @Override + public ClientMode getClientMode() + { +return clientMode == null ? super.getClientMode() : clientMode; + } + + @Override + public boolean skipTlsHostnameVerification() + { +return skipTlsHostnameVerification; + } + + @Override + public String toString() + { +// MURMUR_128 cannot be cast to DefaultHashAlgorithm +return "Failure Mode: " + getFailureMode().name() + ", Hash Algorithm: " ++ getHashAlg().toString() + " Max Reconnect Delay: " ++ getMaxReconnectDelay() + ", Max Op Timeout: " + getOperationTimeout() ++ ", Op Queue Length: " + getOpQueueLen() + ", Op Max Queue Block Time" ++ getOpQueueMaxBlockTime() + ", Max Timeout Exception Threshold: " ++ getTimeoutExceptionThreshold() + ", Read Buffer Size: " ++ getReadBufSize() + ", Transcoder: " + getDefaultTranscoder() ++ ", Operation Factory: " + getOperationFactory() + " isDaemon: " ++ isDaemon() + ", Optimized: " + shouldOptimize() + ", Using Nagle: " ++ useNagleAlgorithm() + ", KeepAlive: " + getKeepAlive() + ", SSLContext: " + getSSLContext() + ", ConnectionFactory: " + getName(); + } Review Comment: any reason we are overriding this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Update Memcached Client to elasticache-java-cluster-client (druid)
xvrl commented on code in PR #14827: URL: https://github.com/apache/druid/pull/14827#discussion_r1323275736 ## website/static/css/custom.css: ## @@ -99,15 +99,3 @@ article iframe { margin-right: auto; max-width: 100%; } -.getAPI { - color: #0073e6; - font-weight: bold; -} -.postAPI { - color: #00bf7d; - font-weight: bold; -} -.deleteAPI { - color: #f49200; - font-weight: bold; -} Review Comment: why are we changing this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[PR] Adding unnest tests for join precedence when using multiple joins tog… (druid)
somu-imply opened a new pull request, #14969: URL: https://github.com/apache/druid/pull/14969 …ether with unnest This PR has: - [ ] been self-reviewed. - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.) - [ ] added documentation for new or modified features or behaviors. - [ ] a release note entry in the PR description. - [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links. - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md) - [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader. - [ ] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met. - [ ] added integration tests. - [ ] been tested in a test Druid cluster. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[I] Supporting "other" JVM Scripting languages (druid)
nmondal opened a new issue, #14968: URL: https://github.com/apache/druid/issues/14968 ### Description JSR-223 is pretty interesting, and allows a lot of scripting languages to be run inside JVM. Current code almost has it, just that we have to make it generic to accept not only JS, but groovy, jruby, clojure whatever works. Copying the JavaScript specific code, and refactoring it would do fine. Naturally, the druid classpath must have the scripting engine. ### Motivation https://en.wikipedia.org/wiki/Scripting_for_the_Java_Platform - Same advantage of having JS - More so because of being absolutely polyglot - Disadvantage might be performance for dynamic languages - but - we can also use Kotlin, Scala, Java as scripts , moreover the option to generate a compiled version of scripts does wonders to performance, it would compile to JVM bytecode anyways. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [I] LDAP Auth fails with special character in password using druid-basic-security (druid)
schmichri commented on issue #12379: URL: https://github.com/apache/druid/issues/12379#issuecomment-1715857628 @abhishekagarwal87 I never answered back in the past (sorry for that) but I remember I've changed the password. The behavior @trtg describes (seeing the logs) I can confirm. I remember as well, that the password with the `:` in it was my personal password and not the internalClientPassword nor the initialAdminPassword. As far as I understood it, it wasn't a configuration issue rather an issue with the escaping of basic auth itself because it looked like that the internal urls had the following pattern `http://:@host:port/` and not used an basic authorization header. An `:` in url based basic auth would been misinterpreted. Let me know if I can provide more information. (this time I will answer) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [I] LDAP Auth fails with special character in password using druid-basic-security (druid)
abhishekagarwal87 commented on issue #12379: URL: https://github.com/apache/druid/issues/12379#issuecomment-1715828518 Are you providing the password the way I recommended in my previous comment? Did that not work? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [I] Filtered Aggregator at ingestion time don't work (druid)
ShehanIshanka commented on issue #10293: URL: https://github.com/apache/druid/issues/10293#issuecomment-1715771583 Is this fixed in any latest release? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [I] LDAP Auth fails with special character in password using druid-basic-security (druid)
trtg commented on issue #12379: URL: https://github.com/apache/druid/issues/12379#issuecomment-1715615089 Confirming that Druid 26.0.0 still does not support the use of the colon character in an LDAP password, I suspect that perhaps it is doing some inappropriate escaping or something when sending to the LDAP server because I can see that the password sent from the druid UI password dialog to the druid backend seems fine but in the druid logs it says that the LDAP server replies that the password is invalid. I confirmed that this error appears when the password contains a colon character and is resolved when the password does not contain a colon character. Has anyone taken a look at this? @abhishekagarwal87 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [I] postgresql的SQL格式化后注释只保留了注释,/**/给去掉了 (druid)
FrankChen021 commented on issue #14960: URL: https://github.com/apache/druid/issues/14960#issuecomment-1715608372 I think the user posted in a wrong repo. This issue should posted to this repo: alibaba druid. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Upload docker and druid service logs as artifacts on GitHub Actions IT run failure (druid)
abhishekagarwal87 commented on code in PR #14967: URL: https://github.com/apache/druid/pull/14967#discussion_r1322900314 ## .github/workflows/reusable-revised-its.yml: ## @@ -133,3 +133,33 @@ jobs: - name: Run IT run: ${{ inputs.script }} + + - name: Collect docker logs on failure +if: ${{ failure() && steps.run-it.conclusion == 'failure' }} +uses: jwalton/gh-docker-logs@v2 +with: + dest: './docker-logs' + + - name: Tar docker logs +if: ${{ failure() && steps.run-it.conclusion == 'failure' }} +run: tar cvzf ./docker-logs.tgz ./docker-logs + + - name: Upload docker logs to GitHub +if: ${{ failure() && steps.run-it.conclusion == 'failure' }} +uses: actions/upload-artifact@master +with: + name: IT-${{ inputs.it }} docker logs (Compile=jdk${{ inputs.build_jdk }}, Run=jdk${{ inputs.runtime_jdk }}, Indexer=${{ inputs.use_indexer }}, Mysql=${{ inputs.mysql_driver }}) + path: docker-logs.tgz + + - name: Collect service logs on failure +if: ${{ failure() && steps.run-it.conclusion == 'failure' }} +run: | + docker cp middlemanager:/shared/logs/ ./service-logs Review Comment: can you add a comment that middlemanager is just randomly picked up? Also, can we not just copy the logs from a local location? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[PR] Upload docker and druid service logs as artifacts on GitHub Actions IT run failure (druid)
tejaswini-imply opened a new pull request, #14967: URL: https://github.com/apache/druid/pull/14967 With this PR, docker and druid service logs are uploaded as artifacts onto GitHub when an IT job fails so that we can later download them for investigation. https://github.com/apache/druid/assets/96047043/8a3f1140-e23e-4a0b-bfd4-50d01ae17d53;> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [I] Calls to /druid/v2/sql intermittently failing from Management UI (druid)
abhishekagarwal87 commented on issue #14965: URL: https://github.com/apache/druid/issues/14965#issuecomment-1715485390 These columns were added recently. It is possible that one of the broker nodes is running an older version despite the upgrade. Do confirm that the broker throwing these errors is indeed running on 27. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Clean up stale locks if segment allocation fails (druid)
AmatyaAvadhanula commented on code in PR #14966: URL: https://github.com/apache/druid/pull/14966#discussion_r1322795601 ## indexing-service/src/main/java/org/apache/druid/indexing/overlord/TaskLockbox.java: ## @@ -494,10 +494,11 @@ public List allocateSegments( allocateSegmentIds(dataSource, interval, skipSegmentLineageCheck, holderList.getPending()); holderList.getPending().forEach(holder -> acquireTaskLock(holder, false)); } - holderList.getPending().forEach(holder -> addTaskAndPersistLocks(holder, isTimeChunkLock)); } finally { + holderList.failHoldersForCompletedTasks(this); + holderList.clearStaleLocks(this); Review Comment: failHoldersForCompletedTasks may help prevent retries of pending requests after the task itself has stopped even when there is no exception. clearStaleLocks is not needed but I don't think it can cause any harm. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Clean up stale locks if segment allocation fails (druid)
abhishekagarwal87 commented on code in PR #14966: URL: https://github.com/apache/druid/pull/14966#discussion_r1322728660 ## indexing-service/src/main/java/org/apache/druid/indexing/overlord/TaskLockbox.java: ## @@ -494,10 +494,11 @@ public List allocateSegments( allocateSegmentIds(dataSource, interval, skipSegmentLineageCheck, holderList.getPending()); holderList.getPending().forEach(holder -> acquireTaskLock(holder, false)); } - holderList.getPending().forEach(holder -> addTaskAndPersistLocks(holder, isTimeChunkLock)); } finally { + holderList.failHoldersForCompletedTasks(this); + holderList.clearStaleLocks(this); Review Comment: are these to be done even if there is no exception? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[PR] Clean up stale locks if segment allocation fails (druid)
AmatyaAvadhanula opened a new pull request, #14966: URL: https://github.com/apache/druid/pull/14966 When locks are acquired prior to a failed segment allocation request, stale locks can be left behind causing the subsequent tasks to wait until an Overlord restart occurs. This PR aims to fix this issue. ### Description In batch segment allocation, when task locks have been acquired for requests but segments could not be allocated for them, the locks must be released. This PR also tries to stop retries for segment allocation after the task has completed. Release note Fixes a bug where segment allocation failure could leave behind stale locks that block subsequent tasks. # Key changed/added classes in this PR * `TaskLockbox` This PR has: - [x] been self-reviewed. - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.) - [ ] added documentation for new or modified features or behaviors. - [ ] a release note entry in the PR description. - [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links. - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md) - [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader. - [x] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met. - [ ] added integration tests. - [x] been tested in a test Druid cluster. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[I] Calls to /druid/v2/sql intermittently failing from Management UI (druid)
jakcodex opened a new issue, #14965: URL: https://github.com/apache/druid/issues/14965 ### Affected Version 27.0.0 ### Description Apache Druid is deployed in a highly available cluster in AWS using Amazon Linux 2023 x86_64 AMI consisting of three-node master-with-zk cluster and an assortment of data/query nodes using MySQL-based metadata and S3 deep storage. The cluster presently has no segments or configured tasks. Management UI running on a 27.0.0 server experiences intermittent failures for requests to `/druid/v2/sql`. The Services page makes a request with the following payload: ```json { "query": "SELECT\n \"server\" AS \"service\",\n \"server_type\" AS \"service_type\",\n \"tier\",\n \"host\",\n \"plaintext_port\",\n \"tls_port\",\n \"curr_size\",\n \"max_size\",\n \"is_leader\",\n \"start_time\"\nFROM sys.servers\nORDER BY\n (\nCASE \"server_type\"\n WHEN 'coordinator' THEN 8\nWHEN 'overlord' THEN 7\nWHEN 'router' THEN 6\nWHEN 'broker' THEN 5\nWHEN 'historical' THEN 4\nWHEN 'indexer' THEN 3\nWHEN 'middle_manager' THEN 2\nWHEN 'peon' THEN 1\nELSE 0\n END\n ) DESC,\n \"service\" DESC" } ``` Which sometimes get the error response: `400 Bad Request` ```json { "error": "Plan validation failed", "errorMessage": "org.apache.calcite.runtime.CalciteContextException: From line 11, column 3 to line 11, column 14: Column 'start_time' not found in any table", "errorClass": "org.apache.calcite.tools.ValidationException", "host": null } ``` Unified Console home page makes the following request: ```json { "query": "SELECT\n COUNT(*) AS \"active\",\n COUNT(*) FILTER (WHERE is_available = 1) AS \"cached_on_historical\",\n COUNT(*) FILTER (WHERE is_available = 0 AND replication_factor > 0) AS \"unavailable\",\n COUNT(*) FILTER (WHERE is_realtime = 1) AS \"realtime\"\nFROM sys.segments\nWHERE is_active = 1" } ``` Which sometimes get the error response: `400 Bad Request` ```json { "error": "Plan validation failed", "errorMessage": "org.apache.calcite.runtime.CalciteContextException: From line 4, column 47 to line 4, column 64: Column 'replication_factor' not found in any table", "errorClass": "org.apache.calcite.tools.ValidationException", "host": null } ``` Strangely, when watching the automatic refresh requests on the Services page, the error occurs exactly every other request consistently. ![image](https://github.com/apache/druid/assets/28816370/225c8af7-c0d9-49d7-bbf8-926d68aba75e) Errors are being logged in the corresponding broker's log file. ``` 2023-09-12T07:00:42,096 WARN [sql[d2038c2c-bf10-4305-93e2-679badd03e51]] org.apache.druid.sql.http.SqlResource - Exception while processing sqlQueryId[d2038c2c-bf10-4305-93e2-679badd03e51] (SqlPlanningException{msg=org.apache.calcite.runtime.CalciteContextException: From line 11, column 3 to line 11, column 14: Column 'start_time' not found in any table, code=Plan validation failed, class=org.apache.calcite.tools.ValidationException, host=null}) ``` ``` 2023-09-12T06:57:32,797 WARN [sql[0317f126-692e-4a5e-ad76-ad422a63891a]] org.apache.druid.sql.http.SqlResource - Exception while processing sqlQueryId[0317f126-692e-4a5e-ad76-ad422a63891a] (SqlPlanningException{msg=org.apache.calcite.runtime.CalciteContextException: From line 4, column 47 to line 4, column 64: Column 'replication_factor' not found in any table, code=Plan validation failed, class=org.apache.calcite.tools.ValidationException, host=null}) ``` With all cluster nodes installed with 26.0.0 this issue is not present. With all cluster nodes installed to 27.0.0 the issue is present. If you then downgrade a query/router node to 26.0.0 the issue is no longer present on that specific node. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Simplify DruidCoordinatorConfig and binding of metadata cleanup duties (druid)
clintropolis commented on code in PR #14891: URL: https://github.com/apache/druid/pull/14891#discussion_r1322530601 ## services/src/main/java/org/apache/druid/cli/CliCoordinator.java: ## @@ -248,14 +240,7 @@ public void configure(Binder binder) LifecycleModule.register(binder, Server.class); LifecycleModule.register(binder, DataSourcesResource.class); -// Binding for Set of indexing service coordinator Duty -final ConditionalMultibind conditionalIndexingServiceDutyMultibind = ConditionalMultibind.create( -properties, -binder, -CoordinatorDuty.class, -CoordinatorIndexingServiceDuty.class -); Review Comment: oops, forgot about `CoordinatorCustomDuty`, carry on -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Simplify DruidCoordinatorConfig and binding of metadata cleanup duties (druid)
clintropolis commented on code in PR #14891: URL: https://github.com/apache/druid/pull/14891#discussion_r1322395091 ## services/src/main/java/org/apache/druid/cli/CliCoordinator.java: ## @@ -248,14 +240,7 @@ public void configure(Binder binder) LifecycleModule.register(binder, Server.class); LifecycleModule.register(binder, DataSourcesResource.class); -// Binding for Set of indexing service coordinator Duty -final ConditionalMultibind conditionalIndexingServiceDutyMultibind = ConditionalMultibind.create( -properties, -binder, -CoordinatorDuty.class, -CoordinatorIndexingServiceDuty.class -); Review Comment: does this mean extensions can no longer provide custom coordinator duties? If so this might require a dev list thread to see if anyone is using this functionality. It does make sense to update the built-in ones to be handled in a more direct manner, but I wonder if there is some way we could leave this stuff in place to retain this functionality as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [I] Delay reporting MSQ ingest success until segments are loaded (druid)
LakshSingla closed issue #13770: Delay reporting MSQ ingest success until segments are loaded URL: https://github.com/apache/druid/issues/13770 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [I] Delay reporting MSQ ingest success until segments are loaded (druid)
LakshSingla commented on issue #13770: URL: https://github.com/apache/druid/issues/13770#issuecomment-1715054711 I think https://github.com/apache/druid/pull/14322 completes all the requirements mentioned in the PR. cc: @adarshsanjeev -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] longer compatibility window for nested column format v4 (druid)
abhishekagarwal87 commented on code in PR #14955: URL: https://github.com/apache/druid/pull/14955#discussion_r1322466938 ## processing/src/main/java/org/apache/druid/segment/DefaultColumnFormatConfig.java: ## @@ -21,12 +21,24 @@ import com.fasterxml.jackson.annotation.JsonCreator; import com.fasterxml.jackson.annotation.JsonProperty; +import org.apache.druid.error.DruidException; import javax.annotation.Nullable; import java.util.Objects; public class DefaultColumnFormatConfig { + public static void validateNestedFormatVersion(@Nullable Integer formatVersion) + { +if (formatVersion != null) { + if (formatVersion < 4 || formatVersion > 5) { +throw DruidException.forPersona(DruidException.Persona.USER) +.ofCategory(DruidException.Category.INVALID_INPUT) +.build("Unsupported nested column format version[%s]", formatVersion); Review Comment: Nit - can you add in the error message that supported values are 4 and 5? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [I] MariaDB as metadata storage: facing ClassNotFoundException: com.mysql.jdbc.exceptions.MySQLTransientException (druid)
LakshSingla closed issue #14936: MariaDB as metadata storage: facing ClassNotFoundException: com.mysql.jdbc.exceptions.MySQLTransientException URL: https://github.com/apache/druid/issues/14936 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [I] MariaDB as metadata storage: facing ClassNotFoundException: com.mysql.jdbc.exceptions.MySQLTransientException (druid)
LakshSingla commented on issue #14936: URL: https://github.com/apache/druid/issues/14936#issuecomment-1715037250 This is fixed by https://github.com/apache/druid/pull/12205. Can you please upgrade to a more recent version and check if the error persists? On a related note, the version of Druid in use is pretty old, and there have been significant improvements made to Druid since then. To make use of those, and to get rid of the error, it is recommended that you upgrade to a more recent version. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org