Re: [PR] only create `sqlSegmentsMetadataManager` once to speed up test (druid)
TestBoost commented on PR #15557: URL: https://github.com/apache/druid/pull/15557#issuecomment-1925151449 Thank you very much for pointing out! We don't find any other tests like this in server module and other modules in druid as of now. But there are other test classes that require splitting tests that utilize the same variables/resources into different test classes to make tests faster. I don't know if it's suitable. And I also applied the changes as you said in this pull request. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] only create used and unused segments once to make the test faster (druid)
TestBoost commented on PR #15533: URL: https://github.com/apache/druid/pull/15533#issuecomment-1925146292 Sorry for the late reply and I just came back from holiday. I tried to make the changes as you said. However, I found there is another test `testRetrieveUnusedSegmentsActionWithNowUsedLastUpdatedTime` in this test class which also needs `expectedUnusedSegments`. We need to create `expectedUsedSegments` in the method `testRetrieveUsedSegmentsAction` and create `expectedUnusedSegments` in the method `testRetrieveUnusedSegmentsAction` and `testRetrieveUnusedSegmentsActionWithNowUsedLastUpdatedTime`. This makes the test even slower than the original one. The original test takes `3.583 s`. After applying this commit, it takes `2.939 s`. And after making `expectedUsedSegments` and `expectedUnusedSegments` as local variables, it takes `4.761 s`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Sql Single Value Aggregator for scalar queries (druid)
somu-imply commented on code in PR #15700: URL: https://github.com/apache/druid/pull/15700#discussion_r1476870634 ## processing/src/main/java/org/apache/druid/query/aggregation/SingleValueAggregatorFactory.java: ## @@ -0,0 +1,206 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.query.aggregation; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonInclude; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.fasterxml.jackson.annotation.JsonTypeName; +import com.google.common.base.Preconditions; +import org.apache.druid.error.DruidException; +import org.apache.druid.segment.ColumnSelectorFactory; +import org.apache.druid.segment.ColumnValueSelector; +import org.apache.druid.segment.column.ColumnCapabilities; +import org.apache.druid.segment.column.ColumnType; +import org.apache.druid.segment.column.ValueType; + +import javax.annotation.Nullable; +import java.util.Collections; +import java.util.Comparator; +import java.util.List; +import java.util.Objects; + +/** + * This AggregatorFactory is meant to wrap the subquery used as an expression into a single value + * and is expected to throw an exception when the subquery results in more than one row + * + * + * This consumes columnType as well along with name and fieldName to pass it on to underlying + * {@link SingleValueBufferAggregator} to work with different ColumnTypes + */ +@JsonTypeName("singleValue") +public class SingleValueAggregatorFactory extends AggregatorFactory +{ + + @JsonProperty + @JsonInclude(JsonInclude.Include.NON_NULL) + private final String name; + + @JsonProperty + @JsonInclude(JsonInclude.Include.NON_NULL) + private final String fieldName; + + @JsonProperty + @JsonInclude(JsonInclude.Include.NON_NULL) + private final ColumnType columnType; + + public static final int DEFAULT_MAX_STRING_SIZE = 1024; + + @JsonCreator + public SingleValueAggregatorFactory( + @JsonProperty("name") String name, + @JsonProperty("fieldName") final String fieldName, + @JsonProperty("columnType") final ColumnType columnType + ) + { +Preconditions.checkNotNull(name, "Must have a valid, non-null aggregator name"); +Preconditions.checkNotNull(fieldName, "Must have a valid, non-null fieldName"); + +this.name = name; +this.fieldName = fieldName; +this.columnType = columnType; + } + + @Override + public Aggregator factorize(ColumnSelectorFactory metricFactory) + { +ColumnValueSelector selector = metricFactory.makeColumnValueSelector(fieldName); +return new SingleValueAggregator(selector); + } + + @Override + public BufferAggregator factorizeBuffered(ColumnSelectorFactory metricFactory) + { +ColumnValueSelector selector = metricFactory.makeColumnValueSelector(fieldName); +ColumnCapabilities columnCapabilities = metricFactory.getColumnCapabilities(fieldName); +ColumnType columnType = new ColumnType(columnCapabilities.getType(), null, null); +return new SingleValueBufferAggregator(selector, columnType); + } + + @Override + public Comparator getComparator() + { +throw DruidException.defensive("Single Value Aggregator would not have more than one row to compare"); + } + + @Override + @Nullable + public Object combine(@Nullable Object lhs, @Nullable Object rhs) Review Comment: IMO this would never be called as there's nothing to combine for this agg. This runs on the broker level on the result of the subquery. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Sql Single Value Aggregator for scalar queries (druid)
somu-imply commented on code in PR #15700: URL: https://github.com/apache/druid/pull/15700#discussion_r1476870634 ## processing/src/main/java/org/apache/druid/query/aggregation/SingleValueAggregatorFactory.java: ## @@ -0,0 +1,206 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.query.aggregation; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonInclude; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.fasterxml.jackson.annotation.JsonTypeName; +import com.google.common.base.Preconditions; +import org.apache.druid.error.DruidException; +import org.apache.druid.segment.ColumnSelectorFactory; +import org.apache.druid.segment.ColumnValueSelector; +import org.apache.druid.segment.column.ColumnCapabilities; +import org.apache.druid.segment.column.ColumnType; +import org.apache.druid.segment.column.ValueType; + +import javax.annotation.Nullable; +import java.util.Collections; +import java.util.Comparator; +import java.util.List; +import java.util.Objects; + +/** + * This AggregatorFactory is meant to wrap the subquery used as an expression into a single value + * and is expected to throw an exception when the subquery results in more than one row + * + * + * This consumes columnType as well along with name and fieldName to pass it on to underlying + * {@link SingleValueBufferAggregator} to work with different ColumnTypes + */ +@JsonTypeName("singleValue") +public class SingleValueAggregatorFactory extends AggregatorFactory +{ + + @JsonProperty + @JsonInclude(JsonInclude.Include.NON_NULL) + private final String name; + + @JsonProperty + @JsonInclude(JsonInclude.Include.NON_NULL) + private final String fieldName; + + @JsonProperty + @JsonInclude(JsonInclude.Include.NON_NULL) + private final ColumnType columnType; + + public static final int DEFAULT_MAX_STRING_SIZE = 1024; + + @JsonCreator + public SingleValueAggregatorFactory( + @JsonProperty("name") String name, + @JsonProperty("fieldName") final String fieldName, + @JsonProperty("columnType") final ColumnType columnType + ) + { +Preconditions.checkNotNull(name, "Must have a valid, non-null aggregator name"); +Preconditions.checkNotNull(fieldName, "Must have a valid, non-null fieldName"); + +this.name = name; +this.fieldName = fieldName; +this.columnType = columnType; + } + + @Override + public Aggregator factorize(ColumnSelectorFactory metricFactory) + { +ColumnValueSelector selector = metricFactory.makeColumnValueSelector(fieldName); +return new SingleValueAggregator(selector); + } + + @Override + public BufferAggregator factorizeBuffered(ColumnSelectorFactory metricFactory) + { +ColumnValueSelector selector = metricFactory.makeColumnValueSelector(fieldName); +ColumnCapabilities columnCapabilities = metricFactory.getColumnCapabilities(fieldName); +ColumnType columnType = new ColumnType(columnCapabilities.getType(), null, null); +return new SingleValueBufferAggregator(selector, columnType); + } + + @Override + public Comparator getComparator() + { +throw DruidException.defensive("Single Value Aggregator would not have more than one row to compare"); + } + + @Override + @Nullable + public Object combine(@Nullable Object lhs, @Nullable Object rhs) Review Comment: IMO this would never be called as there's nothing to group. This runs on the broker level on the result of the subquery. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Addressing shapeshifting issues with window functions (druid)
krishnat2 commented on PR #15807: URL: https://github.com/apache/druid/pull/15807#issuecomment-1924945519 Thanks to @somu-imply and @abhishekagarwal87 for a quick fix! Really appreciate it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Add configuration for max S3 connections (druid)
github-actions[bot] closed pull request #13697: Add configuration for max S3 connections URL: https://github.com/apache/druid/pull/13697 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [I] Race condition when killing kafka ingestion tasks with replication > 1 (druid)
github-actions[bot] commented on issue #13705: URL: https://github.com/apache/druid/issues/13705#issuecomment-1924931106 This issue has been marked as stale due to 280 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the d...@druid.apache.org list. Thank you for your contributions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [I] when installing druid on k8s (druid)
github-actions[bot] commented on issue #13702: URL: https://github.com/apache/druid/issues/13702#issuecomment-1924931078 This issue has been marked as stale due to 280 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the d...@druid.apache.org list. Thank you for your contributions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [I] [FR] Jupysql integration (druid)
github-actions[bot] commented on issue #13700: URL: https://github.com/apache/druid/issues/13700#issuecomment-1924931060 This issue has been marked as stale due to 280 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the d...@druid.apache.org list. Thank you for your contributions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Add configuration for max S3 connections (druid)
github-actions[bot] commented on PR #13697: URL: https://github.com/apache/druid/pull/13697#issuecomment-1924931050 This pull request/issue has been closed due to lack of activity. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[PR] Bump io.grpc:grpc-netty-shaded from 1.57.2 to 1.61.1 (druid)
dependabot[bot] opened a new pull request, #15829: URL: https://github.com/apache/druid/pull/15829 Bumps [io.grpc:grpc-netty-shaded](https://github.com/grpc/grpc-java) from 1.57.2 to 1.61.1. Release notes Sourced from https://github.com/grpc/grpc-java/releases;>io.grpc:grpc-netty-shaded's releases. v1.61.1 Bug Fixes xds: Fix a bug in WeightedRoundRobinLoadBalancer policy that could raise NullPointerException and further cause channel panic when picking a subchannel. This bug can only be triggered when connection can not be established and the channel reports TRANSIENT_FAILURE state. (https://redirect.github.com/grpc/grpc-java/issues/10868;>#10868) v1.61.0 API Changes Remove unused experimental API ManagedChannelBuilder.enableFullStreamDecompression (https://redirect.github.com/grpc/grpc-java/issues/10744;>#10744) api: Deprecate LoadBalancer.EMPTY_PICKER added in 1.58.0 in favor of FixedResultPicker (860b5cb1f) New Features binder: Experimental support for asynchronous security policies (https://redirect.github.com/grpc/grpc-java/issues/10566;>#10566) Improvements core: reduce CompositeReadableBuffer allocation (https://redirect.github.com/grpc/grpc-java/issues/3279;>#3279) core: Improve error message clarity when a channel leak is detected (201893f5e) util: use shared index across round_robin pickers (dca89b25b). This makes its implementation more similar to weighted_round_robin. xds: Implement ADS stream flow control mechanism (https://redirect.github.com/grpc/grpc-java/issues/10674;>#10674). This limits the maximum memory consumed if the control plane sends updates more rapidly than they can be processed. Bug Fixes core: Check outbound maximum message size for the compressed size in addition to the already-checked uncompressed size (https://redirect.github.com/grpc/grpc-java/issues/10739;>#10739). Fixed the status code to be RESOURCE_EXHAUSTED instead of UNKNOWN. util: Fix NPE when multiple addresses are in an address group for petiole load balancer policies (https://redirect.github.com/grpc/grpc-java/issues/10769;>#10769) util: Disable publishing of fixtures (8ac43dd81). The Gradle test fixtures are for use by grpc-java's internal tests. okhttp: Ignore known conscrypt socket close issue (https://redirect.github.com/grpc/grpc-java/issues/10812;>#10812). This stops an exception from being thrown when a known Conscrypt synchronization issue happens. Dependencies Drop support for Bazel 5 (55a9c012c). Bazel 7 is available, and Protobuf has already dropped support for Bazel 5. Change many compile deps to runtime deps (d6830d7f9). This reduces the transitive classes leaked into the compile classpath. In particular, grpc-core (io.grpc.internal) will be less frequently included transitively at compile time. Upgrade dependencies (c985797d9) Protobuf to 3.25.1 auto-value-annotations to 1.10.4 error_prone_annotations to 2.23.0 proto-google-common-protos to 2.29.0 google-cloud-logging to 3.15.14 guava to 32.1.3-android okio to 3.4.0 Acknowledgements https://github.com/Gordiychuk;>@Gordiychuk https://github.com/jroper;>@jroper https://github.com/jyane;>@jyane https://github.com/ulfjack;>@ulfjack v1.60.2 Bug Fixes xds: Fix a bug in WeightedRoundRobinLoadBalancer policy that could raise NullPointerException and further cause channel panic when picking a subchannel. This bug can only be triggered when connection can not be established and the channel reports TRANSIENT_FAILURE state. (https://redirect.github.com/grpc/grpc-java/issues/10868;>#10868) v1.60.1 Bug Fixes util: Fix NPE when multiple addresses in an address group for petiole load balancer policies (https://redirect.github.com/grpc/grpc-java/issues/10770;>#10770) ... (truncated) Commits https://github.com/grpc/grpc-java/commit/dfff9a9475a394c455742a9b45a69cf6d9640cf5;>dfff9a9 Bump version to 1.61.1 https://github.com/grpc/grpc-java/commit/df1bb36ea4c3b51d23c78a6e64b41de9d57d81e8;>df1bb36 Update README etc to reference 1.61.1 https://github.com/grpc/grpc-java/commit/1abdaf36e8c363b5f0d445b8c08f49cbb69dd55f;>1abdaf3 xds: fix NPE in wrr in TF state (https://redirect.github.com/grpc/grpc-java/issues/10875;>#10875) https://github.com/grpc/grpc-java/commit/529d0ab330395bf66a5e3942de314bedb95d8bd8;>529d0ab Bump version to 1.61.1-SNAPSHOT https://github.com/grpc/grpc-java/commit/f06abeb6fda2985c19e9792edb6f2b071b6fd8b5;>f06abeb Bump version to 1.61.0 https://github.com/grpc/grpc-java/commit/77005107aaa1847cdbac0919d5e73752649c3485;>7700510 Update README protoc references to 3.25.1 https://github.com/grpc/grpc-java/commit/c639b8161bd16d01ddec2a42ee3bf68c3a49e296;>c639b81 Update README etc to reference 1.61.0
Re: [PR] Bump io.grpc:grpc-netty-shaded from 1.57.2 to 1.61.0 (druid)
dependabot[bot] commented on PR #15678: URL: https://github.com/apache/druid/pull/15678#issuecomment-1924929767 Superseded by #15829. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
(druid) branch dependabot/maven/io.grpc-grpc-netty-shaded-1.61.0 deleted (was bb43982ca22)
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a change to branch dependabot/maven/io.grpc-grpc-netty-shaded-1.61.0 in repository https://gitbox.apache.org/repos/asf/druid.git was bb43982ca22 Bump io.grpc:grpc-netty-shaded from 1.57.2 to 1.61.0 The revisions that were on this branch are still contained in other references; therefore, this change does not discard any commits from the repository. - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
(druid) branch dependabot/maven/io.grpc-grpc-netty-shaded-1.61.1 created (now f0e4192a3ef)
This is an automated email from the ASF dual-hosted git repository. github-bot pushed a change to branch dependabot/maven/io.grpc-grpc-netty-shaded-1.61.1 in repository https://gitbox.apache.org/repos/asf/druid.git at f0e4192a3ef Bump io.grpc:grpc-netty-shaded from 1.57.2 to 1.61.1 No new revisions were added by this update. - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Bump io.grpc:grpc-netty-shaded from 1.57.2 to 1.61.0 (druid)
dependabot[bot] closed pull request #15678: Bump io.grpc:grpc-netty-shaded from 1.57.2 to 1.61.0 URL: https://github.com/apache/druid/pull/15678 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Sql Single Value Aggregator for scalar queries (druid)
somu-imply commented on code in PR #15700: URL: https://github.com/apache/druid/pull/15700#discussion_r1476870634 ## processing/src/main/java/org/apache/druid/query/aggregation/SingleValueAggregatorFactory.java: ## @@ -0,0 +1,206 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.query.aggregation; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonInclude; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.fasterxml.jackson.annotation.JsonTypeName; +import com.google.common.base.Preconditions; +import org.apache.druid.error.DruidException; +import org.apache.druid.segment.ColumnSelectorFactory; +import org.apache.druid.segment.ColumnValueSelector; +import org.apache.druid.segment.column.ColumnCapabilities; +import org.apache.druid.segment.column.ColumnType; +import org.apache.druid.segment.column.ValueType; + +import javax.annotation.Nullable; +import java.util.Collections; +import java.util.Comparator; +import java.util.List; +import java.util.Objects; + +/** + * This AggregatorFactory is meant to wrap the subquery used as an expression into a single value + * and is expected to throw an exception when the subquery results in more than one row + * + * + * This consumes columnType as well along with name and fieldName to pass it on to underlying + * {@link SingleValueBufferAggregator} to work with different ColumnTypes + */ +@JsonTypeName("singleValue") +public class SingleValueAggregatorFactory extends AggregatorFactory +{ + + @JsonProperty + @JsonInclude(JsonInclude.Include.NON_NULL) + private final String name; + + @JsonProperty + @JsonInclude(JsonInclude.Include.NON_NULL) + private final String fieldName; + + @JsonProperty + @JsonInclude(JsonInclude.Include.NON_NULL) + private final ColumnType columnType; + + public static final int DEFAULT_MAX_STRING_SIZE = 1024; + + @JsonCreator + public SingleValueAggregatorFactory( + @JsonProperty("name") String name, + @JsonProperty("fieldName") final String fieldName, + @JsonProperty("columnType") final ColumnType columnType + ) + { +Preconditions.checkNotNull(name, "Must have a valid, non-null aggregator name"); +Preconditions.checkNotNull(fieldName, "Must have a valid, non-null fieldName"); + +this.name = name; +this.fieldName = fieldName; +this.columnType = columnType; + } + + @Override + public Aggregator factorize(ColumnSelectorFactory metricFactory) + { +ColumnValueSelector selector = metricFactory.makeColumnValueSelector(fieldName); +return new SingleValueAggregator(selector); + } + + @Override + public BufferAggregator factorizeBuffered(ColumnSelectorFactory metricFactory) + { +ColumnValueSelector selector = metricFactory.makeColumnValueSelector(fieldName); +ColumnCapabilities columnCapabilities = metricFactory.getColumnCapabilities(fieldName); +ColumnType columnType = new ColumnType(columnCapabilities.getType(), null, null); +return new SingleValueBufferAggregator(selector, columnType); + } + + @Override + public Comparator getComparator() + { +throw DruidException.defensive("Single Value Aggregator would not have more than one row to compare"); + } + + @Override + @Nullable + public Object combine(@Nullable Object lhs, @Nullable Object rhs) Review Comment: IMO this would never be called as there's nothing to group. This runs on the broker level on the result of the subquery. We should throw an exception in the `getCombiningFactory` as well because that also is also not needed -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Extension to read and ingest Delta Lake tables (druid)
vkorukanti commented on code in PR #15755: URL: https://github.com/apache/druid/pull/15755#discussion_r1476750537 ## extensions-contrib/druid-deltalake-extensions/src/main/java/org/apache/druid/delta/input/DeltaInputSourceReader.java: ## @@ -0,0 +1,138 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.delta.input; + +import io.delta.kernel.data.FilteredColumnarBatch; +import io.delta.kernel.data.Row; +import org.apache.druid.data.input.InputRow; +import org.apache.druid.data.input.InputRowListPlusRawValues; +import org.apache.druid.data.input.InputRowSchema; +import org.apache.druid.data.input.InputSourceReader; +import org.apache.druid.data.input.InputStats; +import org.apache.druid.java.util.common.parsers.CloseableIterator; + +import java.io.IOException; +import java.util.NoSuchElementException; + +/** + * A reader for the Delta Lake input source. It initializes an iterator {@link DeltaInputSourceIterator} + * for a subset of Delta records given by {@link FilteredColumnarBatch} and schema {@link InputRowSchema}. + * + */ +public class DeltaInputSourceReader implements InputSourceReader +{ + private final io.delta.kernel.utils.CloseableIterator filteredColumnarBatchCloseableIterator; + private final InputRowSchema inputRowSchema; + + public DeltaInputSourceReader( + io.delta.kernel.utils.CloseableIterator filteredColumnarBatchCloseableIterator, + InputRowSchema inputRowSchema + ) + { +this.filteredColumnarBatchCloseableIterator = filteredColumnarBatchCloseableIterator; +this.inputRowSchema = inputRowSchema; + } + + @Override + public CloseableIterator read() + { +return new DeltaInputSourceIterator(filteredColumnarBatchCloseableIterator, inputRowSchema); + } + + @Override + public CloseableIterator read(InputStats inputStats) + { +return new DeltaInputSourceIterator(filteredColumnarBatchCloseableIterator, inputRowSchema); + } + + @Override + public CloseableIterator sample() + { + +CloseableIterator inner = read(); +return new CloseableIterator() +{ + @Override + public void close() throws IOException + { +inner.close(); + } + + @Override + public boolean hasNext() + { +return inner.hasNext(); + } + + @Override + public InputRowListPlusRawValues next() + { +DeltaInputRow deltaInputRow = (DeltaInputRow) inner.next(); +return InputRowListPlusRawValues.of(deltaInputRow, deltaInputRow.getRawRowAsMap()); + } +}; + } + + private static class DeltaInputSourceIterator implements CloseableIterator + { +private final io.delta.kernel.utils.CloseableIterator filteredColumnarBatchCloseableIterator; + +private io.delta.kernel.utils.CloseableIterator currentBatch = null; +private final InputRowSchema inputRowSchema; + +public DeltaInputSourceIterator( +io.delta.kernel.utils.CloseableIterator filteredColumnarBatchCloseableIterator, +InputRowSchema inputRowSchema +) +{ + this.filteredColumnarBatchCloseableIterator = filteredColumnarBatchCloseableIterator; + this.inputRowSchema = inputRowSchema; +} + +@Override +public boolean hasNext() +{ + while (currentBatch == null || !currentBatch.hasNext()) { +if (!filteredColumnarBatchCloseableIterator.hasNext()) { + return false; // No more batches or records to read! +} +currentBatch = filteredColumnarBatchCloseableIterator.next().getRows(); Review Comment: one qn: do we want to check if the `currentBatch` can return `hasNext` to true? because it could contain zero rows due to the selection vector. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail:
[PR] Handling cast boolean to integer (druid)
somu-imply opened a new pull request, #15828: URL: https://github.com/apache/druid/pull/15828 Previously queries like `SELECT CAST(TRUE as INTEGER)` would throw errors. This PR aims to support it by allowing lenient conformance This PR has: - [ ] been self-reviewed. - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.) - [ ] added documentation for new or modified features or behaviors. - [ ] a release note entry in the PR description. - [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links. - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/dev/license.md) - [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader. - [ ] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for [code coverage](https://github.com/apache/druid/blob/master/dev/code-review/code-coverage.md) is met. - [ ] added integration tests. - [ ] been tested in a test Druid cluster. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Extension to read and ingest Delta Lake tables (druid)
vkorukanti commented on code in PR #15755: URL: https://github.com/apache/druid/pull/15755#discussion_r1476750537 ## extensions-contrib/druid-deltalake-extensions/src/main/java/org/apache/druid/delta/input/DeltaInputSourceReader.java: ## @@ -0,0 +1,138 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.delta.input; + +import io.delta.kernel.data.FilteredColumnarBatch; +import io.delta.kernel.data.Row; +import org.apache.druid.data.input.InputRow; +import org.apache.druid.data.input.InputRowListPlusRawValues; +import org.apache.druid.data.input.InputRowSchema; +import org.apache.druid.data.input.InputSourceReader; +import org.apache.druid.data.input.InputStats; +import org.apache.druid.java.util.common.parsers.CloseableIterator; + +import java.io.IOException; +import java.util.NoSuchElementException; + +/** + * A reader for the Delta Lake input source. It initializes an iterator {@link DeltaInputSourceIterator} + * for a subset of Delta records given by {@link FilteredColumnarBatch} and schema {@link InputRowSchema}. + * + */ +public class DeltaInputSourceReader implements InputSourceReader +{ + private final io.delta.kernel.utils.CloseableIterator filteredColumnarBatchCloseableIterator; + private final InputRowSchema inputRowSchema; + + public DeltaInputSourceReader( + io.delta.kernel.utils.CloseableIterator filteredColumnarBatchCloseableIterator, + InputRowSchema inputRowSchema + ) + { +this.filteredColumnarBatchCloseableIterator = filteredColumnarBatchCloseableIterator; +this.inputRowSchema = inputRowSchema; + } + + @Override + public CloseableIterator read() + { +return new DeltaInputSourceIterator(filteredColumnarBatchCloseableIterator, inputRowSchema); + } + + @Override + public CloseableIterator read(InputStats inputStats) + { +return new DeltaInputSourceIterator(filteredColumnarBatchCloseableIterator, inputRowSchema); + } + + @Override + public CloseableIterator sample() + { + +CloseableIterator inner = read(); +return new CloseableIterator() +{ + @Override + public void close() throws IOException + { +inner.close(); + } + + @Override + public boolean hasNext() + { +return inner.hasNext(); + } + + @Override + public InputRowListPlusRawValues next() + { +DeltaInputRow deltaInputRow = (DeltaInputRow) inner.next(); +return InputRowListPlusRawValues.of(deltaInputRow, deltaInputRow.getRawRowAsMap()); + } +}; + } + + private static class DeltaInputSourceIterator implements CloseableIterator + { +private final io.delta.kernel.utils.CloseableIterator filteredColumnarBatchCloseableIterator; + +private io.delta.kernel.utils.CloseableIterator currentBatch = null; +private final InputRowSchema inputRowSchema; + +public DeltaInputSourceIterator( +io.delta.kernel.utils.CloseableIterator filteredColumnarBatchCloseableIterator, +InputRowSchema inputRowSchema +) +{ + this.filteredColumnarBatchCloseableIterator = filteredColumnarBatchCloseableIterator; + this.inputRowSchema = inputRowSchema; +} + +@Override +public boolean hasNext() +{ + while (currentBatch == null || !currentBatch.hasNext()) { +if (!filteredColumnarBatchCloseableIterator.hasNext()) { + return false; // No more batches or records to read! +} +currentBatch = filteredColumnarBatchCloseableIterator.next().getRows(); Review Comment: one qn: do we want to check if the `currentBatch` can return `hasNext` to true? because it could contain zero rows due to the selection vector. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail:
Re: [PR] Addressing shapeshifting issues with window functions (druid)
somu-imply commented on PR #15807: URL: https://github.com/apache/druid/pull/15807#issuecomment-1924571571 Although the comment is there, I don't see any loss of generalization for calling the super's default method for these Segments. Updated the occurrences -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Updating and removing unnecessary white spaces on use case page [druid-website-src]
2bethere commented on PR #457: URL: https://github.com/apache/druid-website-src/pull/457#issuecomment-1924285125 Looks like this now: https://github.com/apache/druid-website-src/assets/1326022/062c83c4-aa2e-4544-a743-45741e7b0e6e;> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Updating and removing unnecessary white spaces on use case page [druid-website-src]
2bethere commented on PR #457: URL: https://github.com/apache/druid-website-src/pull/457#issuecomment-1924268383 Can do, will modify. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
(druid) branch 29.0.0 updated (d27c38b59b1 -> f1c29f6904e)
This is an automated email from the ASF dual-hosted git repository. abhishekrb pushed a change to branch 29.0.0 in repository https://gitbox.apache.org/repos/asf/druid.git from d27c38b59b1 [Backport] docs: concurrent append copyedits (#15821) new b44bd34762a Add range filtering support for iceberg ingestion (#15782) new f1c29f6904e Update input-sources.md for fixing the warehouse path example under S3 (#15823) The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: docs/ingestion/input-sources.md| 13 +++- .../druid-iceberg-extensions/pom.xml | 4 -- .../apache/druid/iceberg/filter/IcebergFilter.java | 3 +- ...IntervalFilter.java => IcebergRangeFilter.java} | 69 +++--- .../iceberg/filter/IcebergRangeFilterTest.java | 68 + website/.spelling | 2 + 6 files changed, 119 insertions(+), 40 deletions(-) copy extensions-contrib/druid-iceberg-extensions/src/main/java/org/apache/druid/iceberg/filter/{IcebergIntervalFilter.java => IcebergRangeFilter.java} (51%) create mode 100644 extensions-contrib/druid-iceberg-extensions/src/test/java/org/apache/druid/iceberg/filter/IcebergRangeFilterTest.java - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
(druid) 02/02: Update input-sources.md for fixing the warehouse path example under S3 (#15823)
This is an automated email from the ASF dual-hosted git repository. abhishekrb pushed a commit to branch 29.0.0 in repository https://gitbox.apache.org/repos/asf/druid.git commit f1c29f6904eb972379c2f3ecc611b4e2a87e7254 Author: Aru Raghuwanshi <35107623+aruraghuwan...@users.noreply.github.com> AuthorDate: Thu Feb 1 23:32:05 2024 -0800 Update input-sources.md for fixing the warehouse path example under S3 (#15823) --- docs/ingestion/input-sources.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/ingestion/input-sources.md b/docs/ingestion/input-sources.md index 0adc001cd23..f34dbbfe713 100644 --- a/docs/ingestion/input-sources.md +++ b/docs/ingestion/input-sources.md @@ -889,7 +889,7 @@ The following is a sample spec for a S3 warehouse source: "namespace": "iceberg_namespace", "icebergCatalog": { "type": "hive", - "warehousePath": "hdfs://warehouse/path", + "warehousePath": "s3://warehouse/path", "catalogUri": "thrift://hive-metastore.x.com:8970", "catalogProperties": { "hive.metastore.connect.retries": "1", - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
(druid) 01/02: Add range filtering support for iceberg ingestion (#15782)
This is an automated email from the ASF dual-hosted git repository. abhishekrb pushed a commit to branch 29.0.0 in repository https://gitbox.apache.org/repos/asf/druid.git commit b44bd34762a380fb04235e48600e8c265ebc5e65 Author: Atul Mohan AuthorDate: Thu Feb 1 23:32:30 2024 -0800 Add range filtering support for iceberg ingestion (#15782) * Add range filtering support for iceberg ingestion * Docs formatting * Spelling --- docs/ingestion/input-sources.md| 11 +++ .../druid-iceberg-extensions/pom.xml | 4 - .../apache/druid/iceberg/filter/IcebergFilter.java | 3 +- .../druid/iceberg/filter/IcebergRangeFilter.java | 93 ++ .../iceberg/filter/IcebergRangeFilterTest.java | 68 website/.spelling | 2 + 6 files changed, 176 insertions(+), 5 deletions(-) diff --git a/docs/ingestion/input-sources.md b/docs/ingestion/input-sources.md index d4693e7925a..0adc001cd23 100644 --- a/docs/ingestion/input-sources.md +++ b/docs/ingestion/input-sources.md @@ -1013,6 +1013,17 @@ This input source provides the following filters: `and`, `equals`, `interval`, a |type|Set this value to `not`.|yes| |filter|The iceberg filter on which logical NOT is applied|yes| +`range` Filter: + +|Property|Description|Default|Required| +||---|---|| +|type|Set this value to `range`.|None|yes| +|filterColumn|The column name from the iceberg table schema based on which range filtering needs to happen.|None|yes| +|lower|Lower bound value to match.|None|no. At least one of `lower` or `upper` must not be null.| +|upper|Upper bound value to match. |None|no. At least one of `lower` or `upper` must not be null.| +|lowerOpen|Boolean indicating if lower bound is open in the interval of values defined by the range (">" instead of ">="). |false|no| +|upperOpen|Boolean indicating if upper bound is open on the interval of values defined by range ("<" instead of "<="). |false|no| + ## Delta Lake input source :::info diff --git a/extensions-contrib/druid-iceberg-extensions/pom.xml b/extensions-contrib/druid-iceberg-extensions/pom.xml index 577d56978bc..d5be8235d4a 100644 --- a/extensions-contrib/druid-iceberg-extensions/pom.xml +++ b/extensions-contrib/druid-iceberg-extensions/pom.xml @@ -206,10 +206,6 @@ org.slf4j slf4j-reload4j - - com.google.re2j - re2j - com.google.code.gson gson diff --git a/extensions-contrib/druid-iceberg-extensions/src/main/java/org/apache/druid/iceberg/filter/IcebergFilter.java b/extensions-contrib/druid-iceberg-extensions/src/main/java/org/apache/druid/iceberg/filter/IcebergFilter.java index 10c07cdbe24..cff8797d60b 100644 --- a/extensions-contrib/druid-iceberg-extensions/src/main/java/org/apache/druid/iceberg/filter/IcebergFilter.java +++ b/extensions-contrib/druid-iceberg-extensions/src/main/java/org/apache/druid/iceberg/filter/IcebergFilter.java @@ -33,7 +33,8 @@ import org.apache.iceberg.expressions.Expression; @JsonSubTypes.Type(name = "equals", value = IcebergEqualsFilter.class), @JsonSubTypes.Type(name = "and", value = IcebergAndFilter.class), @JsonSubTypes.Type(name = "not", value = IcebergNotFilter.class), -@JsonSubTypes.Type(name = "or", value = IcebergOrFilter.class) +@JsonSubTypes.Type(name = "or", value = IcebergOrFilter.class), +@JsonSubTypes.Type(name = "range", value = IcebergRangeFilter.class) }) public interface IcebergFilter { diff --git a/extensions-contrib/druid-iceberg-extensions/src/main/java/org/apache/druid/iceberg/filter/IcebergRangeFilter.java b/extensions-contrib/druid-iceberg-extensions/src/main/java/org/apache/druid/iceberg/filter/IcebergRangeFilter.java new file mode 100644 index 000..2e3f42dbba9 --- /dev/null +++ b/extensions-contrib/druid-iceberg-extensions/src/main/java/org/apache/druid/iceberg/filter/IcebergRangeFilter.java @@ -0,0 +1,93 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.iceberg.filter; + +import com.fasterxml.jackson.annotation.JsonCreator; +import
Re: [PR] [Backport] Add range filtering support for Iceberg ingestion and fixup typo in example (druid)
abhishekrb19 merged PR #15824: URL: https://github.com/apache/druid/pull/15824 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] CVE Fix: Update json-path version (druid)
xvrl commented on code in PR #15772: URL: https://github.com/apache/druid/pull/15772#discussion_r1476301395 ## extensions-core/orc-extensions/src/test/java/org/apache/druid/data/input/orc/OrcReaderTest.java: ## @@ -314,8 +314,8 @@ public void testJsonPathFunctions() throws IOException //deviation of [7,8,9] is 1/3, stddev is sqrt(1/3), approximately 0.8165 Assert.assertEquals(0.8165, Double.parseDouble(Iterables.getOnlyElement(row.getDimension("stddev"))), 0.0001); -//append is not supported -Assert.assertEquals(Collections.emptyList(), row.getDimension("append")); +//Support for append has been added in json-path-2.9.0 +//Assert.assertEquals(Collections.emptyList(), row.getDimension("append")); Review Comment: understood, but there are two issues here: 1. as a principle we don't leave commented code unless there's a very good reason. 2. This is potentially being breaking change. Any json-path expression using append today would have silently been ignored, but would now cause an error. My suggestion would be to: 1. update the test to check that we do get the exception 2. make a note in the PR description that we should call this out in the release notes as a potential issue to watch out for when upgrading. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
(druid) branch master updated: Add QueryLifecycle#authorize for grpc-query-extension (#15816)
This is an automated email from the ASF dual-hosted git repository. abhishek pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/druid.git The following commit(s) were added to refs/heads/master by this push: new de959e513d1 Add QueryLifecycle#authorize for grpc-query-extension (#15816) de959e513d1 is described below commit de959e513d120348dfbfcf3e13d265b200a7db6c Author: Rishabh Singh <6513075+findingr...@users.noreply.github.com> AuthorDate: Fri Feb 2 21:49:57 2024 +0530 Add QueryLifecycle#authorize for grpc-query-extension (#15816) Proposal #13469 Original PR #14024 A new method is being added in QueryLifecycle class to authorise a query based on authentication result. This method is required since we authenticate the query by intercepting it in the grpc extension and pass down the authentication result. --- .../org/apache/druid/server/QueryLifecycle.java| 31 + .../apache/druid/server/QueryLifecycleTest.java| 74 -- 2 files changed, 85 insertions(+), 20 deletions(-) diff --git a/server/src/main/java/org/apache/druid/server/QueryLifecycle.java b/server/src/main/java/org/apache/druid/server/QueryLifecycle.java index 0f46e5da4d9..e0bb9875240 100644 --- a/server/src/main/java/org/apache/druid/server/QueryLifecycle.java +++ b/server/src/main/java/org/apache/druid/server/QueryLifecycle.java @@ -242,6 +242,37 @@ public class QueryLifecycle ); } + /** + * Authorize the query using the authentication result. + * Will return an Access object denoting whether the query is authorized or not. + * This method is to be used by the grpc-query-extension. + * + * @param authenticationResult authentication result indicating identity of the requester + * @return authorization result of requester + */ + public Access authorize(AuthenticationResult authenticationResult) + { +transition(State.INITIALIZED, State.AUTHORIZING); +final Iterable resourcesToAuthorize = Iterables.concat( +Iterables.transform( +baseQuery.getDataSource().getTableNames(), +AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR +), +Iterables.transform( +authConfig.contextKeysToAuthorize(userContextKeys), +contextParam -> new ResourceAction(new Resource(contextParam, ResourceType.QUERY_CONTEXT), Action.WRITE) +) +); +return doAuthorize( +authenticationResult, +AuthorizationUtils.authorizeAllResourceActions( +authenticationResult, +resourcesToAuthorize, +authorizerMapper +) +); + } + private void preAuthorized(final AuthenticationResult authenticationResult, final Access access) { // gotta transition those states, even if we are already authorized diff --git a/server/src/test/java/org/apache/druid/server/QueryLifecycleTest.java b/server/src/test/java/org/apache/druid/server/QueryLifecycleTest.java index 578661ea768..a2691297d0b 100644 --- a/server/src/test/java/org/apache/druid/server/QueryLifecycleTest.java +++ b/server/src/test/java/org/apache/druid/server/QueryLifecycleTest.java @@ -188,15 +188,15 @@ public class QueryLifecycleTest EasyMock.expect(authenticationResult.getIdentity()).andReturn(IDENTITY).anyTimes(); EasyMock.expect(authenticationResult.getAuthorizerName()).andReturn(AUTHORIZER).anyTimes(); EasyMock.expect(authorizer.authorize(authenticationResult, new Resource(DATASOURCE, ResourceType.DATASOURCE), Action.READ)) -.andReturn(Access.OK); +.andReturn(Access.OK).times(2); EasyMock.expect(authorizer.authorize(authenticationResult, new Resource("foo", ResourceType.QUERY_CONTEXT), Action.WRITE)) -.andReturn(Access.OK); +.andReturn(Access.OK).times(2); EasyMock.expect(authorizer.authorize(authenticationResult, new Resource("baz", ResourceType.QUERY_CONTEXT), Action.WRITE)) -.andReturn(Access.OK); +.andReturn(Access.OK).times(2); EasyMock.expect(toolChestWarehouse.getToolChest(EasyMock.anyObject())) .andReturn(toolChest) -.once(); +.times(2); replayAll(); @@ -223,6 +223,10 @@ public class QueryLifecycleTest ); Assert.assertTrue(lifecycle.authorize(mockRequest()).isAllowed()); + +lifecycle = createLifecycle(authConfig); +lifecycle.initialize(query); +Assert.assertTrue(lifecycle.authorize(authenticationResult).isAllowed()); } @Test @@ -232,13 +236,15 @@ public class QueryLifecycleTest EasyMock.expect(authenticationResult.getIdentity()).andReturn(IDENTITY).anyTimes(); EasyMock.expect(authenticationResult.getAuthorizerName()).andReturn(AUTHORIZER).anyTimes(); EasyMock.expect(authorizer.authorize(authenticationResult, new Resource(DATASOURCE, ResourceType.DATASOURCE), Action.READ)) -.andReturn(Access.OK); +
Re: [PR] Add QueryLifecycle#authorize for grpc-query-extension (druid)
abhishekagarwal87 merged PR #15816: URL: https://github.com/apache/druid/pull/15816 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] [Backport] Strict window frame checks (druid)
kgyrtkirk closed pull request #15826: [Backport] Strict window frame checks URL: https://github.com/apache/druid/pull/15826 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[PR] [Backport] Strict window frame checks (druid)
kgyrtkirk opened a new pull request, #15827: URL: https://github.com/apache/druid/pull/15827 Backport of #15746 to 29.0.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[PR] [Backport] Strict window frame checks (druid)
kgyrtkirk opened a new pull request, #15826: URL: https://github.com/apache/druid/pull/15826 Backport of #15746 to 28.0.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
(druid) branch master updated: Strict window frame checks (#15746)
This is an automated email from the ASF dual-hosted git repository. abhishek pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/druid.git The following commit(s) were added to refs/heads/master by this push: new 8f5b7522c72 Strict window frame checks (#15746) 8f5b7522c72 is described below commit 8f5b7522c72f33afdd36c7b70e4de85f455d65a1 Author: Zoltan Haindrich AuthorDate: Fri Feb 2 11:51:53 2024 +0100 Strict window frame checks (#15746) introduce checks to ensure that window frame is supported added check to ensure that no expressions are set as bounds added logic to detect following/following like cases - described in Window function fails to demarcate if 2 following are used #15739 currently RANGE frames are only supported correctly if both endpoints are unbounded or current row Offset based window range support #15767 added windowingStrictValidation context key to provide a way to override the check --- .../java/org/apache/druid/query/QueryContext.java | 10 +++ .../java/org/apache/druid/query/QueryContexts.java | 3 + .../org/apache/druid/query/QueryContextsTest.java | 9 ++ .../sql/calcite/planner/DruidSqlValidator.java | 99 +- .../druid/sql/calcite/planner/PlannerContext.java | 2 +- .../druid/sql/calcite/BaseCalciteQueryTest.java| 10 +++ .../apache/druid/sql/calcite/CalciteQueryTest.java | 46 ++ .../druid/sql/calcite/CalciteWindowQueryTest.java | 7 +- .../druid/sql/calcite/DrillWindowQueryTest.java| 2 +- .../calcite/expression/ExpressionTestHelper.java | 4 +- 10 files changed, 185 insertions(+), 7 deletions(-) diff --git a/processing/src/main/java/org/apache/druid/query/QueryContext.java b/processing/src/main/java/org/apache/druid/query/QueryContext.java index 6fa73f0b65c..3364512052b 100644 --- a/processing/src/main/java/org/apache/druid/query/QueryContext.java +++ b/processing/src/main/java/org/apache/druid/query/QueryContext.java @@ -27,6 +27,7 @@ import org.apache.druid.query.QueryContexts.Vectorize; import org.apache.druid.segment.QueryableIndexStorageAdapter; import javax.annotation.Nullable; + import java.io.IOException; import java.util.Collections; import java.util.Map; @@ -582,6 +583,15 @@ public class QueryContext ); } + public boolean isWindowingStrictValidation() + { +return getBoolean( +QueryContexts.WINDOWING_STRICT_VALIDATION, +QueryContexts.DEFAULT_WINDOWING_STRICT_VALIDATION +); + } + + public String getBrokerServiceName() { return getString(QueryContexts.BROKER_SERVICE_NAME); diff --git a/processing/src/main/java/org/apache/druid/query/QueryContexts.java b/processing/src/main/java/org/apache/druid/query/QueryContexts.java index 7d39edfde66..0359b13e2f0 100644 --- a/processing/src/main/java/org/apache/druid/query/QueryContexts.java +++ b/processing/src/main/java/org/apache/druid/query/QueryContexts.java @@ -85,6 +85,8 @@ public class QueryContexts public static final String SERIALIZE_DATE_TIME_AS_LONG_INNER_KEY = "serializeDateTimeAsLongInner"; public static final String UNCOVERED_INTERVALS_LIMIT_KEY = "uncoveredIntervalsLimit"; public static final String MIN_TOP_N_THRESHOLD = "minTopNThreshold"; + public static final String WINDOWING_STRICT_VALIDATION = "windowingStrictValidation"; + // SQL query context keys public static final String CTX_SQL_QUERY_ID = BaseQuery.SQL_QUERY_ID; @@ -117,6 +119,7 @@ public class QueryContexts public static final boolean DEFAULT_ENABLE_DEBUG = false; public static final int DEFAULT_IN_SUB_QUERY_THRESHOLD = Integer.MAX_VALUE; public static final boolean DEFAULT_ENABLE_TIME_BOUNDARY_PLANNING = false; + public static final boolean DEFAULT_WINDOWING_STRICT_VALIDATION = true; @SuppressWarnings("unused") // Used by Jackson serialization public enum Vectorize diff --git a/processing/src/test/java/org/apache/druid/query/QueryContextsTest.java b/processing/src/test/java/org/apache/druid/query/QueryContextsTest.java index 38b5384ded9..09496ece9f2 100644 --- a/processing/src/test/java/org/apache/druid/query/QueryContextsTest.java +++ b/processing/src/test/java/org/apache/druid/query/QueryContextsTest.java @@ -151,6 +151,15 @@ public class QueryContextsTest ); } + @Test + public void testDefaultWindowingStrictValidation() + { +Assert.assertEquals( +QueryContexts.DEFAULT_WINDOWING_STRICT_VALIDATION, +QueryContext.empty().isWindowingStrictValidation() +); + } + @Test public void testGetEnableJoinLeftScanDirect() { diff --git a/sql/src/main/java/org/apache/druid/sql/calcite/planner/DruidSqlValidator.java b/sql/src/main/java/org/apache/druid/sql/calcite/planner/DruidSqlValidator.java index 34f3c7410f3..390ddf96baf 100644 --- a/sql/src/main/java/org/apache/druid/sql/calcite/planner/DruidSqlValidator.java +++ b/sql/src/main/java/org/apache/druid/sql/calcite/planner/DruidSqlValidator.java
Re: [PR] Strict window frame checks (druid)
abhishekagarwal87 merged PR #15746: URL: https://github.com/apache/druid/pull/15746 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] fix: handle BOOKMARK events in kubernetes pod discovery (druid)
lkm commented on code in PR #15819: URL: https://github.com/apache/druid/pull/15819#discussion_r1475854162 ## extensions-core/kubernetes-extensions/src/main/java/org/apache/druid/k8s/discovery/DefaultK8sApiClient.java: ## @@ -131,7 +131,7 @@ public boolean hasNext() throws SocketTimeoutException try { while (watch.hasNext()) { Watch.Response item = watch.next(); - if (item != null && item.type != null) { + if (item != null && item.type != null && !item.type.equals(WatchResult.BOOKMARK)) { Review Comment: We're actually not totally ignoring the result, we are updating the resourceVersion which helps to reduce load on Kubernetes API so it does make sense to get those. See for details https://kubernetes.io/docs/reference/using-api/api-concepts/#watch-bookmarks https://github.com/apache/druid/blob/2e46a980245bc1f2694dd21af9425855e78632f9/extensions-core/kubernetes-extensions/src/main/java/org/apache/druid/k8s/discovery/K8sDruidNodeDiscoveryProvider.java#L282 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Add export capabilities to MSQ with SQL syntax (druid)
adarshsanjeev commented on code in PR #15689: URL: https://github.com/apache/druid/pull/15689#discussion_r1475852597 ## docs/multi-stage-query/reference.md: ## @@ -90,6 +93,66 @@ can precede the column list: `EXTEND (timestamp VARCHAR...)`. For more information, see [Read external data with EXTERN](concepts.md#read-external-data-with-extern). + `EXTERN` to export to a destination + +`EXTERN` can be used to specify a destination, where the data needs to be exported. +This variation of EXTERN requires one argument, the details of the destination as specified below. +This variation additionally requires an `AS` clause to specify the format of the exported rows. + +INSERT statements and REPLACE statements are both supported with an `EXTERN` destination. +Only `CSV` format is supported at the moment. +Please note that partitioning (`PARTITIONED BY`) and clustering (`CLUSTERED BY`) is not currently supported with export statements. + +Export statements support the context parameter `rowsPerPage` for the number of rows in each exported file. The default value +is 100,000. + +INSERT statements append the results to the existing files at the destination. +```sql +INSERT INTO + EXTERN() +AS CSV +SELECT + +FROM +``` + +REPLACE statements have an additional OVERWRITE clause. As partitioning is not yet supported, only `OVERWRITE ALL` +is allowed. REPLACE deletes any currently existing files at the specified directory, and creates new files with the results of the query. + + +```sql +REPLACE INTO + EXTERN() +AS CSV +OVERWRITE ALL +SELECT + +FROM +``` + +Exporting is currently supported for Amazon S3 storage. This can be done passing the function `S3()` as an argument to the `EXTERN` function. The `druid-s3-extensions` should be loaded. Review Comment: Going through the currently used syntax, it would makes sense to use `FN(x=>'a')` for export as well, so this should . I've also added a line about using single quotes to the docs. The functions would not allow named parameters. I've added this information to the docs as well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Add export capabilities to MSQ with SQL syntax (druid)
adarshsanjeev commented on code in PR #15689: URL: https://github.com/apache/druid/pull/15689#discussion_r1475840275 ## docs/multi-stage-query/reference.md: ## @@ -90,6 +93,66 @@ can precede the column list: `EXTEND (timestamp VARCHAR...)`. For more information, see [Read external data with EXTERN](concepts.md#read-external-data-with-extern). + `EXTERN` to export to a destination + +`EXTERN` can be used to specify a destination, where the data needs to be exported. +This variation of EXTERN requires one argument, the details of the destination as specified below. +This variation additionally requires an `AS` clause to specify the format of the exported rows. + +INSERT statements and REPLACE statements are both supported with an `EXTERN` destination. +Only `CSV` format is supported at the moment. +Please note that partitioning (`PARTITIONED BY`) and clustering (`CLUSTERED BY`) is not currently supported with export statements. + +Export statements support the context parameter `rowsPerPage` for the number of rows in each exported file. The default value +is 100,000. + +INSERT statements append the results to the existing files at the destination. +```sql +INSERT INTO + EXTERN() +AS CSV +SELECT + +FROM +``` + +REPLACE statements have an additional OVERWRITE clause. As partitioning is not yet supported, only `OVERWRITE ALL` +is allowed. REPLACE deletes any currently existing files at the specified directory, and creates new files with the results of the query. + + +```sql +REPLACE INTO + EXTERN() +AS CSV +OVERWRITE ALL +SELECT + +FROM +``` + +Exporting is currently supported for Amazon S3 storage. This can be done passing the function `S3()` as an argument to the `EXTERN` function. The `druid-s3-extensions` should be loaded. + +```sql +INSERT INTO + EXTERN(S3(bucket=<...>, prefix=<...>, tempDir=<...>)) Review Comment: Changed the example here -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Sql Single Value Aggregator for scalar queries (druid)
LakshSingla commented on code in PR #15700: URL: https://github.com/apache/druid/pull/15700#discussion_r1475661119 ## processing/src/main/java/org/apache/druid/query/aggregation/SingleValueAggregatorFactory.java: ## @@ -0,0 +1,206 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.query.aggregation; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonInclude; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.fasterxml.jackson.annotation.JsonTypeName; +import com.google.common.base.Preconditions; +import org.apache.druid.error.DruidException; +import org.apache.druid.segment.ColumnSelectorFactory; +import org.apache.druid.segment.ColumnValueSelector; +import org.apache.druid.segment.column.ColumnCapabilities; +import org.apache.druid.segment.column.ColumnType; +import org.apache.druid.segment.column.ValueType; + +import javax.annotation.Nullable; +import java.util.Collections; +import java.util.Comparator; +import java.util.List; +import java.util.Objects; + +/** + * This AggregatorFactory is meant to wrap the subquery used as an expression into a single value + * and is expected to throw an exception when the subquery results in more than one row + * + * + * This consumes columnType as well along with name and fieldName to pass it on to underlying + * {@link SingleValueBufferAggregator} to work with different ColumnTypes + */ +@JsonTypeName("singleValue") +public class SingleValueAggregatorFactory extends AggregatorFactory +{ + + @JsonProperty + @JsonInclude(JsonInclude.Include.NON_NULL) + private final String name; + + @JsonProperty + @JsonInclude(JsonInclude.Include.NON_NULL) + private final String fieldName; + + @JsonProperty + @JsonInclude(JsonInclude.Include.NON_NULL) + private final ColumnType columnType; + + public static final int DEFAULT_MAX_STRING_SIZE = 1024; + + @JsonCreator + public SingleValueAggregatorFactory( + @JsonProperty("name") String name, + @JsonProperty("fieldName") final String fieldName, + @JsonProperty("columnType") final ColumnType columnType + ) + { +Preconditions.checkNotNull(name, "Must have a valid, non-null aggregator name"); +Preconditions.checkNotNull(fieldName, "Must have a valid, non-null fieldName"); + +this.name = name; +this.fieldName = fieldName; +this.columnType = columnType; + } + + @Override + public Aggregator factorize(ColumnSelectorFactory metricFactory) + { +ColumnValueSelector selector = metricFactory.makeColumnValueSelector(fieldName); +return new SingleValueAggregator(selector); + } + + @Override + public BufferAggregator factorizeBuffered(ColumnSelectorFactory metricFactory) + { +ColumnValueSelector selector = metricFactory.makeColumnValueSelector(fieldName); +ColumnCapabilities columnCapabilities = metricFactory.getColumnCapabilities(fieldName); +ColumnType columnType = new ColumnType(columnCapabilities.getType(), null, null); +return new SingleValueBufferAggregator(selector, columnType); + } + + @Override + public Comparator getComparator() + { +throw DruidException.defensive("Single Value Aggregator would not have more than one row to compare"); + } + + @Override + @Nullable + public Object combine(@Nullable Object lhs, @Nullable Object rhs) Review Comment: What would happen if `lhs` represents something empty (i.e. hasn't encountered a single row), while `rhs` is something that has encountered a single row. In that case `rhs` should be returned. Similarly, the cases for when `lhs` represents something which means a single row, while `rhs` is empty; or when both are empty). I am not sure if this will ever be called if one of the side represents "something empty", hence the confusion. This also is at odds with `getCombiningFactory`, since that doesn't throw upon being called, and returns a correct implementation. I assumed that both should have the same semantics. The calling patterns might be such that it's okay to not implement the .combine() and throw an exception here, and if so, I think there should be some comment explaining why. -- This is
[PR] [Backport] Resolve CVE issues (druid)
LakshSingla opened a new pull request, #15825: URL: https://github.com/apache/druid/pull/15825 Backport of #15814 to 29.0.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
Re: [PR] Add export capabilities to MSQ with SQL syntax (druid)
adarshsanjeev commented on code in PR #15689: URL: https://github.com/apache/druid/pull/15689#discussion_r1475764572 ## extensions-core/multi-stage-query/src/main/java/org/apache/druid/msq/exec/ControllerImpl.java: ## @@ -1872,6 +1871,44 @@ private static QueryDefinition makeQueryDefinition( } else { return queryDef; } +} else if (querySpec.getDestination() instanceof ExportMSQDestination) { + final ExportMSQDestination exportMSQDestination = (ExportMSQDestination) querySpec.getDestination(); + final StorageConnectorProvider storageConnectorProvider = exportMSQDestination.getStorageConnectorProvider(); + + final ResultFormat resultFormat = exportMSQDestination.getResultFormat(); + + // If the statement is a 'REPLACE' statement, delete the existing files at the destination. + if (exportMSQDestination.getReplaceTimeChunks() != null) { +if (Intervals.ONLY_ETERNITY.equals(exportMSQDestination.getReplaceTimeChunks())) { + StorageConnector storageConnector = storageConnectorProvider.get(); + try { +storageConnector.deleteRecursively(""); Review Comment: Instead of remove the delete call, we can add a directory like `druid-export/` to the path, and delete it for REPLACE. This way, we would not delete any other files unless the user stores them in a directory with the same name. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org