[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r279959099 ## File path: server/src/main/java/org/apache/druid/server/http/MetadataResource.java ## @@ -159,14 +162,61 @@ public Response getDatabaseSegments( } final Stream metadataSegments = dataSourceStream.flatMap(t -> t.getSegments().stream()); -final Function> raGenerator = segment -> Collections.singletonList( - AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource())); +if (includeOvershadowedStatus != null) { + final Iterable authorizedSegments = findAuthorizedSegmentWithOvershadowedStatus( + req, + druidDataSources, + metadataSegments + ); + Response.ResponseBuilder builder = Response.status(Response.Status.OK); + return builder.entity(authorizedSegments).build(); +} else { + + final Function> raGenerator = segment -> Collections.singletonList( + AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource())); + + final Iterable authorizedSegments = AuthorizationUtils.filterAuthorizedResources( + req, + metadataSegments::iterator, + raGenerator, + authorizerMapper + ); + + Response.ResponseBuilder builder = Response.status(Response.Status.OK); + return builder.entity(authorizedSegments).build(); +} + } -final Iterable authorizedSegments = -AuthorizationUtils.filterAuthorizedResources(req, metadataSegments::iterator, raGenerator, authorizerMapper); + private Iterable findAuthorizedSegmentWithOvershadowedStatus( + HttpServletRequest req, + Collection druidDataSources, + Stream metadataSegments + ) + { +// It's fine to add all overshadowed segments to a single collection because only +// a small fraction of the segments in the cluster are expected to be overshadowed, +// so building this collection shouldn't generate a lot of garbage. +final Set overshadowedSegments = new HashSet<>(); +for (ImmutableDruidDataSource dataSource : druidDataSources) { + overshadowedSegments.addAll(ImmutableDruidDataSource.determineOvershadowedSegments(dataSource.getSegments())); Review comment: @leventov I am working on #7571. Agree the current API is not most efficient and I acknowledge your concern. While I am not sure what's the most appropriate way to avoid recalculating overshadowed segments yet, I looked at the suggested changes and I have some questions, which I have asked in #7571. Could we agree to discuss the design there, I think it'll make it easier for you and me and others to review those changes as this PR is getting crowded, and we may miss some parts about the new changes as they get mixed up with existing changes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r279509466 ## File path: server/src/main/java/org/apache/druid/server/http/MetadataResource.java ## @@ -159,14 +162,61 @@ public Response getDatabaseSegments( } final Stream metadataSegments = dataSourceStream.flatMap(t -> t.getSegments().stream()); -final Function> raGenerator = segment -> Collections.singletonList( - AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource())); +if (includeOvershadowedStatus != null) { + final Iterable authorizedSegments = findAuthorizedSegmentWithOvershadowedStatus( + req, + druidDataSources, + metadataSegments + ); + Response.ResponseBuilder builder = Response.status(Response.Status.OK); + return builder.entity(authorizedSegments).build(); +} else { + + final Function> raGenerator = segment -> Collections.singletonList( + AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource())); + + final Iterable authorizedSegments = AuthorizationUtils.filterAuthorizedResources( + req, + metadataSegments::iterator, + raGenerator, + authorizerMapper + ); + + Response.ResponseBuilder builder = Response.status(Response.Status.OK); + return builder.entity(authorizedSegments).build(); +} + } -final Iterable authorizedSegments = -AuthorizationUtils.filterAuthorizedResources(req, metadataSegments::iterator, raGenerator, authorizerMapper); + private Iterable findAuthorizedSegmentWithOvershadowedStatus( + HttpServletRequest req, + Collection druidDataSources, + Stream metadataSegments + ) + { +// It's fine to add all overshadowed segments to a single collection because only +// a small fraction of the segments in the cluster are expected to be overshadowed, +// so building this collection shouldn't generate a lot of garbage. +final Set overshadowedSegments = new HashSet<>(); +for (ImmutableDruidDataSource dataSource : druidDataSources) { + overshadowedSegments.addAll(ImmutableDruidDataSource.determineOvershadowedSegments(dataSource.getSegments())); Review comment: Thanks for your suggestion, but this code change is not contained in this method and would affect other places in the code, some of which are not part of original PR( eg coordinator balancing logic). I would prefer to do this change separately as it suggests changing `DataSegment` object and adding new interfaces, so there may be more follow-up discussions. Created #7571 to address this comment. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r279509422 ## File path: sql/src/main/java/org/apache/druid/sql/calcite/schema/MetadataSegmentView.java ## @@ -134,40 +138,41 @@ public void stop() private void poll() { log.info("polling published segments from coordinator"); -final JsonParserIterator metadataSegments = getMetadataSegments( +final JsonParserIterator metadataSegments = getMetadataSegments( coordinatorDruidLeaderClient, jsonMapper, responseHandler, segmentWatcherConfig.getWatchedDataSources() ); -final DateTime timestamp = DateTimes.nowUtc(); +final ImmutableSortedSet.Builder builder = ImmutableSortedSet.naturalOrder(); while (metadataSegments.hasNext()) { - final DataSegment interned = DataSegmentInterner.intern(metadataSegments.next()); - // timestamp is used to filter deleted segments - publishedSegments.put(interned, timestamp); + final SegmentWithOvershadowedStatus segment = metadataSegments.next(); + final DataSegment interned = DataSegmentInterner.intern(segment.getDataSegment()); + final SegmentWithOvershadowedStatus segmentWithOvershadowedStatus = new SegmentWithOvershadowedStatus( + interned, + segment.isOvershadowed() + ); + builder.add(segmentWithOvershadowedStatus); +} +// build operation can be expensive for high cardinality segments, so calling it outside "lock" +final ImmutableSortedSet immutableSortedSet = builder.build(); +synchronized (lock) { + publishedSegments = immutableSortedSet; } -// filter the segments from cache whose timestamp is not equal to latest timestamp stored, -// since the presence of a segment with an earlier timestamp indicates that -// "that" segment is not returned by coordinator in latest poll, so it's -// likely deleted and therefore we remove it from publishedSegments -// Since segments are not atomically replaced because it can cause high -// memory footprint due to large number of published segments, so -// we are incrementally removing deleted segments from the map -// This means publishedSegments will be eventually consistent with -// the segments in coordinator -publishedSegments.entrySet().removeIf(e -> e.getValue() != timestamp); cachePopulated.set(true); } - public Iterator getPublishedSegments() + public Iterator getPublishedSegments() { if (isCacheEnabled) { Preconditions.checkState( lifecycleLock.awaitStarted(1, TimeUnit.MILLISECONDS) && cachePopulated.get(), "hold on, still syncing published segments" ); - return publishedSegments.keySet().iterator(); + synchronized (lock) { Review comment: removed lock and made `publishedSegments` volatile This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r279509452 ## File path: sql/src/main/java/org/apache/druid/sql/calcite/schema/MetadataSegmentView.java ## @@ -73,8 +72,14 @@ private final BrokerSegmentWatcherConfig segmentWatcherConfig; private final boolean isCacheEnabled; + private final Object lock = new Object(); + /** + * Use {@link ImmutableSortedSet} so that the order of segments is deterministic and + * sys.segments queries return the segments in sorted order based on segmentId + */ @Nullable Review comment: ok, good to know about this annotation. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r279509440 ## File path: sql/src/main/java/org/apache/druid/sql/calcite/schema/MetadataSegmentView.java ## @@ -134,40 +138,41 @@ public void stop() private void poll() { log.info("polling published segments from coordinator"); -final JsonParserIterator metadataSegments = getMetadataSegments( +final JsonParserIterator metadataSegments = getMetadataSegments( coordinatorDruidLeaderClient, jsonMapper, responseHandler, segmentWatcherConfig.getWatchedDataSources() ); -final DateTime timestamp = DateTimes.nowUtc(); +final ImmutableSortedSet.Builder builder = ImmutableSortedSet.naturalOrder(); while (metadataSegments.hasNext()) { - final DataSegment interned = DataSegmentInterner.intern(metadataSegments.next()); - // timestamp is used to filter deleted segments - publishedSegments.put(interned, timestamp); + final SegmentWithOvershadowedStatus segment = metadataSegments.next(); + final DataSegment interned = DataSegmentInterner.intern(segment.getDataSegment()); + final SegmentWithOvershadowedStatus segmentWithOvershadowedStatus = new SegmentWithOvershadowedStatus( + interned, + segment.isOvershadowed() + ); + builder.add(segmentWithOvershadowedStatus); +} +// build operation can be expensive for high cardinality segments, so calling it outside "lock" +final ImmutableSortedSet immutableSortedSet = builder.build(); +synchronized (lock) { + publishedSegments = immutableSortedSet; } -// filter the segments from cache whose timestamp is not equal to latest timestamp stored, -// since the presence of a segment with an earlier timestamp indicates that -// "that" segment is not returned by coordinator in latest poll, so it's -// likely deleted and therefore we remove it from publishedSegments -// Since segments are not atomically replaced because it can cause high -// memory footprint due to large number of published segments, so -// we are incrementally removing deleted segments from the map -// This means publishedSegments will be eventually consistent with -// the segments in coordinator -publishedSegments.entrySet().removeIf(e -> e.getValue() != timestamp); cachePopulated.set(true); } - public Iterator getPublishedSegments() + public Iterator getPublishedSegments() { if (isCacheEnabled) { Preconditions.checkState( lifecycleLock.awaitStarted(1, TimeUnit.MILLISECONDS) && cachePopulated.get(), Review comment: replaced `AtomicBoolean` with `CountDownLatch` here, no exception on wait. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r279128049 ## File path: core/src/main/java/org/apache/druid/timeline/VersionedIntervalTimeline.java ## @@ -109,6 +109,15 @@ public static void addSegments( ); } + public static Map> buildTimelines(Iterable segments) Review comment: you're right, it should not belong here, moved it to `ImmutableDruidDataSource` as that's the only place it's used from. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r278630044 ## File path: sql/src/main/java/org/apache/druid/sql/calcite/schema/MetadataSegmentView.java ## @@ -143,10 +146,16 @@ private void poll() final DateTime timestamp = DateTimes.nowUtc(); while (metadataSegments.hasNext()) { - final DataSegment interned = DataSegmentInterner.intern(metadataSegments.next()); + final SegmentWithOvershadowedStatus segment = metadataSegments.next(); + final DataSegment interned = DataSegmentInterner.intern(segment.getDataSegment()); + final SegmentWithOvershadowedStatus segmentWithOvershadowedStatus = new SegmentWithOvershadowedStatus( + interned, + segment.isOvershadowed() + ); // timestamp is used to filter deleted segments - publishedSegments.put(interned, timestamp); Review comment: I used `ImmutableSortedSet`, now the cache gets replaced atomically. @gianm @leventov let me know if you see any issues with this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r278270044 ## File path: sql/src/main/java/org/apache/druid/sql/calcite/schema/MetadataSegmentView.java ## @@ -195,7 +206,7 @@ private void poll() sb.append("datasources=").append(ds).append("&"); } sb.setLength(sb.length() - 1); - query = "/druid/coordinator/v1/metadata/segments?" + sb; + query = "/druid/coordinator/v1/metadata/segments?includeOvershadowedStatus&" + sb; Review comment: yes, i think so, this will be only used if there are non empty `watchedDataSources` set (let's say it contains a datasource name "dummy") then the URL would look like `/druid/coordinator/v1/metadata/segments?includeOvershadowedStatus=dummy` etc. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r278264821 ## File path: sql/src/main/java/org/apache/druid/sql/calcite/schema/MetadataSegmentView.java ## @@ -73,8 +72,12 @@ private final BrokerSegmentWatcherConfig segmentWatcherConfig; private final boolean isCacheEnabled; + /** + * Use {@link ConcurrentSkipListMap} so that the order of segments is deterministic and sys.segments queries return the segments in sorted order based on segmentId Review comment: fixed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r278264363 ## File path: server/src/main/java/org/apache/druid/server/http/MetadataResource.java ## @@ -159,14 +162,57 @@ public Response getDatabaseSegments( } final Stream metadataSegments = dataSourceStream.flatMap(t -> t.getSegments().stream()); -final Function> raGenerator = segment -> Collections.singletonList( - AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource())); +if (includeOvershadowedStatus != null) { + final Iterable authorizedSegments = findAuthorizedSegmentWithOvershadowedStatus( + req, + druidDataSources, + metadataSegments + ); + Response.ResponseBuilder builder = Response.status(Response.Status.OK); + return builder.entity(authorizedSegments).build(); +} else { -final Iterable authorizedSegments = -AuthorizationUtils.filterAuthorizedResources(req, metadataSegments::iterator, raGenerator, authorizerMapper); + final Function> raGenerator = segment -> Collections.singletonList( + AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource())); + + final Iterable authorizedSegments = AuthorizationUtils.filterAuthorizedResources( + req, + metadataSegments::iterator, + raGenerator, + authorizerMapper + ); + + Response.ResponseBuilder builder = Response.status(Response.Status.OK); + return builder.entity(authorizedSegments).build(); +} + } -final Response.ResponseBuilder builder = Response.status(Response.Status.OK); -return builder.entity(authorizedSegments).build(); + private Iterable findAuthorizedSegmentWithOvershadowedStatus( + HttpServletRequest req, + Collection druidDataSources, + Stream metadataSegments + ) + { +final Set overshadowedSegments = new HashSet<>(); Review comment: sure, added the comment This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r278254246 ## File path: core/src/main/java/org/apache/druid/timeline/SegmentWithOvershadowedStatus.java ## @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.timeline; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; + +/** + * DataSegment object plus the overshadowed status for the segment. An immutable object. + * + * SegmentWithOvershadowedStatus's {@link #compareTo} method considers only the {@link SegmentId} of the DataSegment object. Review comment: fixed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r278262296 ## File path: server/src/main/java/org/apache/druid/server/coordinator/helper/DruidCoordinatorRuleRunner.java ## @@ -141,13 +139,8 @@ public DruidCoordinatorRuntimeParams run(DruidCoordinatorRuntimeParams params) private Set determineOvershadowedSegments(DruidCoordinatorRuntimeParams params) Review comment: ok, made `determineOvershadowedSegments` static in `ImmutableDruidDataSource` and removed the one from here This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r278250875 ## File path: core/src/main/java/org/apache/druid/timeline/SegmentWithOvershadowedStatus.java ## @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.timeline; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; + +/** + * DataSegment object plus the overshadowed status for the segment. An immutable object. + * + * SegmentWithOvershadowedStatus's {@link #compareTo} method considers only the {@link SegmentId} of the DataSegment object. + */ +public class SegmentWithOvershadowedStatus implements Comparable Review comment: thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r277800641 ## File path: core/src/main/java/org/apache/druid/timeline/SegmentWithOvershadowedStatus.java ## @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.timeline; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; + +/** + * DataSegment object plus the overshadowed status for the segment. An immutable object. + * + * SegmentWithOvershadowedStatus's {@link #compareTo} method considers only the {@link SegmentId} of the DataSegment object. + */ +public class SegmentWithOvershadowedStatus implements Comparable Review comment: sounds like a good idea to add an annotation for all internal API's in a separate PR This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r277480172 ## File path: server/src/main/java/org/apache/druid/server/http/MetadataResource.java ## @@ -171,26 +175,67 @@ public Response getDatabaseSegments( .stream() .flatMap(t -> t.getSegments().stream()); -final Function> raGenerator = segment -> Collections.singletonList( - AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource())); +if (includeOvershadowedStatus != null) { + final Set overshadowedSegments = findOvershadowedSegments(druidDataSources); + //transform DataSegment to SegmentWithOvershadowedStatus objects + final Stream segmentsWithOvershadowedStatus = metadataSegments + .map( + segment -> new SegmentWithOvershadowedStatus( + segment, + overshadowedSegments.contains(segment.getId()) + )).collect(Collectors.toList()).stream(); -final Iterable authorizedSegments = -AuthorizationUtils.filterAuthorizedResources(req, metadataSegments::iterator, raGenerator, authorizerMapper); + final Function> raGenerator = segment -> Collections.singletonList( + AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSegment().getDataSource())); Review comment: manually changed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r277472998 ## File path: server/src/main/java/org/apache/druid/server/coordinator/helper/DruidCoordinatorRuleRunner.java ## @@ -141,13 +139,8 @@ public DruidCoordinatorRuntimeParams run(DruidCoordinatorRuntimeParams params) private Set determineOvershadowedSegments(DruidCoordinatorRuntimeParams params) Review comment: yeah, it's similar, except input args and return type. I think `getFullyOvershadowedSegments` would be useful as it's public and can be used by anyone with a datasource object, i don't see how i can use that one in here though. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r277468653 ## File path: server/src/main/java/org/apache/druid/client/ImmutableDruidDataSource.java ## @@ -109,6 +112,27 @@ public long getTotalSizeOfSegments() return totalSizeOfSegments; } + /** + * This method finds the fully overshadowed segments in this datasource + * + * @return set of overshadowed segments + */ + public Set getFullyOvershadowedSegments() Review comment: renamed to `determineOvershadowedSegments` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r277470980 ## File path: server/src/main/java/org/apache/druid/client/ImmutableDruidDataSource.java ## @@ -109,6 +112,27 @@ public long getTotalSizeOfSegments() return totalSizeOfSegments; } + /** + * This method finds the fully overshadowed segments in this datasource + * + * @return set of overshadowed segments + */ + public Set getFullyOvershadowedSegments() + { +final Collection segments = this.getSegments(); +final Map> timelines = VersionedIntervalTimeline.buildTimelines( +segments); Review comment: not sure if I am missing something, I again reimported `druid_intellij_formatting.xml`, and this is what I get if I reformat code. I tried to add a manual line break and reformat, see if it looks ok now? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r277480339 ## File path: server/src/main/java/org/apache/druid/server/http/MetadataResource.java ## @@ -171,26 +175,67 @@ public Response getDatabaseSegments( .stream() .flatMap(t -> t.getSegments().stream()); -final Function> raGenerator = segment -> Collections.singletonList( - AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource())); +if (includeOvershadowedStatus != null) { + final Set overshadowedSegments = findOvershadowedSegments(druidDataSources); + //transform DataSegment to SegmentWithOvershadowedStatus objects + final Stream segmentsWithOvershadowedStatus = metadataSegments + .map( + segment -> new SegmentWithOvershadowedStatus( + segment, + overshadowedSegments.contains(segment.getId()) + )).collect(Collectors.toList()).stream(); -final Iterable authorizedSegments = -AuthorizationUtils.filterAuthorizedResources(req, metadataSegments::iterator, raGenerator, authorizerMapper); + final Function> raGenerator = segment -> Collections.singletonList( + AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSegment().getDataSource())); -final StreamingOutput stream = outputStream -> { - final JsonFactory jsonFactory = jsonMapper.getFactory(); - try (final JsonGenerator jsonGenerator = jsonFactory.createGenerator(outputStream)) { -jsonGenerator.writeStartArray(); -for (DataSegment ds : authorizedSegments) { - jsonGenerator.writeObject(ds); - jsonGenerator.flush(); -} -jsonGenerator.writeEndArray(); - } -}; + final Iterable authorizedSegments = + AuthorizationUtils.filterAuthorizedResources( + req, + segmentsWithOvershadowedStatus::iterator, + raGenerator, + authorizerMapper + ); + Response.ResponseBuilder builder = Response.status(Response.Status.OK); + return builder.entity(authorizedSegments).build(); +} else { -Response.ResponseBuilder builder = Response.status(Response.Status.OK); -return builder.entity(stream).build(); + final Function> raGenerator = segment -> Collections.singletonList( + AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource())); Review comment: manually changed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r277479797 ## File path: server/src/main/java/org/apache/druid/server/http/MetadataResource.java ## @@ -155,7 +158,8 @@ public Response getDatabaseSegmentDataSource(@PathParam("dataSourceName") final @Produces(MediaType.APPLICATION_JSON) public Response getDatabaseSegments( Review comment: I looked at this, not sure best way to split it, took out parts of finding and authorizing `SegmentWithOvershadowedStatus` into a helper method. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r277473875 ## File path: server/src/main/java/org/apache/druid/server/coordinator/helper/DruidCoordinatorRuleRunner.java ## @@ -141,13 +139,8 @@ public DruidCoordinatorRuntimeParams run(DruidCoordinatorRuntimeParams params) private Set determineOvershadowedSegments(DruidCoordinatorRuntimeParams params) { -Map> timelines = new HashMap<>(); -for (DataSegment segment : params.getAvailableSegments()) { - timelines - .computeIfAbsent(segment.getDataSource(), dataSource -> new VersionedIntervalTimeline<>(Ordering.natural())) - .add(segment.getInterval(), segment.getVersion(), segment.getShardSpec().createChunk(segment)); -} - +final Map> timelines = VersionedIntervalTimeline.buildTimelines( +params.getAvailableSegments()); Review comment: same here, reformat doesn't change the formatting This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r277468164 ## File path: server/src/main/java/org/apache/druid/client/ImmutableDruidDataSource.java ## @@ -109,6 +112,27 @@ public long getTotalSizeOfSegments() return totalSizeOfSegments; } + /** + * This method finds the fully overshadowed segments in this datasource Review comment: renamed "fully overshadowed" -> "overshadowed" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r277149896 ## File path: sql/src/main/java/org/apache/druid/sql/calcite/schema/MetadataSegmentView.java ## @@ -143,10 +146,16 @@ private void poll() final DateTime timestamp = DateTimes.nowUtc(); while (metadataSegments.hasNext()) { - final DataSegment interned = DataSegmentInterner.intern(metadataSegments.next()); + final SegmentWithOvershadowedStatus segment = metadataSegments.next(); + final DataSegment interned = DataSegmentInterner.intern(segment.getDataSegment()); + final SegmentWithOvershadowedStatus segmentWithOvershadowedStatus = new SegmentWithOvershadowedStatus( + interned, + segment.isOvershadowed() + ); // timestamp is used to filter deleted segments - publishedSegments.put(interned, timestamp); Review comment: okay, i added changes to do (2) now as we discussed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r277051747 ## File path: sql/src/main/java/org/apache/druid/sql/calcite/schema/MetadataSegmentView.java ## @@ -195,7 +204,7 @@ private void poll() sb.append("datasources=").append(ds).append("&"); } sb.setLength(sb.length() - 1); - query = "/druid/coordinator/v1/metadata/segments?" + sb; + query = "/druid/coordinator/v1/metadata/segments?includeOvershadowedStatus?" + sb; Review comment: yes, didn't realize i have two `?` , will change This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r277093924 ## File path: sql/src/main/java/org/apache/druid/sql/calcite/schema/MetadataSegmentView.java ## @@ -143,10 +146,16 @@ private void poll() final DateTime timestamp = DateTimes.nowUtc(); while (metadataSegments.hasNext()) { - final DataSegment interned = DataSegmentInterner.intern(metadataSegments.next()); + final SegmentWithOvershadowedStatus segment = metadataSegments.next(); + final DataSegment interned = DataSegmentInterner.intern(segment.getDataSegment()); + final SegmentWithOvershadowedStatus segmentWithOvershadowedStatus = new SegmentWithOvershadowedStatus( + interned, + segment.isOvershadowed() + ); // timestamp is used to filter deleted segments - publishedSegments.put(interned, timestamp); Review comment: omg, thanks for catching this issue, actually I feel like going with option (1), i didn't do it initially because of memory bump it'd cause like you said, but that would have avoided such bugs and updates the cache atomically. I tried option (2) as well, one thing it would cause is, the return type of `getPublishedSegments()` changes, so `CachedSegmentInfo` cannot be private to this class. The `getPublishedSegments()` might need to be split into 2 methods `getPublishedSegments()` and `getCachedPublishedSegments()` or create yet another wrapper class and return that. Another potential ugliness it might have is clients need to know if cache is enabled and call the right method to get published segments. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r276529355 ## File path: core/src/main/java/org/apache/druid/timeline/SegmentWithOvershadowedStatus.java ## @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.timeline; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; + +/** + * DataSegment object plus the overshadowed status for the segment. An immutable object. + * + * SegmentWithOvershadowedStatus's {@link #compareTo} method considers only the {@link SegmentId} of the DataSegment object. + */ +public class SegmentWithOvershadowedStatus implements Comparable Review comment: If I deconstruct the `DataSegment` object, then we might save few bytes on a reference to DataSegment, but then the memory savings in broker where interned DataSegment is used would be lost. (Getting rid of interning or not is another issue, which should be addressed outside of this PR). If the concern is bytes sent over the network, then moving to smile format instead of json can provide considerable reduction in size of bytes transferred, which I plan to do in a follow-up PR later. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r275961821 ## File path: server/src/main/java/org/apache/druid/server/http/MetadataResource.java ## @@ -171,26 +175,67 @@ public Response getDatabaseSegments( .stream() .flatMap(t -> t.getSegments().stream()); -final Function> raGenerator = segment -> Collections.singletonList( - AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource())); +if (includeOvershadowedStatus != null) { + final Set overshadowedSegments = findOvershadowedSegments(druidDataSources); + //transform DataSegment to SegmentWithOvershadowedStatus objects + final Stream segmentsWithOvershadowedStatus = metadataSegments + .map( + segment -> new SegmentWithOvershadowedStatus( + segment, + overshadowedSegments.contains(segment.getId()) + )).collect(Collectors.toList()).stream(); -final Iterable authorizedSegments = -AuthorizationUtils.filterAuthorizedResources(req, metadataSegments::iterator, raGenerator, authorizerMapper); + final Function> raGenerator = segment -> Collections.singletonList( + AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSegment().getDataSource())); -final StreamingOutput stream = outputStream -> { - final JsonFactory jsonFactory = jsonMapper.getFactory(); - try (final JsonGenerator jsonGenerator = jsonFactory.createGenerator(outputStream)) { -jsonGenerator.writeStartArray(); -for (DataSegment ds : authorizedSegments) { - jsonGenerator.writeObject(ds); - jsonGenerator.flush(); -} -jsonGenerator.writeEndArray(); - } -}; + final Iterable authorizedSegments = + AuthorizationUtils.filterAuthorizedResources( + req, + segmentsWithOvershadowedStatus::iterator, + raGenerator, + authorizerMapper + ); + Response.ResponseBuilder builder = Response.status(Response.Status.OK); + return builder.entity(authorizedSegments).build(); +} else { -Response.ResponseBuilder builder = Response.status(Response.Status.OK); -return builder.entity(stream).build(); + final Function> raGenerator = segment -> Collections.singletonList( + AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource())); + + final Iterable authorizedSegments = + AuthorizationUtils.filterAuthorizedResources(req, metadataSegments::iterator, raGenerator, authorizerMapper); + + Response.ResponseBuilder builder = Response.status(Response.Status.OK); + return builder.entity(authorizedSegments).build(); +} + } + + /** + * find fully overshadowed segments + * + * @param druidDataSources + * + * @return set of overshadowed segments + */ + private Set findOvershadowedSegments(Collection druidDataSources) Review comment: thanks, moved this method to `ImmutableDruidDataSource` as an instance method. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r275961821 ## File path: server/src/main/java/org/apache/druid/server/http/MetadataResource.java ## @@ -171,26 +175,67 @@ public Response getDatabaseSegments( .stream() .flatMap(t -> t.getSegments().stream()); -final Function> raGenerator = segment -> Collections.singletonList( - AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource())); +if (includeOvershadowedStatus != null) { + final Set overshadowedSegments = findOvershadowedSegments(druidDataSources); + //transform DataSegment to SegmentWithOvershadowedStatus objects + final Stream segmentsWithOvershadowedStatus = metadataSegments + .map( + segment -> new SegmentWithOvershadowedStatus( + segment, + overshadowedSegments.contains(segment.getId()) + )).collect(Collectors.toList()).stream(); -final Iterable authorizedSegments = -AuthorizationUtils.filterAuthorizedResources(req, metadataSegments::iterator, raGenerator, authorizerMapper); + final Function> raGenerator = segment -> Collections.singletonList( + AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSegment().getDataSource())); -final StreamingOutput stream = outputStream -> { - final JsonFactory jsonFactory = jsonMapper.getFactory(); - try (final JsonGenerator jsonGenerator = jsonFactory.createGenerator(outputStream)) { -jsonGenerator.writeStartArray(); -for (DataSegment ds : authorizedSegments) { - jsonGenerator.writeObject(ds); - jsonGenerator.flush(); -} -jsonGenerator.writeEndArray(); - } -}; + final Iterable authorizedSegments = + AuthorizationUtils.filterAuthorizedResources( + req, + segmentsWithOvershadowedStatus::iterator, + raGenerator, + authorizerMapper + ); + Response.ResponseBuilder builder = Response.status(Response.Status.OK); + return builder.entity(authorizedSegments).build(); +} else { -Response.ResponseBuilder builder = Response.status(Response.Status.OK); -return builder.entity(stream).build(); + final Function> raGenerator = segment -> Collections.singletonList( + AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource())); + + final Iterable authorizedSegments = + AuthorizationUtils.filterAuthorizedResources(req, metadataSegments::iterator, raGenerator, authorizerMapper); + + Response.ResponseBuilder builder = Response.status(Response.Status.OK); + return builder.entity(authorizedSegments).build(); +} + } + + /** + * find fully overshadowed segments + * + * @param druidDataSources + * + * @return set of overshadowed segments + */ + private Set findOvershadowedSegments(Collection druidDataSources) Review comment: thanks, moved this method to `ImmutableDruidDataSource` as a instance method. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r275073458 ## File path: server/src/main/java/org/apache/druid/server/http/MetadataResource.java ## @@ -171,26 +175,67 @@ public Response getDatabaseSegments( .stream() .flatMap(t -> t.getSegments().stream()); -final Function> raGenerator = segment -> Collections.singletonList( - AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource())); +if (includeOvershadowedStatus != null) { + final Set overshadowedSegments = findOvershadowedSegments(druidDataSources); + //transform DataSegment to SegmentWithOvershadowedStatus objects + final Stream segmentsWithOvershadowedStatus = metadataSegments + .map( + segment -> new SegmentWithOvershadowedStatus( + segment, + overshadowedSegments.contains(segment.getId()) + )).collect(Collectors.toList()).stream(); -final Iterable authorizedSegments = -AuthorizationUtils.filterAuthorizedResources(req, metadataSegments::iterator, raGenerator, authorizerMapper); + final Function> raGenerator = segment -> Collections.singletonList( + AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSegment().getDataSource())); -final StreamingOutput stream = outputStream -> { - final JsonFactory jsonFactory = jsonMapper.getFactory(); - try (final JsonGenerator jsonGenerator = jsonFactory.createGenerator(outputStream)) { -jsonGenerator.writeStartArray(); -for (DataSegment ds : authorizedSegments) { - jsonGenerator.writeObject(ds); - jsonGenerator.flush(); -} -jsonGenerator.writeEndArray(); - } -}; + final Iterable authorizedSegments = + AuthorizationUtils.filterAuthorizedResources( + req, + segmentsWithOvershadowedStatus::iterator, + raGenerator, + authorizerMapper + ); + Response.ResponseBuilder builder = Response.status(Response.Status.OK); + return builder.entity(authorizedSegments).build(); +} else { -Response.ResponseBuilder builder = Response.status(Response.Status.OK); -return builder.entity(stream).build(); + final Function> raGenerator = segment -> Collections.singletonList( + AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource())); + + final Iterable authorizedSegments = + AuthorizationUtils.filterAuthorizedResources(req, metadataSegments::iterator, raGenerator, authorizerMapper); + + Response.ResponseBuilder builder = Response.status(Response.Status.OK); + return builder.entity(authorizedSegments).build(); +} + } + + /** + * find fully overshadowed segments Review comment: sure, changed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r275090562 ## File path: sql/src/main/java/org/apache/druid/sql/calcite/schema/SystemSchema.java ## @@ -295,6 +302,7 @@ public TableType getJdbcTableType() val.getValue().isPublished(), val.getValue().isAvailable(), val.getValue().isRealtime(), + AVAILABLE_IS_OVERSHADOWED_VALUE, Review comment: added comment This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r275090369 ## File path: sql/src/main/java/org/apache/druid/sql/calcite/schema/SystemSchema.java ## @@ -250,23 +256,24 @@ public TableType getJdbcTableType() isRealtime = partialSegmentData.isRealtime(); } return new Object[]{ - val.getId(), - val.getDataSource(), - val.getInterval().getStart().toString(), - val.getInterval().getEnd().toString(), - val.getSize(), - val.getVersion(), - Long.valueOf(val.getShardSpec().getPartitionNum()), + segment.getId(), + segment.getDataSource(), + segment.getInterval().getStart().toString(), + segment.getInterval().getEnd().toString(), + segment.getSize(), + segment.getVersion(), + Long.valueOf(segment.getShardSpec().getPartitionNum()), numReplicas, numRows, - 1L, //is_published is true for published segments + PUBLISHED_IS_PUBLISHED_VALUE, //is_published is true for published segments isAvailable, isRealtime, + val.isOvershadowed() ? 1L : 0L, Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r275090186 ## File path: sql/src/main/java/org/apache/druid/sql/calcite/schema/SystemSchema.java ## @@ -92,6 +93,9 @@ private static final String SERVER_SEGMENTS_TABLE = "server_segments"; private static final String TASKS_TABLE = "tasks"; + private static final long AVAILABLE_IS_OVERSHADOWED_VALUE = 0L; Review comment: ok, yeah, i was having hard time coming up with good constant names for these. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r275076806 ## File path: server/src/main/java/org/apache/druid/server/http/MetadataResource.java ## @@ -171,26 +175,67 @@ public Response getDatabaseSegments( .stream() .flatMap(t -> t.getSegments().stream()); -final Function> raGenerator = segment -> Collections.singletonList( - AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource())); +if (includeOvershadowedStatus != null) { + final Set overshadowedSegments = findOvershadowedSegments(druidDataSources); + //transform DataSegment to SegmentWithOvershadowedStatus objects + final Stream segmentsWithOvershadowedStatus = metadataSegments + .map( + segment -> new SegmentWithOvershadowedStatus( + segment, + overshadowedSegments.contains(segment.getId()) + )).collect(Collectors.toList()).stream(); -final Iterable authorizedSegments = -AuthorizationUtils.filterAuthorizedResources(req, metadataSegments::iterator, raGenerator, authorizerMapper); + final Function> raGenerator = segment -> Collections.singletonList( + AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSegment().getDataSource())); -final StreamingOutput stream = outputStream -> { - final JsonFactory jsonFactory = jsonMapper.getFactory(); - try (final JsonGenerator jsonGenerator = jsonFactory.createGenerator(outputStream)) { -jsonGenerator.writeStartArray(); -for (DataSegment ds : authorizedSegments) { - jsonGenerator.writeObject(ds); - jsonGenerator.flush(); -} -jsonGenerator.writeEndArray(); - } -}; + final Iterable authorizedSegments = + AuthorizationUtils.filterAuthorizedResources( + req, + segmentsWithOvershadowedStatus::iterator, + raGenerator, + authorizerMapper + ); + Response.ResponseBuilder builder = Response.status(Response.Status.OK); + return builder.entity(authorizedSegments).build(); +} else { -Response.ResponseBuilder builder = Response.status(Response.Status.OK); -return builder.entity(stream).build(); + final Function> raGenerator = segment -> Collections.singletonList( + AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource())); + + final Iterable authorizedSegments = + AuthorizationUtils.filterAuthorizedResources(req, metadataSegments::iterator, raGenerator, authorizerMapper); + + Response.ResponseBuilder builder = Response.status(Response.Status.OK); + return builder.entity(authorizedSegments).build(); +} + } + + /** + * find fully overshadowed segments + * + * @param druidDataSources + * + * @return set of overshadowed segments + */ + private Set findOvershadowedSegments(Collection druidDataSources) Review comment: If this methods belongs here, yeah I also thought about it, thought of adding it to `VersionedIntervalTimeline`, but since that's in core package, there was dependency issue to take `ImmutableDruidDataSource` as argument. May be can work around that by passing `DataSegment` object instead, if it makes sense to move this to `VersionedIntervalTimeline` and if it's going to be used by other code. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r275074252 ## File path: server/src/main/java/org/apache/druid/server/http/MetadataResource.java ## @@ -171,26 +175,67 @@ public Response getDatabaseSegments( .stream() .flatMap(t -> t.getSegments().stream()); -final Function> raGenerator = segment -> Collections.singletonList( - AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource())); +if (includeOvershadowedStatus != null) { + final Set overshadowedSegments = findOvershadowedSegments(druidDataSources); + //transform DataSegment to SegmentWithOvershadowedStatus objects + final Stream segmentsWithOvershadowedStatus = metadataSegments + .map( + segment -> new SegmentWithOvershadowedStatus( + segment, + overshadowedSegments.contains(segment.getId()) + )).collect(Collectors.toList()).stream(); -final Iterable authorizedSegments = -AuthorizationUtils.filterAuthorizedResources(req, metadataSegments::iterator, raGenerator, authorizerMapper); + final Function> raGenerator = segment -> Collections.singletonList( + AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSegment().getDataSource())); -final StreamingOutput stream = outputStream -> { - final JsonFactory jsonFactory = jsonMapper.getFactory(); - try (final JsonGenerator jsonGenerator = jsonFactory.createGenerator(outputStream)) { -jsonGenerator.writeStartArray(); -for (DataSegment ds : authorizedSegments) { - jsonGenerator.writeObject(ds); - jsonGenerator.flush(); -} -jsonGenerator.writeEndArray(); - } -}; + final Iterable authorizedSegments = + AuthorizationUtils.filterAuthorizedResources( + req, + segmentsWithOvershadowedStatus::iterator, + raGenerator, + authorizerMapper + ); + Response.ResponseBuilder builder = Response.status(Response.Status.OK); + return builder.entity(authorizedSegments).build(); +} else { -Response.ResponseBuilder builder = Response.status(Response.Status.OK); -return builder.entity(stream).build(); + final Function> raGenerator = segment -> Collections.singletonList( + AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource())); + + final Iterable authorizedSegments = + AuthorizationUtils.filterAuthorizedResources(req, metadataSegments::iterator, raGenerator, authorizerMapper); + + Response.ResponseBuilder builder = Response.status(Response.Status.OK); + return builder.entity(authorizedSegments).build(); +} + } + + /** + * find fully overshadowed segments + * + * @param druidDataSources + * + * @return set of overshadowed segments + */ + private Set findOvershadowedSegments(Collection druidDataSources) Review comment: you mean the `VersionedIntervalTimeline#findOvershadowed()`, I think that one finds partially overshadowed segments as well. This method only looks for fully overshadowed segments. Also that method returns a `TimelineObjectHolder`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r275063339 ## File path: core/src/main/java/org/apache/druid/timeline/SegmentWithOvershadowedStatus.java ## @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.timeline; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; + +/** + * DataSegment object plus the overshadowed status for the segment. An immutable object. + * + * SegmentWithOvershadowedStatus's {@link #compareTo} method considers only the {@link SegmentId} of the DataSegment object. + */ +public class SegmentWithOvershadowedStatus implements Comparable Review comment: Yes, in fact I started with `extends DataSegment`, but in order to call the super(), I had to pass the `DataSegment` reference to `SegmentWithOvershadowedStatus`, so that I get the properties for the super constructor call, something like this ``` @JsonCreator public SegmentWithOvershadowedStatus( @JsonProperty("dataSegment") DataSegment segment, @JsonProperty("overshadowed") boolean overshadowed ) { super( segment.getDataSource(), segment.getInterval(), segment.getVersion(), segment.getLoadSpec(), segment.getDimensions(), segment.getMetrics(), segment.getShardSpec(), segment.getBinaryVersion(), segment.getSize() ); this.dataSegment = segment; this.overshadowed = overshadowed; } ``` which didn't seem correct to me, as I am both extending the class and passing the reference to same in sub-class and decided to just keep `DataSegment` as member of this class. Is there a better way of doing this ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r275089539 ## File path: sql/src/main/java/org/apache/druid/sql/calcite/schema/MetadataSegmentView.java ## @@ -73,8 +73,10 @@ private final BrokerSegmentWatcherConfig segmentWatcherConfig; private final boolean isCacheEnabled; + // Use ConcurrentSkipListMap so that the order of segments is deterministic and Review comment: changed to javadocs This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r275088852 ## File path: server/src/main/java/org/apache/druid/server/http/MetadataResource.java ## @@ -171,26 +175,67 @@ public Response getDatabaseSegments( .stream() .flatMap(t -> t.getSegments().stream()); -final Function> raGenerator = segment -> Collections.singletonList( - AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource())); +if (includeOvershadowedStatus != null) { + final Set overshadowedSegments = findOvershadowedSegments(druidDataSources); + //transform DataSegment to SegmentWithOvershadowedStatus objects + final Stream segmentsWithOvershadowedStatus = metadataSegments + .map( + segment -> new SegmentWithOvershadowedStatus( + segment, + overshadowedSegments.contains(segment.getId()) + )).collect(Collectors.toList()).stream(); -final Iterable authorizedSegments = -AuthorizationUtils.filterAuthorizedResources(req, metadataSegments::iterator, raGenerator, authorizerMapper); + final Function> raGenerator = segment -> Collections.singletonList( + AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSegment().getDataSource())); -final StreamingOutput stream = outputStream -> { - final JsonFactory jsonFactory = jsonMapper.getFactory(); - try (final JsonGenerator jsonGenerator = jsonFactory.createGenerator(outputStream)) { -jsonGenerator.writeStartArray(); -for (DataSegment ds : authorizedSegments) { - jsonGenerator.writeObject(ds); - jsonGenerator.flush(); -} -jsonGenerator.writeEndArray(); - } -}; + final Iterable authorizedSegments = + AuthorizationUtils.filterAuthorizedResources( + req, + segmentsWithOvershadowedStatus::iterator, + raGenerator, + authorizerMapper + ); + Response.ResponseBuilder builder = Response.status(Response.Status.OK); + return builder.entity(authorizedSegments).build(); +} else { -Response.ResponseBuilder builder = Response.status(Response.Status.OK); -return builder.entity(stream).build(); + final Function> raGenerator = segment -> Collections.singletonList( + AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource())); + + final Iterable authorizedSegments = + AuthorizationUtils.filterAuthorizedResources(req, metadataSegments::iterator, raGenerator, authorizerMapper); + + Response.ResponseBuilder builder = Response.status(Response.Status.OK); + return builder.entity(authorizedSegments).build(); +} + } + + /** + * find fully overshadowed segments + * + * @param druidDataSources + * + * @return set of overshadowed segments + */ + private Set findOvershadowedSegments(Collection druidDataSources) + { +final Stream segmentStream = druidDataSources +.stream() +.flatMap(t -> t.getSegments().stream()); +final Set usedSegments = segmentStream.collect(Collectors.toSet()); +final Map> timelines = new HashMap<>(); Review comment: yeah, extracted it to `VersionedIntervalTimeline`, added static method `buildTimelines()` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r275088705 ## File path: server/src/main/java/org/apache/druid/server/http/MetadataResource.java ## @@ -171,26 +175,67 @@ public Response getDatabaseSegments( .stream() .flatMap(t -> t.getSegments().stream()); -final Function> raGenerator = segment -> Collections.singletonList( - AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource())); +if (includeOvershadowedStatus != null) { + final Set overshadowedSegments = findOvershadowedSegments(druidDataSources); + //transform DataSegment to SegmentWithOvershadowedStatus objects + final Stream segmentsWithOvershadowedStatus = metadataSegments + .map( + segment -> new SegmentWithOvershadowedStatus( + segment, + overshadowedSegments.contains(segment.getId()) + )).collect(Collectors.toList()).stream(); -final Iterable authorizedSegments = -AuthorizationUtils.filterAuthorizedResources(req, metadataSegments::iterator, raGenerator, authorizerMapper); + final Function> raGenerator = segment -> Collections.singletonList( + AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSegment().getDataSource())); -final StreamingOutput stream = outputStream -> { - final JsonFactory jsonFactory = jsonMapper.getFactory(); - try (final JsonGenerator jsonGenerator = jsonFactory.createGenerator(outputStream)) { -jsonGenerator.writeStartArray(); -for (DataSegment ds : authorizedSegments) { - jsonGenerator.writeObject(ds); - jsonGenerator.flush(); -} -jsonGenerator.writeEndArray(); - } -}; + final Iterable authorizedSegments = + AuthorizationUtils.filterAuthorizedResources( + req, + segmentsWithOvershadowedStatus::iterator, + raGenerator, + authorizerMapper + ); + Response.ResponseBuilder builder = Response.status(Response.Status.OK); + return builder.entity(authorizedSegments).build(); +} else { -Response.ResponseBuilder builder = Response.status(Response.Status.OK); -return builder.entity(stream).build(); + final Function> raGenerator = segment -> Collections.singletonList( + AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource())); + + final Iterable authorizedSegments = + AuthorizationUtils.filterAuthorizedResources(req, metadataSegments::iterator, raGenerator, authorizerMapper); + + Response.ResponseBuilder builder = Response.status(Response.Status.OK); + return builder.entity(authorizedSegments).build(); +} + } + + /** + * find fully overshadowed segments + * + * @param druidDataSources + * + * @return set of overshadowed segments + */ + private Set findOvershadowedSegments(Collection druidDataSources) + { +final Stream segmentStream = druidDataSources Review comment: it was used for building timelines, and is still used to pass to the new method i created in `VersionedIntervalTimeline` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r275066881 ## File path: server/src/main/java/org/apache/druid/server/http/MetadataResource.java ## @@ -171,26 +175,67 @@ public Response getDatabaseSegments( .stream() .flatMap(t -> t.getSegments().stream()); -final Function> raGenerator = segment -> Collections.singletonList( - AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource())); +if (includeOvershadowedStatus != null) { + final Set overshadowedSegments = findOvershadowedSegments(druidDataSources); + //transform DataSegment to SegmentWithOvershadowedStatus objects + final Stream segmentsWithOvershadowedStatus = metadataSegments + .map( + segment -> new SegmentWithOvershadowedStatus( + segment, + overshadowedSegments.contains(segment.getId()) + )).collect(Collectors.toList()).stream(); Review comment: fixed formatting and that part is actually not needed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r274644465 ## File path: core/src/main/java/org/apache/druid/timeline/SegmentWithOvershadowedStatus.java ## @@ -0,0 +1,89 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.druid.timeline; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; + +/** + * DataSegment object plus the overshadowed status for the segment. An immutable object. + * + * SegmentWithOvershadowedStatus's {@link #compareTo} method considers only the {@link SegmentId} of the DataSegment object. + */ +public class SegmentWithOvershadowedStatus implements Comparable +{ + private final boolean isOvershadowed; Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r274653345 ## File path: server/src/main/java/org/apache/druid/server/http/MetadataResource.java ## @@ -164,26 +168,68 @@ public Response getDatabaseSegments( .stream() .flatMap(t -> t.getSegments().stream()); -final Function> raGenerator = segment -> Collections.singletonList( - AuthorizationUtils.DATASOURCE_READ_RA_GENERATOR.apply(segment.getDataSource())); +if (includeOvershadowedStatus != null) { + final Set overshadowedSegments = findOvershadowedSegments(druidDataSources); + //transform DataSegment to SegmentWithOvershadowedStatus objects + final Stream segmentsWithOvershadowedStatus = metadataSegments.map(segment -> { +if (overshadowedSegments.contains(segment.getId())) { Review comment: yes, that looks nicer, thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table
surekhasaharan commented on a change in pull request #7425: Add is_overshadowed column to sys.segments table URL: https://github.com/apache/incubator-druid/pull/7425#discussion_r274645284 ## File path: docs/content/querying/sql.md ## @@ -609,6 +609,7 @@ Note that a segment can be served by more than one stream ingestion tasks or His |is_published|LONG|Boolean is represented as long type where 1 = true, 0 = false. 1 represents this segment has been published to the metadata store with `used=1`| |is_available|LONG|Boolean is represented as long type where 1 = true, 0 = false. 1 if this segment is currently being served by any process(Historical or realtime)| |is_realtime|LONG|Boolean is represented as long type where 1 = true, 0 = false. 1 if this segment is being served on any type of realtime tasks| +|is_overshadowed|LONG|Boolean is represented as long type where 1 = true, 0 = false. 1 if this segment is published and is overshadowed by some other published segments. Currently, is_overshadowed is always false for unpublished segments, although this may change in the future. You can filter for segments that "should be published" by filtering for `is_published = 1 AND is_overshadowed = 0`. Segments can briefly be both published and overshadowed if they were recently replaced, but have not been unpublished yet. Review comment: yes, i think it's important to mention that, changed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org