jihoonson commented on a change in pull request #10125:
URL: https://github.com/apache/druid/pull/10125#discussion_r449254333
##########
File path:
server/src/main/java/org/apache/druid/client/CachingClusteredClient.java
##########
@@ -401,11 +407,16 @@ private ClusterQueryResult(Sequence<T> sequence, int
numQueryServers)
}
}
- private Set<SegmentServerSelector>
computeSegmentsToQuery(TimelineLookup<String, ServerSelector> timeline)
+ private Set<SegmentServerSelector> computeSegmentsToQuery(
+ TimelineLookup<String, ServerSelector> timeline,
+ boolean specificSegments
+ )
{
+ final java.util.function.Function<Interval,
List<TimelineObjectHolder<String, ServerSelector>>> lookupFn
+ = specificSegments ? timeline::lookupWithIncompletePartitions :
timeline::lookup;
final List<TimelineObjectHolder<String, ServerSelector>> serversLookup =
toolChest.filterSegments(
query,
- intervals.stream().flatMap(i ->
timeline.lookup(i).stream()).collect(Collectors.toList())
+ intervals.stream().flatMap(i ->
lookupFn.apply(i).stream()).collect(Collectors.toList())
Review comment:
> Hmm, although, `getQueryRunnerForSegments` does create a modified
timeline that only has the relevant segments, so this might be okay as long as
we can assume none of them overshadow the others. Which should be a fair
assumption for a query that originally started out as being by intervals. It
seems a bit sketchy, though, and would be nice if we could avoid the
unnecessary creation of this timeline.
I agree. That would be cleaner and even more efficient since we can avoid
unnecessary timeline build and lookup on it. However, since we are about to
release our next version, I would like to avoid such big refactoring for now
though. It will be more risky and potentially delay the release.
> But if we stick with this approach — will this properly retain information
about the relevant SegmentDescriptors' intervals, though? These could be
narrower than the segments' full intervals.
It does truncate the interval, but the query server can handle it. The query
runner for specific segments in `ServerManager` or `AppenderatorImpl` uses
`TimelineLookup.findEntry()` to find the `PartitionHolder` containing the
segments to query. In this `findEntry()`, an entry matches if its interval
contains the given interval.
```java
public @Nullable PartitionHolder<ObjectType> findEntry(Interval interval,
VersionType version)
{
lock.readLock().lock();
try {
for (Entry<Interval, TreeMap<VersionType, TimelineEntry>> entry :
allTimelineEntries.entrySet()) {
if (entry.getKey().equals(interval) ||
entry.getKey().contains(interval)) {
TimelineEntry foundEntry = entry.getValue().get(version);
if (foundEntry != null) {
return foundEntry.getPartitionHolder().asImmutable();
}
}
}
return null;
}
finally {
lock.readLock().unlock();
}
}
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]