capistrant commented on code in PR #18922:
URL: https://github.com/apache/druid/pull/18922#discussion_r2737963920
##########
multi-stage-query/src/test/java/org/apache/druid/msq/input/table/IndexerTableInputSpecSlicerTest.java:
##########
Review Comment:
For the case when `TableInputSpec` is instantiated with non-null intervals
**and** non-null segments - For segments who are not in the interval, I
understand it that they will **not** end up in the produced "slice", correct?
Is there need for a UT of that? `SEGMENT4` with an interval that doesn't
overlap with "2000/2001" and confirm that if we provide the interval with
`SEGMENT4` in segments list, `SEGMENT4` is not in the output?
##########
multi-stage-query/src/main/java/org/apache/druid/msq/input/table/TableInputSpec.java:
##########
@@ -58,6 +63,8 @@ public class TableInputSpec implements InputSpec
* meaning that when this spec is sliced and read, the
returned {@link LoadableSegment}
* from {@link PhysicalInputSlice#getLoadableSegments()}
are clipped to these intervals using
* {@link LoadableSegment#descriptor()}.
+ * @param segments specific segments to read, or null to read all
segments in the intervals. If provided,
Review Comment:
does there need to be documentation around what happens with non-null
intervals` and non-null `segments` and some/all of those segments not
overlapping with anything in `intervals`? I think that is the most interesting
part of adding this support for specific segments, ant potentially the most
confusing
##########
multi-stage-query/src/main/java/org/apache/druid/msq/indexing/IndexerTableInputSpecSlicer.java:
##########
@@ -157,24 +161,25 @@ private Set<DataSegmentWithInterval>
getPrunedSegmentSet(final TableInputSpec ta
{
final TimelineLookup<String, DataSegment> timeline =
getTimeline(tableInputSpec.getDataSource(),
tableInputSpec.getIntervals());
+ final Predicate<SegmentDescriptor> segmentFilter =
tableInputSpec.getSegments() != null
+ ?
Set.copyOf(tableInputSpec.getSegments())::contains
+ :
Predicates.alwaysTrue();
if (timeline == null) {
return Collections.emptySet();
} else {
+ // A segment can overlap with multiple search intervals, or even outside
search intervals.
+ // The same segment can appear multiple times or 0 time, but each is
also bounded within the overlapped search interval
Review Comment:
I don't think I fully follow this comment.
> or even outside search intervals
A segment outside search intervals. Is that referring to a segment that is
in `tableInputSpec.getSegments()` but ends up not overlapping any search
intervals and thus not get found?
> The same segment can appear multiple times or 0 time
Same idea regarding the `0 time` comment. Is that just saying that even
though a segment was in `tableInputSpec.getSegments()`, it may not be in the
final set?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]