capistrant commented on code in PR #18922:
URL: https://github.com/apache/druid/pull/18922#discussion_r2737963920


##########
multi-stage-query/src/test/java/org/apache/druid/msq/input/table/IndexerTableInputSpecSlicerTest.java:
##########


Review Comment:
   For the case when `TableInputSpec` is instantiated with non-null intervals 
**and** non-null segments - For segments who are not in the interval, I 
understand it that they will **not** end up in the produced "slice", correct? 
Is there need for a UT of that? `SEGMENT4` with an interval that doesn't 
overlap with "2000/2001" and confirm that if we provide the interval with 
`SEGMENT4` in segments list, `SEGMENT4` is not in the output?



##########
multi-stage-query/src/main/java/org/apache/druid/msq/input/table/TableInputSpec.java:
##########
@@ -58,6 +63,8 @@ public class TableInputSpec implements InputSpec
    *                     meaning that when this spec is sliced and read, the 
returned {@link LoadableSegment}
    *                     from {@link PhysicalInputSlice#getLoadableSegments()} 
are clipped to these intervals using
    *                     {@link LoadableSegment#descriptor()}.
+   * @param segments     specific segments to read, or null to read all 
segments in the intervals. If provided,

Review Comment:
   does there need to be documentation around what happens with non-null 
intervals` and non-null `segments` and some/all of those segments not 
overlapping with anything in `intervals`? I think that is the most interesting 
part of adding this support for specific segments, ant potentially the most 
confusing



##########
multi-stage-query/src/main/java/org/apache/druid/msq/indexing/IndexerTableInputSpecSlicer.java:
##########
@@ -157,24 +161,25 @@ private Set<DataSegmentWithInterval> 
getPrunedSegmentSet(final TableInputSpec ta
   {
     final TimelineLookup<String, DataSegment> timeline =
         getTimeline(tableInputSpec.getDataSource(), 
tableInputSpec.getIntervals());
+    final Predicate<SegmentDescriptor> segmentFilter = 
tableInputSpec.getSegments() != null
+                                                       ? 
Set.copyOf(tableInputSpec.getSegments())::contains
+                                                       : 
Predicates.alwaysTrue();
 
     if (timeline == null) {
       return Collections.emptySet();
     } else {
+      // A segment can overlap with multiple search intervals, or even outside 
search intervals.
+      // The same segment can appear multiple times or 0 time, but each is 
also bounded within the overlapped search interval

Review Comment:
   I don't think I fully follow this comment.
   
   > or even outside search intervals
   
   A segment outside search intervals. Is that referring to a segment that is 
in `tableInputSpec.getSegments()` but ends up not overlapping any search 
intervals and thus not get found?
   
   > The same segment can appear multiple times or 0 time
   
   Same idea regarding the `0 time` comment. Is that just saying that even 
though a segment was in `tableInputSpec.getSegments()`, it may not be in the 
final set?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to