gianm opened a new issue #6177: TimeBoundary, DataSourceMetadata filterSegments 
flawed with overlapping segments
URL: https://github.com/apache/incubator-druid/issues/6177
 
 
   The "interval" on a LogicalSegment corresponds to the timeline object holder 
interval, meaning that with overlapping segments, it is only the queryable 
part. This means that the code in filterSegments of both timeBoundary and 
dataSourceMetadata is flawed. It looks like this in both of them:
   
   ```java
       final T min = query.isMaxTime() ? null : segments.get(0);
       final T max = query.isMinTime() ? null : segments.get(segments.size() - 
1);
   
       return Lists.newArrayList(
           Iterables.filter(
               segments,
               input -> (min != null && 
input.getInterval().overlaps(min.getInterval())) ||
                      (max != null && 
input.getInterval().overlaps(max.getInterval()))
           )
       );
   ```
   
   But because of how "interval" works on the LogicalSegments, 
`input.getInterval()` for two different LogicalSegments will never overlap. 
This causes erroneous results in cases like this:
   
   - Segment A is 2017/2018 (YEAR granularity) and has data up through May 2017.
   - Segment B is 2017-08-01/2017-08-02 (DAY granularity) and has data for 
August 1 2017.
   
   In this case, three LogicalSegments are returned:
   
   - 2017/2017-08-01
   - 2017-08-01/2017-08-02
   - 2017-08-02/2018
   
   And the filterSegments methods on these two query types will only inspect 
the last LogicalSegment, meaning they'll miss the _actual_ max time, which is 
in the 2017-08-01/2017-08-02 holder.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org

Reply via email to