yuanlihan commented on a change in pull request #9905:
URL: https://github.com/apache/druid/pull/9905#discussion_r435733738



##########
File path: 
indexing-service/src/main/java/org/apache/druid/indexing/common/task/CompactionTask.java
##########
@@ -710,20 +728,8 @@ private static DimensionsSpec 
createDimensionsSpec(List<Pair<QueryableIndex, Dat
     // Dimensions are extracted from the recent segments to olders because 
recent segments are likely to be queried more
     // frequently, and thus the performance should be optimized for recent 
ones rather than old ones.
 
-    // timelineSegments are sorted in order of interval, but we do a sanity 
check here.
-    final Comparator<Interval> intervalComparator = 
Comparators.intervalsByStartThenEnd();
-    for (int i = 0; i < queryableIndices.size() - 1; i++) {
-      final Interval shouldBeSmaller = 
queryableIndices.get(i).lhs.getDataInterval();
-      final Interval shouldBeLarger = queryableIndices.get(i + 
1).lhs.getDataInterval();
-      Preconditions.checkState(
-          intervalComparator.compare(shouldBeSmaller, shouldBeLarger) <= 0,
-          "QueryableIndexes are not sorted! Interval[%s] of segment[%s] is 
laster than interval[%s] of segment[%s]",
-          shouldBeSmaller,
-          queryableIndices.get(i).rhs.getId(),
-          shouldBeLarger,
-          queryableIndices.get(i + 1).rhs.getId()
-      );
-    }
+    // sort timelineSegments in order of interval.
+    queryableIndices.sort((o1, o2) -> 
Comparators.intervalsByStartThenEnd().compare(o1.rhs.getInterval(), 
o2.rhs.getInterval()));

Review comment:
       Hi @jihoonson 
   
   Here I intended to fix the error in step 5 as described 
[here](https://github.com/apache/druid/issues/9904).
   
   When compacting the two segments
   
   - `inline_data_2020-05-18T00:00:00.000Z_2020-05-19T00:00:00.000Z_version_1`
   - `inline_data_2020-05-18T20:00:00.000Z_2020-05-18T21:00:00.000Z_version_2`
   
   there will be three timelineObjectHolders
   
   - `2020-05-18T00:00:00.000Z/2020-05-18T20:00:00.000Z` backed by segment 
`inline_data_2020-05-18T00:00:00.000Z_2020-05-19T00:00:00.000Z_version_1`
   - `2020-05-18T20:00:00.000Z/2020-05-18T21:00:00.000Z` backed by segment 
`inline_data_2020-05-18T20:00:00.000Z_2020-05-18T21:00:00.000Z_version_2`
   - `2020-05-18T21:00:00.000Z/2020-05-19T00:00:00.000Z` backed by segment 
`inline_data_2020-05-18T00:00:00.000Z_2020-05-19T00:00:00.000Z_version_1`
   
   And the last two pairs of `queryableIndices` will fail to pass the previous 
sanity check. Also I found that now timeline already has unit tests about 
sorting timelineObjectHolders and the sanity check will pass in most case. So I 
decided to fix the error by re-sorting them anyway. And I added a link in the 
comment. Does it make sense?
   

##########
File path: 
indexing-service/src/test/java/org/apache/druid/indexing/common/task/CompactionTaskTest.java
##########
@@ -191,28 +194,33 @@ public static void setupClass()
     MIXED_TYPE_COLUMN_MAP.put(Intervals.of("2017-05-01/2017-06-01"), new 
DoubleDimensionSchema(MIXED_TYPE_COLUMN));
     MIXED_TYPE_COLUMN_MAP.put(Intervals.of("2017-06-01/2017-07-01"), new 
DoubleDimensionSchema(MIXED_TYPE_COLUMN));
 
+    MIXED_TYPE_COLUMN_MAP.put(Intervals.of("2017-06-01/2017-06-02"), new 
DoubleDimensionSchema(MIXED_TYPE_COLUMN));
+    MIXED_TYPE_COLUMN_MAP.put(Intervals.of("2017-06-15/2017-06-16"), new 
DoubleDimensionSchema(MIXED_TYPE_COLUMN));
+    MIXED_TYPE_COLUMN_MAP.put(Intervals.of("2017-06-30/2017-07-01"), new 
DoubleDimensionSchema(MIXED_TYPE_COLUMN));
+
     DIMENSIONS = new HashMap<>();
     AGGREGATORS = new ArrayList<>();
 
     DIMENSIONS.put(ColumnHolder.TIME_COLUMN_NAME, new 
LongDimensionSchema(ColumnHolder.TIME_COLUMN_NAME));
     DIMENSIONS.put(TIMESTAMP_COLUMN, new 
LongDimensionSchema(TIMESTAMP_COLUMN));
-    for (int i = 0; i < SEGMENT_INTERVALS.size(); i++) {
+    int num = 6;

Review comment:
       Naming updated




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to