yuanlihan commented on a change in pull request #9905:
URL: https://github.com/apache/druid/pull/9905#discussion_r435733738
##########
File path:
indexing-service/src/main/java/org/apache/druid/indexing/common/task/CompactionTask.java
##########
@@ -710,20 +728,8 @@ private static DimensionsSpec
createDimensionsSpec(List<Pair<QueryableIndex, Dat
// Dimensions are extracted from the recent segments to olders because
recent segments are likely to be queried more
// frequently, and thus the performance should be optimized for recent
ones rather than old ones.
- // timelineSegments are sorted in order of interval, but we do a sanity
check here.
- final Comparator<Interval> intervalComparator =
Comparators.intervalsByStartThenEnd();
- for (int i = 0; i < queryableIndices.size() - 1; i++) {
- final Interval shouldBeSmaller =
queryableIndices.get(i).lhs.getDataInterval();
- final Interval shouldBeLarger = queryableIndices.get(i +
1).lhs.getDataInterval();
- Preconditions.checkState(
- intervalComparator.compare(shouldBeSmaller, shouldBeLarger) <= 0,
- "QueryableIndexes are not sorted! Interval[%s] of segment[%s] is
laster than interval[%s] of segment[%s]",
- shouldBeSmaller,
- queryableIndices.get(i).rhs.getId(),
- shouldBeLarger,
- queryableIndices.get(i + 1).rhs.getId()
- );
- }
+ // sort timelineSegments in order of interval.
+ queryableIndices.sort((o1, o2) ->
Comparators.intervalsByStartThenEnd().compare(o1.rhs.getInterval(),
o2.rhs.getInterval()));
Review comment:
Hi @jihoonson
Here I intended to fix the error in step 5 as described
[here](https://github.com/apache/druid/issues/9904).
When compacting the two segments
- `inline_data_2020-05-18T00:00:00.000Z_2020-05-19T00:00:00.000Z_version_1`
- `inline_data_2020-05-18T20:00:00.000Z_2020-05-18T21:00:00.000Z_version_2`
there will be three timelineObjectHolders
- `2020-05-18T00:00:00.000Z/2020-05-18T20:00:00.000Z` backed by segment
`inline_data_2020-05-18T00:00:00.000Z_2020-05-19T00:00:00.000Z_version_1`
- `2020-05-18T20:00:00.000Z/2020-05-18T21:00:00.000Z` backed by segment
`inline_data_2020-05-18T20:00:00.000Z_2020-05-18T21:00:00.000Z_version_2`
- `2020-05-18T21:00:00.000Z/2020-05-19T00:00:00.000Z` backed by segment
`inline_data_2020-05-18T00:00:00.000Z_2020-05-19T00:00:00.000Z_version_1`
And the last two pairs of `queryableIndices` will fail to pass the previous
sanity check. Also I found that now timeline already has unit tests about
sorting timelineObjectHolders and the sanity check will pass in most case. So I
decided to fix the error by re-sorting them anyway. And I added a link in the
comment. Does it make sense?
##########
File path:
indexing-service/src/test/java/org/apache/druid/indexing/common/task/CompactionTaskTest.java
##########
@@ -191,28 +194,33 @@ public static void setupClass()
MIXED_TYPE_COLUMN_MAP.put(Intervals.of("2017-05-01/2017-06-01"), new
DoubleDimensionSchema(MIXED_TYPE_COLUMN));
MIXED_TYPE_COLUMN_MAP.put(Intervals.of("2017-06-01/2017-07-01"), new
DoubleDimensionSchema(MIXED_TYPE_COLUMN));
+ MIXED_TYPE_COLUMN_MAP.put(Intervals.of("2017-06-01/2017-06-02"), new
DoubleDimensionSchema(MIXED_TYPE_COLUMN));
+ MIXED_TYPE_COLUMN_MAP.put(Intervals.of("2017-06-15/2017-06-16"), new
DoubleDimensionSchema(MIXED_TYPE_COLUMN));
+ MIXED_TYPE_COLUMN_MAP.put(Intervals.of("2017-06-30/2017-07-01"), new
DoubleDimensionSchema(MIXED_TYPE_COLUMN));
+
DIMENSIONS = new HashMap<>();
AGGREGATORS = new ArrayList<>();
DIMENSIONS.put(ColumnHolder.TIME_COLUMN_NAME, new
LongDimensionSchema(ColumnHolder.TIME_COLUMN_NAME));
DIMENSIONS.put(TIMESTAMP_COLUMN, new
LongDimensionSchema(TIMESTAMP_COLUMN));
- for (int i = 0; i < SEGMENT_INTERVALS.size(); i++) {
+ int num = 6;
Review comment:
Naming updated
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]