maytasm commented on a change in pull request #10371:
URL: https://github.com/apache/druid/pull/10371#discussion_r490000904
##########
File path:
server/src/main/java/org/apache/druid/server/coordinator/duty/CompactionSegmentIterator.java
##########
@@ -19,22 +19,45 @@
package org.apache.druid.server.coordinator.duty;
-import it.unimi.dsi.fastutil.objects.Object2LongOpenHashMap;
+import org.apache.druid.server.coordinator.CompactionStatistics;
import org.apache.druid.timeline.DataSegment;
import java.util.Iterator;
import java.util.List;
+import java.util.Map;
/**
* Segments in the lists which are the elements of this iterator are sorted
according to the natural segment order
* (see {@link DataSegment#compareTo}).
*/
public interface CompactionSegmentIterator extends Iterator<List<DataSegment>>
Review comment:
we can only know the skipped and compacted segments when we are
iterating the segment timeline (doSegmentsNeedCompaction method in
NewestSegmentFirstIterator). This code path isn’t called by the next(). It is
computed and stored as a QueueEntry into the queue. The next() method simply
just poll the QueueEntry from the queue. This means that the skipped and
compacted stats must be in the QueueEntry when the next() call polls the queue.
We also do not store an entry into the queue if the List<DataSegment> is empty
(no segments to compact). This does not fit well with having to try include the
skipped and compacted stats into the QueueEntry. For example, we still need to
store and aggregates the skipped and compacted stats when the actual return
segment is an empty List (no more segments for the datasource).
tbh I think it is best to keep it this way. Also, since this (the current
way or modifying the CompactionSegmentIterator) does not change the API, we can
come back and modify it later. Also since NewestSegmentFirstIterator is the
only CompactionSegmentIterator right now, the cost for changing isn't too high.
Maybe a good time to re-visit this is if we are going to add a new iterator and
see what will work best when we have multiple implementations of
CompactionSegmentIterator
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]