maytasm commented on a change in pull request #10371:
URL: https://github.com/apache/druid/pull/10371#discussion_r490000904
##########
File path:
server/src/main/java/org/apache/druid/server/coordinator/duty/CompactionSegmentIterator.java
##########
@@ -19,22 +19,45 @@
package org.apache.druid.server.coordinator.duty;
-import it.unimi.dsi.fastutil.objects.Object2LongOpenHashMap;
+import org.apache.druid.server.coordinator.CompactionStatistics;
import org.apache.druid.timeline.DataSegment;
import java.util.Iterator;
import java.util.List;
+import java.util.Map;
/**
* Segments in the lists which are the elements of this iterator are sorted
according to the natural segment order
* (see {@link DataSegment#compareTo}).
*/
public interface CompactionSegmentIterator extends Iterator<List<DataSegment>>
Review comment:
we can only know the skipped and compacted segments when we are
iterating the segment timeline. This code path isn’t called by the next(). It
is computed and stored as a QueueEntry into the queue. next() simply just poll
the QueueEntry from the queue. This means that the skipped and compacted stats
must be in the QueueEntry when the next() call polls the queue. We also do not
store an entry into the queue if the List<DataSegment> is empty (no segments to
compact). This does not fit well with having to try include the skipped and
compacted stats into the QueueEntry. For example, we still need to store and
aggregates the skipped and compacted stats when the actual return segment is an
empty List (no more segments for the datasource).
tbh I think it is best to keep it this way. Also, since this (the current
way or modifying the CompactionSegmentIterator) does not change the API, we can
come back and modify it later. Also since NewestSegmentFirstIterator is the
only CompactionSegmentIterator right now, the cost for changing isn't too high.
Maybe a good time to re-visit this is if we are going to add a new iterator and
see what will work best when we have multiple implementations of
CompactionSegmentIterator
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]