maytasm commented on a change in pull request #10371:
URL: https://github.com/apache/druid/pull/10371#discussion_r490000904



##########
File path: 
server/src/main/java/org/apache/druid/server/coordinator/duty/CompactionSegmentIterator.java
##########
@@ -19,22 +19,45 @@
 
 package org.apache.druid.server.coordinator.duty;
 
-import it.unimi.dsi.fastutil.objects.Object2LongOpenHashMap;
+import org.apache.druid.server.coordinator.CompactionStatistics;
 import org.apache.druid.timeline.DataSegment;
 
 import java.util.Iterator;
 import java.util.List;
+import java.util.Map;
 
 /**
  * Segments in the lists which are the elements of this iterator are sorted 
according to the natural segment order
  * (see {@link DataSegment#compareTo}).
  */
 public interface CompactionSegmentIterator extends Iterator<List<DataSegment>>

Review comment:
       we can only know the skipped and compacted segments when we are 
iterating the segment timeline (doSegmentsNeedCompaction method in 
NewestSegmentFirstIterator). This code path isn’t called by the next(). It is 
computed and stored as a QueueEntry into the queue before the next() is called. 
The next() method simply just poll the QueueEntry from the queue. This means 
that the skipped and compacted stats must be in the QueueEntry when the next() 
call polls the queue. We also do not store an entry into the queue if the 
List<DataSegment> is empty (no segments to compact). This does not fit well 
with having to try include the skipped and compacted stats into the QueueEntry. 
For example, we still need to store and aggregates the skipped and compacted 
stats when the actual return segment is an empty List (no more segments for the 
datasource).
   
   tbh I think it is best to keep it this way. Also, since this (the current 
way or modifying the CompactionSegmentIterator) does not change the API, we can 
come back and modify it later. Also since NewestSegmentFirstIterator is the 
only CompactionSegmentIterator right now, the cost for changing isn't too high. 
Maybe a good time to re-visit this is if we are going to add a new iterator and 
see what will work best when we have multiple implementations of 
CompactionSegmentIterator




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to