jihoonson commented on issue #9712:
URL: https://github.com/apache/druid/issues/9712#issuecomment-619165839


   @yuanlihan your assessment is very correct! Yes, the coordinator should be 
able to run both minor compaction and major compactions; minor compaction for 
recent data and major compaction for old data. As you mentioned, minor 
compaction should be able to run on a subset of segments in a time chunk 
instead of grabbing all of them. 
   
   > It would be better to only do minor compaction for the first M small 
segments and the tail N small segments.
   
   This sounds nice, but I'm not sure how we can do it. Auto compaction 
algorithm used to use segment size as a trigger for compaction, but this caused 
a bunch of bugs since the segment size after compaction can be still small 
based on your configuration such as maxRowsPerSegment. Also parallel task will 
create at least one small segment in most cases since the last task will be 
likely assigned small number of segments. As a result, we changed the algorithm 
to be stateful in https://github.com/apache/druid/pull/8573. Do you have a good 
idea?
   
   > Also found that minor compaction tasks will always fail if the partitionId 
is not consecutive.
   > 
   > > WARN [TaskQueue-Manager] org.apache.druid.indexing.overlord.TaskQueue - 
Exception thrown during isReady for task: 
compact_ds_name_pfjceoge_2020-04-20T03:25:14.299Z
   > > org.apache.druid.java.util.common.ISE: Can't compact segments of 
non-consecutive rootPartition range
   
   Oh yeah, this is a known issue. Just opened 
https://github.com/apache/druid/issues/9768.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to