jihoonson commented on issue #9768:
URL: https://github.com/apache/druid/issues/9768#issuecomment-904200393


   Assuming that we will keep the current segment ID allocation protocol that 
monotonically increases the partition ID on task failures, the problem we want 
to solve is, given a missing partitionId, how we would know whether the segment 
of that ID really doesn't exist or it is being created by some other task. One 
way to do is modifying the compaction task to as below.
   
   1) When some missing partitionIds are found, the compaction task tries to 
lock them using the regular locking mechanism.
   2) If the locking succeeds, the compaction task can safely assume that those 
partitionIds will never be used since there is no ingestion task creating 
segments of those partitionIds. In this case, the compaction task can simply 
ignore those missing partitionIds and compact the given segments all together.
   3) If the locking fails, there should be some ingestion task creating the 
segments of those partitionIds. In this case, the compaction task can split the 
input segments into multiple groups where each group has only consecutive 
partitionIds, and compact each group separately. Those segments that are being 
created by other task can be compacted later using another compaction task.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to