FrankChen021 commented on PR #18939:
URL: https://github.com/apache/druid/pull/18939#issuecomment-3787886505

   So, in future, there's no compaction term?
   
   However, I have a different view. IMO, the compaction and re-indexing should 
be separated from each other, the should serve complete different purposes
   
   Compaction should only performs the merge of small segments without any 
schema changs(query granularity, segment granularity). The compaction should 
perform eagerly and aggresively especially for kafka ingestion to reduce number 
of segment. There're many problems/limitation around this feature that have not 
solved. for example, the compaction now performs compaction on a whole 
interval, if there're many segments, it takes very long time(and sometimes it's 
not realistic to complete) to finish the job. This kind of compaction was 
originally named as 'Majar compaction', while a 'minor compaction' is there to 
allow us to compact given segments, but it's buggy now, even we give 2 segments 
for example, the task will still fetch all segments in that interval. And 
another problem is that the minor compaction only accepts segments with 
consecutive segments. these problems are states in: #9712 , #9768, #9571
   However, these problems are not solved, and we still experience the large 
number of small segments for long. 
   Apply the re-indexing term for this use case, I think the term itself does 
not reflect its feature but introduces confusion.
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to