gaodayue commented on issue #2016: [Proposal] Limit the memory usage of 
Compaction
URL: 
https://github.com/apache/incubator-doris/issues/2016#issuecomment-545427345
 
 
   > what I want to do has no effect for current load process. It will be done 
before we add this rowset to StorageEngine. If we found there are too many 
number of rowsets, we can try to merge some of them to a bigger rowset. 
Actually we can do it for all load operation, because it will improve our read 
performance.
   
   I think the motivation for compaction within a rowset is to reduce the 
number of overlapped segments and improve query performance.  However when the 
number of segments is large, single round of compaction would consume too much 
memory. So we need to decide how many number of segments to compact at a time 
based on estimated RowBlock size and available memory.
   
   My question is the previous compaction strategy will generate a rowset with 
a new version (like 0-6) to replace input rowsets with overlapped version, but 
when doing compaction within a rowset, you end up with two rowset with the same 
version. Is it a problem? And if it is, how do you plan to solve it?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to