morningman opened a new issue #1951: Limit the memory usage of Loading process
URL: https://github.com/apache/incubator-doris/issues/1951
 
 
   ## Motivation
   
   Current load framework using memtable to receive incoming loading data and 
flush to disk when reaching limit (default is 100MB). 
   Each tablet corresponds to a memtable, so that if there are many tablets on 
a Backend, and the loading data is distributed evenly on each of these tablets, 
the total memory consumption can be very large because all memtables will no be 
flushed until they reaching the 100MB.
   For example, if there are 100 tablets on a Backend, the peak memory 
consumption can be 10GB(100 * 100MB), and this may cause process killed by 
system OOM.
   
   ## How to resolve
   
   There will be 2 levels of memory limit for loading process. One is loading 
process-level, and one is for Backend-level.
   
   1. loading process-level limit
   
   User can set a memory limit for each loading process. And this limit will 
limit the memory 
   consumption of a loading process on one Backend.
   
   From the aspect of code implements, we use a MemTracker to track the memory 
consumption of a loading process on a Backend. If memory reaches the limit. It 
will find the largest memtable
   can flush it to disk to reduce the memory consumption.
   
   2. Backend-level limit 
   
   Each Backend should has a total loading memory limit. This will limit total 
memory consumption
   of all loading processes on this Backend. And if this limit is reached, Load 
Mgr will pick a 
   loading process with largest memory consumption and try to flush its 
memtable.
   
   ## Configuration
   
   For loading process-level limit, the default is 2GB, and user can it by 
session variable or load
   property.
   For Backend-level limit, the default is 60% of physical memory and max to 
20GB.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to