Hi Maeda, Thank you for your analyse. That helps me much because I didn't succeed in reproducing the hang on my machine. Now, I see the problem and I'm working on it. Thanks again, Valérie
Hi Valérie, Valerie Clement wrote: > Hi Maeda, > Could you try the following patch ? > It should improve things, but I still think of a better correction. I think the fundamental problem is a dead lock between reclaiming the pages due to reaching the memory limit and allocating a new page by ext3. To enforce the page write ordering by ext3, page out may sleep until a journal handle is closed, and a new page may have to be allocated in order to close the handle. Therefore, if the new page allocation has to sleep due to reaching the memory limit, dead lock happens in current memory controller's logic, which is waiting infinitely while reaching the memory limit. In my case, kswapd is sleeping at start_this_handle, which is called via ext3_ordered_writepage, and cc1 is waiting on blk_congestion_wait, which is called by __alloc_pages via ext3_create, which has been opened a journal handle kswapd seems to wait for. While reaching the memory limit, __alloc_pages never success until some of the pages are reclaimed, but kswapd cannot reclaim the page until the __alloc_pages success. I don't come up with the clever way to prevent this dead lock. However it is possible that detecting the sleeping kswapd too long after shrink_ckrmzone called, and then allow an allocating memory in very low rate regardless of the memory limit. It may work. Thanks, MAEDA Naoaki ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid3432&bid#0486&dat1642 _______________________________________________ ckrm-tech mailing list https://lists.sourceforge.net/lists/listinfo/ckrm-tech