Pavel Emelianov wrote:
> Balbir Singh wrote:
> 
> [snip]
> 
>>> And what about a hard limit - how would you fail in page fault in
>>> case of limit hit? SIGKILL/SEGV is not an option - in this case we
>>> should run synchronous reclamation. This is done in beancounter
>>> patches v6 we've sent recently.
>>>
>> I thought about running synchronous reclamation, but then did not follow
>> that approach, I was not sure if calling the reclaim routines from the
>> page fault context is a good thing to do. It's worth trying out, since
> 
> Each page fault potentially calls reclamation by allocating
> required page with __GFP_IO | __GFP_FS bits set. Synchronous
> reclamation in page fault is really normal.

True. I don't know what I was thinking, thanks for making me think
straight.

> 
> [snip]
> 
>>> Please correct me if I'm wrong, but does this reclamation work like
>>> "run over all the zones' lists searching for page whose controller
>>> is sc->container" ?
>>>
>> Yeah, that's correct. The code can also reclaim memory from all 
>> over-the-limit
> 
> OK. What if I have a container with 100 pages limit in a 4Gb
> (~ million of pages) machine and this group starts reclaiming
> its pages. In case this group uses its pages heavily they will
> be at the beginning of an LRU list and reclamation code would
> have to scan through all (million) pages before it finds proper
> ones. This is not optimal!
> 

Yes, thats possible. The trade off is between

The cost associated with traversing that list while reclaiming
and the complexity associated with task migration. If we keep
a per-container list of pages, during task migration, you'll have
to migrate pages (of the task) from the list to the new container.

>> containers (by passing SC_OVERLIMIT_ALL). The idea behind using such a scheme
>> is to ensure that the global LRU list is not broken.
> 
> isolate_lru_pages() helps in this. As far as I remember this
> was introduced to reduce lru lock contention and keep lru
> lists integrity.
> 
> In beancounters patches this is used to shrink BC's pages.

I'll look at isolate_lru_pages() to see if the reclaim can be optimized.

Thanks for your feedback,


-- 

        Balbir Singh,
        Linux Technology Center,
        IBM Software Labs

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
ckrm-tech mailing list
https://lists.sourceforge.net/lists/listinfo/ckrm-tech

Reply via email to