On Mon, 2 Jun 2014, David Rientjes wrote:

> mem_cgroup_force_empty_list() can iterate a large number of pages on an lru 
> and 
> mem_cgroup_move_parent() doesn't return an errno unless certain criteria, 
> none 
> of which indicate that the iteration may be taking too long, is met.
> 
> We have encountered the following stack trace many times indicating
> "need_resched set for > 51000020 ns (51 ticks) without schedule", for example:
> 
>       scheduler_tick()
>       <timer irq>
>       mem_cgroup_move_account+0x4d/0x1d5
>       mem_cgroup_move_parent+0x8d/0x109
>       mem_cgroup_reparent_charges+0x149/0x2ba
>       mem_cgroup_css_offline+0xeb/0x11b
>       cgroup_offline_fn+0x68/0x16b
>       process_one_work+0x129/0x350
> 
> If this iteration is taking too long, indicated by need_resched(), then 
> periodically schedule and continue from where we last left off.
> 
> Signed-off-by: David Rientjes <rient...@google.com>
> ---
>  mm/memcontrol.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -4764,6 +4764,7 @@ static void mem_cgroup_force_empty_list(struct 
> mem_cgroup *memcg,
>       do {
>               struct page_cgroup *pc;
>               struct page *page;
> +             int ret;
>  
>               spin_lock_irqsave(&zone->lru_lock, flags);
>               if (list_empty(list)) {
> @@ -4781,8 +4782,13 @@ static void mem_cgroup_force_empty_list(struct 
> mem_cgroup *memcg,
>  
>               pc = lookup_page_cgroup(page);
>  
> -             if (mem_cgroup_move_parent(page, pc, memcg)) {
> -                     /* found lock contention or "pc" is obsolete. */
> +             ret = mem_cgroup_move_parent(page, pc, memcg);
> +             if (ret || need_resched()) {
> +                     /*
> +                      * Couldn't grab the page reference, isolate the page,
> +                      * there was a pc mismatch, or we simply need to
> +                      * schedule because this is taking too long.
> +                      */
>                       busy = page;
>                       cond_resched();
>               } else

Why not just move that cond_resched() down below the if/else?
No need to test need_resched() separately, and this page is not busy.

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to