Re: 回复: [PATCH v2] kvfree_rcu: Release page cache under memory pressure
On Sat, Jan 30, 2021 at 06:47:31AM +, Zhang, Qiang wrote: > > > > 发件人: Uladzislau Rezki > 发送时间: 2021年1月29日 22:19 > 收件人: Zhang, Qiang > 抄送: ure...@gmail.com; paul...@kernel.org; j...@joelfernandes.org; > r...@vger.kernel.org; linux-kernel@vger.kernel.org > 主题: Re: [PATCH v2] kvfree_rcu: Release page cache under memory pressure > > [Please note: This e-mail is from an EXTERNAL e-mail address] > > On Fri, Jan 29, 2021 at 04:04:42PM +0800, qiang.zh...@windriver.com wrote: > > From: Zqiang > > > > Add free per-cpu existing krcp's page cache operation, when > > the system is under memory pressure. > > > > Signed-off-by: Zqiang > > --- > > kernel/rcu/tree.c | 25 + > > 1 file changed, 25 insertions(+) > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > > index c1ae1e52f638..ec098910d80b 100644 > > --- a/kernel/rcu/tree.c > > +++ b/kernel/rcu/tree.c > > @@ -3571,17 +3571,40 @@ void kvfree_call_rcu(struct rcu_head *head, > > rcu_callback_t func) > > } > > EXPORT_SYMBOL_GPL(kvfree_call_rcu); > > > > +static int free_krc_page_cache(struct kfree_rcu_cpu *krcp) > > +{ > > + unsigned long flags; > > + struct kvfree_rcu_bulk_data *bnode; > > + int i; > > + > > + for (i = 0; i < rcu_min_cached_objs; i++) { > > + raw_spin_lock_irqsave(>lock, flags); > >I am not sure why we should disable IRQs. I think it can be >avoided. > > Suppose in multi CPU system, the kfree_rcu_shrink_scan function is runing on > CPU2, > and we just traverse to CPU2, and then call free_krc_page_cache function, > if not disable irq, a interrupt may be occurs on CPU2 after the CPU2 > corresponds to krcp variable 's lock be acquired, if the interrupt or > softirq handler function to call kvfree_rcu function, in this function , > acquire CPU2 corresponds to krcp variable 's lock , will happen deadlock. > Or in single CPU scenario. > Right. Deadlock scenario. It went away from my head during writing that :) Thanks! -- Vlad Rezki
回复: [PATCH v2] kvfree_rcu: Release page cache under memory pressure
发件人: Uladzislau Rezki 发送时间: 2021年1月29日 22:19 收件人: Zhang, Qiang 抄送: ure...@gmail.com; paul...@kernel.org; j...@joelfernandes.org; r...@vger.kernel.org; linux-kernel@vger.kernel.org 主题: Re: [PATCH v2] kvfree_rcu: Release page cache under memory pressure [Please note: This e-mail is from an EXTERNAL e-mail address] On Fri, Jan 29, 2021 at 04:04:42PM +0800, qiang.zh...@windriver.com wrote: > From: Zqiang > > Add free per-cpu existing krcp's page cache operation, when > the system is under memory pressure. > > Signed-off-by: Zqiang > --- > kernel/rcu/tree.c | 25 + > 1 file changed, 25 insertions(+) > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > index c1ae1e52f638..ec098910d80b 100644 > --- a/kernel/rcu/tree.c > +++ b/kernel/rcu/tree.c > @@ -3571,17 +3571,40 @@ void kvfree_call_rcu(struct rcu_head *head, > rcu_callback_t func) > } > EXPORT_SYMBOL_GPL(kvfree_call_rcu); > > +static int free_krc_page_cache(struct kfree_rcu_cpu *krcp) > +{ > + unsigned long flags; > + struct kvfree_rcu_bulk_data *bnode; > + int i; > + > + for (i = 0; i < rcu_min_cached_objs; i++) { > + raw_spin_lock_irqsave(>lock, flags); >I am not sure why we should disable IRQs. I think it can be >avoided. Suppose in multi CPU system, the kfree_rcu_shrink_scan function is runing on CPU2, and we just traverse to CPU2, and then call free_krc_page_cache function, if not disable irq, a interrupt may be occurs on CPU2 after the CPU2 corresponds to krcp variable 's lock be acquired, if the interrupt or softirq handler function to call kvfree_rcu function, in this function , acquire CPU2 corresponds to krcp variable 's lock , will happen deadlock. Or in single CPU scenario. > + bnode = get_cached_bnode(krcp); > + raw_spin_unlock_irqrestore(>lock, flags); > + if (!bnode) > + break; > + free_page((unsigned long)bnode); > + } > + > + return i; > +} >Also i forgot to add in my previous comment to this path. Can we >access >to page cache once and then do the drain work? I mean if we had >100 objects >in the cache we would need to access to a krcp->lock 100 times. > >What about something like below: > > >static int free_krc_page_cache(struct kfree_rcu_cpu *krcp) >{ >struct llist_node *page_list, *pos, *n; >int freed = 0; > >raw_spin_lock(>lock); >page_list = llist_del_all(>bkvcache); >krcp->nr_bkv_objs = 0; >raw_spin_unlock(>lock); > >llist_for_each_safe(pos, n, page_list) { >free_page((unsigned long) pos); >freed++; >} > >return freed; >} > this change looks better. Thanks Qiang > + > static unsigned long > kfree_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc) > { > int cpu; > unsigned long count = 0; > + unsigned long flags; > > /* Snapshot count of all CPUs */ > for_each_possible_cpu(cpu) { > struct kfree_rcu_cpu *krcp = per_cpu_ptr(, cpu); > > count += READ_ONCE(krcp->count); > + > + raw_spin_lock_irqsave(>lock, flags); > + count += krcp->nr_bkv_objs; > + raw_spin_unlock_irqrestore(>lock, flags); >Should we disable irqs? > > return count; > @@ -3598,6 +3621,8 @@ kfree_rcu_shrink_scan(struct shrinker *shrink, struct > shrink_control *sc) > struct kfree_rcu_cpu *krcp = per_cpu_ptr(, cpu); > > count = krcp->count; > + count += free_krc_page_cache(krcp); > + > raw_spin_lock_irqsave(>lock, flags); > if (krcp->monitor_todo) > kfree_rcu_drain_unlock(krcp, flags); > -- > 2.17.1 Thanks! -- Vlad Rezki
[PATCH v2] kvfree_rcu: Release page cache under memory pressure
From: Zqiang Add free per-cpu existing krcp's page cache operation, when the system is under memory pressure. Signed-off-by: Zqiang --- kernel/rcu/tree.c | 25 + 1 file changed, 25 insertions(+) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index c1ae1e52f638..ec098910d80b 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -3571,17 +3571,40 @@ void kvfree_call_rcu(struct rcu_head *head, rcu_callback_t func) } EXPORT_SYMBOL_GPL(kvfree_call_rcu); +static int free_krc_page_cache(struct kfree_rcu_cpu *krcp) +{ + unsigned long flags; + struct kvfree_rcu_bulk_data *bnode; + int i; + + for (i = 0; i < rcu_min_cached_objs; i++) { + raw_spin_lock_irqsave(>lock, flags); + bnode = get_cached_bnode(krcp); + raw_spin_unlock_irqrestore(>lock, flags); + if (!bnode) + break; + free_page((unsigned long)bnode); + } + + return i; +} + static unsigned long kfree_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc) { int cpu; unsigned long count = 0; + unsigned long flags; /* Snapshot count of all CPUs */ for_each_possible_cpu(cpu) { struct kfree_rcu_cpu *krcp = per_cpu_ptr(, cpu); count += READ_ONCE(krcp->count); + + raw_spin_lock_irqsave(>lock, flags); + count += krcp->nr_bkv_objs; + raw_spin_unlock_irqrestore(>lock, flags); } return count; @@ -3598,6 +3621,8 @@ kfree_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) struct kfree_rcu_cpu *krcp = per_cpu_ptr(, cpu); count = krcp->count; + count += free_krc_page_cache(krcp); + raw_spin_lock_irqsave(>lock, flags); if (krcp->monitor_todo) kfree_rcu_drain_unlock(krcp, flags); -- 2.17.1