Re: [PATCH] [RFC] vmscan.c: add a sysctl entry for controlling memory reclaim IO congestion_wait length

2019-09-19 Thread Michal Hocko
On Thu 19-09-19 15:46:11, Lin Feng wrote: > > > On 9/19/19 11:49, Matthew Wilcox wrote: > > On Thu, Sep 19, 2019 at 10:33:10AM +0800, Lin Feng wrote: > > > On 9/18/19 20:33, Michal Hocko wrote: > > > > I absolutely agree here. From you changelog it is also not clear what is > > > > the

Re: [PATCH] [RFC] vmscan.c: add a sysctl entry for controlling memory reclaim IO congestion_wait length

2019-09-19 Thread Lin Feng
On 9/19/19 11:49, Matthew Wilcox wrote: On Thu, Sep 19, 2019 at 10:33:10AM +0800, Lin Feng wrote: On 9/18/19 20:33, Michal Hocko wrote: I absolutely agree here. From you changelog it is also not clear what is the underlying problem. Both congestion_wait and wait_iff_congested should wake up

Re: [PATCH] [RFC] vmscan.c: add a sysctl entry for controlling memory reclaim IO congestion_wait length

2019-09-18 Thread Matthew Wilcox
On Thu, Sep 19, 2019 at 10:33:10AM +0800, Lin Feng wrote: > On 9/18/19 20:33, Michal Hocko wrote: > > I absolutely agree here. From you changelog it is also not clear what is > > the underlying problem. Both congestion_wait and wait_iff_congested > > should wake up early if the congestion is

Re: [PATCH] [RFC] vmscan.c: add a sysctl entry for controlling memory reclaim IO congestion_wait length

2019-09-18 Thread Lin Feng
On 9/18/19 20:33, Michal Hocko wrote: +mm_reclaim_congestion_wait_jiffies +== + +This control is used to define how long kernel will wait/sleep while +system memory is under pressure and memroy reclaim is relatively active. +Lower values will decrease the kernel wait/sleep time. +

Re: [PATCH] [RFC] vmscan.c: add a sysctl entry for controlling memory reclaim IO congestion_wait length

2019-09-18 Thread Lin Feng
Hi, On 9/18/19 19:38, Matthew Wilcox wrote: On Wed, Sep 18, 2019 at 11:21:04AM +0800, Lin Feng wrote: Adding a new tunable is not the right solution. The right way is to make Linux auto-tune itself to avoid the problem. For example, bdi_writeback contains an estimated write bandwidth

Re: [PATCH] [RFC] vmscan.c: add a sysctl entry for controlling memory reclaim IO congestion_wait length

2019-09-18 Thread Michal Hocko
On Tue 17-09-19 05:06:46, Matthew Wilcox wrote: > On Tue, Sep 17, 2019 at 07:58:24PM +0800, Lin Feng wrote: [...] > > +mm_reclaim_congestion_wait_jiffies > > +== > > + > > +This control is used to define how long kernel will wait/sleep while > > +system memory is under pressure and memroy

Re: [PATCH] [RFC] vmscan.c: add a sysctl entry for controlling memory reclaim IO congestion_wait length

2019-09-18 Thread Matthew Wilcox
On Wed, Sep 18, 2019 at 11:21:04AM +0800, Lin Feng wrote: > > Adding a new tunable is not the right solution. The right way is > > to make Linux auto-tune itself to avoid the problem. For example, > > bdi_writeback contains an estimated write bandwidth (calculated by the > > memory management

Re: [PATCH] [RFC] vmscan.c: add a sysctl entry for controlling memory reclaim IO congestion_wait length

2019-09-17 Thread Lin Feng
On 9/17/19 20:06, Matthew Wilcox wrote: On Tue, Sep 17, 2019 at 07:58:24PM +0800, Lin Feng wrote: In direct and background(kswapd) pages reclaim paths both may fall into calling msleep(100) or congestion_wait(HZ/10) or wait_iff_congested(HZ/10) while under IO pressure, and the sleep length

[PATCH] [RFC] vmscan.c: add a sysctl entry for controlling memory reclaim IO congestion_wait length

2019-09-17 Thread Lin Feng
This sysctl is named as mm_reclaim_congestion_wait_jiffies, default to HZ/10 as unchanged to old codes. It is in jiffies unit and can be set in range between [1, 100], so refers to CONFIG_HZ before tuning. In direct and background(kswapd) pages reclaim paths both may fall into calling

Re: [PATCH] [RFC] vmscan.c: add a sysctl entry for controlling memory reclaim IO congestion_wait length

2019-09-17 Thread Matthew Wilcox
On Tue, Sep 17, 2019 at 07:58:24PM +0800, Lin Feng wrote: > In direct and background(kswapd) pages reclaim paths both may fall into > calling msleep(100) or congestion_wait(HZ/10) or wait_iff_congested(HZ/10) > while under IO pressure, and the sleep length is hard-coded and the later > two will