Re: [PATCH 0/7 v4] move update blocked load outside newidle_balance

2021-02-25 Thread Valentin Schneider
On 25/02/21 09:05, Vincent Guittot wrote:
>> One last thing for patch 7: mayhaps we could do a tad better to avoid
>> duplicate updates going through a heapful of leaf cfs rqs, see
>>
>>   http://lore.kernel.org/r/jhj4kiht7oh.mog...@arm.com
>
> rq->last_blocked_load_update_tick is there only to filter duplicate
> update during _nohz_idle_balance but not for other normal LB.
>

Right, update_nohz_stats() is now only used in _nohz_idle_balance()...
IIUC the pattern being covered here would be a CPUX getting kicked for nohz
stats/balance, then some CPUX-N (so it gets picked by find_new_ilb())
entering idle and getting kick in turn (less than a jiffy apart). 



Re: [PATCH 0/7 v4] move update blocked load outside newidle_balance

2021-02-25 Thread Vincent Guittot
Hi Valentin,

On Wed, 24 Feb 2021 at 19:46, Valentin Schneider
 wrote:
>
> On 24/02/21 14:30, Vincent Guittot wrote:
> > Joel reported long preempt and irq off sequence in newidle_balance because
> > of a large number of CPU cgroups in use and having to be updated. This
> > patchset moves the update outside newidle_imblance. This enables to early
> > abort during the updates in case of pending irq as an example.
> >
> > Instead of kicking a normal ILB that will wakes up CPU which is already
> > idle, patch 6 triggers the update of statistics in the idle thread of
> > the CPU before selecting and entering an idle state.
> >
> > Changes on v4:
> > - Add a dedicated bit for updating blocked load when entering idle.
> >   This simplifies the management of concurrency with kick_ilb.
> >
>
> I believe that solves the issues vs nohz balance.
>
> One last thing for patch 7: mayhaps we could do a tad better to avoid
> duplicate updates going through a heapful of leaf cfs rqs, see
>
>   http://lore.kernel.org/r/jhj4kiht7oh.mog...@arm.com

rq->last_blocked_load_update_tick is there only to filter duplicate
update during _nohz_idle_balance but not for other normal LB.

>
>
> Otherwise, feel free to add to the lot:
>
>   Reviewed-by: Valentin Schneider 
>


Re: [PATCH 0/7 v4] move update blocked load outside newidle_balance

2021-02-24 Thread Valentin Schneider
On 24/02/21 14:30, Vincent Guittot wrote:
> Joel reported long preempt and irq off sequence in newidle_balance because
> of a large number of CPU cgroups in use and having to be updated. This
> patchset moves the update outside newidle_imblance. This enables to early
> abort during the updates in case of pending irq as an example.
>
> Instead of kicking a normal ILB that will wakes up CPU which is already
> idle, patch 6 triggers the update of statistics in the idle thread of
> the CPU before selecting and entering an idle state.
>
> Changes on v4:
> - Add a dedicated bit for updating blocked load when entering idle.
>   This simplifies the management of concurrency with kick_ilb.
>

I believe that solves the issues vs nohz balance.

One last thing for patch 7: mayhaps we could do a tad better to avoid
duplicate updates going through a heapful of leaf cfs rqs, see

  http://lore.kernel.org/r/jhj4kiht7oh.mog...@arm.com


Otherwise, feel free to add to the lot:

  Reviewed-by: Valentin Schneider 



Re: [PATCH 0/7 v4] move update blocked load outside newidle_balance

2021-02-24 Thread Peter Zijlstra
On Wed, Feb 24, 2021 at 06:51:01PM +0100, Vincent Guittot wrote:

> > OK, shall I add something like:
> >
> > This reduces the IRQ latency from O(nr_cgroups * nr_nohz_cpus) to
> > O(nr_cgroups).
> >
> > To the changelog of patch #1 ?
> 
> Yes, good point. This will clarify the range of improvement

OK, done.


Re: [PATCH 0/7 v4] move update blocked load outside newidle_balance

2021-02-24 Thread Vincent Guittot
On Wed, 24 Feb 2021 at 18:41, Peter Zijlstra  wrote:
>
> On Wed, Feb 24, 2021 at 04:57:15PM +0100, Vincent Guittot wrote:
> > On Wed, 24 Feb 2021 at 16:54, Peter Zijlstra  wrote:
> > >
> > > On Wed, Feb 24, 2021 at 02:30:00PM +0100, Vincent Guittot wrote:
> > > > Joel reported long preempt and irq off sequence in newidle_balance 
> > > > because
> > > > of a large number of CPU cgroups in use and having to be updated. This
> > > > patchset moves the update outside newidle_imblance. This enables to 
> > > > early
> > > > abort during the updates in case of pending irq as an example.
> > > >
> > > > Instead of kicking a normal ILB that will wakes up CPU which is already
> > > > idle, patch 6 triggers the update of statistics in the idle thread of
> > > > the CPU before selecting and entering an idle state.
> > >
> > > I'm confused... update_blocked_averages(), which calls
> > > __update_blocked_fair(), which is the one doing the cgroup iteration
> > > thing, runs with rq->lock held, and thus will have IRQs disabled any
> > > which way around we turn this thing.
> > >
> > > Or is the problem that we called nohz_idle_balance(), which does
> > > update_nohz_stats() -> update_blocked_averages() for evey NOHZ cpu from
> > > newidle balance, such that we get NR_NOHZ_CPUS * NR_CGROUPS IRQ latency?
> > > Which is now reduced to just NR_CGROUPS ?
> >
> > Yes we can now abort between each cpu update
>
> OK, shall I add something like:
>
> This reduces the IRQ latency from O(nr_cgroups * nr_nohz_cpus) to
> O(nr_cgroups).
>
> To the changelog of patch #1 ?

Yes, good point. This will clarify the range of improvement


Re: [PATCH 0/7 v4] move update blocked load outside newidle_balance

2021-02-24 Thread Peter Zijlstra
On Wed, Feb 24, 2021 at 04:57:15PM +0100, Vincent Guittot wrote:
> On Wed, 24 Feb 2021 at 16:54, Peter Zijlstra  wrote:
> >
> > On Wed, Feb 24, 2021 at 02:30:00PM +0100, Vincent Guittot wrote:
> > > Joel reported long preempt and irq off sequence in newidle_balance because
> > > of a large number of CPU cgroups in use and having to be updated. This
> > > patchset moves the update outside newidle_imblance. This enables to early
> > > abort during the updates in case of pending irq as an example.
> > >
> > > Instead of kicking a normal ILB that will wakes up CPU which is already
> > > idle, patch 6 triggers the update of statistics in the idle thread of
> > > the CPU before selecting and entering an idle state.
> >
> > I'm confused... update_blocked_averages(), which calls
> > __update_blocked_fair(), which is the one doing the cgroup iteration
> > thing, runs with rq->lock held, and thus will have IRQs disabled any
> > which way around we turn this thing.
> >
> > Or is the problem that we called nohz_idle_balance(), which does
> > update_nohz_stats() -> update_blocked_averages() for evey NOHZ cpu from
> > newidle balance, such that we get NR_NOHZ_CPUS * NR_CGROUPS IRQ latency?
> > Which is now reduced to just NR_CGROUPS ?
> 
> Yes we can now abort between each cpu update

OK, shall I add something like:

This reduces the IRQ latency from O(nr_cgroups * nr_nohz_cpus) to
O(nr_cgroups).

To the changelog of patch #1 ?


Re: [PATCH 0/7 v4] move update blocked load outside newidle_balance

2021-02-24 Thread Vincent Guittot
On Wed, 24 Feb 2021 at 16:54, Peter Zijlstra  wrote:
>
> On Wed, Feb 24, 2021 at 02:30:00PM +0100, Vincent Guittot wrote:
> > Joel reported long preempt and irq off sequence in newidle_balance because
> > of a large number of CPU cgroups in use and having to be updated. This
> > patchset moves the update outside newidle_imblance. This enables to early
> > abort during the updates in case of pending irq as an example.
> >
> > Instead of kicking a normal ILB that will wakes up CPU which is already
> > idle, patch 6 triggers the update of statistics in the idle thread of
> > the CPU before selecting and entering an idle state.
>
> I'm confused... update_blocked_averages(), which calls
> __update_blocked_fair(), which is the one doing the cgroup iteration
> thing, runs with rq->lock held, and thus will have IRQs disabled any
> which way around we turn this thing.
>
> Or is the problem that we called nohz_idle_balance(), which does
> update_nohz_stats() -> update_blocked_averages() for evey NOHZ cpu from
> newidle balance, such that we get NR_NOHZ_CPUS * NR_CGROUPS IRQ latency?
> Which is now reduced to just NR_CGROUPS ?

Yes we can now abort between each cpu update


Re: [PATCH 0/7 v4] move update blocked load outside newidle_balance

2021-02-24 Thread Peter Zijlstra
On Wed, Feb 24, 2021 at 02:30:00PM +0100, Vincent Guittot wrote:
> Joel reported long preempt and irq off sequence in newidle_balance because
> of a large number of CPU cgroups in use and having to be updated. This
> patchset moves the update outside newidle_imblance. This enables to early
> abort during the updates in case of pending irq as an example.
> 
> Instead of kicking a normal ILB that will wakes up CPU which is already
> idle, patch 6 triggers the update of statistics in the idle thread of
> the CPU before selecting and entering an idle state.

I'm confused... update_blocked_averages(), which calls
__update_blocked_fair(), which is the one doing the cgroup iteration
thing, runs with rq->lock held, and thus will have IRQs disabled any
which way around we turn this thing.

Or is the problem that we called nohz_idle_balance(), which does
update_nohz_stats() -> update_blocked_averages() for evey NOHZ cpu from
newidle balance, such that we get NR_NOHZ_CPUS * NR_CGROUPS IRQ latency?
Which is now reduced to just NR_CGROUPS ?


[PATCH 0/7 v4] move update blocked load outside newidle_balance

2021-02-24 Thread Vincent Guittot
Joel reported long preempt and irq off sequence in newidle_balance because
of a large number of CPU cgroups in use and having to be updated. This
patchset moves the update outside newidle_imblance. This enables to early
abort during the updates in case of pending irq as an example.

Instead of kicking a normal ILB that will wakes up CPU which is already
idle, patch 6 triggers the update of statistics in the idle thread of
the CPU before selecting and entering an idle state.

Changes on v4:
- Add a dedicated bit for updating blocked load when entering idle.
  This simplifies the management of concurrency with kick_ilb.

Changes on v3:
- Fixed a compilation error for !CONFIG_SMP && CONFIG_NO_HZ_COMMON
  reported by kernel test robot 
- Took advantage of this new version to add a short desciption for
  nohz_run_idle_balance

Changes on v2:
- Fixed some typos and updated some comments
- Added more cleanup
- Changed to way to trigger ILB in idle thread context to remove a possible
  race condition between the normal softirq ILB and this new mecanism. The
  cpu can already be set in idle_cpus_mask because even if the cpu is added
  later when entering idle, it might not have been removed yet from previous
  idle phase.
  
Vincent Guittot (7):
  sched/fair: remove update of blocked load from newidle_balance
  sched/fair: remove unused return of _nohz_idle_balance
  sched/fair: remove unused parameter of update_nohz_stats
  sched/fair: merge for each idle cpu loop of ILB
  sched/fair: reorder newidle_balance pulled_task tests
  sched/fair: trigger the update of blocked load on newly idle cpu
  sched/fair: reduce the window for duplicated update

 kernel/sched/core.c  |   2 +-
 kernel/sched/fair.c  | 118 +--
 kernel/sched/idle.c  |   6 +++
 kernel/sched/sched.h |   7 +++
 4 files changed, 60 insertions(+), 73 deletions(-)

-- 
2.17.1