Re: blk-mq breaks suspend even with runtime PM patch

2017-08-08 Thread Jens Axboe
On 08/08/2017 12:33 PM, Mike Galbraith wrote:
> On Tue, 2017-08-08 at 18:50 +0200, Mike Galbraith wrote:
>> On Tue, 2017-08-08 at 09:44 -0700, Greg KH wrote:
>>>
>>> Should these go back farther than 4.12?  Looks like they apply cleanly
>>> to 4.9, didn't look older than that...
>>
>> I met prerequisites at 4.11...
> 
> FWIW, I took/modified 2d0364c8c1a9.  Dunno if the suspend regression
> exists in 4.11 though, without which you'll likely want nothing.

It does exist, the only change here is that we default people to
scsi-mq in 4.12+. Honestly, nobody complained since we've had scsi-mq,
so I could pivot both ways on whether we really need the changes in
earlier versions or not.

-- 
Jens Axboe



Re: blk-mq breaks suspend even with runtime PM patch

2017-08-08 Thread Jens Axboe
On 08/08/2017 12:33 PM, Mike Galbraith wrote:
> On Tue, 2017-08-08 at 18:50 +0200, Mike Galbraith wrote:
>> On Tue, 2017-08-08 at 09:44 -0700, Greg KH wrote:
>>>
>>> Should these go back farther than 4.12?  Looks like they apply cleanly
>>> to 4.9, didn't look older than that...
>>
>> I met prerequisites at 4.11...
> 
> FWIW, I took/modified 2d0364c8c1a9.  Dunno if the suspend regression
> exists in 4.11 though, without which you'll likely want nothing.

It does exist, the only change here is that we default people to
scsi-mq in 4.12+. Honestly, nobody complained since we've had scsi-mq,
so I could pivot both ways on whether we really need the changes in
earlier versions or not.

-- 
Jens Axboe



Re: blk-mq breaks suspend even with runtime PM patch

2017-08-08 Thread Mike Galbraith
On Tue, 2017-08-08 at 18:50 +0200, Mike Galbraith wrote:
> On Tue, 2017-08-08 at 09:44 -0700, Greg KH wrote:
> > 
> > Should these go back farther than 4.12?  Looks like they apply cleanly
> > to 4.9, didn't look older than that...
> 
> I met prerequisites at 4.11...

FWIW, I took/modified 2d0364c8c1a9.  Dunno if the suspend regression
exists in 4.11 though, without which you'll likely want nothing.

-Mike



Re: blk-mq breaks suspend even with runtime PM patch

2017-08-08 Thread Mike Galbraith
On Tue, 2017-08-08 at 18:50 +0200, Mike Galbraith wrote:
> On Tue, 2017-08-08 at 09:44 -0700, Greg KH wrote:
> > 
> > Should these go back farther than 4.12?  Looks like they apply cleanly
> > to 4.9, didn't look older than that...
> 
> I met prerequisites at 4.11...

FWIW, I took/modified 2d0364c8c1a9.  Dunno if the suspend regression
exists in 4.11 though, without which you'll likely want nothing.

-Mike



Re: blk-mq breaks suspend even with runtime PM patch

2017-08-08 Thread Mike Galbraith
On Tue, 2017-08-08 at 09:44 -0700, Greg KH wrote:
> 
> Should these go back farther than 4.12?  Looks like they apply cleanly
> to 4.9, didn't look older than that...

I met prerequisites at 4.11, but I wasn't patching anything remotely
resembling virgin source.

-Mike


Re: blk-mq breaks suspend even with runtime PM patch

2017-08-08 Thread Mike Galbraith
On Tue, 2017-08-08 at 09:44 -0700, Greg KH wrote:
> 
> Should these go back farther than 4.12?  Looks like they apply cleanly
> to 4.9, didn't look older than that...

I met prerequisites at 4.11, but I wasn't patching anything remotely
resembling virgin source.

-Mike


Re: blk-mq breaks suspend even with runtime PM patch

2017-08-08 Thread Oleksandr Natalenko
Greg,

this is 765e40b675a9566459ddcb8358ad16f3b8344bbe.

On úterý 8. srpna 2017 18:43:33 CEST Greg KH wrote:
> On Tue, Aug 08, 2017 at 06:36:01PM +0200, Oleksandr Natalenko wrote:
> > Could you queue "block: disable runtime-pm for blk-mq" too please? It is
> > also related to suspend-resume freezes that were observed by multiple
> > users.
> What is the git commit id of that patch?
> 
> thanks,
> 
> greg k-h




Re: blk-mq breaks suspend even with runtime PM patch

2017-08-08 Thread Oleksandr Natalenko
Greg,

this is 765e40b675a9566459ddcb8358ad16f3b8344bbe.

On úterý 8. srpna 2017 18:43:33 CEST Greg KH wrote:
> On Tue, Aug 08, 2017 at 06:36:01PM +0200, Oleksandr Natalenko wrote:
> > Could you queue "block: disable runtime-pm for blk-mq" too please? It is
> > also related to suspend-resume freezes that were observed by multiple
> > users.
> What is the git commit id of that patch?
> 
> thanks,
> 
> greg k-h




Re: blk-mq breaks suspend even with runtime PM patch

2017-08-08 Thread Greg KH
On Tue, Aug 08, 2017 at 06:34:01PM +0200, Mike Galbraith wrote:
> On Tue, 2017-08-08 at 09:22 -0700, Greg KH wrote:
> > On Sun, Jul 30, 2017 at 03:50:15PM +0200, Oleksandr Natalenko wrote:
> > > Hello Mike et al.
> > > 
> > > On neděle 30. července 2017 7:12:31 CEST Mike Galbraith wrote:
> > > > FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
> > > > stable fixed it.
> > > 
> > > My build already includes v4.12.4.
> > > 
> > > > If not, I'd find these two commits irresistible.
> > > > 
> > > > 5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue 
> > > > mapping
> > > > 4b855ad37194f blk-mq: Create hctx for each present CPU
> > > 
> > > I've applied these 2 commits, and cannot reproduce the issue anymore. 
> > > Looks 
> > > like a perfect hit, thanks!
> > > 
> > > > 'course applying random upstream bits does come with some risk, trying
> > > > a kernel already containing them has less "entertainment" potential. 
> > > 
> > > Should you consider applying them to v4.12.x stable series? CC'ing Greg 
> > > just 
> > > in case.
> > 
> > I can queue these up if I get an ack from the developers/maintainers
> > that it is ok to do so...
> > 
> > {hint}
> 
> {hint++}
> 
> Those commits take Steven Rostedt's hotplug stress script runtime down
> from 4 _minutes_ down to 7 seconds for my RT tree, so I'm rather hoping
> you hear an "ACK" too.

Oh, nice!

Should these go back farther than 4.12?  Looks like they apply cleanly
to 4.9, didn't look older than that...

thanks,

greg k-h


Re: blk-mq breaks suspend even with runtime PM patch

2017-08-08 Thread Greg KH
On Tue, Aug 08, 2017 at 06:34:01PM +0200, Mike Galbraith wrote:
> On Tue, 2017-08-08 at 09:22 -0700, Greg KH wrote:
> > On Sun, Jul 30, 2017 at 03:50:15PM +0200, Oleksandr Natalenko wrote:
> > > Hello Mike et al.
> > > 
> > > On neděle 30. července 2017 7:12:31 CEST Mike Galbraith wrote:
> > > > FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
> > > > stable fixed it.
> > > 
> > > My build already includes v4.12.4.
> > > 
> > > > If not, I'd find these two commits irresistible.
> > > > 
> > > > 5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue 
> > > > mapping
> > > > 4b855ad37194f blk-mq: Create hctx for each present CPU
> > > 
> > > I've applied these 2 commits, and cannot reproduce the issue anymore. 
> > > Looks 
> > > like a perfect hit, thanks!
> > > 
> > > > 'course applying random upstream bits does come with some risk, trying
> > > > a kernel already containing them has less "entertainment" potential. 
> > > 
> > > Should you consider applying them to v4.12.x stable series? CC'ing Greg 
> > > just 
> > > in case.
> > 
> > I can queue these up if I get an ack from the developers/maintainers
> > that it is ok to do so...
> > 
> > {hint}
> 
> {hint++}
> 
> Those commits take Steven Rostedt's hotplug stress script runtime down
> from 4 _minutes_ down to 7 seconds for my RT tree, so I'm rather hoping
> you hear an "ACK" too.

Oh, nice!

Should these go back farther than 4.12?  Looks like they apply cleanly
to 4.9, didn't look older than that...

thanks,

greg k-h


Re: blk-mq breaks suspend even with runtime PM patch

2017-08-08 Thread Greg KH
On Tue, Aug 08, 2017 at 06:36:01PM +0200, Oleksandr Natalenko wrote:
> Could you queue "block: disable runtime-pm for blk-mq" too please? It is also 
> related to suspend-resume freezes that were observed by multiple users.

What is the git commit id of that patch?

thanks,

greg k-h


Re: blk-mq breaks suspend even with runtime PM patch

2017-08-08 Thread Greg KH
On Tue, Aug 08, 2017 at 06:36:01PM +0200, Oleksandr Natalenko wrote:
> Could you queue "block: disable runtime-pm for blk-mq" too please? It is also 
> related to suspend-resume freezes that were observed by multiple users.

What is the git commit id of that patch?

thanks,

greg k-h


Re: blk-mq breaks suspend even with runtime PM patch

2017-08-08 Thread Oleksandr Natalenko
Could you queue "block: disable runtime-pm for blk-mq" too please? It is also 
related to suspend-resume freezes that were observed by multiple users.

Thanks.

On úterý 8. srpna 2017 18:33:29 CEST Jens Axboe wrote:
> On 08/08/2017 10:22 AM, Greg KH wrote:
> > On Sun, Jul 30, 2017 at 03:50:15PM +0200, Oleksandr Natalenko wrote:
> >> Hello Mike et al.
> >> 
> >> On neděle 30. července 2017 7:12:31 CEST Mike Galbraith wrote:
> >>> FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
> >>> stable fixed it.
> >> 
> >> My build already includes v4.12.4.
> >> 
> >>> If not, I'd find these two commits irresistible.
> >>> 
> >>> 5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue
> >>> mapping
> >>> 4b855ad37194f blk-mq: Create hctx for each present CPU
> >> 
> >> I've applied these 2 commits, and cannot reproduce the issue anymore.
> >> Looks
> >> like a perfect hit, thanks!
> >> 
> >>> 'course applying random upstream bits does come with some risk, trying
> >>> a kernel already containing them has less "entertainment" potential.
> >> 
> >> Should you consider applying them to v4.12.x stable series? CC'ing Greg
> >> just in case.
> > 
> > I can queue these up if I get an ack from the developers/maintainers
> > that it is ok to do so...
> 
> You can add those two commits to stable.




Re: blk-mq breaks suspend even with runtime PM patch

2017-08-08 Thread Oleksandr Natalenko
Could you queue "block: disable runtime-pm for blk-mq" too please? It is also 
related to suspend-resume freezes that were observed by multiple users.

Thanks.

On úterý 8. srpna 2017 18:33:29 CEST Jens Axboe wrote:
> On 08/08/2017 10:22 AM, Greg KH wrote:
> > On Sun, Jul 30, 2017 at 03:50:15PM +0200, Oleksandr Natalenko wrote:
> >> Hello Mike et al.
> >> 
> >> On neděle 30. července 2017 7:12:31 CEST Mike Galbraith wrote:
> >>> FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
> >>> stable fixed it.
> >> 
> >> My build already includes v4.12.4.
> >> 
> >>> If not, I'd find these two commits irresistible.
> >>> 
> >>> 5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue
> >>> mapping
> >>> 4b855ad37194f blk-mq: Create hctx for each present CPU
> >> 
> >> I've applied these 2 commits, and cannot reproduce the issue anymore.
> >> Looks
> >> like a perfect hit, thanks!
> >> 
> >>> 'course applying random upstream bits does come with some risk, trying
> >>> a kernel already containing them has less "entertainment" potential.
> >> 
> >> Should you consider applying them to v4.12.x stable series? CC'ing Greg
> >> just in case.
> > 
> > I can queue these up if I get an ack from the developers/maintainers
> > that it is ok to do so...
> 
> You can add those two commits to stable.




Re: blk-mq breaks suspend even with runtime PM patch

2017-08-08 Thread Mike Galbraith
On Tue, 2017-08-08 at 09:22 -0700, Greg KH wrote:
> On Sun, Jul 30, 2017 at 03:50:15PM +0200, Oleksandr Natalenko wrote:
> > Hello Mike et al.
> > 
> > On neděle 30. července 2017 7:12:31 CEST Mike Galbraith wrote:
> > > FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
> > > stable fixed it.
> > 
> > My build already includes v4.12.4.
> > 
> > > If not, I'd find these two commits irresistible.
> > > 
> > > 5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue 
> > > mapping
> > > 4b855ad37194f blk-mq: Create hctx for each present CPU
> > 
> > I've applied these 2 commits, and cannot reproduce the issue anymore. Looks 
> > like a perfect hit, thanks!
> > 
> > > 'course applying random upstream bits does come with some risk, trying
> > > a kernel already containing them has less "entertainment" potential. 
> > 
> > Should you consider applying them to v4.12.x stable series? CC'ing Greg 
> > just 
> > in case.
> 
> I can queue these up if I get an ack from the developers/maintainers
> that it is ok to do so...
> 
> {hint}

{hint++}

Those commits take Steven Rostedt's hotplug stress script runtime down
from 4 _minutes_ down to 7 seconds for my RT tree, so I'm rather hoping
you hear an "ACK" too.

-Mike


Re: blk-mq breaks suspend even with runtime PM patch

2017-08-08 Thread Mike Galbraith
On Tue, 2017-08-08 at 09:22 -0700, Greg KH wrote:
> On Sun, Jul 30, 2017 at 03:50:15PM +0200, Oleksandr Natalenko wrote:
> > Hello Mike et al.
> > 
> > On neděle 30. července 2017 7:12:31 CEST Mike Galbraith wrote:
> > > FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
> > > stable fixed it.
> > 
> > My build already includes v4.12.4.
> > 
> > > If not, I'd find these two commits irresistible.
> > > 
> > > 5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue 
> > > mapping
> > > 4b855ad37194f blk-mq: Create hctx for each present CPU
> > 
> > I've applied these 2 commits, and cannot reproduce the issue anymore. Looks 
> > like a perfect hit, thanks!
> > 
> > > 'course applying random upstream bits does come with some risk, trying
> > > a kernel already containing them has less "entertainment" potential. 
> > 
> > Should you consider applying them to v4.12.x stable series? CC'ing Greg 
> > just 
> > in case.
> 
> I can queue these up if I get an ack from the developers/maintainers
> that it is ok to do so...
> 
> {hint}

{hint++}

Those commits take Steven Rostedt's hotplug stress script runtime down
from 4 _minutes_ down to 7 seconds for my RT tree, so I'm rather hoping
you hear an "ACK" too.

-Mike


Re: blk-mq breaks suspend even with runtime PM patch

2017-08-08 Thread Jens Axboe
On 08/08/2017 10:22 AM, Greg KH wrote:
> On Sun, Jul 30, 2017 at 03:50:15PM +0200, Oleksandr Natalenko wrote:
>> Hello Mike et al.
>>
>> On neděle 30. července 2017 7:12:31 CEST Mike Galbraith wrote:
>>> FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
>>> stable fixed it.
>>
>> My build already includes v4.12.4.
>>
>>> If not, I'd find these two commits irresistible.
>>>
>>> 5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue mapping
>>> 4b855ad37194f blk-mq: Create hctx for each present CPU
>>
>> I've applied these 2 commits, and cannot reproduce the issue anymore. Looks 
>> like a perfect hit, thanks!
>>
>>> 'course applying random upstream bits does come with some risk, trying
>>> a kernel already containing them has less "entertainment" potential. 
>>
>> Should you consider applying them to v4.12.x stable series? CC'ing Greg just 
>> in case.
> 
> I can queue these up if I get an ack from the developers/maintainers
> that it is ok to do so...

You can add those two commits to stable.

-- 
Jens Axboe



Re: blk-mq breaks suspend even with runtime PM patch

2017-08-08 Thread Jens Axboe
On 08/08/2017 10:22 AM, Greg KH wrote:
> On Sun, Jul 30, 2017 at 03:50:15PM +0200, Oleksandr Natalenko wrote:
>> Hello Mike et al.
>>
>> On neděle 30. července 2017 7:12:31 CEST Mike Galbraith wrote:
>>> FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
>>> stable fixed it.
>>
>> My build already includes v4.12.4.
>>
>>> If not, I'd find these two commits irresistible.
>>>
>>> 5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue mapping
>>> 4b855ad37194f blk-mq: Create hctx for each present CPU
>>
>> I've applied these 2 commits, and cannot reproduce the issue anymore. Looks 
>> like a perfect hit, thanks!
>>
>>> 'course applying random upstream bits does come with some risk, trying
>>> a kernel already containing them has less "entertainment" potential. 
>>
>> Should you consider applying them to v4.12.x stable series? CC'ing Greg just 
>> in case.
> 
> I can queue these up if I get an ack from the developers/maintainers
> that it is ok to do so...

You can add those two commits to stable.

-- 
Jens Axboe



Re: blk-mq breaks suspend even with runtime PM patch

2017-08-08 Thread Greg KH
On Sun, Jul 30, 2017 at 03:50:15PM +0200, Oleksandr Natalenko wrote:
> Hello Mike et al.
> 
> On neděle 30. července 2017 7:12:31 CEST Mike Galbraith wrote:
> > FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
> > stable fixed it.
> 
> My build already includes v4.12.4.
> 
> > If not, I'd find these two commits irresistible.
> > 
> > 5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue mapping
> > 4b855ad37194f blk-mq: Create hctx for each present CPU
> 
> I've applied these 2 commits, and cannot reproduce the issue anymore. Looks 
> like a perfect hit, thanks!
> 
> > 'course applying random upstream bits does come with some risk, trying
> > a kernel already containing them has less "entertainment" potential. 
> 
> Should you consider applying them to v4.12.x stable series? CC'ing Greg just 
> in case.

I can queue these up if I get an ack from the developers/maintainers
that it is ok to do so...

{hint}

thanks,

greg k-h


Re: blk-mq breaks suspend even with runtime PM patch

2017-08-08 Thread Greg KH
On Sun, Jul 30, 2017 at 03:50:15PM +0200, Oleksandr Natalenko wrote:
> Hello Mike et al.
> 
> On neděle 30. července 2017 7:12:31 CEST Mike Galbraith wrote:
> > FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
> > stable fixed it.
> 
> My build already includes v4.12.4.
> 
> > If not, I'd find these two commits irresistible.
> > 
> > 5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue mapping
> > 4b855ad37194f blk-mq: Create hctx for each present CPU
> 
> I've applied these 2 commits, and cannot reproduce the issue anymore. Looks 
> like a perfect hit, thanks!
> 
> > 'course applying random upstream bits does come with some risk, trying
> > a kernel already containing them has less "entertainment" potential. 
> 
> Should you consider applying them to v4.12.x stable series? CC'ing Greg just 
> in case.

I can queue these up if I get an ack from the developers/maintainers
that it is ok to do so...

{hint}

thanks,

greg k-h


Re: blk-mq breaks suspend even with runtime PM patch

2017-07-30 Thread Oleksandr Natalenko
Hello Mike et al.

On neděle 30. července 2017 7:12:31 CEST Mike Galbraith wrote:
> FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
> stable fixed it.

My build already includes v4.12.4.

> If not, I'd find these two commits irresistible.
> 
> 5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue mapping
> 4b855ad37194f blk-mq: Create hctx for each present CPU

I've applied these 2 commits, and cannot reproduce the issue anymore. Looks 
like a perfect hit, thanks!

> 'course applying random upstream bits does come with some risk, trying
> a kernel already containing them has less "entertainment" potential. 

Should you consider applying them to v4.12.x stable series? CC'ing Greg just 
in case.


Re: blk-mq breaks suspend even with runtime PM patch

2017-07-30 Thread Oleksandr Natalenko
Hello Mike et al.

On neděle 30. července 2017 7:12:31 CEST Mike Galbraith wrote:
> FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
> stable fixed it.

My build already includes v4.12.4.

> If not, I'd find these two commits irresistible.
> 
> 5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue mapping
> 4b855ad37194f blk-mq: Create hctx for each present CPU

I've applied these 2 commits, and cannot reproduce the issue anymore. Looks 
like a perfect hit, thanks!

> 'course applying random upstream bits does come with some risk, trying
> a kernel already containing them has less "entertainment" potential. 

Should you consider applying them to v4.12.x stable series? CC'ing Greg just 
in case.


Re: blk-mq breaks suspend even with runtime PM patch

2017-07-29 Thread Mike Galbraith
On Sat, 2017-07-29 at 17:27 +0200, Oleksandr Natalenko wrote:
> Hello Jens, Christoph.
> 
> Unfortunately, even with "block: disable runtime-pm for blk-mq" patch applied 
> blk-mq breaks suspend to RAM for me. It is reproducible on my laptop as well 
> as in a VM.
> 
> I use complex disk layout involving MD, LUKS and LVM, and managed to get 
> these 
> warnings from VM via serial console when suspend fails:
> 
> ===
> [  245.516573] INFO: task kworker/0:1:49 blocked for more than 120 seconds.
> [  245.520025]   Not tainted 4.12.0-pf4 #1

FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
stable fixed it.  If not, I'd find these two commits irresistible.

5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue mapping
4b855ad37194f blk-mq: Create hctx for each present CPU

'course applying random upstream bits does come with some risk, trying
a kernel already containing them has less "entertainment" potential. 

-Mike


Re: blk-mq breaks suspend even with runtime PM patch

2017-07-29 Thread Mike Galbraith
On Sat, 2017-07-29 at 17:27 +0200, Oleksandr Natalenko wrote:
> Hello Jens, Christoph.
> 
> Unfortunately, even with "block: disable runtime-pm for blk-mq" patch applied 
> blk-mq breaks suspend to RAM for me. It is reproducible on my laptop as well 
> as in a VM.
> 
> I use complex disk layout involving MD, LUKS and LVM, and managed to get 
> these 
> warnings from VM via serial console when suspend fails:
> 
> ===
> [  245.516573] INFO: task kworker/0:1:49 blocked for more than 120 seconds.
> [  245.520025]   Not tainted 4.12.0-pf4 #1

FWIW, first thing I'd do is update that 4.12.0 to 4.12.4, and see if
stable fixed it.  If not, I'd find these two commits irresistible.

5f042e7cbd9eb blk-mq: Include all present CPUs in the default queue mapping
4b855ad37194f blk-mq: Create hctx for each present CPU

'course applying random upstream bits does come with some risk, trying
a kernel already containing them has less "entertainment" potential. 

-Mike


Re: blk-mq breaks suspend even with runtime PM patch

2017-07-29 Thread Oleksandr Natalenko
Recompiled kernel with lockdep enabled gives me this:

===
[  368.655051] Showing all locks held in the system:
[  368.656387] 1 lock held by khungtaskd/37:
[  368.657171]  #0:  (tasklist_lock){.+.+..}, at: [] 
debug_show_all_locks+0x3d/0x1a0
[  368.658725] 1 lock held by md0_raid10/458:
[  368.659455]  #0:  (>reconfig_mutex){+.+.+.}, at: 
[] md_check_recovery+0xaf/0x4d0 [md_mod]
[  368.661403] 3 locks held by btrfs-transacti/550:
[  368.662754]  #0:  (_info->transaction_kthread_mutex){+.+...}, at: 
[] transaction_kthread+0x69/0x1c0 [btrfs]
[  368.664797]  #1:  (_info->reloc_mutex){+.+...}, at: [] 
btrfs_commit_transaction+0x2e1/0x9b0 [btrfs]
[  368.69]  #2:  (_info->tree_log_mutex){+.+...}, at: 
[] btrfs_commit_transaction+0x351/0x9b0 [btrfs]
[  368.668644] 4 locks held by kworker/0:2/888:
[  368.669384]  #0:  ("events"){.+.+.+}, at: [] 
process_one_work+0x1fb/0x6e0
[  368.670916]  #1:  ((shepherd).work){+.+...}, at: [] 
process_one_work+0x1fb/0x6e0
[  368.672592]  #2:  (cpu_hotplug.dep_map){++}, at: [] 
get_online_cpus.part.14+0x5/0x50
[  368.674742]  #3:  (cpu_hotplug.lock){+.+.+.}, at: [] 
get_online_cpus.part.14+0x3a/0x50
[  368.677494] 10 locks held by systemd-sleep/889:
[  368.678650]  #0:  (sb_writers#5){.+.+.+}, at: [] 
vfs_write+0x17b/0x1a0
[  368.680483]  #1:  (>mutex){+.+.+.}, at: [] 
kernfs_fop_write+0x123/0x1e0
[  368.682412]  #2:  (s_active#257){.+.+.+}, at: [] 
kernfs_fop_write+0x12c/0x1e0
[  368.684440]  #3:  (autosleep_lock){+.+.+.}, at: [] 
pm_autosleep_lock+0x17/0x20
[  368.686707]  #4:  (pm_mutex){+.+.+.}, at: [] pm_suspend
+0x88/0x490
[  368.688086]  #5:  (acpi_scan_lock){+.+.+.}, at: [] 
acpi_scan_lock_acquire+0x17/0x20
[  368.690213]  #6:  (cpu_add_remove_lock){+.+.+.}, at: [] 
freeze_secondary_cpus+0x30/0x3c0
[  368.692016]  #7:  (cpu_hotplug.dep_map){++}, at: [] 
cpu_hotplug_begin+0x5/0xe0
[  368.694347]  #8:  (cpu_hotplug.lock){+.+.+.}, at: [] 
cpu_hotplug_begin+0x83/0xe0
[  368.696010]  #9:  (all_q_mutex){+.+...}, at: [] 
blk_mq_queue_reinit_work+0x1a/0x110
[  368.698624] 
[  368.698990] =
[  368.698990]
===

Deadlock with CPU hotplug?

On sobota 29. července 2017 17:27:41 CEST Oleksandr Natalenko wrote:
> Hello Jens, Christoph.
> 
> Unfortunately, even with "block: disable runtime-pm for blk-mq" patch
> applied blk-mq breaks suspend to RAM for me. It is reproducible on my
> laptop as well as in a VM.
> 
> I use complex disk layout involving MD, LUKS and LVM, and managed to get
> these warnings from VM via serial console when suspend fails:
> 
> ===
> [  245.516573] INFO: task kworker/0:1:49 blocked for more than 120 seconds.
> [  245.520025]   Not tainted 4.12.0-pf4 #1
> [  245.521836] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [  245.525612] kworker/0:1 D049  2 0x
> [  245.527515] Workqueue: events vmstat_shepherd
> [  245.528685] Call Trace:
> [  245.529296]  __schedule+0x459/0xe40
> [  245.530115]  ? kvm_clock_read+0x25/0x40
> [  245.531003]  ? ktime_get+0x40/0xa0
> [  245.531819]  schedule+0x3d/0xb0
> [  245.532542]  ? schedule+0x3d/0xb0
> [  245.533299]  schedule_preempt_disabled+0x15/0x20
> [  245.534367]  __mutex_lock.isra.5+0x295/0x530
> [  245.535351]  __mutex_lock_slowpath+0x13/0x20
> [  245.536362]  ? __mutex_lock_slowpath+0x13/0x20
> [  245.537334]  mutex_lock+0x25/0x30
> [  245.538118]  get_online_cpus.part.14+0x15/0x30
> [  245.539588]  get_online_cpus+0x20/0x30
> [  245.540560]  vmstat_shepherd+0x21/0xc0
> [  245.541538]  process_one_work+0x1de/0x430
> [  245.542364]  worker_thread+0x47/0x3f0
> [  245.543042]  kthread+0x125/0x140
> [  245.543649]  ? process_one_work+0x430/0x430
> [  245.544417]  ? kthread_create_on_node+0x70/0x70
> [  245.545737]  ret_from_fork+0x25/0x30
> [  245.546490] INFO: task md0_raid10:459 blocked for more than 120 seconds.
> [  245.547668]   Not tainted 4.12.0-pf4 #1
> [  245.548769] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [  245.550133] md0_raid10  D0   459  2 0x
> [  245.551092] Call Trace:
> [  245.551539]  __schedule+0x459/0xe40
> [  245.552163]  schedule+0x3d/0xb0
> [  245.552728]  ? schedule+0x3d/0xb0
> [  245.553344]  md_super_wait+0x6e/0xa0 [md_mod]
> [  245.554118]  ? wake_bit_function+0x60/0x60
> [  245.554854]  md_update_sb.part.60+0x3df/0x840 [md_mod]
> [  245.555771]  md_check_recovery+0x215/0x4b0 [md_mod]
> [  245.556732]  raid10d+0x62/0x13c0 [raid10]
> [  245.557456]  ? schedule+0x3d/0xb0
> [  245.558169]  ? schedule+0x3d/0xb0
> [  245.558803]  ? schedule_timeout+0x21f/0x330
> [  245.559593]  md_thread+0x120/0x160 [md_mod]
> [  245.560380]  ? md_thread+0x120/0x160 [md_mod]
> [  245.561202]  ? wake_bit_function+0x60/0x60
> [  245.561975]  kthread+0x125/0x140
> [  245.562601]  ? find_pers+0x70/0x70 [md_mod]
> [  245.563394]  ? kthread_create_on_node+0x70/0x70
> [  245.564516]  ret_from_fork+0x25/0x30
> [  245.565669] INFO: task dmcrypt_write:487 

Re: blk-mq breaks suspend even with runtime PM patch

2017-07-29 Thread Oleksandr Natalenko
Recompiled kernel with lockdep enabled gives me this:

===
[  368.655051] Showing all locks held in the system:
[  368.656387] 1 lock held by khungtaskd/37:
[  368.657171]  #0:  (tasklist_lock){.+.+..}, at: [] 
debug_show_all_locks+0x3d/0x1a0
[  368.658725] 1 lock held by md0_raid10/458:
[  368.659455]  #0:  (>reconfig_mutex){+.+.+.}, at: 
[] md_check_recovery+0xaf/0x4d0 [md_mod]
[  368.661403] 3 locks held by btrfs-transacti/550:
[  368.662754]  #0:  (_info->transaction_kthread_mutex){+.+...}, at: 
[] transaction_kthread+0x69/0x1c0 [btrfs]
[  368.664797]  #1:  (_info->reloc_mutex){+.+...}, at: [] 
btrfs_commit_transaction+0x2e1/0x9b0 [btrfs]
[  368.69]  #2:  (_info->tree_log_mutex){+.+...}, at: 
[] btrfs_commit_transaction+0x351/0x9b0 [btrfs]
[  368.668644] 4 locks held by kworker/0:2/888:
[  368.669384]  #0:  ("events"){.+.+.+}, at: [] 
process_one_work+0x1fb/0x6e0
[  368.670916]  #1:  ((shepherd).work){+.+...}, at: [] 
process_one_work+0x1fb/0x6e0
[  368.672592]  #2:  (cpu_hotplug.dep_map){++}, at: [] 
get_online_cpus.part.14+0x5/0x50
[  368.674742]  #3:  (cpu_hotplug.lock){+.+.+.}, at: [] 
get_online_cpus.part.14+0x3a/0x50
[  368.677494] 10 locks held by systemd-sleep/889:
[  368.678650]  #0:  (sb_writers#5){.+.+.+}, at: [] 
vfs_write+0x17b/0x1a0
[  368.680483]  #1:  (>mutex){+.+.+.}, at: [] 
kernfs_fop_write+0x123/0x1e0
[  368.682412]  #2:  (s_active#257){.+.+.+}, at: [] 
kernfs_fop_write+0x12c/0x1e0
[  368.684440]  #3:  (autosleep_lock){+.+.+.}, at: [] 
pm_autosleep_lock+0x17/0x20
[  368.686707]  #4:  (pm_mutex){+.+.+.}, at: [] pm_suspend
+0x88/0x490
[  368.688086]  #5:  (acpi_scan_lock){+.+.+.}, at: [] 
acpi_scan_lock_acquire+0x17/0x20
[  368.690213]  #6:  (cpu_add_remove_lock){+.+.+.}, at: [] 
freeze_secondary_cpus+0x30/0x3c0
[  368.692016]  #7:  (cpu_hotplug.dep_map){++}, at: [] 
cpu_hotplug_begin+0x5/0xe0
[  368.694347]  #8:  (cpu_hotplug.lock){+.+.+.}, at: [] 
cpu_hotplug_begin+0x83/0xe0
[  368.696010]  #9:  (all_q_mutex){+.+...}, at: [] 
blk_mq_queue_reinit_work+0x1a/0x110
[  368.698624] 
[  368.698990] =
[  368.698990]
===

Deadlock with CPU hotplug?

On sobota 29. července 2017 17:27:41 CEST Oleksandr Natalenko wrote:
> Hello Jens, Christoph.
> 
> Unfortunately, even with "block: disable runtime-pm for blk-mq" patch
> applied blk-mq breaks suspend to RAM for me. It is reproducible on my
> laptop as well as in a VM.
> 
> I use complex disk layout involving MD, LUKS and LVM, and managed to get
> these warnings from VM via serial console when suspend fails:
> 
> ===
> [  245.516573] INFO: task kworker/0:1:49 blocked for more than 120 seconds.
> [  245.520025]   Not tainted 4.12.0-pf4 #1
> [  245.521836] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [  245.525612] kworker/0:1 D049  2 0x
> [  245.527515] Workqueue: events vmstat_shepherd
> [  245.528685] Call Trace:
> [  245.529296]  __schedule+0x459/0xe40
> [  245.530115]  ? kvm_clock_read+0x25/0x40
> [  245.531003]  ? ktime_get+0x40/0xa0
> [  245.531819]  schedule+0x3d/0xb0
> [  245.532542]  ? schedule+0x3d/0xb0
> [  245.533299]  schedule_preempt_disabled+0x15/0x20
> [  245.534367]  __mutex_lock.isra.5+0x295/0x530
> [  245.535351]  __mutex_lock_slowpath+0x13/0x20
> [  245.536362]  ? __mutex_lock_slowpath+0x13/0x20
> [  245.537334]  mutex_lock+0x25/0x30
> [  245.538118]  get_online_cpus.part.14+0x15/0x30
> [  245.539588]  get_online_cpus+0x20/0x30
> [  245.540560]  vmstat_shepherd+0x21/0xc0
> [  245.541538]  process_one_work+0x1de/0x430
> [  245.542364]  worker_thread+0x47/0x3f0
> [  245.543042]  kthread+0x125/0x140
> [  245.543649]  ? process_one_work+0x430/0x430
> [  245.544417]  ? kthread_create_on_node+0x70/0x70
> [  245.545737]  ret_from_fork+0x25/0x30
> [  245.546490] INFO: task md0_raid10:459 blocked for more than 120 seconds.
> [  245.547668]   Not tainted 4.12.0-pf4 #1
> [  245.548769] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [  245.550133] md0_raid10  D0   459  2 0x
> [  245.551092] Call Trace:
> [  245.551539]  __schedule+0x459/0xe40
> [  245.552163]  schedule+0x3d/0xb0
> [  245.552728]  ? schedule+0x3d/0xb0
> [  245.553344]  md_super_wait+0x6e/0xa0 [md_mod]
> [  245.554118]  ? wake_bit_function+0x60/0x60
> [  245.554854]  md_update_sb.part.60+0x3df/0x840 [md_mod]
> [  245.555771]  md_check_recovery+0x215/0x4b0 [md_mod]
> [  245.556732]  raid10d+0x62/0x13c0 [raid10]
> [  245.557456]  ? schedule+0x3d/0xb0
> [  245.558169]  ? schedule+0x3d/0xb0
> [  245.558803]  ? schedule_timeout+0x21f/0x330
> [  245.559593]  md_thread+0x120/0x160 [md_mod]
> [  245.560380]  ? md_thread+0x120/0x160 [md_mod]
> [  245.561202]  ? wake_bit_function+0x60/0x60
> [  245.561975]  kthread+0x125/0x140
> [  245.562601]  ? find_pers+0x70/0x70 [md_mod]
> [  245.563394]  ? kthread_create_on_node+0x70/0x70
> [  245.564516]  ret_from_fork+0x25/0x30
> [  245.565669] INFO: task dmcrypt_write:487