Re: System not booting since dm changes? (was Linux 4.20-rc1)

2018-11-06 Thread Michael Ellerman
Mike Snitzer  writes:
> On Mon, Nov 05 2018 at  5:25am -0500,
> Michael Ellerman  wrote:
>
>> Linus Torvalds  writes:
>> ...
>> > Mike Snitzer (1):
>> > device mapper updates
>> 
>> Hi Mike,
>> 
>> Replying here because I can't find the device-mapper pull or the patch
>> in question on LKML. I guess I should be subscribed to dm-devel.
>> 
>> We have a box that doesn't boot any more, bisect points at one of:
>> 
>>   cef6f55a9fb4 Mike Snitzer   dm table: require that request-based DM be 
>> layered on blk-mq devices 
>>   953923c09fe8 Mike Snitzer   dm: rename DM_TYPE_MQ_REQUEST_BASED to 
>> DM_TYPE_REQUEST_BASED 
>>   6a23e05c2fe3 Jens Axboe dm: remove legacy request-based IO path 
>> 
>> 
>> It's a Power8 system running Rawhide, it does have multipath, but I'm
>> told it was setup by the Fedora installer, ie. nothing fancy.
>> 
>> The symptom is the system can't find its root filesystem and drops into
>> the initramfs shell. The dmesg includes a bunch of errors like below:
>> 
>>   [   43.263460] localhost multipathd[1344]: sdb: fail to get serial
>>   [   43.268762] localhost multipathd[1344]: mpatha: failed in domap for 
>> addition of new path sdb
>>   [   43.268762] localhost multipathd[1344]: uevent trigger error
>>   [   43.282065] localhost kernel: device-mapper: table: table load 
>> rejected: not all devices are blk-mq request-stackable
> ...
>>
>> Any ideas what's going wrong here?
>
> "table load rejected: not all devices are blk-mq request-stackable"
> speaks to the fact that you aren't using blk-mq for scsi (aka scsi-mq).
>
> You need to use scsi_mod.use_blk_mq=Y on the kernel commandline (or set
> CONFIG_SCSI_MQ_DEFAULT in your kernel config)

Thanks.

Looks like CONFIG_SCSI_MQ_DEFAULT is default y, so new configs should
pick that up by default. We must have had an old .config that didn't get
that update.

cheers


Re: System not booting since dm changes? (was Linux 4.20-rc1)

2018-11-05 Thread Jens Axboe
On 11/5/18 7:35 AM, Satheesh Rajendran wrote:
> On Mon, Nov 05, 2018 at 08:51:57AM -0500, Mike Snitzer wrote:
>> On Mon, Nov 05 2018 at  5:25am -0500,
>> Michael Ellerman  wrote:
>>
>>> Linus Torvalds  writes:
>>> ...
 Mike Snitzer (1):
 device mapper updates
>>>
>>> Hi Mike,
>>>
>>> Replying here because I can't find the device-mapper pull or the patch
>>> in question on LKML. I guess I should be subscribed to dm-devel.
>>>
>>> We have a box that doesn't boot any more, bisect points at one of:
>>>
>>>   cef6f55a9fb4 Mike Snitzer   dm table: require that request-based DM 
>>> be layered on blk-mq devices 
>>>   953923c09fe8 Mike Snitzer   dm: rename DM_TYPE_MQ_REQUEST_BASED to 
>>> DM_TYPE_REQUEST_BASED 
>>>   6a23e05c2fe3 Jens Axboe dm: remove legacy request-based IO path 
>>>
>>>
>>> It's a Power8 system running Rawhide, it does have multipath, but I'm
>>> told it was setup by the Fedora installer, ie. nothing fancy.
>>>
>>> The symptom is the system can't find its root filesystem and drops into
>>> the initramfs shell. The dmesg includes a bunch of errors like below:
>>>
>>>   [   43.263460] localhost multipathd[1344]: sdb: fail to get serial
>>>   [   43.268762] localhost multipathd[1344]: mpatha: failed in domap for 
>>> addition of new path sdb
>>>   [   43.268762] localhost multipathd[1344]: uevent trigger error
>>>   [   43.282065] localhost kernel: device-mapper: table: table load 
>>> rejected: not all devices are blk-mq request-stackable
>> ...
>>>
>>> Any ideas what's going wrong here?
>>
>> "table load rejected: not all devices are blk-mq request-stackable"
>> speaks to the fact that you aren't using blk-mq for scsi (aka scsi-mq).
>>
>> You need to use scsi_mod.use_blk_mq=Y on the kernel commandline (or set
>> CONFIG_SCSI_MQ_DEFAULT in your kernel config)
> 
> Thanks Mike!, above solution worked and the system booted fine now:-)

This quirk will go away for the next kernel, fwiw, since the non-mq
path for SCSI will be dropped as well.

-- 
Jens Axboe



Re: System not booting since dm changes? (was Linux 4.20-rc1)

2018-11-05 Thread Satheesh Rajendran
On Mon, Nov 05, 2018 at 08:51:57AM -0500, Mike Snitzer wrote:
> On Mon, Nov 05 2018 at  5:25am -0500,
> Michael Ellerman  wrote:
> 
> > Linus Torvalds  writes:
> > ...
> > > Mike Snitzer (1):
> > > device mapper updates
> > 
> > Hi Mike,
> > 
> > Replying here because I can't find the device-mapper pull or the patch
> > in question on LKML. I guess I should be subscribed to dm-devel.
> > 
> > We have a box that doesn't boot any more, bisect points at one of:
> > 
> >   cef6f55a9fb4 Mike Snitzer   dm table: require that request-based DM 
> > be layered on blk-mq devices 
> >   953923c09fe8 Mike Snitzer   dm: rename DM_TYPE_MQ_REQUEST_BASED to 
> > DM_TYPE_REQUEST_BASED 
> >   6a23e05c2fe3 Jens Axboe dm: remove legacy request-based IO path 
> > 
> > 
> > It's a Power8 system running Rawhide, it does have multipath, but I'm
> > told it was setup by the Fedora installer, ie. nothing fancy.
> > 
> > The symptom is the system can't find its root filesystem and drops into
> > the initramfs shell. The dmesg includes a bunch of errors like below:
> > 
> >   [   43.263460] localhost multipathd[1344]: sdb: fail to get serial
> >   [   43.268762] localhost multipathd[1344]: mpatha: failed in domap for 
> > addition of new path sdb
> >   [   43.268762] localhost multipathd[1344]: uevent trigger error
> >   [   43.282065] localhost kernel: device-mapper: table: table load 
> > rejected: not all devices are blk-mq request-stackable
> ...
> >
> > Any ideas what's going wrong here?
> 
> "table load rejected: not all devices are blk-mq request-stackable"
> speaks to the fact that you aren't using blk-mq for scsi (aka scsi-mq).
> 
> You need to use scsi_mod.use_blk_mq=Y on the kernel commandline (or set
> CONFIG_SCSI_MQ_DEFAULT in your kernel config)

Thanks Mike!, above solution worked and the system booted fine now:-)

# uname -r
4.20.0-rc1+
# cat /proc/cmdline 
root=/dev/mapper/fedora_ltc--test--ci2-root ro 
rd.lvm.lv=fedora_ltc-test-ci2/root rd.lvm.lv=fedora_ltc-test-ci2/swap 
scsi_mod.use_blk_mq=Y

CONFIG_SCSI_MQ_DEFAULT kernel was not set in my kernel config, will set in 
future runs.

Thanks Michael!

Regards,
-Satheesh.

> 
> Mike
> 



Re: System not booting since dm changes? (was Linux 4.20-rc1)

2018-11-05 Thread Mike Snitzer
On Mon, Nov 05 2018 at  5:25am -0500,
Michael Ellerman  wrote:

> Linus Torvalds  writes:
> ...
> > Mike Snitzer (1):
> > device mapper updates
> 
> Hi Mike,
> 
> Replying here because I can't find the device-mapper pull or the patch
> in question on LKML. I guess I should be subscribed to dm-devel.
> 
> We have a box that doesn't boot any more, bisect points at one of:
> 
>   cef6f55a9fb4 Mike Snitzer   dm table: require that request-based DM be 
> layered on blk-mq devices 
>   953923c09fe8 Mike Snitzer   dm: rename DM_TYPE_MQ_REQUEST_BASED to 
> DM_TYPE_REQUEST_BASED 
>   6a23e05c2fe3 Jens Axboe dm: remove legacy request-based IO path 
> 
> 
> It's a Power8 system running Rawhide, it does have multipath, but I'm
> told it was setup by the Fedora installer, ie. nothing fancy.
> 
> The symptom is the system can't find its root filesystem and drops into
> the initramfs shell. The dmesg includes a bunch of errors like below:
> 
>   [   43.263460] localhost multipathd[1344]: sdb: fail to get serial
>   [   43.268762] localhost multipathd[1344]: mpatha: failed in domap for 
> addition of new path sdb
>   [   43.268762] localhost multipathd[1344]: uevent trigger error
>   [   43.282065] localhost kernel: device-mapper: table: table load rejected: 
> not all devices are blk-mq request-stackable
...
>
> Any ideas what's going wrong here?

"table load rejected: not all devices are blk-mq request-stackable"
speaks to the fact that you aren't using blk-mq for scsi (aka scsi-mq).

You need to use scsi_mod.use_blk_mq=Y on the kernel commandline (or set
CONFIG_SCSI_MQ_DEFAULT in your kernel config)

Mike


System not booting since dm changes? (was Linux 4.20-rc1)

2018-11-05 Thread Michael Ellerman
Linus Torvalds  writes:
...
> Mike Snitzer (1):
> device mapper updates

Hi Mike,

Replying here because I can't find the device-mapper pull or the patch
in question on LKML. I guess I should be subscribed to dm-devel.

We have a box that doesn't boot any more, bisect points at one of:

  cef6f55a9fb4 Mike Snitzer   dm table: require that request-based DM be 
layered on blk-mq devices 
  953923c09fe8 Mike Snitzer   dm: rename DM_TYPE_MQ_REQUEST_BASED to 
DM_TYPE_REQUEST_BASED 
  6a23e05c2fe3 Jens Axboe dm: remove legacy request-based IO path 


It's a Power8 system running Rawhide, it does have multipath, but I'm
told it was setup by the Fedora installer, ie. nothing fancy.

The symptom is the system can't find its root filesystem and drops into
the initramfs shell. The dmesg includes a bunch of errors like below:

  [   43.263460] localhost multipathd[1344]: sdb: fail to get serial
  [   43.268762] localhost multipathd[1344]: mpatha: failed in domap for 
addition of new path sdb
  [   43.268762] localhost multipathd[1344]: uevent trigger error
  [   43.282065] localhost kernel: device-mapper: table: table load rejected: 
not all devices are blk-mq request-stackable
  [   43.282096] localhost kernel: device-mapper: table: unable to determine 
table type
  [   43.275898] localhost multipathd[1344]: sdd: fail to get serial
  [   43.282597] localhost multipathd[1344]: mpatha: failed in domap for 
addition of new path sdd
  [   43.282642] localhost multipathd[1344]: uevent trigger error
  [   43.286540] localhost multipathd[1344]: sdc: fail to get serial
  [   43.296366] localhost kernel: device-mapper: table: table load rejected: 
not all devices are blk-mq request-stackable
  [   43.296392] localhost kernel: device-mapper: table: unable to determine 
table type
  [   43.292218] localhost multipathd[1344]: mpathb: failed in domap for 
addition of new path sdc
  [   43.292218] localhost multipathd[1344]: uevent trigger error
  [   43.306193] localhost kernel: device-mapper: table: table load rejected: 
not all devices are blk-mq request-stackable
  [   43.306212] localhost kernel: device-mapper: table: unable to determine 
table type
  [  150.523303] localhost dracut-initqueue[1325]: Warning: dracut-initqueue 
timeout - starting timeout scripts


There's more info here if you want it:
  https://github.com/linuxppc/linux/issues/203


Any ideas what's going wrong here?

cheers