On Mon, Aug 29, 2016 at 11:16 AM, Bart Van Assche
<bart.vanass...@sandisk.com> wrote:
> On 08/14/2016 10:21 AM, James Bottomley wrote:
>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
>> index d3e852a..222771d 100644
>> --- a/drivers/scsi/sd.c
>> +++ b/drivers/scsi/sd.c
>> @@ -3000,7 +3000,13 @@ static void sd_probe_async(void *data, async_cookie_t 
>> cookie)
>>       }
>>
>>       blk_pm_runtime_init(sdp->request_queue, dev);
>> -     device_add_disk(dev, gd);
>> +     /*
>> +      * previously the parent of the gendisk was the scsi device.  It
>> +      * was moved to fix lifetime rules, so now we install a symlink
>> +      * to the new location of the block class directory
>> +      */
>> +     device_add_disk(&sdkp->dev, gd);
>> +     WARN_ON(sysfs_add_link_to_group(&dev->kobj, "block", &sdkp->dev.kobj, 
>> "block"));
>>       if (sdkp->capacity)
>>               sd_dif_config_host(sdkp);
>>
>> @@ -3142,6 +3148,7 @@ static int sd_remove(struct device *dev)
>>
>>       async_synchronize_full_domain(&scsi_sd_pm_domain);
>>       async_synchronize_full_domain(&scsi_sd_probe_domain);
>> +     sysfs_remove_link(&dev->kobj, "block");
>>       device_del(&sdkp->dev);
>>       del_gendisk(sdkp->disk);
>>       sd_shutdown(dev);
>
> Hello James,
>
> Sorry that it took so long before I could test this patch and
> the previous patch that was posted in this e-mail thread. But I
> did so earlier this morning. What I see is that the following
> warning message is reported frequently:
>
> WARNING: CPU: 1 PID: 136 at drivers/scsi/sd.c:3009 sd_probe_async+0x1ce/0x1e0
> Modules linked in: ib_srp libcrc32c scsi_transport_srp target_core_pscsi 
> target_core_file ib_srpt target_core_iblock target_core_mod brd dm_multipath 
> scsi_dh_rdac scsi_dh_emc scsi_dh_alua netconsole xt_CHECKSUM iptable_mangle 
> ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat 
> nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT 
> xt_tcpudp tun ebtable_filter ebtables ip6table_filter ip6_tables 
> iptable_filter ip_tables x_tables af_packet bridge stp llc iscsi_ibft 
> iscsi_boot_sysfs ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad msr rdma_cm 
> configfs ib_cm iw_cm mlx4_ib ib_core sb_edac edac_core x86_pkg_temp_thermal 
> intel_powerclamp coretemp kvm_intel kvm dm_mod irqbypass ipmi_ssif 
> crct10dif_pclmul ipmi_devintf crc32_pclmul ghash_clmulni_intel aesni_intel 
> iTCO_wdt aes_x86_64 lrw iTCO_vendor_support dcdbas gf128mul tg3 mlx4_core 
> glue_helper ablk_helper ptp cryptd fjes ipmi_si pcspkr pps_core libphy 
> ipmi_msghandler mei_me tpm_tis tpm_tis_core mei tpm lpc_ich shpchp mfd_core 
> wmi button hid_generic usbhid mgag200 i2c_algo_bit drm_kms_helper syscopyarea 
> sysfillrect sysimgblt fb_sys_fops ttm drm sr_mod cdrom crc32c_intel ehci_pci 
> ehci_hcd usbcore usb_common sg [last unloaded: scsi_transport_srp]
> CPU: 1 PID: 136 Comm: kworker/u64:8 Tainted: G        W       4.8.0-rc4-dbg+ 
> #2
> Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.0.2 11/17/2014
> Workqueue: events_unbound async_run_entry_fn
> Call Trace:
>  [<ffffffff813247c5>] dump_stack+0x68/0x93
>  [<ffffffff81062f96>] __warn+0xc6/0xe0
>  [<ffffffff81063068>] warn_slowpath_null+0x18/0x20
>  [<ffffffff8148145e>] sd_probe_async+0x1ce/0x1e0
>  [<ffffffff81089d64>] async_run_entry_fn+0x34/0x140
>  [<ffffffff81080355>] process_one_work+0x1f5/0x690
>  [<ffffffff81080839>] worker_thread+0x49/0x490
>  [<ffffffff8108701a>] kthread+0xea/0x100
>  [<ffffffff8162d3bf>] ret_from_fork+0x1f/0x40
>
> The test I ran is available at https://github.com/bvanassche/srp-test.

I tried running this, but it seems I'm failing to configure my test
environment correctly [1], but I'm worried that this "re-parenting the
scsi-disk" approach, even if the above warning is addressed, may not
be backwards compatible.  We now have an ordering difference where the
link to the "block" attribute group is established after the disk's
KOBJ_ADD event which seems in the same class of problems that Fam
Zheng's patchset [2] is trying to solve.

Bart, if you can help me get the test case running I can take a look
at finishing off the disk_devt approach I proposed earlier.

[1]: https://gist.github.com/djbw/1c72526c90d1ea8fe2a05dcfbfc73dda
[2]: https://lkml.org/lkml/2016/8/17/63
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to