On Fri, Aug 12, 2016 at 5:29 PM, Dan Williams <dan.j.willi...@intel.com> wrote:
> On Fri, Aug 12, 2016 at 5:17 PM, James Bottomley
> <james.bottom...@hansenpartnership.com> wrote:
>> On Fri, 2016-08-12 at 14:29 -0700, Dan Williams wrote:
>>> Before spending effort trying to flush the destruction of old bdi
>>> instances before new ones are registered, is it rather time to
>>> complete the conversion of sd to only use dynamically allocated devt?
>>
>> Do we have to go that far?  Surely your fix is extensible: the only
>> reason it doesn't work for us is that the gendisk holds the parent
>> without a reference, so we can free the SCSI device before its child
>> gendisk (good job no-one actually uses gendisk->parent after we've
>> released it ...).  If we fix that it would mean SCSI can't release the
>> sdev until after the queue is dead and the bdi namespace released, so
>> isn't something like this the easy fix?
>>
>> James
>>
>> ---
>>
>> diff --git a/block/genhd.c b/block/genhd.c
>> index fcd6d4f..54ae4ae 100644
>> --- a/block/genhd.c
>> +++ b/block/genhd.c
>> @@ -514,7 +514,7 @@ static void register_disk(struct device *parent, struct 
>> gendisk *disk)
>>         struct hd_struct *part;
>>         int err;
>>
>> -       ddev->parent = parent;
>> +       ddev->parent = get_device(parent);
>>
>>         dev_set_name(ddev, "%s", disk->disk_name);
>>
>> @@ -1144,6 +1144,7 @@ static void disk_release(struct device *dev)
>>         hd_free_part(&disk->part0);
>>         if (disk->queue)
>>                 blk_put_queue(disk->queue);
>> +       put_device(dev->parent);
>>         kfree(disk);
>>  }
>>  struct class block_class = {
>
> Looks ok at first glance to me.
>
> We do hold a reference on the parent device, but it gets dropped at
> device_unregister() time and this moves it out to the final put.
> However, this does leave static devt block-device-drivers that
> register a disk without a parent device susceptible to the race... I
> think those exist given all the drivers still using add_disk() after
> commit 52c44d93c26f "block: remove ->driverfs_dev".

So I tried the attached and it makes the libnvdimm unit tests start
crashing.  A couple crash logs attached.  Not yet sure what assumption
is getting violated, but how about that conversion of scsi to use
dynamic devt? ;-)
[   30.149817] =========================
[   30.150729] [ BUG: held lock freed! ]
[   30.151644] 4.8.0-rc1+ #19 Tainted: G           O   
[   30.152717] -------------------------
[   30.153635] lt-libndctl/863 is freeing memory 
ffff880310b7e128-ffff880310b7e927, with a lock still held there!
[   30.155719]  (&dev->mutex){......}, at: [<ffffffff815f931d>] 
device_release_driver+0x1d/0x40
[   30.157927] 6 locks held by lt-libndctl/863:
[   30.158917]  #0:  (sb_writers#7){.+.+.+}, at: [<ffffffff8128c537>] 
__sb_start_write+0xb7/0xf0
[   30.161267]  #1:  (&of->mutex){+.+.+.}, at: [<ffffffff8131c191>] 
kernfs_fop_write+0x101/0x1c0
[   30.163545]  #2:  (s_active#3){++++.+}, at: [<ffffffff8131c199>] 
kernfs_fop_write+0x109/0x1c0
[   30.165893]  #3:  (&dev->mutex){......}, at: [<ffffffff815f7a3f>] 
unbind_store+0xff/0x160
[   30.168124]  #4:  (&dev->mutex){......}, at: [<ffffffff815f931d>] 
device_release_driver+0x1d/0x40
[   30.170435]  #5:  (&dev->mutex){......}, at: [<ffffffff815f931d>] 
device_release_driver+0x1d/0x40
[   30.172755] 
[   30.172755] stack backtrace:
[   30.174156] CPU: 0 PID: 863 Comm: lt-libndctl Tainted: G           O    
4.8.0-rc1+ #19
[   30.175988] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014
[   30.178260]  0000000000000086 0000000099c8ecbb ffff8803394538d0 
ffffffff814b4f23
[   30.180325]  ffff8803380c4a90 ffff880310b7e128 ffff880339453908 
ffffffff81107653
[   30.182389]  ffff880310b7e128 0000000000000282 ffff88033b002940 
ffffea000c42de00
[   30.184456] Call Trace:
[   30.185232]  [<ffffffff814b4f23>] dump_stack+0x85/0xc2
[   30.186328]  [<ffffffff81107653>] debug_check_no_locks_freed+0x153/0x160
[   30.187616]  [<ffffffffa003fb1e>] ? namespace_pmem_release+0x2e/0x40 
[libnvdimm]
[   30.189391]  [<ffffffff8125e039>] kfree+0xc9/0x270
[   30.190445]  [<ffffffffa003fb1e>] namespace_pmem_release+0x2e/0x40 
[libnvdimm]
[   30.192197]  [<ffffffff815f4042>] device_release+0x32/0x90
[   30.193333]  [<ffffffff814b7a1a>] kobject_release+0x6a/0x170
[   30.194494]  [<ffffffff814b78e7>] kobject_put+0x27/0x50
[   30.195602]  [<ffffffff815f4377>] put_device+0x17/0x20
[   30.196696]  [<ffffffff81493290>] disk_release+0xc0/0x100
[   30.197828]  [<ffffffff815f4042>] device_release+0x32/0x90
[   30.198966]  [<ffffffff814b7a1a>] kobject_release+0x6a/0x170
[   30.200126]  [<ffffffff814b78e7>] kobject_put+0x27/0x50
[   30.201230]  [<ffffffff815f4377>] put_device+0x17/0x20
[   30.202328]  [<ffffffff81214e0a>] bdi_unregister+0x20a/0x290
[   30.203482]  [<ffffffff81107289>] ? trace_hardirqs_on_caller+0x129/0x1b0
[   30.204769]  [<ffffffff8147e2ac>] blk_cleanup_queue+0x18c/0x280
[   30.205957]  [<ffffffffa00a117e>] pmem_release_queue+0xe/0x10 [nd_pmem]
[   30.207228]  [<ffffffff815fd49f>] devm_action_release+0xf/0x20
[   30.208407]  [<ffffffff815fdf79>] release_nodes+0x129/0x230
[   30.209556]  [<ffffffff815fe2ec>] devres_release_all+0x3c/0x60
[   30.210736]  [<ffffffff815f9249>] __device_release_driver+0xa9/0x160
[   30.211979]  [<ffffffff815f9325>] device_release_driver+0x25/0x40
[   30.213187]  [<ffffffff815f88a4>] bus_remove_device+0x114/0x190
[   30.214377]  [<ffffffff815f4978>] device_del+0x148/0x270
[   30.215490]  [<ffffffff815f4380>] ? put_device+0x20/0x20
[   30.216609]  [<ffffffffa003f770>] ? nd_region_remove+0x80/0x80 [libnvdimm]
[   30.217914]  [<ffffffff815f4aba>] device_unregister+0x1a/0x60
[   30.219085]  [<ffffffffa003b828>] nd_device_unregister+0x48/0x50 [libnvdimm]
[   30.220412]  [<ffffffffa003f780>] child_unregister+0x10/0x20 [libnvdimm]
[   30.221696]  [<ffffffff815f4470>] device_for_each_child+0x50/0x90
[   30.222910]  [<ffffffffa003f718>] nd_region_remove+0x28/0x80 [libnvdimm]
[   30.224195]  [<ffffffffa003bbe1>] nvdimm_bus_remove+0x41/0xa0 [libnvdimm]
[   30.225488]  [<ffffffff815f9241>] __device_release_driver+0xa1/0x160
[   30.226727]  [<ffffffff815f9325>] device_release_driver+0x25/0x40
[   30.227947]  [<ffffffff815f7a4f>] unbind_store+0x10f/0x160
[   30.229081]  [<ffffffff815f6fa5>] drv_attr_store+0x25/0x30
[   30.230215]  [<ffffffff8131cee5>] sysfs_kf_write+0x45/0x60
[   30.231352]  [<ffffffff8131c1c5>] kernfs_fop_write+0x135/0x1c0
[   30.232530]  [<ffffffff81288977>] __vfs_write+0x37/0x160
[   30.233643]  [<ffffffff81102f47>] ? update_fast_ctr+0x17/0x30
[   30.234814]  [<ffffffff81102fd9>] ? percpu_down_read+0x49/0x90
[   30.235988]  [<ffffffff8128c537>] ? __sb_start_write+0xb7/0xf0
[   30.237169]  [<ffffffff8128c537>] ? __sb_start_write+0xb7/0xf0
[   30.238344]  [<ffffffff812890d8>] vfs_write+0xb8/0x1b0
[   30.239444]  [<ffffffff8128a568>] SyS_write+0x58/0xc0
[   30.240534]  [<ffffffff8199837c>] entry_SYSCALL_64_fastpath+0x1f/0xbd
[   30.693773] BUG: unable to handle kernel paging request at 0000000040070088

[..]


[   30.695296] IP: [<ffffffff819886ae>] klist_put+0xe/0x90
[   30.696501] PGD 3383e2067 PUD 0 
[   30.697532] Oops: 0000 [#1] SMP
[   30.698403] Dumping ftrace buffer:
[   30.699306]    (ftrace buffer empty)
[   30.700228] Modules linked in: nd_blk(O) nfit_test(O) ip6t_rpfilter 
ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp 
llc ip6table_mangle ip6table_raw ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 
nf_nat_ipv6 ip6table_security iptable_mangle iptable_raw iptable_nat 
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack 
iptable_security ebtable_filter ebtables ip6table_filter ip6_tables dax_pmem(O) 
nd_pmem(O) nd_btt(O) dax(O) nd_e820(O) nfit(O) tpm_tis libnvdimm(O) 
tpm_tis_core tpm serio_raw nfit_test_iomap(O)
[   30.712449] CPU: 0 PID: 863 Comm: lt-libndctl Tainted: G           O    
4.8.0-rc1+ #19
[   30.714325] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014
[   30.716650] task: ffff8803380c4240 task.stack: ffff880339450000
[   30.717868] RIP: 0010:[<ffffffff819886ae>]  [<ffffffff819886ae>] 
klist_put+0xe/0x90
[   30.719774] RSP: 0018:ffff880339453b40  EFLAGS: 00010246
[   30.720917] RAX: 0000000000000000 RBX: ffff8803380c4240 RCX: ffff8803380c4240
[   30.722277] RDX: ffffffff819887dd RSI: 0000000000000001 RDI: 0000000040070088
[   30.723647] RBP: ffff880339453b60 R08: ffffffff81f4eba0 R09: 0000000000000000
[   30.725026] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000040070088
[   30.726395] R13: ffffffffa004d3e0 R14: 0000000000000001 R15: fffffffffffffff2
[   30.727763] FS:  00007f253e15f840(0000) GS:ffff88033fc00000(0000) 
knlGS:0000000000000000
[   30.729641] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   30.730842] CR2: 0000000040070088 CR3: 00000003395d3000 CR4: 00000000000006f0
[   30.732212] Stack:
[   30.732941]  ffff8803380c4240 0000000040070088 ffffffffa004d3e0 
ffff880310b79050
[   30.735047]  ffff880339453bc8 ffffffff819887ea ffff8803104ac038 
ffff880324ba2848
[   30.737156]  ffffffff81f4eba0 ffffffff81f4eba0 0000000040070088 
ffff8803380c4240
[   30.739249] Call Trace:
[   30.740029]  [<ffffffff819887ea>] klist_remove+0x7a/0xe0
[   30.741166]  [<ffffffff815f9291>] __device_release_driver+0xf1/0x160
[   30.742434]  [<ffffffff815f9325>] device_release_driver+0x25/0x40
[   30.743668]  [<ffffffff815f88a4>] bus_remove_device+0x114/0x190
[   30.744892]  [<ffffffff815f4978>] device_del+0x148/0x270
[   30.746038]  [<ffffffff815f4380>] ? put_device+0x20/0x20
[   30.747185]  [<ffffffffa003f770>] ? nd_region_remove+0x80/0x80 [libnvdimm]
[   30.748520]  [<ffffffff815f4aba>] device_unregister+0x1a/0x60
[   30.749715]  [<ffffffffa003b828>] nd_device_unregister+0x48/0x50 [libnvdimm]
[   30.751088]  [<ffffffffa003f780>] child_unregister+0x10/0x20 [libnvdimm]
[   30.752401]  [<ffffffff815f4470>] device_for_each_child+0x50/0x90
[   30.753650]  [<ffffffffa003f718>] nd_region_remove+0x28/0x80 [libnvdimm]
[   30.754979]  [<ffffffffa003bbe1>] nvdimm_bus_remove+0x41/0xa0 [libnvdimm]
[   30.756307]  [<ffffffff815f9241>] __device_release_driver+0xa1/0x160
[   30.757583]  [<ffffffff815f9325>] device_release_driver+0x25/0x40
[   30.758828]  [<ffffffff815f7a4f>] unbind_store+0x10f/0x160
[   30.759986]  [<ffffffff815f6fa5>] drv_attr_store+0x25/0x30
[   30.761145]  [<ffffffff8131cee5>] sysfs_kf_write+0x45/0x60
[   30.762299]  [<ffffffff8131c1c5>] kernfs_fop_write+0x135/0x1c0
[   30.763502]  [<ffffffff81288977>] __vfs_write+0x37/0x160
[   30.764641]  [<ffffffff81102f47>] ? update_fast_ctr+0x17/0x30
[   30.765845]  [<ffffffff81102fd9>] ? percpu_down_read+0x49/0x90
[   30.767055]  [<ffffffff8128c537>] ? __sb_start_write+0xb7/0xf0
[   30.768265]  [<ffffffff8128c537>] ? __sb_start_write+0xb7/0xf0
[   30.769477]  [<ffffffff812890d8>] vfs_write+0xb8/0x1b0
[   30.770592]  [<ffffffff8128a568>] SyS_write+0x58/0xc0
[   30.771697]  [<ffffffff8199837c>] entry_SYSCALL_64_fastpath+0x1f/0xbd
[   30.772979] Code: f5 55 5d 00 01 e8 b3 6e 72 ff e9 63 ff ff ff e8 69 f5 00 
00 e9 72 ff ff ff 0f 1f 40 00 55 48 89 e5 41 56 41 55 41 54 53 41 89 f6 <4c> 8b 
27 48 89 fb 49 83 e4 fe 4c 89 e7 4d 8b 6c 24 50 e8 3b f2 
[   30.781430] RIP  [<ffffffff819886ae>] klist_put+0xe/0x90
[   30.782642]  RSP <ffff880339453b40>
[   30.783550] CR2: 0000000040070088
[   30.784495] ---[ end trace 0dcefe91828fba52 ]---
[   30.786917] Kernel panic - not syncing: Fatal exception
[   30.788400] Dumping ftrace buffer:
[   30.789305]    (ftrace buffer empty)
[   30.790225] Kernel Offset: disabled
[   30.791137] Rebooting in 5 seconds..
[   34.255986] general protection fault: 0000 [#1] SMP
[   34.257026] Dumping ftrace buffer:
[   34.257859]    (ftrace buffer empty)
[   34.258713] Modules linked in: nd_blk(O) nfit_test(O) ip6t_rpfilter 
ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp 
llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle 
ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 
nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw 
ebtable_filter ebtables ip6table_filter ip6_tables nd_pmem(O) dax_pmem(O) 
dax(O) nd_btt(O) serio_raw nd_e820(O) nfit(O) libnvdimm(O) tpm_tis tpm_tis_core 
tpm nfit_test_iomap(O)
[   34.269688] CPU: 1 PID: 871 Comm: lt-libndctl Tainted: G           O    
4.8.0-rc1+ #19
[   34.271364] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014
[   34.273450] task: ffff880338b64240 task.stack: ffff880338a80000
[   34.274564] RIP: 0010:[<ffffffff819888ae>]  [<ffffffff819888ae>] 
klist_next+0x5e/0x120
[   34.276327] RSP: 0018:ffff880338a83c90  EFLAGS: 00010217
[   34.277371] RAX: ffff880313384278 RBX: 6b6b6b6b6b6b6b63 RCX: c2482a36b45bcd4d
[   34.278614] RDX: 0000000000000001 RSI: 0000000082f7649c RDI: ffff880313384248
[   34.279864] RBP: ffff880338a83cb8 R08: 0000000000000000 R09: 0000000000000000
[   34.281114] R10: 0000000000000000 R11: 0000000000003436 R12: ffff880338a83cc8
[   34.282356] R13: 0000000000000000 R14: ffff880313384a80 R15: 0000000000000000
[   34.283593] FS:  00007fa1a7e06840(0000) GS:ffff88033fd00000(0000) 
knlGS:0000000000000000
[   34.285306] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   34.286404] CR2: 0000561d5afc9208 CR3: 000000033891c000 CR4: 00000000000006e0
[   34.287654] Stack:
[   34.288339]  0000000000000000 0000000000000000 ffffffffa002e770 
ffffffffa017d100
[   34.290258]  fffffffffffffff2 ffff880338a83cf8 ffffffff815f447b 
ffff880313384248
[   34.301588]  0000000000000000 0000000070757190 ffff880312c49868 
ffff880312c49868
[   34.303488] Call Trace:
[   34.304208]  [<ffffffffa002e770>] ? nd_region_remove+0x80/0x80 [libnvdimm]
[   34.305428]  [<ffffffff815f447b>] device_for_each_child+0x5b/0x90
[   34.306566]  [<ffffffffa002e718>] nd_region_remove+0x28/0x80 [libnvdimm]
[   34.307776]  [<ffffffffa002abe1>] nvdimm_bus_remove+0x41/0xa0 [libnvdimm]
[   34.308983]  [<ffffffff815f9241>] __device_release_driver+0xa1/0x160
[   34.310148]  [<ffffffff815f9325>] device_release_driver+0x25/0x40
[   34.311273]  [<ffffffff815f7a4f>] unbind_store+0x10f/0x160
[   34.312330]  [<ffffffff815f6fa5>] drv_attr_store+0x25/0x30
[   34.313391]  [<ffffffff8131cee5>] sysfs_kf_write+0x45/0x60
[   34.314451]  [<ffffffff8131c1c5>] kernfs_fop_write+0x135/0x1c0
[   34.315557]  [<ffffffff81288977>] __vfs_write+0x37/0x160
[   34.316610]  [<ffffffff81102f47>] ? update_fast_ctr+0x17/0x30
[   34.317704]  [<ffffffff81102fd9>] ? percpu_down_read+0x49/0x90
[   34.318805]  [<ffffffff8128c537>] ? __sb_start_write+0xb7/0xf0
[   34.319906]  [<ffffffff8128c537>] ? __sb_start_write+0xb7/0xf0
[   34.321010]  [<ffffffff812890d8>] vfs_write+0xb8/0x1b0
[   34.322029]  [<ffffffff8128a568>] SyS_write+0x58/0xc0
[   34.323039]  [<ffffffff8199837c>] entry_SYSCALL_64_fastpath+0x1f/0xbd
[   34.324208] Code: 8d 58 f8 f0 41 83 6e 18 01 74 60 49 8b 3c 24 45 31 ed 48 
8d 47 30 45 31 ff 49 c7 44 24 08 00 00 00 00 48 39 c3 0f 84 b4 00 00 00 <f6> 03 
01 75 70 b8 01 00 00 00 f0 0f c1 43 18 83 c0 01 83 f8 01 
[   34.331684] RIP  [<ffffffff819888ae>] klist_next+0x5e/0x120
[   34.332798]  RSP <ffff880338a83c90>
[   34.333659] ---[ end trace db64022aaca42534 ]---

Attachment: patch
Description: Binary data

Reply via email to