On Mon, 2009-10-12 at 11:39 -0700, Joe Eykholt wrote:
> This is with the patch I just submitted on top of fcoe-next.
> I believe it's unrelated to my patch.
>
> On removing the module or deleting an instance (I'm not sure which), the
> system crashes. I think the problem is that fc_exch_release() happens
> too late, after the exchange manager is freed, so that the second arg
> to mempool_free, the pool pointer, is 6b6b6b6b6b6b6b6b, which is the
> slab allocator's free poison value.
>
> Symptoms with other allocators will probably be different, but I find the
> slab allocator with CONFIG_DEBUG_SLAB handy for finding things like this.
>
> We need to do a sync cancel somewhere before removing the exchange manager.
The fc_exch_reset needs to call cancel_delayed_work_sync() instead
cancel_delayed_work, we could not call _sync here since the
fc_exch_reset is called with lport lock held and calling _sync would
have acquired lport lock again causing deadlock. Currently lport exch
resp handler checks for -FC_EX_CLOSED without acquiring lport lock, so
now it should be safe to call cancel_delayed_work_sync in
fc_exch_reset(), let me try this fix.
Vasu
>
> Below is the log leading up to the crash.
>
> Joe
>
>
> [ 816.144406] device eth0 left promiscuous mode
> [ 816.149371] host42: rport 820000: Remove port
> [ 816.153950] host42: rport 820000: Port entered LOGO state from Ready state
> [ 816.161230] host42: rport 820000: Delete port
> [ 816.165894] host42: rport 706e1: Remove port
> [ 816.165900] host42: rport 820000: work event 3
> [ 816.175136] host42: rport 706e1: Port entered LOGO state from Ready state
> [ 816.182379] host42: rport 706e1: Delete port
> [ 816.186744] host42: rport 706d2: Remove port
> [ 816.191119] host42: rport 706d2: Port entered LOGO state from Ready state
> [ 816.198035] host42: rport 706d2: Delete port
> [ 816.202414] host42: rport 702ef: Remove port
> [ 816.206795] host42: rport 702ef: Port entered LOGO state from Ready state
> [ 816.213716] host42: rport 702ef: Delete port
> [ 816.218093] host42: rport 702dc: Remove port
> [ 816.222474] host42: rport 702dc: Port entered LOGO state from Ready state
> [ 816.229396] host42: rport 702dc: Delete port
> [ 816.233764] host42: rport 700e8: Remove port
> [ 816.238129] host42: rport 700e8: Port entered LOGO state from Ready state
> [ 816.245040] host42: rport 700e8: Delete port
> [ 816.249415] host42: rport 700e2: Remove port
> [ 816.253781] host42: rport 700e2: Port entered LOGO state from Ready state
> [ 816.260699] host42: rport 700e2: Delete port
> [ 816.265071] host42: rport 700e1: Remove port
> [ 816.269452] host42: rport 700e1: Port entered LOGO state from Ready state
> [ 816.276364] host42: rport 700e1: Delete port
> [ 816.280730] host42: rport 700dc: Remove port
> [ 816.285121] host42: rport 700dc: Port entered LOGO state from Ready state
> [ 816.292022] host42: rport 700dc: Delete port
> [ 816.296459] host42: rport 820000: Received a LOGO response closed
> [ 816.302868] host42: rport 820000: Received a LOGO response, but in state
> Delete
> [ 816.310639] host42: rport 706e1: work event 3
> [ 816.315359] host42: rport 706e1: Received a LOGO response closed
> [ 816.321750] host42: rport 706e1: Received a LOGO response, but in state
> Delete
> [ 816.329170] host42: rport 706d2: work event 3
> [ 816.333672] host42: rport 706d2: Received a LOGO response closed
> [ 816.339791] host42: rport 706d2: Received a LOGO response, but in state
> Delete
> [ 816.347211] host42: rport 702ef: work event 3
> [ 816.351700] host42: rport 702ef: Received a LOGO response closed
> [ 816.357810] host42: rport 702ef: Received a LOGO response, but in state
> Delete
> [ 816.365234] host42: rport 702dc: work event 3
> [ 816.369736] host42: rport 702dc: Received a LOGO response closed
> [ 816.375867] host42: rport 702dc: Received a LOGO response, but in state
> Delete
> [ 816.383285] host42: rport 700e8: work event 3
> [ 816.387797] host42: rport 700e8: Received a LOGO response closed
> [ 816.393900] host42: rport 700e8: Received a LOGO response, but in state
> Delete
> [ 816.401269] host42: rport 700e2: work event 3
> [ 816.405752] host42: rport 700e2: Received a LOGO response closed
> [ 816.411878] host42: rport 700e2: Received a LOGO response, but in state
> Delete
> [ 816.419242] host42: rport 700e1: work event 3
> [ 816.423711] host42: rport 700e1: Received a LOGO response closed
> [ 816.429830] host42: rport 700e1: Received a LOGO response, but in state
> Delete
> [ 816.437192] host42: rport 700dc: work event 3
> [ 816.441677] host42: rport 700dc: Received a LOGO response closed
> [ 816.447804] host42: rport 700dc: Received a LOGO response, but in state
> Delete
> [ 816.455210] host42: rport fffffc: Remove port
> [ 816.459789] host42: rport fffffc: Port entered LOGO state from Ready state
> [ 816.466723] host42: rport fffffc: Delete port
> [ 816.471356] host42: rport fffffc: work event 3
> [ 816.475826] host42: rport fffffc: callback ev 3
> [ 816.480367] host42: lport 6a0000: Received a 3 event for port (fffffc)
> [ 816.486919] host42: rport fffffc: Received a LOGO response closed
> [ 816.493037] host42: rport fffffc: Received a LOGO response, but in state
> Delete
> [ 816.500407] host42: lport 6a0000: Entered LOGO state from Ready state
> [ 816.506985] host42: lport 6a0000: Received a LOGO response closed
> [ 816.752910] host43: lport 0: Entered LOGO state from FLOGI state
> [ 816.760009] host43: lport 0: Received a FLOGI response closed
> [ 816.766483] host43: lport 0: Received a LOGO response closed
> [ 816.773585] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC
> [ 816.779051] last sysfs file:
> /sys/class/net/eth4/host42/rport-42:0-12/target42:0:8/42:0:8:0/state
> [ 816.787777] CPU 1
> [ 816.789283] Modules linked in: fcoe(-) libfcoe libfc autofs4 nfs lockd
> nfs_acl auth_rpcgss sunrpc
> iptable_filter ip_tables x_tables loop dm_multipath uinpute1000e
> scsi_transport_fc ixgbe ide_cd_mod cdrom
> i2c_i801 i2c_core tg3 libphy shpchp sg pcspkr ppdev mdio serio_raw parport_pc
> rtc_cmos rtc_core rtc_lib
> parportbutton dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod
> ata_piix libata sd_mod scsi_mod ext3
> jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: libfc]
> [ 816.827990] Pid: 16, comm: events/1 Not tainted 2.6.32-rc3-n1 #1 X7DB8
> [ 816.836152] RIP: 0010:[<ffffffff8109f08a>] [<ffffffff8109f08a>]
> mempool_free+0x15/0x77
> [ 816.841698] RSP: 0018:ffff88013f09bd30 EFLAGS: 00010286
> [ 816.849811] RAX: 0000000000000000 RBX: 6b6b6b6b6b6b6b6b RCX:
> ffff88013f07c006
> [ 816.858207] RDX: 0000000000000001 RSI: 6b6b6b6b6b6b6b6b RDI:
> ffff8801372eceb8
> [ 816.858207] RBP: ffff88013f09bd40 R08: ffff8801372ecf18 R09:
> 0000000000000000
> [ 816.858207] R10: ffff88013f09bd40 R11: ffff8801372ecef8 R12:
> ffff8801372eceb8
> [ 816.858207] R13: ffff8801372ecf20 R14: ffff8801372ecf18 R15:
> ffff8801372ecee0
> [ 816.858207] FS: 0000000000000000(0000) GS:ffff880028280000(0000)
> knlGS:0000000000000000
> [ 816.858207] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> [ 816.858207] CR2: 00007fee93e40098 CR3: 0000000135761000 CR4:
> 00000000000006e0
> [ 816.858207] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [ 816.858207] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> [ 816.858207] Process events/1 (pid: 16, threadinfo ffff88013f09a000, task
> ffff88013f07c048)
> [ 816.858207] Stack:
> [ 816.858207] ffff8801372eceb8 ffff880123a220e8 ffff88013f09bd60
> ffffffffa05c42de
> [ 816.858207] <0> ffff88013f09bed8 ffff8801372eceb8 ffff88013f09bdc0
> ffffffffa05c56c6
> [ 816.858207] <0> 0000000000000000 ffffffff81055429 ffff8801372ecf18
> 0000000000000000
> [ 816.858207] Call Trace:
> [ 816.858207] [<ffffffffa05c42de>] fc_exch_release+0x5e/0x63 [libfc]
> [ 816.858207] [<ffffffffa05c56c6>] fc_exch_timeout+0x305/0x314 [libfc]
> [ 816.858207] [<ffffffff81055429>] ? worker_thread+0x1d5/0x332
> [ 816.858207] [<ffffffff81055480>] worker_thread+0x22c/0x332
> [ 816.858207] [<ffffffff81055429>] ? worker_thread+0x1d5/0x332
> [ 816.858207] [<ffffffffa05c53c1>] ? fc_exch_timeout+0x0/0x314 [libfc]
> [ 816.858207] [<ffffffff810594b0>] ? autoremove_wake_function+0x0/0x38
> [ 816.858207] [<ffffffff81066900>] ? trace_hardirqs_on+0xd/0xf
> [ 816.858207] [<ffffffff81055254>] ? worker_thread+0x0/0x332
> [ 816.858207] [<ffffffff81059170>] kthread+0x7d/0x85
> [ 816.858207] [<ffffffff8100caba>] child_rip+0xa/0x20
> [ 816.858207] [<ffffffff8100c47c>] ? restore_args+0x0/0x30
> [ 816.858207] [<ffffffff810590f3>] ? kthread+0x0/0x85
> [ 816.858207] [<ffffffff8100cab0>] ? child_rip+0x0/0x20
> [ 816.858207] Code: c9 c3 55 48 89 f0 48 89 fe 48 89 c7 48 89 e5 e8 68 e9 02
> 00 c9 c3 55 48 85 ff 48 89 e5 41
> 54 49 89 fc 53 48 89 f3 74 60 0f ae f0 <8b> 46 34 3b 46 30 7d 4b 48 89 f7 e8
> ea b5 27 00 8b 4b 34 3b 4b
> [ 816.858207] RIP [<ffffffff8109f08a>] mempool_free+0x15/0x77
> [ 816.858207] RSP <ffff88013f09bd30>
> [ 817.069295] ---[ end trace 31bf59194b827ac4 ]---
> _______________________________________________
> devel mailing list
> [email protected]
> http://www.open-fcoe.org/mailman/listinfo/devel
_______________________________________________
devel mailing list
[email protected]
http://www.open-fcoe.org/mailman/listinfo/devel