> > This oops was reported to me recently:
> > PID: 5176   TASK: ffff880215274100  CPU: 0   COMMAND: "fc_rport_eq"
> > 0 [ffff880218c65760] machine_kexec at ffffffff81031d3b
> > 1 [ffff880218c657c0] crash_kexec at ffffffff810b8e92
> > 2 [ffff880218c65890] oops_end at ffffffff814ef890
> > 3 [ffff880218c658c0] no_context at ffffffff8104226b
> > 4 [ffff880218c65910] __bad_area_nosemaphore at ffffffff810424f5
> > 5 [ffff880218c65960] bad_area_nosemaphore at ffffffff810425c3
> > 6 [ffff880218c65970] __do_page_fault at ffffffff81042c9d
> > 7 [ffff880218c65a90] do_page_fault at ffffffff814f186e
> > 8 [ffff880218c65ac0] page_fault at ffffffff814eec25
> > 9 [ffff880218c65bb8] fc_fcp_complete_locked at ffffffffa02ed739 [libfc]
> > 10 [ffff880218c65c08] fc_fcp_retry_cmd at ffffffffa02ed86f [libfc]
> > 11 [ffff880218c65c28] fc_fcp_recv at ffffffffa02eed3f [libfc]
> > 12 [ffff880218c65d28] fc_exch_mgr_reset at ffffffffa02e2373 [libfc]
> > 13 [ffff880218c65db8] fc_rport_work at ffffffffa02e9f10 [libfc]
> > 14 [ffff880218c65e38] worker_thread at ffffffff8108b250
> > 15 [ffff880218c65ee8] kthread at ffffffff81090806
> > 16 [ffff880218c65f48] kernel_thread at ffffffff8100c10a
> >
> > It results from two contexts that try to manipulate the same
> > fcoe_exch_pool
> > without syncronizing themselves:
> >
> > 1) The fcoe event_work workqueue which calls
> > fc_rport_work
> >  fc_exch_mgr_reset
> >    fc_exch_pool_reset
> >
> > 2) The FCOE transport destroy path, which schedules a destroy_work
> > workqueue,
> > calling:
> > fcoe_destroy_work
> >  fcoe_if_destroy
> >   fc_exch_mgr_free
> >    fc_exch_mgr_del
> >     fc_exch_mgr_destroy
> >
> > The pool_reset path holds the pool look, but no references to the pool
> > manager
> > kobject, while exch_mgr_destroy path drops what is ostensibly the last
> > reference to the pool manager kobject (causing its freeing), while not
> > holding
> > the pool lock.
> >
> > The attached patch has been confirmed to prevent the panic.
> >
> > Signed-off-by: Neil Horman <nhor...@tuxdriver.com>
> > CC: Robert Love <robert.w.l...@intel.com>
> 
> Thanks, Neil.
> 
> yi
Neil, I have fixed the issues below while applying, the following
will be updated to your original patch description when I pull
this in later to open-fcoe.

    1. added mixxing ';' at kref_get() to fix compiling error
    2. added the declaration of fc_exch_mgr_destroy() to fix compiling error
    3. fixed one typo of 'look' to 'lock' in patch description
    4. added a prefix of libfc in patch title
    
 -yi

> 
> 
> > ---
> >  drivers/scsi/libfc/fc_exch.c |    2 ++
> >  1 files changed, 2 insertions(+), 0 deletions(-)
> >
> > diff --git a/drivers/scsi/libfc/fc_exch.c
> b/drivers/scsi/libfc/fc_exch.c
> > index 8325489..d7c8100 100644
> > --- a/drivers/scsi/libfc/fc_exch.c
> > +++ b/drivers/scsi/libfc/fc_exch.c
> > @@ -1815,10 +1815,12 @@ void fc_exch_mgr_reset(struct fc_lport *lport,
> > u32 sid, u32 did)
> >     unsigned int cpu;
> >
> >     list_for_each_entry(ema, &lport->ema_list, ema_list) {
> > +           kref_get(&ema->mp->kref)
> >             for_each_possible_cpu(cpu)
> >                     fc_exch_pool_reset(lport,
> >                                        per_cpu_ptr(ema->mp->pool, cpu),
> >                                        sid, did);
> > +           kref_put(&ema->mp->kref, fc_exch_mgr_destroy);
> >     }
> >  }
> >  EXPORT_SYMBOL(fc_exch_mgr_reset);
> > --
> > 1.7.6.4
> >
> > _______________________________________________
> > devel mailing list
> > devel@open-fcoe.org
> > https://lists.open-fcoe.org/mailman/listinfo/devel
> _______________________________________________
> devel mailing list
> devel@open-fcoe.org
> https://lists.open-fcoe.org/mailman/listinfo/devel
_______________________________________________
devel mailing list
devel@open-fcoe.org
https://lists.open-fcoe.org/mailman/listinfo/devel

Reply via email to