On Fri, Oct 21, 2011 at 05:47:50PM -0700, Zou, Yi wrote:
> > > This oops was reported to me recently:
> > > PID: 5176   TASK: ffff880215274100  CPU: 0   COMMAND: "fc_rport_eq"
> > > 0 [ffff880218c65760] machine_kexec at ffffffff81031d3b
> > > 1 [ffff880218c657c0] crash_kexec at ffffffff810b8e92
> > > 2 [ffff880218c65890] oops_end at ffffffff814ef890
> > > 3 [ffff880218c658c0] no_context at ffffffff8104226b
> > > 4 [ffff880218c65910] __bad_area_nosemaphore at ffffffff810424f5
> > > 5 [ffff880218c65960] bad_area_nosemaphore at ffffffff810425c3
> > > 6 [ffff880218c65970] __do_page_fault at ffffffff81042c9d
> > > 7 [ffff880218c65a90] do_page_fault at ffffffff814f186e
> > > 8 [ffff880218c65ac0] page_fault at ffffffff814eec25
> > > 9 [ffff880218c65bb8] fc_fcp_complete_locked at ffffffffa02ed739 [libfc]
> > > 10 [ffff880218c65c08] fc_fcp_retry_cmd at ffffffffa02ed86f [libfc]
> > > 11 [ffff880218c65c28] fc_fcp_recv at ffffffffa02eed3f [libfc]
> > > 12 [ffff880218c65d28] fc_exch_mgr_reset at ffffffffa02e2373 [libfc]
> > > 13 [ffff880218c65db8] fc_rport_work at ffffffffa02e9f10 [libfc]
> > > 14 [ffff880218c65e38] worker_thread at ffffffff8108b250
> > > 15 [ffff880218c65ee8] kthread at ffffffff81090806
> > > 16 [ffff880218c65f48] kernel_thread at ffffffff8100c10a
> > >
> > > It results from two contexts that try to manipulate the same
> > > fcoe_exch_pool
> > > without syncronizing themselves:
> > >
> > > 1) The fcoe event_work workqueue which calls
> > > fc_rport_work
> > >  fc_exch_mgr_reset
> > >    fc_exch_pool_reset
> > >
> > > 2) The FCOE transport destroy path, which schedules a destroy_work
> > > workqueue,
> > > calling:
> > > fcoe_destroy_work
> > >  fcoe_if_destroy
> > >   fc_exch_mgr_free
> > >    fc_exch_mgr_del
> > >     fc_exch_mgr_destroy
> > >
> > > The pool_reset path holds the pool look, but no references to the pool
> > > manager
> > > kobject, while exch_mgr_destroy path drops what is ostensibly the last
> > > reference to the pool manager kobject (causing its freeing), while not
> > > holding
> > > the pool lock.
> > >
> > > The attached patch has been confirmed to prevent the panic.
> > >
> > > Signed-off-by: Neil Horman <nhor...@tuxdriver.com>
> > > CC: Robert Love <robert.w.l...@intel.com>
> > 
> > Thanks, Neil.
> > 
> > yi
> Neil, I have fixed the issues below while applying, the following
> will be updated to your original patch description when I pull
> this in later to open-fcoe.
> 
>     1. added mixxing ';' at kref_get() to fix compiling error
>     2. added the declaration of fc_exch_mgr_destroy() to fix compiling error
>     3. fixed one typo of 'look' to 'lock' in patch description
>     4. added a prefix of libfc in patch title
>     
>  -yi
> 
Thank you, I apologize, I had those fixed locally, but neglected to ammend my
commit prior to running git-send-email.
Neil

> 
_______________________________________________
devel mailing list
devel@open-fcoe.org
https://lists.open-fcoe.org/mailman/listinfo/devel

Reply via email to