On Mon, 2011-02-21 at 19:08 +0100, Bart Van Assche wrote:
> On Mon, Feb 21, 2011 at 11:48 AM, Nicholas A. Bellinger
> <[email protected]> wrote:
> > Using a completion queue here with possibility of the extra context
> > switch is IMHO an unnecessary overhead.  The main issue here is that
> > target core needs to be able to clear the struct se_cmd descriptor after
> > invoking the struct target_core_fabric_ops->send_data_in() callback, but
> > before the call to srpt_handle_send_comp() -> release of the fabric
> > dependent descriptor containing our struct se_cmd can happen.
> >
> > I prefer something along the lines of the patch below for
> > lio-core-2.6.git/tcm_ib_srpt to follow what other TCM v4 HW fabric
> > modules are currently doing wrt to TFO->check_stop_free() usage.  This
> > patch has been compile tested only at this point, but I will be testing
> > this on IB hardware in the upcoming week.  Please review.
> 
> Hello Nicholas,
> 
> Thanks for the review. Regarding the two issues you mentioned:
> - As far as I understood it is safe to send back data to the initiator
> concurrently with the target core processing new commands ?

Yes, > 1 TCQ depth is not the issue.  This issue at hand is how to clear
the struct se_cmd in transport_cmd_check_stop() when using the optional
TFO->check_stop_free() call after TFO->queue_data_in() has been invoked.

> In that
> case no new counters are necessary - the existing kref in
> srpt_send_ioctx is sufficient. The patch that re-enables concurrent
> SCSI READs can be found here:
> https://github.com/bvanassche/srpt/commit/cd8d5e9a6501f5b3a4315c261e945c201c4ce0ae.
> Please let me know if anyone wants to see that patch posted on the
> list. Also, please note that your patch contained the same race as my
> original patch: directly invoking srpt_unmap_sg_to_ib_sge() from the
> check_stop_free() callback makes that call race with the ongoing RDMA
> write. To Roland: I haven't found a way to invoke
> srpt_put_send_ioctx() without using an atomic counter.

Ok, so this sounds like srpt_unmap_sg_to_ib_sge() should be getting
called from the IB completion queue context then..

>  If you see an
> approach to avoid that atomic counter, suggestions are welcome.
> - Regarding making the check_stop_free() callback a NOOP for task
> management commands: are you sure that that's a good idea ? I have
> tried to do that, and the following appeared in the kernel log during
> module unload:
> 
> CORE_HBA[0] - Detached HBA from Generic Target Core
> =============================================================================
> BUG se_tmr_cache: Objects remaining on kmem_cache_close()
> -----------------------------------------------------------------------------
> 
> INFO: Slab 0xffffea0000119c00 objects=16 used=1 fp=0xffff880005080100
> flags=0x1000000000000080
> Pid: 18117, comm: rmmod Not tainted 2.6.38-rc5-dbg+ #4
> Call Trace:
>  [<ffffffff81123f76>] ? slab_err+0x96/0xb0
>  [<ffffffff81126da4>] ? __slab_alloc+0x124/0x540
>  [<ffffffff811277c8>] ? __kmalloc+0xf8/0x1b0
>  [<ffffffff81081a0d>] ? trace_hardirqs_off+0xd/0x10
>  [<ffffffff811277e8>] ? __kmalloc+0x118/0x1b0
>  [<ffffffff81128cf7>] ? kmem_cache_destroy+0x177/0x3f0
>  [<ffffffffa02e53c6>] ? release_se_global+0x36/0xb0 [target_core_mod]
>  [<ffffffffa02d10ab>] ? target_core_exit_configfs+0x17b/0x190 
> [target_core_mod]
>  [<ffffffff8109340f>] ? sys_delete_module+0x16f/0x260
>  [<ffffffff81002fdc>] ? sysret_check+0x27/0x62
>  [<ffffffff81087a8d>] ? trace_hardirqs_on_caller+0x14d/0x190
>  [<ffffffff81002fab>] ? system_call_fastpath+0x16/0x1b
> INFO: Object 0xffff880005080000 @offset=0
> INFO: Allocated in core_tmr_alloc_req+0x33/0x80 [target_core_mod]
> age=2442 cpu=1 pid=18088
> SLUB se_tmr_cache: kmem_cache_destroy called for cache that still has objects.
> Pid: 18117, comm: rmmod Not tainted 2.6.38-rc5-dbg+ #4
> Call Trace:
>  [<ffffffff81128ee4>] ? kmem_cache_destroy+0x364/0x3f0
>  [<ffffffffa02e53c6>] ? release_se_global+0x36/0xb0 [target_core_mod]
>  [<ffffffffa02d10ab>] ? target_core_exit_configfs+0x17b/0x190 
> [target_core_mod]
>  [<ffffffff8109340f>] ? sys_delete_module+0x16f/0x260
>  [<ffffffff81002fdc>] ? sysret_check+0x27/0x62
>  [<ffffffff81087a8d>] ? trace_hardirqs_on_caller+0x14d/0x190
>  [<ffffffff81002fab>] ? system_call_fastpath+0x16/0x1b
> 

Hmmm, I don't recall seeing this with TCM_Loop with sg_reset, but I will
double check with the lastest for-39 WIP fabric module code using
TFO->check_stop_free().

Thanks!

--nab


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to