On 07/14/2011 01:33 AM, Eddie Wai wrote:
> 
> On Mon, 2011-07-11 at 13:02 -0700, Mike Christie wrote:
>> On 07/11/2011 01:14 PM, Eddie Wai wrote:
>>> A kernel panic was observed when passing the sc->request->cpu = -1 to
>>> retrieve the per_cpu variable pointer:
>>>  #0 [ffff880011203960] machine_kexec at ffffffff81022bc3
>>>  #1 [ffff8800112039b0] crash_kexec at ffffffff81088630
>>>  #2 [ffff880011203a80] __die at ffffffff8139ea20
>>>  #3 [ffff880011203aa0] no_context at ffffffff8102f3a7
>>>  #4 [ffff880011203ae0] __bad_area_nosemaphore at ffffffff8102f665
>>>  #5 [ffff880011203ba0] retint_signal at ffffffff8139dd1f
>>>  #6 [ffff880011203cc8] bnx2i_indicate_kcqe at ffffffffa03dc4f2
>>>  #7 [ffff880011203da8] service_kcqes at ffffffffa03cb04f
>>>  #8 [ffff880011203e68] cnic_service_bnx2x_kcq at ffffffffa03cb14a
>>>  #9 [ffff880011203e88] cnic_service_bnx2x_bh at ffffffffa03cb1b3
>>>
>>> The problem lies in the sg_io (and perhaps sg_scsi_ioctl) call to
>>> blk_get_request->get_request/wait->blk_alloc_request->blk_rq_init which
>>> re-initializes the request->cpu to -1.  There is no assignment for cpu from
>>> that to the request_fn call to low level drivers.
>>>
>>> When this happens, the sc->request->cpu will be using the init value of
>>> -1.  This will create a kernel panic when it hits bnx2i because the code
>>> refers it to get the per_cpu variables ptr.
>>>
>>> This change is to put in a guard against that and also for cases when
>>> CONFIG_SMP/BIO_CPU_AFFINE is not enabled.  In those cases, the cpu
>>> affinitization code would not get run in __make_request either; hence
>>> the request->cpu will remain a -1 also.
>>>
>>
>>
>>>
>>> diff --git a/drivers/scsi/bnx2i/bnx2i_iscsi.c 
>>> b/drivers/scsi/bnx2i/bnx2i_iscsi.c
>>> index 5c55a75..622383d 100644
>>> --- a/drivers/scsi/bnx2i/bnx2i_iscsi.c
>>> +++ b/drivers/scsi/bnx2i/bnx2i_iscsi.c
>>> @@ -1225,6 +1225,10 @@ static int bnx2i_task_xmit(struct iscsi_task *task)
>>>     if (!sc)
>>>             return bnx2i_mtask_xmit(conn, task);
>>>  
>>> +   if (!blk_rq_cpu_valid(sc->request)) {
>>> +           sc->request->cpu = get_cpu();
>>> +           put_cpu();
>>> +   }
>>
>>
>> If I understand you right, then I think this needs to get fixed in the
>> block or scsi layer instead of each LLD.
> 
> Absolutely, but this bnx2i fix is still applicable alongside the fixes I'm
> proposing in the block layer below.
> 
> I think the whole idea behind the tracking of the blk req->cpu is so that the 
> blk completion
> can be fired off from the same CPU to take advantage of the CPU's llc.  
> However, this is only being
> done when the queue is defined with the QUEUE_FLAG_SAME_COMP queue_flag 
> enabled.  In the case
> when the queue is defined without this enforced, it would then be up to the 
> blk completion
> code to complete the blk request with the current CPU of the thread.
> 
> The same analogy should apply to the iSCSI LLD for cmd completion as well.  
> So if the
> sc->request->cpu is left at -1, the LLD should then decide how it wants the 
> cmd completion
> to take place.  For all the other cases, the request->cpu id should be used 
> instead.  
> For bnx2i, if the blk layer didn't set the request->cpu, we would want to 
> align and complete
> the cmd against the task_xmit issuer's CPU id unconditionally; hence the 
> explicit get_cpu call.

Oh yeah, wrt the code we have today (so I mean ignoring my rant in the
other mail :)), I think what you are doing in your patch is sort of ok.
I am not sure if we want to be touching the request->cpu field in a LLD
though. I do not think LLDs should be modifying a block layer struct
like that probably.

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.

Reply via email to