On 08/17/2009 10:00 AM, Hannes Reinecke wrote:
> Mike Christie wrote:
>> On 08/14/2009 07:01 AM, Hannes Reinecke wrote:
>>> Hi Mike,
>>> finally I've found the mysterious I/O stall I've been chasing
>>> for the last two months.
>>> Problem is iscsi_conn_queue_work(); we're just calling
>>> queue_work() without checking the return value.
>>> However, queue_work() will be effectively a no-op when
>>> there is some work already running, so we'll be missing
>>> quite a few invocations (on my NetApp target I've been
>>> counting up to 120000 missed invocations ...).
>> Huh, I think I am misunderstanding you or I must be misreading workqueue
>> code or probably both :(
>> When you saying running do you mean the work fn is running or that the
>> work is queued (WORK_STRUCT_PENDING bit is set but the work fn is not
>> yet called)?
>> I thought this happens when queue_work returns 1:
>> run_workqueue
>>      work_clear_pending
>>      f(work)
>> so if we called queue_work while iscsi_data_xmit was running then
>> queue_work's pending test:
>>           if (!test_and_set_bit(WORK_STRUCT_PENDING, work_data_bits(work))) {
>>                   BUG_ON(!list_empty(&work->entry));
>>                   __queue_work(wq_per_cpu(wq, cpu), work);
>> would actually queue the work to be run. So I thought we could actually
>> be getting extra calls to our work function, because we could call
>> queue_work right after the WORK_STRUCT_PENDING bit cleared but the work
>> fn is not yet running.
>> If the queue_work returns 0, then I thought iscsi_data_xmit is not yet
>> running, so when it does run it would handle the task added to the list
>> by the caller of queue_work (we do list_add then queue_work so it should
>> be on the list when the work function would run if the work is pending
>> when we call queue_work).
> Yeah, you are right. My assumptions were too simplistic.
> (Read: next time I should read the code more thoroughly ...)
> However, I still have the feeling there still is potential
> for an I/O stall; this iscsi_check_cmdsn_window_closed() check
> simply doesn't feel safe.

The netapp target you are using has a fixed window for your test right? 
You could just set can_queue to that window, and see if the problem occurs.

>> Hey, for the list_add+queue_work sequence do I need to add some sort of
>> barrier?
> Not sure; might've as queue_work is potentially run on another CPU.
>>> In addition to that, iscsi_check_cmdsn_window_closed()
>>> doesn't actually check if the cmdsn window has been closed
>>> (in the sense that we're not allowed to send any PDUs),
>>> but rather that the new PDU _to be queued_ will be rejected.
>>> There might still be PDUs in the queue, waiting to be
>>> transferred.
>>> So if we're hitting queuecommand hard enough we might be running into this
>>> scenario:
>>> - iscsi_data_xmit:
>>>      Transfers last Data-out PDU in requeue list; all queues are empty.
>>>      ->   xmit_task()
>>>         ->   unlock session
>>> - queuecommand()
>>>       being called repeatedly until iscsi_check_cmdsn triggers, giving
>>>       xmit thread no chance to grab the session lock
>>>       ->   iscsi_conn_queue_work() does nothing as work is already pending
>> Yeah, so I thought here because iscsi_data_xmit is running we hit this:
>> run_workqueue
>>      work_clear_pending
>> //WORK_STRUCT_PENDING is now cleared so queue_work will queue the work
>> (queue_work returns 1).
>>      f(work)
> But we still will be stuck eg if sendpage() returns a retryable
> error other than -EAGAIN.
> If such a thing exists. Will have to check.

I was asking on your other patch where you modified the queue_work in 
iscsi_update_cmdsn, if when the sendpage happens if 
iscsi_sw_tcp_write_space should always be called when space finally 
opens up. If so then iscsi_sw_tcp_write_space will call 
iscsi_conn_queue_work and that should start us up again.

If iscsi_sw_tcp_write_space does not always get called then we have to 
add some code to iscsi_complete_command to wake up the xmit thread when 
a command still needs to be sent, and we have to modify iscsi_data_xmit 
to do a delayed queue_work when there are no commands in flight.

You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
For more options, visit this group at http://groups.google.com/group/open-iscsi

Reply via email to