Mike Christie wrote: > On 08/14/2009 07:01 AM, Hannes Reinecke wrote: >> Hi Mike, >> >> finally I've found the mysterious I/O stall I've been chasing >> for the last two months. >> >> Problem is iscsi_conn_queue_work(); we're just calling >> queue_work() without checking the return value. >> >> However, queue_work() will be effectively a no-op when >> there is some work already running, so we'll be missing >> quite a few invocations (on my NetApp target I've been >> counting up to 120000 missed invocations ...). >> > > Huh, I think I am misunderstanding you or I must be misreading workqueue > code or probably both :( > > When you saying running do you mean the work fn is running or that the > work is queued (WORK_STRUCT_PENDING bit is set but the work fn is not > yet called)? > > > I thought this happens when queue_work returns 1: > > run_workqueue > work_clear_pending > f(work) > > so if we called queue_work while iscsi_data_xmit was running then > queue_work's pending test: > > if (!test_and_set_bit(WORK_STRUCT_PENDING, work_data_bits(work))) { > BUG_ON(!list_empty(&work->entry)); > __queue_work(wq_per_cpu(wq, cpu), work); > > > would actually queue the work to be run. So I thought we could actually > be getting extra calls to our work function, because we could call > queue_work right after the WORK_STRUCT_PENDING bit cleared but the work > fn is not yet running. > > If the queue_work returns 0, then I thought iscsi_data_xmit is not yet > running, so when it does run it would handle the task added to the list > by the caller of queue_work (we do list_add then queue_work so it should > be on the list when the work function would run if the work is pending > when we call queue_work). > Yeah, you are right. My assumptions were too simplistic. (Read: next time I should read the code more thoroughly ...)
However, I still have the feeling there still is potential for an I/O stall; this iscsi_check_cmdsn_window_closed() check simply doesn't feel safe. > > Hey, for the list_add+queue_work sequence do I need to add some sort of > barrier? > Not sure; might've as queue_work is potentially run on another CPU. > > >> In addition to that, iscsi_check_cmdsn_window_closed() >> doesn't actually check if the cmdsn window has been closed >> (in the sense that we're not allowed to send any PDUs), >> but rather that the new PDU _to be queued_ will be rejected. >> There might still be PDUs in the queue, waiting to be >> transferred. >> >> So if we're hitting queuecommand hard enough we might be running into this >> scenario: >> >> - iscsi_data_xmit: >> Transfers last Data-out PDU in requeue list; all queues are empty. >> -> xmit_task() >> -> unlock session >> >> - queuecommand() >> being called repeatedly until iscsi_check_cmdsn triggers, giving >> xmit thread no chance to grab the session lock >> -> iscsi_conn_queue_work() does nothing as work is already pending > > Yeah, so I thought here because iscsi_data_xmit is running we hit this: > > run_workqueue > work_clear_pending > > //WORK_STRUCT_PENDING is now cleared so queue_work will queue the work > (queue_work returns 1). > > f(work) > > > But we still will be stuck eg if sendpage() returns a retryable error other than -EAGAIN. If such a thing exists. Will have to check. Anyway, hope to have some more details tomorrow. Running failover tests against a NetApp Filer. Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage h...@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Markus Rex, HRB 16746 (AG Nürnberg) --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~----------~----~----~----~------~----~------~--~---