On 11/5/2014 6:57 AM, Elliott, Robert (Server Storage) wrote:


-----Original Message-----
From: Sagi Grimberg [mailto:sa...@dev.mellanox.co.il]
Sent: Tuesday, November 04, 2014 6:15 AM
To: Bart Van Assche; Elliott, Robert (Server Storage); Christoph Hellwig
Cc: Jens Axboe; Sagi Grimberg; Sebastian Parschauer; Ming Lei; linux-
s...@vger.kernel.org; linux-rdma
Subject: Re: [PATCH v2 12/12] IB/srp: Add multichannel support

...
I think that Rob and I are not talking about the same issue. In
case only a single core is servicing interrupts it is indeed expected
that it will spend 100% in hard-irq, that's acceptable since it is
pounded with completions all the time.

However, I'm referring to a condition where SRP will spend infinite
time servicing a single interrupt (while loop on ib_poll_cq that never
drains) which will lead to a hard lockup.

This *can* happen, and I do believe that with an optimized IO path
it is even more likely to.

If the IB completions/interrupts are only for IOs submitted on this
CPU, then the CQ will eventually drain, because this CPU is not
submitting anything new while stuck in the loop.

They're not (or not necessarily). I'm talking about the case where the
IO completions are submitted from another CPU. This creates a cycle
where the submitter is generating completions on CPU X and the completer
is evacuating room for more submissions on CPU Y. This process can
never end while the completer is in hard-irq context.


This can become bursty, though - submit a lot of IOs, then be busy
completing all of them and not submitting more, resulting in the
queue depth bouncing from 0 to high to 0 to high.  I've seen
that with both hpsa and mpt3sas drivers.  The fio options
iodepth_batch, iodepth_batch_complete, and iodepth_low
can amplify and reduce that effect (using libaio).


blk-iopoll (or some other form of budgeting completions) should take
care of that.

Sagi.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to