Possible issue in commit 659743b02c41 ("libiscsi: Reduce locking contention in fast path")

2015-11-05 Thread Guilherme G. Piccoli

Hello Shlomo and Or,

I'm Guilherme Piccoli from LTC/IBM - firstly, sorry to bother you.


We are running some tests with iSCSI and we found an issue caused 
possibly by commit 659743b02c41 ("libiscsi: Reduce locking contention in 
fast path").


After some time (+/- 1 hour) of testing with a hardware target (using 
fio benchmark tool), we got a kernel oops; the following link is a 
pastebin of the error message (we got lots of these messages, since our 
system has multiple cores): http://codepad.org/KS2C9Jjt


With some debugging, we could find the exact point of the crash, caused 
by a null-pointer read: sc == NULL on sc->device->lun at libiscsi.c:369. 
But as you can see in error messages, some list issue seems to be 
possibly leading to this null-pointer situation.


After reverting the aforementioned commit, the issue is gone and we can 
run the benchmark many times without a single failure. The issue is hard 
to reproduce; we only were able to reproduce in high bandwidth 
environment (10Gb network) with the our hardware target (IBM FlashSystem 
840). Notice that from the initiator side we're using software iSCSI 
(iscsi_tcp/libiscsi_tcp).



We'd really appreciate if you could give us some directions to help us 
figuring what's going on - what path might have been taken leading to 
that null pointer read? It's hard to debug since I'm no expert in iSCSI, 
so any clues or suggestions you can provide would be really appreciated 
and helpful.


Any additional information you want, please let me know and I'd be glad 
to provide. Again, sorry to bother you.


Thanks in advance,



Guilherme

--
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to open-iscsi+unsubscr...@googlegroups.com.
To post to this group, send email to open-iscsi@googlegroups.com.
Visit this group at http://groups.google.com/group/open-iscsi.
For more options, visit https://groups.google.com/d/optout.


Antw: Re: Possible issue in commit 659743b02c41 ("libiscsi: Reduce locking contention in fast path")

2015-11-05 Thread Ulrich Windl
>>> Chris Leech  schrieb am 06.11.2015 um 01:56 in Nachricht
<20151106005608.ga18...@straylight.hirudinean.org>:

[...]
> Am I missing something, or is splitting a linked list across two locks a
> major failing of this change?
[...]

Could you explain your question again for those that are not deeply in the code?
How do you lock, and what do you do between the locks?

Regards,
Ulrich


-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to open-iscsi+unsubscr...@googlegroups.com.
To post to this group, send email to open-iscsi@googlegroups.com.
Visit this group at http://groups.google.com/group/open-iscsi.
For more options, visit https://groups.google.com/d/optout.