Vladislav Bolkhovitin, on 09/10/2011 05:44 PM wrote:
> Vladislav Bolkhovitin, on 09/08/2011 09:55 PM wrote:
>> Mike Christie, on 09/02/2011 12:15 PM wrote:
>>> On 09/01/2011 10:04 PM, Vladislav Bolkhovitin wrote:
>>>> Hi,
>>>>
>>>> I've done some tests and looks like open-iscsi doesn't support full duplex 
>>>> speed
>>>> on bidirectional data transfers from a single drive.
>>>>
>>>> My test is simple: 2 dd's doing big transfers in parallel over 1 GbE link 
>>>> from a
>>>> ramdisk or nullio iSCSI device. One dd is reading and another one is 
>>>> writing. I'm
>>>> watching throughput using vmstat. When any of the dd's working alone, I 
>>>> have full
>>>> single direction link utilization (~120 MB/s) in both directions, but when 
>>>> both
>>>> transfers working in parallel, throughput on any of them immediately drops 
>>>> in 2
>>>> times to 55-60 MB/s (sum is the same 120 MB/s).
>>>>
>>>> For sure, I tested bidirectional possibility of a single TCP connection 
>>>> and it
>>>> does provide near 2 times throughput increase (~200 MB/s).
>>>>
>>>> Interesting, that doing another direction transfer from the same device 
>>>> imported
>>>> from another iSCSI target provides expected full duplex 2x aggregate 
>>>> throughput
>>>> increase.
>>>>
>>>> I tried several iSCSI targets + I'm pretty confident that iSCSI-SCST is 
>>>> capable to
>>>> provide full duplex transfers, but from some look on the open-iscsi code I 
>>>> can't
>>>> see the serialization point in it. Looks like open-iscsi receives and 
>>>> sends data
>>>> in different threads (the requester process and per connection iscsi_q_X 
>>>> workqueue
>>>> correspondingly), so should be capable to have full duplex.
>>>
>>> Yeah, we send from the iscsi_q workqueue and receive from the network
>>> softirq if the net driver supports NAPI.
>>>>
>>>> Does anyone have idea what could be the serialization point preventing 
>>>> full duplex
>>>> speed?
>>>
>>> Did you do any lock profiliing and is the session->lock look the
>>> problem? It is taken in both the receive and xmit paths and also the
>>> queuecommand path.
>>
>> Just done it. /proc/lock_stat says that there is no significant contention 
>> for
>> session->lock.
>>
>> >From other side, session->lock is a spinlock, so, if it was the 
>> >serialization
>> point, we would see big CPU consumption on the initiator. But we have a 
>> plenty of
>> CPU time there.
>>
>> So, there must be other serialization point.
> 
> Update. Using sg_dd with blk_sgio=1 (SG_IO) instead of dd I was able to 
> achieve
> bidi speed 92 MB/s in each direction.
> 
> Thus, the iSCSI stack works as expected well and the serialization point must 
> be
> somewhere higher in the block stack. Both buffered and direct dd demonstrate 
> the
> same serialized behavior described above.

...even using the corresponding iSCSI device formatted in ext4 with the dd's
working over 2 _separate_ files.

In other words, apparently, for user space applications not smart enough to use 
sg
interface there is no way to use full duplex capability of the link. No tricks 
can
give them the double throughput.

I tried with Fibre Channel and see the same with the only difference that if the
same device from the same target imported as 2 LUNs (i.e. as multipath), both
those LUNs can work bidirectionally. In iSCSI you need to import this device 
from
2 separate iSCSI targets to achieve that.

Vlad

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.

Reply via email to