On 07/23/2009 07:01 PM, Mike Christie wrote:
> Boaz Harrosh wrote:
>>> I think I can replicate this problem now too. It was by accident. I am 
>>> using a EQL target remotely (I am in the middle of the US and the target 
>>> is on the west coast so there is a good deal of space between us and the 
>>> connection is slow) and I am seeing the problem where the network layer 
>>> is just not taking any more data so eventually something times out (if I 
>>> turn off nops then scsi command timer fires and if I also increase that 
>>> to 10 minutes then the EQL target will actually send me a nop and I 
>>> cannot send that because the network layer just keeps returning -AGAIN). 
>>> Are you still seeing that problem? Basically sendpage/sendmsg just keeps 
>>> returning -EGAIN. We even get woken up by iscsi_sw_tcp_write_space, but 
>>> the sk_wmem_queued and sk_sndbuf values are basically stuck and so no 
>>> space ever opens up for some reason (I attached the debug patch I am using).
>>>
>> I've used in the passed tgt with open-iscsi over an internet connection
>> from Israel to US and it did work. Like a simple mount of ext3 and some
>> read writes. But I've never put it into heavy load. Do you see this
>> problem only on an heavy load or a single long dd will cause it?
>>
> 
> Heavy load. The target has a window of 128 commands. I am setting the 
> shost->can_queue to 1024 and the scsi_device queue_depth to 256. I then 
> run multiple:
> 
> disktest -PT -T130 -h1 -K256 -B256k -ID /dev/sdb  -D 0:100 &
> disktest -PT -T130 -h1 -K256 -B256k -ID /dev/sdb  -D 0:100 &
> disktest -PT -T130 -h1 -K256 -B256k -ID /dev/sdb  -D 0:100 &
> 
> shost->host_busy and turn on debugging I can see the driver running 128 
> commands. With this the writes just die. sendpage/sendmsg just starts 
> returning EAGAIN. I turned off nops and set the scsi cmnd timer to 
> 1,200. And so we can go 1,200 seconds just getting -EAGAIN. The strange 
> thing is that the network is fine on the recv side. The target is 
> sending me a nop-in as a ping, and we are reading that in fine.
> 
> Also if I just do READ tests:
> 
> disktest -PT -T130 -h1 -K256 -B256k -ID /dev/sdb &
> disktest -PT -T130 -h1 -K256 -B256k -ID /dev/sdb &
> disktest -PT -T130 -h1 -K256 -B256k -ID /dev/sdb &
> 
> it works just fine. It is slow as heck. I get less than 1 MB/s 
> throughput, but we always make forward progress.
> 
> 
> If I just set the scsi_device queue_depth to 1 and run that write 
> workload it works just fine.
> 

What if you set scsi_device queue_depth to 126 or can_queue to 126
commandSN window left at 128 ?

Boaz

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~----------~----~----~----~------~----~------~--~---

Reply via email to