Matthew Kent wrote:
> On Mon, 2009-04-13 at 15:44 -0500, Mike Christie wrote:
>> Matthew Kent wrote:
>>> Can anyone suggest a timeout I might be hitting or a setting I'm
>>> missing?
>>>
>>> The run down:
>>>
>>> - EqualLogic target
>>> - CentOS 5.2 client
>> You will want to upgrade that to 5.3 when you can. The iscsi code in 
>> there fixes a bug where the initiator dropped the session when it should 
>> not.
>>
> 
> Will do, probably Wednesday night and we'll see if this goes away. I'll
> be sure to follow up for the archives.
> 
>>> - xfs > lvm > iscsi
>>>
>>> During a period of high load the EqualLogic decides to load balance:
>>>
>>>  INFO  4/13/09  12:08:29 AM  eql3    iSCSI session to target
>>> '20.20.20.31:3260,
>>> iqn.2001-05.com.equallogic:0-8a0906-b7f6d3801-2b2000d0f5347d9a-foo' from
>>> initiator '20.20.20.92:51274, iqn.1994-05.com.redhat:a62ba20db72' was
>>> closed.   Load balancing request was received on the array.  
>>
>> So is this what you get in the EQL log when it decides to load balance 
>> the initiator and send us to a different portal?
>>
> 
> Yes, a straight copy from event log in the java web interface.
> 
>>>  INFO  4/13/09  12:08:31 AM  eql3    iSCSI login to target
>>> '20.20.20.32:3260,
>>> iqn.2001-05.com.equallogic:0-8a0906-b7f6d3801-2b2000d0f5347d9a-foo' from
>>> initiator '20.20.20.92:44805, iqn.1994-05.com.redhat:a62ba20db72'
>>> successful, using standard frame length.  
>>>
>>> on the client see I get:
>>>
>>> Apr 13 00:08:29 moo kernel: [4576850.161324] sd 5:0:0:0: SCSI error:
>>> return code = 0x00020000
>>>
>>> Apr 13 00:08:29 moo kernel: [4576850.161330] end_request: I/O error, dev
>>> sdc, sector 113287552
>>>
>>> Apr 13 00:08:32 moo kernel: [4576852.470879] I/O error in filesystem
>>> ("dm-10") meta-data dev dm-10 block 0x6c0a000
>> Are you using dm-multipath over iscsi? Does this load balance issue 
>> affect all the paths at the same time? What is your multipath 
>> no_path_retry value? I think you might want to set that higher to avoid 
>> the FS from getting IO errors at this time if all paths are affected at 
>> the same time.
>>
> 
> Not using multipath on this one.
> 

Do you have xfs on sdc or is there something like LVM or RAID on top of sdc?

That is really strange then. 0x00020000 is DID_BUS_BUSY. The iscsi 
initiator layer would return this when the target does its load 
balancing. The initiator does this to ask he scsi layer to retry the IO. 
If dm-multipath was used then it is failed to the multipath layer right 
away. If dm-multipath is not used then we get 5 retries so we should not 
see the error if there was only the one rebalancing at the time. If 
there was a bunch of load rebalancing within a couple minutes then it 
makes sense.

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/open-iscsi
-~----------~----~----~----~------~----~------~--~---

Reply via email to