Re: iscsid: Kernel reported iSCSI connection 1:0 error (1020) state (3)

Mike Christie Fri, 14 May 2010 09:44:47 -0700

On 05/14/2010 04:24 AM, 立凡 王 wrote:

Sorry, I did not answer the multiple initiators question.
The initiatorname of previos machine is different from present machine
and I set the iscsi service of previos machine down.
I also loged in the hp storage killed records of other hosts.
But the 1020 errors still alive.
I found it seems to be relative with the size of files.
When the backup files like video files or some big files, the errors
will show again.


machine A show below
27118/980330/980330-13.wmv
27118/980330/980330-14.wmv
27118/980330/980330-2.wmv
27118/980330/980330-3.wmv
27118/980330/980330-4.wmv
27118/980330/980330-5.wmv
27118/980330/980330-6.wmv
27118/980330/980330-7.wmv
27118/980330/980330-8.wmv
27118/980330/980330-9.wmv
27118/980413/980413-1.WMV
27118/980413/980413-10.WMV
27118/980413/980413-11.WMV
27118/980413/980413-12.WMV
27118/980413/980413-2.WMV
27118/980413/980413-3.WMV
27118/980413/980413-4.WMV
27118/980413/980413-5.WMV
27118/980413/980413-6.WMV
27118/980413/980413-7.WMV
27118/980413/980413-8.WMV
27118/980413/980413-9.WMV
27118/980420/980420-1.wmv
27118/980420/980420-10.wmv
27118/980420/980420-11.wmv
27118/980420/980420-12.wmv
27118/980420/980420-13.wmv
27118/980420/980420-2.wmv
27118/980420/980420-3.wmv
27118/980420/980420-4.wmv
27118/980420/980420-5.wmv
27118/980420/980420-6.wmv
27118/980420/980420-7.wmv
27118/980420/980420-8.wmv
27118/980420/980420-9.wmv

In the period of time, the log file of machine B
http://people.chu.edu.tw/~b8902110/temp/messages23

Could you rerun that test but before you start copying file turn oniscsi eh debugging by doing:


echo 1 > /sys/module/libiscsi/parameters/debug_libiscsi_eh

I think you might be hitting a problem where you are sending too muchIO, a IO takes too long and the scsi eh runs and is waiting on thetarget but the target is waiting on the initiator and then ends updropping the session.


**If** with the debug_libicsi_eh on you see messages like

aborting sc 0x1234432

in /var/log/messages while running your test, then to avoid falling inthere you need to:



1. increase scsi cmd timeout.

echo N > /sys/block/sdX/device/timeout

(you can also create a udev rule to increase this. See section 8.1.2.1in /usr/share/doc/iscsi-initiator-utils-6.2.0.870/README)

2. Reduce node.session.cmds_max and node.session.queue_depth. Do notjust set it to half and report back. Try setting it really low like setnode.session.queue_depth to 1 or 2 or 4 and see if that changesanything. Let me know what worked and did not too btw.





Could you also tell me what version of the kernel you are using?

--
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
open-iscsi+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.

Re: iscsid: Kernel reported iSCSI connection 1:0 error (1020) state (3)

Reply via email to