> 
> From the iSCSI Target side, posting relevant pieces
>  of /var/adm/ 
> essages, during the times when timeouts are reported.
> 
On iSCSI Target there is nothing at all in /var/adm/messages.

> From the iSCSI Initiator side, please describe what
>  type of systems,  
> oftware, configuration and errors you are seeing.
> 
It is Virtual Iron environment. As far as I can tell it is built on SLES64 
kernel and running Xen hypervisor. VI correctly discovers and establishes 
session to Solaris. 
Then, it successfully creates an LVM physical volume and Volume Group, but when 
I try to clone some virtual disk to new storage problems appear. Here is 
snippet from log:
 
2008-10-21 19:13:02,175 [de (10.2.2.246)] DEBUG  - <28>Oct 21 19:14:02 iscsid: 
Nop-out timedout after 15 seconds on connection 5:0 state (3). Dropping session.
2008-10-21 19:13:07,175 [de (10.2.2.246)] DEBUG  - <6>Oct 21 19:14:09 kernel: 
sd 9:0:0:0: SCSI error: return code = 0x00020000
2008-10-21 19:13:07,175 [de (10.2.2.246)] DEBUG  - <4>Oct 21 19:14:09 kernel: 
end_request: I/O error, dev sdb, sector 548376
2008-10-21 19:13:07,175 [de (10.2.2.246)] DEBUG  - <4>Oct 21 19:14:09 kernel: 
device-mapper: dm-multipath: Failing path 8:16.
2008-10-21 19:13:07,175 [de (10.2.2.246)] DEBUG  - <4>Oct 21 19:14:09 kernel: 
printk: 1637 messages suppressed.
2008-10-21 19:13:07,175 [de (10.2.2.246)] DEBUG  - <3>Oct 21 19:14:09 kernel: 
Buffer I/O error on device dm-27, logical block 68500
2008-10-21 19:13:07,175 [de (10.2.2.246)] DEBUG  - <4>Oct 21 19:14:09 kernel: 
lost page write due to I/O error on dm-27
2008-10-21 19:13:07,175 [de (10.2.2.246)] DEBUG  - <3>Oct 21 19:14:09 kernel: 
Buffer I/O error on device dm-27, logical block 68501
2008-10-21 19:13:07,175 [de (10.2.2.246)] DEBUG  - <4>Oct 21 19:14:09 kernel: 
lost page write due to I/O error on dm-27
2008-10-21 19:13:07,175 [de (10.2.2.246)] DEBUG  - <3>Oct 21 19:14:09 kernel: 
Buffer I/O error on device dm-27, logical block 68502
2008-10-21 19:13:07,175 [de (10.2.2.246)] DEBUG  - <4>Oct 21 19:14:09 kernel: 
lost page write due to I/O error on dm-27
2008-10-21 19:13:07,175 [de (10.2.2.246)] DEBUG  - <3>Oct 21 19:14:09 kernel: 
Buffer I/O error on device dm-27, logical block 68503
2008-10-21 19:13:07,175 [de (10.2.2.246)] DEBUG  - <4>Oct 21 19:14:09 kernel: 
lost page write due to I/O error on dm-27
2008-10-21 19:13:07,175 [de (10.2.2.246)] DEBUG  - <3>Oct 21 19:14:09 kernel: 
Buffer I/O error on device dm-27, logical block 68504
2008-10-21 19:13:07,175 [de (10.2.2.246)] DEBUG  - <4>Oct 21 19:14:09 kernel: 
lost page write due to I/O error on dm-27
2008-10-21 19:13:07,175 [de (10.2.2.246)] DEBUG  - <3>Oct 21 19:14:09 kernel: 
Buffer I/O error on device dm-27, logical block 68505
2008-10-21 19:13:07,175 [de (10.2.2.246)] DEBUG  - <4>Oct 21 19:14:09 kernel: 
lost page write due to I/O error on dm-27
2008-10-21 19:13:07,175 [de (10.2.2.246)] DEBUG  - <3>Oct 21 19:14:09 kernel: 
Buffer I/O error on device dm-27, logical block 68506
2008-10-21 19:13:07,175 [de (10.2.2.246)] DEBUG  - <4>Oct 21 19:14:09 kernel: 
lost page write due to I/O error on dm-27
2008-10-21 19:13:07,175 [de (10.2.2.246)] DEBUG  - <3>Oct 21 19:14:09 kernel: 
Buffer I/O error on device dm-27, logical block 68507
2008-10-21 19:13:07,175 [de (10.2.2.246)] DEBUG  - <4>Oct 21 19:14:09 kernel: 
lost page write due to I/O error on dm-27
2008-10-21 19:13:07,175 [de (10.2.2.246)] DEBUG  - <3>Oct 21 19:14:09 kernel: 
Buffer I/O error on device dm-27, logical block 68508
2008-10-21 19:13:07,176 [de (10.2.2.246)] DEBUG  - <4>Oct 21 19:14:09 kernel: 
lost page write due to I/O error on dm-27
2008-10-21 19:13:07,176 [de (10.2.2.246)] DEBUG  - <3>Oct 21 19:14:09 kernel: 
Buffer I/O error on device dm-27, logical block 68509
2008-10-21 19:13:07,176 [de (10.2.2.246)] DEBUG  - <4>Oct 21 19:14:09 kernel: 
lost page write due to I/O error on dm-27
2008-10-21 19:13:07,176 [de (10.2.2.246)] DEBUG  - <6>Oct 21 19:14:09 kernel: 
sd 9:0:0:0: SCSI error: return code = 0x00020000
2008-10-21 19:13:07,176 [de (10.2.2.246)] DEBUG  - <4>Oct 21 19:14:09 kernel: 
end_request: I/O error, dev sdb, sector 549400
2008-10-21 19:13:07,176 [de (10.2.2.246)] DEBUG  - <6>Oct 21 19:14:09 kernel: 
sd 9:0:0:0: SCSI error: return code = 0x00020000
2008-10-21 19:13:07,176 [de (10.2.2.246)] DEBUG  - <4>Oct 21 19:14:09 kernel: 
end_request: I/O error, dev sdb, sector 550424
2008-10-21 19:13:07,176 [de (10.2.2.246)] DEBUG  - <6>Oct 21 19:14:09 kernel: 
sd 9:0:0:0: SCSI error: return code = 0x00020000
2008-10-21 19:13:07,176 [de (10.2.2.246)] DEBUG  - <4>Oct 21 19:14:09 kernel: 
end_request: I/O error, dev sdb, sector 551448
2008-10-21 19:13:07,176 [de (10.2.2.246)] DEBUG  - <6>Oct 21 19:14:09 kernel: 
sd 9:0:0:0: SCSI error: return code = 0x00020000
2008-10-21 19:13:07,176 [de (10.2.2.246)] DEBUG  - <4>Oct 21 19:14:09 kernel: 
end_request: I/O error, dev sdb, sector 551456
2008-10-21 19:13:12,179 [de (10.2.2.246)] DEBUG  - <28>Oct 21 19:14:12 iscsid: 
connection5:0 is operational after recovery (2 attempts)

DTrace iscsisnoop.d didn't show anything strange either, here is part when NOP 
timeout occured:
  3  172.16.33.15               data-receive   131072 1610612754  -
  3  172.16.33.15               scsi-response       0 1610612752  -
  3  172.16.33.15               data-receive   131072 1610612756  -
  3  172.16.33.15               scsi-response       0 1610612754  -
  3  172.16.33.15               data-receive   131072 1610612757  -
  3  172.16.33.15               scsi-response       0 1610612756  -
  3  172.16.33.15               data-receive   131072 1610612764  -
  3  172.16.33.15               scsi-response       0 1610612757  -
  3  172.16.33.15               data-receive   131072 1610612763  -
  3  172.16.33.15               scsi-response       0 1610612764  -
  3  172.16.33.15               data-receive   131072 1610612769  -
  3  172.16.33.15               scsi-response       0 1610612763  -
  3  172.16.33.15               data-receive   131072 1610612786  -
  0  172.16.33.15               scsi-response       0 1610612769  -
  0  172.16.33.15               scsi-response       0 1610612786  -
  3  172.16.33.15               scsi-command   131072 1610612773  write(10)
  2  172.16.33.15               login-command     493 1610615296  -
  2  172.16.33.15               login-response    368 1610615296  -
  2  172.16.33.15               nop-receive         0 1879050765  -
...

Any ideas what can be wong and where to look further?

Thanks,
Eugene
--
This message posted from opensolaris.org
_______________________________________________
storage-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/storage-discuss

Reply via email to