[zfs-discuss] Problem: Disconnected command timeout for Target X

2012-07-17 Thread Roberto Scudeller
Hi all,

I'm using Opensolaris snv_134 with LSI Controllers and a motherboard
supermicro, with 20 sata disks, zfs in raid-10 conf. I mounted this
zfs_storage with NFS.
I'm not opensolaris specialist. What're the commands to show hardware
information? Like 'lshw' in linux but for opensolaris.

The storage stopped working, but ping responds. SSH and NFS is out. When I
open the console showing this messages:

Jul  2 13:00:27 storage scsi: [ID 107833 kern.warning] WARNING: /pci@0
,0/pci8086,340a@3/pci1000,3140@0 (mpt2):
Jul  2 13:00:27 storageDisconnected command timeout for Target 4
Jul  2 13:01:28 storage scsi: [ID 107833 kern.warning] WARNING: /pci@0
,0/pci8086,340a@3/pci1000,3140@0 (mpt2):
Jul  2 13:01:28 storageDisconnected command timeout for Target 3
Jul  2 13:02:28 storage scsi: [ID 107833 kern.warning] WARNING: /pci@0
,0/pci8086,340a@3/pci1000,3140@0 (mpt2):
Jul  2 13:02:28 storageDisconnected command timeout for Target 2
Jul  2 13:03:29 storage scsi: [ID 107833 kern.warning] WARNING: /pci@0
,0/pci8086,340a@3/pci1000,3140@0 (mpt2):
Jul  2 13:03:29 storageDisconnected command timeout for Target 1
Jul  2 13:04:29 storage scsi: [ID 107833 kern.warning] WARNING: /pci@0
,0/pci8086,340a@3/pci1000,3140@0 (mpt2):
Jul  2 13:04:29 storageDisconnected command timeout for Target 0
Jul  2 13:05:40 storage scsi: [ID 107833 kern.warning] WARNING: /pci@0
,0/pci8086,340a@3/pci1000,3140@0 (mpt2):
Jul  2 13:05:40 storageDisconnected command timeout for Target 6
Jul  2 13:06:40 storage scsi: [ID 107833 kern.warning] WARNING: /pci@0
,0/pci8086,340a@3/pci1000,3140@0 (mpt2):
Jul  2 13:06:40 storageDisconnected command timeout for Target 5

Any ideas? Could help me?

-- 
Roberto Scudeller
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Problem: Disconnected command timeout for Target X

2012-07-17 Thread Bob Friesenhahn

On Tue, 17 Jul 2012, Roberto Scudeller wrote:


Hi all,

I'm using Opensolaris snv_134 with LSI Controllers and a motherboard 
supermicro, with 20 sata disks, zfs in raid-10 conf. I mounted this zfs_storage 
with
NFS.
I'm not opensolaris specialist. What're the commands to show hardware 
information? Like 'lshw' in linux but for opensolaris.


cfgadm, prtconf, prtpicl, prtdiag

zpool status

fmadm faulty

It sounds like you may have a broken cable or power supply failure to 
some disks.


Bob



The storage stopped working, but ping responds. SSH and NFS is out. When I open 
the console showing this messages:

Jul  2 13:00:27 storage scsi: [ID 107833 kern.warning] WARNING: 
/pci@0,0/pci8086,340a@3/pci1000,3140@0 (mpt2):
Jul  2 13:00:27 storage    Disconnected command timeout for Target 4
Jul  2 13:01:28 storage scsi: [ID 107833 kern.warning] WARNING: 
/pci@0,0/pci8086,340a@3/pci1000,3140@0 (mpt2):
Jul  2 13:01:28 storage    Disconnected command timeout for Target 3
Jul  2 13:02:28 storage scsi: [ID 107833 kern.warning] WARNING: 
/pci@0,0/pci8086,340a@3/pci1000,3140@0 (mpt2):
Jul  2 13:02:28 storage    Disconnected command timeout for Target 2
Jul  2 13:03:29 storage scsi: [ID 107833 kern.warning] WARNING: 
/pci@0,0/pci8086,340a@3/pci1000,3140@0 (mpt2):
Jul  2 13:03:29 storage    Disconnected command timeout for Target 1
Jul  2 13:04:29 storage scsi: [ID 107833 kern.warning] WARNING: 
/pci@0,0/pci8086,340a@3/pci1000,3140@0 (mpt2):
Jul  2 13:04:29 storage    Disconnected command timeout for Target 0
Jul  2 13:05:40 storage scsi: [ID 107833 kern.warning] WARNING: 
/pci@0,0/pci8086,340a@3/pci1000,3140@0 (mpt2):
Jul  2 13:05:40 storage    Disconnected command timeout for Target 6
Jul  2 13:06:40 storage scsi: [ID 107833 kern.warning] WARNING: 
/pci@0,0/pci8086,340a@3/pci1000,3140@0 (mpt2):
Jul  2 13:06:40 storage    Disconnected command timeout for Target 5

Any ideas? Could help me?

--
Roberto Scudeller






--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Problem: Disconnected command timeout for Target X

2012-07-17 Thread Roberto Scudeller
Hi Bob,

Thanks for the answers.

How do I test your theory?

In this case, I use common disks SATA 2, not Nearline SAS (NL SATA) or SAS.
Do you think the disks SATA are the problem?

Cheers,


2012/7/17 Bob Friesenhahn bfrie...@simple.dallas.tx.us

 On Tue, 17 Jul 2012, Roberto Scudeller wrote:

  Hi all,

 I'm using Opensolaris snv_134 with LSI Controllers and a motherboard
 supermicro, with 20 sata disks, zfs in raid-10 conf. I mounted this
 zfs_storage with
 NFS.
 I'm not opensolaris specialist. What're the commands to show hardware
 information? Like 'lshw' in linux but for opensolaris.


 cfgadm, prtconf, prtpicl, prtdiag

 zpool status

 fmadm faulty

 It sounds like you may have a broken cable or power supply failure to some
 disks.

 Bob



 The storage stopped working, but ping responds. SSH and NFS is out. When
 I open the console showing this messages:

 Jul  2 13:00:27 storage scsi: [ID 107833 kern.warning] WARNING: /pci@0
 ,0/pci8086,340a@3/**pci1000,3140@0 (mpt2):
 Jul  2 13:00:27 storageDisconnected command timeout for Target 4
 Jul  2 13:01:28 storage scsi: [ID 107833 kern.warning] WARNING: /pci@0
 ,0/pci8086,340a@3/**pci1000,3140@0 (mpt2):
 Jul  2 13:01:28 storageDisconnected command timeout for Target 3
 Jul  2 13:02:28 storage scsi: [ID 107833 kern.warning] WARNING: /pci@0
 ,0/pci8086,340a@3/**pci1000,3140@0 (mpt2):
 Jul  2 13:02:28 storageDisconnected command timeout for Target 2
 Jul  2 13:03:29 storage scsi: [ID 107833 kern.warning] WARNING: /pci@0
 ,0/pci8086,340a@3/**pci1000,3140@0 (mpt2):
 Jul  2 13:03:29 storageDisconnected command timeout for Target 1
 Jul  2 13:04:29 storage scsi: [ID 107833 kern.warning] WARNING: /pci@0
 ,0/pci8086,340a@3/**pci1000,3140@0 (mpt2):
 Jul  2 13:04:29 storageDisconnected command timeout for Target 0
 Jul  2 13:05:40 storage scsi: [ID 107833 kern.warning] WARNING: /pci@0
 ,0/pci8086,340a@3/**pci1000,3140@0 (mpt2):
 Jul  2 13:05:40 storageDisconnected command timeout for Target 6
 Jul  2 13:06:40 storage scsi: [ID 107833 kern.warning] WARNING: /pci@0
 ,0/pci8086,340a@3/**pci1000,3140@0 (mpt2):
 Jul  2 13:06:40 storageDisconnected command timeout for Target 5

 Any ideas? Could help me?

 --
 Roberto Scudeller





 --
 Bob Friesenhahn
 bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/**
 users/bfriesen/ http://www.simplesystems.org/users/bfriesen/
 GraphicsMagick Maintainer,http://www.GraphicsMagick.org/




-- 
Roberto Scudeller
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Problem: Disconnected command timeout for Target X

2012-07-17 Thread Bob Friesenhahn

On Tue, 17 Jul 2012, Roberto Scudeller wrote:


Hi Bob,

Thanks for the answers.

How do I test your theory?


I would use 'dd' to see if it is possible to transfer data from one of 
the problem devices.  Gain physical access to the system and check the 
signal and power cables to these devices closely.


Use 'iostat -xe' to see what error counts have accumulated.  Also 
'iostat -E'.



In this case, I use common disks SATA 2, not Nearline SAS (NL SATA) or SAS. Do 
you think the disks SATA are the problem?


There have been reports of congestion leading to timeouts and resets 
when SATA disks are on expanders.  There have also been reports that 
one failing disk can cause problems when on expanders.  Regardless, if 
this system has been previously operating fine for some time, these 
errors would indicate a change in the hardware shared by all these 
devices.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss