Re: Linux 2.6.24 sata_promise SATA300TX4 problems

2008-01-27 Thread Peter Favrholdt

Hi Mikael,

Thanks!

It works perfectly at 1.5Gbps :-)

I think that because it fails at 3Gbps it might eventually fail at 
1.5Gbps also... And the error handling was not robust (in my setup with 
these drives at 3Gbps).


If I can help investigate further, I'm happy to do that. I have a spare 
controller card so I could try swapping them.  Do you have any 
suggestions for what I should try next?


Best regards,

Peter

Mikael Pettersson wrote:

Peter Favrholdt writes:
  If it is not too much of a hassle, could you please make a 1.5Gbps patch 
  for 2.6.24 for me to try out? If it solves the problem (without me ever 
  touching the cables) we know for sure it is speed-related and not due to 
  kernel version.


No problem. I had intended to drop that patch after 2.6.24-rc8 as it
ought to be obsolete, but then again it might not be. It's available here:
http://user.it.uu.se/~mikpe/linux/patches/2.6/patch-sata_promise-limit-sataii-to-1.5Gbps-2.6.24


-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Linux 2.6.24 sata_promise SATA300TX4 problems

2008-01-26 Thread Mikael Pettersson
Peter Favrholdt writes:
  Hi Mikael  list,
  
  I have previously reported problems with my setup:
  
  SATA300TX4 + 4 Seagate Barracuda ES 500GB
  
  I just tested with 2.6.24. After copying approx 25GB of each drive using
dd if=/dev/sd[abcd] of=/dev/null bs=1M
  sda failed with the following message:
  
  [ 1060.069489] ata1: SError: { 10B8B Dispar BadCRC TrStaTrns }
  [ 1060.069498] ata1.00: cmd 25/00:00:90:2c:e6/00:02:01:00:00/e0 tag 0 
  dma 262144 in
  [ 1060.069501]  res 40/00:28:00:00:00/00:00:00:00:00/40 Emask 
  0x4 (timeout)
  
  I have included lspci and dmesg output below.
  
  My system is rock solid using 2.6.21-rc2 with Mikael Pettersons 1.5Gbps 
  patch.
...
  [ 1060.069478] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x138 
  action 0x2 frozen
  [ 1060.069489] ata1: SError: { 10B8B Dispar BadCRC TrStaTrns }
  [ 1060.069498] ata1.00: cmd 25/00:00:90:2c:e6/00:02:01:00:00/e0 tag 0 
  dma 262144 in
  [ 1060.069501]  res 40/00:28:00:00:00/00:00:00:00:00/40 Emask 
  0x4 (timeout)
  [ 1060.069505] ata1.00: status: { DRDY }
  [ 1065.437567] ata1: port is slow to respond, please be patient (Status 
  0xff)
  [ 1070.114210] ata1: device not ready (errno=-16), forcing hardreset
  [ 1070.114219] ata1: hard resetting link
  [ 1076.320932] ata1: port is slow to respond, please be patient (Status 
  0xff)
  [ 1080.158924] ata1: COMRESET failed (errno=-16)

Mysterious. What you have there is a transmission error between the
controller and the disk, which is bad in and by itself, but then there's
a sequence of COMRESETs that fail to bring the port or disk back to life.

The original error is not a driver error but something caused by your
system, be it a dodgy cable, a poorly seated cable, or electrical
interference. But the failed COMRESETs is a concern as I've seen them
in other reports as well.

Me worried ...

So going back to 2.6.21-rc2 makes the system stable again? Can you do some
more testing to see at what point the system becomes less stable? I.e.,
2.6.21-rcI, 2.6.22, 2.6.22-rcJ, 2.6.23, or 2.6.24-rcJ?

FWIW, I just completed some testing of a 300 TX4 card with kernel 2.6.24,
including dd:s, fscks, mkfs:s, and copying about 400GB of data from one drive
(Samsung) to another (Seagate 7200.10) on that card, and I cannot seem to break 
it.
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Linux 2.6.24 sata_promise SATA300TX4 problems

2008-01-26 Thread Peter Favrholdt

Hi Mikael,

Thanks for your reply :-)

Mikael Pettersson wrote:

Mysterious. What you have there is a transmission error between the
controller and the disk, which is bad in and by itself, but then there's
a sequence of COMRESETs that fail to bring the port or disk back to life.

The original error is not a driver error but something caused by your
system, be it a dodgy cable, a poorly seated cable, or electrical
interference. But the failed COMRESETs is a concern as I've seen them
in other reports as well.


Maybe I should try switching cables (again). Or it could be a 
motherboard issue (NFORCE2)?



Me worried ...

So going back to 2.6.21-rc2 makes the system stable again? Can you do some
more testing to see at what point the system becomes less stable? I.e.,
2.6.21-rcI, 2.6.22, 2.6.22-rcJ, 2.6.23, or 2.6.24-rcJ?


I believe the important part is your 1.5Gbps patch which I applied to 
2.6.21-rc2. Maybe the reason for being stable is that the transmission 
error will not show up at that speed - thus not having anything to do 
with the kernel version. I'm quite sure the problem is there using 
2.6.21-rc2 at 3Gbps.



FWIW, I just completed some testing of a 300 TX4 card with kernel 2.6.24,
including dd:s, fscks, mkfs:s, and copying about 400GB of data from one drive
(Samsung) to another (Seagate 7200.10) on that card, and I cannot seem to break 
it.


I believe it only happens if I stress all four drives simultanously. So 
maybe the transmission error is somehow related to the overall stress of 
the PCI bus/card/chip/whatever?


If it is not too much of a hassle, could you please make a 1.5Gbps patch 
for 2.6.24 for me to try out? If it solves the problem (without me ever 
touching the cables) we know for sure it is speed-related and not due to 
kernel version.


Still strange that the com resets does not help though (but maybe this 
is the drive which locks up?) :-/


Best regards,

Peter
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Linux 2.6.24 sata_promise SATA300TX4 problems

2008-01-26 Thread Mikael Pettersson
Peter Favrholdt writes:
  Hi Mikael,
  
  Thanks for your reply :-)
  
  Mikael Pettersson wrote:
   Mysterious. What you have there is a transmission error between the
   controller and the disk, which is bad in and by itself, but then there's
   a sequence of COMRESETs that fail to bring the port or disk back to life.
   
   The original error is not a driver error but something caused by your
   system, be it a dodgy cable, a poorly seated cable, or electrical
   interference. But the failed COMRESETs is a concern as I've seen them
   in other reports as well.
  
  Maybe I should try switching cables (again). Or it could be a 
  motherboard issue (NFORCE2)?
  
   Me worried ...
   
   So going back to 2.6.21-rc2 makes the system stable again? Can you do some
   more testing to see at what point the system becomes less stable? I.e.,
   2.6.21-rcI, 2.6.22, 2.6.22-rcJ, 2.6.23, or 2.6.24-rcJ?
  
  I believe the important part is your 1.5Gbps patch which I applied to 
  2.6.21-rc2. Maybe the reason for being stable is that the transmission 
  error will not show up at that speed - thus not having anything to do 
  with the kernel version. I'm quite sure the problem is there using 
  2.6.21-rc2 at 3Gbps.
  
   FWIW, I just completed some testing of a 300 TX4 card with kernel 2.6.24,
   including dd:s, fscks, mkfs:s, and copying about 400GB of data from one 
   drive
   (Samsung) to another (Seagate 7200.10) on that card, and I cannot seem to 
   break it.
  
  I believe it only happens if I stress all four drives simultanously. So 
  maybe the transmission error is somehow related to the overall stress of 
  the PCI bus/card/chip/whatever?
  
  If it is not too much of a hassle, could you please make a 1.5Gbps patch 
  for 2.6.24 for me to try out? If it solves the problem (without me ever 
  touching the cables) we know for sure it is speed-related and not due to 
  kernel version.

No problem. I had intended to drop that patch after 2.6.24-rc8 as it
ought to be obsolete, but then again it might not be. It's available here:
http://user.it.uu.se/~mikpe/linux/patches/2.6/patch-sata_promise-limit-sataii-to-1.5Gbps-2.6.24
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html