Re: SATA problems

2007-09-13 Thread Tejun Heo
Andrew Morton wrote:
> On Thu, 30 Aug 2007 09:24:18 + Nigel Kukard <[EMAIL PROTECTED]> wrote:
> 
>> Hrmmm,
>>
  > 
  >  > > Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
  >  > > 0x0001c807
  >  > > Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
  >  > > 0x0001c807
  > 
  > Unrelated to the other error, but I've been meaning to ask for a while..
  > If this is 'abnormal', why does every SATA box I've seen do it?

 *crickets*

It's removed (finally).  :-)

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-09-13 Thread Tejun Heo
Andrew Morton wrote:
 On Thu, 30 Aug 2007 09:24:18 + Nigel Kukard [EMAIL PROTECTED] wrote:
 
 Hrmmm,

   
  Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
  0x0001c807
  Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
  0x0001c807
   
   Unrelated to the other error, but I've been meaning to ask for a while..
   If this is 'abnormal', why does every SATA box I've seen do it?

 *crickets*

It's removed (finally).  :-)

-- 
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-09-10 Thread Andrew Morton
On Thu, 30 Aug 2007 09:24:18 + Nigel Kukard <[EMAIL PROTECTED]> wrote:

> Hrmmm,
> 
> >>  > 
> >>  >  > > Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
> >>  >  > > 0x0001c807
> >>  >  > > Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
> >>  >  > > 0x0001c807
> >>  > 
> >>  > Unrelated to the other error, but I've been meaning to ask for a while..
> >>  > If this is 'abnormal', why does every SATA box I've seen do it?
> >>
> >> *crickets*

chirp, chirp.

> >> Should we check for this case explicitly, and not print this?
> >>
> >>   
> >> 
> > After I get the above errors, my entire SATA bus crashes and I need to
> > hard reset the box ... not sure we can just ignore the errors?
> >
> >   
> 
> Appears even with the patch provided a few months ago I'm getting
> freezes. Replaced the HDD & all cables, same errors ... especially
> whilst doing heavy IO.
> 
> Can anyone shed some light?
> 

I think I was told last week that copying the appropriate mailing list will
at least prevent chirping, so let's try that.

Original thread here: http://lkml.org/lkml/2007/6/14/154

> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> ata2.00: cmd c8/00:00:9f:c9:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
> 131072 in
>  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> ata2: soft resetting port
> ATA: abnormal status 0x7F on port 0x0001c807
> ATA: abnormal status 0x7F on port 0x0001c807
> ata2.00: configured for UDMA/133
> ata2: EH complete
> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> ata2.00: cmd c8/00:00:9f:c9:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
> 131072 in
>  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> ata2: soft resetting port
> ATA: abnormal status 0x7F on port 0x0001c807
> ATA: abnormal status 0x7F on port 0x0001c807
> ata2.00: configured for UDMA/133
> ata2: EH complete
> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> ata2.00: cmd c8/00:00:9f:c9:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
> 131072 in
>  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> ata2: soft resetting port
> ATA: abnormal status 0x7F on port 0x0001c807
> ATA: abnormal status 0x7F on port 0x0001c807
> ata2.00: configured for UDMA/133
> ata2: EH complete
> ata2.00: limiting speed to UDMA/100:PIO4
> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> ata2.00: cmd c8/00:00:9f:c9:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
> 131072 in
>  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> ata2: soft resetting port
> ATA: abnormal status 0x7F on port 0x0001c807
> ATA: abnormal status 0x7F on port 0x0001c807
> ata2.00: configured for UDMA/100
> ata2: EH complete
> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> ata2.00: cmd c8/00:00:9f:c9:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
> 131072 in
>  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> ata2: soft resetting port
> ATA: abnormal status 0x7F on port 0x0001c807
> ATA: abnormal status 0x7F on port 0x0001c807
> ata2.00: configured for UDMA/100
> ata2: EH complete
> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> ata2.00: cmd c8/00:00:9f:c9:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
> 131072 in
>  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> ata2: soft resetting port
> ATA: abnormal status 0x7F on port 0x0001c807
> ATA: abnormal status 0x7F on port 0x0001c807
> ata2.00: configured for UDMA/100
> sd 3:0:0:0: SCSI error: return code = 0x0802
> sda: Current [descriptor]: sense key=0xb
> ASC=0x0 ASCQ=0x0
> Descriptor sense data with sense descriptors (in hex):
> 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00
> 00 00 00 00
> end_request: I/O error, dev sda, sector 30132639
> ata2: EH complete
> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> ata2.00: cmd c8/00:00:9f:ca:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
> 131072 in
>  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> ata2: soft resetting port
> ATA: abnormal status 0x7F on port 0x0001c807
> ATA: abnormal status 0x7F on port 0x0001c807
> ata2.00: configured for UDMA/100
> ata2: EH complete
> ata2.00: limiting speed to UDMA/33:PIO4
> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> ata2.00: cmd c8/00:00:9f:ca:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
> 131072 in
>  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> ata2: soft resetting port
> ATA: abnormal status 0x7F on port 0x0001c807
> ATA: abnormal status 0x7F on port 0x0001c807
> ata2.00: configured for UDMA/33
> ata2: EH complete
> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> ata2.00: cmd c8/00:00:9f:ca:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
> 131072 in
>  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> ata2: soft resetting port
> ATA: abnormal status 0x7F on port 0x0001c807
> ATA: 

Re: SATA problems

2007-09-10 Thread Andrew Morton
On Thu, 30 Aug 2007 09:24:18 + Nigel Kukard [EMAIL PROTECTED] wrote:

 Hrmmm,
 

   Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
   0x0001c807
   Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
   0x0001c807

Unrelated to the other error, but I've been meaning to ask for a while..
If this is 'abnormal', why does every SATA box I've seen do it?
 
  *crickets*

chirp, chirp.

  Should we check for this case explicitly, and not print this?
 

  
  After I get the above errors, my entire SATA bus crashes and I need to
  hard reset the box ... not sure we can just ignore the errors?
 

 
 Appears even with the patch provided a few months ago I'm getting
 freezes. Replaced the HDD  all cables, same errors ... especially
 whilst doing heavy IO.
 
 Can anyone shed some light?
 

I think I was told last week that copying the appropriate mailing list will
at least prevent chirping, so let's try that.

Original thread here: http://lkml.org/lkml/2007/6/14/154

 ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
 ata2.00: cmd c8/00:00:9f:c9:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
 131072 in
  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
 ata2: soft resetting port
 ATA: abnormal status 0x7F on port 0x0001c807
 ATA: abnormal status 0x7F on port 0x0001c807
 ata2.00: configured for UDMA/133
 ata2: EH complete
 ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
 ata2.00: cmd c8/00:00:9f:c9:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
 131072 in
  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
 ata2: soft resetting port
 ATA: abnormal status 0x7F on port 0x0001c807
 ATA: abnormal status 0x7F on port 0x0001c807
 ata2.00: configured for UDMA/133
 ata2: EH complete
 ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
 ata2.00: cmd c8/00:00:9f:c9:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
 131072 in
  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
 ata2: soft resetting port
 ATA: abnormal status 0x7F on port 0x0001c807
 ATA: abnormal status 0x7F on port 0x0001c807
 ata2.00: configured for UDMA/133
 ata2: EH complete
 ata2.00: limiting speed to UDMA/100:PIO4
 ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
 ata2.00: cmd c8/00:00:9f:c9:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
 131072 in
  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
 ata2: soft resetting port
 ATA: abnormal status 0x7F on port 0x0001c807
 ATA: abnormal status 0x7F on port 0x0001c807
 ata2.00: configured for UDMA/100
 ata2: EH complete
 ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
 ata2.00: cmd c8/00:00:9f:c9:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
 131072 in
  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
 ata2: soft resetting port
 ATA: abnormal status 0x7F on port 0x0001c807
 ATA: abnormal status 0x7F on port 0x0001c807
 ata2.00: configured for UDMA/100
 ata2: EH complete
 ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
 ata2.00: cmd c8/00:00:9f:c9:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
 131072 in
  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
 ata2: soft resetting port
 ATA: abnormal status 0x7F on port 0x0001c807
 ATA: abnormal status 0x7F on port 0x0001c807
 ata2.00: configured for UDMA/100
 sd 3:0:0:0: SCSI error: return code = 0x0802
 sda: Current [descriptor]: sense key=0xb
 ASC=0x0 ASCQ=0x0
 Descriptor sense data with sense descriptors (in hex):
 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00
 00 00 00 00
 end_request: I/O error, dev sda, sector 30132639
 ata2: EH complete
 ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
 ata2.00: cmd c8/00:00:9f:ca:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
 131072 in
  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
 ata2: soft resetting port
 ATA: abnormal status 0x7F on port 0x0001c807
 ATA: abnormal status 0x7F on port 0x0001c807
 ata2.00: configured for UDMA/100
 ata2: EH complete
 ata2.00: limiting speed to UDMA/33:PIO4
 ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
 ata2.00: cmd c8/00:00:9f:ca:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
 131072 in
  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
 ata2: soft resetting port
 ATA: abnormal status 0x7F on port 0x0001c807
 ATA: abnormal status 0x7F on port 0x0001c807
 ata2.00: configured for UDMA/33
 ata2: EH complete
 ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
 ata2.00: cmd c8/00:00:9f:ca:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
 131072 in
  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
 ata2: soft resetting port
 ATA: abnormal status 0x7F on port 0x0001c807
 ATA: abnormal status 0x7F on port 0x0001c807
 ata2.00: configured for UDMA/33
 ata2: EH complete
 ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
 

Re: SATA problems

2007-08-30 Thread Nigel Kukard
Hrmmm,

>>  > 
>>  >  > > Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
>>  >  > > 0x0001c807
>>  >  > > Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
>>  >  > > 0x0001c807
>>  > 
>>  > Unrelated to the other error, but I've been meaning to ask for a while..
>>  > If this is 'abnormal', why does every SATA box I've seen do it?
>>
>> *crickets*
>>
>> Should we check for this case explicitly, and not print this?
>>
>>   
>> 
> After I get the above errors, my entire SATA bus crashes and I need to
> hard reset the box ... not sure we can just ignore the errors?
>
>   

Appears even with the patch provided a few months ago I'm getting
freezes. Replaced the HDD & all cables, same errors ... especially
whilst doing heavy IO.

Can anyone shed some light?


ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd c8/00:00:9f:c9:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
131072 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ATA: abnormal status 0x7F on port 0x0001c807
ATA: abnormal status 0x7F on port 0x0001c807
ata2.00: configured for UDMA/133
ata2: EH complete
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd c8/00:00:9f:c9:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
131072 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ATA: abnormal status 0x7F on port 0x0001c807
ATA: abnormal status 0x7F on port 0x0001c807
ata2.00: configured for UDMA/133
ata2: EH complete
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd c8/00:00:9f:c9:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
131072 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ATA: abnormal status 0x7F on port 0x0001c807
ATA: abnormal status 0x7F on port 0x0001c807
ata2.00: configured for UDMA/133
ata2: EH complete
ata2.00: limiting speed to UDMA/100:PIO4
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd c8/00:00:9f:c9:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
131072 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ATA: abnormal status 0x7F on port 0x0001c807
ATA: abnormal status 0x7F on port 0x0001c807
ata2.00: configured for UDMA/100
ata2: EH complete
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd c8/00:00:9f:c9:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
131072 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ATA: abnormal status 0x7F on port 0x0001c807
ATA: abnormal status 0x7F on port 0x0001c807
ata2.00: configured for UDMA/100
ata2: EH complete
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd c8/00:00:9f:c9:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
131072 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ATA: abnormal status 0x7F on port 0x0001c807
ATA: abnormal status 0x7F on port 0x0001c807
ata2.00: configured for UDMA/100
sd 3:0:0:0: SCSI error: return code = 0x0802
sda: Current [descriptor]: sense key=0xb
ASC=0x0 ASCQ=0x0
Descriptor sense data with sense descriptors (in hex):
72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00
00 00 00 00
end_request: I/O error, dev sda, sector 30132639
ata2: EH complete
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd c8/00:00:9f:ca:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
131072 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ATA: abnormal status 0x7F on port 0x0001c807
ATA: abnormal status 0x7F on port 0x0001c807
ata2.00: configured for UDMA/100
ata2: EH complete
ata2.00: limiting speed to UDMA/33:PIO4
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd c8/00:00:9f:ca:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
131072 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ATA: abnormal status 0x7F on port 0x0001c807
ATA: abnormal status 0x7F on port 0x0001c807
ata2.00: configured for UDMA/33
ata2: EH complete
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd c8/00:00:9f:ca:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
131072 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ATA: abnormal status 0x7F on port 0x0001c807
ATA: abnormal status 0x7F on port 0x0001c807
ata2.00: configured for UDMA/33
ata2: EH complete
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd c8/00:00:9f:ca:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
131072 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ATA: abnormal status 0x7F on port 0x0001c807
ATA: abnormal status 0x7F on port 0x0001c807
ata2.00: configured for UDMA/33
ata2: EH complete




signature.asc
Description: 

Re: SATA problems

2007-08-30 Thread Nigel Kukard
Hrmmm,

   
  Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
  0x0001c807
  Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
  0x0001c807
   
   Unrelated to the other error, but I've been meaning to ask for a while..
   If this is 'abnormal', why does every SATA box I've seen do it?

 *crickets*

 Should we check for this case explicitly, and not print this?

   
 
 After I get the above errors, my entire SATA bus crashes and I need to
 hard reset the box ... not sure we can just ignore the errors?

   

Appears even with the patch provided a few months ago I'm getting
freezes. Replaced the HDD  all cables, same errors ... especially
whilst doing heavy IO.

Can anyone shed some light?


ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd c8/00:00:9f:c9:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
131072 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ATA: abnormal status 0x7F on port 0x0001c807
ATA: abnormal status 0x7F on port 0x0001c807
ata2.00: configured for UDMA/133
ata2: EH complete
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd c8/00:00:9f:c9:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
131072 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ATA: abnormal status 0x7F on port 0x0001c807
ATA: abnormal status 0x7F on port 0x0001c807
ata2.00: configured for UDMA/133
ata2: EH complete
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd c8/00:00:9f:c9:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
131072 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ATA: abnormal status 0x7F on port 0x0001c807
ATA: abnormal status 0x7F on port 0x0001c807
ata2.00: configured for UDMA/133
ata2: EH complete
ata2.00: limiting speed to UDMA/100:PIO4
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd c8/00:00:9f:c9:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
131072 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ATA: abnormal status 0x7F on port 0x0001c807
ATA: abnormal status 0x7F on port 0x0001c807
ata2.00: configured for UDMA/100
ata2: EH complete
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd c8/00:00:9f:c9:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
131072 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ATA: abnormal status 0x7F on port 0x0001c807
ATA: abnormal status 0x7F on port 0x0001c807
ata2.00: configured for UDMA/100
ata2: EH complete
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd c8/00:00:9f:c9:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
131072 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ATA: abnormal status 0x7F on port 0x0001c807
ATA: abnormal status 0x7F on port 0x0001c807
ata2.00: configured for UDMA/100
sd 3:0:0:0: SCSI error: return code = 0x0802
sda: Current [descriptor]: sense key=0xb
ASC=0x0 ASCQ=0x0
Descriptor sense data with sense descriptors (in hex):
72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00
00 00 00 00
end_request: I/O error, dev sda, sector 30132639
ata2: EH complete
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd c8/00:00:9f:ca:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
131072 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ATA: abnormal status 0x7F on port 0x0001c807
ATA: abnormal status 0x7F on port 0x0001c807
ata2.00: configured for UDMA/100
ata2: EH complete
ata2.00: limiting speed to UDMA/33:PIO4
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd c8/00:00:9f:ca:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
131072 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ATA: abnormal status 0x7F on port 0x0001c807
ATA: abnormal status 0x7F on port 0x0001c807
ata2.00: configured for UDMA/33
ata2: EH complete
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd c8/00:00:9f:ca:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
131072 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ATA: abnormal status 0x7F on port 0x0001c807
ATA: abnormal status 0x7F on port 0x0001c807
ata2.00: configured for UDMA/33
ata2: EH complete
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata2.00: cmd c8/00:00:9f:ca:cb/00:00:00:00:00/e1 tag 0 cdb 0x0 data
131072 in
 res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata2: soft resetting port
ATA: abnormal status 0x7F on port 0x0001c807
ATA: abnormal status 0x7F on port 0x0001c807
ata2.00: configured for UDMA/33
ata2: EH complete




signature.asc
Description: OpenPGP digital signature


Re: SATA problems

2007-06-18 Thread Nigel Kukard

>  > 
>  >  > > Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
>  >  > > 0x0001c807
>  >  > > Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
>  >  > > 0x0001c807
>  > 
>  > Unrelated to the other error, but I've been meaning to ask for a while..
>  > If this is 'abnormal', why does every SATA box I've seen do it?
>
> *crickets*
>
> Should we check for this case explicitly, and not print this?
>
>   
After I get the above errors, my entire SATA bus crashes and I need to
hard reset the box ... not sure we can just ignore the errors?



signature.asc
Description: OpenPGP digital signature


Re: SATA problems

2007-06-18 Thread Dave Jones
On Thu, Jun 14, 2007 at 02:28:54PM -0400, Dave Jones wrote:
 > On Thu, Jun 14, 2007 at 12:21:49PM -0400, Jeff Garzik wrote:
 > 
 >  > > Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
 >  > > 0x0001c807
 >  > > Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
 >  > > 0x0001c807
 > 
 > Unrelated to the other error, but I've been meaning to ask for a while..
 > If this is 'abnormal', why does every SATA box I've seen do it?

*crickets*

Should we check for this case explicitly, and not print this?

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-06-18 Thread Nigel Kukard
Hi Jeff,

Ok ... second part of my problem. Where should I look in trying to debug
the below problem...

Regards
Nigel

Jun 18 07:59:56 nigel-m2v kernel: ata2.00: exception Emask 0x0 SAct 0x0
SErr 0x0 action 0x2 frozen
Jun 18 07:59:56 nigel-m2v kernel: ata2.00: cmd
ca/00:08:bf:ab:68/00:00:00:00:00/e8 tag 0 cdb 0x0 data 4096 out
Jun 18 07:59:56 nigel-m2v kernel:  res
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jun 18 07:59:56 nigel-m2v kernel: ata2: soft resetting port
Jun 18 07:59:56 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 18 07:59:56 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 18 07:59:56 nigel-m2v kernel: ata2.00: configured for UDMA/133
Jun 18 07:59:56 nigel-m2v kernel: ata2: EH complete
Jun 18 08:00:26 nigel-m2v kernel: rtc: lost 7740 interrupts
Jun 18 08:00:26 nigel-m2v kernel: ata2.00: exception Emask 0x0 SAct 0x0
SErr 0x0 action 0x2 frozen
Jun 18 08:00:26 nigel-m2v kernel: ata2.00: cmd
ca/00:08:bf:ab:68/00:00:00:00:00/e8 tag 0 cdb 0x0 data 4096 out
Jun 18 08:00:26 nigel-m2v kernel:  res
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jun 18 08:00:26 nigel-m2v kernel: ata2: soft resetting port
Jun 18 08:00:26 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 18 08:00:27 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 18 08:00:27 nigel-m2v kernel: ata2.00: configured for UDMA/133
Jun 18 08:00:27 nigel-m2v kernel: ata2: EH complete
Jun 18 08:00:57 nigel-m2v kernel: rtc: lost 7741 interrupts
Jun 18 08:00:57 nigel-m2v kernel: ata2.00: exception Emask 0x0 SAct 0x0
SErr 0x0 action 0x2 frozen
Jun 18 08:00:57 nigel-m2v kernel: ata2.00: cmd
ca/00:08:bf:ab:68/00:00:00:00:00/e8 tag 0 cdb 0x0 data 4096 out
Jun 18 08:00:57 nigel-m2v kernel:  res
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jun 18 08:00:57 nigel-m2v kernel: ata2: soft resetting port
Jun 18 08:00:57 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 18 08:00:57 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 18 08:00:57 nigel-m2v kernel: ata2.00: configured for UDMA/133
Jun 18 08:00:57 nigel-m2v kernel: ata2: EH complete
Jun 18 08:01:27 nigel-m2v kernel: rtc: lost 7740 interrupts
Jun 18 08:01:27 nigel-m2v kernel: ata2.00: limiting speed to UDMA/100:PIO4
Jun 18 08:01:27 nigel-m2v kernel: ata2.00: exception Emask 0x0 SAct 0x0
SErr 0x0 action 0x2 frozen
Jun 18 08:01:27 nigel-m2v kernel: ata2.00: cmd
ca/00:08:bf:ab:68/00:00:00:00:00/e8 tag 0 cdb 0x0 data 4096 out
Jun 18 08:01:27 nigel-m2v kernel:  res
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jun 18 08:01:27 nigel-m2v kernel: ata2: soft resetting port
Jun 18 08:01:27 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 18 08:01:27 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 18 08:01:27 nigel-m2v kernel: ata2.00: configured for UDMA/100
Jun 18 08:01:27 nigel-m2v kernel: ata2: EH complete
Jun 18 08:01:57 nigel-m2v kernel: rtc: lost 7740 interrupts
Jun 18 08:01:57 nigel-m2v kernel: ata2.00: exception Emask 0x0 SAct 0x0
SErr 0x0 action 0x2 frozen
Jun 18 08:01:57 nigel-m2v kernel: ata2.00: cmd
ca/00:08:bf:ab:68/00:00:00:00:00/e8 tag 0 cdb 0x0 data 4096 out
Jun 18 08:01:57 nigel-m2v kernel:  res
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jun 18 08:01:57 nigel-m2v kernel: ata2: soft resetting port
Jun 18 08:01:57 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 18 08:01:57 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 18 08:01:57 nigel-m2v kernel: ata2.00: configured for UDMA/100
Jun 18 08:01:57 nigel-m2v kernel: ata2: EH complete
Jun 18 08:02:27 nigel-m2v kernel: rtc: lost 7741 interrupts
Jun 18 08:02:27 nigel-m2v kernel: ata2.00: exception Emask 0x0 SAct 0x0
SErr 0x0 action 0x2 frozen
Jun 18 08:02:27 nigel-m2v kernel: ata2.00: cmd
ca/00:08:bf:ab:68/00:00:00:00:00/e8 tag 0 cdb 0x0 data 4096 out
Jun 18 08:02:27 nigel-m2v kernel:  res
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jun 18 08:02:27 nigel-m2v kernel: ata2: soft resetting port
Jun 18 08:02:27 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 18 08:02:27 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 18 08:02:28 nigel-m2v kernel: ata2.00: configured for UDMA/100
Jun 18 08:02:28 nigel-m2v kernel: sd 3:0:0:0: SCSI error: return code =
0x0802
Jun 18 08:02:28 nigel-m2v kernel: sda: Current [descriptor]: sense key=0xb
Jun 18 08:02:28 nigel-m2v kernel: ASC=0x0 ASCQ=0x0
Jun 18 08:02:28 nigel-m2v kernel: Descriptor sense data with sense
descriptors (in hex):
Jun 18 08:02:28 nigel-m2v kernel: 72 0b 00 00 00 00 00 0c 00 0a
80 00 00 00 00 00
Jun 18 08:02:28 nigel-m2v kernel: 00 00 00 00
Jun 18 08:02:28 nigel-m2v kernel: end_request: I/O error, dev sda,
sector 141077439
Jun 18 08:02:28 nigel-m2v kernel: Buffer I/O error on device sda1,
logical block 17634672
Jun 18 08:02:28 nigel-m2v kernel: 

Re: SATA problems

2007-06-18 Thread Nigel Kukard
Hi Jeff,

Ok ... second part of my problem. Where should I look in trying to debug
the below problem...

Regards
Nigel

Jun 18 07:59:56 nigel-m2v kernel: ata2.00: exception Emask 0x0 SAct 0x0
SErr 0x0 action 0x2 frozen
Jun 18 07:59:56 nigel-m2v kernel: ata2.00: cmd
ca/00:08:bf:ab:68/00:00:00:00:00/e8 tag 0 cdb 0x0 data 4096 out
Jun 18 07:59:56 nigel-m2v kernel:  res
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jun 18 07:59:56 nigel-m2v kernel: ata2: soft resetting port
Jun 18 07:59:56 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 18 07:59:56 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 18 07:59:56 nigel-m2v kernel: ata2.00: configured for UDMA/133
Jun 18 07:59:56 nigel-m2v kernel: ata2: EH complete
Jun 18 08:00:26 nigel-m2v kernel: rtc: lost 7740 interrupts
Jun 18 08:00:26 nigel-m2v kernel: ata2.00: exception Emask 0x0 SAct 0x0
SErr 0x0 action 0x2 frozen
Jun 18 08:00:26 nigel-m2v kernel: ata2.00: cmd
ca/00:08:bf:ab:68/00:00:00:00:00/e8 tag 0 cdb 0x0 data 4096 out
Jun 18 08:00:26 nigel-m2v kernel:  res
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jun 18 08:00:26 nigel-m2v kernel: ata2: soft resetting port
Jun 18 08:00:26 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 18 08:00:27 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 18 08:00:27 nigel-m2v kernel: ata2.00: configured for UDMA/133
Jun 18 08:00:27 nigel-m2v kernel: ata2: EH complete
Jun 18 08:00:57 nigel-m2v kernel: rtc: lost 7741 interrupts
Jun 18 08:00:57 nigel-m2v kernel: ata2.00: exception Emask 0x0 SAct 0x0
SErr 0x0 action 0x2 frozen
Jun 18 08:00:57 nigel-m2v kernel: ata2.00: cmd
ca/00:08:bf:ab:68/00:00:00:00:00/e8 tag 0 cdb 0x0 data 4096 out
Jun 18 08:00:57 nigel-m2v kernel:  res
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jun 18 08:00:57 nigel-m2v kernel: ata2: soft resetting port
Jun 18 08:00:57 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 18 08:00:57 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 18 08:00:57 nigel-m2v kernel: ata2.00: configured for UDMA/133
Jun 18 08:00:57 nigel-m2v kernel: ata2: EH complete
Jun 18 08:01:27 nigel-m2v kernel: rtc: lost 7740 interrupts
Jun 18 08:01:27 nigel-m2v kernel: ata2.00: limiting speed to UDMA/100:PIO4
Jun 18 08:01:27 nigel-m2v kernel: ata2.00: exception Emask 0x0 SAct 0x0
SErr 0x0 action 0x2 frozen
Jun 18 08:01:27 nigel-m2v kernel: ata2.00: cmd
ca/00:08:bf:ab:68/00:00:00:00:00/e8 tag 0 cdb 0x0 data 4096 out
Jun 18 08:01:27 nigel-m2v kernel:  res
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jun 18 08:01:27 nigel-m2v kernel: ata2: soft resetting port
Jun 18 08:01:27 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 18 08:01:27 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 18 08:01:27 nigel-m2v kernel: ata2.00: configured for UDMA/100
Jun 18 08:01:27 nigel-m2v kernel: ata2: EH complete
Jun 18 08:01:57 nigel-m2v kernel: rtc: lost 7740 interrupts
Jun 18 08:01:57 nigel-m2v kernel: ata2.00: exception Emask 0x0 SAct 0x0
SErr 0x0 action 0x2 frozen
Jun 18 08:01:57 nigel-m2v kernel: ata2.00: cmd
ca/00:08:bf:ab:68/00:00:00:00:00/e8 tag 0 cdb 0x0 data 4096 out
Jun 18 08:01:57 nigel-m2v kernel:  res
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jun 18 08:01:57 nigel-m2v kernel: ata2: soft resetting port
Jun 18 08:01:57 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 18 08:01:57 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 18 08:01:57 nigel-m2v kernel: ata2.00: configured for UDMA/100
Jun 18 08:01:57 nigel-m2v kernel: ata2: EH complete
Jun 18 08:02:27 nigel-m2v kernel: rtc: lost 7741 interrupts
Jun 18 08:02:27 nigel-m2v kernel: ata2.00: exception Emask 0x0 SAct 0x0
SErr 0x0 action 0x2 frozen
Jun 18 08:02:27 nigel-m2v kernel: ata2.00: cmd
ca/00:08:bf:ab:68/00:00:00:00:00/e8 tag 0 cdb 0x0 data 4096 out
Jun 18 08:02:27 nigel-m2v kernel:  res
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jun 18 08:02:27 nigel-m2v kernel: ata2: soft resetting port
Jun 18 08:02:27 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 18 08:02:27 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 18 08:02:28 nigel-m2v kernel: ata2.00: configured for UDMA/100
Jun 18 08:02:28 nigel-m2v kernel: sd 3:0:0:0: SCSI error: return code =
0x0802
Jun 18 08:02:28 nigel-m2v kernel: sda: Current [descriptor]: sense key=0xb
Jun 18 08:02:28 nigel-m2v kernel: ASC=0x0 ASCQ=0x0
Jun 18 08:02:28 nigel-m2v kernel: Descriptor sense data with sense
descriptors (in hex):
Jun 18 08:02:28 nigel-m2v kernel: 72 0b 00 00 00 00 00 0c 00 0a
80 00 00 00 00 00
Jun 18 08:02:28 nigel-m2v kernel: 00 00 00 00
Jun 18 08:02:28 nigel-m2v kernel: end_request: I/O error, dev sda,
sector 141077439
Jun 18 08:02:28 nigel-m2v kernel: Buffer I/O error on device sda1,
logical block 17634672
Jun 18 08:02:28 nigel-m2v kernel: 

Re: SATA problems

2007-06-18 Thread Dave Jones
On Thu, Jun 14, 2007 at 02:28:54PM -0400, Dave Jones wrote:
  On Thu, Jun 14, 2007 at 12:21:49PM -0400, Jeff Garzik wrote:
  
 Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
 0x0001c807
 Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
 0x0001c807
  
  Unrelated to the other error, but I've been meaning to ask for a while..
  If this is 'abnormal', why does every SATA box I've seen do it?

*crickets*

Should we check for this case explicitly, and not print this?

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-06-18 Thread Nigel Kukard

   
  Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
  0x0001c807
  Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
  0x0001c807
   
   Unrelated to the other error, but I've been meaning to ask for a while..
   If this is 'abnormal', why does every SATA box I've seen do it?

 *crickets*

 Should we check for this case explicitly, and not print this?

   
After I get the above errors, my entire SATA bus crashes and I need to
hard reset the box ... not sure we can just ignore the errors?



signature.asc
Description: OpenPGP digital signature


Re: SATA problems

2007-06-14 Thread Dave Jones
On Thu, Jun 14, 2007 at 12:21:49PM -0400, Jeff Garzik wrote:

 > > Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
 > > 0x0001c807
 > > Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
 > > 0x0001c807

Unrelated to the other error, but I've been meaning to ask for a while..
If this is 'abnormal', why does every SATA box I've seen do it?

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-06-14 Thread Jeff Garzik

Nigel Kukard wrote:

I'm stumped trying to track down the below intermittent problem.

I've confirmed this problem on 2.6.19, 2.6.20 and 2.6.21.



Jun 14 07:55:52 nigel-m2v kernel: ata2.00: exception Emask 0x0 SAct 0x0
SErr 0x0 action 0x2 frozen
Jun 14 07:55:52 nigel-m2v kernel: ata2.00: cmd
ca/00:18:87:e7:00/00:00:00:00:00/e0 tag 0 cdb 0x0 data 12288 out
Jun 14 07:55:52 nigel-m2v kernel:  res
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jun 14 07:55:52 nigel-m2v kernel: ata2: soft resetting port
Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 14 07:56:22 nigel-m2v kernel: ata2.00: qc timeout (cmd 0xef)
Jun 14 07:56:22 nigel-m2v kernel: ata2.00: failed to set xfermode
(err_mask=0x4)



Try 2.6.22-rc4-gitX...



Is there a patch in particular I can maybe apply? I see you made a
couple of commits ... my problem is this is also happening on one of my
production boxes which has a few other patches applied, I'm a bit scared
of conflicts ... I don't really want to break anything by upgrading the
kernel.


The two most relevant git commits:

commit 51b94d2a5a90d4800e74d7348bcde098a28f4fb3
Author: Tejun Heo <[EMAIL PROTECTED]>
Date:   Fri Jun 8 13:46:55 2007 -0700

sata_promise: use TF interface for polling NODATA commands

commit 464cf177df7727efcc5506322fc5d0c8b896f545
Author: Tejun Heo <[EMAIL PROTECTED]>
Date:   Sun May 27 15:10:40 2007 +0200

libata: always use polling SETXFER

If you have a git tree local to you, "git-diff-tree -p $COMMIT" will 
extract a patch, otherwise click "raw" after surfing to 
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=$COMMIT


Regards,

Jeff


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-06-14 Thread Nigel Kukard


>> I'm stumped trying to track down the below intermittent problem.
>>
>> I've confirmed this problem on 2.6.19, 2.6.20 and 2.6.21.
>>
>> Any help greatly appreciated!
>>
>> Regards
>> Nigel
>>
>>
>> Jun 14 07:55:52 nigel-m2v kernel: ata2.00: exception Emask 0x0 SAct 0x0
>> SErr 0x0 action 0x2 frozen
>> Jun 14 07:55:52 nigel-m2v kernel: ata2.00: cmd
>> ca/00:18:87:e7:00/00:00:00:00:00/e0 tag 0 cdb 0x0 data 12288 out
>> Jun 14 07:55:52 nigel-m2v kernel:  res
>> 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
>> Jun 14 07:55:52 nigel-m2v kernel: ata2: soft resetting port
>> Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
>> 0x0001c807
>> Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
>> 0x0001c807
>> Jun 14 07:56:22 nigel-m2v kernel: ata2.00: qc timeout (cmd 0xef)
>> Jun 14 07:56:22 nigel-m2v kernel: ata2.00: failed to set xfermode
>> (err_mask=0x4)
> 
> Try 2.6.22-rc4-gitX...
> 
> Jeff

Hi Jeff,

Is there a patch in particular I can maybe apply? I see you made a
couple of commits ... my problem is this is also happening on one of my
production boxes which has a few other patches applied, I'm a bit scared
of conflicts ... I don't really want to break anything by upgrading the
kernel.


Kind Regards
Nigel




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-06-14 Thread Jeff Garzik

Nigel Kukard wrote:

I'm stumped trying to track down the below intermittent problem.

I've confirmed this problem on 2.6.19, 2.6.20 and 2.6.21.

Any help greatly appreciated!

Regards
Nigel


Jun 14 07:55:52 nigel-m2v kernel: ata2.00: exception Emask 0x0 SAct 0x0
SErr 0x0 action 0x2 frozen
Jun 14 07:55:52 nigel-m2v kernel: ata2.00: cmd
ca/00:18:87:e7:00/00:00:00:00:00/e0 tag 0 cdb 0x0 data 12288 out
Jun 14 07:55:52 nigel-m2v kernel:  res
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jun 14 07:55:52 nigel-m2v kernel: ata2: soft resetting port
Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 14 07:56:22 nigel-m2v kernel: ata2.00: qc timeout (cmd 0xef)
Jun 14 07:56:22 nigel-m2v kernel: ata2.00: failed to set xfermode
(err_mask=0x4)


Try 2.6.22-rc4-gitX...

Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-06-14 Thread Jeff Garzik

Nigel Kukard wrote:

I'm stumped trying to track down the below intermittent problem.

I've confirmed this problem on 2.6.19, 2.6.20 and 2.6.21.

Any help greatly appreciated!

Regards
Nigel


Jun 14 07:55:52 nigel-m2v kernel: ata2.00: exception Emask 0x0 SAct 0x0
SErr 0x0 action 0x2 frozen
Jun 14 07:55:52 nigel-m2v kernel: ata2.00: cmd
ca/00:18:87:e7:00/00:00:00:00:00/e0 tag 0 cdb 0x0 data 12288 out
Jun 14 07:55:52 nigel-m2v kernel:  res
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jun 14 07:55:52 nigel-m2v kernel: ata2: soft resetting port
Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 14 07:56:22 nigel-m2v kernel: ata2.00: qc timeout (cmd 0xef)
Jun 14 07:56:22 nigel-m2v kernel: ata2.00: failed to set xfermode
(err_mask=0x4)


Try 2.6.22-rc4-gitX...

Jeff



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-06-14 Thread Nigel Kukard


 I'm stumped trying to track down the below intermittent problem.

 I've confirmed this problem on 2.6.19, 2.6.20 and 2.6.21.

 Any help greatly appreciated!

 Regards
 Nigel


 Jun 14 07:55:52 nigel-m2v kernel: ata2.00: exception Emask 0x0 SAct 0x0
 SErr 0x0 action 0x2 frozen
 Jun 14 07:55:52 nigel-m2v kernel: ata2.00: cmd
 ca/00:18:87:e7:00/00:00:00:00:00/e0 tag 0 cdb 0x0 data 12288 out
 Jun 14 07:55:52 nigel-m2v kernel:  res
 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
 Jun 14 07:55:52 nigel-m2v kernel: ata2: soft resetting port
 Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
 0x0001c807
 Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
 0x0001c807
 Jun 14 07:56:22 nigel-m2v kernel: ata2.00: qc timeout (cmd 0xef)
 Jun 14 07:56:22 nigel-m2v kernel: ata2.00: failed to set xfermode
 (err_mask=0x4)
 
 Try 2.6.22-rc4-gitX...
 
 Jeff

Hi Jeff,

Is there a patch in particular I can maybe apply? I see you made a
couple of commits ... my problem is this is also happening on one of my
production boxes which has a few other patches applied, I'm a bit scared
of conflicts ... I don't really want to break anything by upgrading the
kernel.


Kind Regards
Nigel




-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-06-14 Thread Jeff Garzik

Nigel Kukard wrote:

I'm stumped trying to track down the below intermittent problem.

I've confirmed this problem on 2.6.19, 2.6.20 and 2.6.21.



Jun 14 07:55:52 nigel-m2v kernel: ata2.00: exception Emask 0x0 SAct 0x0
SErr 0x0 action 0x2 frozen
Jun 14 07:55:52 nigel-m2v kernel: ata2.00: cmd
ca/00:18:87:e7:00/00:00:00:00:00/e0 tag 0 cdb 0x0 data 12288 out
Jun 14 07:55:52 nigel-m2v kernel:  res
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Jun 14 07:55:52 nigel-m2v kernel: ata2: soft resetting port
Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
0x0001c807
Jun 14 07:56:22 nigel-m2v kernel: ata2.00: qc timeout (cmd 0xef)
Jun 14 07:56:22 nigel-m2v kernel: ata2.00: failed to set xfermode
(err_mask=0x4)



Try 2.6.22-rc4-gitX...



Is there a patch in particular I can maybe apply? I see you made a
couple of commits ... my problem is this is also happening on one of my
production boxes which has a few other patches applied, I'm a bit scared
of conflicts ... I don't really want to break anything by upgrading the
kernel.


The two most relevant git commits:

commit 51b94d2a5a90d4800e74d7348bcde098a28f4fb3
Author: Tejun Heo [EMAIL PROTECTED]
Date:   Fri Jun 8 13:46:55 2007 -0700

sata_promise: use TF interface for polling NODATA commands

commit 464cf177df7727efcc5506322fc5d0c8b896f545
Author: Tejun Heo [EMAIL PROTECTED]
Date:   Sun May 27 15:10:40 2007 +0200

libata: always use polling SETXFER

If you have a git tree local to you, git-diff-tree -p $COMMIT will 
extract a patch, otherwise click raw after surfing to 
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=$COMMIT


Regards,

Jeff


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-06-14 Thread Dave Jones
On Thu, Jun 14, 2007 at 12:21:49PM -0400, Jeff Garzik wrote:

   Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
   0x0001c807
   Jun 14 07:55:52 nigel-m2v kernel: ATA: abnormal status 0x7F on port
   0x0001c807

Unrelated to the other error, but I've been meaning to ask for a while..
If this is 'abnormal', why does every SATA box I've seen do it?

Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems in 2.6.20.3

2007-03-17 Thread Robert Hancock

Christian wrote:
I'm seeing the same here since a few days. Before it worked great (even with 
NCQ). I've been getting those messages since 2.6.21-rc3-mm1 and with the 
latest Ubuntu feisty kernel (2.6.20-11-generic #2 SMP Thu Mar 15 03:43:56 UTC 
2007 x86_64 GNU/Linux)


System is Athlon64 X2, Nforce4, 3x Samsung SATA II NCQ discs.


Can you try 2.6.21-rc4? There was a change that went in between rc3 and 
rc4 to revert a previous change which seemed to be problematic.


As far as 2.6.20, I'm somewhat tempted to submit this patch to -stable:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=5e5c74a5e11d1e2a99d03132cc6c4455016db6c2

--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems in 2.6.20.3

2007-03-17 Thread Christian
I'm seeing the same here since a few days. Before it worked great (even with 
NCQ). I've been getting those messages since 2.6.21-rc3-mm1 and with the 
latest Ubuntu feisty kernel (2.6.20-11-generic #2 SMP Thu Mar 15 03:43:56 UTC 
2007 x86_64 GNU/Linux)

System is Athlon64 X2, Nforce4, 3x Samsung SATA II NCQ discs.

[10802.844891] ata1: soft resetting port
[10802.922845] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[10802.966231] ata1.00: Host Protected Area detected:
[10802.966232]  current size: 781422768 sectors (400 GB)
[10802.966233]  native size: -1349283664 sectors (18446743382 GB)
[10802.966237] ata1.00: configured for UDMA/133
[10802.966265] ata1: EH complete
[10817.958196] ata1: EH in ADMA mode, notifier 0x0 notifier_error 0x0 gen_ctl 
0x1501000 status 0x400
[10817.958201] ata1: CPB 0: ctl_flags 0x1f, resp_flags 0x0
[10817.958203] ata1: CPB 1: ctl_flags 0x1f, resp_flags 0x0
[10817.958205] ata1: CPB 2: ctl_flags 0x1f, resp_flags 0x0
[10817.958206] ata1: CPB 3: ctl_flags 0x1f, resp_flags 0x0
[10817.958208] ata1: CPB 4: ctl_flags 0x1f, resp_flags 0x0
[10817.958210] ata1: CPB 5: ctl_flags 0x1f, resp_flags 0x0
[10817.958211] ata1: CPB 6: ctl_flags 0x1f, resp_flags 0x0
[10817.958213] ata1: CPB 7: ctl_flags 0x1f, resp_flags 0x0
[10817.958215] ata1: CPB 8: ctl_flags 0x1f, resp_flags 0x0
[10817.958216] ata1: CPB 9: ctl_flags 0x1f, resp_flags 0x0
[10817.958218] ata1: CPB 10: ctl_flags 0x1f, resp_flags 0x0
[10817.958220] ata1: CPB 11: ctl_flags 0x1f, resp_flags 0x0
[10817.958222] ata1: CPB 12: ctl_flags 0x1f, resp_flags 0x0
[10817.958224] ata1: CPB 13: ctl_flags 0x1f, resp_flags 0x0
[10817.958225] ata1: CPB 14: ctl_flags 0x1f, resp_flags 0x0
[10817.958227] ata1: CPB 15: ctl_flags 0x1f, resp_flags 0x0
[10817.958229] ata1: CPB 16: ctl_flags 0x1f, resp_flags 0x0
[10817.958231] ata1: CPB 17: ctl_flags 0x1f, resp_flags 0x0
[10817.958233] ata1: CPB 18: ctl_flags 0x1f, resp_flags 0x0
[10817.958235] ata1: CPB 19: ctl_flags 0x1f, resp_flags 0x0
[10817.958236] ata1: CPB 20: ctl_flags 0x1f, resp_flags 0x0
[10817.958238] ata1: CPB 21: ctl_flags 0x1f, resp_flags 0x0
[10817.958240] ata1: CPB 22: ctl_flags 0x1f, resp_flags 0x0
[10817.958242] ata1: CPB 23: ctl_flags 0x1f, resp_flags 0x0
[10817.958244] ata1: CPB 24: ctl_flags 0x1f, resp_flags 0x0
[10817.958245] ata1: CPB 25: ctl_flags 0x1f, resp_flags 0x0
[10817.958247] ata1: CPB 26: ctl_flags 0x1f, resp_flags 0x0
[10817.958249] ata1: CPB 27: ctl_flags 0x1f, resp_flags 0x0
[10817.958250] ata1: CPB 28: ctl_flags 0x1f, resp_flags 0x0
[10817.958252] ata1: CPB 29: ctl_flags 0x1f, resp_flags 0x0
[10817.958254] ata1: CPB 30: ctl_flags 0x1f, resp_flags 0x0
[10817.958256] ata1: Resetting port
[10817.958262] ata1.00: exception Emask 0x0 SAct 0x7fff SErr 0x0 action 
0x2 frozen
[10817.958267] ata1.00: cmd 61/00:00:c7:6b:46/02:00:02:00:00/40 tag 0 cdb 0x0 
data 262144 out
[10817.958268]  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 
(timeout)
[10817.958272] ata1.00: cmd 61/00:08:c7:6d:46/02:00:02:00:00/40 tag 1 cdb 0x0 
data 262144 out
[10817.958274]  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 
(timeout)
[10817.958278] ata1.00: cmd 61/00:10:c7:6f:46/04:00:02:00:00/40 tag 2 cdb 0x0 
data 524288 out
[10817.958279]  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 
(timeout)
[10817.958283] ata1.00: cmd 61/00:18:c7:73:46/02:00:02:00:00/40 tag 3 cdb 0x0 
data 262144 out
[10817.958285]  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 
(timeout)
[10817.958289] ata1.00: cmd 61/00:20:c7:75:46/04:00:02:00:00/40 tag 4 cdb 0x0 
data 524288 out
[10817.958290]  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 
(timeout)
[10817.958294] ata1.00: cmd 61/00:28:c7:79:46/02:00:02:00:00/40 tag 5 cdb 0x0 
data 262144 out
[10817.958296]  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 
(timeout)
[10817.958300] ata1.00: cmd 61/00:30:c7:7b:46/02:00:02:00:00/40 tag 6 cdb 0x0 
data 262144 out
[10817.958301]  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 
(timeout)
[10817.958305] ata1.00: cmd 61/08:38:c7:7f:46/00:00:02:00:00/40 tag 7 cdb 0x0 
data 4096 out
[10817.958307]  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 
(timeout)
[10817.958311] ata1.00: cmd 61/00:40:c7:7d:46/02:00:02:00:00/40 tag 8 cdb 0x0 
data 262144 out
[10817.958312]  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 
(timeout)
[10817.958316] ata1.00: cmd 61/00:48:cf:7f:46/02:00:02:00:00/40 tag 9 cdb 0x0 
data 262144 out
[10817.958317]  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 
(timeout)
[10817.958322] ata1.00: cmd 61/00:50:cf:81:46/02:00:02:00:00/40 tag 10 cdb 0x0 
data 262144 out
[10817.958323]  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 
(timeout)
[10817.958327] ata1.00: cmd 61/00:58:cf:83:46/02:00:02:00:00/40 tag 11 cdb 0x0 
data 262144 out
[10817.958328]  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 
(timeout)
[10817.958333] ata1.00: cmd 61/00:60:cf:85:46/02:00:02:00:00/40 tag 12 cdb 

Re: SATA problems in 2.6.20.3

2007-03-17 Thread Christian
I'm seeing the same here since a few days. Before it worked great (even with 
NCQ). I've been getting those messages since 2.6.21-rc3-mm1 and with the 
latest Ubuntu feisty kernel (2.6.20-11-generic #2 SMP Thu Mar 15 03:43:56 UTC 
2007 x86_64 GNU/Linux)

System is Athlon64 X2, Nforce4, 3x Samsung SATA II NCQ discs.

[10802.844891] ata1: soft resetting port
[10802.922845] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[10802.966231] ata1.00: Host Protected Area detected:
[10802.966232]  current size: 781422768 sectors (400 GB)
[10802.966233]  native size: -1349283664 sectors (18446743382 GB)
[10802.966237] ata1.00: configured for UDMA/133
[10802.966265] ata1: EH complete
[10817.958196] ata1: EH in ADMA mode, notifier 0x0 notifier_error 0x0 gen_ctl 
0x1501000 status 0x400
[10817.958201] ata1: CPB 0: ctl_flags 0x1f, resp_flags 0x0
[10817.958203] ata1: CPB 1: ctl_flags 0x1f, resp_flags 0x0
[10817.958205] ata1: CPB 2: ctl_flags 0x1f, resp_flags 0x0
[10817.958206] ata1: CPB 3: ctl_flags 0x1f, resp_flags 0x0
[10817.958208] ata1: CPB 4: ctl_flags 0x1f, resp_flags 0x0
[10817.958210] ata1: CPB 5: ctl_flags 0x1f, resp_flags 0x0
[10817.958211] ata1: CPB 6: ctl_flags 0x1f, resp_flags 0x0
[10817.958213] ata1: CPB 7: ctl_flags 0x1f, resp_flags 0x0
[10817.958215] ata1: CPB 8: ctl_flags 0x1f, resp_flags 0x0
[10817.958216] ata1: CPB 9: ctl_flags 0x1f, resp_flags 0x0
[10817.958218] ata1: CPB 10: ctl_flags 0x1f, resp_flags 0x0
[10817.958220] ata1: CPB 11: ctl_flags 0x1f, resp_flags 0x0
[10817.958222] ata1: CPB 12: ctl_flags 0x1f, resp_flags 0x0
[10817.958224] ata1: CPB 13: ctl_flags 0x1f, resp_flags 0x0
[10817.958225] ata1: CPB 14: ctl_flags 0x1f, resp_flags 0x0
[10817.958227] ata1: CPB 15: ctl_flags 0x1f, resp_flags 0x0
[10817.958229] ata1: CPB 16: ctl_flags 0x1f, resp_flags 0x0
[10817.958231] ata1: CPB 17: ctl_flags 0x1f, resp_flags 0x0
[10817.958233] ata1: CPB 18: ctl_flags 0x1f, resp_flags 0x0
[10817.958235] ata1: CPB 19: ctl_flags 0x1f, resp_flags 0x0
[10817.958236] ata1: CPB 20: ctl_flags 0x1f, resp_flags 0x0
[10817.958238] ata1: CPB 21: ctl_flags 0x1f, resp_flags 0x0
[10817.958240] ata1: CPB 22: ctl_flags 0x1f, resp_flags 0x0
[10817.958242] ata1: CPB 23: ctl_flags 0x1f, resp_flags 0x0
[10817.958244] ata1: CPB 24: ctl_flags 0x1f, resp_flags 0x0
[10817.958245] ata1: CPB 25: ctl_flags 0x1f, resp_flags 0x0
[10817.958247] ata1: CPB 26: ctl_flags 0x1f, resp_flags 0x0
[10817.958249] ata1: CPB 27: ctl_flags 0x1f, resp_flags 0x0
[10817.958250] ata1: CPB 28: ctl_flags 0x1f, resp_flags 0x0
[10817.958252] ata1: CPB 29: ctl_flags 0x1f, resp_flags 0x0
[10817.958254] ata1: CPB 30: ctl_flags 0x1f, resp_flags 0x0
[10817.958256] ata1: Resetting port
[10817.958262] ata1.00: exception Emask 0x0 SAct 0x7fff SErr 0x0 action 
0x2 frozen
[10817.958267] ata1.00: cmd 61/00:00:c7:6b:46/02:00:02:00:00/40 tag 0 cdb 0x0 
data 262144 out
[10817.958268]  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 
(timeout)
[10817.958272] ata1.00: cmd 61/00:08:c7:6d:46/02:00:02:00:00/40 tag 1 cdb 0x0 
data 262144 out
[10817.958274]  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 
(timeout)
[10817.958278] ata1.00: cmd 61/00:10:c7:6f:46/04:00:02:00:00/40 tag 2 cdb 0x0 
data 524288 out
[10817.958279]  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 
(timeout)
[10817.958283] ata1.00: cmd 61/00:18:c7:73:46/02:00:02:00:00/40 tag 3 cdb 0x0 
data 262144 out
[10817.958285]  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 
(timeout)
[10817.958289] ata1.00: cmd 61/00:20:c7:75:46/04:00:02:00:00/40 tag 4 cdb 0x0 
data 524288 out
[10817.958290]  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 
(timeout)
[10817.958294] ata1.00: cmd 61/00:28:c7:79:46/02:00:02:00:00/40 tag 5 cdb 0x0 
data 262144 out
[10817.958296]  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 
(timeout)
[10817.958300] ata1.00: cmd 61/00:30:c7:7b:46/02:00:02:00:00/40 tag 6 cdb 0x0 
data 262144 out
[10817.958301]  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 
(timeout)
[10817.958305] ata1.00: cmd 61/08:38:c7:7f:46/00:00:02:00:00/40 tag 7 cdb 0x0 
data 4096 out
[10817.958307]  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 
(timeout)
[10817.958311] ata1.00: cmd 61/00:40:c7:7d:46/02:00:02:00:00/40 tag 8 cdb 0x0 
data 262144 out
[10817.958312]  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 
(timeout)
[10817.958316] ata1.00: cmd 61/00:48:cf:7f:46/02:00:02:00:00/40 tag 9 cdb 0x0 
data 262144 out
[10817.958317]  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 
(timeout)
[10817.958322] ata1.00: cmd 61/00:50:cf:81:46/02:00:02:00:00/40 tag 10 cdb 0x0 
data 262144 out
[10817.958323]  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 
(timeout)
[10817.958327] ata1.00: cmd 61/00:58:cf:83:46/02:00:02:00:00/40 tag 11 cdb 0x0 
data 262144 out
[10817.958328]  res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 
(timeout)
[10817.958333] ata1.00: cmd 61/00:60:cf:85:46/02:00:02:00:00/40 tag 12 cdb 

Re: SATA problems in 2.6.20.3

2007-03-17 Thread Robert Hancock

Christian wrote:
I'm seeing the same here since a few days. Before it worked great (even with 
NCQ). I've been getting those messages since 2.6.21-rc3-mm1 and with the 
latest Ubuntu feisty kernel (2.6.20-11-generic #2 SMP Thu Mar 15 03:43:56 UTC 
2007 x86_64 GNU/Linux)


System is Athlon64 X2, Nforce4, 3x Samsung SATA II NCQ discs.


Can you try 2.6.21-rc4? There was a change that went in between rc3 and 
rc4 to revert a previous change which seemed to be problematic.


As far as 2.6.20, I'm somewhat tempted to submit this patch to -stable:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=5e5c74a5e11d1e2a99d03132cc6c4455016db6c2

--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems in 2.6.20.3

2007-03-16 Thread Charles Shannon Hendrix
On Fri, 16 Mar 2007 17:44:25 -0600
Robert Hancock <[EMAIL PROTECTED]> wrote:

> Charles Shannon Hendrix wrote:
> > I normally run a modified 2.6.19 kernel and it works great.
> > 
> > I recently tried 2.6.20 and had severe SATA problems with it.
> > 
> > Yesterday I tried 2.6.20.3, and the problems are still there.
> 
> Can you try 2.6.21-rc and see if the problem is fixed in those kernels?

OK.

sata_nv.adma=0 let's me run 2.6.20.3 for now.

I'll test 2.6.21-rc tomorrow some time.




-- 
shannon   | Work for something because it is good, not just because 
  | it stands a chance to succeed. 
  |-- Vaclav Havel
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems in 2.6.20.3

2007-03-16 Thread Alistair John Strachan
On Friday 16 March 2007 23:44, you wrote:
> Charles Shannon Hendrix wrote:
> > I normally run a modified 2.6.19 kernel and it works great.
> >
> > I recently tried 2.6.20 and had severe SATA problems with it.
> >
> > Yesterday I tried 2.6.20.3, and the problems are still there.
>
> Can you try 2.6.21-rc and see if the problem is fixed in those kernels?

-rc4 specifically, it's the first one that's worked for me (possibly related).

(BTW Robert, the sata_nv shadow registers patch has been fine here with a 
patched -rc3 for just over a week now.)

-- 
Cheers,
Alistair.

Final year Computer Science undergraduate.
1F2 55 South Clerk Street, Edinburgh, UK.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems in 2.6.20.3

2007-03-16 Thread Robert Hancock

Charles Shannon Hendrix wrote:

I normally run a modified 2.6.19 kernel and it works great.

I recently tried 2.6.20 and had severe SATA problems with it.

Yesterday I tried 2.6.20.3, and the problems are still there.


Can you try 2.6.21-rc and see if the problem is fixed in those kernels?

--
Robert Hancock  Saskatoon, SK, Canada
To email, remove "nospam" from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems in 2.6.20.3

2007-03-16 Thread Charles Shannon Hendrix
On Fri, 16 Mar 2007 11:58:21 -0400
Jeff Garzik <[EMAIL PROTECTED]> wrote:

> Charles Shannon Hendrix wrote:
> > I normally run a modified 2.6.19 kernel and it works great.
> > 
> > I recently tried 2.6.20 and had severe SATA problems with it.
> > 
> > Yesterday I tried 2.6.20.3, and the problems are still there.
> 
> Setting the module parameter 'adma' to zero fixes this, yes?

Seems to.

NCQ would be nice of course, but this is usable.

 
-- 
The strength of the Constitution lies entirely in the determination of
each citizen to defend it.  Only if every single citizen feels duty
bound to do his share in this defense are the constitutional rights
secure. -- Albert Einstein
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems in 2.6.20.3

2007-03-16 Thread Jeff Garzik

Charles Shannon Hendrix wrote:

I normally run a modified 2.6.19 kernel and it works great.

I recently tried 2.6.20 and had severe SATA problems with it.

Yesterday I tried 2.6.20.3, and the problems are still there.


Setting the module parameter 'adma' to zero fixes this, yes?

Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


SATA problems in 2.6.20.3

2007-03-16 Thread Charles Shannon Hendrix

I normally run a modified 2.6.19 kernel and it works great.

I recently tried 2.6.20 and had severe SATA problems with it.

Yesterday I tried 2.6.20.3, and the problems are still there.

Relevant /var/log/messages entries:

Mar 14 20:45:11 daydream kernel: ata3: EH in ADMA mode, notifier 0x0
notifier_er ror 0x0 gen_ctl 0x1501000 status 0x400
Mar 14 20:45:11 daydream kernel: ata3: CPB 0: ctl_flags 0x1f, resp_flags 0x0
Mar 14 20:45:11 daydream kernel: ata3: CPB 1: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:11 daydream kernel: ata3: CPB 2: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:11 daydream kernel: ata3: CPB 3: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:11 daydream kernel: ata3: CPB 4: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:11 daydream kernel: ata3: CPB 5: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:11 daydream kernel: ata3: CPB 6: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:11 daydream kernel: ata3: CPB 7: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:11 daydream kernel: ata3: CPB 8: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:11 daydream kernel: ata3: CPB 9: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:11 daydream kernel: ata3: CPB 10: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:11 daydream kernel: ata3: CPB 11: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:11 daydream kernel: ata3: CPB 12: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:12 daydream kernel: ata3: CPB 13: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:12 daydream kernel: ata3: CPB 14: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:12 daydream kernel: ata3: CPB 15: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:12 daydream kernel: ata3: CPB 16: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:12 daydream kernel: ata3: CPB 17: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:12 daydream kernel: ata3: CPB 18: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:12 daydream kernel: ata3: CPB 19: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:12 daydream kernel: ata3: CPB 20: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:12 daydream kernel: ata3: CPB 21: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:12 daydream kernel: ata3: CPB 22: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:12 daydream kernel: ata3: CPB 23: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:12 daydream kernel: ata3: CPB 24: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:12 daydream kernel: ata3: CPB 25: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:12 daydream kernel: ata3: CPB 26: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:13 daydream kernel: ata3: CPB 27: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:13 daydream kernel: ata3: CPB 28: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:13 daydream kernel: ata3: CPB 29: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:13 daydream kernel: ata3: CPB 30: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:13 daydream kernel: ata3: Resetting port
Mar 14 20:45:13 daydream kernel: ata3.00: exception Emask 0x0 SAct 0x1 SErr
0x0 action 0x2 frozen
Mar 14 20:45:13 daydream kernel: ata3.00: cmd
61/40:00:ed:39:b1/00:00:04:00:00/4 0 tag 0 cdb 0x0 data 32768 out
Mar 14 20:45:13 daydream kernel:  res
40/00:00:00:00:00/00:00:00:00:00/0 0 Emask 0x4 (timeout)
Mar 14 20:45:13 daydream kernel: ata3: soft resetting port
Mar 14 20:45:13 daydream kernel: ata3: SATA link up 1.5 Gbps (SStatus 113
SContr ol 300)
Mar 14 20:45:13 daydream kernel: ata3.00: configured for UDMA/133
Mar 14 20:45:13 daydream kernel: ata3: EH complete
Mar 14 20:45:13 daydream kernel: SCSI device sdc: 156301488 512-byte hdwr
sector s (80026 MB)
Mar 14 20:45:13 daydream kernel: sdc: Write Protect is off
Mar 14 20:45:13 daydream kernel: sdc: Mode Sense: 00 3a 00 00
Mar 14 20:45:14 daydream kernel: SCSI device sdc: write cache: enabled, read
cac he: enabled, doesn't support DPO or FUA
Mar 14 20:49:12 daydream kernel: ata3: EH in ADMA mode, notifier 0x0
notifier_er ror 0x0 gen_ctl 0x1501000 status 0x400
Mar 14 20:49:12 daydream kernel: ata3: CPB 0: ctl_flags 0x1f, resp_flags 0x0
Mar 14 20:49:12 daydream kernel: ata3: CPB 0: ctl_flags 0x1f, resp_flags 0x0
Mar 14 20:49:12 daydream kernel: ata3: CPB 1: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:49:12 daydream kernel: ata3: CPB 2: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:49:12 daydream kernel: ata3: CPB 3: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:49:12 daydream kernel: ata3: CPB 4: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:49:12 daydream kernel: ata3: CPB 5: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:49:12 daydream kernel: ata3: CPB 6: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:49:12 daydream kernel: ata3: CPB 7: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:49:12 daydream kernel: ata3: CPB 8: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:49:12 daydream kernel: ata3: CPB 9: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:49:12 daydream kernel: ata3: CPB 10: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:49:12 daydream kernel: ata3: CPB 11: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:49:18 daydream kernel: ata3: CPB 12: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:49:23 daydream kernel: ata3: CPB 13: ctl_flags 0x1f, resp_flags

SATA problems in 2.6.20.3

2007-03-16 Thread Charles Shannon Hendrix

I normally run a modified 2.6.19 kernel and it works great.

I recently tried 2.6.20 and had severe SATA problems with it.

Yesterday I tried 2.6.20.3, and the problems are still there.

Relevant /var/log/messages entries:

Mar 14 20:45:11 daydream kernel: ata3: EH in ADMA mode, notifier 0x0
notifier_er ror 0x0 gen_ctl 0x1501000 status 0x400
Mar 14 20:45:11 daydream kernel: ata3: CPB 0: ctl_flags 0x1f, resp_flags 0x0
Mar 14 20:45:11 daydream kernel: ata3: CPB 1: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:11 daydream kernel: ata3: CPB 2: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:11 daydream kernel: ata3: CPB 3: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:11 daydream kernel: ata3: CPB 4: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:11 daydream kernel: ata3: CPB 5: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:11 daydream kernel: ata3: CPB 6: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:11 daydream kernel: ata3: CPB 7: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:11 daydream kernel: ata3: CPB 8: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:11 daydream kernel: ata3: CPB 9: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:11 daydream kernel: ata3: CPB 10: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:11 daydream kernel: ata3: CPB 11: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:11 daydream kernel: ata3: CPB 12: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:12 daydream kernel: ata3: CPB 13: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:12 daydream kernel: ata3: CPB 14: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:12 daydream kernel: ata3: CPB 15: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:12 daydream kernel: ata3: CPB 16: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:12 daydream kernel: ata3: CPB 17: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:12 daydream kernel: ata3: CPB 18: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:12 daydream kernel: ata3: CPB 19: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:12 daydream kernel: ata3: CPB 20: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:12 daydream kernel: ata3: CPB 21: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:12 daydream kernel: ata3: CPB 22: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:12 daydream kernel: ata3: CPB 23: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:12 daydream kernel: ata3: CPB 24: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:12 daydream kernel: ata3: CPB 25: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:12 daydream kernel: ata3: CPB 26: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:13 daydream kernel: ata3: CPB 27: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:13 daydream kernel: ata3: CPB 28: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:13 daydream kernel: ata3: CPB 29: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:13 daydream kernel: ata3: CPB 30: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:45:13 daydream kernel: ata3: Resetting port
Mar 14 20:45:13 daydream kernel: ata3.00: exception Emask 0x0 SAct 0x1 SErr
0x0 action 0x2 frozen
Mar 14 20:45:13 daydream kernel: ata3.00: cmd
61/40:00:ed:39:b1/00:00:04:00:00/4 0 tag 0 cdb 0x0 data 32768 out
Mar 14 20:45:13 daydream kernel:  res
40/00:00:00:00:00/00:00:00:00:00/0 0 Emask 0x4 (timeout)
Mar 14 20:45:13 daydream kernel: ata3: soft resetting port
Mar 14 20:45:13 daydream kernel: ata3: SATA link up 1.5 Gbps (SStatus 113
SContr ol 300)
Mar 14 20:45:13 daydream kernel: ata3.00: configured for UDMA/133
Mar 14 20:45:13 daydream kernel: ata3: EH complete
Mar 14 20:45:13 daydream kernel: SCSI device sdc: 156301488 512-byte hdwr
sector s (80026 MB)
Mar 14 20:45:13 daydream kernel: sdc: Write Protect is off
Mar 14 20:45:13 daydream kernel: sdc: Mode Sense: 00 3a 00 00
Mar 14 20:45:14 daydream kernel: SCSI device sdc: write cache: enabled, read
cac he: enabled, doesn't support DPO or FUA
Mar 14 20:49:12 daydream kernel: ata3: EH in ADMA mode, notifier 0x0
notifier_er ror 0x0 gen_ctl 0x1501000 status 0x400
Mar 14 20:49:12 daydream kernel: ata3: CPB 0: ctl_flags 0x1f, resp_flags 0x0
Mar 14 20:49:12 daydream kernel: ata3: CPB 0: ctl_flags 0x1f, resp_flags 0x0
Mar 14 20:49:12 daydream kernel: ata3: CPB 1: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:49:12 daydream kernel: ata3: CPB 2: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:49:12 daydream kernel: ata3: CPB 3: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:49:12 daydream kernel: ata3: CPB 4: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:49:12 daydream kernel: ata3: CPB 5: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:49:12 daydream kernel: ata3: CPB 6: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:49:12 daydream kernel: ata3: CPB 7: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:49:12 daydream kernel: ata3: CPB 8: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:49:12 daydream kernel: ata3: CPB 9: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:49:12 daydream kernel: ata3: CPB 10: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:49:12 daydream kernel: ata3: CPB 11: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:49:18 daydream kernel: ata3: CPB 12: ctl_flags 0x1f, resp_flags 0x1
Mar 14 20:49:23 daydream kernel: ata3: CPB 13: ctl_flags 0x1f, resp_flags

Re: SATA problems in 2.6.20.3

2007-03-16 Thread Jeff Garzik

Charles Shannon Hendrix wrote:

I normally run a modified 2.6.19 kernel and it works great.

I recently tried 2.6.20 and had severe SATA problems with it.

Yesterday I tried 2.6.20.3, and the problems are still there.


Setting the module parameter 'adma' to zero fixes this, yes?

Jeff



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems in 2.6.20.3

2007-03-16 Thread Charles Shannon Hendrix
On Fri, 16 Mar 2007 11:58:21 -0400
Jeff Garzik [EMAIL PROTECTED] wrote:

 Charles Shannon Hendrix wrote:
  I normally run a modified 2.6.19 kernel and it works great.
  
  I recently tried 2.6.20 and had severe SATA problems with it.
  
  Yesterday I tried 2.6.20.3, and the problems are still there.
 
 Setting the module parameter 'adma' to zero fixes this, yes?

Seems to.

NCQ would be nice of course, but this is usable.

 
-- 
The strength of the Constitution lies entirely in the determination of
each citizen to defend it.  Only if every single citizen feels duty
bound to do his share in this defense are the constitutional rights
secure. -- Albert Einstein
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems in 2.6.20.3

2007-03-16 Thread Robert Hancock

Charles Shannon Hendrix wrote:

I normally run a modified 2.6.19 kernel and it works great.

I recently tried 2.6.20 and had severe SATA problems with it.

Yesterday I tried 2.6.20.3, and the problems are still there.


Can you try 2.6.21-rc and see if the problem is fixed in those kernels?

--
Robert Hancock  Saskatoon, SK, Canada
To email, remove nospam from [EMAIL PROTECTED]
Home Page: http://www.roberthancock.com/

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems in 2.6.20.3

2007-03-16 Thread Alistair John Strachan
On Friday 16 March 2007 23:44, you wrote:
 Charles Shannon Hendrix wrote:
  I normally run a modified 2.6.19 kernel and it works great.
 
  I recently tried 2.6.20 and had severe SATA problems with it.
 
  Yesterday I tried 2.6.20.3, and the problems are still there.

 Can you try 2.6.21-rc and see if the problem is fixed in those kernels?

-rc4 specifically, it's the first one that's worked for me (possibly related).

(BTW Robert, the sata_nv shadow registers patch has been fine here with a 
patched -rc3 for just over a week now.)

-- 
Cheers,
Alistair.

Final year Computer Science undergraduate.
1F2 55 South Clerk Street, Edinburgh, UK.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems in 2.6.20.3

2007-03-16 Thread Charles Shannon Hendrix
On Fri, 16 Mar 2007 17:44:25 -0600
Robert Hancock [EMAIL PROTECTED] wrote:

 Charles Shannon Hendrix wrote:
  I normally run a modified 2.6.19 kernel and it works great.
  
  I recently tried 2.6.20 and had severe SATA problems with it.
  
  Yesterday I tried 2.6.20.3, and the problems are still there.
 
 Can you try 2.6.21-rc and see if the problem is fixed in those kernels?

OK.

sata_nv.adma=0 let's me run 2.6.20.3 for now.

I'll test 2.6.21-rcwhatever tomorrow some time.




-- 
shannon   | Work for something because it is good, not just because 
  | it stands a chance to succeed. 
  |-- Vaclav Havel
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-02-21 Thread Pablo Sebastian Greco

Tejun Heo wrote:

Pablo Sebastian Greco wrote:
  

Tejun Heo wrote:


* Pablo, the bug you saw was bad interaction between blacklisted NCQ
device and dynamic queue depth adjustment.  Patches are submitted to fix
the problem.  Just drop the blacklist patch.  Your drives should work
fine in NCQ mode.  My gut feeling is that your problem is power related
from the beginning.
  
  

I had the same problems with a new Power Supply, Now everything is ok
with the old Power Supply and the new drives.



So, it was bad drives?  Are you using the same model or different ones?
 NCQ works okay now?

  
All I can say is that now is working, other things changed with the new 
drives: 1.5Gbps instead of 3Gbps, also new drives don't use NCQ (I'm 
reattaching  a full dmesg).
Also I've found this firmware upgrade 
(http://www.samsung.com/Products/HardDiskDrive/support/faqs/faqs_20060414_246673.htm) 
for the old drives, but couldn't confirm if it should be applied because 
the server is in Brazil and I live in Argentina. Won't be there until 
April to test.


Thanks.
Pablo.
Linux version 2.6.19-1.2895.fc6 ([EMAIL PROTECTED]) (gcc version 4.1.1 20070105 
(Red Hat 4.1.1-51)) #1 SMP Wed Jan 10 18:50:56 EST 2007
Command line: ro root=LABEL=/
BIOS-provided physical RAM map:
 BIOS-e820:  - 0009ec00 (usable)
 BIOS-e820: 0009ec00 - 0010 (reserved)
 BIOS-e820: 0010 - df938000 (usable)
 BIOS-e820: df938000 - df9d2000 (ACPI NVS)
 BIOS-e820: df9d2000 - dfa42000 (usable)
 BIOS-e820: dfa42000 - dfa9a000 (reserved)
 BIOS-e820: dfa9a000 - dfab8000 (usable)
 BIOS-e820: dfab8000 - dfb1a000 (ACPI NVS)
 BIOS-e820: dfb1a000 - dfb2c000 (usable)
 BIOS-e820: dfb2c000 - dfb3a000 (ACPI data)
 BIOS-e820: dfb3a000 - dfc0 (usable)
 BIOS-e820: ffc0 - ffc0c000 (reserved)
 BIOS-e820: 0001 - 00012000 (usable)
Entering add_active_range(0, 0, 158) 0 entries of 3200 used
Entering add_active_range(0, 256, 915768) 1 entries of 3200 used
Entering add_active_range(0, 915922, 916034) 2 entries of 3200 used
Entering add_active_range(0, 916122, 916152) 3 entries of 3200 used
Entering add_active_range(0, 916250, 916268) 4 entries of 3200 used
Entering add_active_range(0, 916282, 916480) 5 entries of 3200 used
Entering add_active_range(0, 1048576, 1179648) 6 entries of 3200 used
end_pfn_map = 1179648
DMI 2.4 present.
ACPI: RSDP (v002 INTEL ) @ 0x000f0350
ACPI: XSDT (v001 INTEL  S5000VSA 0x INTL 0x0113) @ 
0xdfb39120
ACPI: FADT (v003 INTEL  S5000VSA 0x INTL 0x0113) @ 
0xdfb36000
ACPI: MADT (v001 INTEL  S5000VSA 0x INTL 0x0113) @ 
0xdfb35000
ACPI: SPCR (v001 INTEL  S5000VSA 0x INTL 0x0113) @ 
0xdfb2f000
ACPI: HPET (v001 INTEL  S5000VSA 0x0001 INTL 0x0113) @ 
0xdfb2e000
ACPI: MCFG (v001 INTEL  S5000VSA 0x0001 INTL 0x0113) @ 
0xdfb2d000
ACPI: SSDT (v002 INTEL  S5000VSA 0x4000 INTL 0x0113) @ 
0xdfb2c000
ACPI: DSDT (v002 INTEL  S5000VSA 0x0008 INTL 0x0113) @ 
0x
No NUMA configuration found
Faking a node at -00012000
Entering add_active_range(0, 0, 158) 0 entries of 3200 used
Entering add_active_range(0, 256, 915768) 1 entries of 3200 used
Entering add_active_range(0, 915922, 916034) 2 entries of 3200 used
Entering add_active_range(0, 916122, 916152) 3 entries of 3200 used
Entering add_active_range(0, 916250, 916268) 4 entries of 3200 used
Entering add_active_range(0, 916282, 916480) 5 entries of 3200 used
Entering add_active_range(0, 1048576, 1179648) 6 entries of 3200 used
Bootmem setup node 0 -00012000
Zone PFN ranges:
  DMA 0 -> 4096
  DMA324096 ->  1048576
  Normal1048576 ->  1179648
early_node_map[7] active PFN ranges
0:0 ->  158
0:  256 ->   915768
0:   915922 ->   916034
0:   916122 ->   916152
0:   916250 ->   916268
0:   916282 ->   916480
0:  1048576 ->  1179648
On node 0 totalpages: 1047100
  DMA zone: 64 pages used for memmap
  DMA zone: 1450 pages reserved
  DMA zone: 2484 pages, LIFO batch:0
  DMA32 zone: 16320 pages used for memmap
  DMA32 zone: 895710 pages, LIFO batch:31
  Normal zone: 2048 pages used for memmap
  Normal zone: 129024 pages, LIFO batch:31
ACPI: PM-Timer IO Port: 0x408
ACPI: Local APIC address 0xfee0
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 (Bootup-CPU)
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x02] enabled)
Processor #2
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Processor #1
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] enabled)
Processor #3
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x84] disabled)
ACPI: LAPIC (acpi_id[0x05] lapic_id[0x85] disabled)
ACPI: LAPIC 

Re: SATA problems

2007-02-21 Thread Pablo Sebastian Greco

Tejun Heo wrote:

Pablo Sebastian Greco wrote:
  

Tejun Heo wrote:


* Pablo, the bug you saw was bad interaction between blacklisted NCQ
device and dynamic queue depth adjustment.  Patches are submitted to fix
the problem.  Just drop the blacklist patch.  Your drives should work
fine in NCQ mode.  My gut feeling is that your problem is power related
from the beginning.
  
  

I had the same problems with a new Power Supply, Now everything is ok
with the old Power Supply and the new drives.



So, it was bad drives?  Are you using the same model or different ones?
 NCQ works okay now?

  
All I can say is that now is working, other things changed with the new 
drives: 1.5Gbps instead of 3Gbps, also new drives don't use NCQ (I'm 
reattaching  a full dmesg).
Also I've found this firmware upgrade 
(http://www.samsung.com/Products/HardDiskDrive/support/faqs/faqs_20060414_246673.htm) 
for the old drives, but couldn't confirm if it should be applied because 
the server is in Brazil and I live in Argentina. Won't be there until 
April to test.


Thanks.
Pablo.
Linux version 2.6.19-1.2895.fc6 ([EMAIL PROTECTED]) (gcc version 4.1.1 20070105 
(Red Hat 4.1.1-51)) #1 SMP Wed Jan 10 18:50:56 EST 2007
Command line: ro root=LABEL=/
BIOS-provided physical RAM map:
 BIOS-e820:  - 0009ec00 (usable)
 BIOS-e820: 0009ec00 - 0010 (reserved)
 BIOS-e820: 0010 - df938000 (usable)
 BIOS-e820: df938000 - df9d2000 (ACPI NVS)
 BIOS-e820: df9d2000 - dfa42000 (usable)
 BIOS-e820: dfa42000 - dfa9a000 (reserved)
 BIOS-e820: dfa9a000 - dfab8000 (usable)
 BIOS-e820: dfab8000 - dfb1a000 (ACPI NVS)
 BIOS-e820: dfb1a000 - dfb2c000 (usable)
 BIOS-e820: dfb2c000 - dfb3a000 (ACPI data)
 BIOS-e820: dfb3a000 - dfc0 (usable)
 BIOS-e820: ffc0 - ffc0c000 (reserved)
 BIOS-e820: 0001 - 00012000 (usable)
Entering add_active_range(0, 0, 158) 0 entries of 3200 used
Entering add_active_range(0, 256, 915768) 1 entries of 3200 used
Entering add_active_range(0, 915922, 916034) 2 entries of 3200 used
Entering add_active_range(0, 916122, 916152) 3 entries of 3200 used
Entering add_active_range(0, 916250, 916268) 4 entries of 3200 used
Entering add_active_range(0, 916282, 916480) 5 entries of 3200 used
Entering add_active_range(0, 1048576, 1179648) 6 entries of 3200 used
end_pfn_map = 1179648
DMI 2.4 present.
ACPI: RSDP (v002 INTEL ) @ 0x000f0350
ACPI: XSDT (v001 INTEL  S5000VSA 0x INTL 0x0113) @ 
0xdfb39120
ACPI: FADT (v003 INTEL  S5000VSA 0x INTL 0x0113) @ 
0xdfb36000
ACPI: MADT (v001 INTEL  S5000VSA 0x INTL 0x0113) @ 
0xdfb35000
ACPI: SPCR (v001 INTEL  S5000VSA 0x INTL 0x0113) @ 
0xdfb2f000
ACPI: HPET (v001 INTEL  S5000VSA 0x0001 INTL 0x0113) @ 
0xdfb2e000
ACPI: MCFG (v001 INTEL  S5000VSA 0x0001 INTL 0x0113) @ 
0xdfb2d000
ACPI: SSDT (v002 INTEL  S5000VSA 0x4000 INTL 0x0113) @ 
0xdfb2c000
ACPI: DSDT (v002 INTEL  S5000VSA 0x0008 INTL 0x0113) @ 
0x
No NUMA configuration found
Faking a node at -00012000
Entering add_active_range(0, 0, 158) 0 entries of 3200 used
Entering add_active_range(0, 256, 915768) 1 entries of 3200 used
Entering add_active_range(0, 915922, 916034) 2 entries of 3200 used
Entering add_active_range(0, 916122, 916152) 3 entries of 3200 used
Entering add_active_range(0, 916250, 916268) 4 entries of 3200 used
Entering add_active_range(0, 916282, 916480) 5 entries of 3200 used
Entering add_active_range(0, 1048576, 1179648) 6 entries of 3200 used
Bootmem setup node 0 -00012000
Zone PFN ranges:
  DMA 0 - 4096
  DMA324096 -  1048576
  Normal1048576 -  1179648
early_node_map[7] active PFN ranges
0:0 -  158
0:  256 -   915768
0:   915922 -   916034
0:   916122 -   916152
0:   916250 -   916268
0:   916282 -   916480
0:  1048576 -  1179648
On node 0 totalpages: 1047100
  DMA zone: 64 pages used for memmap
  DMA zone: 1450 pages reserved
  DMA zone: 2484 pages, LIFO batch:0
  DMA32 zone: 16320 pages used for memmap
  DMA32 zone: 895710 pages, LIFO batch:31
  Normal zone: 2048 pages used for memmap
  Normal zone: 129024 pages, LIFO batch:31
ACPI: PM-Timer IO Port: 0x408
ACPI: Local APIC address 0xfee0
ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Processor #0 (Bootup-CPU)
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x02] enabled)
Processor #2
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Processor #1
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] enabled)
Processor #3
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x84] disabled)
ACPI: LAPIC (acpi_id[0x05] lapic_id[0x85] disabled)
ACPI: LAPIC (acpi_id[0x06] 

RE: SATA problems

2007-02-20 Thread Paul Rolland
Hi Marcus,

Could you give more details ? I'm stucked with a boot problem on a 
Asus P5W that also includes a Jmicron behind a Sata port on ICH8, and
the kernel boot goes timeout when probing it... I've been trying various
kernels,
including some patches from Tejun (Thanks !), but no luck to date...

Mind sharing your .config so that I can check if I missed something obvious ?

Regards,
Paul

Paul Rolland, rol(at)as2917.net
ex-AS2917 Network administrator and Peering Coordinator

--

Please no HTML, I'm not a browser - Pas d'HTML, je ne suis pas un navigateur 
"Some people dream of success... while others wake up and work hard at it" 

"I worry about my child and the Internet all the time, even though she's too 
young to have logged on yet. Here's what I worry about. I worry that 10 or 15 
years from now, she will come to me and say 'Daddy, where were you when they 
took freedom of the press away from the Internet?'"
--Mike Godwin, Electronic Frontier Foundation 
  

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Marcus Haebler
> Sent: Wednesday, February 21, 2007 7:24 AM
> To: Tejun Heo
> Cc: Pablo Sebastian Greco; linux-kernel@vger.kernel.org
> Subject: Re: SATA problems
> 
> Tejun,
> 
> I checked out the kernel 2.6.19 to 2.6.20 Changelog. Seems like you
> fixed a problem with the JMB363. The Asus P5B-Deluxe I am using has a
> JMB363 - besides an Intel ICH8R - with the SATA ports set to AHCI as
> well. Looks like that might have been the source of the problem in
> 2.6.19.
> 
> Thanks,
> 
> Marcus
> 
> On 2/21/07, Marcus Haebler <[EMAIL PROTECTED]> wrote:
> > Tejun,
> >
> > thanks. In preparation of your patch I installed a vanilla 2.6.20.1
> > kernel on my FC6
> > system. Amazingly the problem went away with the vanilla(!) 
> kernel and NCQ
> > is enabled at boot time (queue_depth is 31). I guess I should have
> > tried that kernel
> > earlier.
> >
> > The patches you sent earlier apply w/o problems against the 2.6.20.1
> > vanilla kernel
> > which is expected. I will test drive those patches tomorrow.
> >
> > BTW thanks for saving me the 'cat' on the 3 patches. ;)
> >
> > Thanks,
> >
> > Marcus
> -
> To unsubscribe from this list: send the line "unsubscribe 
> linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-02-20 Thread Marcus Haebler

Tejun,

I checked out the kernel 2.6.19 to 2.6.20 Changelog. Seems like you
fixed a problem with the JMB363. The Asus P5B-Deluxe I am using has a
JMB363 - besides an Intel ICH8R - with the SATA ports set to AHCI as
well. Looks like that might have been the source of the problem in
2.6.19.

Thanks,

Marcus

On 2/21/07, Marcus Haebler <[EMAIL PROTECTED]> wrote:

Tejun,

thanks. In preparation of your patch I installed a vanilla 2.6.20.1
kernel on my FC6
system. Amazingly the problem went away with the vanilla(!) kernel and NCQ
is enabled at boot time (queue_depth is 31). I guess I should have
tried that kernel
earlier.

The patches you sent earlier apply w/o problems against the 2.6.20.1
vanilla kernel
which is expected. I will test drive those patches tomorrow.

BTW thanks for saving me the 'cat' on the 3 patches. ;)

Thanks,

Marcus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-02-20 Thread Marcus Haebler

Tejun,

thanks. In preparation of your patch I installed a vanilla 2.6.20.1
kernel on my FC6
system. Amazingly the problem went away with the vanilla(!) kernel and NCQ
is enabled at boot time (queue_depth is 31). I guess I should have
tried that kernel
earlier.

The patches you sent earlier apply w/o problems against the 2.6.20.1
vanilla kernel
which is expected. I will test drive those patches tomorrow.

BTW thanks for saving me the 'cat' on the 3 patches. ;)

Thanks,

Marcus

On 2/20/07, Tejun Heo <[EMAIL PROTECTED]> wrote:

Marcus Haebler wrote:
> thanks for the patches! I am on an Intel P965/ICH8R.

I see.  That can happen too.  There was a race window where in-flight
r/w command which left SCSI midlayer but pending on libata gets executed
in the wrong mode.  If possible, please verify that it doesn't happen
with the patches applied.  I'm attaching combined patch against v2.6.20.

Thanks.

--
tejun

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 667acd2..348cc02 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -308,9 +308,7 @@ int ata_build_rw_tf(struct ata_taskfile *tf, struct 
ata_device *dev,
tf->flags |= ATA_TFLAG_ISADDR | ATA_TFLAG_DEVICE;
tf->flags |= tf_flags;

-   if ((dev->flags & (ATA_DFLAG_PIO | ATA_DFLAG_NCQ_OFF |
-  ATA_DFLAG_NCQ)) == ATA_DFLAG_NCQ &&
-   likely(tag != ATA_TAG_INTERNAL)) {
+   if (ata_ncq_enabled(dev) && likely(tag != ATA_TAG_INTERNAL)) {
/* yay, NCQ */
if (!lba_48_ok(block, n_block))
return -ERANGE;
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 73902d3..ebb9185 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -945,29 +945,32 @@ int ata_scsi_change_queue_depth(struct scsi_device *sdev, 
int queue_depth)
struct ata_port *ap = ata_shost_to_port(sdev->host);
struct ata_device *dev;
unsigned long flags;
-   int max_depth;

-   if (queue_depth < 1)
+   if (queue_depth < 1 || queue_depth == sdev->queue_depth)
return sdev->queue_depth;

dev = ata_scsi_find_dev(ap, sdev);
if (!dev || !ata_dev_enabled(dev))
return sdev->queue_depth;

-   max_depth = min(sdev->host->can_queue, ata_id_queue_depth(dev->id));
-   max_depth = min(ATA_MAX_QUEUE - 1, max_depth);
-   if (queue_depth > max_depth)
-   queue_depth = max_depth;
-
-   scsi_adjust_queue_depth(sdev, MSG_SIMPLE_TAG, queue_depth);
-
+   /* NCQ enabled? */
spin_lock_irqsave(ap->lock, flags);
-   if (queue_depth > 1)
-   dev->flags &= ~ATA_DFLAG_NCQ_OFF;
-   else
+   dev->flags &= ~ATA_DFLAG_NCQ_OFF;
+   if (queue_depth == 1 || !ata_ncq_enabled(dev)) {
dev->flags |= ATA_DFLAG_NCQ_OFF;
+   queue_depth = 1;
+   }
spin_unlock_irqrestore(ap->lock, flags);

+   /* limit and apply queue depth */
+   queue_depth = min(queue_depth, sdev->host->can_queue);
+   queue_depth = min(queue_depth, ata_id_queue_depth(dev->id));
+   queue_depth = min(queue_depth, ATA_MAX_QUEUE - 1);
+
+   if (sdev->queue_depth == queue_depth)
+   return -EINVAL;
+
+   scsi_adjust_queue_depth(sdev, MSG_SIMPLE_TAG, queue_depth);
return queue_depth;
 }

@@ -1454,11 +1457,9 @@ static void ata_scsi_qc_complete(struct ata_queued_cmd 
*qc)
 static int ata_scmd_need_defer(struct ata_device *dev, int is_io)
 {
struct ata_port *ap = dev->ap;
+   int is_ncq = is_io && ata_ncq_enabled(dev);

-   if (!(dev->flags & ATA_DFLAG_NCQ))
-   return 0;
-
-   if (is_io) {
+   if (is_ncq) {
if (!ata_tag_valid(ap->active_tag))
return 0;
} else {
diff --git a/include/linux/libata.h b/include/linux/libata.h
index 91bb8ce..4e4e365 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -1035,6 +1035,21 @@ static inline u8 ata_chk_status(struct ata_port *ap)
return ap->ops->check_status(ap);
 }

+/**
+ * ata_ncq_enabled - Test whether NCQ is enabled
+ * @dev: ATA device to test for
+ *
+ * LOCKING:
+ * spin_lock_irqsave(host lock)
+ *
+ * RETURNS:
+ * 1 if NCQ is enabled for @dev, 0 otherwise.
+ */
+static inline int ata_ncq_enabled(struct ata_device *dev)
+{
+   return (dev->flags & (ATA_DFLAG_PIO | ATA_DFLAG_NCQ_OFF |
+ ATA_DFLAG_NCQ)) == ATA_DFLAG_NCQ;
+}

 /**
  * ata_pause - Flush writes and pause 400 nanoseconds.



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-02-20 Thread Tejun Heo
Marcus Haebler wrote:
> thanks for the patches! I am on an Intel P965/ICH8R.

I see.  That can happen too.  There was a race window where in-flight
r/w command which left SCSI midlayer but pending on libata gets executed
in the wrong mode.  If possible, please verify that it doesn't happen
with the patches applied.  I'm attaching combined patch against v2.6.20.

Thanks.

-- 
tejun
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 667acd2..348cc02 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -308,9 +308,7 @@ int ata_build_rw_tf(struct ata_taskfile *tf, struct 
ata_device *dev,
tf->flags |= ATA_TFLAG_ISADDR | ATA_TFLAG_DEVICE;
tf->flags |= tf_flags;
 
-   if ((dev->flags & (ATA_DFLAG_PIO | ATA_DFLAG_NCQ_OFF |
-  ATA_DFLAG_NCQ)) == ATA_DFLAG_NCQ &&
-   likely(tag != ATA_TAG_INTERNAL)) {
+   if (ata_ncq_enabled(dev) && likely(tag != ATA_TAG_INTERNAL)) {
/* yay, NCQ */
if (!lba_48_ok(block, n_block))
return -ERANGE;
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 73902d3..ebb9185 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -945,29 +945,32 @@ int ata_scsi_change_queue_depth(struct scsi_device *sdev, 
int queue_depth)
struct ata_port *ap = ata_shost_to_port(sdev->host);
struct ata_device *dev;
unsigned long flags;
-   int max_depth;
 
-   if (queue_depth < 1)
+   if (queue_depth < 1 || queue_depth == sdev->queue_depth)
return sdev->queue_depth;
 
dev = ata_scsi_find_dev(ap, sdev);
if (!dev || !ata_dev_enabled(dev))
return sdev->queue_depth;
 
-   max_depth = min(sdev->host->can_queue, ata_id_queue_depth(dev->id));
-   max_depth = min(ATA_MAX_QUEUE - 1, max_depth);
-   if (queue_depth > max_depth)
-   queue_depth = max_depth;
-
-   scsi_adjust_queue_depth(sdev, MSG_SIMPLE_TAG, queue_depth);
-
+   /* NCQ enabled? */
spin_lock_irqsave(ap->lock, flags);
-   if (queue_depth > 1)
-   dev->flags &= ~ATA_DFLAG_NCQ_OFF;
-   else
+   dev->flags &= ~ATA_DFLAG_NCQ_OFF;
+   if (queue_depth == 1 || !ata_ncq_enabled(dev)) {
dev->flags |= ATA_DFLAG_NCQ_OFF;
+   queue_depth = 1;
+   }
spin_unlock_irqrestore(ap->lock, flags);
 
+   /* limit and apply queue depth */
+   queue_depth = min(queue_depth, sdev->host->can_queue);
+   queue_depth = min(queue_depth, ata_id_queue_depth(dev->id));
+   queue_depth = min(queue_depth, ATA_MAX_QUEUE - 1);
+
+   if (sdev->queue_depth == queue_depth)
+   return -EINVAL;
+
+   scsi_adjust_queue_depth(sdev, MSG_SIMPLE_TAG, queue_depth);
return queue_depth;
 }
 
@@ -1454,11 +1457,9 @@ static void ata_scsi_qc_complete(struct ata_queued_cmd 
*qc)
 static int ata_scmd_need_defer(struct ata_device *dev, int is_io)
 {
struct ata_port *ap = dev->ap;
+   int is_ncq = is_io && ata_ncq_enabled(dev);
 
-   if (!(dev->flags & ATA_DFLAG_NCQ))
-   return 0;
-
-   if (is_io) {
+   if (is_ncq) {
if (!ata_tag_valid(ap->active_tag))
return 0;
} else {
diff --git a/include/linux/libata.h b/include/linux/libata.h
index 91bb8ce..4e4e365 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -1035,6 +1035,21 @@ static inline u8 ata_chk_status(struct ata_port *ap)
return ap->ops->check_status(ap);
 }
 
+/**
+ * ata_ncq_enabled - Test whether NCQ is enabled
+ * @dev: ATA device to test for
+ *
+ * LOCKING:
+ * spin_lock_irqsave(host lock)
+ *
+ * RETURNS:
+ * 1 if NCQ is enabled for @dev, 0 otherwise.
+ */
+static inline int ata_ncq_enabled(struct ata_device *dev)
+{
+   return (dev->flags & (ATA_DFLAG_PIO | ATA_DFLAG_NCQ_OFF |
+ ATA_DFLAG_NCQ)) == ATA_DFLAG_NCQ;
+}
 
 /**
  * ata_pause - Flush writes and pause 400 nanoseconds.


Re: SATA problems

2007-02-20 Thread Tejun Heo
Pablo Sebastian Greco wrote:
> Tejun Heo wrote:
>> * Pablo, the bug you saw was bad interaction between blacklisted NCQ
>> device and dynamic queue depth adjustment.  Patches are submitted to fix
>> the problem.  Just drop the blacklist patch.  Your drives should work
>> fine in NCQ mode.  My gut feeling is that your problem is power related
>> from the beginning.
>>   
> I had the same problems with a new Power Supply, Now everything is ok
> with the old Power Supply and the new drives.

So, it was bad drives?  Are you using the same model or different ones?
 NCQ works okay now?

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-02-20 Thread Marcus Haebler

Tejun,

thanks for the patches! I am on an Intel P965/ICH8R.

Best,

Marcus

On 2/20/07, Tejun Heo <[EMAIL PROTECTED]> wrote:

* Pablo, the bug you saw was bad interaction between blacklisted NCQ
device and dynamic queue depth adjustment.  Patches are submitted to fix
the problem.  Just drop the blacklist patch.  Your drives should work
fine in NCQ mode.  My gut feeling is that your problem is power related
from the beginning.

* Marcus, you're on via's ahci controller, right?  The problem you saw
was bad interaction between blacklisted NCQ _controller_ and dynamic
queue depth adjustment.  Patches submitted.

Thanks.

--
tejun


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-02-20 Thread Pablo Sebastian Greco

Tejun Heo wrote:

* Pablo, the bug you saw was bad interaction between blacklisted NCQ
device and dynamic queue depth adjustment.  Patches are submitted to fix
the problem.  Just drop the blacklist patch.  Your drives should work
fine in NCQ mode.  My gut feeling is that your problem is power related
from the beginning.

* Marcus, you're on via's ahci controller, right?  The problem you saw
was bad interaction between blacklisted NCQ _controller_ and dynamic
queue depth adjustment.  Patches submitted.

Thanks.

  
I had the same problems with a new Power Supply, Now everything is ok 
with the old Power Supply and the new drives.


Pablo.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-02-20 Thread Tejun Heo
* Pablo, the bug you saw was bad interaction between blacklisted NCQ
device and dynamic queue depth adjustment.  Patches are submitted to fix
the problem.  Just drop the blacklist patch.  Your drives should work
fine in NCQ mode.  My gut feeling is that your problem is power related
from the beginning.

* Marcus, you're on via's ahci controller, right?  The problem you saw
was bad interaction between blacklisted NCQ _controller_ and dynamic
queue depth adjustment.  Patches submitted.

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-02-20 Thread Tejun Heo
* Pablo, the bug you saw was bad interaction between blacklisted NCQ
device and dynamic queue depth adjustment.  Patches are submitted to fix
the problem.  Just drop the blacklist patch.  Your drives should work
fine in NCQ mode.  My gut feeling is that your problem is power related
from the beginning.

* Marcus, you're on via's ahci controller, right?  The problem you saw
was bad interaction between blacklisted NCQ _controller_ and dynamic
queue depth adjustment.  Patches submitted.

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-02-20 Thread Pablo Sebastian Greco

Tejun Heo wrote:

* Pablo, the bug you saw was bad interaction between blacklisted NCQ
device and dynamic queue depth adjustment.  Patches are submitted to fix
the problem.  Just drop the blacklist patch.  Your drives should work
fine in NCQ mode.  My gut feeling is that your problem is power related
from the beginning.

* Marcus, you're on via's ahci controller, right?  The problem you saw
was bad interaction between blacklisted NCQ _controller_ and dynamic
queue depth adjustment.  Patches submitted.

Thanks.

  
I had the same problems with a new Power Supply, Now everything is ok 
with the old Power Supply and the new drives.


Pablo.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-02-20 Thread Marcus Haebler

Tejun,

thanks for the patches! I am on an Intel P965/ICH8R.

Best,

Marcus

On 2/20/07, Tejun Heo [EMAIL PROTECTED] wrote:

* Pablo, the bug you saw was bad interaction between blacklisted NCQ
device and dynamic queue depth adjustment.  Patches are submitted to fix
the problem.  Just drop the blacklist patch.  Your drives should work
fine in NCQ mode.  My gut feeling is that your problem is power related
from the beginning.

* Marcus, you're on via's ahci controller, right?  The problem you saw
was bad interaction between blacklisted NCQ _controller_ and dynamic
queue depth adjustment.  Patches submitted.

Thanks.

--
tejun


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-02-20 Thread Tejun Heo
Pablo Sebastian Greco wrote:
 Tejun Heo wrote:
 * Pablo, the bug you saw was bad interaction between blacklisted NCQ
 device and dynamic queue depth adjustment.  Patches are submitted to fix
 the problem.  Just drop the blacklist patch.  Your drives should work
 fine in NCQ mode.  My gut feeling is that your problem is power related
 from the beginning.
   
 I had the same problems with a new Power Supply, Now everything is ok
 with the old Power Supply and the new drives.

So, it was bad drives?  Are you using the same model or different ones?
 NCQ works okay now?

-- 
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-02-20 Thread Tejun Heo
Marcus Haebler wrote:
 thanks for the patches! I am on an Intel P965/ICH8R.

I see.  That can happen too.  There was a race window where in-flight
r/w command which left SCSI midlayer but pending on libata gets executed
in the wrong mode.  If possible, please verify that it doesn't happen
with the patches applied.  I'm attaching combined patch against v2.6.20.

Thanks.

-- 
tejun
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 667acd2..348cc02 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -308,9 +308,7 @@ int ata_build_rw_tf(struct ata_taskfile *tf, struct 
ata_device *dev,
tf-flags |= ATA_TFLAG_ISADDR | ATA_TFLAG_DEVICE;
tf-flags |= tf_flags;
 
-   if ((dev-flags  (ATA_DFLAG_PIO | ATA_DFLAG_NCQ_OFF |
-  ATA_DFLAG_NCQ)) == ATA_DFLAG_NCQ 
-   likely(tag != ATA_TAG_INTERNAL)) {
+   if (ata_ncq_enabled(dev)  likely(tag != ATA_TAG_INTERNAL)) {
/* yay, NCQ */
if (!lba_48_ok(block, n_block))
return -ERANGE;
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 73902d3..ebb9185 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -945,29 +945,32 @@ int ata_scsi_change_queue_depth(struct scsi_device *sdev, 
int queue_depth)
struct ata_port *ap = ata_shost_to_port(sdev-host);
struct ata_device *dev;
unsigned long flags;
-   int max_depth;
 
-   if (queue_depth  1)
+   if (queue_depth  1 || queue_depth == sdev-queue_depth)
return sdev-queue_depth;
 
dev = ata_scsi_find_dev(ap, sdev);
if (!dev || !ata_dev_enabled(dev))
return sdev-queue_depth;
 
-   max_depth = min(sdev-host-can_queue, ata_id_queue_depth(dev-id));
-   max_depth = min(ATA_MAX_QUEUE - 1, max_depth);
-   if (queue_depth  max_depth)
-   queue_depth = max_depth;
-
-   scsi_adjust_queue_depth(sdev, MSG_SIMPLE_TAG, queue_depth);
-
+   /* NCQ enabled? */
spin_lock_irqsave(ap-lock, flags);
-   if (queue_depth  1)
-   dev-flags = ~ATA_DFLAG_NCQ_OFF;
-   else
+   dev-flags = ~ATA_DFLAG_NCQ_OFF;
+   if (queue_depth == 1 || !ata_ncq_enabled(dev)) {
dev-flags |= ATA_DFLAG_NCQ_OFF;
+   queue_depth = 1;
+   }
spin_unlock_irqrestore(ap-lock, flags);
 
+   /* limit and apply queue depth */
+   queue_depth = min(queue_depth, sdev-host-can_queue);
+   queue_depth = min(queue_depth, ata_id_queue_depth(dev-id));
+   queue_depth = min(queue_depth, ATA_MAX_QUEUE - 1);
+
+   if (sdev-queue_depth == queue_depth)
+   return -EINVAL;
+
+   scsi_adjust_queue_depth(sdev, MSG_SIMPLE_TAG, queue_depth);
return queue_depth;
 }
 
@@ -1454,11 +1457,9 @@ static void ata_scsi_qc_complete(struct ata_queued_cmd 
*qc)
 static int ata_scmd_need_defer(struct ata_device *dev, int is_io)
 {
struct ata_port *ap = dev-ap;
+   int is_ncq = is_io  ata_ncq_enabled(dev);
 
-   if (!(dev-flags  ATA_DFLAG_NCQ))
-   return 0;
-
-   if (is_io) {
+   if (is_ncq) {
if (!ata_tag_valid(ap-active_tag))
return 0;
} else {
diff --git a/include/linux/libata.h b/include/linux/libata.h
index 91bb8ce..4e4e365 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -1035,6 +1035,21 @@ static inline u8 ata_chk_status(struct ata_port *ap)
return ap-ops-check_status(ap);
 }
 
+/**
+ * ata_ncq_enabled - Test whether NCQ is enabled
+ * @dev: ATA device to test for
+ *
+ * LOCKING:
+ * spin_lock_irqsave(host lock)
+ *
+ * RETURNS:
+ * 1 if NCQ is enabled for @dev, 0 otherwise.
+ */
+static inline int ata_ncq_enabled(struct ata_device *dev)
+{
+   return (dev-flags  (ATA_DFLAG_PIO | ATA_DFLAG_NCQ_OFF |
+ ATA_DFLAG_NCQ)) == ATA_DFLAG_NCQ;
+}
 
 /**
  * ata_pause - Flush writes and pause 400 nanoseconds.


Re: SATA problems

2007-02-20 Thread Marcus Haebler

Tejun,

thanks. In preparation of your patch I installed a vanilla 2.6.20.1
kernel on my FC6
system. Amazingly the problem went away with the vanilla(!) kernel and NCQ
is enabled at boot time (queue_depth is 31). I guess I should have
tried that kernel
earlier.

The patches you sent earlier apply w/o problems against the 2.6.20.1
vanilla kernel
which is expected. I will test drive those patches tomorrow.

BTW thanks for saving me the 'cat' on the 3 patches. ;)

Thanks,

Marcus

On 2/20/07, Tejun Heo [EMAIL PROTECTED] wrote:

Marcus Haebler wrote:
 thanks for the patches! I am on an Intel P965/ICH8R.

I see.  That can happen too.  There was a race window where in-flight
r/w command which left SCSI midlayer but pending on libata gets executed
in the wrong mode.  If possible, please verify that it doesn't happen
with the patches applied.  I'm attaching combined patch against v2.6.20.

Thanks.

--
tejun

diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 667acd2..348cc02 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -308,9 +308,7 @@ int ata_build_rw_tf(struct ata_taskfile *tf, struct 
ata_device *dev,
tf-flags |= ATA_TFLAG_ISADDR | ATA_TFLAG_DEVICE;
tf-flags |= tf_flags;

-   if ((dev-flags  (ATA_DFLAG_PIO | ATA_DFLAG_NCQ_OFF |
-  ATA_DFLAG_NCQ)) == ATA_DFLAG_NCQ 
-   likely(tag != ATA_TAG_INTERNAL)) {
+   if (ata_ncq_enabled(dev)  likely(tag != ATA_TAG_INTERNAL)) {
/* yay, NCQ */
if (!lba_48_ok(block, n_block))
return -ERANGE;
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 73902d3..ebb9185 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -945,29 +945,32 @@ int ata_scsi_change_queue_depth(struct scsi_device *sdev, 
int queue_depth)
struct ata_port *ap = ata_shost_to_port(sdev-host);
struct ata_device *dev;
unsigned long flags;
-   int max_depth;

-   if (queue_depth  1)
+   if (queue_depth  1 || queue_depth == sdev-queue_depth)
return sdev-queue_depth;

dev = ata_scsi_find_dev(ap, sdev);
if (!dev || !ata_dev_enabled(dev))
return sdev-queue_depth;

-   max_depth = min(sdev-host-can_queue, ata_id_queue_depth(dev-id));
-   max_depth = min(ATA_MAX_QUEUE - 1, max_depth);
-   if (queue_depth  max_depth)
-   queue_depth = max_depth;
-
-   scsi_adjust_queue_depth(sdev, MSG_SIMPLE_TAG, queue_depth);
-
+   /* NCQ enabled? */
spin_lock_irqsave(ap-lock, flags);
-   if (queue_depth  1)
-   dev-flags = ~ATA_DFLAG_NCQ_OFF;
-   else
+   dev-flags = ~ATA_DFLAG_NCQ_OFF;
+   if (queue_depth == 1 || !ata_ncq_enabled(dev)) {
dev-flags |= ATA_DFLAG_NCQ_OFF;
+   queue_depth = 1;
+   }
spin_unlock_irqrestore(ap-lock, flags);

+   /* limit and apply queue depth */
+   queue_depth = min(queue_depth, sdev-host-can_queue);
+   queue_depth = min(queue_depth, ata_id_queue_depth(dev-id));
+   queue_depth = min(queue_depth, ATA_MAX_QUEUE - 1);
+
+   if (sdev-queue_depth == queue_depth)
+   return -EINVAL;
+
+   scsi_adjust_queue_depth(sdev, MSG_SIMPLE_TAG, queue_depth);
return queue_depth;
 }

@@ -1454,11 +1457,9 @@ static void ata_scsi_qc_complete(struct ata_queued_cmd 
*qc)
 static int ata_scmd_need_defer(struct ata_device *dev, int is_io)
 {
struct ata_port *ap = dev-ap;
+   int is_ncq = is_io  ata_ncq_enabled(dev);

-   if (!(dev-flags  ATA_DFLAG_NCQ))
-   return 0;
-
-   if (is_io) {
+   if (is_ncq) {
if (!ata_tag_valid(ap-active_tag))
return 0;
} else {
diff --git a/include/linux/libata.h b/include/linux/libata.h
index 91bb8ce..4e4e365 100644
--- a/include/linux/libata.h
+++ b/include/linux/libata.h
@@ -1035,6 +1035,21 @@ static inline u8 ata_chk_status(struct ata_port *ap)
return ap-ops-check_status(ap);
 }

+/**
+ * ata_ncq_enabled - Test whether NCQ is enabled
+ * @dev: ATA device to test for
+ *
+ * LOCKING:
+ * spin_lock_irqsave(host lock)
+ *
+ * RETURNS:
+ * 1 if NCQ is enabled for @dev, 0 otherwise.
+ */
+static inline int ata_ncq_enabled(struct ata_device *dev)
+{
+   return (dev-flags  (ATA_DFLAG_PIO | ATA_DFLAG_NCQ_OFF |
+ ATA_DFLAG_NCQ)) == ATA_DFLAG_NCQ;
+}

 /**
  * ata_pause - Flush writes and pause 400 nanoseconds.



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-02-20 Thread Marcus Haebler

Tejun,

I checked out the kernel 2.6.19 to 2.6.20 Changelog. Seems like you
fixed a problem with the JMB363. The Asus P5B-Deluxe I am using has a
JMB363 - besides an Intel ICH8R - with the SATA ports set to AHCI as
well. Looks like that might have been the source of the problem in
2.6.19.

Thanks,

Marcus

On 2/21/07, Marcus Haebler [EMAIL PROTECTED] wrote:

Tejun,

thanks. In preparation of your patch I installed a vanilla 2.6.20.1
kernel on my FC6
system. Amazingly the problem went away with the vanilla(!) kernel and NCQ
is enabled at boot time (queue_depth is 31). I guess I should have
tried that kernel
earlier.

The patches you sent earlier apply w/o problems against the 2.6.20.1
vanilla kernel
which is expected. I will test drive those patches tomorrow.

BTW thanks for saving me the 'cat' on the 3 patches. ;)

Thanks,

Marcus

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: SATA problems

2007-02-20 Thread Paul Rolland
Hi Marcus,

Could you give more details ? I'm stucked with a boot problem on a 
Asus P5W that also includes a Jmicron behind a Sata port on ICH8, and
the kernel boot goes timeout when probing it... I've been trying various
kernels,
including some patches from Tejun (Thanks !), but no luck to date...

Mind sharing your .config so that I can check if I missed something obvious ?

Regards,
Paul

Paul Rolland, rol(at)as2917.net
ex-AS2917 Network administrator and Peering Coordinator

--

Please no HTML, I'm not a browser - Pas d'HTML, je ne suis pas un navigateur 
Some people dream of success... while others wake up and work hard at it 

I worry about my child and the Internet all the time, even though she's too 
young to have logged on yet. Here's what I worry about. I worry that 10 or 15 
years from now, she will come to me and say 'Daddy, where were you when they 
took freedom of the press away from the Internet?'
--Mike Godwin, Electronic Frontier Foundation 
  

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Marcus Haebler
 Sent: Wednesday, February 21, 2007 7:24 AM
 To: Tejun Heo
 Cc: Pablo Sebastian Greco; linux-kernel@vger.kernel.org
 Subject: Re: SATA problems
 
 Tejun,
 
 I checked out the kernel 2.6.19 to 2.6.20 Changelog. Seems like you
 fixed a problem with the JMB363. The Asus P5B-Deluxe I am using has a
 JMB363 - besides an Intel ICH8R - with the SATA ports set to AHCI as
 well. Looks like that might have been the source of the problem in
 2.6.19.
 
 Thanks,
 
 Marcus
 
 On 2/21/07, Marcus Haebler [EMAIL PROTECTED] wrote:
  Tejun,
 
  thanks. In preparation of your patch I installed a vanilla 2.6.20.1
  kernel on my FC6
  system. Amazingly the problem went away with the vanilla(!) 
 kernel and NCQ
  is enabled at boot time (queue_depth is 31). I guess I should have
  tried that kernel
  earlier.
 
  The patches you sent earlier apply w/o problems against the 2.6.20.1
  vanilla kernel
  which is expected. I will test drive those patches tomorrow.
 
  BTW thanks for saving me the 'cat' on the 3 patches. ;)
 
  Thanks,
 
  Marcus
 -
 To unsubscribe from this list: send the line unsubscribe 
 linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
 

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-02-17 Thread Pablo Sebastian Greco

Marcus Haebler wrote:

I opened a bug report (228979) on bugzilla.redhat.com on this one because
I have the same issue under FC6 2.6.19-1.2895. Here is the link:

   https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=228979

Do you have any more updates on this problem? Is there a way I can help
by providing debug data?

Thanks,

Marcus

On 1/23/07, Tejun Heo <[EMAIL PROTECTED]> wrote:

Pablo Sebastian Greco wrote:
> Well, it took me a few days,  but I think I'm ready to report back. 
One
> of the drives was failing, and it stopped after rewiring power 
supply so

> the last problem seems to be corrected.
> OTOH, your blacklist seems to be needed too, now I'm running FC6
> distribution kernel 2.6.19-1.2895.fc6 (2.6.19.2 + some patches by
> fedora) and setting
> echo 1 >/sys/block/sdX/device/queue_depth
> on all the SAMSUNG drives (sdb, sdc and sdd)
> The second I type
> echo 31 >/sys/block/sdX/device/queue_depth
> on any of the drives I get these messages
>
> Jan 23 12:36:30 squid kernel: BUG: warning: (ap->ops->error_handler &&
> ata_tag_valid(ap->active_tag)) at
> drivers/ata/libata-core.c:4602/ata_qc_issue() (Not ta
> inted)

This is kernel bug that needs fixing.  I'll investigate.

--
tejun
-
To unsubscribe from this list: send the line "unsubscribe 
linux-kernel" in

the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/





On my side, all the problems dissapeared on all the kernels after 
changing all 3 drives to non-NCQ drives, I was going crazy.


New dmesg attached

Pablo.
Linux version 2.6.19-1.2895.fc6 ([EMAIL PROTECTED]) (gcc version 4.1.1 20070105 
(Red Hat 4.1.1-51)) #1 SMP Wed Jan 10 18:50:56 EST 2007
Command line: ro root=LABEL=/
BIOS-provided physical RAM map:
 BIOS-e820:  - 0009ec00 (usable)
 BIOS-e820: 0009ec00 - 0010 (reserved)
 BIOS-e820: 0010 - df938000 (usable)
 BIOS-e820: df938000 - df9d2000 (ACPI NVS)
 BIOS-e820: df9d2000 - dfa42000 (usable)
 BIOS-e820: dfa42000 - dfa9a000 (reserved)
 BIOS-e820: dfa9a000 - dfab8000 (usable)
 BIOS-e820: dfab8000 - dfb1a000 (ACPI NVS)
 BIOS-e820: dfb1a000 - dfb2c000 (usable)
 BIOS-e820: dfb2c000 - dfb3a000 (ACPI data)
 BIOS-e820: dfb3a000 - dfc0 (usable)
 BIOS-e820: ffc0 - ffc0c000 (reserved)
 BIOS-e820: 0001 - 00012000 (usable)
Entering add_active_range(0, 0, 158) 0 entries of 3200 used
Entering add_active_range(0, 256, 915768) 1 entries of 3200 used
Entering add_active_range(0, 915922, 916034) 2 entries of 3200 used
Entering add_active_range(0, 916122, 916152) 3 entries of 3200 used
Entering add_active_range(0, 916250, 916268) 4 entries of 3200 used
Entering add_active_range(0, 916282, 916480) 5 entries of 3200 used
Entering add_active_range(0, 1048576, 1179648) 6 entries of 3200 used
end_pfn_map = 1179648
DMI 2.4 present.
ACPI: RSDP (v002 INTEL ) @ 0x000f0350
ACPI: XSDT (v001 INTEL  S5000VSA 0x INTL 0x0113) @ 
0xdfb39120
ACPI: FADT (v003 INTEL  S5000VSA 0x INTL 0x0113) @ 
0xdfb36000
ACPI: MADT (v001 INTEL  S5000VSA 0x INTL 0x0113) @ 
0xdfb35000
ACPI: SPCR (v001 INTEL  S5000VSA 0x INTL 0x0113) @ 
0xdfb2f000
ACPI: HPET (v001 INTEL  S5000VSA 0x0001 INTL 0x0113) @ 
0xdfb2e000
ACPI: MCFG (v001 INTEL  S5000VSA 0x0001 INTL 0x0113) @ 
0xdfb2d000
ACPI: SSDT (v002 INTEL  S5000VSA 0x4000 INTL 0x0113) @ 
0xdfb2c000
ACPI: DSDT (v002 INTEL  S5000VSA 0x0008 INTL 0x0113) @ 
0x
No NUMA configuration found
Faking a node at -00012000
Entering add_active_range(0, 0, 158) 0 entries of 3200 used
Entering add_active_range(0, 256, 915768) 1 entries of 3200 used
Entering add_active_range(0, 915922, 916034) 2 entries of 3200 used
Entering add_active_range(0, 916122, 916152) 3 entries of 3200 used
Entering add_active_range(0, 916250, 916268) 4 entries of 3200 used
Entering add_active_range(0, 916282, 916480) 5 entries of 3200 used
Entering add_active_range(0, 1048576, 1179648) 6 entries of 3200 used
Bootmem setup node 0 -00012000
Zone PFN ranges:
  DMA 0 -> 4096
  DMA324096 ->  1048576
  Normal1048576 ->  1179648
early_node_map[7] active PFN ranges
0:0 ->  158
0:  256 ->   915768
0:   915922 ->   916034
0:   916122 ->   916152
0:   916250 ->   916268
0:   916282 ->   916480
0:  1048576 ->  1179648
On node 0 totalpages: 1047100
  DMA zone: 64 pages used for memmap
  DMA zone: 1450 pages reserved
  DMA zone: 2484 pages, LIFO batch:0
  DMA32 zone: 16320 pages used for memmap
  DMA32 zone: 895710 pages, LIFO batch:31
  Normal zone: 

Re: SATA problems

2007-02-17 Thread Marcus Haebler

I opened a bug report (228979) on bugzilla.redhat.com on this one because
I have the same issue under FC6 2.6.19-1.2895. Here is the link:

   https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=228979

Do you have any more updates on this problem? Is there a way I can help
by providing debug data?

Thanks,

Marcus

On 1/23/07, Tejun Heo <[EMAIL PROTECTED]> wrote:

Pablo Sebastian Greco wrote:
> Well, it took me a few days,  but I think I'm ready to report back. One
> of the drives was failing, and it stopped after rewiring power supply so
> the last problem seems to be corrected.
> OTOH, your blacklist seems to be needed too, now I'm running FC6
> distribution kernel 2.6.19-1.2895.fc6 (2.6.19.2 + some patches by
> fedora) and setting
> echo 1 >/sys/block/sdX/device/queue_depth
> on all the SAMSUNG drives (sdb, sdc and sdd)
> The second I type
> echo 31 >/sys/block/sdX/device/queue_depth
> on any of the drives I get these messages
>
> Jan 23 12:36:30 squid kernel: BUG: warning: (ap->ops->error_handler &&
> ata_tag_valid(ap->active_tag)) at
> drivers/ata/libata-core.c:4602/ata_qc_issue() (Not ta
> inted)

This is kernel bug that needs fixing.  I'll investigate.

--
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/




-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-02-17 Thread Marcus Haebler

I opened a bug report (228979) on bugzilla.redhat.com on this one because
I have the same issue under FC6 2.6.19-1.2895. Here is the link:

   https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=228979

Do you have any more updates on this problem? Is there a way I can help
by providing debug data?

Thanks,

Marcus

On 1/23/07, Tejun Heo [EMAIL PROTECTED] wrote:

Pablo Sebastian Greco wrote:
 Well, it took me a few days,  but I think I'm ready to report back. One
 of the drives was failing, and it stopped after rewiring power supply so
 the last problem seems to be corrected.
 OTOH, your blacklist seems to be needed too, now I'm running FC6
 distribution kernel 2.6.19-1.2895.fc6 (2.6.19.2 + some patches by
 fedora) and setting
 echo 1 /sys/block/sdX/device/queue_depth
 on all the SAMSUNG drives (sdb, sdc and sdd)
 The second I type
 echo 31 /sys/block/sdX/device/queue_depth
 on any of the drives I get these messages

 Jan 23 12:36:30 squid kernel: BUG: warning: (ap-ops-error_handler 
 ata_tag_valid(ap-active_tag)) at
 drivers/ata/libata-core.c:4602/ata_qc_issue() (Not ta
 inted)

This is kernel bug that needs fixing.  I'll investigate.

--
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/




-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-02-17 Thread Pablo Sebastian Greco

Marcus Haebler wrote:

I opened a bug report (228979) on bugzilla.redhat.com on this one because
I have the same issue under FC6 2.6.19-1.2895. Here is the link:

   https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=228979

Do you have any more updates on this problem? Is there a way I can help
by providing debug data?

Thanks,

Marcus

On 1/23/07, Tejun Heo [EMAIL PROTECTED] wrote:

Pablo Sebastian Greco wrote:
 Well, it took me a few days,  but I think I'm ready to report back. 
One
 of the drives was failing, and it stopped after rewiring power 
supply so

 the last problem seems to be corrected.
 OTOH, your blacklist seems to be needed too, now I'm running FC6
 distribution kernel 2.6.19-1.2895.fc6 (2.6.19.2 + some patches by
 fedora) and setting
 echo 1 /sys/block/sdX/device/queue_depth
 on all the SAMSUNG drives (sdb, sdc and sdd)
 The second I type
 echo 31 /sys/block/sdX/device/queue_depth
 on any of the drives I get these messages

 Jan 23 12:36:30 squid kernel: BUG: warning: (ap-ops-error_handler 
 ata_tag_valid(ap-active_tag)) at
 drivers/ata/libata-core.c:4602/ata_qc_issue() (Not ta
 inted)

This is kernel bug that needs fixing.  I'll investigate.

--
tejun
-
To unsubscribe from this list: send the line unsubscribe 
linux-kernel in

the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/





On my side, all the problems dissapeared on all the kernels after 
changing all 3 drives to non-NCQ drives, I was going crazy.


New dmesg attached

Pablo.
Linux version 2.6.19-1.2895.fc6 ([EMAIL PROTECTED]) (gcc version 4.1.1 20070105 
(Red Hat 4.1.1-51)) #1 SMP Wed Jan 10 18:50:56 EST 2007
Command line: ro root=LABEL=/
BIOS-provided physical RAM map:
 BIOS-e820:  - 0009ec00 (usable)
 BIOS-e820: 0009ec00 - 0010 (reserved)
 BIOS-e820: 0010 - df938000 (usable)
 BIOS-e820: df938000 - df9d2000 (ACPI NVS)
 BIOS-e820: df9d2000 - dfa42000 (usable)
 BIOS-e820: dfa42000 - dfa9a000 (reserved)
 BIOS-e820: dfa9a000 - dfab8000 (usable)
 BIOS-e820: dfab8000 - dfb1a000 (ACPI NVS)
 BIOS-e820: dfb1a000 - dfb2c000 (usable)
 BIOS-e820: dfb2c000 - dfb3a000 (ACPI data)
 BIOS-e820: dfb3a000 - dfc0 (usable)
 BIOS-e820: ffc0 - ffc0c000 (reserved)
 BIOS-e820: 0001 - 00012000 (usable)
Entering add_active_range(0, 0, 158) 0 entries of 3200 used
Entering add_active_range(0, 256, 915768) 1 entries of 3200 used
Entering add_active_range(0, 915922, 916034) 2 entries of 3200 used
Entering add_active_range(0, 916122, 916152) 3 entries of 3200 used
Entering add_active_range(0, 916250, 916268) 4 entries of 3200 used
Entering add_active_range(0, 916282, 916480) 5 entries of 3200 used
Entering add_active_range(0, 1048576, 1179648) 6 entries of 3200 used
end_pfn_map = 1179648
DMI 2.4 present.
ACPI: RSDP (v002 INTEL ) @ 0x000f0350
ACPI: XSDT (v001 INTEL  S5000VSA 0x INTL 0x0113) @ 
0xdfb39120
ACPI: FADT (v003 INTEL  S5000VSA 0x INTL 0x0113) @ 
0xdfb36000
ACPI: MADT (v001 INTEL  S5000VSA 0x INTL 0x0113) @ 
0xdfb35000
ACPI: SPCR (v001 INTEL  S5000VSA 0x INTL 0x0113) @ 
0xdfb2f000
ACPI: HPET (v001 INTEL  S5000VSA 0x0001 INTL 0x0113) @ 
0xdfb2e000
ACPI: MCFG (v001 INTEL  S5000VSA 0x0001 INTL 0x0113) @ 
0xdfb2d000
ACPI: SSDT (v002 INTEL  S5000VSA 0x4000 INTL 0x0113) @ 
0xdfb2c000
ACPI: DSDT (v002 INTEL  S5000VSA 0x0008 INTL 0x0113) @ 
0x
No NUMA configuration found
Faking a node at -00012000
Entering add_active_range(0, 0, 158) 0 entries of 3200 used
Entering add_active_range(0, 256, 915768) 1 entries of 3200 used
Entering add_active_range(0, 915922, 916034) 2 entries of 3200 used
Entering add_active_range(0, 916122, 916152) 3 entries of 3200 used
Entering add_active_range(0, 916250, 916268) 4 entries of 3200 used
Entering add_active_range(0, 916282, 916480) 5 entries of 3200 used
Entering add_active_range(0, 1048576, 1179648) 6 entries of 3200 used
Bootmem setup node 0 -00012000
Zone PFN ranges:
  DMA 0 - 4096
  DMA324096 -  1048576
  Normal1048576 -  1179648
early_node_map[7] active PFN ranges
0:0 -  158
0:  256 -   915768
0:   915922 -   916034
0:   916122 -   916152
0:   916250 -   916268
0:   916282 -   916480
0:  1048576 -  1179648
On node 0 totalpages: 1047100
  DMA zone: 64 pages used for memmap
  DMA zone: 1450 pages reserved
  DMA zone: 2484 pages, LIFO batch:0
  DMA32 zone: 16320 pages used for memmap
  DMA32 zone: 895710 pages, LIFO batch:31
  Normal zone: 2048 pages used for memmap
  Normal 

Re: SATA problems

2007-01-23 Thread Tejun Heo
Pablo Sebastian Greco wrote:
> Well, it took me a few days,  but I think I'm ready to report back. One
> of the drives was failing, and it stopped after rewiring power supply so
> the last problem seems to be corrected.
> OTOH, your blacklist seems to be needed too, now I'm running FC6
> distribution kernel 2.6.19-1.2895.fc6 (2.6.19.2 + some patches by
> fedora) and setting
> echo 1 >/sys/block/sdX/device/queue_depth
> on all the SAMSUNG drives (sdb, sdc and sdd)
> The second I type
> echo 31 >/sys/block/sdX/device/queue_depth
> on any of the drives I get these messages
> 
> Jan 23 12:36:30 squid kernel: BUG: warning: (ap->ops->error_handler &&
> ata_tag_valid(ap->active_tag)) at
> drivers/ata/libata-core.c:4602/ata_qc_issue() (Not ta
> inted)

This is kernel bug that needs fixing.  I'll investigate.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-01-23 Thread Pablo Sebastian Greco

Tejun Heo wrote:

Hello, Pablo.

Please apply common hardware debugging method.  You know, swap drives.
Use separate power supply for disks, swap cables, etc...

It seems more like a hardware problem at this point.

Thanks.

  
Well, it took me a few days,  but I think I'm ready to report back. One 
of the drives was failing, and it stopped after rewiring power supply so 
the last problem seems to be corrected.
OTOH, your blacklist seems to be needed too, now I'm running FC6 
distribution kernel 2.6.19-1.2895.fc6 (2.6.19.2 + some patches by 
fedora) and setting

echo 1 >/sys/block/sdX/device/queue_depth
on all the SAMSUNG drives (sdb, sdc and sdd)
The second I type
echo 31 >/sys/block/sdX/device/queue_depth
on any of the drives I get these messages

Jan 23 12:36:30 squid kernel: BUG: warning: (ap->ops->error_handler && 
ata_tag_valid(ap->active_tag)) at 
drivers/ata/libata-core.c:4602/ata_qc_issue() (Not ta

inted)
Jan 23 12:36:30 squid kernel:
Jan 23 12:36:30 squid kernel: Call Trace:
Jan 23 12:36:30 squid kernel:  [] show_trace+0x34/0x47
Jan 23 12:36:30 squid kernel:  [] dump_stack+0x12/0x17
Jan 23 12:36:30 squid kernel:  [] 
:libata:ata_qc_issue+0x61/0x551
Jan 23 12:36:30 squid kernel:  [] 
:libata:ata_scsi_translate+0xd1/0x11a
Jan 23 12:36:30 squid kernel:  [] 
:libata:ata_scsi_queuecmd+0x103/0x122
Jan 23 12:36:30 squid kernel:  [] 
:scsi_mod:scsi_dispatch_cmd+0x27c/0x30d
Jan 23 12:36:30 squid kernel:  [] 
:scsi_mod:scsi_request_fn+0x2ca/0x395

Jan 23 12:36:30 squid kernel:  [] elv_insert+0x15a/0x226
Jan 23 12:36:30 squid kernel:  [] 
__make_request+0x439/0x487
Jan 23 12:36:30 squid kernel:  [] 
generic_make_request+0x207/0x21e

Jan 23 12:36:30 squid kernel:  [] submit_bio+0xee/0xf7
Jan 23 12:36:30 squid kernel:  [] submit_bh+0x130/0x150
Jan 23 12:36:30 squid kernel:  [] ll_rw_block+0x9d/0xc0
Jan 23 12:36:30 squid kernel:  [] 
:reiserfs:search_by_key+0x13d/0xce7
Jan 23 12:36:30 squid kernel:  [] 
:reiserfs:search_for_position_by_key+0x34/0x2ad
Jan 23 12:36:30 squid kernel:  [] 
:reiserfs:_get_block_create_0+0x86/0x544
Jan 23 12:36:30 squid kernel:  [] 
:reiserfs:reiserfs_get_block+0xcd/0xfdd
Jan 23 12:36:30 squid kernel:  [] 
do_mpage_readpage+0x16d/0x4b0
Jan 23 12:36:30 squid kernel:  [] 
mpage_readpages+0xb3/0x146
Jan 23 12:36:30 squid kernel:  [] 
__do_page_cache_readahead+0x119/0x209
Jan 23 12:36:30 squid kernel:  [] 
blockable_page_cache_readahead+0x56/0xb5
Jan 23 12:36:30 squid kernel:  [] 
page_cache_readahead+0xd6/0x1af
Jan 23 12:36:30 squid kernel:  [] 
do_generic_mapping_read+0x129/0x40b
Jan 23 12:36:30 squid kernel:  [] 
generic_file_aio_read+0x15f/0x1b1

Jan 23 12:36:30 squid kernel:  [] do_sync_read+0xc9/0x10c
Jan 23 12:36:30 squid kernel:  [] vfs_read+0xcb/0x170
Jan 23 12:36:30 squid kernel:  [] sys_read+0x45/0x6e
Jan 23 12:36:30 squid kernel:  [] system_call+0x7e/0x83
Jan 23 12:36:30 squid kernel:  [<00359ccbfb80>]
Jan 23 12:36:30 squid kernel:

Thanks for everything.
Pablo.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-01-23 Thread Pablo Sebastian Greco

Tejun Heo wrote:

Hello, Pablo.

Please apply common hardware debugging method.  You know, swap drives.
Use separate power supply for disks, swap cables, etc...

It seems more like a hardware problem at this point.

Thanks.

  
Well, it took me a few days,  but I think I'm ready to report back. One 
of the drives was failing, and it stopped after rewiring power supply so 
the last problem seems to be corrected.
OTOH, your blacklist seems to be needed too, now I'm running FC6 
distribution kernel 2.6.19-1.2895.fc6 (2.6.19.2 + some patches by 
fedora) and setting

echo 1 /sys/block/sdX/device/queue_depth
on all the SAMSUNG drives (sdb, sdc and sdd)
The second I type
echo 31 /sys/block/sdX/device/queue_depth
on any of the drives I get these messages

Jan 23 12:36:30 squid kernel: BUG: warning: (ap-ops-error_handler  
ata_tag_valid(ap-active_tag)) at 
drivers/ata/libata-core.c:4602/ata_qc_issue() (Not ta

inted)
Jan 23 12:36:30 squid kernel:
Jan 23 12:36:30 squid kernel: Call Trace:
Jan 23 12:36:30 squid kernel:  [8026999a] show_trace+0x34/0x47
Jan 23 12:36:30 squid kernel:  [802699bf] dump_stack+0x12/0x17
Jan 23 12:36:30 squid kernel:  [88092d50] 
:libata:ata_qc_issue+0x61/0x551
Jan 23 12:36:30 squid kernel:  [88097bc8] 
:libata:ata_scsi_translate+0xd1/0x11a
Jan 23 12:36:30 squid kernel:  [88098b87] 
:libata:ata_scsi_queuecmd+0x103/0x122
Jan 23 12:36:30 squid kernel:  [8805cbc1] 
:scsi_mod:scsi_dispatch_cmd+0x27c/0x30d
Jan 23 12:36:30 squid kernel:  [88061dbe] 
:scsi_mod:scsi_request_fn+0x2ca/0x395

Jan 23 12:36:30 squid kernel:  [8033844e] elv_insert+0x15a/0x226
Jan 23 12:36:30 squid kernel:  [8020bcc2] 
__make_request+0x439/0x487
Jan 23 12:36:30 squid kernel:  [8021bf12] 
generic_make_request+0x207/0x21e

Jan 23 12:36:30 squid kernel:  [80232f7d] submit_bio+0xee/0xf7
Jan 23 12:36:30 squid kernel:  [8021a4f0] submit_bh+0x130/0x150
Jan 23 12:36:30 squid kernel:  [80217187] ll_rw_block+0x9d/0xc0
Jan 23 12:36:30 squid kernel:  [881adf63] 
:reiserfs:search_by_key+0x13d/0xce7
Jan 23 12:36:30 squid kernel:  [881aee54] 
:reiserfs:search_for_position_by_key+0x34/0x2ad
Jan 23 12:36:30 squid kernel:  [8819bd48] 
:reiserfs:_get_block_create_0+0x86/0x544
Jan 23 12:36:30 squid kernel:  [8819d508] 
:reiserfs:reiserfs_get_block+0xcd/0xfdd
Jan 23 12:36:30 squid kernel:  [80228c34] 
do_mpage_readpage+0x16d/0x4b0
Jan 23 12:36:30 squid kernel:  [802388df] 
mpage_readpages+0xb3/0x146
Jan 23 12:36:30 squid kernel:  [80212a81] 
__do_page_cache_readahead+0x119/0x209
Jan 23 12:36:30 squid kernel:  [80231fed] 
blockable_page_cache_readahead+0x56/0xb5
Jan 23 12:36:30 squid kernel:  [80213b7a] 
page_cache_readahead+0xd6/0x1af
Jan 23 12:36:30 squid kernel:  [8020be39] 
do_generic_mapping_read+0x129/0x40b
Jan 23 12:36:30 squid kernel:  [80216a02] 
generic_file_aio_read+0x15f/0x1b1

Jan 23 12:36:30 squid kernel:  [8020c92b] do_sync_read+0xc9/0x10c
Jan 23 12:36:30 squid kernel:  [8020b226] vfs_read+0xcb/0x170
Jan 23 12:36:30 squid kernel:  [80211731] sys_read+0x45/0x6e
Jan 23 12:36:30 squid kernel:  [8025c11e] system_call+0x7e/0x83
Jan 23 12:36:30 squid kernel:  [00359ccbfb80]
Jan 23 12:36:30 squid kernel:

Thanks for everything.
Pablo.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-01-23 Thread Tejun Heo
Pablo Sebastian Greco wrote:
 Well, it took me a few days,  but I think I'm ready to report back. One
 of the drives was failing, and it stopped after rewiring power supply so
 the last problem seems to be corrected.
 OTOH, your blacklist seems to be needed too, now I'm running FC6
 distribution kernel 2.6.19-1.2895.fc6 (2.6.19.2 + some patches by
 fedora) and setting
 echo 1 /sys/block/sdX/device/queue_depth
 on all the SAMSUNG drives (sdb, sdc and sdd)
 The second I type
 echo 31 /sys/block/sdX/device/queue_depth
 on any of the drives I get these messages
 
 Jan 23 12:36:30 squid kernel: BUG: warning: (ap-ops-error_handler 
 ata_tag_valid(ap-active_tag)) at
 drivers/ata/libata-core.c:4602/ata_qc_issue() (Not ta
 inted)

This is kernel bug that needs fixing.  I'll investigate.

-- 
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-01-09 Thread Tejun Heo
Hello, Pablo.

Please apply common hardware debugging method.  You know, swap drives.
Use separate power supply for disks, swap cables, etc...

It seems more like a hardware problem at this point.

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-01-09 Thread Pablo Sebastian Greco

Pablo Sebastian Greco wrote:

Tejun Heo wrote:

Pablo Sebastian Greco wrote:
 

After an uptime of  13:34 under heavy load and no errors, I'm pretty
sure your patch is correct. Is there a way to backport this to 
2.6.18.x?



I forgot this (even though I implemented it) but you can turn off NCQ by
doing the following.

# echo 1 > /sys/block/sdX/device/queue_depth

Can you put the seagate drive under load to verify that it's the samsung
drive's problem not the controller's?

 

Just an off topic question, does anyone know why I get so uneven IRQ
handling on 2.6.19-20 and almost perfect on 2.6.20-rc2-mm1?



I dunno.  You have much better chance of getting a useful answer by
asking it on a separate thread with proper subject line.  People usualyl
screen threads by subject.  There are just too many message in LKML for
anyone to follow all the message.

Thanks.

  

Guess I spoke too soon :(
Today I found this
Jan  8 04:01:40 squid kernel: ata2.00: exception Emask 0x0 SAct 0x0 
SErr 0x0 action 0x2 frozen
Jan  8 04:01:40 squid kernel: ata2.00: cmd 
25/00:08:49:ee:e8/00:00:16:00:00/e0 tag 0 cdb 0x0 data 4096 in
Jan  8 04:01:40 squid kernel:  res 
40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)

Jan  8 04:01:40 squid kernel: ata2: soft resetting port
Jan  8 04:01:40 squid kernel: ata2: softreset failed (port busy but 
CLO unavailable)

Jan  8 04:01:40 squid kernel: ata2: softreset failed, retrying in 5 secs
Jan  8 04:01:45 squid kernel: ata2: hard resetting port
Jan  8 04:01:53 squid kernel: ata2: port is slow to respond, please be 
patient (Status 0x80)
Jan  8 04:02:16 squid kernel: ata2: port failed to respond (30 secs, 
Status 0x80)

Jan  8 04:02:16 squid kernel: ata2: COMRESET failed (device not ready)
Jan  8 04:02:16 squid kernel: ata2: hardreset failed, retrying in 5 secs
Jan  8 04:02:21 squid kernel: ata2: hard resetting port
Jan  8 04:02:21 squid kernel: ata2: SATA link up 3.0 Gbps (SStatus 123 
SControl 300)

Jan  8 04:02:21 squid kernel: ata2.00: configured for UDMA/133
Jan  8 04:02:21 squid kernel: ata2: EH complete
Jan  8 04:02:21 squid kernel: SCSI device sdb: 488397168 512-byte hdwr 
sectors (250059 MB)

Jan  8 04:02:21 squid kernel: sdb: Write Protect is off
Jan  8 04:02:21 squid kernel: SCSI device sdb: write cache: enabled, 
read cache: enabled, doesn't support DPO or FUA

#uptime
10:10:12 up 3 days, 22:48,  1 user,  load average: 0.22, 0.19, 0.18
4 am is the lowest load ever, so I don't get it.
I've found two differences with older errors
   SAct is now 0x0 when before was 0x7fff
   And the cmd/res used to be really long, now it's just one command
About heavy loading the seagate, I've tested as suggested on other 
thread dd if= of=/dev/null
for all 4 drives simultaneously, on top of usual load, and all was 
perfect with current kernel (2.6.20-rc3 + blacklist).

Don't know what to do to help

Thanks.
Pablo.
-
To unsubscribe from this list: send the line "unsubscribe 
linux-kernel" in

the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


And now this :( , still  running rc3+blacklist without rebooting

Jan  9 05:30:36 squid kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 
0x0 action 0x2 frozen
Jan  9 05:30:36 squid kernel: ata2.00: cmd 
c8/00:08:87:83:85/00:00:00:00:00/e2 tag 0 cdb 0x0 data 4096 in
Jan  9 05:30:36 squid kernel:  res 
40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)

Jan  9 05:30:36 squid kernel: ata2: soft resetting port
Jan  9 05:30:36 squid kernel: ata2: softreset failed (port busy but CLO 
unavailable)

Jan  9 05:30:36 squid kernel: ata2: softreset failed, retrying in 5 secs
Jan  9 05:30:41 squid kernel: ata2: hard resetting port
Jan  9 05:30:49 squid kernel: ata2: port is slow to respond, please be 
patient (Status 0x80)
Jan  9 05:31:12 squid kernel: ata2: port failed to respond (30 secs, 
Status 0x80)

Jan  9 05:31:12 squid kernel: ata2: COMRESET failed (device not ready)
Jan  9 05:31:12 squid kernel: ata2: hardreset failed, retrying in 5 secs
Jan  9 05:31:17 squid kernel: ata2: hard resetting port
Jan  9 05:31:17 squid kernel: ata2: SATA link up 3.0 Gbps (SStatus 123 
SControl 300)

Jan  9 05:31:17 squid kernel: ata2.00: configured for UDMA/133
Jan  9 05:31:17 squid kernel: ata2: EH complete
Jan  9 05:31:17 squid kernel: SCSI device sdb: 488397168 512-byte hdwr 
sectors (250059 MB)

Jan  9 05:31:17 squid kernel: sdb: Write Protect is off
Jan  9 05:31:17 squid kernel: SCSI device sdb: write cache: enabled, 
read cache: enabled, doesn't support DPO or FUA
Jan  9 05:32:17 squid kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 
0x0 action 0x2 frozen
Jan  9 05:32:17 squid kernel: ata2.00: cmd 
c8/00:08:37:ac:04/00:00:00:00:00/e0 tag 0 cdb 0x0 data 4096 in
Jan  9 05:32:17 squid kernel:  res 
40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)

Jan  9 05:32:18 squid kernel: ata2: soft resetting port
Jan  9 05:32:18 

Re: SATA problems

2007-01-09 Thread Pablo Sebastian Greco

Pablo Sebastian Greco wrote:

Tejun Heo wrote:

Pablo Sebastian Greco wrote:
 

After an uptime of  13:34 under heavy load and no errors, I'm pretty
sure your patch is correct. Is there a way to backport this to 
2.6.18.x?



I forgot this (even though I implemented it) but you can turn off NCQ by
doing the following.

# echo 1  /sys/block/sdX/device/queue_depth

Can you put the seagate drive under load to verify that it's the samsung
drive's problem not the controller's?

 

Just an off topic question, does anyone know why I get so uneven IRQ
handling on 2.6.19-20 and almost perfect on 2.6.20-rc2-mm1?



I dunno.  You have much better chance of getting a useful answer by
asking it on a separate thread with proper subject line.  People usualyl
screen threads by subject.  There are just too many message in LKML for
anyone to follow all the message.

Thanks.

  

Guess I spoke too soon :(
Today I found this
Jan  8 04:01:40 squid kernel: ata2.00: exception Emask 0x0 SAct 0x0 
SErr 0x0 action 0x2 frozen
Jan  8 04:01:40 squid kernel: ata2.00: cmd 
25/00:08:49:ee:e8/00:00:16:00:00/e0 tag 0 cdb 0x0 data 4096 in
Jan  8 04:01:40 squid kernel:  res 
40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)

Jan  8 04:01:40 squid kernel: ata2: soft resetting port
Jan  8 04:01:40 squid kernel: ata2: softreset failed (port busy but 
CLO unavailable)

Jan  8 04:01:40 squid kernel: ata2: softreset failed, retrying in 5 secs
Jan  8 04:01:45 squid kernel: ata2: hard resetting port
Jan  8 04:01:53 squid kernel: ata2: port is slow to respond, please be 
patient (Status 0x80)
Jan  8 04:02:16 squid kernel: ata2: port failed to respond (30 secs, 
Status 0x80)

Jan  8 04:02:16 squid kernel: ata2: COMRESET failed (device not ready)
Jan  8 04:02:16 squid kernel: ata2: hardreset failed, retrying in 5 secs
Jan  8 04:02:21 squid kernel: ata2: hard resetting port
Jan  8 04:02:21 squid kernel: ata2: SATA link up 3.0 Gbps (SStatus 123 
SControl 300)

Jan  8 04:02:21 squid kernel: ata2.00: configured for UDMA/133
Jan  8 04:02:21 squid kernel: ata2: EH complete
Jan  8 04:02:21 squid kernel: SCSI device sdb: 488397168 512-byte hdwr 
sectors (250059 MB)

Jan  8 04:02:21 squid kernel: sdb: Write Protect is off
Jan  8 04:02:21 squid kernel: SCSI device sdb: write cache: enabled, 
read cache: enabled, doesn't support DPO or FUA

#uptime
10:10:12 up 3 days, 22:48,  1 user,  load average: 0.22, 0.19, 0.18
4 am is the lowest load ever, so I don't get it.
I've found two differences with older errors
   SAct is now 0x0 when before was 0x7fff
   And the cmd/res used to be really long, now it's just one command
About heavy loading the seagate, I've tested as suggested on other 
thread dd if=drive of=/dev/null
for all 4 drives simultaneously, on top of usual load, and all was 
perfect with current kernel (2.6.20-rc3 + blacklist).

Don't know what to do to help

Thanks.
Pablo.
-
To unsubscribe from this list: send the line unsubscribe 
linux-kernel in

the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


And now this :( , still  running rc3+blacklist without rebooting

Jan  9 05:30:36 squid kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 
0x0 action 0x2 frozen
Jan  9 05:30:36 squid kernel: ata2.00: cmd 
c8/00:08:87:83:85/00:00:00:00:00/e2 tag 0 cdb 0x0 data 4096 in
Jan  9 05:30:36 squid kernel:  res 
40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)

Jan  9 05:30:36 squid kernel: ata2: soft resetting port
Jan  9 05:30:36 squid kernel: ata2: softreset failed (port busy but CLO 
unavailable)

Jan  9 05:30:36 squid kernel: ata2: softreset failed, retrying in 5 secs
Jan  9 05:30:41 squid kernel: ata2: hard resetting port
Jan  9 05:30:49 squid kernel: ata2: port is slow to respond, please be 
patient (Status 0x80)
Jan  9 05:31:12 squid kernel: ata2: port failed to respond (30 secs, 
Status 0x80)

Jan  9 05:31:12 squid kernel: ata2: COMRESET failed (device not ready)
Jan  9 05:31:12 squid kernel: ata2: hardreset failed, retrying in 5 secs
Jan  9 05:31:17 squid kernel: ata2: hard resetting port
Jan  9 05:31:17 squid kernel: ata2: SATA link up 3.0 Gbps (SStatus 123 
SControl 300)

Jan  9 05:31:17 squid kernel: ata2.00: configured for UDMA/133
Jan  9 05:31:17 squid kernel: ata2: EH complete
Jan  9 05:31:17 squid kernel: SCSI device sdb: 488397168 512-byte hdwr 
sectors (250059 MB)

Jan  9 05:31:17 squid kernel: sdb: Write Protect is off
Jan  9 05:31:17 squid kernel: SCSI device sdb: write cache: enabled, 
read cache: enabled, doesn't support DPO or FUA
Jan  9 05:32:17 squid kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 
0x0 action 0x2 frozen
Jan  9 05:32:17 squid kernel: ata2.00: cmd 
c8/00:08:37:ac:04/00:00:00:00:00/e0 tag 0 cdb 0x0 data 4096 in
Jan  9 05:32:17 squid kernel:  res 
40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)

Jan  9 05:32:18 squid kernel: ata2: soft resetting port
Jan  9 

Re: SATA problems

2007-01-09 Thread Tejun Heo
Hello, Pablo.

Please apply common hardware debugging method.  You know, swap drives.
Use separate power supply for disks, swap cables, etc...

It seems more like a hardware problem at this point.

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-01-08 Thread Pablo Sebastian Greco

Tejun Heo wrote:

Pablo Sebastian Greco wrote:
  

After an uptime of  13:34 under heavy load and no errors, I'm pretty
sure your patch is correct. Is there a way to backport this to 2.6.18.x?



I forgot this (even though I implemented it) but you can turn off NCQ by
doing the following.

# echo 1 > /sys/block/sdX/device/queue_depth

Can you put the seagate drive under load to verify that it's the samsung
drive's problem not the controller's?

  

Just an off topic question, does anyone know why I get so uneven IRQ
handling on 2.6.19-20 and almost perfect on 2.6.20-rc2-mm1?



I dunno.  You have much better chance of getting a useful answer by
asking it on a separate thread with proper subject line.  People usualyl
screen threads by subject.  There are just too many message in LKML for
anyone to follow all the message.

Thanks.

  

Guess I spoke too soon :(
Today I found this
Jan  8 04:01:40 squid kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 
0x0 action 0x2 frozen
Jan  8 04:01:40 squid kernel: ata2.00: cmd 
25/00:08:49:ee:e8/00:00:16:00:00/e0 tag 0 cdb 0x0 data 4096 in
Jan  8 04:01:40 squid kernel:  res 
40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)

Jan  8 04:01:40 squid kernel: ata2: soft resetting port
Jan  8 04:01:40 squid kernel: ata2: softreset failed (port busy but CLO 
unavailable)

Jan  8 04:01:40 squid kernel: ata2: softreset failed, retrying in 5 secs
Jan  8 04:01:45 squid kernel: ata2: hard resetting port
Jan  8 04:01:53 squid kernel: ata2: port is slow to respond, please be 
patient (Status 0x80)
Jan  8 04:02:16 squid kernel: ata2: port failed to respond (30 secs, 
Status 0x80)

Jan  8 04:02:16 squid kernel: ata2: COMRESET failed (device not ready)
Jan  8 04:02:16 squid kernel: ata2: hardreset failed, retrying in 5 secs
Jan  8 04:02:21 squid kernel: ata2: hard resetting port
Jan  8 04:02:21 squid kernel: ata2: SATA link up 3.0 Gbps (SStatus 123 
SControl 300)

Jan  8 04:02:21 squid kernel: ata2.00: configured for UDMA/133
Jan  8 04:02:21 squid kernel: ata2: EH complete
Jan  8 04:02:21 squid kernel: SCSI device sdb: 488397168 512-byte hdwr 
sectors (250059 MB)

Jan  8 04:02:21 squid kernel: sdb: Write Protect is off
Jan  8 04:02:21 squid kernel: SCSI device sdb: write cache: enabled, 
read cache: enabled, doesn't support DPO or FUA

#uptime
10:10:12 up 3 days, 22:48,  1 user,  load average: 0.22, 0.19, 0.18
4 am is the lowest load ever, so I don't get it.
I've found two differences with older errors
   SAct is now 0x0 when before was 0x7fff
   And the cmd/res used to be really long, now it's just one command
About heavy loading the seagate, I've tested as suggested on other 
thread dd if= of=/dev/null
for all 4 drives simultaneously, on top of usual load, and all was 
perfect with current kernel (2.6.20-rc3 + blacklist).

Don't know what to do to help

Thanks.
Pablo.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-01-08 Thread Pablo Sebastian Greco

Tejun Heo wrote:

Pablo Sebastian Greco wrote:
  

After an uptime of  13:34 under heavy load and no errors, I'm pretty
sure your patch is correct. Is there a way to backport this to 2.6.18.x?



I forgot this (even though I implemented it) but you can turn off NCQ by
doing the following.

# echo 1  /sys/block/sdX/device/queue_depth

Can you put the seagate drive under load to verify that it's the samsung
drive's problem not the controller's?

  

Just an off topic question, does anyone know why I get so uneven IRQ
handling on 2.6.19-20 and almost perfect on 2.6.20-rc2-mm1?



I dunno.  You have much better chance of getting a useful answer by
asking it on a separate thread with proper subject line.  People usualyl
screen threads by subject.  There are just too many message in LKML for
anyone to follow all the message.

Thanks.

  

Guess I spoke too soon :(
Today I found this
Jan  8 04:01:40 squid kernel: ata2.00: exception Emask 0x0 SAct 0x0 SErr 
0x0 action 0x2 frozen
Jan  8 04:01:40 squid kernel: ata2.00: cmd 
25/00:08:49:ee:e8/00:00:16:00:00/e0 tag 0 cdb 0x0 data 4096 in
Jan  8 04:01:40 squid kernel:  res 
40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)

Jan  8 04:01:40 squid kernel: ata2: soft resetting port
Jan  8 04:01:40 squid kernel: ata2: softreset failed (port busy but CLO 
unavailable)

Jan  8 04:01:40 squid kernel: ata2: softreset failed, retrying in 5 secs
Jan  8 04:01:45 squid kernel: ata2: hard resetting port
Jan  8 04:01:53 squid kernel: ata2: port is slow to respond, please be 
patient (Status 0x80)
Jan  8 04:02:16 squid kernel: ata2: port failed to respond (30 secs, 
Status 0x80)

Jan  8 04:02:16 squid kernel: ata2: COMRESET failed (device not ready)
Jan  8 04:02:16 squid kernel: ata2: hardreset failed, retrying in 5 secs
Jan  8 04:02:21 squid kernel: ata2: hard resetting port
Jan  8 04:02:21 squid kernel: ata2: SATA link up 3.0 Gbps (SStatus 123 
SControl 300)

Jan  8 04:02:21 squid kernel: ata2.00: configured for UDMA/133
Jan  8 04:02:21 squid kernel: ata2: EH complete
Jan  8 04:02:21 squid kernel: SCSI device sdb: 488397168 512-byte hdwr 
sectors (250059 MB)

Jan  8 04:02:21 squid kernel: sdb: Write Protect is off
Jan  8 04:02:21 squid kernel: SCSI device sdb: write cache: enabled, 
read cache: enabled, doesn't support DPO or FUA

#uptime
10:10:12 up 3 days, 22:48,  1 user,  load average: 0.22, 0.19, 0.18
4 am is the lowest load ever, so I don't get it.
I've found two differences with older errors
   SAct is now 0x0 when before was 0x7fff
   And the cmd/res used to be really long, now it's just one command
About heavy loading the seagate, I've tested as suggested on other 
thread dd if=drive of=/dev/null
for all 4 drives simultaneously, on top of usual load, and all was 
perfect with current kernel (2.6.20-rc3 + blacklist).

Don't know what to do to help

Thanks.
Pablo.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-01-07 Thread Jeff Garzik

Tejun Heo wrote:

Pablo Sebastian Greco wrote:

After an uptime of  13:34 under heavy load and no errors, I'm pretty
sure your patch is correct. Is there a way to backport this to 2.6.18.x?


I forgot this (even though I implemented it) but you can turn off NCQ by
doing the following.

# echo 1 > /sys/block/sdX/device/queue_depth


Thanks, I had forgotten this, too :)

Added to the libata FAQ: http://linux-ata.org/faq.html

Jeff


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-01-07 Thread Tejun Heo
Pablo Sebastian Greco wrote:
> After an uptime of  13:34 under heavy load and no errors, I'm pretty
> sure your patch is correct. Is there a way to backport this to 2.6.18.x?

I forgot this (even though I implemented it) but you can turn off NCQ by
doing the following.

# echo 1 > /sys/block/sdX/device/queue_depth

Can you put the seagate drive under load to verify that it's the samsung
drive's problem not the controller's?

> Just an off topic question, does anyone know why I get so uneven IRQ
> handling on 2.6.19-20 and almost perfect on 2.6.20-rc2-mm1?

I dunno.  You have much better chance of getting a useful answer by
asking it on a separate thread with proper subject line.  People usualyl
screen threads by subject.  There are just too many message in LKML for
anyone to follow all the message.

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-01-07 Thread Tejun Heo
Pablo Sebastian Greco wrote:
 After an uptime of  13:34 under heavy load and no errors, I'm pretty
 sure your patch is correct. Is there a way to backport this to 2.6.18.x?

I forgot this (even though I implemented it) but you can turn off NCQ by
doing the following.

# echo 1  /sys/block/sdX/device/queue_depth

Can you put the seagate drive under load to verify that it's the samsung
drive's problem not the controller's?

 Just an off topic question, does anyone know why I get so uneven IRQ
 handling on 2.6.19-20 and almost perfect on 2.6.20-rc2-mm1?

I dunno.  You have much better chance of getting a useful answer by
asking it on a separate thread with proper subject line.  People usualyl
screen threads by subject.  There are just too many message in LKML for
anyone to follow all the message.

Thanks.

-- 
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-01-07 Thread Jeff Garzik

Tejun Heo wrote:

Pablo Sebastian Greco wrote:

After an uptime of  13:34 under heavy load and no errors, I'm pretty
sure your patch is correct. Is there a way to backport this to 2.6.18.x?


I forgot this (even though I implemented it) but you can turn off NCQ by
doing the following.

# echo 1  /sys/block/sdX/device/queue_depth


Thanks, I had forgotten this, too :)

Added to the libata FAQ: http://linux-ata.org/faq.html

Jeff


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-01-04 Thread Pablo Sebastian Greco

Pablo Sebastian Greco wrote:

Tejun Heo wrote:

Pablo Sebastian Greco wrote:
 

By crash I mean the whole system going down, having to reset the entire
machine.
I'm sending you 4 files:
dmesg: current boot dmesg, just a boot, because no errors appeared 
after

last crash, since the server is out of production right now (errors
usually appear under heavy load, and this primarily a transparent proxy
for about 1000 simultaneous users)
lspci: the way you asked for it
messages and messages.1: files where you can see old boots and crashes
(even a soft lockup).
If there is anything else I can do, let me know. If you need direct
access to the server, I can arrange that too.



Can you try 2.6.20-rc3 and see if 'CLO not available' message goes away
(please post boot dmesg)?

The crash/lock is because filesystem code does not cope with IO errors
very well.  I can't tell why timeouts are occurring in the first place.
 It seems that only samsung drives are affected (sda2, 3, 4).  Hmmm...
Please apply the attached patch to 2.6.20-rc3 and test it.

Thanks.

  
Here's boot dmesg with 2.6.20-rc3 + blacklist. And you are right about 
only affecting samsung drives, but since only those drives get all the 
heavy load, couldn't tell exactly.
I'm putting the server in production right now, so I think in a few 
hours I'll have more info.


Thanks.
Pablo.
After an uptime of  13:34 under heavy load and no errors, I'm pretty 
sure your patch is correct. Is there a way to backport this to 2.6.18.x?
Just an off topic question, does anyone know why I get so uneven IRQ 
handling on 2.6.19-20 and almost perfect on 2.6.20-rc2-mm1?


Thanks for everything.
Pablo.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-01-04 Thread Pablo Sebastian Greco

Tejun Heo wrote:

Pablo Sebastian Greco wrote:
  

By crash I mean the whole system going down, having to reset the entire
machine.
I'm sending you 4 files:
dmesg: current boot dmesg, just a boot, because no errors appeared after
last crash, since the server is out of production right now (errors
usually appear under heavy load, and this primarily a transparent proxy
for about 1000 simultaneous users)
lspci: the way you asked for it
messages and messages.1: files where you can see old boots and crashes
(even a soft lockup).
If there is anything else I can do, let me know. If you need direct
access to the server, I can arrange that too.



Can you try 2.6.20-rc3 and see if 'CLO not available' message goes away
(please post boot dmesg)?

The crash/lock is because filesystem code does not cope with IO errors
very well.  I can't tell why timeouts are occurring in the first place.
 It seems that only samsung drives are affected (sda2, 3, 4).  Hmmm...
Please apply the attached patch to 2.6.20-rc3 and test it.

Thanks.

  
Here's boot dmesg with 2.6.20-rc3 + blacklist. And you are right about 
only affecting samsung drives, but since only those drives get all the 
heavy load, couldn't tell exactly.
I'm putting the server in production right now, so I think in a few 
hours I'll have more info.


Thanks.
Pablo.


dmesg.bz2
Description: Binary data


Re: SATA problems

2007-01-04 Thread Pablo Sebastian Greco

Tejun Heo wrote:

Pablo Sebastian Greco wrote:
  

By crash I mean the whole system going down, having to reset the entire
machine.
I'm sending you 4 files:
dmesg: current boot dmesg, just a boot, because no errors appeared after
last crash, since the server is out of production right now (errors
usually appear under heavy load, and this primarily a transparent proxy
for about 1000 simultaneous users)
lspci: the way you asked for it
messages and messages.1: files where you can see old boots and crashes
(even a soft lockup).
If there is anything else I can do, let me know. If you need direct
access to the server, I can arrange that too.



Can you try 2.6.20-rc3 and see if 'CLO not available' message goes away
(please post boot dmesg)?

The crash/lock is because filesystem code does not cope with IO errors
very well.  I can't tell why timeouts are occurring in the first place.
 It seems that only samsung drives are affected (sda2, 3, 4).  Hmmm...
Please apply the attached patch to 2.6.20-rc3 and test it.

Thanks.

  
Here's boot dmesg with 2.6.20-rc3 + blacklist. And you are right about 
only affecting samsung drives, but since only those drives get all the 
heavy load, couldn't tell exactly.
I'm putting the server in production right now, so I think in a few 
hours I'll have more info.


Thanks.
Pablo.


dmesg.bz2
Description: Binary data


Re: SATA problems

2007-01-04 Thread Pablo Sebastian Greco

Pablo Sebastian Greco wrote:

Tejun Heo wrote:

Pablo Sebastian Greco wrote:
 

By crash I mean the whole system going down, having to reset the entire
machine.
I'm sending you 4 files:
dmesg: current boot dmesg, just a boot, because no errors appeared 
after

last crash, since the server is out of production right now (errors
usually appear under heavy load, and this primarily a transparent proxy
for about 1000 simultaneous users)
lspci: the way you asked for it
messages and messages.1: files where you can see old boots and crashes
(even a soft lockup).
If there is anything else I can do, let me know. If you need direct
access to the server, I can arrange that too.



Can you try 2.6.20-rc3 and see if 'CLO not available' message goes away
(please post boot dmesg)?

The crash/lock is because filesystem code does not cope with IO errors
very well.  I can't tell why timeouts are occurring in the first place.
 It seems that only samsung drives are affected (sda2, 3, 4).  Hmmm...
Please apply the attached patch to 2.6.20-rc3 and test it.

Thanks.

  
Here's boot dmesg with 2.6.20-rc3 + blacklist. And you are right about 
only affecting samsung drives, but since only those drives get all the 
heavy load, couldn't tell exactly.
I'm putting the server in production right now, so I think in a few 
hours I'll have more info.


Thanks.
Pablo.
After an uptime of  13:34 under heavy load and no errors, I'm pretty 
sure your patch is correct. Is there a way to backport this to 2.6.18.x?
Just an off topic question, does anyone know why I get so uneven IRQ 
handling on 2.6.19-20 and almost perfect on 2.6.20-rc2-mm1?


Thanks for everything.
Pablo.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-01-03 Thread Tejun Heo
Pablo Sebastian Greco wrote:
> By crash I mean the whole system going down, having to reset the entire
> machine.
> I'm sending you 4 files:
> dmesg: current boot dmesg, just a boot, because no errors appeared after
> last crash, since the server is out of production right now (errors
> usually appear under heavy load, and this primarily a transparent proxy
> for about 1000 simultaneous users)
> lspci: the way you asked for it
> messages and messages.1: files where you can see old boots and crashes
> (even a soft lockup).
> If there is anything else I can do, let me know. If you need direct
> access to the server, I can arrange that too.

Can you try 2.6.20-rc3 and see if 'CLO not available' message goes away
(please post boot dmesg)?

The crash/lock is because filesystem code does not cope with IO errors
very well.  I can't tell why timeouts are occurring in the first place.
 It seems that only samsung drives are affected (sda2, 3, 4).  Hmmm...
Please apply the attached patch to 2.6.20-rc3 and test it.

Thanks.

-- 
tejun
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 0d51d13..f8cf349 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -3327,6 +3327,8 @@ static const struct ata_blacklist_entry 
ata_device_blacklist [] = {
/* NCQ is slow */
 { "WDC WD740ADFD-00",   NULL,  ATA_HORKAGE_NONCQ },
 
+   { "SAMSUNG SP2504C",NULL,   ATA_HORKAGE_NONCQ },
+
/* Devices with NCQ limits */
 
/* End Marker */


Re: SATA problems

2007-01-03 Thread Tejun Heo
Pablo Sebastian Greco wrote:
> First of all, thanks for everything, and my excuses if I'm doing
> anything wrong, this is my first lkml mail, but I've read all the faq,
> so should be OK.
> This is the machine with the problem:
> 
> Intel ServerBoard S5000VSA
> Dual Core Xeon 2.66 (Intel(R) Xeon(TM) CPU 2.66GHz stepping 04)
> 4G Kingston
> 1 Seagate 80G sata (ST380211AS) (sda)
> 3 Samsung 250G sata (SAMSUNG SP2504C) (sdb,c,d)
> 
> Installed distribution is FC6 x86_64
> 
> I've been getting these messages with distribution and vanilla kernels
> 
> Jan  1 16:29:08 squid kernel: ata4.00: exception Emask 0x0 SAct
> 0x7fff SErr 0x0 action 0x2 frozen
> Jan  1 16:29:08 squid kernel: ata4.00: cmd
> 61/60:00:c9:6d:8e/00:00:0e:00:00/40 tag 0 cdb 0x0 data 49152 out
> Jan  1 16:29:08 squid kernel:  res
> 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
> Jan  1 16:29:08 squid kernel: ata4.00: cmd
> 60/08:08:f7:7d:56/00:00:0e:00:00/40 tag 1 cdb 0x0 data 4096 in
> Jan  1 16:29:08 squid kernel:  res
> 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> 
> Jan  1 16:29:08 squid kernel: ata4: soft resetting port
> Jan  1 16:29:08 squid kernel: ata4: softreset failed (port busy but CLO
> unavailable)
> Jan  1 16:29:08 squid kernel: ata4: softreset failed, retrying in 5 secs
> Jan  1 16:29:13 squid kernel: ata4: hard resetting port
> Jan  1 16:29:21 squid kernel: ata4: port is slow to respond, please be
> patient (Status 0x80)
> Jan  1 16:29:43 squid kernel: ata4: port failed to respond (30 secs,
> Status 0x80)
> Jan  1 16:29:43 squid kernel: ata4: COMRESET failed (device not ready)
> Jan  1 16:29:43 squid kernel: ata4: hardreset failed, retrying in 5 secs
> Jan  1 16:29:48 squid kernel: ata4: hard resetting port
> Jan  1 16:29:49 squid kernel: ata4: SATA link up 3.0 Gbps (SStatus 123
> SControl 300)
> Jan  1 16:29:49 squid kernel: ata4.00: configured for UDMA/133
> Jan  1 16:29:49 squid kernel: ata4: EH complete
> Jan  1 16:29:49 squid kernel: SCSI device sdd: 488397168 512-byte hdwr
> sectors (250059 MB)
> Jan  1 16:29:49 squid kernel: sdd: Write Protect is off
> Jan  1 16:29:49 squid kernel: SCSI device sdd: write cache: enabled,
> read cache: enabled, doesn't support DPO or FUA
> 
> lots of them, and eventually crashing the system.
> Tested from fc6 2.6.18 kernel to vanilla 2.6.20-rc2-mm1. Old kernels
> just crash, newer ones log these things and then crash.
> I don't want to flood with this mail with useless info, so please tell
> me what to send and I'll do it (dmesg, smartctl... you name it)
> BTW, memtest was running for about 2 days without errors, and and
> badblocks on all 4 drives returned nothing. Reallocated_Sector_Ct
> raw_value was 0 on all 4 drives

Please post full dmesg and the result of 'lspci -nnvvv'.  And what do
you mean by 'crash'?

-- 
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-01-03 Thread Tejun Heo
Pablo Sebastian Greco wrote:
 First of all, thanks for everything, and my excuses if I'm doing
 anything wrong, this is my first lkml mail, but I've read all the faq,
 so should be OK.
 This is the machine with the problem:
 
 Intel ServerBoard S5000VSA
 Dual Core Xeon 2.66 (Intel(R) Xeon(TM) CPU 2.66GHz stepping 04)
 4G Kingston
 1 Seagate 80G sata (ST380211AS) (sda)
 3 Samsung 250G sata (SAMSUNG SP2504C) (sdb,c,d)
 
 Installed distribution is FC6 x86_64
 
 I've been getting these messages with distribution and vanilla kernels
 
 Jan  1 16:29:08 squid kernel: ata4.00: exception Emask 0x0 SAct
 0x7fff SErr 0x0 action 0x2 frozen
 Jan  1 16:29:08 squid kernel: ata4.00: cmd
 61/60:00:c9:6d:8e/00:00:0e:00:00/40 tag 0 cdb 0x0 data 49152 out
 Jan  1 16:29:08 squid kernel:  res
 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
 Jan  1 16:29:08 squid kernel: ata4.00: cmd
 60/08:08:f7:7d:56/00:00:0e:00:00/40 tag 1 cdb 0x0 data 4096 in
 Jan  1 16:29:08 squid kernel:  res
 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
 snip
 Jan  1 16:29:08 squid kernel: ata4: soft resetting port
 Jan  1 16:29:08 squid kernel: ata4: softreset failed (port busy but CLO
 unavailable)
 Jan  1 16:29:08 squid kernel: ata4: softreset failed, retrying in 5 secs
 Jan  1 16:29:13 squid kernel: ata4: hard resetting port
 Jan  1 16:29:21 squid kernel: ata4: port is slow to respond, please be
 patient (Status 0x80)
 Jan  1 16:29:43 squid kernel: ata4: port failed to respond (30 secs,
 Status 0x80)
 Jan  1 16:29:43 squid kernel: ata4: COMRESET failed (device not ready)
 Jan  1 16:29:43 squid kernel: ata4: hardreset failed, retrying in 5 secs
 Jan  1 16:29:48 squid kernel: ata4: hard resetting port
 Jan  1 16:29:49 squid kernel: ata4: SATA link up 3.0 Gbps (SStatus 123
 SControl 300)
 Jan  1 16:29:49 squid kernel: ata4.00: configured for UDMA/133
 Jan  1 16:29:49 squid kernel: ata4: EH complete
 Jan  1 16:29:49 squid kernel: SCSI device sdd: 488397168 512-byte hdwr
 sectors (250059 MB)
 Jan  1 16:29:49 squid kernel: sdd: Write Protect is off
 Jan  1 16:29:49 squid kernel: SCSI device sdd: write cache: enabled,
 read cache: enabled, doesn't support DPO or FUA
 
 lots of them, and eventually crashing the system.
 Tested from fc6 2.6.18 kernel to vanilla 2.6.20-rc2-mm1. Old kernels
 just crash, newer ones log these things and then crash.
 I don't want to flood with this mail with useless info, so please tell
 me what to send and I'll do it (dmesg, smartctl... you name it)
 BTW, memtest was running for about 2 days without errors, and and
 badblocks on all 4 drives returned nothing. Reallocated_Sector_Ct
 raw_value was 0 on all 4 drives

Please post full dmesg and the result of 'lspci -nnvvv'.  And what do
you mean by 'crash'?

-- 
tejun
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: SATA problems

2007-01-03 Thread Tejun Heo
Pablo Sebastian Greco wrote:
 By crash I mean the whole system going down, having to reset the entire
 machine.
 I'm sending you 4 files:
 dmesg: current boot dmesg, just a boot, because no errors appeared after
 last crash, since the server is out of production right now (errors
 usually appear under heavy load, and this primarily a transparent proxy
 for about 1000 simultaneous users)
 lspci: the way you asked for it
 messages and messages.1: files where you can see old boots and crashes
 (even a soft lockup).
 If there is anything else I can do, let me know. If you need direct
 access to the server, I can arrange that too.

Can you try 2.6.20-rc3 and see if 'CLO not available' message goes away
(please post boot dmesg)?

The crash/lock is because filesystem code does not cope with IO errors
very well.  I can't tell why timeouts are occurring in the first place.
 It seems that only samsung drives are affected (sda2, 3, 4).  Hmmm...
Please apply the attached patch to 2.6.20-rc3 and test it.

Thanks.

-- 
tejun
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 0d51d13..f8cf349 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -3327,6 +3327,8 @@ static const struct ata_blacklist_entry 
ata_device_blacklist [] = {
/* NCQ is slow */
 { WDC WD740ADFD-00,   NULL,  ATA_HORKAGE_NONCQ },
 
+   { SAMSUNG SP2504C,NULL,   ATA_HORKAGE_NONCQ },
+
/* Devices with NCQ limits */
 
/* End Marker */


SATA problems

2007-01-02 Thread Pablo Sebastian Greco
First of all, thanks for everything, and my excuses if I'm doing 
anything wrong, this is my first lkml mail, but I've read all the faq, 
so should be OK.

This is the machine with the problem:

Intel ServerBoard S5000VSA
Dual Core Xeon 2.66 (Intel(R) Xeon(TM) CPU 2.66GHz stepping 04)
4G Kingston
1 Seagate 80G sata (ST380211AS) (sda)
3 Samsung 250G sata (SAMSUNG SP2504C) (sdb,c,d)

Installed distribution is FC6 x86_64

I've been getting these messages with distribution and vanilla kernels

Jan  1 16:29:08 squid kernel: ata4.00: exception Emask 0x0 SAct 
0x7fff SErr 0x0 action 0x2 frozen
Jan  1 16:29:08 squid kernel: ata4.00: cmd 
61/60:00:c9:6d:8e/00:00:0e:00:00/40 tag 0 cdb 0x0 data 49152 out
Jan  1 16:29:08 squid kernel:  res 
40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Jan  1 16:29:08 squid kernel: ata4.00: cmd 
60/08:08:f7:7d:56/00:00:0e:00:00/40 tag 1 cdb 0x0 data 4096 in
Jan  1 16:29:08 squid kernel:  res 
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)


Jan  1 16:29:08 squid kernel: ata4: soft resetting port
Jan  1 16:29:08 squid kernel: ata4: softreset failed (port busy but CLO 
unavailable)

Jan  1 16:29:08 squid kernel: ata4: softreset failed, retrying in 5 secs
Jan  1 16:29:13 squid kernel: ata4: hard resetting port
Jan  1 16:29:21 squid kernel: ata4: port is slow to respond, please be 
patient (Status 0x80)
Jan  1 16:29:43 squid kernel: ata4: port failed to respond (30 secs, 
Status 0x80)

Jan  1 16:29:43 squid kernel: ata4: COMRESET failed (device not ready)
Jan  1 16:29:43 squid kernel: ata4: hardreset failed, retrying in 5 secs
Jan  1 16:29:48 squid kernel: ata4: hard resetting port
Jan  1 16:29:49 squid kernel: ata4: SATA link up 3.0 Gbps (SStatus 123 
SControl 300)

Jan  1 16:29:49 squid kernel: ata4.00: configured for UDMA/133
Jan  1 16:29:49 squid kernel: ata4: EH complete
Jan  1 16:29:49 squid kernel: SCSI device sdd: 488397168 512-byte hdwr 
sectors (250059 MB)

Jan  1 16:29:49 squid kernel: sdd: Write Protect is off
Jan  1 16:29:49 squid kernel: SCSI device sdd: write cache: enabled, 
read cache: enabled, doesn't support DPO or FUA


lots of them, and eventually crashing the system.
Tested from fc6 2.6.18 kernel to vanilla 2.6.20-rc2-mm1. Old kernels 
just crash, newer ones log these things and then crash.
I don't want to flood with this mail with useless info, so please tell 
me what to send and I'll do it (dmesg, smartctl... you name it)
BTW, memtest was running for about 2 days without errors, and and 
badblocks on all 4 drives returned nothing. Reallocated_Sector_Ct 
raw_value was 0 on all 4 drives


Thanks in advance.
Pablo.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


SATA problems

2007-01-02 Thread Pablo Sebastian Greco
First of all, thanks for everything, and my excuses if I'm doing 
anything wrong, this is my first lkml mail, but I've read all the faq, 
so should be OK.

This is the machine with the problem:

Intel ServerBoard S5000VSA
Dual Core Xeon 2.66 (Intel(R) Xeon(TM) CPU 2.66GHz stepping 04)
4G Kingston
1 Seagate 80G sata (ST380211AS) (sda)
3 Samsung 250G sata (SAMSUNG SP2504C) (sdb,c,d)

Installed distribution is FC6 x86_64

I've been getting these messages with distribution and vanilla kernels

Jan  1 16:29:08 squid kernel: ata4.00: exception Emask 0x0 SAct 
0x7fff SErr 0x0 action 0x2 frozen
Jan  1 16:29:08 squid kernel: ata4.00: cmd 
61/60:00:c9:6d:8e/00:00:0e:00:00/40 tag 0 cdb 0x0 data 49152 out
Jan  1 16:29:08 squid kernel:  res 
40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Jan  1 16:29:08 squid kernel: ata4.00: cmd 
60/08:08:f7:7d:56/00:00:0e:00:00/40 tag 1 cdb 0x0 data 4096 in
Jan  1 16:29:08 squid kernel:  res 
40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)

snip
Jan  1 16:29:08 squid kernel: ata4: soft resetting port
Jan  1 16:29:08 squid kernel: ata4: softreset failed (port busy but CLO 
unavailable)

Jan  1 16:29:08 squid kernel: ata4: softreset failed, retrying in 5 secs
Jan  1 16:29:13 squid kernel: ata4: hard resetting port
Jan  1 16:29:21 squid kernel: ata4: port is slow to respond, please be 
patient (Status 0x80)
Jan  1 16:29:43 squid kernel: ata4: port failed to respond (30 secs, 
Status 0x80)

Jan  1 16:29:43 squid kernel: ata4: COMRESET failed (device not ready)
Jan  1 16:29:43 squid kernel: ata4: hardreset failed, retrying in 5 secs
Jan  1 16:29:48 squid kernel: ata4: hard resetting port
Jan  1 16:29:49 squid kernel: ata4: SATA link up 3.0 Gbps (SStatus 123 
SControl 300)

Jan  1 16:29:49 squid kernel: ata4.00: configured for UDMA/133
Jan  1 16:29:49 squid kernel: ata4: EH complete
Jan  1 16:29:49 squid kernel: SCSI device sdd: 488397168 512-byte hdwr 
sectors (250059 MB)

Jan  1 16:29:49 squid kernel: sdd: Write Protect is off
Jan  1 16:29:49 squid kernel: SCSI device sdd: write cache: enabled, 
read cache: enabled, doesn't support DPO or FUA


lots of them, and eventually crashing the system.
Tested from fc6 2.6.18 kernel to vanilla 2.6.20-rc2-mm1. Old kernels 
just crash, newer ones log these things and then crash.
I don't want to flood with this mail with useless info, so please tell 
me what to send and I'll do it (dmesg, smartctl... you name it)
BTW, memtest was running for about 2 days without errors, and and 
badblocks on all 4 drives returned nothing. Reallocated_Sector_Ct 
raw_value was 0 on all 4 drives


Thanks in advance.
Pablo.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/