Gabor Gombas wrote:
On Mon, Jan 07, 2008 at 06:10:29PM -0600, Robert Hancock wrote:
Gabor, I just noticed you said that it worked OK in 2.6.20, yet 2.6.22
fails. 2.6.20 had ADMA support as well, so I wonder what change started
causing the problem. Would it be possible for you to do a git
On Mon, Jan 07, 2008 at 06:10:29PM -0600, Robert Hancock wrote:
> Gabor, I just noticed you said that it worked OK in 2.6.20, yet 2.6.22
> fails. 2.6.20 had ADMA support as well, so I wonder what change started
> causing the problem. Would it be possible for you to do a git bisect (or
> at
On Mon, Jan 07, 2008 at 06:10:29PM -0600, Robert Hancock wrote:
Gabor, I just noticed you said that it worked OK in 2.6.20, yet 2.6.22
fails. 2.6.20 had ADMA support as well, so I wonder what change started
causing the problem. Would it be possible for you to do a git bisect (or
at
Gabor Gombas wrote:
On Mon, Jan 07, 2008 at 06:10:29PM -0600, Robert Hancock wrote:
Gabor, I just noticed you said that it worked OK in 2.6.20, yet 2.6.22
fails. 2.6.20 had ADMA support as well, so I wonder what change started
causing the problem. Would it be possible for you to do a git
Allen Martin wrote:
Dunno about the NVidia version.
Theirs works rather differently - the GO bit is there, but there's
another append register which is used to tell the controller
that a new
tag has been added to the CPB list.
The only thing we currently use the GO bit for is to switch
Allen Martin wrote:
Dunno about the NVidia version.
Theirs works rather differently - the GO bit is there, but there's
another append register which is used to tell the controller
that a new
tag has been added to the CPB list.
The only thing we currently use the GO bit for is to switch
On Thu, 2008-01-03 at 19:43 -0600, Robert Hancock wrote:
> Benjamin Herrenschmidt wrote:
> >> Another thing about the PacDigi core: one has to be very careful
> >> to avoid sequential accesses to sequential PCI locations when
> >> programming the chip -- it cannot handle merged register writes.
Allen Martin wrote:
Dunno about the NVidia version.
Theirs works rather differently - the GO bit is there, but there's
another append register which is used to tell the controller
that a new
tag has been added to the CPB list.
The only thing we currently use the GO bit for is to switch
Benjamin Herrenschmidt wrote:
Another thing about the PacDigi core: one has to be very careful
to avoid sequential accesses to sequential PCI locations when
programming the chip -- it cannot handle merged register writes.
So for any group of sequentially laid out registers, the code has
to
> > Dunno about the NVidia version.
>
> Theirs works rather differently - the GO bit is there, but there's
> another append register which is used to tell the controller
> that a new
> tag has been added to the CPB list.
>
> The only thing we currently use the GO bit for is to switch
>
> Another thing about the PacDigi core: one has to be very careful
> to avoid sequential accesses to sequential PCI locations when
> programming the chip -- it cannot handle merged register writes.
>
> So for any group of sequentially laid out registers, the code has
> to ensure it never writes
Mark Lord wrote:
Robert Hancock wrote:
Mark Lord wrote:
Robert Hancock wrote:
..
From some of the traces I took previously (posted on LKML as "sata_nv ADMA
controller lockup investigation" way back in Feb 07), what seems to occur is that
when the second command is issued very rapidly
Robert Hancock wrote:
Mark Lord wrote:
Robert Hancock wrote:
..
From some of the traces I took previously (posted on LKML as
"sata_nv ADMA controller lockup investigation" way back in Feb 07),
what seems to occur is that when the second command is issued very
rapidly (within less than 20
Robert Hancock wrote:
Mark Lord wrote:
Robert Hancock wrote:
..
From some of the traces I took previously (posted on LKML as
sata_nv ADMA controller lockup investigation way back in Feb 07),
what seems to occur is that when the second command is issued very
rapidly (within less than 20
Mark Lord wrote:
Robert Hancock wrote:
Mark Lord wrote:
Robert Hancock wrote:
..
From some of the traces I took previously (posted on LKML as sata_nv ADMA
controller lockup investigation way back in Feb 07), what seems to occur is that
when the second command is issued very rapidly (within
Another thing about the PacDigi core: one has to be very careful
to avoid sequential accesses to sequential PCI locations when
programming the chip -- it cannot handle merged register writes.
So for any group of sequentially laid out registers, the code has
to ensure it never writes two
Dunno about the NVidia version.
Theirs works rather differently - the GO bit is there, but there's
another append register which is used to tell the controller
that a new
tag has been added to the CPB list.
The only thing we currently use the GO bit for is to switch
between ADMA
Benjamin Herrenschmidt wrote:
Another thing about the PacDigi core: one has to be very careful
to avoid sequential accesses to sequential PCI locations when
programming the chip -- it cannot handle merged register writes.
So for any group of sequentially laid out registers, the code has
to
Allen Martin wrote:
Dunno about the NVidia version.
Theirs works rather differently - the GO bit is there, but there's
another append register which is used to tell the controller
that a new
tag has been added to the CPB list.
The only thing we currently use the GO bit for is to switch
On Thu, 2008-01-03 at 19:43 -0600, Robert Hancock wrote:
Benjamin Herrenschmidt wrote:
Another thing about the PacDigi core: one has to be very careful
to avoid sequential accesses to sequential PCI locations when
programming the chip -- it cannot handle merged register writes.
So for
Mark Lord wrote:
Robert Hancock wrote:
..
From some of the traces I took previously (posted on LKML as "sata_nv
ADMA controller lockup investigation" way back in Feb 07), what seems
to occur is that when the second command is issued very rapidly
(within less than 20 microseconds, or
Robert Hancock wrote:
..
From some of the traces I took previously (posted on LKML as "sata_nv
ADMA controller lockup investigation" way back in Feb 07), what seems to
occur is that when the second command is issued very rapidly (within
less than 20 microseconds, or potentially longer) after
Robert Hancock wrote:
What we're doing to enter legacy mode is essentially:
-wait until ADMA status indicates IDLE bit set (max wait of 1 microsecond)
-clear GO bit in control register
-wait until status indicates LEGACY bit set (max wait of 1 microsecond)
and to enter ADMA mode:
-set GO bit
Tejun Heo wrote:
Robert Hancock wrote:
Jeff Garzik wrote:
Tejun Heo wrote:
Thanks a lot for the detailed explanation. Nvidia ppl, any ideas?
FLUSH is used regularly. We really need to fix this.
I reiterate my opinion :) ... We should remove ADMA support from
sata_nv. It's only in a
Allen Martin wrote:
The software definitely provides that guarantee for all NCQ-capable
controllers.
Well if that's not it, it must be some problem entering ADMA legacy
mode. Here's what the Windows driver does:
ADMACtrl.aGO = 0
ADMACtrl.aEIEN = 0
poll {
until ADMAStatus.aLGCY = 1 ||
> The software definitely provides that guarantee for all NCQ-capable
> controllers.
>
Well if that's not it, it must be some problem entering ADMA legacy
mode. Here's what the Windows driver does:
ADMACtrl.aGO = 0
ADMACtrl.aEIEN = 0
poll {
until ADMAStatus.aLGCY = 1 || timeout
}
Allen Martin wrote:
The question I had for NVIDIA regarding this that I never got
answered
was, is there any reason why we would need a delay when switching
between NCQ and non-NCQ commands on ADMA, and if not, is
there any known
cause that could cause the controller to get into this
> The question I had for NVIDIA regarding this that I never got
> answered
> was, is there any reason why we would need a delay when switching
> between NCQ and non-NCQ commands on ADMA, and if not, is
> there any known
> cause that could cause the controller to get into this seemingly
>
The question I had for NVIDIA regarding this that I never got
answered
was, is there any reason why we would need a delay when switching
between NCQ and non-NCQ commands on ADMA, and if not, is
there any known
cause that could cause the controller to get into this seemingly
locked-up
Allen Martin wrote:
The question I had for NVIDIA regarding this that I never got
answered
was, is there any reason why we would need a delay when switching
between NCQ and non-NCQ commands on ADMA, and if not, is
there any known
cause that could cause the controller to get into this
The software definitely provides that guarantee for all NCQ-capable
controllers.
Well if that's not it, it must be some problem entering ADMA legacy
mode. Here's what the Windows driver does:
ADMACtrl.aGO = 0
ADMACtrl.aEIEN = 0
poll {
until ADMAStatus.aLGCY = 1 || timeout
}
Allen Martin wrote:
The software definitely provides that guarantee for all NCQ-capable
controllers.
Well if that's not it, it must be some problem entering ADMA legacy
mode. Here's what the Windows driver does:
ADMACtrl.aGO = 0
ADMACtrl.aEIEN = 0
poll {
until ADMAStatus.aLGCY = 1 ||
Tejun Heo wrote:
Robert Hancock wrote:
Jeff Garzik wrote:
Tejun Heo wrote:
Thanks a lot for the detailed explanation. Nvidia ppl, any ideas?
FLUSH is used regularly. We really need to fix this.
I reiterate my opinion :) ... We should remove ADMA support from
sata_nv. It's only in a
Robert Hancock wrote:
What we're doing to enter legacy mode is essentially:
-wait until ADMA status indicates IDLE bit set (max wait of 1 microsecond)
-clear GO bit in control register
-wait until status indicates LEGACY bit set (max wait of 1 microsecond)
and to enter ADMA mode:
-set GO bit
Robert Hancock wrote:
..
From some of the traces I took previously (posted on LKML as sata_nv
ADMA controller lockup investigation way back in Feb 07), what seems to
occur is that when the second command is issued very rapidly (within
less than 20 microseconds, or potentially longer) after
Mark Lord wrote:
Robert Hancock wrote:
..
From some of the traces I took previously (posted on LKML as sata_nv
ADMA controller lockup investigation way back in Feb 07), what seems
to occur is that when the second command is issued very rapidly
(within less than 20 microseconds, or
Robert Hancock wrote:
> Jeff Garzik wrote:
>> Tejun Heo wrote:
>>> Thanks a lot for the detailed explanation. Nvidia ppl, any ideas?
>>> FLUSH is used regularly. We really need to fix this.
>>
>>
>> I reiterate my opinion :) ... We should remove ADMA support from
>> sata_nv. It's only in a
Jeff Garzik wrote:
Tejun Heo wrote:
Thanks a lot for the detailed explanation. Nvidia ppl, any ideas?
FLUSH is used regularly. We really need to fix this.
I reiterate my opinion :) ... We should remove ADMA support from
sata_nv. It's only in a few chips, it's not appearing in any new
Tejun Heo wrote:
Thanks a lot for the detailed explanation. Nvidia ppl, any ideas?
FLUSH is used regularly. We really need to fix this.
I reiterate my opinion :) ... We should remove ADMA support from
sata_nv. It's only in a few chips, it's not appearing in any new chips,
and nasty
Robert Hancock wrote:
>> This is kind of a longstanding problem which has been partially worked
>> around, but it seems not entirely. This is what I had diagnosed some
>> time ago:
>>
>> "recently, some issues cropped up with command timeouts when a cache
>> flush command was immediately followed
Robert Hancock wrote:
Tejun Heo wrote:
[cc'ing Robert Hancock and NVidia people]
Whole thread can be read from the following URL.
http://thread.gmane.org/gmane.linux.ide/21710
In a nutshell, with ADMA enabled, FLUSH_EXT occasionally times out. I
first suspected faulty disk (reallocation
Tejun Heo wrote:
[cc'ing Robert Hancock and NVidia people]
Whole thread can be read from the following URL.
http://thread.gmane.org/gmane.linux.ide/21710
In a nutshell, with ADMA enabled, FLUSH_EXT occasionally times out. I
first suspected faulty disk (reallocation failure on flush) but
[cc'ing Robert Hancock and NVidia people]
Whole thread can be read from the following URL.
http://thread.gmane.org/gmane.linux.ide/21710
In a nutshell, with ADMA enabled, FLUSH_EXT occasionally times out. I
first suspected faulty disk (reallocation failure on flush) but SMART
reports nothing
Hi,
Just FYI I've tried to enable ADMA again (now running 2.6.24-rc6) but
the bug is still present:
Jan 1 16:11:21 host kernel: ata7: EH in ADMA mode, notifier 0x0 notifier_error
0x0 gen_ctl 0x1501000 status 0x400 next cpb count 0x0 next cpb idx 0x0
Jan 1 16:11:21 host kernel: ata7: CPB 0:
Hi,
Just FYI I've tried to enable ADMA again (now running 2.6.24-rc6) but
the bug is still present:
Jan 1 16:11:21 host kernel: ata7: EH in ADMA mode, notifier 0x0 notifier_error
0x0 gen_ctl 0x1501000 status 0x400 next cpb count 0x0 next cpb idx 0x0
Jan 1 16:11:21 host kernel: ata7: CPB 0:
Tejun Heo wrote:
[cc'ing Robert Hancock and NVidia people]
Whole thread can be read from the following URL.
http://thread.gmane.org/gmane.linux.ide/21710
In a nutshell, with ADMA enabled, FLUSH_EXT occasionally times out. I
first suspected faulty disk (reallocation failure on flush) but
Robert Hancock wrote:
Tejun Heo wrote:
[cc'ing Robert Hancock and NVidia people]
Whole thread can be read from the following URL.
http://thread.gmane.org/gmane.linux.ide/21710
In a nutshell, with ADMA enabled, FLUSH_EXT occasionally times out. I
first suspected faulty disk (reallocation
Robert Hancock wrote:
This is kind of a longstanding problem which has been partially worked
around, but it seems not entirely. This is what I had diagnosed some
time ago:
recently, some issues cropped up with command timeouts when a cache
flush command was immediately followed by an NCQ
Tejun Heo wrote:
Thanks a lot for the detailed explanation. Nvidia ppl, any ideas?
FLUSH is used regularly. We really need to fix this.
I reiterate my opinion :) ... We should remove ADMA support from
sata_nv. It's only in a few chips, it's not appearing in any new chips,
and nasty
Jeff Garzik wrote:
Tejun Heo wrote:
Thanks a lot for the detailed explanation. Nvidia ppl, any ideas?
FLUSH is used regularly. We really need to fix this.
I reiterate my opinion :) ... We should remove ADMA support from
sata_nv. It's only in a few chips, it's not appearing in any new
Robert Hancock wrote:
Jeff Garzik wrote:
Tejun Heo wrote:
Thanks a lot for the detailed explanation. Nvidia ppl, any ideas?
FLUSH is used regularly. We really need to fix this.
I reiterate my opinion :) ... We should remove ADMA support from
sata_nv. It's only in a few chips, it's
Gabor Gombas wrote:
> On Tue, Aug 14, 2007 at 06:30:28PM +0900, Tejun Heo wrote:
> > Hmmm... That's timeout on cache flush, indicative of failing disk.
> > Please post the result of 'smartctl -a /dev/sdc'.
>
> Ok, so something is fishy in 2.6.22 wrt. SMART.
See http://lkml.org/lkml/2007/7/8/198
Hi,
On Tue, Aug 14, 2007 at 06:30:28PM +0900, Tejun Heo wrote:
> Hmmm... That's timeout on cache flush, indicative of failing disk.
> Please post the result of 'smartctl -a /dev/sdc'.
Ok, so something is fishy in 2.6.22 wrt. SMART.
First, booting back to 2.6.20.5 I confirmed that SMART works
Hi,
On Tue, Aug 14, 2007 at 06:30:28PM +0900, Tejun Heo wrote:
Hmmm... That's timeout on cache flush, indicative of failing disk.
Please post the result of 'smartctl -a /dev/sdc'.
Ok, so something is fishy in 2.6.22 wrt. SMART.
First, booting back to 2.6.20.5 I confirmed that SMART works
Gabor Gombas wrote:
On Tue, Aug 14, 2007 at 06:30:28PM +0900, Tejun Heo wrote:
Hmmm... That's timeout on cache flush, indicative of failing disk.
Please post the result of 'smartctl -a /dev/sdc'.
Ok, so something is fishy in 2.6.22 wrt. SMART.
See http://lkml.org/lkml/2007/7/8/198
-jim
-
On Tue, Aug 14, 2007 at 06:30:28PM +0900, Tejun Heo wrote:
> Hmmm... That's timeout on cache flush, indicative of failing disk.
> Please post the result of 'smartctl -a /dev/sdc'.
Will do when I get home. Note however that this only occurs in ADMA
mode. It never occured with 2.6.20 and it never
Gabor Gombas wrote:
> Hi,
>
> Since I have upgraded to 2.6.22.1 from 2.6.20 I have problems with
> Samsung disks. Sometimes the disks stall for about half a minute and
> then I have these messages in the logs:
>
> Aug 6 20:10:11 twister kernel: ata7: EH in ADMA mode, notifier 0x0
>
Gabor Gombas wrote:
Hi,
Since I have upgraded to 2.6.22.1 from 2.6.20 I have problems with
Samsung disks. Sometimes the disks stall for about half a minute and
then I have these messages in the logs:
Aug 6 20:10:11 twister kernel: ata7: EH in ADMA mode, notifier 0x0
notifier_error 0x0
On Tue, Aug 14, 2007 at 06:30:28PM +0900, Tejun Heo wrote:
Hmmm... That's timeout on cache flush, indicative of failing disk.
Please post the result of 'smartctl -a /dev/sdc'.
Will do when I get home. Note however that this only occurs in ADMA
mode. It never occured with 2.6.20 and it never
59 matches
Mail list logo