Re: Bug#419482: Kernel 2.6.18 - ALI15X3 driver too optmistic about UDMA

2007-09-08 Thread Alan Cox
 if (m5229_revision = 0x20) {
 return 0;
 } else if ((m5229_revision  0xC2) 

So 0xC1 takes this path

 Looking back at the equivalent code in 2.4.27 (the previous kernel
 this machine ran), that's rather different:
 
 if (m5229_revision  0xC1) {/* According to ALi */
 return 0;
 } else if ((m5229_revision  0xC2) 

And 0xC1 takes the same path.

 So it would seem there has been a regression here - the assumption now
 is that versions between 0x20 and 0xC1 can use UDMA fine unless there
 is a WDC drive attached, but the old code wouldn't try UDMA at all on
 chips older than rev C1.

There are no versions between 0x21 and 0xC0.

 I have the machine out and ready to experiment with if any more
 details are needed to help solve this problem.

Interesting report as we've had essentially no corruption reports
equivalent to this on common architectures for a long time and the
hardware is in a huge number of PC systems. Also UDMA transfers are CRC
protected by hardware at each end.

That makes me wonder if you have a platform or endian bug, or indeed your
firmware isn't setting up all the chipset as required by the ALi chipset
and BIOS documentation (which unfortunately is NDA)

Alan
-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bug#419482: Kernel 2.6.18 - ALI15X3 driver too optmistic about UDMA

2007-09-06 Thread Steve McIntyre
Hi,

I reported this problem initially to the Debian BTS back in April, but
I've not had a chance to follow up on it since - the moment I got the
system up and running again, it needed to keep running as a build
daemon for us. Now I've got some downtime to allow me to delve
further...

On the cats machines that we use for arm buildd work, it seems the
kernel is too aggressive in enabling UDMA support for the onboard IDE
chip:

  ALi Corporation M5229 IDE (rev c1)

aka

00:11.0 IDE interface: ALi Corporation M5229 IDE (rev c1) (prog-if fa)
Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr+ Stepping- SERR- FastB2B-
Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium
TAbort- TAbort- MAbort- SERR- PERR-
Latency: 32 (500ns min, 1000ns max)
Interrupt: pin A routed to IRQ 31
Region 4: I/O ports at 10e0 [size=16]

So much so, that simple data transfers are corrupted - the very first
read from a disk to find the partition table in sector 0 is corrupted
and the machine fails to boot. I've debugged through this to verify
the problem, and for these machines for now I have built a custom
kernel with UDMA disabled altogether. It seems that the code in
ali15x3_can_ultra() is to blame:

static u8 ali15x3_can_ultra (ide_drive_t *drive)
{
#ifndef CONFIG_WDC_ALI15X3
struct hd_driveid *id   = drive-id;
#endif /* CONFIG_WDC_ALI15X3 */

return 0;

if (m5229_revision = 0x20) {
return 0;
} else if ((m5229_revision  0xC2) 
#ifndef CONFIG_WDC_ALI15X3
   ((chip_is_1543c_e  strstr(id-model, WDC )) ||
(drive-media!=ide_disk))) {
#else /* CONFIG_WDC_ALI15X3 */
   (drive-media!=ide_disk)) {
#endif /* CONFIG_WDC_ALI15X3 */
return 0;
} else {
return 1;
}
}

Looking back at the equivalent code in 2.4.27 (the previous kernel
this machine ran), that's rather different:

static u8 ali15x3_can_ultra (ide_drive_t *drive)
{
#ifndef CONFIG_WDC_ALI15X3
struct hd_driveid *id   = drive-id;
#endif /* CONFIG_WDC_ALI15X3 */

if (m5229_revision  0xC1) {/* According to ALi */
return 0;
} else if ((m5229_revision  0xC2) 
#ifndef CONFIG_WDC_ALI15X3
   ((chip_is_1543c_e  strstr(id-model, WDC )) ||
(drive-media!=ide_disk))) {
#else /* CONFIG_WDC_ALI15X3 */
   (drive-media!=ide_disk)) {
#endif /* CONFIG_WDC_ALI15X3 */
return 0;
} else {
return 1;
}
}

So it would seem there has been a regression here - the assumption now
is that versions between 0x20 and 0xC1 can use UDMA fine unless there
is a WDC drive attached, but the old code wouldn't try UDMA at all on
chips older than rev C1.

In case it's relevant, I have a Samsung drive attached:

hda: SAMSUNG SP0411N, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
Probing IDE interface ide1...
hda: max request size: 128KiB
hda: 78242976 sectors (40060 MB) w/2048KiB Cache, CHS=16383/255/63, (U)DMA
hda: cache flushes supported
 hda: hda1 hda2

I have the machine out and ready to experiment with if any more
details are needed to help solve this problem.

-- 
Steve McIntyre, Cambridge, UK.[EMAIL PROTECTED]
Every time you use Tcl, God kills a kitten. -- Malcolm Ray

-
To unsubscribe from this list: send the line unsubscribe linux-ide in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html