Re: [OT] Major Clock Drift

2001-02-11 Thread Peter Horton

On Sun, Feb 11, 2001 at 12:06:14PM +0100, Pavel Machek wrote:
> 
> > > > > >I've discovered that heavy use of vesafb can be a major source of clock
> > > > > >drift on my system, especially if I don't specify "ypan" or "ywrap". On my
> > > > >
> > > > > This is extremely interesting. What version of ntp are you using?
> > > >
> > > > Is vesafb one of the drivers which blocks interrupts for (many) tens
> > > > of milliseconds?
> > > 
> > > Vesafb is happy to block interrupts for half a second.
> > 
> > And has this been observed to cause clock drift?
> 
> YEs. I've seen time running 3 times slower. Just do cat /etc/termcap
> with loaded PCI bus. Yesterday I lost 20 minutes during 2 hours -- I
> have been using USB (load PCI) and framebuffer.
>   Pavel

Is it not possible to save/check TSC in timer interrupt to correct for
dropped interrupts ? (obviously only on machines that have a TSC ...)

P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Easy Way to FS-corruption

2001-02-11 Thread Peter Horton

On Sat, Feb 10, 2001 at 03:24:06PM +0100, Tim Krieglstein wrote:
> 
> I found a way which seems to lead to an "easy" way of fs-corruption:
> Install two sound-cards, use the newest ALSA-Drivers 0.5.10b 
> (the standard sound drivers don't work to good with sf) and

[snip]

This could be that bus master DMA caching problem that showed up on my
KT133 A7V (see previous threads re: 'VIA silent disk corruption'). You
could try more conservative BIOS chipset settings (my problem went away
with "normal" rather than "optimal" BIOS settings). In the end Asus
released a BIOS update for the A7V that seems to have fixed it
permanently.

P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Easy Way to FS-corruption

2001-02-11 Thread Peter Horton

On Sat, Feb 10, 2001 at 03:24:06PM +0100, Tim Krieglstein wrote:
 
 I found a way which seems to lead to an "easy" way of fs-corruption:
 Install two sound-cards, use the newest ALSA-Drivers 0.5.10b 
 (the standard sound drivers don't work to good with sf) and

[snip]

This could be that bus master DMA caching problem that showed up on my
KT133 A7V (see previous threads re: 'VIA silent disk corruption'). You
could try more conservative BIOS chipset settings (my problem went away
with "normal" rather than "optimal" BIOS settings). In the end Asus
released a BIOS update for the A7V that seems to have fixed it
permanently.

P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: [OT] Major Clock Drift

2001-02-11 Thread Peter Horton

On Sun, Feb 11, 2001 at 12:06:14PM +0100, Pavel Machek wrote:
 
 I've discovered that heavy use of vesafb can be a major source of clock
 drift on my system, especially if I don't specify "ypan" or "ywrap". On my

 This is extremely interesting. What version of ntp are you using?
   
Is vesafb one of the drivers which blocks interrupts for (many) tens
of milliseconds?
   
   Vesafb is happy to block interrupts for half a second.
  
  And has this been observed to cause clock drift?
 
 YEs. I've seen time running 3 times slower. Just do cat /etc/termcap
 with loaded PCI bus. Yesterday I lost 20 minutes during 2 hours -- I
 have been using USB (load PCI) and framebuffer.
   Pavel

Is it not possible to save/check TSC in timer interrupt to correct for
dropped interrupts ? (obviously only on machines that have a TSC ...)

P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Linux 2.4.1ac9

2001-02-10 Thread Peter Horton

On Fri, Feb 09, 2001 at 11:01:13PM +, Alan Cox wrote:
> > I've noticed that -ac9 comes with the "Disable PCI-Master-Read-Caching
> > on VIA" patch that Peter Horton posted a while back. I don't know
> > whether it was applied in Linus' or your tree first, but is it
> > actually verified to fix anything?
> 
> Not yet. As the story becomes clear it can either be dropped or pushed
> on
> 

It should be dropped I think ...

Different folks found that changing different settings fixed it for
them, so it looks like some kind of internal race in the North bridge
where changing the timings in any way makes it harder to reproduce.

The updated BIOS from Asus definitely fixes it for me, and "PCI Master
Read Caching" is *enabled*. There are quite a few differences in the
setup of the North bridge from the previous BIOS to this one, and I
assume the changes were suggested by VIA.

If there are other people out there who still have this problem we can
probably come up with a patch for the kernel, but isolating which of the
settings are important would be a long job.

Shame VIA won't help :-(

P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: PS/2 Mouse/Keyboard conflict and lockup

2001-02-07 Thread Peter Horton

On Thu, Feb 08, 2001 at 03:35:00AM +0100, Udo A. Steinberg wrote:
> 
> I'm not sure whether this is related to the ominous ps/2 mouse bug
> you have been chasing, but this problem is 100% reproducible and
> very annoying.
> 
> After upgrading my Asus A7V Bios from 1003 to 1005D, gpm no longer
> receives any mouse events and the mouse doesn't work in text
> consoles. Once I kill gpm and restart gpm -t ps2 the keyboard
> locks up.
> 
> Logging in remotely and looking at dmesg revealed the following:
> 
> keyboard: Timeout - AT keyboard not present?
> keyboard: unrecognized scancode (70) - ignored
> 
> If I don't kill and restart gpm, but start X, the mouse works
> perfectly, but only in X.
> 

Similiar problems here after my upgrade to 1005D. Linux somehow kills
the keyboard if I start the box without a PS/2 mouse connected. I have
another machine (these are both 2.4.1) which is a much older K6-233, and
it too kills the keyboard if no mouse is present. Keyboard works at LILO
prompt but is dead by the time I get to login. GPM doesn't work for me
either.

P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: VIA silent disk corruption - patch

2001-02-07 Thread Peter Horton

On Tue, Feb 06, 2001 at 05:01:46PM +0100, Udo A. Steinberg wrote:
> Dale Farnsworth wrote:
> > 
> > However, if I enable the BIOS parameter "I/O Recovery Time", I can still
> > enable read caching without seeing any data corruption.
> > The lastest BIOS revision (1005C) enables "I/O Recovery Time" by default
> > where the previous revision I had (1004D) did not.
> 
> Interesting stuff.
> 
> Asus, Germany released 1005D today. It's available from
> ftp://ftp.asuscom.de/pub/ASUSCOM/BIOS/Socket_A/VIA_Chipset/Apollo_KT133/A7V/1005D.zip
> 
> No comments about what they changed and/or fixed.
> 

Good news here, looks like the new BIOS fixes it (1005D). I've run a
heavy test for at least 10 hours without a single blip. The BIOS is set
for "optimal". Hoorah!

Here's the North bridge diff for anyone who can't get a BIOS update :-)

P.

--- bad.pci Sun Feb  4 22:29:22 2001
+++ new.pci Wed Feb  7 23:11:28 2001
@@ -1,7 +1,7 @@
 00:00.0 Host bridge: VIA Technologies, Inc.: Unknown device 0305 (rev 02)
Subsystem: Asustek Computer, Inc.: Unknown device 8033
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- 
SERR- FastB2B-
-   Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- http://www.tux.org/lkml/



Re: VIA silent disk corruption - patch

2001-02-07 Thread Peter Horton

On Tue, Feb 06, 2001 at 05:01:46PM +0100, Udo A. Steinberg wrote:
 Dale Farnsworth wrote:
  
  However, if I enable the BIOS parameter "I/O Recovery Time", I can still
  enable read caching without seeing any data corruption.
  The lastest BIOS revision (1005C) enables "I/O Recovery Time" by default
  where the previous revision I had (1004D) did not.
 
 Interesting stuff.
 
 Asus, Germany released 1005D today. It's available from
 ftp://ftp.asuscom.de/pub/ASUSCOM/BIOS/Socket_A/VIA_Chipset/Apollo_KT133/A7V/1005D.zip
 
 No comments about what they changed and/or fixed.
 

Good news here, looks like the new BIOS fixes it (1005D). I've run a
heavy test for at least 10 hours without a single blip. The BIOS is set
for "optimal". Hoorah!

Here's the North bridge diff for anyone who can't get a BIOS update :-)

P.

--- bad.pci Sun Feb  4 22:29:22 2001
+++ new.pci Wed Feb  7 23:11:28 2001
@@ -1,7 +1,7 @@
 00:00.0 Host bridge: VIA Technologies, Inc.: Unknown device 0305 (rev 02)
Subsystem: Asustek Computer, Inc.: Unknown device 8033
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- 
SERR- FastB2B-
-   Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium TAbort- TAbort- 
MAbort+ SERR- PERR+
+   Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium TAbort- TAbort- 
+MAbort+ SERR- PERR-
Latency: 0 set
Region 0: Memory at e400 (32-bit, prefetchable) [size=64M]
Capabilities: [a0] AGP version 2.0
@@ -10,13 +10,13 @@
Capabilities: [c0] Power Management version 2
Flags: PMEClk- AuxPwr- DSI- D1- D2- PME-
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
-00: 06 11 05 03 06 00 10 a2 02 00 00 06 00 00 00 00
+00: 06 11 05 03 06 00 10 22 02 00 00 06 00 00 00 00
 10: 08 00 00 e4 00 00 00 00 00 00 00 00 00 00 00 00
 20: 00 00 00 00 00 00 00 00 00 00 00 00 43 10 33 80
 30: 00 00 00 00 a0 00 00 00 00 00 00 00 00 00 00 00
 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
-50: 17 a4 6b b4 4f 81 08 08 80 00 04 08 08 08 08 08
-60: 03 ff 00 a0 52 e5 e5 00 44 7c 86 0f 08 3f 00 00
+50: 17 a4 6b b4 07 28 08 08 80 00 04 08 08 08 08 08
+60: 03 ff 55 a0 52 e5 e5 00 44 7c 86 0f 08 3f 00 00
 70: de 80 cc 0c 0e a1 d2 00 01 b4 11 02 00 00 00 00
 80: 0f 40 00 00 c0 00 00 00 02 00 00 00 00 00 00 00
 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: PS/2 Mouse/Keyboard conflict and lockup

2001-02-07 Thread Peter Horton

On Thu, Feb 08, 2001 at 03:35:00AM +0100, Udo A. Steinberg wrote:
 
 I'm not sure whether this is related to the ominous ps/2 mouse bug
 you have been chasing, but this problem is 100% reproducible and
 very annoying.
 
 After upgrading my Asus A7V Bios from 1003 to 1005D, gpm no longer
 receives any mouse events and the mouse doesn't work in text
 consoles. Once I kill gpm and restart gpm -t ps2 the keyboard
 locks up.
 
 Logging in remotely and looking at dmesg revealed the following:
 
 keyboard: Timeout - AT keyboard not present?
 keyboard: unrecognized scancode (70) - ignored
 
 If I don't kill and restart gpm, but start X, the mouse works
 perfectly, but only in X.
 

Similiar problems here after my upgrade to 1005D. Linux somehow kills
the keyboard if I start the box without a PS/2 mouse connected. I have
another machine (these are both 2.4.1) which is a much older K6-233, and
it too kills the keyboard if no mouse is present. Keyboard works at LILO
prompt but is dead by the time I get to login. GPM doesn't work for me
either.

P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: VIA silent disk corruption - patch

2001-02-06 Thread Peter Horton

On Tue, Feb 06, 2001 at 08:52:23AM -0700, Dale Farnsworth wrote:
> 
> In article <[EMAIL PROTECTED]>,
> Peter Horton <[EMAIL PROTECTED]> wrote:
> > +  *  VIA VT8363 host bridge has broken feature 'PCI Master Read
> > +  *  Caching'. It caches more than is good for it, sometimes
> > +  *  serving the bus master with stale data. Some BIOSes enable
> > +  *  it by default, so we disable it.
> 
> Another data point:
> 
> I have an ASUS A7V motherboard with via vt82c686a and Promise pdc20265
> IDE controllers.  I noticed disk data corruption when I enabled DMA. 
> The corrupted data was 4K bytes long on 4K byte boundaries and occurred
> about once for every couple of gigabytes copied via cpio.
> I saw this corruption when the disks were connected to the pdc20265
> as well as to the 686a.
> 
> I also noticed that turning off read caching eliminated the corruption.
> 
> However, if I enable the BIOS parameter "I/O Recovery Time", I can still
> enable read caching without seeing any data corruption.
> The lastest BIOS revision (1005C) enables "I/O Recovery Time" by default
> where the previous revision I had (1004D) did not.
> 

I still get corruption with "I/O Recovery Time" enabled :-(

I don't get corruption with the BIOS "normal" settings (1004D).

I might update my BIOS to the latest BIOS in case it changes any other
settings.

P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: VIA silent disk corruption - patch

2001-02-06 Thread Peter Horton

On Tue, Feb 06, 2001 at 08:52:23AM -0700, Dale Farnsworth wrote:
 
 In article [EMAIL PROTECTED],
 Peter Horton [EMAIL PROTECTED] wrote:
  +  *  VIA VT8363 host bridge has broken feature 'PCI Master Read
  +  *  Caching'. It caches more than is good for it, sometimes
  +  *  serving the bus master with stale data. Some BIOSes enable
  +  *  it by default, so we disable it.
 
 Another data point:
 
 I have an ASUS A7V motherboard with via vt82c686a and Promise pdc20265
 IDE controllers.  I noticed disk data corruption when I enabled DMA. 
 The corrupted data was 4K bytes long on 4K byte boundaries and occurred
 about once for every couple of gigabytes copied via cpio.
 I saw this corruption when the disks were connected to the pdc20265
 as well as to the 686a.
 
 I also noticed that turning off read caching eliminated the corruption.
 
 However, if I enable the BIOS parameter "I/O Recovery Time", I can still
 enable read caching without seeing any data corruption.
 The lastest BIOS revision (1005C) enables "I/O Recovery Time" by default
 where the previous revision I had (1004D) did not.
 

I still get corruption with "I/O Recovery Time" enabled :-(

I don't get corruption with the BIOS "normal" settings (1004D).

I might update my BIOS to the latest BIOS in case it changes any other
settings.

P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



VIA silent disk corruption - bad news

2001-02-05 Thread Peter Horton

The patch doesn't work for me. Maybe I need to disable some more of
those North bridge features :-(

Oh bum. Back to testing with "normal" ...

P.

-  CORRUPTING SETUP  -

00:00.0 Host bridge: VIA Technologies, Inc.: Unknown device 0305 (rev 02)
Subsystem: Asustek Computer, Inc.: Unknown device 8033
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- 
SERR- FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- http://www.tux.org/lkml/



VIA silent disk corruption - patch

2001-02-05 Thread Peter Horton

Okay, looks like this fixes it (for me anyways).

Thanks to Mark Hahn and Andre for their help with this problem.

P.


--- linux-2.4.1/arch/i386/kernel/pci-pc.c   Thu Jun 22 15:17:16 2000
+++ linux-2.4.1-bm-fix/arch/i386/kernel/pci-pc.cMon Feb  5 18:37:35 2001
@@ -924,6 +924,22 @@
pcibios_max_latency = 32;
 }
 
+static void __init pci_fixup_vt8363(struct pci_dev *d)
+{
+   /*
+*  VIA VT8363 host bridge has broken feature 'PCI Master Read
+*  Caching'. It caches more than is good for it, sometimes
+*  serving the bus master with stale data. Some BIOSes enable
+*  it by default, so we disable it.
+*/
+   u8 tmp;
+   pci_read_config_byte(d, 0x70, );
+   if(tmp & 4) {
+   printk("PCI: Bus master read caching disabled\n");
+   pci_write_config_byte(d, 0x70, tmp & ~4);
+   }
+}
+
 struct pci_fixup pcibios_fixups[] = {
{ PCI_FIXUP_HEADER, PCI_VENDOR_ID_INTEL,PCI_DEVICE_ID_INTEL_82451NX,   
 pci_fixup_i450nx },
{ PCI_FIXUP_HEADER, PCI_VENDOR_ID_INTEL,PCI_DEVICE_ID_INTEL_82454GX,   
 pci_fixup_i450gx },
@@ -936,6 +952,7 @@
{ PCI_FIXUP_HEADER, PCI_ANY_ID, PCI_ANY_ID,
 pci_fixup_ide_bases },
{ PCI_FIXUP_HEADER, PCI_VENDOR_ID_SI,   PCI_DEVICE_ID_SI_5597, 
 pci_fixup_latency },
{ PCI_FIXUP_HEADER, PCI_VENDOR_ID_SI,   PCI_DEVICE_ID_SI_5598, 
 pci_fixup_latency },
+   { PCI_FIXUP_HEADER, PCI_VENDOR_ID_VIA,  PCI_DEVICE_ID_VIA_8363_0,  
+ pci_fixup_vt8363 },
{ 0 }
 };
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



VIA silent disk corruption - likely fix

2001-02-05 Thread Peter Horton

I've found the cause of silent disk corruption on my A7V motherboard,
and it might affect all boards with the same North bridge (KT133 etc).

For some reason the IDE controller(s) was sometimes picking up stale
data during bus master DMA to the drive. Assuming that there was no bug
in the CPU it had to be the North bridge that was caching the stuff when
it shouldn't have been. I assume the problem would also apply to other
bus masters (SCSI, NIC etc).

Scanning the motherboard manual showed up a chipset setting "PCI master
read caching" which I suspect is the culprit. According to the manual
this defaults to "on" for Athlons and "off" for Durons (obviously other
BIOSes / MB might treat this setting differently). Unfortunately my BIOS
does not allow me to change this setting independently [1], I only have
the choice of running the machine in "normal" or "optimal" configuration
to alter this setting ("optimal" is the default).

In "normal" mode my machine is rock solid and I see no corruption,
however "normal" mode also changes a lot of other settings (AGP speed,
DRAM interleave etc). Anyone experiencing such corruption should look
for a BIOS setting which disables this "feature".

If anyone out there has a BIOS which allows them to change just this one
setting can they diff the "lspci -vvxxx" output with the setting off and
then on so we can isolate which host bridge biti(s) control this feature.
Maybe we can then add it to 'pci_quirks' and reduce the number of VIA
corruption reports.

P.

[1] the BIOS appears to let you change the option but it defaults the
option the moment you leave the "advanced settings" screen :-(
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



VIA silent disk corruption - likely fix

2001-02-05 Thread Peter Horton

I've found the cause of silent disk corruption on my A7V motherboard,
and it might affect all boards with the same North bridge (KT133 etc).

For some reason the IDE controller(s) was sometimes picking up stale
data during bus master DMA to the drive. Assuming that there was no bug
in the CPU it had to be the North bridge that was caching the stuff when
it shouldn't have been. I assume the problem would also apply to other
bus masters (SCSI, NIC etc).

Scanning the motherboard manual showed up a chipset setting "PCI master
read caching" which I suspect is the culprit. According to the manual
this defaults to "on" for Athlons and "off" for Durons (obviously other
BIOSes / MB might treat this setting differently). Unfortunately my BIOS
does not allow me to change this setting independently [1], I only have
the choice of running the machine in "normal" or "optimal" configuration
to alter this setting ("optimal" is the default).

In "normal" mode my machine is rock solid and I see no corruption,
however "normal" mode also changes a lot of other settings (AGP speed,
DRAM interleave etc). Anyone experiencing such corruption should look
for a BIOS setting which disables this "feature".

If anyone out there has a BIOS which allows them to change just this one
setting can they diff the "lspci -vvxxx" output with the setting off and
then on so we can isolate which host bridge biti(s) control this feature.
Maybe we can then add it to 'pci_quirks' and reduce the number of VIA
corruption reports.

P.

[1] the BIOS appears to let you change the option but it defaults the
option the moment you leave the "advanced settings" screen :-(
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



VIA silent disk corruption - patch

2001-02-05 Thread Peter Horton

Okay, looks like this fixes it (for me anyways).

Thanks to Mark Hahn and Andre for their help with this problem.

P.


--- linux-2.4.1/arch/i386/kernel/pci-pc.c   Thu Jun 22 15:17:16 2000
+++ linux-2.4.1-bm-fix/arch/i386/kernel/pci-pc.cMon Feb  5 18:37:35 2001
@@ -924,6 +924,22 @@
pcibios_max_latency = 32;
 }
 
+static void __init pci_fixup_vt8363(struct pci_dev *d)
+{
+   /*
+*  VIA VT8363 host bridge has broken feature 'PCI Master Read
+*  Caching'. It caches more than is good for it, sometimes
+*  serving the bus master with stale data. Some BIOSes enable
+*  it by default, so we disable it.
+*/
+   u8 tmp;
+   pci_read_config_byte(d, 0x70, tmp);
+   if(tmp  4) {
+   printk("PCI: Bus master read caching disabled\n");
+   pci_write_config_byte(d, 0x70, tmp  ~4);
+   }
+}
+
 struct pci_fixup pcibios_fixups[] = {
{ PCI_FIXUP_HEADER, PCI_VENDOR_ID_INTEL,PCI_DEVICE_ID_INTEL_82451NX,   
 pci_fixup_i450nx },
{ PCI_FIXUP_HEADER, PCI_VENDOR_ID_INTEL,PCI_DEVICE_ID_INTEL_82454GX,   
 pci_fixup_i450gx },
@@ -936,6 +952,7 @@
{ PCI_FIXUP_HEADER, PCI_ANY_ID, PCI_ANY_ID,
 pci_fixup_ide_bases },
{ PCI_FIXUP_HEADER, PCI_VENDOR_ID_SI,   PCI_DEVICE_ID_SI_5597, 
 pci_fixup_latency },
{ PCI_FIXUP_HEADER, PCI_VENDOR_ID_SI,   PCI_DEVICE_ID_SI_5598, 
 pci_fixup_latency },
+   { PCI_FIXUP_HEADER, PCI_VENDOR_ID_VIA,  PCI_DEVICE_ID_VIA_8363_0,  
+ pci_fixup_vt8363 },
{ 0 }
 };
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



VIA silent disk corruption - bad news

2001-02-05 Thread Peter Horton

The patch doesn't work for me. Maybe I need to disable some more of
those North bridge features :-(

Oh bum. Back to testing with "normal" ...

P.

-  CORRUPTING SETUP  -

00:00.0 Host bridge: VIA Technologies, Inc.: Unknown device 0305 (rev 02)
Subsystem: Asustek Computer, Inc.: Unknown device 8033
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- 
SERR- FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium TAbort- TAbort- 
MAbort+ SERR- PERR+
Latency: 0 set
Region 0: Memory at e400 (32-bit, prefetchable) [size=64M]
Capabilities: [a0] AGP version 2.0
Status: RQ=31 SBA+ 64bit- FW- Rate=421
Command: RQ=0 SBA- AGP- 64bit- FW- Rate=
Capabilities: [c0] Power Management version 2
Flags: PMEClk- AuxPwr- DSI- D1- D2- PME-
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
00: 06 11 05 03 06 00 10 a2 02 00 00 06 00 00 00 00
10: 08 00 00 e4 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 43 10 33 80
30: 00 00 00 00 a0 00 00 00 00 00 00 00 00 00 00 00
40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
50: 17 a4 6b b4 4f 81 08 08 80 00 04 08 08 08 08 08
60: 03 ff 00 a0 52 e5 e5 00 44 7c 86 0f 08 3f 00 00
70: de 80 cc 0c 0e a1 d2 00 01 b4 11 02 00 00 00 00
80: 0f 40 00 00 c0 00 00 00 02 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
a0: 02 c0 20 00 07 02 00 1f 00 00 00 00 6e 02 04 00
b0: 59 ec 80 b5 32 33 28 00 00 00 00 00 00 00 00 00
c0: 01 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 0e 22 00 00 00 00 00 91 06

-  DIFF FOR NON-CORRUPTING SETUP  -

@@ -5,7 +5,7 @@
Latency: 0 set
Region 0: Memory at e400 (32-bit, prefetchable) [size=64M]
Capabilities: [a0] AGP version 2.0
-   Status: RQ=31 SBA+ 64bit- FW- Rate=421
+   Status: RQ=31 SBA+ 64bit- FW- Rate=21
Command: RQ=0 SBA- AGP- 64bit- FW- Rate=
Capabilities: [c0] Power Management version 2
Flags: PMEClk- AuxPwr- DSI- D1- D2- PME-
@@ -15,12 +15,12 @@
 20: 00 00 00 00 00 00 00 00 00 00 00 00 43 10 33 80
 30: 00 00 00 00 a0 00 00 00 00 00 00 00 00 00 00 00
 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
-50: 17 a4 6b b4 4f 81 08 08 80 00 04 08 08 08 08 08
-60: 03 ff 00 a0 52 e5 e5 00 44 7c 86 0f 08 3f 00 00
-70: de 80 cc 0c 0e a1 d2 00 01 b4 11 02 00 00 00 00
+50: 17 a4 6b b4 06 81 08 08 80 00 04 08 08 08 08 08
+60: 03 ff 00 a0 50 e4 e4 00 40 78 86 0f 08 3f 00 00
+70: d8 80 cc 0c 0e a1 d2 00 01 b4 01 02 00 00 00 00
 80: 0f 40 00 00 c0 00 00 00 02 00 00 00 00 00 00 00
 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
-a0: 02 c0 20 00 07 02 00 1f 00 00 00 00 6e 02 04 00
+a0: 02 c0 20 00 03 02 00 1f 00 00 00 00 6e 02 00 00
 b0: 59 ec 80 b5 32 33 28 00 00 00 00 00 00 00 00 00
 c0: 01 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00
 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4.1-pre8 losing pages

2001-01-28 Thread Peter Horton

On Fri, Jan 26, 2001 at 07:48:05PM +, Russell King wrote:
> Peter Horton writes:
> > The corruption is dependent on having a swapped on swap partition. If I
> > "swapoff" the corruption goes away, but it comes back when I "swapon"
> > again. I feel this a kernel bug, but as I'm the only person out here who's
> > seeing it I'm at a loss ...
> 
> The reason I ask is that on an ARM box running plain 2.4.0 with swap
> enabled I get rather a lot of SEGVs.  Turn swap off, and I don't see
> any.
> 
> It sounds like it may be related.
> 

Okay, scratch that. It does still happen when there's no swap, but for
some reason it happens a lot less often. Looks like it's timing related,
it only fails when using 7200rpm drives, not older 5400rpm ones (even
though they too are using UDMA33). I've ruled out the filing system, the
IDE controller, the drives and the RAM, so that leaves the kernel or the
CPU - I'll try and beg/borrow/steal another CPU and try that. I can
compile kernels / run X whilst the test is running without a problem so it
looks like it's the bulk write that's the problem.

P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4.1-pre8 losing pages

2001-01-28 Thread Peter Horton

On Fri, Jan 26, 2001 at 07:48:05PM +, Russell King wrote:
 Peter Horton writes:
  The corruption is dependent on having a swapped on swap partition. If I
  "swapoff" the corruption goes away, but it comes back when I "swapon"
  again. I feel this a kernel bug, but as I'm the only person out here who's
  seeing it I'm at a loss ...
 
 The reason I ask is that on an ARM box running plain 2.4.0 with swap
 enabled I get rather a lot of SEGVs.  Turn swap off, and I don't see
 any.
 
 It sounds like it may be related.
 

Okay, scratch that. It does still happen when there's no swap, but for
some reason it happens a lot less often. Looks like it's timing related,
it only fails when using 7200rpm drives, not older 5400rpm ones (even
though they too are using UDMA33). I've ruled out the filing system, the
IDE controller, the drives and the RAM, so that leaves the kernel or the
CPU - I'll try and beg/borrow/steal another CPU and try that. I can
compile kernels / run X whilst the test is running without a problem so it
looks like it's the bulk write that's the problem.

P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4.1-pre8 losing pages

2001-01-26 Thread Peter Horton

On Fri, Jan 26, 2001 at 09:24:12AM +, Peter Horton wrote:
> On Fri, Jan 26, 2001 at 03:20:33AM +0100, Xuan Baldauf wrote:
> > 
> > Peter Horton wrote:
> > 
> > > I'm experiencing repeatable corruption whilst writing large volumes of
> > > data to disk. Kernel version is 2.4.1-pre8, on an 850MHz AMD Athlon on an
> > > ASUS A7V (VIA KT133 chipset) motherboard 128M RAM (tested with 'memtest86'
> > > for 10 hours).
> > >
> > 
> 
> ... this is the kinda output I get on most runs :-
> 
>Linux mole-rat 2.4.1-pre10 #1 Fri Jan 26 08:48:55 GMT 2001 i686 unknown
>...
>aa6a64589748321899bab2b66f71427f  testt
>aa6a64589748321899bab2b66f71427f  testu
>aa6a64589748321899bab2b66f71427f  testv
>9dde1bed276e32a1f9af98c87ab05978  testw
>aa6a64589748321899bab2b66f71427f  testx
>aa6a64589748321899bab2b66f71427f  testy
>aa6a64589748321899bab2b66f71427f  testz
>mole-rat:~# cmp testw testx
>testw testx differ: char 110862337, line 433772
>mole-rat:~# cmp -i $(( 110862336 + 4096 )) testw testx
>mole-rat:~# echo $(( 110862336 % 4096 ))
>0
> 
> > 
> > I cannot reproduce your behaviour in 2.4.1-pre9.
> > 
> 

The corruption is dependent on having a swapped on swap partition. If I
"swapoff" the corruption goes away, but it comes back when I "swapon"
again. I feel this a kernel bug, but as I'm the only person out here who's
seeing it I'm at a loss ...

P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4.1-pre8 losing pages

2001-01-26 Thread Peter Horton

On Fri, Jan 26, 2001 at 03:20:33AM +0100, Xuan Baldauf wrote:
> 
> Peter Horton wrote:
> 
> > I'm experiencing repeatable corruption whilst writing large volumes of
> > data to disk. Kernel version is 2.4.1-pre8, on an 850MHz AMD Athlon on an
> > ASUS A7V (VIA KT133 chipset) motherboard 128M RAM (tested with 'memtest86'
> > for 10 hours).
> >
> 
> So what output does following bash script produce?
> 

Well this is the script I've been testing with ...

   #!/bin/bash -x
   set -e
   uname -a
   rm -f test test[a-z]
   dd if=/dev/urandom of=test bs=1024k count=128
   for I in a b c d e f g h i j k l m n o p q r s t u v w x y z; do
   cp test test$I
   done
   md5sum test*

... this is the kinda output I get on most runs :-

   Linux mole-rat 2.4.1-pre10 #1 Fri Jan 26 08:48:55 GMT 2001 i686 unknown
   ...
   aa6a64589748321899bab2b66f71427f  testt
   aa6a64589748321899bab2b66f71427f  testu
   aa6a64589748321899bab2b66f71427f  testv
   9dde1bed276e32a1f9af98c87ab05978  testw
   aa6a64589748321899bab2b66f71427f  testx
   aa6a64589748321899bab2b66f71427f  testy
   aa6a64589748321899bab2b66f71427f  testz
   mole-rat:~# cmp testw testx
   testw testx differ: char 110862337, line 433772
   mole-rat:~# cmp -i $(( 110862336 + 4096 )) testw testx
   mole-rat:~# echo $(( 110862336 % 4096 ))
   0

> 
> I cannot reproduce your behaviour in 2.4.1-pre9.
> 

No, I can't find anybody else who can either. Maybe I've got a dodgy CPU
:-(

P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4.1-pre8 losing pages

2001-01-26 Thread Peter Horton

On Fri, Jan 26, 2001 at 03:20:33AM +0100, Xuan Baldauf wrote:
 
 Peter Horton wrote:
 
  I'm experiencing repeatable corruption whilst writing large volumes of
  data to disk. Kernel version is 2.4.1-pre8, on an 850MHz AMD Athlon on an
  ASUS A7V (VIA KT133 chipset) motherboard 128M RAM (tested with 'memtest86'
  for 10 hours).
 
 
 So what output does following bash script produce?
 

Well this is the script I've been testing with ...

   #!/bin/bash -x
   set -e
   uname -a
   rm -f test test[a-z]
   dd if=/dev/urandom of=test bs=1024k count=128
   for I in a b c d e f g h i j k l m n o p q r s t u v w x y z; do
   cp test test$I
   done
   md5sum test*

... this is the kinda output I get on most runs :-

   Linux mole-rat 2.4.1-pre10 #1 Fri Jan 26 08:48:55 GMT 2001 i686 unknown
   ...
   aa6a64589748321899bab2b66f71427f  testt
   aa6a64589748321899bab2b66f71427f  testu
   aa6a64589748321899bab2b66f71427f  testv
   9dde1bed276e32a1f9af98c87ab05978  testw
   aa6a64589748321899bab2b66f71427f  testx
   aa6a64589748321899bab2b66f71427f  testy
   aa6a64589748321899bab2b66f71427f  testz
   mole-rat:~# cmp testw testx
   testw testx differ: char 110862337, line 433772
   mole-rat:~# cmp -i $(( 110862336 + 4096 )) testw testx
   mole-rat:~# echo $(( 110862336 % 4096 ))
   0

 
 I cannot reproduce your behaviour in 2.4.1-pre9.
 

No, I can't find anybody else who can either. Maybe I've got a dodgy CPU
:-(

P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: 2.4.1-pre8 losing pages

2001-01-26 Thread Peter Horton

On Fri, Jan 26, 2001 at 09:24:12AM +, Peter Horton wrote:
 On Fri, Jan 26, 2001 at 03:20:33AM +0100, Xuan Baldauf wrote:
  
  Peter Horton wrote:
  
   I'm experiencing repeatable corruption whilst writing large volumes of
   data to disk. Kernel version is 2.4.1-pre8, on an 850MHz AMD Athlon on an
   ASUS A7V (VIA KT133 chipset) motherboard 128M RAM (tested with 'memtest86'
   for 10 hours).
  
  
 
 ... this is the kinda output I get on most runs :-
 
Linux mole-rat 2.4.1-pre10 #1 Fri Jan 26 08:48:55 GMT 2001 i686 unknown
...
aa6a64589748321899bab2b66f71427f  testt
aa6a64589748321899bab2b66f71427f  testu
aa6a64589748321899bab2b66f71427f  testv
9dde1bed276e32a1f9af98c87ab05978  testw
aa6a64589748321899bab2b66f71427f  testx
aa6a64589748321899bab2b66f71427f  testy
aa6a64589748321899bab2b66f71427f  testz
mole-rat:~# cmp testw testx
testw testx differ: char 110862337, line 433772
mole-rat:~# cmp -i $(( 110862336 + 4096 )) testw testx
mole-rat:~# echo $(( 110862336 % 4096 ))
0
 
  
  I cannot reproduce your behaviour in 2.4.1-pre9.
  
 

The corruption is dependent on having a swapped on swap partition. If I
"swapoff" the corruption goes away, but it comes back when I "swapon"
again. I feel this a kernel bug, but as I'm the only person out here who's
seeing it I'm at a loss ...

P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



2.4.1-pre8 losing pages

2001-01-25 Thread Peter Horton

I'm experiencing repeatable corruption whilst writing large volumes of
data to disk. Kernel version is 2.4.1-pre8, on an 850MHz AMD Athlon on an
ASUS A7V (VIA KT133 chipset) motherboard 128M RAM (tested with 'memtest86'
for 10 hours).

First, I realised that the fsck was noticing small corruptions on my ext2
volume. My first suspect was the much discussed VIA IDE controller. As a
test I created a 128M file from "urandom" and copied it to twenty six
files. When I MD5 the files one or two of them are usually corrupt. The
damage usually occurs in the 24th copy (thought not always). Inspecting
the files shows a single 4K block (aligned on a 4K boundary) that is
completely different from what it should be. The kernel logs no errors
whilst writing the corrupt files.

I've repeated the test on the other on-board IDE controller (Promise), a
different hard disk, and on reiserfs. I see the corruption in all cases.

I tried building the kernel for "Pentium-Classic", and I tried a few older
kernels (2.4.0-test5 and 2.4.0-test12), still bad (all kernels built with
GCC 2.95.2 - Debian potato).

I really could do with some help as where to look next :-). I did try and
come up with a test to see whether bad data is written or whether the
damaged piece is just not written, but if I alter the testing procedure
too much the problem seems to go away. It seems to just lose a single page
under one very specific circumstance.

P.

( configs attached )


 info.tar.gz


2.4.1-pre8 losing pages

2001-01-25 Thread Peter Horton

I'm experiencing repeatable corruption whilst writing large volumes of
data to disk. Kernel version is 2.4.1-pre8, on an 850MHz AMD Athlon on an
ASUS A7V (VIA KT133 chipset) motherboard 128M RAM (tested with 'memtest86'
for 10 hours).

First, I realised that the fsck was noticing small corruptions on my ext2
volume. My first suspect was the much discussed VIA IDE controller. As a
test I created a 128M file from "urandom" and copied it to twenty six
files. When I MD5 the files one or two of them are usually corrupt. The
damage usually occurs in the 24th copy (thought not always). Inspecting
the files shows a single 4K block (aligned on a 4K boundary) that is
completely different from what it should be. The kernel logs no errors
whilst writing the corrupt files.

I've repeated the test on the other on-board IDE controller (Promise), a
different hard disk, and on reiserfs. I see the corruption in all cases.

I tried building the kernel for "Pentium-Classic", and I tried a few older
kernels (2.4.0-test5 and 2.4.0-test12), still bad (all kernels built with
GCC 2.95.2 - Debian potato).

I really could do with some help as where to look next :-). I did try and
come up with a test to see whether bad data is written or whether the
damaged piece is just not written, but if I alter the testing procedure
too much the problem seems to go away. It seems to just lose a single page
under one very specific circumstance.

P.

( configs attached )


 info.tar.gz


Re: Via apollo KX133 ide bug in 2.4.x

2001-01-22 Thread Peter Horton

On Sun, Jan 21, 2001 at 12:40:30PM +0100, Vojtech Pavlik wrote:
> On Sat, Jan 20, 2001 at 04:32:36PM -0500, safemode wrote:
> > Peter Horton wrote:
> > 
> > > On Thu, Jan 20, 2000 at 08:38:12AM +, Peter Horton wrote:
> > > >
> > > > I think I'm suffering the same thing on my new Asus A7V. Yesterday I got a
> > > > single "error in bitmap, remounting read only" type error, and today I got
> > > > some files in /tmp that returned I/O error when stat()ed. I do have DMA
> > > > enabled, but only UDMA33. I've done several kernel compiles with no
> > > > problems at all so looks like something is on the edge. Think I might go
> > > > back to 2.2.x for a bit and see what happens, or maybe just remove the VIA
> > > > driver :-((.
> > > >
> > >
> > > I apologise for following up my own E-mail, but there is something I'm
> > > missing here (maybe a whole lot of something). Anyone know how come we're
> > > seeing silent corruption ... I thought this UDMA stuff was all checksummed
> > > ? If there error is outside the data I assume the driver would notice ?
> > >
> > > P.
> > 
> > The thing is, even with UDMA disabled in the kernel, I still see the corruption
> > with 2.4.x (release) and above.  Anything written while using the kernel is
> > corrupted.   Much of the stuff will read fine (files) ... but I believe
> > directories get the IO error immediately and some files do also.  Everything is
> > seen as corrupted when you fsck a partition where this kernel has been run and
> > created files on.   This is a silent corruption without any errors reported and
> > I've only tested it on ext2.  You cannot create FS's with these kernels (at
> > least on the VIA chipsets) since they too are corrupted (note, only tested ext2
> > fs).   I did disable UDMA everywhere and still saw it happen, this problem is
> > not present in older 2.4.0-test kernels so it's something in the late
> > pre-release stage and into the release stage.
> 
> Do you have the via driver compiled in? If yes, try without, if no, try
> with it ...
> 

Okay, I bit the bullet and rebuilt the kernel with the VIA driver back in.
As a test I created one 128M file from /dev/urandom and copied it 26
times. Out of the 26 copies one was damaged. The damage was just one page
(eight sectors), aligned on a page boundary. The damaged section bore no
resemblance at all to what it should have been. Is it just a coincidence
that it looks like an incorrect page got written out ?

P.

PS - just to rule out other factors I ran memtest86 on this box for 10
hours with no error. It's not an overclock either.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Via apollo KX133 ide bug in 2.4.x

2001-01-22 Thread Peter Horton

On Sun, Jan 21, 2001 at 12:40:30PM +0100, Vojtech Pavlik wrote:
 On Sat, Jan 20, 2001 at 04:32:36PM -0500, safemode wrote:
  Peter Horton wrote:
  
   On Thu, Jan 20, 2000 at 08:38:12AM +, Peter Horton wrote:
   
I think I'm suffering the same thing on my new Asus A7V. Yesterday I got a
single "error in bitmap, remounting read only" type error, and today I got
some files in /tmp that returned I/O error when stat()ed. I do have DMA
enabled, but only UDMA33. I've done several kernel compiles with no
problems at all so looks like something is on the edge. Think I might go
back to 2.2.x for a bit and see what happens, or maybe just remove the VIA
driver :-((.
   
  
   I apologise for following up my own E-mail, but there is something I'm
   missing here (maybe a whole lot of something). Anyone know how come we're
   seeing silent corruption ... I thought this UDMA stuff was all checksummed
   ? If there error is outside the data I assume the driver would notice ?
  
   P.
  
  The thing is, even with UDMA disabled in the kernel, I still see the corruption
  with 2.4.x (release) and above.  Anything written while using the kernel is
  corrupted.   Much of the stuff will read fine (files) ... but I believe
  directories get the IO error immediately and some files do also.  Everything is
  seen as corrupted when you fsck a partition where this kernel has been run and
  created files on.   This is a silent corruption without any errors reported and
  I've only tested it on ext2.  You cannot create FS's with these kernels (at
  least on the VIA chipsets) since they too are corrupted (note, only tested ext2
  fs).   I did disable UDMA everywhere and still saw it happen, this problem is
  not present in older 2.4.0-test kernels so it's something in the late
  pre-release stage and into the release stage.
 
 Do you have the via driver compiled in? If yes, try without, if no, try
 with it ...
 

Okay, I bit the bullet and rebuilt the kernel with the VIA driver back in.
As a test I created one 128M file from /dev/urandom and copied it 26
times. Out of the 26 copies one was damaged. The damage was just one page
(eight sectors), aligned on a page boundary. The damaged section bore no
resemblance at all to what it should have been. Is it just a coincidence
that it looks like an incorrect page got written out ?

P.

PS - just to rule out other factors I ran memtest86 on this box for 10
hours with no error. It's not an overclock either.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Via apollo KX133 ide bug in 2.4.x

2001-01-20 Thread Peter Horton

On Thu, Jan 20, 2000 at 08:38:12AM +, Peter Horton wrote:
> 
> I think I'm suffering the same thing on my new Asus A7V. Yesterday I got a
> single "error in bitmap, remounting read only" type error, and today I got
> some files in /tmp that returned I/O error when stat()ed. I do have DMA
> enabled, but only UDMA33. I've done several kernel compiles with no
> problems at all so looks like something is on the edge. Think I might go
> back to 2.2.x for a bit and see what happens, or maybe just remove the VIA
> driver :-((.
> 

I apologise for following up my own E-mail, but there is something I'm
missing here (maybe a whole lot of something). Anyone know how come we're
seeing silent corruption ... I thought this UDMA stuff was all checksummed
? If there error is outside the data I assume the driver would notice ?


P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Via apollo KX133 ide bug in 2.4.x

2001-01-20 Thread Peter Horton

On Fri, Jan 19, 2001 at 07:33:21PM -0500, safemode wrote:
> I'm sorry I can't be more descriptive than that, but there aren't any
> errors ever displayed.  What happened was after about a day of uptime, I
> began seeing IO errors when trying to access files.  I realized that the
> IO errors occurred on any file I had created.  I rebooted since the
> computer became impossible to use and fsck removed everything that I had
> created since upgrading to the release kernel.  This is all on ext2fs.
> I tried making bootdisks but they all showed up as being bad.  I tried
> copying files to another ext2fs but upon fsck, they too were all removed
> due to corruption.  These ext2fs' were not created by the release
> kernel.  I had to go back to 2.4.0-test11 before the kernel would write
> to the fs correctly.  For the record, I disabled DMA in the kernel and
> i'm compiling for athlon using gcc 2.95.3.  I saw the same thing happen
> though when I booted for a kernel compiled for Pentium 2.Since
> reverting back to 2.4.0-test11, however, no FS corruption has been
> observed.  Anyone have any idea what this is about?  i'm compiling with
> the same options between kernels but 2.4.x (release and newer) do not
> seem to be able to write to the ext2fs correctly.  Could this be because
> it was formatted by a 2.2.x kernel?   Anyone using this chipset I would
> caution to have backups ready when using it with 2.4.x, as I lost
> hundreds of files to it.  Also, no errors were reported anywhere,  IO
> errors when trying to stat dirs just started appearing after a couple
> days uptime ...then they would occur whenever you wrote to the FS.  Even
> after a reboot.If you need any extra iinfo about kernel options and
> computer config, just ask.
> 

I think I'm suffering the same thing on my new Asus A7V. Yesterday I got a
single "error in bitmap, remounting read only" type error, and today I got
some files in /tmp that returned I/O error when stat()ed. I do have DMA
enabled, but only UDMA33. I've done several kernel compiles with no
problems at all so looks like something is on the edge. Think I might go
back to 2.2.x for a bit and see what happens, or maybe just remove the VIA
driver :-((.

P.

I've attached lspci -vxxx output, and kernel config, in case anyone is
investigating.

/dev/hda is Seagate ST330621A.

--VIA BusMastering IDE Configuration
Driver Version: 2.1e
South Bridge:   VIA vt82c686a rev 0x22
Command register:   0x7
Latency timer:  32
PCI clock:  33MHz
Master Read  Cycle IRDY:0ws
Master Write Cycle IRDY:0ws
FIFO Output Data 1/2 Clock Advance: off
BM IDE Status Register Read Retry:  on
Max DRDY Pulse Width:   No limit
---Primary IDE---Secondary IDE--
Read DMA FIFO flush:   on  on
End Sect. FIFO flush:  on  on
Prefetch Buffer:   on  on
Post Write Buffer: on  on
FIFO size:  8   8
Threshold Prim.:  1/2 1/2
Bytes Per Sector: 512 512
Both channels togth:  yes yes
---drive0drive1drive2drive3-
BMDMA enabled:yesnonono
Transfer Mode:   UDMA   DMA/PIO   DMA/PIO   DMA/PIO
Address Setup:   30ns 120ns  30ns 120ns
Active Pulse:90ns 330ns  90ns 330ns
Recovery Time:   30ns 270ns  30ns 270ns
Cycle Time:  60ns 600ns 120ns 600ns
Transfer Rate:   33.0MB/s   3.3MB/s  16.5MB/s   3.3MB/s

 lspci+config.gz


Re: Via apollo KX133 ide bug in 2.4.x

2001-01-20 Thread Peter Horton

On Fri, Jan 19, 2001 at 07:33:21PM -0500, safemode wrote:
 I'm sorry I can't be more descriptive than that, but there aren't any
 errors ever displayed.  What happened was after about a day of uptime, I
 began seeing IO errors when trying to access files.  I realized that the
 IO errors occurred on any file I had created.  I rebooted since the
 computer became impossible to use and fsck removed everything that I had
 created since upgrading to the release kernel.  This is all on ext2fs.
 I tried making bootdisks but they all showed up as being bad.  I tried
 copying files to another ext2fs but upon fsck, they too were all removed
 due to corruption.  These ext2fs' were not created by the release
 kernel.  I had to go back to 2.4.0-test11 before the kernel would write
 to the fs correctly.  For the record, I disabled DMA in the kernel and
 i'm compiling for athlon using gcc 2.95.3.  I saw the same thing happen
 though when I booted for a kernel compiled for Pentium 2.Since
 reverting back to 2.4.0-test11, however, no FS corruption has been
 observed.  Anyone have any idea what this is about?  i'm compiling with
 the same options between kernels but 2.4.x (release and newer) do not
 seem to be able to write to the ext2fs correctly.  Could this be because
 it was formatted by a 2.2.x kernel?   Anyone using this chipset I would
 caution to have backups ready when using it with 2.4.x, as I lost
 hundreds of files to it.  Also, no errors were reported anywhere,  IO
 errors when trying to stat dirs just started appearing after a couple
 days uptime ...then they would occur whenever you wrote to the FS.  Even
 after a reboot.If you need any extra iinfo about kernel options and
 computer config, just ask.
 

I think I'm suffering the same thing on my new Asus A7V. Yesterday I got a
single "error in bitmap, remounting read only" type error, and today I got
some files in /tmp that returned I/O error when stat()ed. I do have DMA
enabled, but only UDMA33. I've done several kernel compiles with no
problems at all so looks like something is on the edge. Think I might go
back to 2.2.x for a bit and see what happens, or maybe just remove the VIA
driver :-((.

P.

I've attached lspci -vxxx output, and kernel config, in case anyone is
investigating.

/dev/hda is Seagate ST330621A.

--VIA BusMastering IDE Configuration
Driver Version: 2.1e
South Bridge:   VIA vt82c686a rev 0x22
Command register:   0x7
Latency timer:  32
PCI clock:  33MHz
Master Read  Cycle IRDY:0ws
Master Write Cycle IRDY:0ws
FIFO Output Data 1/2 Clock Advance: off
BM IDE Status Register Read Retry:  on
Max DRDY Pulse Width:   No limit
---Primary IDE---Secondary IDE--
Read DMA FIFO flush:   on  on
End Sect. FIFO flush:  on  on
Prefetch Buffer:   on  on
Post Write Buffer: on  on
FIFO size:  8   8
Threshold Prim.:  1/2 1/2
Bytes Per Sector: 512 512
Both channels togth:  yes yes
---drive0drive1drive2drive3-
BMDMA enabled:yesnonono
Transfer Mode:   UDMA   DMA/PIO   DMA/PIO   DMA/PIO
Address Setup:   30ns 120ns  30ns 120ns
Active Pulse:90ns 330ns  90ns 330ns
Recovery Time:   30ns 270ns  30ns 270ns
Cycle Time:  60ns 600ns 120ns 600ns
Transfer Rate:   33.0MB/s   3.3MB/s  16.5MB/s   3.3MB/s

 lspci+config.gz


Re: Via apollo KX133 ide bug in 2.4.x

2001-01-20 Thread Peter Horton

On Thu, Jan 20, 2000 at 08:38:12AM +, Peter Horton wrote:
 
 I think I'm suffering the same thing on my new Asus A7V. Yesterday I got a
 single "error in bitmap, remounting read only" type error, and today I got
 some files in /tmp that returned I/O error when stat()ed. I do have DMA
 enabled, but only UDMA33. I've done several kernel compiles with no
 problems at all so looks like something is on the edge. Think I might go
 back to 2.2.x for a bit and see what happens, or maybe just remove the VIA
 driver :-((.
 

I apologise for following up my own E-mail, but there is something I'm
missing here (maybe a whole lot of something). Anyone know how come we're
seeing silent corruption ... I thought this UDMA stuff was all checksummed
? If there error is outside the data I assume the driver would notice ?


P.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



2.4.1-pre8 and Athlon

2001-01-17 Thread Peter Horton


Building 2.4.1-pre8 for K7 gives 'current' undefined errors in the headers
included from init/main.c. Looks like something included from asm/string.h
is missing an include. The problems go away if I remove
CONFIG_X86_USE3DNOW=y from the config.

P.

-- 
P. Horton
Software Engineer
http://www.colonel-panic.com
Linux 2.4.0
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



2.4.1-pre8 and Athlon

2001-01-17 Thread Peter Horton


Building 2.4.1-pre8 for K7 gives 'current' undefined errors in the headers
included from init/main.c. Looks like something included from asm/string.h
is missing an include. The problems go away if I remove
CONFIG_X86_USE3DNOW=y from the config.

P.

-- 
P. Horton
Software Engineer
http://www.colonel-panic.com
Linux 2.4.0
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: D-LINK DFE-530-TX

2000-12-07 Thread Peter Horton

On Wed, Dec 06, 2000 at 07:44:02PM -0500, Mike A. Harris wrote:
> Which ethernet module works with this card?  2.2.17 kernel
> 

If the PCI device ID is 3065 then it's via-rhine, but not supported by the
driver in the kernel. Get updated via-rhine from Donald Becker's site
http://www.scyld.com/network.

Even the DFE-530-TX driver for NT downloaded from D-Link's site doesn't know
about this chip yet ... though changing the device ID in the .INF file seemed
to make it work ... shrug.

HTH

P.

-- 
++
|    Peter Horton|
++
|http://www.colonel-panic.com|
|   http://www.berserk.demon.co.uk   |
| Linux 2.4.0-test11 |
++
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: D-LINK DFE-530-TX

2000-12-07 Thread Peter Horton

On Wed, Dec 06, 2000 at 07:44:02PM -0500, Mike A. Harris wrote:
 Which ethernet module works with this card?  2.2.17 kernel
 

If the PCI device ID is 3065 then it's via-rhine, but not supported by the
driver in the kernel. Get updated via-rhine from Donald Becker's site
http://www.scyld.com/network.

Even the DFE-530-TX driver for NT downloaded from D-Link's site doesn't know
about this chip yet ... though changing the device ID in the .INF file seemed
to make it work ... shrug.

HTH

P.

-- 
++
|Peter Horton|
++
|http://www.colonel-panic.com|
|   http://www.berserk.demon.co.uk   |
| Linux 2.4.0-test11 |
++
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/