Re: [OT] Major Clock Drift
On Sun, Feb 11, 2001 at 12:06:14PM +0100, Pavel Machek wrote: > > > > > > >I've discovered that heavy use of vesafb can be a major source of clock > > > > > >drift on my system, especially if I don't specify "ypan" or "ywrap". On my > > > > > > > > > > This is extremely interesting. What version of ntp are you using? > > > > > > > > Is vesafb one of the drivers which blocks interrupts for (many) tens > > > > of milliseconds? > > > > > > Vesafb is happy to block interrupts for half a second. > > > > And has this been observed to cause clock drift? > > YEs. I've seen time running 3 times slower. Just do cat /etc/termcap > with loaded PCI bus. Yesterday I lost 20 minutes during 2 hours -- I > have been using USB (load PCI) and framebuffer. > Pavel Is it not possible to save/check TSC in timer interrupt to correct for dropped interrupts ? (obviously only on machines that have a TSC ...) P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Easy Way to FS-corruption
On Sat, Feb 10, 2001 at 03:24:06PM +0100, Tim Krieglstein wrote: > > I found a way which seems to lead to an "easy" way of fs-corruption: > Install two sound-cards, use the newest ALSA-Drivers 0.5.10b > (the standard sound drivers don't work to good with sf) and [snip] This could be that bus master DMA caching problem that showed up on my KT133 A7V (see previous threads re: 'VIA silent disk corruption'). You could try more conservative BIOS chipset settings (my problem went away with "normal" rather than "optimal" BIOS settings). In the end Asus released a BIOS update for the A7V that seems to have fixed it permanently. P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Easy Way to FS-corruption
On Sat, Feb 10, 2001 at 03:24:06PM +0100, Tim Krieglstein wrote: I found a way which seems to lead to an "easy" way of fs-corruption: Install two sound-cards, use the newest ALSA-Drivers 0.5.10b (the standard sound drivers don't work to good with sf) and [snip] This could be that bus master DMA caching problem that showed up on my KT133 A7V (see previous threads re: 'VIA silent disk corruption'). You could try more conservative BIOS chipset settings (my problem went away with "normal" rather than "optimal" BIOS settings). In the end Asus released a BIOS update for the A7V that seems to have fixed it permanently. P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: [OT] Major Clock Drift
On Sun, Feb 11, 2001 at 12:06:14PM +0100, Pavel Machek wrote: I've discovered that heavy use of vesafb can be a major source of clock drift on my system, especially if I don't specify "ypan" or "ywrap". On my This is extremely interesting. What version of ntp are you using? Is vesafb one of the drivers which blocks interrupts for (many) tens of milliseconds? Vesafb is happy to block interrupts for half a second. And has this been observed to cause clock drift? YEs. I've seen time running 3 times slower. Just do cat /etc/termcap with loaded PCI bus. Yesterday I lost 20 minutes during 2 hours -- I have been using USB (load PCI) and framebuffer. Pavel Is it not possible to save/check TSC in timer interrupt to correct for dropped interrupts ? (obviously only on machines that have a TSC ...) P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.4.1ac9
On Fri, Feb 09, 2001 at 11:01:13PM +, Alan Cox wrote: > > I've noticed that -ac9 comes with the "Disable PCI-Master-Read-Caching > > on VIA" patch that Peter Horton posted a while back. I don't know > > whether it was applied in Linus' or your tree first, but is it > > actually verified to fix anything? > > Not yet. As the story becomes clear it can either be dropped or pushed > on > It should be dropped I think ... Different folks found that changing different settings fixed it for them, so it looks like some kind of internal race in the North bridge where changing the timings in any way makes it harder to reproduce. The updated BIOS from Asus definitely fixes it for me, and "PCI Master Read Caching" is *enabled*. There are quite a few differences in the setup of the North bridge from the previous BIOS to this one, and I assume the changes were suggested by VIA. If there are other people out there who still have this problem we can probably come up with a patch for the kernel, but isolating which of the settings are important would be a long job. Shame VIA won't help :-( P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: PS/2 Mouse/Keyboard conflict and lockup
On Thu, Feb 08, 2001 at 03:35:00AM +0100, Udo A. Steinberg wrote: > > I'm not sure whether this is related to the ominous ps/2 mouse bug > you have been chasing, but this problem is 100% reproducible and > very annoying. > > After upgrading my Asus A7V Bios from 1003 to 1005D, gpm no longer > receives any mouse events and the mouse doesn't work in text > consoles. Once I kill gpm and restart gpm -t ps2 the keyboard > locks up. > > Logging in remotely and looking at dmesg revealed the following: > > keyboard: Timeout - AT keyboard not present? > keyboard: unrecognized scancode (70) - ignored > > If I don't kill and restart gpm, but start X, the mouse works > perfectly, but only in X. > Similiar problems here after my upgrade to 1005D. Linux somehow kills the keyboard if I start the box without a PS/2 mouse connected. I have another machine (these are both 2.4.1) which is a much older K6-233, and it too kills the keyboard if no mouse is present. Keyboard works at LILO prompt but is dead by the time I get to login. GPM doesn't work for me either. P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: VIA silent disk corruption - patch
On Tue, Feb 06, 2001 at 05:01:46PM +0100, Udo A. Steinberg wrote: > Dale Farnsworth wrote: > > > > However, if I enable the BIOS parameter "I/O Recovery Time", I can still > > enable read caching without seeing any data corruption. > > The lastest BIOS revision (1005C) enables "I/O Recovery Time" by default > > where the previous revision I had (1004D) did not. > > Interesting stuff. > > Asus, Germany released 1005D today. It's available from > ftp://ftp.asuscom.de/pub/ASUSCOM/BIOS/Socket_A/VIA_Chipset/Apollo_KT133/A7V/1005D.zip > > No comments about what they changed and/or fixed. > Good news here, looks like the new BIOS fixes it (1005D). I've run a heavy test for at least 10 hours without a single blip. The BIOS is set for "optimal". Hoorah! Here's the North bridge diff for anyone who can't get a BIOS update :-) P. --- bad.pci Sun Feb 4 22:29:22 2001 +++ new.pci Wed Feb 7 23:11:28 2001 @@ -1,7 +1,7 @@ 00:00.0 Host bridge: VIA Technologies, Inc.: Unknown device 0305 (rev 02) Subsystem: Asustek Computer, Inc.: Unknown device 8033 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- - Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- TAbort- SERR- http://www.tux.org/lkml/
Re: VIA silent disk corruption - patch
On Tue, Feb 06, 2001 at 05:01:46PM +0100, Udo A. Steinberg wrote: Dale Farnsworth wrote: However, if I enable the BIOS parameter "I/O Recovery Time", I can still enable read caching without seeing any data corruption. The lastest BIOS revision (1005C) enables "I/O Recovery Time" by default where the previous revision I had (1004D) did not. Interesting stuff. Asus, Germany released 1005D today. It's available from ftp://ftp.asuscom.de/pub/ASUSCOM/BIOS/Socket_A/VIA_Chipset/Apollo_KT133/A7V/1005D.zip No comments about what they changed and/or fixed. Good news here, looks like the new BIOS fixes it (1005D). I've run a heavy test for at least 10 hours without a single blip. The BIOS is set for "optimal". Hoorah! Here's the North bridge diff for anyone who can't get a BIOS update :-) P. --- bad.pci Sun Feb 4 22:29:22 2001 +++ new.pci Wed Feb 7 23:11:28 2001 @@ -1,7 +1,7 @@ 00:00.0 Host bridge: VIA Technologies, Inc.: Unknown device 0305 (rev 02) Subsystem: Asustek Computer, Inc.: Unknown device 8033 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- - Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium TAbort- TAbort- MAbort+ SERR- PERR+ + Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium TAbort- TAbort- +MAbort+ SERR- PERR- Latency: 0 set Region 0: Memory at e400 (32-bit, prefetchable) [size=64M] Capabilities: [a0] AGP version 2.0 @@ -10,13 +10,13 @@ Capabilities: [c0] Power Management version 2 Flags: PMEClk- AuxPwr- DSI- D1- D2- PME- Status: D0 PME-Enable- DSel=0 DScale=0 PME- -00: 06 11 05 03 06 00 10 a2 02 00 00 06 00 00 00 00 +00: 06 11 05 03 06 00 10 22 02 00 00 06 00 00 00 00 10: 08 00 00 e4 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 43 10 33 80 30: 00 00 00 00 a0 00 00 00 00 00 00 00 00 00 00 00 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 -50: 17 a4 6b b4 4f 81 08 08 80 00 04 08 08 08 08 08 -60: 03 ff 00 a0 52 e5 e5 00 44 7c 86 0f 08 3f 00 00 +50: 17 a4 6b b4 07 28 08 08 80 00 04 08 08 08 08 08 +60: 03 ff 55 a0 52 e5 e5 00 44 7c 86 0f 08 3f 00 00 70: de 80 cc 0c 0e a1 d2 00 01 b4 11 02 00 00 00 00 80: 0f 40 00 00 c0 00 00 00 02 00 00 00 00 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: PS/2 Mouse/Keyboard conflict and lockup
On Thu, Feb 08, 2001 at 03:35:00AM +0100, Udo A. Steinberg wrote: I'm not sure whether this is related to the ominous ps/2 mouse bug you have been chasing, but this problem is 100% reproducible and very annoying. After upgrading my Asus A7V Bios from 1003 to 1005D, gpm no longer receives any mouse events and the mouse doesn't work in text consoles. Once I kill gpm and restart gpm -t ps2 the keyboard locks up. Logging in remotely and looking at dmesg revealed the following: keyboard: Timeout - AT keyboard not present? keyboard: unrecognized scancode (70) - ignored If I don't kill and restart gpm, but start X, the mouse works perfectly, but only in X. Similiar problems here after my upgrade to 1005D. Linux somehow kills the keyboard if I start the box without a PS/2 mouse connected. I have another machine (these are both 2.4.1) which is a much older K6-233, and it too kills the keyboard if no mouse is present. Keyboard works at LILO prompt but is dead by the time I get to login. GPM doesn't work for me either. P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: VIA silent disk corruption - patch
On Tue, Feb 06, 2001 at 08:52:23AM -0700, Dale Farnsworth wrote: > > In article <[EMAIL PROTECTED]>, > Peter Horton <[EMAIL PROTECTED]> wrote: > > + * VIA VT8363 host bridge has broken feature 'PCI Master Read > > + * Caching'. It caches more than is good for it, sometimes > > + * serving the bus master with stale data. Some BIOSes enable > > + * it by default, so we disable it. > > Another data point: > > I have an ASUS A7V motherboard with via vt82c686a and Promise pdc20265 > IDE controllers. I noticed disk data corruption when I enabled DMA. > The corrupted data was 4K bytes long on 4K byte boundaries and occurred > about once for every couple of gigabytes copied via cpio. > I saw this corruption when the disks were connected to the pdc20265 > as well as to the 686a. > > I also noticed that turning off read caching eliminated the corruption. > > However, if I enable the BIOS parameter "I/O Recovery Time", I can still > enable read caching without seeing any data corruption. > The lastest BIOS revision (1005C) enables "I/O Recovery Time" by default > where the previous revision I had (1004D) did not. > I still get corruption with "I/O Recovery Time" enabled :-( I don't get corruption with the BIOS "normal" settings (1004D). I might update my BIOS to the latest BIOS in case it changes any other settings. P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: VIA silent disk corruption - patch
On Tue, Feb 06, 2001 at 08:52:23AM -0700, Dale Farnsworth wrote: In article [EMAIL PROTECTED], Peter Horton [EMAIL PROTECTED] wrote: + * VIA VT8363 host bridge has broken feature 'PCI Master Read + * Caching'. It caches more than is good for it, sometimes + * serving the bus master with stale data. Some BIOSes enable + * it by default, so we disable it. Another data point: I have an ASUS A7V motherboard with via vt82c686a and Promise pdc20265 IDE controllers. I noticed disk data corruption when I enabled DMA. The corrupted data was 4K bytes long on 4K byte boundaries and occurred about once for every couple of gigabytes copied via cpio. I saw this corruption when the disks were connected to the pdc20265 as well as to the 686a. I also noticed that turning off read caching eliminated the corruption. However, if I enable the BIOS parameter "I/O Recovery Time", I can still enable read caching without seeing any data corruption. The lastest BIOS revision (1005C) enables "I/O Recovery Time" by default where the previous revision I had (1004D) did not. I still get corruption with "I/O Recovery Time" enabled :-( I don't get corruption with the BIOS "normal" settings (1004D). I might update my BIOS to the latest BIOS in case it changes any other settings. P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
VIA silent disk corruption - bad news
The patch doesn't work for me. Maybe I need to disable some more of those North bridge features :-( Oh bum. Back to testing with "normal" ... P. - CORRUPTING SETUP - 00:00.0 Host bridge: VIA Technologies, Inc.: Unknown device 0305 (rev 02) Subsystem: Asustek Computer, Inc.: Unknown device 8033 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- SERR- http://www.tux.org/lkml/
VIA silent disk corruption - patch
Okay, looks like this fixes it (for me anyways). Thanks to Mark Hahn and Andre for their help with this problem. P. --- linux-2.4.1/arch/i386/kernel/pci-pc.c Thu Jun 22 15:17:16 2000 +++ linux-2.4.1-bm-fix/arch/i386/kernel/pci-pc.cMon Feb 5 18:37:35 2001 @@ -924,6 +924,22 @@ pcibios_max_latency = 32; } +static void __init pci_fixup_vt8363(struct pci_dev *d) +{ + /* +* VIA VT8363 host bridge has broken feature 'PCI Master Read +* Caching'. It caches more than is good for it, sometimes +* serving the bus master with stale data. Some BIOSes enable +* it by default, so we disable it. +*/ + u8 tmp; + pci_read_config_byte(d, 0x70, ); + if(tmp & 4) { + printk("PCI: Bus master read caching disabled\n"); + pci_write_config_byte(d, 0x70, tmp & ~4); + } +} + struct pci_fixup pcibios_fixups[] = { { PCI_FIXUP_HEADER, PCI_VENDOR_ID_INTEL,PCI_DEVICE_ID_INTEL_82451NX, pci_fixup_i450nx }, { PCI_FIXUP_HEADER, PCI_VENDOR_ID_INTEL,PCI_DEVICE_ID_INTEL_82454GX, pci_fixup_i450gx }, @@ -936,6 +952,7 @@ { PCI_FIXUP_HEADER, PCI_ANY_ID, PCI_ANY_ID, pci_fixup_ide_bases }, { PCI_FIXUP_HEADER, PCI_VENDOR_ID_SI, PCI_DEVICE_ID_SI_5597, pci_fixup_latency }, { PCI_FIXUP_HEADER, PCI_VENDOR_ID_SI, PCI_DEVICE_ID_SI_5598, pci_fixup_latency }, + { PCI_FIXUP_HEADER, PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_8363_0, + pci_fixup_vt8363 }, { 0 } }; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
VIA silent disk corruption - likely fix
I've found the cause of silent disk corruption on my A7V motherboard, and it might affect all boards with the same North bridge (KT133 etc). For some reason the IDE controller(s) was sometimes picking up stale data during bus master DMA to the drive. Assuming that there was no bug in the CPU it had to be the North bridge that was caching the stuff when it shouldn't have been. I assume the problem would also apply to other bus masters (SCSI, NIC etc). Scanning the motherboard manual showed up a chipset setting "PCI master read caching" which I suspect is the culprit. According to the manual this defaults to "on" for Athlons and "off" for Durons (obviously other BIOSes / MB might treat this setting differently). Unfortunately my BIOS does not allow me to change this setting independently [1], I only have the choice of running the machine in "normal" or "optimal" configuration to alter this setting ("optimal" is the default). In "normal" mode my machine is rock solid and I see no corruption, however "normal" mode also changes a lot of other settings (AGP speed, DRAM interleave etc). Anyone experiencing such corruption should look for a BIOS setting which disables this "feature". If anyone out there has a BIOS which allows them to change just this one setting can they diff the "lspci -vvxxx" output with the setting off and then on so we can isolate which host bridge biti(s) control this feature. Maybe we can then add it to 'pci_quirks' and reduce the number of VIA corruption reports. P. [1] the BIOS appears to let you change the option but it defaults the option the moment you leave the "advanced settings" screen :-( - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
VIA silent disk corruption - likely fix
I've found the cause of silent disk corruption on my A7V motherboard, and it might affect all boards with the same North bridge (KT133 etc). For some reason the IDE controller(s) was sometimes picking up stale data during bus master DMA to the drive. Assuming that there was no bug in the CPU it had to be the North bridge that was caching the stuff when it shouldn't have been. I assume the problem would also apply to other bus masters (SCSI, NIC etc). Scanning the motherboard manual showed up a chipset setting "PCI master read caching" which I suspect is the culprit. According to the manual this defaults to "on" for Athlons and "off" for Durons (obviously other BIOSes / MB might treat this setting differently). Unfortunately my BIOS does not allow me to change this setting independently [1], I only have the choice of running the machine in "normal" or "optimal" configuration to alter this setting ("optimal" is the default). In "normal" mode my machine is rock solid and I see no corruption, however "normal" mode also changes a lot of other settings (AGP speed, DRAM interleave etc). Anyone experiencing such corruption should look for a BIOS setting which disables this "feature". If anyone out there has a BIOS which allows them to change just this one setting can they diff the "lspci -vvxxx" output with the setting off and then on so we can isolate which host bridge biti(s) control this feature. Maybe we can then add it to 'pci_quirks' and reduce the number of VIA corruption reports. P. [1] the BIOS appears to let you change the option but it defaults the option the moment you leave the "advanced settings" screen :-( - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
VIA silent disk corruption - patch
Okay, looks like this fixes it (for me anyways). Thanks to Mark Hahn and Andre for their help with this problem. P. --- linux-2.4.1/arch/i386/kernel/pci-pc.c Thu Jun 22 15:17:16 2000 +++ linux-2.4.1-bm-fix/arch/i386/kernel/pci-pc.cMon Feb 5 18:37:35 2001 @@ -924,6 +924,22 @@ pcibios_max_latency = 32; } +static void __init pci_fixup_vt8363(struct pci_dev *d) +{ + /* +* VIA VT8363 host bridge has broken feature 'PCI Master Read +* Caching'. It caches more than is good for it, sometimes +* serving the bus master with stale data. Some BIOSes enable +* it by default, so we disable it. +*/ + u8 tmp; + pci_read_config_byte(d, 0x70, tmp); + if(tmp 4) { + printk("PCI: Bus master read caching disabled\n"); + pci_write_config_byte(d, 0x70, tmp ~4); + } +} + struct pci_fixup pcibios_fixups[] = { { PCI_FIXUP_HEADER, PCI_VENDOR_ID_INTEL,PCI_DEVICE_ID_INTEL_82451NX, pci_fixup_i450nx }, { PCI_FIXUP_HEADER, PCI_VENDOR_ID_INTEL,PCI_DEVICE_ID_INTEL_82454GX, pci_fixup_i450gx }, @@ -936,6 +952,7 @@ { PCI_FIXUP_HEADER, PCI_ANY_ID, PCI_ANY_ID, pci_fixup_ide_bases }, { PCI_FIXUP_HEADER, PCI_VENDOR_ID_SI, PCI_DEVICE_ID_SI_5597, pci_fixup_latency }, { PCI_FIXUP_HEADER, PCI_VENDOR_ID_SI, PCI_DEVICE_ID_SI_5598, pci_fixup_latency }, + { PCI_FIXUP_HEADER, PCI_VENDOR_ID_VIA, PCI_DEVICE_ID_VIA_8363_0, + pci_fixup_vt8363 }, { 0 } }; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
VIA silent disk corruption - bad news
The patch doesn't work for me. Maybe I need to disable some more of those North bridge features :-( Oh bum. Back to testing with "normal" ... P. - CORRUPTING SETUP - 00:00.0 Host bridge: VIA Technologies, Inc.: Unknown device 0305 (rev 02) Subsystem: Asustek Computer, Inc.: Unknown device 8033 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz- UDF- FastB2B- ParErr- DEVSEL=medium TAbort- TAbort- MAbort+ SERR- PERR+ Latency: 0 set Region 0: Memory at e400 (32-bit, prefetchable) [size=64M] Capabilities: [a0] AGP version 2.0 Status: RQ=31 SBA+ 64bit- FW- Rate=421 Command: RQ=0 SBA- AGP- 64bit- FW- Rate= Capabilities: [c0] Power Management version 2 Flags: PMEClk- AuxPwr- DSI- D1- D2- PME- Status: D0 PME-Enable- DSel=0 DScale=0 PME- 00: 06 11 05 03 06 00 10 a2 02 00 00 06 00 00 00 00 10: 08 00 00 e4 00 00 00 00 00 00 00 00 00 00 00 00 20: 00 00 00 00 00 00 00 00 00 00 00 00 43 10 33 80 30: 00 00 00 00 a0 00 00 00 00 00 00 00 00 00 00 00 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 50: 17 a4 6b b4 4f 81 08 08 80 00 04 08 08 08 08 08 60: 03 ff 00 a0 52 e5 e5 00 44 7c 86 0f 08 3f 00 00 70: de 80 cc 0c 0e a1 d2 00 01 b4 11 02 00 00 00 00 80: 0f 40 00 00 c0 00 00 00 02 00 00 00 00 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 a0: 02 c0 20 00 07 02 00 1f 00 00 00 00 6e 02 04 00 b0: 59 ec 80 b5 32 33 28 00 00 00 00 00 00 00 00 00 c0: 01 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f0: 00 00 00 00 00 00 00 0e 22 00 00 00 00 00 91 06 - DIFF FOR NON-CORRUPTING SETUP - @@ -5,7 +5,7 @@ Latency: 0 set Region 0: Memory at e400 (32-bit, prefetchable) [size=64M] Capabilities: [a0] AGP version 2.0 - Status: RQ=31 SBA+ 64bit- FW- Rate=421 + Status: RQ=31 SBA+ 64bit- FW- Rate=21 Command: RQ=0 SBA- AGP- 64bit- FW- Rate= Capabilities: [c0] Power Management version 2 Flags: PMEClk- AuxPwr- DSI- D1- D2- PME- @@ -15,12 +15,12 @@ 20: 00 00 00 00 00 00 00 00 00 00 00 00 43 10 33 80 30: 00 00 00 00 a0 00 00 00 00 00 00 00 00 00 00 00 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 -50: 17 a4 6b b4 4f 81 08 08 80 00 04 08 08 08 08 08 -60: 03 ff 00 a0 52 e5 e5 00 44 7c 86 0f 08 3f 00 00 -70: de 80 cc 0c 0e a1 d2 00 01 b4 11 02 00 00 00 00 +50: 17 a4 6b b4 06 81 08 08 80 00 04 08 08 08 08 08 +60: 03 ff 00 a0 50 e4 e4 00 40 78 86 0f 08 3f 00 00 +70: d8 80 cc 0c 0e a1 d2 00 01 b4 01 02 00 00 00 00 80: 0f 40 00 00 c0 00 00 00 02 00 00 00 00 00 00 00 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 -a0: 02 c0 20 00 07 02 00 1f 00 00 00 00 6e 02 04 00 +a0: 02 c0 20 00 03 02 00 1f 00 00 00 00 6e 02 00 00 b0: 59 ec 80 b5 32 33 28 00 00 00 00 00 00 00 00 00 c0: 01 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4.1-pre8 losing pages
On Fri, Jan 26, 2001 at 07:48:05PM +, Russell King wrote: > Peter Horton writes: > > The corruption is dependent on having a swapped on swap partition. If I > > "swapoff" the corruption goes away, but it comes back when I "swapon" > > again. I feel this a kernel bug, but as I'm the only person out here who's > > seeing it I'm at a loss ... > > The reason I ask is that on an ARM box running plain 2.4.0 with swap > enabled I get rather a lot of SEGVs. Turn swap off, and I don't see > any. > > It sounds like it may be related. > Okay, scratch that. It does still happen when there's no swap, but for some reason it happens a lot less often. Looks like it's timing related, it only fails when using 7200rpm drives, not older 5400rpm ones (even though they too are using UDMA33). I've ruled out the filing system, the IDE controller, the drives and the RAM, so that leaves the kernel or the CPU - I'll try and beg/borrow/steal another CPU and try that. I can compile kernels / run X whilst the test is running without a problem so it looks like it's the bulk write that's the problem. P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4.1-pre8 losing pages
On Fri, Jan 26, 2001 at 07:48:05PM +, Russell King wrote: Peter Horton writes: The corruption is dependent on having a swapped on swap partition. If I "swapoff" the corruption goes away, but it comes back when I "swapon" again. I feel this a kernel bug, but as I'm the only person out here who's seeing it I'm at a loss ... The reason I ask is that on an ARM box running plain 2.4.0 with swap enabled I get rather a lot of SEGVs. Turn swap off, and I don't see any. It sounds like it may be related. Okay, scratch that. It does still happen when there's no swap, but for some reason it happens a lot less often. Looks like it's timing related, it only fails when using 7200rpm drives, not older 5400rpm ones (even though they too are using UDMA33). I've ruled out the filing system, the IDE controller, the drives and the RAM, so that leaves the kernel or the CPU - I'll try and beg/borrow/steal another CPU and try that. I can compile kernels / run X whilst the test is running without a problem so it looks like it's the bulk write that's the problem. P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4.1-pre8 losing pages
On Fri, Jan 26, 2001 at 09:24:12AM +, Peter Horton wrote: > On Fri, Jan 26, 2001 at 03:20:33AM +0100, Xuan Baldauf wrote: > > > > Peter Horton wrote: > > > > > I'm experiencing repeatable corruption whilst writing large volumes of > > > data to disk. Kernel version is 2.4.1-pre8, on an 850MHz AMD Athlon on an > > > ASUS A7V (VIA KT133 chipset) motherboard 128M RAM (tested with 'memtest86' > > > for 10 hours). > > > > > > > ... this is the kinda output I get on most runs :- > >Linux mole-rat 2.4.1-pre10 #1 Fri Jan 26 08:48:55 GMT 2001 i686 unknown >... >aa6a64589748321899bab2b66f71427f testt >aa6a64589748321899bab2b66f71427f testu >aa6a64589748321899bab2b66f71427f testv >9dde1bed276e32a1f9af98c87ab05978 testw >aa6a64589748321899bab2b66f71427f testx >aa6a64589748321899bab2b66f71427f testy >aa6a64589748321899bab2b66f71427f testz >mole-rat:~# cmp testw testx >testw testx differ: char 110862337, line 433772 >mole-rat:~# cmp -i $(( 110862336 + 4096 )) testw testx >mole-rat:~# echo $(( 110862336 % 4096 )) >0 > > > > > I cannot reproduce your behaviour in 2.4.1-pre9. > > > The corruption is dependent on having a swapped on swap partition. If I "swapoff" the corruption goes away, but it comes back when I "swapon" again. I feel this a kernel bug, but as I'm the only person out here who's seeing it I'm at a loss ... P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4.1-pre8 losing pages
On Fri, Jan 26, 2001 at 03:20:33AM +0100, Xuan Baldauf wrote: > > Peter Horton wrote: > > > I'm experiencing repeatable corruption whilst writing large volumes of > > data to disk. Kernel version is 2.4.1-pre8, on an 850MHz AMD Athlon on an > > ASUS A7V (VIA KT133 chipset) motherboard 128M RAM (tested with 'memtest86' > > for 10 hours). > > > > So what output does following bash script produce? > Well this is the script I've been testing with ... #!/bin/bash -x set -e uname -a rm -f test test[a-z] dd if=/dev/urandom of=test bs=1024k count=128 for I in a b c d e f g h i j k l m n o p q r s t u v w x y z; do cp test test$I done md5sum test* ... this is the kinda output I get on most runs :- Linux mole-rat 2.4.1-pre10 #1 Fri Jan 26 08:48:55 GMT 2001 i686 unknown ... aa6a64589748321899bab2b66f71427f testt aa6a64589748321899bab2b66f71427f testu aa6a64589748321899bab2b66f71427f testv 9dde1bed276e32a1f9af98c87ab05978 testw aa6a64589748321899bab2b66f71427f testx aa6a64589748321899bab2b66f71427f testy aa6a64589748321899bab2b66f71427f testz mole-rat:~# cmp testw testx testw testx differ: char 110862337, line 433772 mole-rat:~# cmp -i $(( 110862336 + 4096 )) testw testx mole-rat:~# echo $(( 110862336 % 4096 )) 0 > > I cannot reproduce your behaviour in 2.4.1-pre9. > No, I can't find anybody else who can either. Maybe I've got a dodgy CPU :-( P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4.1-pre8 losing pages
On Fri, Jan 26, 2001 at 03:20:33AM +0100, Xuan Baldauf wrote: Peter Horton wrote: I'm experiencing repeatable corruption whilst writing large volumes of data to disk. Kernel version is 2.4.1-pre8, on an 850MHz AMD Athlon on an ASUS A7V (VIA KT133 chipset) motherboard 128M RAM (tested with 'memtest86' for 10 hours). So what output does following bash script produce? Well this is the script I've been testing with ... #!/bin/bash -x set -e uname -a rm -f test test[a-z] dd if=/dev/urandom of=test bs=1024k count=128 for I in a b c d e f g h i j k l m n o p q r s t u v w x y z; do cp test test$I done md5sum test* ... this is the kinda output I get on most runs :- Linux mole-rat 2.4.1-pre10 #1 Fri Jan 26 08:48:55 GMT 2001 i686 unknown ... aa6a64589748321899bab2b66f71427f testt aa6a64589748321899bab2b66f71427f testu aa6a64589748321899bab2b66f71427f testv 9dde1bed276e32a1f9af98c87ab05978 testw aa6a64589748321899bab2b66f71427f testx aa6a64589748321899bab2b66f71427f testy aa6a64589748321899bab2b66f71427f testz mole-rat:~# cmp testw testx testw testx differ: char 110862337, line 433772 mole-rat:~# cmp -i $(( 110862336 + 4096 )) testw testx mole-rat:~# echo $(( 110862336 % 4096 )) 0 I cannot reproduce your behaviour in 2.4.1-pre9. No, I can't find anybody else who can either. Maybe I've got a dodgy CPU :-( P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: 2.4.1-pre8 losing pages
On Fri, Jan 26, 2001 at 09:24:12AM +, Peter Horton wrote: On Fri, Jan 26, 2001 at 03:20:33AM +0100, Xuan Baldauf wrote: Peter Horton wrote: I'm experiencing repeatable corruption whilst writing large volumes of data to disk. Kernel version is 2.4.1-pre8, on an 850MHz AMD Athlon on an ASUS A7V (VIA KT133 chipset) motherboard 128M RAM (tested with 'memtest86' for 10 hours). ... this is the kinda output I get on most runs :- Linux mole-rat 2.4.1-pre10 #1 Fri Jan 26 08:48:55 GMT 2001 i686 unknown ... aa6a64589748321899bab2b66f71427f testt aa6a64589748321899bab2b66f71427f testu aa6a64589748321899bab2b66f71427f testv 9dde1bed276e32a1f9af98c87ab05978 testw aa6a64589748321899bab2b66f71427f testx aa6a64589748321899bab2b66f71427f testy aa6a64589748321899bab2b66f71427f testz mole-rat:~# cmp testw testx testw testx differ: char 110862337, line 433772 mole-rat:~# cmp -i $(( 110862336 + 4096 )) testw testx mole-rat:~# echo $(( 110862336 % 4096 )) 0 I cannot reproduce your behaviour in 2.4.1-pre9. The corruption is dependent on having a swapped on swap partition. If I "swapoff" the corruption goes away, but it comes back when I "swapon" again. I feel this a kernel bug, but as I'm the only person out here who's seeing it I'm at a loss ... P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
2.4.1-pre8 losing pages
I'm experiencing repeatable corruption whilst writing large volumes of data to disk. Kernel version is 2.4.1-pre8, on an 850MHz AMD Athlon on an ASUS A7V (VIA KT133 chipset) motherboard 128M RAM (tested with 'memtest86' for 10 hours). First, I realised that the fsck was noticing small corruptions on my ext2 volume. My first suspect was the much discussed VIA IDE controller. As a test I created a 128M file from "urandom" and copied it to twenty six files. When I MD5 the files one or two of them are usually corrupt. The damage usually occurs in the 24th copy (thought not always). Inspecting the files shows a single 4K block (aligned on a 4K boundary) that is completely different from what it should be. The kernel logs no errors whilst writing the corrupt files. I've repeated the test on the other on-board IDE controller (Promise), a different hard disk, and on reiserfs. I see the corruption in all cases. I tried building the kernel for "Pentium-Classic", and I tried a few older kernels (2.4.0-test5 and 2.4.0-test12), still bad (all kernels built with GCC 2.95.2 - Debian potato). I really could do with some help as where to look next :-). I did try and come up with a test to see whether bad data is written or whether the damaged piece is just not written, but if I alter the testing procedure too much the problem seems to go away. It seems to just lose a single page under one very specific circumstance. P. ( configs attached ) info.tar.gz
2.4.1-pre8 losing pages
I'm experiencing repeatable corruption whilst writing large volumes of data to disk. Kernel version is 2.4.1-pre8, on an 850MHz AMD Athlon on an ASUS A7V (VIA KT133 chipset) motherboard 128M RAM (tested with 'memtest86' for 10 hours). First, I realised that the fsck was noticing small corruptions on my ext2 volume. My first suspect was the much discussed VIA IDE controller. As a test I created a 128M file from "urandom" and copied it to twenty six files. When I MD5 the files one or two of them are usually corrupt. The damage usually occurs in the 24th copy (thought not always). Inspecting the files shows a single 4K block (aligned on a 4K boundary) that is completely different from what it should be. The kernel logs no errors whilst writing the corrupt files. I've repeated the test on the other on-board IDE controller (Promise), a different hard disk, and on reiserfs. I see the corruption in all cases. I tried building the kernel for "Pentium-Classic", and I tried a few older kernels (2.4.0-test5 and 2.4.0-test12), still bad (all kernels built with GCC 2.95.2 - Debian potato). I really could do with some help as where to look next :-). I did try and come up with a test to see whether bad data is written or whether the damaged piece is just not written, but if I alter the testing procedure too much the problem seems to go away. It seems to just lose a single page under one very specific circumstance. P. ( configs attached ) info.tar.gz
Re: Via apollo KX133 ide bug in 2.4.x
On Sun, Jan 21, 2001 at 12:40:30PM +0100, Vojtech Pavlik wrote: > On Sat, Jan 20, 2001 at 04:32:36PM -0500, safemode wrote: > > Peter Horton wrote: > > > > > On Thu, Jan 20, 2000 at 08:38:12AM +, Peter Horton wrote: > > > > > > > > I think I'm suffering the same thing on my new Asus A7V. Yesterday I got a > > > > single "error in bitmap, remounting read only" type error, and today I got > > > > some files in /tmp that returned I/O error when stat()ed. I do have DMA > > > > enabled, but only UDMA33. I've done several kernel compiles with no > > > > problems at all so looks like something is on the edge. Think I might go > > > > back to 2.2.x for a bit and see what happens, or maybe just remove the VIA > > > > driver :-((. > > > > > > > > > > I apologise for following up my own E-mail, but there is something I'm > > > missing here (maybe a whole lot of something). Anyone know how come we're > > > seeing silent corruption ... I thought this UDMA stuff was all checksummed > > > ? If there error is outside the data I assume the driver would notice ? > > > > > > P. > > > > The thing is, even with UDMA disabled in the kernel, I still see the corruption > > with 2.4.x (release) and above. Anything written while using the kernel is > > corrupted. Much of the stuff will read fine (files) ... but I believe > > directories get the IO error immediately and some files do also. Everything is > > seen as corrupted when you fsck a partition where this kernel has been run and > > created files on. This is a silent corruption without any errors reported and > > I've only tested it on ext2. You cannot create FS's with these kernels (at > > least on the VIA chipsets) since they too are corrupted (note, only tested ext2 > > fs). I did disable UDMA everywhere and still saw it happen, this problem is > > not present in older 2.4.0-test kernels so it's something in the late > > pre-release stage and into the release stage. > > Do you have the via driver compiled in? If yes, try without, if no, try > with it ... > Okay, I bit the bullet and rebuilt the kernel with the VIA driver back in. As a test I created one 128M file from /dev/urandom and copied it 26 times. Out of the 26 copies one was damaged. The damage was just one page (eight sectors), aligned on a page boundary. The damaged section bore no resemblance at all to what it should have been. Is it just a coincidence that it looks like an incorrect page got written out ? P. PS - just to rule out other factors I ran memtest86 on this box for 10 hours with no error. It's not an overclock either. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Via apollo KX133 ide bug in 2.4.x
On Sun, Jan 21, 2001 at 12:40:30PM +0100, Vojtech Pavlik wrote: On Sat, Jan 20, 2001 at 04:32:36PM -0500, safemode wrote: Peter Horton wrote: On Thu, Jan 20, 2000 at 08:38:12AM +, Peter Horton wrote: I think I'm suffering the same thing on my new Asus A7V. Yesterday I got a single "error in bitmap, remounting read only" type error, and today I got some files in /tmp that returned I/O error when stat()ed. I do have DMA enabled, but only UDMA33. I've done several kernel compiles with no problems at all so looks like something is on the edge. Think I might go back to 2.2.x for a bit and see what happens, or maybe just remove the VIA driver :-((. I apologise for following up my own E-mail, but there is something I'm missing here (maybe a whole lot of something). Anyone know how come we're seeing silent corruption ... I thought this UDMA stuff was all checksummed ? If there error is outside the data I assume the driver would notice ? P. The thing is, even with UDMA disabled in the kernel, I still see the corruption with 2.4.x (release) and above. Anything written while using the kernel is corrupted. Much of the stuff will read fine (files) ... but I believe directories get the IO error immediately and some files do also. Everything is seen as corrupted when you fsck a partition where this kernel has been run and created files on. This is a silent corruption without any errors reported and I've only tested it on ext2. You cannot create FS's with these kernels (at least on the VIA chipsets) since they too are corrupted (note, only tested ext2 fs). I did disable UDMA everywhere and still saw it happen, this problem is not present in older 2.4.0-test kernels so it's something in the late pre-release stage and into the release stage. Do you have the via driver compiled in? If yes, try without, if no, try with it ... Okay, I bit the bullet and rebuilt the kernel with the VIA driver back in. As a test I created one 128M file from /dev/urandom and copied it 26 times. Out of the 26 copies one was damaged. The damage was just one page (eight sectors), aligned on a page boundary. The damaged section bore no resemblance at all to what it should have been. Is it just a coincidence that it looks like an incorrect page got written out ? P. PS - just to rule out other factors I ran memtest86 on this box for 10 hours with no error. It's not an overclock either. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Via apollo KX133 ide bug in 2.4.x
On Thu, Jan 20, 2000 at 08:38:12AM +, Peter Horton wrote: > > I think I'm suffering the same thing on my new Asus A7V. Yesterday I got a > single "error in bitmap, remounting read only" type error, and today I got > some files in /tmp that returned I/O error when stat()ed. I do have DMA > enabled, but only UDMA33. I've done several kernel compiles with no > problems at all so looks like something is on the edge. Think I might go > back to 2.2.x for a bit and see what happens, or maybe just remove the VIA > driver :-((. > I apologise for following up my own E-mail, but there is something I'm missing here (maybe a whole lot of something). Anyone know how come we're seeing silent corruption ... I thought this UDMA stuff was all checksummed ? If there error is outside the data I assume the driver would notice ? P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Via apollo KX133 ide bug in 2.4.x
On Fri, Jan 19, 2001 at 07:33:21PM -0500, safemode wrote: > I'm sorry I can't be more descriptive than that, but there aren't any > errors ever displayed. What happened was after about a day of uptime, I > began seeing IO errors when trying to access files. I realized that the > IO errors occurred on any file I had created. I rebooted since the > computer became impossible to use and fsck removed everything that I had > created since upgrading to the release kernel. This is all on ext2fs. > I tried making bootdisks but they all showed up as being bad. I tried > copying files to another ext2fs but upon fsck, they too were all removed > due to corruption. These ext2fs' were not created by the release > kernel. I had to go back to 2.4.0-test11 before the kernel would write > to the fs correctly. For the record, I disabled DMA in the kernel and > i'm compiling for athlon using gcc 2.95.3. I saw the same thing happen > though when I booted for a kernel compiled for Pentium 2.Since > reverting back to 2.4.0-test11, however, no FS corruption has been > observed. Anyone have any idea what this is about? i'm compiling with > the same options between kernels but 2.4.x (release and newer) do not > seem to be able to write to the ext2fs correctly. Could this be because > it was formatted by a 2.2.x kernel? Anyone using this chipset I would > caution to have backups ready when using it with 2.4.x, as I lost > hundreds of files to it. Also, no errors were reported anywhere, IO > errors when trying to stat dirs just started appearing after a couple > days uptime ...then they would occur whenever you wrote to the FS. Even > after a reboot.If you need any extra iinfo about kernel options and > computer config, just ask. > I think I'm suffering the same thing on my new Asus A7V. Yesterday I got a single "error in bitmap, remounting read only" type error, and today I got some files in /tmp that returned I/O error when stat()ed. I do have DMA enabled, but only UDMA33. I've done several kernel compiles with no problems at all so looks like something is on the edge. Think I might go back to 2.2.x for a bit and see what happens, or maybe just remove the VIA driver :-((. P. I've attached lspci -vxxx output, and kernel config, in case anyone is investigating. /dev/hda is Seagate ST330621A. --VIA BusMastering IDE Configuration Driver Version: 2.1e South Bridge: VIA vt82c686a rev 0x22 Command register: 0x7 Latency timer: 32 PCI clock: 33MHz Master Read Cycle IRDY:0ws Master Write Cycle IRDY:0ws FIFO Output Data 1/2 Clock Advance: off BM IDE Status Register Read Retry: on Max DRDY Pulse Width: No limit ---Primary IDE---Secondary IDE-- Read DMA FIFO flush: on on End Sect. FIFO flush: on on Prefetch Buffer: on on Post Write Buffer: on on FIFO size: 8 8 Threshold Prim.: 1/2 1/2 Bytes Per Sector: 512 512 Both channels togth: yes yes ---drive0drive1drive2drive3- BMDMA enabled:yesnonono Transfer Mode: UDMA DMA/PIO DMA/PIO DMA/PIO Address Setup: 30ns 120ns 30ns 120ns Active Pulse:90ns 330ns 90ns 330ns Recovery Time: 30ns 270ns 30ns 270ns Cycle Time: 60ns 600ns 120ns 600ns Transfer Rate: 33.0MB/s 3.3MB/s 16.5MB/s 3.3MB/s lspci+config.gz
Re: Via apollo KX133 ide bug in 2.4.x
On Fri, Jan 19, 2001 at 07:33:21PM -0500, safemode wrote: I'm sorry I can't be more descriptive than that, but there aren't any errors ever displayed. What happened was after about a day of uptime, I began seeing IO errors when trying to access files. I realized that the IO errors occurred on any file I had created. I rebooted since the computer became impossible to use and fsck removed everything that I had created since upgrading to the release kernel. This is all on ext2fs. I tried making bootdisks but they all showed up as being bad. I tried copying files to another ext2fs but upon fsck, they too were all removed due to corruption. These ext2fs' were not created by the release kernel. I had to go back to 2.4.0-test11 before the kernel would write to the fs correctly. For the record, I disabled DMA in the kernel and i'm compiling for athlon using gcc 2.95.3. I saw the same thing happen though when I booted for a kernel compiled for Pentium 2.Since reverting back to 2.4.0-test11, however, no FS corruption has been observed. Anyone have any idea what this is about? i'm compiling with the same options between kernels but 2.4.x (release and newer) do not seem to be able to write to the ext2fs correctly. Could this be because it was formatted by a 2.2.x kernel? Anyone using this chipset I would caution to have backups ready when using it with 2.4.x, as I lost hundreds of files to it. Also, no errors were reported anywhere, IO errors when trying to stat dirs just started appearing after a couple days uptime ...then they would occur whenever you wrote to the FS. Even after a reboot.If you need any extra iinfo about kernel options and computer config, just ask. I think I'm suffering the same thing on my new Asus A7V. Yesterday I got a single "error in bitmap, remounting read only" type error, and today I got some files in /tmp that returned I/O error when stat()ed. I do have DMA enabled, but only UDMA33. I've done several kernel compiles with no problems at all so looks like something is on the edge. Think I might go back to 2.2.x for a bit and see what happens, or maybe just remove the VIA driver :-((. P. I've attached lspci -vxxx output, and kernel config, in case anyone is investigating. /dev/hda is Seagate ST330621A. --VIA BusMastering IDE Configuration Driver Version: 2.1e South Bridge: VIA vt82c686a rev 0x22 Command register: 0x7 Latency timer: 32 PCI clock: 33MHz Master Read Cycle IRDY:0ws Master Write Cycle IRDY:0ws FIFO Output Data 1/2 Clock Advance: off BM IDE Status Register Read Retry: on Max DRDY Pulse Width: No limit ---Primary IDE---Secondary IDE-- Read DMA FIFO flush: on on End Sect. FIFO flush: on on Prefetch Buffer: on on Post Write Buffer: on on FIFO size: 8 8 Threshold Prim.: 1/2 1/2 Bytes Per Sector: 512 512 Both channels togth: yes yes ---drive0drive1drive2drive3- BMDMA enabled:yesnonono Transfer Mode: UDMA DMA/PIO DMA/PIO DMA/PIO Address Setup: 30ns 120ns 30ns 120ns Active Pulse:90ns 330ns 90ns 330ns Recovery Time: 30ns 270ns 30ns 270ns Cycle Time: 60ns 600ns 120ns 600ns Transfer Rate: 33.0MB/s 3.3MB/s 16.5MB/s 3.3MB/s lspci+config.gz
Re: Via apollo KX133 ide bug in 2.4.x
On Thu, Jan 20, 2000 at 08:38:12AM +, Peter Horton wrote: I think I'm suffering the same thing on my new Asus A7V. Yesterday I got a single "error in bitmap, remounting read only" type error, and today I got some files in /tmp that returned I/O error when stat()ed. I do have DMA enabled, but only UDMA33. I've done several kernel compiles with no problems at all so looks like something is on the edge. Think I might go back to 2.2.x for a bit and see what happens, or maybe just remove the VIA driver :-((. I apologise for following up my own E-mail, but there is something I'm missing here (maybe a whole lot of something). Anyone know how come we're seeing silent corruption ... I thought this UDMA stuff was all checksummed ? If there error is outside the data I assume the driver would notice ? P. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
2.4.1-pre8 and Athlon
Building 2.4.1-pre8 for K7 gives 'current' undefined errors in the headers included from init/main.c. Looks like something included from asm/string.h is missing an include. The problems go away if I remove CONFIG_X86_USE3DNOW=y from the config. P. -- P. Horton Software Engineer http://www.colonel-panic.com Linux 2.4.0 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
2.4.1-pre8 and Athlon
Building 2.4.1-pre8 for K7 gives 'current' undefined errors in the headers included from init/main.c. Looks like something included from asm/string.h is missing an include. The problems go away if I remove CONFIG_X86_USE3DNOW=y from the config. P. -- P. Horton Software Engineer http://www.colonel-panic.com Linux 2.4.0 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: D-LINK DFE-530-TX
On Wed, Dec 06, 2000 at 07:44:02PM -0500, Mike A. Harris wrote: > Which ethernet module works with this card? 2.2.17 kernel > If the PCI device ID is 3065 then it's via-rhine, but not supported by the driver in the kernel. Get updated via-rhine from Donald Becker's site http://www.scyld.com/network. Even the DFE-530-TX driver for NT downloaded from D-Link's site doesn't know about this chip yet ... though changing the device ID in the .INF file seemed to make it work ... shrug. HTH P. -- ++ | Peter Horton| ++ |http://www.colonel-panic.com| | http://www.berserk.demon.co.uk | | Linux 2.4.0-test11 | ++ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: D-LINK DFE-530-TX
On Wed, Dec 06, 2000 at 07:44:02PM -0500, Mike A. Harris wrote: Which ethernet module works with this card? 2.2.17 kernel If the PCI device ID is 3065 then it's via-rhine, but not supported by the driver in the kernel. Get updated via-rhine from Donald Becker's site http://www.scyld.com/network. Even the DFE-530-TX driver for NT downloaded from D-Link's site doesn't know about this chip yet ... though changing the device ID in the .INF file seemed to make it work ... shrug. HTH P. -- ++ |Peter Horton| ++ |http://www.colonel-panic.com| | http://www.berserk.demon.co.uk | | Linux 2.4.0-test11 | ++ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/