Re: Possible critical VIA vt82c686a chip bug (private question)
Vojtech Pavlik wrote: > > I'm *not* sure. It just looks like a reasonable explanation. It doesn't > happen on Intel chips and older VIA chips, it only happens on new VIA > chips, and the code is the same all the time. Also, it happens both with > 2.2 and 2.4 kernels ... > > -- > Vojtech Pavlik > SuSE Labs > Do you have a method guaranteed to reproduce this? I have a newer VIA chipset and haven't (yet) observed this problem. Host bridge: VIA Technologies, Inc. VT8371 [KX133] (rev 2). PCI bridge: VIA Technologies, Inc. VT8371 [KX133 AGP] (rev 0). ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev 34). IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 16). Bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 48). Multimedia audio controller: VIA Technologies, Inc. AC97 Audio Controller (rev 32). === -- TimO ==++== - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
On Fri, Oct 27, 2000 at 02:04:58PM +0200, [EMAIL PROTECTED] wrote: > > Interesting. If it's caused by SCSI as well (might be), then it's not > > caused by heavy IDE activity but rather than that it could be heavy > > BusMastering activity instead (The IDE chip does BM as well). > > > > I'm still wondering if it could be a Linux kernel bug (bad/concurrent > > accesses to the i8253 registers), this has to be checked. > > > > How sure are you that the chip is actually buggy? I ran into something > similar a while ago, when I mixed the two arguments to an outb in a driver, > and ended up writing MYPORT into the timer instead of 0x40 into MYPORT. I'm *not* sure. It just looks like a reasonable explanation. It doesn't happen on Intel chips and older VIA chips, it only happens on new VIA chips, and the code is the same all the time. Also, it happens both with 2.2 and 2.4 kernels ... -- Vojtech Pavlik SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
On 26 Oct, Vojtech Pavlik wrote: > On Thu, Oct 26, 2000 at 04:42:31PM +0200, Yoann Vandoorselaere wrote: > >> > On Thu, Oct 26, 2000 at 04:20:43PM +0200, Yoann Vandoorselaere wrote: >> > >> > > ... >> > > >> > > Have you any idea what is the relation between time and this chip ? >> > > >> > > Also, I'm experiencing the problem for several month on my >> > > workstation and I never could find where it was comming from... >> > > how did you do ? >> > >> > Well, it integrates both the i8253 PIT and the vt82c586 IDE controller. >> > >> > I first located the wrong time was coming from gettimeofday() and not >> > from the other sources of time the kernel provides. And then I was >> > tracking the problem (which actually is an underflow - the chip bug >> > causes some time offset variables go negative - 0x microseconds >> > is about 1:20 hours). And this way I got to the spot where the patch >> > cures the problem. >> >> Ok, here is what I experienced : >> >> First what is strange is that : >> - I'm using SCSI >> - I just have an IDE disk for mp3. >> The IDE subsystem is never used heavilly... >> >> I've experienced the problem after some time of >> heavy scsi IO, my screen under X was going black (like with dpms) >> When I was moving the mouse, the image was coming back >> for < 1 seconds, then black screen... >> >> The only fix was to kill X then to reboot. >> >> Anyway, thanks for your explaination... >> I'll do a feedback for this patch ASAP. > > Interesting. If it's caused by SCSI as well (might be), then it's not > caused by heavy IDE activity but rather than that it could be heavy > BusMastering activity instead (The IDE chip does BM as well). > > I'm still wondering if it could be a Linux kernel bug (bad/concurrent > accesses to the i8253 registers), this has to be checked. > How sure are you that the chip is actually buggy? I ran into something similar a while ago, when I mixed the two arguments to an outb in a driver, and ended up writing MYPORT into the timer instead of 0x40 into MYPORT. Bart -- Bart Hartgers - TUE Eindhoven Get my GPG key at http://etpmod.phys.tue.nl/bart/pubkey.gpg - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
On Fri, Oct 27, 2000 at 01:16:34PM +0200, Yoann Vandoorselaere wrote: > > Which part of the chipset you mean? The PIT (programmable interrupt > > timer)? That one is standard since XT times. The rest of the ISA bridge? > > Maybe, but that's mostly BIOS work and shouldn't impact the PIT > > under sane conditions. > > What is strange is that a number of persons seem to be hit by this > problem... And if VIA didn't corrected it it's probably because > they are not aware of it... > > I think that if such problem occured under windows > (thinking to the windows user base), VIA would be already in touch. It can't happen under Windows, because Windows timer runs at 18 Hz (timer programmed to 65535), while Linux uses 100 Hz (timer programmed to approx 11920), so when the timer unprograms itself due to the bug to 65535, only Linux notices it, Windows can't. -- Vojtech Pavlik SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
Vojtech Pavlik <[EMAIL PROTECTED]> writes: > On Fri, Oct 27, 2000 at 12:58:12PM +0200, Yoann Vandoorselaere wrote: > > > > > > So this is not our problem here. Anyway I guess it's time to hunt for > > > > > i8259 accesses in the kernel that lack the necessary spinlock, even when > > > > > they're not probably the cause of the problem we see here. > > > > > > > > BTW what about trying to modify your work-around code to make it > > > > attempt to read the timer again? This way we could test whether it was > > > > a race condition during timer read or really timer jumping to a bogus > > > > value. > > > > > > Actually if I don't reprogram the timer (and just ignore the value for > > > example), the work-around code keeps being called again and again very > > > often (between 1x/minute to 100x/second) after the first failure, even > > > when the system is idle. > > > > > > When reprogramming, next failure happens only after stressing the system > > > again. > > > > > > So it's not just a race, the impact of the failure on the chip is > > > permanent and stays till it's reprogrammed. > > > > Are you sure there is not an error in the way the > > chipset is programmed ? > > Which part of the chipset you mean? The PIT (programmable interrupt > timer)? That one is standard since XT times. The rest of the ISA bridge? > Maybe, but that's mostly BIOS work and shouldn't impact the PIT > under sane conditions. What is strange is that a number of persons seem to be hit by this problem... And if VIA didn't corrected it it's probably because they are not aware of it... I think that if such problem occured under windows (thinking to the windows user base), VIA would be already in touch. -- -- Yoann http://www.mandrakesoft.com/~yoann/ Tiniest "mesures unities?" - lenght : millimeter - volume : milliliter - intelligence : military man - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
On Fri, Oct 27, 2000 at 12:58:12PM +0200, Yoann Vandoorselaere wrote: > > > > So this is not our problem here. Anyway I guess it's time to hunt for > > > > i8259 accesses in the kernel that lack the necessary spinlock, even when > > > > they're not probably the cause of the problem we see here. > > > > > > BTW what about trying to modify your work-around code to make it > > > attempt to read the timer again? This way we could test whether it was > > > a race condition during timer read or really timer jumping to a bogus > > > value. > > > > Actually if I don't reprogram the timer (and just ignore the value for > > example), the work-around code keeps being called again and again very > > often (between 1x/minute to 100x/second) after the first failure, even > > when the system is idle. > > > > When reprogramming, next failure happens only after stressing the system > > again. > > > > So it's not just a race, the impact of the failure on the chip is > > permanent and stays till it's reprogrammed. > > Are you sure there is not an error in the way the > chipset is programmed ? Which part of the chipset you mean? The PIT (programmable interrupt timer)? That one is standard since XT times. The rest of the ISA bridge? Maybe, but that's mostly BIOS work and shouldn't impact the PIT under sane conditions. -- Vojtech Pavlik SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
Vojtech Pavlik <[EMAIL PROTECTED]> writes: > On Fri, Oct 27, 2000 at 12:02:20PM +0200, Martin Mares wrote: > > > > So this is not our problem here. Anyway I guess it's time to hunt for > > > i8259 accesses in the kernel that lack the necessary spinlock, even when > > > they're not probably the cause of the problem we see here. > > > > BTW what about trying to modify your work-around code to make it > > attempt to read the timer again? This way we could test whether it was > > a race condition during timer read or really timer jumping to a bogus > > value. > > Actually if I don't reprogram the timer (and just ignore the value for > example), the work-around code keeps being called again and again very > often (between 1x/minute to 100x/second) after the first failure, even > when the system is idle. > > When reprogramming, next failure happens only after stressing the system > again. > > So it's not just a race, the impact of the failure on the chip is > permanent and stays till it's reprogrammed. Are you sure there is not an error in the way the chipset is programmed ? -- -- Yoann http://www.mandrakesoft.com/~yoann/ "Programming is a race between programmers, who try and make more and more idiot-proof software, and universe, which produces more and more remarkable idiots. Until now, universe leads the race" -- R. Cook - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
On Fri, Oct 27, 2000 at 12:02:20PM +0200, Martin Mares wrote: > > So this is not our problem here. Anyway I guess it's time to hunt for > > i8259 accesses in the kernel that lack the necessary spinlock, even when > > they're not probably the cause of the problem we see here. > > BTW what about trying to modify your work-around code to make it > attempt to read the timer again? This way we could test whether it was > a race condition during timer read or really timer jumping to a bogus > value. Actually if I don't reprogram the timer (and just ignore the value for example), the work-around code keeps being called again and again very often (between 1x/minute to 100x/second) after the first failure, even when the system is idle. When reprogramming, next failure happens only after stressing the system again. So it's not just a race, the impact of the failure on the chip is permanent and stays till it's reprogrammed. -- Vojtech Pavlik SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
Hi! > So this is not our problem here. Anyway I guess it's time to hunt for > i8259 accesses in the kernel that lack the necessary spinlock, even when > they're not probably the cause of the problem we see here. BTW what about trying to modify your work-around code to make it attempt to read the timer again? This way we could test whether it was a race condition during timer read or really timer jumping to a bogus value. Have a nice fortnight -- Martin `MJ' Mares <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> http://atrey.karlin.mff.cuni.cz/~mj/ "This line is umop apisdn." - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
Hi! So this is not our problem here. Anyway I guess it's time to hunt for i8259 accesses in the kernel that lack the necessary spinlock, even when they're not probably the cause of the problem we see here. BTW what about trying to modify your work-around code to make it attempt to read the timer again? This way we could test whether it was a race condition during timer read or really timer jumping to a bogus value. Have a nice fortnight -- Martin `MJ' Mares [EMAIL PROTECTED] [EMAIL PROTECTED] http://atrey.karlin.mff.cuni.cz/~mj/ "This line is umop apisdn." - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
On Fri, Oct 27, 2000 at 12:02:20PM +0200, Martin Mares wrote: So this is not our problem here. Anyway I guess it's time to hunt for i8259 accesses in the kernel that lack the necessary spinlock, even when they're not probably the cause of the problem we see here. BTW what about trying to modify your work-around code to make it attempt to read the timer again? This way we could test whether it was a race condition during timer read or really timer jumping to a bogus value. Actually if I don't reprogram the timer (and just ignore the value for example), the work-around code keeps being called again and again very often (between 1x/minute to 100x/second) after the first failure, even when the system is idle. When reprogramming, next failure happens only after stressing the system again. So it's not just a race, the impact of the failure on the chip is permanent and stays till it's reprogrammed. -- Vojtech Pavlik SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
Vojtech Pavlik [EMAIL PROTECTED] writes: On Fri, Oct 27, 2000 at 12:02:20PM +0200, Martin Mares wrote: So this is not our problem here. Anyway I guess it's time to hunt for i8259 accesses in the kernel that lack the necessary spinlock, even when they're not probably the cause of the problem we see here. BTW what about trying to modify your work-around code to make it attempt to read the timer again? This way we could test whether it was a race condition during timer read or really timer jumping to a bogus value. Actually if I don't reprogram the timer (and just ignore the value for example), the work-around code keeps being called again and again very often (between 1x/minute to 100x/second) after the first failure, even when the system is idle. When reprogramming, next failure happens only after stressing the system again. So it's not just a race, the impact of the failure on the chip is permanent and stays till it's reprogrammed. Are you sure there is not an error in the way the chipset is programmed ? -- -- Yoann http://www.mandrakesoft.com/~yoann/ "Programming is a race between programmers, who try and make more and more idiot-proof software, and universe, which produces more and more remarkable idiots. Until now, universe leads the race" -- R. Cook - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
On Fri, Oct 27, 2000 at 12:58:12PM +0200, Yoann Vandoorselaere wrote: So this is not our problem here. Anyway I guess it's time to hunt for i8259 accesses in the kernel that lack the necessary spinlock, even when they're not probably the cause of the problem we see here. BTW what about trying to modify your work-around code to make it attempt to read the timer again? This way we could test whether it was a race condition during timer read or really timer jumping to a bogus value. Actually if I don't reprogram the timer (and just ignore the value for example), the work-around code keeps being called again and again very often (between 1x/minute to 100x/second) after the first failure, even when the system is idle. When reprogramming, next failure happens only after stressing the system again. So it's not just a race, the impact of the failure on the chip is permanent and stays till it's reprogrammed. Are you sure there is not an error in the way the chipset is programmed ? Which part of the chipset you mean? The PIT (programmable interrupt timer)? That one is standard since XT times. The rest of the ISA bridge? Maybe, but that's mostly BIOS work and shouldn't impact the PIT under sane conditions. -- Vojtech Pavlik SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
Vojtech Pavlik [EMAIL PROTECTED] writes: On Fri, Oct 27, 2000 at 12:58:12PM +0200, Yoann Vandoorselaere wrote: So this is not our problem here. Anyway I guess it's time to hunt for i8259 accesses in the kernel that lack the necessary spinlock, even when they're not probably the cause of the problem we see here. BTW what about trying to modify your work-around code to make it attempt to read the timer again? This way we could test whether it was a race condition during timer read or really timer jumping to a bogus value. Actually if I don't reprogram the timer (and just ignore the value for example), the work-around code keeps being called again and again very often (between 1x/minute to 100x/second) after the first failure, even when the system is idle. When reprogramming, next failure happens only after stressing the system again. So it's not just a race, the impact of the failure on the chip is permanent and stays till it's reprogrammed. Are you sure there is not an error in the way the chipset is programmed ? Which part of the chipset you mean? The PIT (programmable interrupt timer)? That one is standard since XT times. The rest of the ISA bridge? Maybe, but that's mostly BIOS work and shouldn't impact the PIT under sane conditions. What is strange is that a number of persons seem to be hit by this problem... And if VIA didn't corrected it it's probably because they are not aware of it... I think that if such problem occured under windows (thinking to the windows user base), VIA would be already in touch. -- -- Yoann http://www.mandrakesoft.com/~yoann/ Tiniest "mesures unities?" - lenght : millimeter - volume : milliliter - intelligence : military man - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
On Fri, Oct 27, 2000 at 01:16:34PM +0200, Yoann Vandoorselaere wrote: Which part of the chipset you mean? The PIT (programmable interrupt timer)? That one is standard since XT times. The rest of the ISA bridge? Maybe, but that's mostly BIOS work and shouldn't impact the PIT under sane conditions. What is strange is that a number of persons seem to be hit by this problem... And if VIA didn't corrected it it's probably because they are not aware of it... I think that if such problem occured under windows (thinking to the windows user base), VIA would be already in touch. It can't happen under Windows, because Windows timer runs at 18 Hz (timer programmed to 65535), while Linux uses 100 Hz (timer programmed to approx 11920), so when the timer unprograms itself due to the bug to 65535, only Linux notices it, Windows can't. -- Vojtech Pavlik SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
On 26 Oct, Vojtech Pavlik wrote: On Thu, Oct 26, 2000 at 04:42:31PM +0200, Yoann Vandoorselaere wrote: On Thu, Oct 26, 2000 at 04:20:43PM +0200, Yoann Vandoorselaere wrote: ... Have you any idea what is the relation between time and this chip ? Also, I'm experiencing the problem for several month on my workstation and I never could find where it was comming from... how did you do ? Well, it integrates both the i8253 PIT and the vt82c586 IDE controller. I first located the wrong time was coming from gettimeofday() and not from the other sources of time the kernel provides. And then I was tracking the problem (which actually is an underflow - the chip bug causes some time offset variables go negative - 0x microseconds is about 1:20 hours). And this way I got to the spot where the patch cures the problem. Ok, here is what I experienced : First what is strange is that : - I'm using SCSI - I just have an IDE disk for mp3. The IDE subsystem is never used heavilly... I've experienced the problem after some time of heavy scsi IO, my screen under X was going black (like with dpms) When I was moving the mouse, the image was coming back for 1 seconds, then black screen... The only fix was to kill X then to reboot. Anyway, thanks for your explaination... I'll do a feedback for this patch ASAP. Interesting. If it's caused by SCSI as well (might be), then it's not caused by heavy IDE activity but rather than that it could be heavy BusMastering activity instead (The IDE chip does BM as well). I'm still wondering if it could be a Linux kernel bug (bad/concurrent accesses to the i8253 registers), this has to be checked. How sure are you that the chip is actually buggy? I ran into something similar a while ago, when I mixed the two arguments to an outb in a driver, and ended up writing MYPORT into the timer instead of 0x40 into MYPORT. Bart -- Bart Hartgers - TUE Eindhoven Get my GPG key at http://etpmod.phys.tue.nl/bart/pubkey.gpg - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
On Fri, Oct 27, 2000 at 02:04:58PM +0200, [EMAIL PROTECTED] wrote: Interesting. If it's caused by SCSI as well (might be), then it's not caused by heavy IDE activity but rather than that it could be heavy BusMastering activity instead (The IDE chip does BM as well). I'm still wondering if it could be a Linux kernel bug (bad/concurrent accesses to the i8253 registers), this has to be checked. How sure are you that the chip is actually buggy? I ran into something similar a while ago, when I mixed the two arguments to an outb in a driver, and ended up writing MYPORT into the timer instead of 0x40 into MYPORT. I'm *not* sure. It just looks like a reasonable explanation. It doesn't happen on Intel chips and older VIA chips, it only happens on new VIA chips, and the code is the same all the time. Also, it happens both with 2.2 and 2.4 kernels ... -- Vojtech Pavlik SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
Vojtech Pavlik wrote: I'm *not* sure. It just looks like a reasonable explanation. It doesn't happen on Intel chips and older VIA chips, it only happens on new VIA chips, and the code is the same all the time. Also, it happens both with 2.2 and 2.4 kernels ... -- Vojtech Pavlik SuSE Labs Do you have a method guaranteed to reproduce this? I have a newer VIA chipset and haven't (yet) observed this problem. Host bridge: VIA Technologies, Inc. VT8371 [KX133] (rev 2). PCI bridge: VIA Technologies, Inc. VT8371 [KX133 AGP] (rev 0). ISA bridge: VIA Technologies, Inc. VT82C686 [Apollo Super South] (rev 34). IDE interface: VIA Technologies, Inc. Bus Master IDE (rev 16). Bridge: VIA Technologies, Inc. VT82C686 [Apollo Super ACPI] (rev 48). Multimedia audio controller: VIA Technologies, Inc. AC97 Audio Controller (rev 32). === -- TimO ==++== - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
On Thu, Oct 26, 2000 at 11:24:38PM +0200, Yoann Vandoorselaere wrote: > Vojtech Pavlik <[EMAIL PROTECTED]> writes: > > > On Thu, Oct 26, 2000 at 11:05:04PM +0200, Yoann Vandoorselaere wrote: > > > > > yop, I 've done : > > > > > > make -j10 World > > > in the xfree tree and simulateously : > > > > > > while true; do make dep && make clean && make bzImage; done > > > in the kernel tree > > > > Now it'd be nice to verify that the problem also happens when the system > > is not running out of memory (which -j10 quite causes I think) ... > > Nope, my system was loaded, but was usable > (at least until the problem occured)... Good to know. -- Vojtech Pavlik SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
Vojtech Pavlik <[EMAIL PROTECTED]> writes: > On Thu, Oct 26, 2000 at 11:05:04PM +0200, Yoann Vandoorselaere wrote: > > > yop, I 've done : > > > > make -j10 World > > in the xfree tree and simulateously : > > > > while true; do make dep && make clean && make bzImage; done > > in the kernel tree > > Now it'd be nice to verify that the problem also happens when the system > is not running out of memory (which -j10 quite causes I think) ... Nope, my system was loaded, but was usable (at least until the problem occured)... Athlon 750 with 128mb of ram and 103mb of swap. -- -- Yoann http://www.mandrakesoft.com/~yoann/ An engineer from NVidia, while asking him to release cards specs said : "Actually, we do write our drivers without documentation." - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
On Thu, Oct 26, 2000 at 11:05:04PM +0200, Yoann Vandoorselaere wrote: > yop, I 've done : > > make -j10 World > in the xfree tree and simulateously : > > while true; do make dep && make clean && make bzImage; done > in the kernel tree Now it'd be nice to verify that the problem also happens when the system is not running out of memory (which -j10 quite causes I think) ... -- Vojtech Pavlik SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
Vojtech Pavlik <[EMAIL PROTECTED]> writes: > On Thu, Oct 26, 2000 at 10:11:54PM +0200, Yoann Vandoorselaere wrote: > > > > > > > ../drivers/block/ide.c, line 162, on version 2.2.17 does bad things > > > > > > to the timer. It writes 0 to the control-word for timer 0. This > > > > > > does the following: > > > > [Snipped...] > > > > > > > > > > Well, at least on 2.4.0-test9, the above timing code is #ifed to > > > > > DISK_RECOVERY_TIME > 0, which in turn is #defined to 0 in > > > > > include/linux/ide.h. > > > > > > > > > > So this is not our problem here. Anyway I guess it's time to hunt for > > > > > i8259 accesses in the kernel that lack the necessary spinlock, even when > > > > > they're not probably the cause of the problem we see here. > > > > > > > > Okay, good. > > > > > > Ok, here is a list of places within the kernel that access the PIT > > > timer, plus the method of locking (i386 arch only): > > > > [...] > > > > Ok, I just tested if the problem was always present without > > the IDE subsystem... > > > > The answer is it is not... so it isn't an IDE problem. > > Uh, guess too many negations. You wanted to say that the problem was > present even when you disabled the IDE subsystem, right? yop > > So now it seems that possibly enough PCI traffic / busmastering traffic > can cause the problem ... yop, I 've done : make -j10 World in the xfree tree and simulateously : while true; do make dep && make clean && make bzImage; done in the kernel tree -- -- Yoann http://www.mandrakesoft.com/~yoann/ An engineer from NVidia, while asking him to release cards specs said : "Actually, we do write our drivers without documentation." - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
On Thu, Oct 26, 2000 at 10:11:54PM +0200, Yoann Vandoorselaere wrote: > > > > > ../drivers/block/ide.c, line 162, on version 2.2.17 does bad things > > > > > to the timer. It writes 0 to the control-word for timer 0. This > > > > > does the following: > > > [Snipped...] > > > > > > > > Well, at least on 2.4.0-test9, the above timing code is #ifed to > > > > DISK_RECOVERY_TIME > 0, which in turn is #defined to 0 in > > > > include/linux/ide.h. > > > > > > > > So this is not our problem here. Anyway I guess it's time to hunt for > > > > i8259 accesses in the kernel that lack the necessary spinlock, even when > > > > they're not probably the cause of the problem we see here. > > > > > > Okay, good. > > > > Ok, here is a list of places within the kernel that access the PIT > > timer, plus the method of locking (i386 arch only): > > [...] > > Ok, I just tested if the problem was always present without > the IDE subsystem... > > The answer is it is not... so it isn't an IDE problem. Uh, guess too many negations. You wanted to say that the problem was present even when you disabled the IDE subsystem, right? So now it seems that possibly enough PCI traffic / busmastering traffic can cause the problem ... -- Vojtech Pavlik SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
Vojtech Pavlik <[EMAIL PROTECTED]> writes: > On Thu, Oct 26, 2000 at 01:42:29PM -0400, Richard B. Johnson wrote: > > > > > ../drivers/block/ide.c, line 162, on version 2.2.17 does bad things > > > > to the timer. It writes 0 to the control-word for timer 0. This > > > > does the following: > > [Snipped...] > > > > > > Well, at least on 2.4.0-test9, the above timing code is #ifed to > > > DISK_RECOVERY_TIME > 0, which in turn is #defined to 0 in > > > include/linux/ide.h. > > > > > > So this is not our problem here. Anyway I guess it's time to hunt for > > > i8259 accesses in the kernel that lack the necessary spinlock, even when > > > they're not probably the cause of the problem we see here. > > > > Okay, good. > > Ok, here is a list of places within the kernel that access the PIT > timer, plus the method of locking (i386 arch only): [...] Ok, I just tested if the problem was always present without the IDE subsystem... The answer is it is not... so it isn't an IDE problem. -- -- Yoann http://www.mandrakesoft.com/~yoann/ An engineer from NVidia, while asking him to release cards specs said : "Actually, we do write our drivers without documentation." - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
On Thu, Oct 26, 2000 at 01:42:29PM -0400, Richard B. Johnson wrote: > > > ../drivers/block/ide.c, line 162, on version 2.2.17 does bad things > > > to the timer. It writes 0 to the control-word for timer 0. This > > > does the following: > [Snipped...] > > > > Well, at least on 2.4.0-test9, the above timing code is #ifed to > > DISK_RECOVERY_TIME > 0, which in turn is #defined to 0 in > > include/linux/ide.h. > > > > So this is not our problem here. Anyway I guess it's time to hunt for > > i8259 accesses in the kernel that lack the necessary spinlock, even when > > they're not probably the cause of the problem we see here. > > Okay, good. Ok, here is a list of places within the kernel that access the PIT timer, plus the method of locking (i386 arch only): Usage: Lock method: arch/i386/kernel/time.c:170:spin_lock() arch/i386/kernel/time.c:491:spin_lock() arch/i386/kernel/time.c:575:none (init) arch/i386/kernel/i8259.c:491: none (init) arch/i386/kernel/apm.c:871: cli() arch/i386/kernel/apic.c:398:spin_lock_irqsave() drivers/char/vt.c:121: cli() drivers/char/ftape/lowlevel/ftape-calibr.c:80: cli() drivers/char/ftape/lowlevel/ftape-calibr.c:99: cli() drivers/char/joystick/analog.c:142: cli() __cli() drivers/char/joystick/gameport.c:66:cli() drivers/ide/hd.c:137: cli() drivers/ide/ide.c:206: __cli() I guess we'll need to fix this. While races here are not likely (the most likely is a beep by vt.c at a wrong moment), they're possible. However, these don't seem to be the cause of the problem we see here anyway. -- Vojtech Pavlik SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
On Thu, 26 Oct 2000, Vojtech Pavlik wrote: > On Thu, Oct 26, 2000 at 12:04:21PM -0400, Richard B. Johnson wrote: > > > ../drivers/block/ide.c, line 162, on version 2.2.17 does bad things > > to the timer. It writes 0 to the control-word for timer 0. This > > does the following: [Snipped...] > > Well, at least on 2.4.0-test9, the above timing code is #ifed to > DISK_RECOVERY_TIME > 0, which in turn is #defined to 0 in > include/linux/ide.h. > > So this is not our problem here. Anyway I guess it's time to hunt for > i8259 accesses in the kernel that lack the necessary spinlock, even when > they're not probably the cause of the problem we see here. Okay, good. Cheers, Dick Johnson Penguin : Linux version 2.2.17 on an i686 machine (801.18 BogoMips). "Memory is like gasoline. You use it up when you are running. Of course you get it all back when you reboot..."; Actual explanation obtained from the Micro$oft help desk. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
On Thu, Oct 26, 2000 at 12:04:21PM -0400, Richard B. Johnson wrote: > ../drivers/block/ide.c, line 162, on version 2.2.17 does bad things > to the timer. It writes 0 to the control-word for timer 0. This > does the following: > > o Selects timer 0. > o Latches the timer. > o Selects mode 0. > o Programs it to a 16 bit counter. > > The result is a latched (stopped) counter. Bits 5 and 4 should have been > selected. Then you read bits 0-7 from 0x40, followed by bits 8-15 from > the same port. > > Also, there is no spin-lock protecting access to these ports. If anybody > else is mucking with the timer, all bets are off. Well, at least on 2.4.0-test9, the above timing code is #ifed to DISK_RECOVERY_TIME > 0, which in turn is #defined to 0 in include/linux/ide.h. So this is not our problem here. Anyway I guess it's time to hunt for i8259 accesses in the kernel that lack the necessary spinlock, even when they're not probably the cause of the problem we see here. -- Vojtech Pavlik SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
On 26 Oct 2000, Yoann Vandoorselaere wrote: > Vojtech Pavlik <[EMAIL PROTECTED]> writes: [Snipped...] ../drivers/block/ide.c, line 162, on version 2.2.17 does bad things to the timer. It writes 0 to the control-word for timer 0. This does the following: o Selects timer 0. o Latches the timer. o Selects mode 0. o Programs it to a 16 bit counter. The result is a latched (stopped) counter. Bits 5 and 4 should have been selected. Then you read bits 0-7 from 0x40, followed by bits 8-15 from the same port. Also, there is no spin-lock protecting access to these ports. If anybody else is mucking with the timer, all bets are off. Cheers, Dick Johnson Penguin : Linux version 2.2.17 on an i686 machine (801.18 BogoMips). "Memory is like gasoline. You use it up when you are running. Of course you get it all back when you reboot..."; Actual explanation obtained from the Micro$oft help desk. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
Vojtech Pavlik <[EMAIL PROTECTED]> writes: > On Thu, Oct 26, 2000 at 04:42:31PM +0200, Yoann Vandoorselaere wrote: > > > > On Thu, Oct 26, 2000 at 04:20:43PM +0200, Yoann Vandoorselaere wrote: > > > > > > > ... > > > > > > > > Have you any idea what is the relation between time and this chip ? > > > > > > > > Also, I'm experiencing the problem for several month on my > > > > workstation and I never could find where it was comming from... > > > > how did you do ? > > > > > > Well, it integrates both the i8253 PIT and the vt82c586 IDE controller. > > > > > > I first located the wrong time was coming from gettimeofday() and not > > > from the other sources of time the kernel provides. And then I was > > > tracking the problem (which actually is an underflow - the chip bug > > > causes some time offset variables go negative - 0x microseconds > > > is about 1:20 hours). And this way I got to the spot where the patch > > > cures the problem. > > > > Ok, here is what I experienced : > > > > First what is strange is that : > > - I'm using SCSI > > - I just have an IDE disk for mp3. > > The IDE subsystem is never used heavilly... > > > > I've experienced the problem after some time of > > heavy scsi IO, my screen under X was going black (like with dpms) > > When I was moving the mouse, the image was coming back > > for < 1 seconds, then black screen... > > > > The only fix was to kill X then to reboot. > > > > Anyway, thanks for your explaination... > > I'll do a feedback for this patch ASAP. > > Interesting. If it's caused by SCSI as well (might be), then it's not > caused by heavy IDE activity but rather than that it could be heavy > BusMastering activity instead (The IDE chip does BM as well). > > I'm still wondering if it could be a Linux kernel bug (bad/concurrent > accesses to the i8253 registers), this has to be checked. An easy way to verify the problem is to start 'dbench 128', I'm gonna do that with and without IDE subsystem to see what happen. -- -- Yoann http://www.mandrakesoft.com/~yoann/ I worry about my child and the Internet all the time, even though she's too young to have logged on yet. Here's what I worry about. I worry that 10 or 15 years from now, she will come to me and say 'Daddy, where were you when they took freedom of the press away from the Internet?' -- Mike Godwin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
On Thu, Oct 26, 2000 at 04:42:31PM +0200, Yoann Vandoorselaere wrote: > > On Thu, Oct 26, 2000 at 04:20:43PM +0200, Yoann Vandoorselaere wrote: > > > > > ... > > > > > > Have you any idea what is the relation between time and this chip ? > > > > > > Also, I'm experiencing the problem for several month on my > > > workstation and I never could find where it was comming from... > > > how did you do ? > > > > Well, it integrates both the i8253 PIT and the vt82c586 IDE controller. > > > > I first located the wrong time was coming from gettimeofday() and not > > from the other sources of time the kernel provides. And then I was > > tracking the problem (which actually is an underflow - the chip bug > > causes some time offset variables go negative - 0x microseconds > > is about 1:20 hours). And this way I got to the spot where the patch > > cures the problem. > > Ok, here is what I experienced : > > First what is strange is that : > - I'm using SCSI > - I just have an IDE disk for mp3. > The IDE subsystem is never used heavilly... > > I've experienced the problem after some time of > heavy scsi IO, my screen under X was going black (like with dpms) > When I was moving the mouse, the image was coming back > for < 1 seconds, then black screen... > > The only fix was to kill X then to reboot. > > Anyway, thanks for your explaination... > I'll do a feedback for this patch ASAP. Interesting. If it's caused by SCSI as well (might be), then it's not caused by heavy IDE activity but rather than that it could be heavy BusMastering activity instead (The IDE chip does BM as well). I'm still wondering if it could be a Linux kernel bug (bad/concurrent accesses to the i8253 registers), this has to be checked. -- Vojtech Pavlik SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
On Thu, Oct 26, 2000 at 04:42:31PM +0200, Yoann Vandoorselaere wrote: On Thu, Oct 26, 2000 at 04:20:43PM +0200, Yoann Vandoorselaere wrote: ... Have you any idea what is the relation between time and this chip ? Also, I'm experiencing the problem for several month on my workstation and I never could find where it was comming from... how did you do ? Well, it integrates both the i8253 PIT and the vt82c586 IDE controller. I first located the wrong time was coming from gettimeofday() and not from the other sources of time the kernel provides. And then I was tracking the problem (which actually is an underflow - the chip bug causes some time offset variables go negative - 0x microseconds is about 1:20 hours). And this way I got to the spot where the patch cures the problem. Ok, here is what I experienced : First what is strange is that : - I'm using SCSI - I just have an IDE disk for mp3. The IDE subsystem is never used heavilly... I've experienced the problem after some time of heavy scsi IO, my screen under X was going black (like with dpms) When I was moving the mouse, the image was coming back for 1 seconds, then black screen... The only fix was to kill X then to reboot. Anyway, thanks for your explaination... I'll do a feedback for this patch ASAP. Interesting. If it's caused by SCSI as well (might be), then it's not caused by heavy IDE activity but rather than that it could be heavy BusMastering activity instead (The IDE chip does BM as well). I'm still wondering if it could be a Linux kernel bug (bad/concurrent accesses to the i8253 registers), this has to be checked. -- Vojtech Pavlik SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
Vojtech Pavlik [EMAIL PROTECTED] writes: On Thu, Oct 26, 2000 at 04:42:31PM +0200, Yoann Vandoorselaere wrote: On Thu, Oct 26, 2000 at 04:20:43PM +0200, Yoann Vandoorselaere wrote: ... Have you any idea what is the relation between time and this chip ? Also, I'm experiencing the problem for several month on my workstation and I never could find where it was comming from... how did you do ? Well, it integrates both the i8253 PIT and the vt82c586 IDE controller. I first located the wrong time was coming from gettimeofday() and not from the other sources of time the kernel provides. And then I was tracking the problem (which actually is an underflow - the chip bug causes some time offset variables go negative - 0x microseconds is about 1:20 hours). And this way I got to the spot where the patch cures the problem. Ok, here is what I experienced : First what is strange is that : - I'm using SCSI - I just have an IDE disk for mp3. The IDE subsystem is never used heavilly... I've experienced the problem after some time of heavy scsi IO, my screen under X was going black (like with dpms) When I was moving the mouse, the image was coming back for 1 seconds, then black screen... The only fix was to kill X then to reboot. Anyway, thanks for your explaination... I'll do a feedback for this patch ASAP. Interesting. If it's caused by SCSI as well (might be), then it's not caused by heavy IDE activity but rather than that it could be heavy BusMastering activity instead (The IDE chip does BM as well). I'm still wondering if it could be a Linux kernel bug (bad/concurrent accesses to the i8253 registers), this has to be checked. An easy way to verify the problem is to start 'dbench 128', I'm gonna do that with and without IDE subsystem to see what happen. -- -- Yoann http://www.mandrakesoft.com/~yoann/ I worry about my child and the Internet all the time, even though she's too young to have logged on yet. Here's what I worry about. I worry that 10 or 15 years from now, she will come to me and say 'Daddy, where were you when they took freedom of the press away from the Internet?' -- Mike Godwin - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
On 26 Oct 2000, Yoann Vandoorselaere wrote: Vojtech Pavlik [EMAIL PROTECTED] writes: [Snipped...] ../drivers/block/ide.c, line 162, on version 2.2.17 does bad things to the timer. It writes 0 to the control-word for timer 0. This does the following: o Selects timer 0. o Latches the timer. o Selects mode 0. o Programs it to a 16 bit counter. The result is a latched (stopped) counter. Bits 5 and 4 should have been selected. Then you read bits 0-7 from 0x40, followed by bits 8-15 from the same port. Also, there is no spin-lock protecting access to these ports. If anybody else is mucking with the timer, all bets are off. Cheers, Dick Johnson Penguin : Linux version 2.2.17 on an i686 machine (801.18 BogoMips). "Memory is like gasoline. You use it up when you are running. Of course you get it all back when you reboot..."; Actual explanation obtained from the Micro$oft help desk. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
On Thu, Oct 26, 2000 at 12:04:21PM -0400, Richard B. Johnson wrote: ../drivers/block/ide.c, line 162, on version 2.2.17 does bad things to the timer. It writes 0 to the control-word for timer 0. This does the following: o Selects timer 0. o Latches the timer. o Selects mode 0. o Programs it to a 16 bit counter. The result is a latched (stopped) counter. Bits 5 and 4 should have been selected. Then you read bits 0-7 from 0x40, followed by bits 8-15 from the same port. Also, there is no spin-lock protecting access to these ports. If anybody else is mucking with the timer, all bets are off. Well, at least on 2.4.0-test9, the above timing code is #ifed to DISK_RECOVERY_TIME 0, which in turn is #defined to 0 in include/linux/ide.h. So this is not our problem here. Anyway I guess it's time to hunt for i8259 accesses in the kernel that lack the necessary spinlock, even when they're not probably the cause of the problem we see here. -- Vojtech Pavlik SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
On Thu, 26 Oct 2000, Vojtech Pavlik wrote: On Thu, Oct 26, 2000 at 12:04:21PM -0400, Richard B. Johnson wrote: ../drivers/block/ide.c, line 162, on version 2.2.17 does bad things to the timer. It writes 0 to the control-word for timer 0. This does the following: [Snipped...] Well, at least on 2.4.0-test9, the above timing code is #ifed to DISK_RECOVERY_TIME 0, which in turn is #defined to 0 in include/linux/ide.h. So this is not our problem here. Anyway I guess it's time to hunt for i8259 accesses in the kernel that lack the necessary spinlock, even when they're not probably the cause of the problem we see here. Okay, good. Cheers, Dick Johnson Penguin : Linux version 2.2.17 on an i686 machine (801.18 BogoMips). "Memory is like gasoline. You use it up when you are running. Of course you get it all back when you reboot..."; Actual explanation obtained from the Micro$oft help desk. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
On Thu, Oct 26, 2000 at 01:42:29PM -0400, Richard B. Johnson wrote: ../drivers/block/ide.c, line 162, on version 2.2.17 does bad things to the timer. It writes 0 to the control-word for timer 0. This does the following: [Snipped...] Well, at least on 2.4.0-test9, the above timing code is #ifed to DISK_RECOVERY_TIME 0, which in turn is #defined to 0 in include/linux/ide.h. So this is not our problem here. Anyway I guess it's time to hunt for i8259 accesses in the kernel that lack the necessary spinlock, even when they're not probably the cause of the problem we see here. Okay, good. Ok, here is a list of places within the kernel that access the PIT timer, plus the method of locking (i386 arch only): Usage: Lock method: arch/i386/kernel/time.c:170:spin_lock() arch/i386/kernel/time.c:491:spin_lock() arch/i386/kernel/time.c:575:none (init) arch/i386/kernel/i8259.c:491: none (init) arch/i386/kernel/apm.c:871: cli() arch/i386/kernel/apic.c:398:spin_lock_irqsave() drivers/char/vt.c:121: cli() drivers/char/ftape/lowlevel/ftape-calibr.c:80: cli() drivers/char/ftape/lowlevel/ftape-calibr.c:99: cli() drivers/char/joystick/analog.c:142: cli() __cli() drivers/char/joystick/gameport.c:66:cli() drivers/ide/hd.c:137: cli() drivers/ide/ide.c:206: __cli() I guess we'll need to fix this. While races here are not likely (the most likely is a beep by vt.c at a wrong moment), they're possible. However, these don't seem to be the cause of the problem we see here anyway. -- Vojtech Pavlik SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
Vojtech Pavlik [EMAIL PROTECTED] writes: On Thu, Oct 26, 2000 at 01:42:29PM -0400, Richard B. Johnson wrote: ../drivers/block/ide.c, line 162, on version 2.2.17 does bad things to the timer. It writes 0 to the control-word for timer 0. This does the following: [Snipped...] Well, at least on 2.4.0-test9, the above timing code is #ifed to DISK_RECOVERY_TIME 0, which in turn is #defined to 0 in include/linux/ide.h. So this is not our problem here. Anyway I guess it's time to hunt for i8259 accesses in the kernel that lack the necessary spinlock, even when they're not probably the cause of the problem we see here. Okay, good. Ok, here is a list of places within the kernel that access the PIT timer, plus the method of locking (i386 arch only): [...] Ok, I just tested if the problem was always present without the IDE subsystem... The answer is it is not... so it isn't an IDE problem. -- -- Yoann http://www.mandrakesoft.com/~yoann/ An engineer from NVidia, while asking him to release cards specs said : "Actually, we do write our drivers without documentation." - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
On Thu, Oct 26, 2000 at 10:11:54PM +0200, Yoann Vandoorselaere wrote: ../drivers/block/ide.c, line 162, on version 2.2.17 does bad things to the timer. It writes 0 to the control-word for timer 0. This does the following: [Snipped...] Well, at least on 2.4.0-test9, the above timing code is #ifed to DISK_RECOVERY_TIME 0, which in turn is #defined to 0 in include/linux/ide.h. So this is not our problem here. Anyway I guess it's time to hunt for i8259 accesses in the kernel that lack the necessary spinlock, even when they're not probably the cause of the problem we see here. Okay, good. Ok, here is a list of places within the kernel that access the PIT timer, plus the method of locking (i386 arch only): [...] Ok, I just tested if the problem was always present without the IDE subsystem... The answer is it is not... so it isn't an IDE problem. Uh, guess too many negations. You wanted to say that the problem was present even when you disabled the IDE subsystem, right? So now it seems that possibly enough PCI traffic / busmastering traffic can cause the problem ... -- Vojtech Pavlik SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
Vojtech Pavlik [EMAIL PROTECTED] writes: On Thu, Oct 26, 2000 at 10:11:54PM +0200, Yoann Vandoorselaere wrote: ../drivers/block/ide.c, line 162, on version 2.2.17 does bad things to the timer. It writes 0 to the control-word for timer 0. This does the following: [Snipped...] Well, at least on 2.4.0-test9, the above timing code is #ifed to DISK_RECOVERY_TIME 0, which in turn is #defined to 0 in include/linux/ide.h. So this is not our problem here. Anyway I guess it's time to hunt for i8259 accesses in the kernel that lack the necessary spinlock, even when they're not probably the cause of the problem we see here. Okay, good. Ok, here is a list of places within the kernel that access the PIT timer, plus the method of locking (i386 arch only): [...] Ok, I just tested if the problem was always present without the IDE subsystem... The answer is it is not... so it isn't an IDE problem. Uh, guess too many negations. You wanted to say that the problem was present even when you disabled the IDE subsystem, right? yop So now it seems that possibly enough PCI traffic / busmastering traffic can cause the problem ... yop, I 've done : make -j10 World in the xfree tree and simulateously : while true; do make dep make clean make bzImage; done in the kernel tree -- -- Yoann http://www.mandrakesoft.com/~yoann/ An engineer from NVidia, while asking him to release cards specs said : "Actually, we do write our drivers without documentation." - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
On Thu, Oct 26, 2000 at 11:05:04PM +0200, Yoann Vandoorselaere wrote: yop, I 've done : make -j10 World in the xfree tree and simulateously : while true; do make dep make clean make bzImage; done in the kernel tree Now it'd be nice to verify that the problem also happens when the system is not running out of memory (which -j10 quite causes I think) ... -- Vojtech Pavlik SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
Vojtech Pavlik [EMAIL PROTECTED] writes: On Thu, Oct 26, 2000 at 11:05:04PM +0200, Yoann Vandoorselaere wrote: yop, I 've done : make -j10 World in the xfree tree and simulateously : while true; do make dep make clean make bzImage; done in the kernel tree Now it'd be nice to verify that the problem also happens when the system is not running out of memory (which -j10 quite causes I think) ... Nope, my system was loaded, but was usable (at least until the problem occured)... Athlon 750 with 128mb of ram and 103mb of swap. -- -- Yoann http://www.mandrakesoft.com/~yoann/ An engineer from NVidia, while asking him to release cards specs said : "Actually, we do write our drivers without documentation." - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Possible critical VIA vt82c686a chip bug (private question)
On Thu, Oct 26, 2000 at 11:24:38PM +0200, Yoann Vandoorselaere wrote: Vojtech Pavlik [EMAIL PROTECTED] writes: On Thu, Oct 26, 2000 at 11:05:04PM +0200, Yoann Vandoorselaere wrote: yop, I 've done : make -j10 World in the xfree tree and simulateously : while true; do make dep make clean make bzImage; done in the kernel tree Now it'd be nice to verify that the problem also happens when the system is not running out of memory (which -j10 quite causes I think) ... Nope, my system was loaded, but was usable (at least until the problem occured)... Good to know. -- Vojtech Pavlik SuSE Labs - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/