Memory corruption in CURRENT

2002-08-22 Thread Martin Blapp
Hi all, I suspect all the SIG4 and SIG11 problems we see are due memory corruption in CURRENT. In the latter case, the affected file looks like: case HASH('^', 'e'): case HASH('^', 'i'): case HASH('^' 'o'): \xc0 case HASH('^', 'u'): %case HADH('`', \xc0A'): ^@ase HASH('`', 'E

Re: Memory corruption in CURRENT

2002-08-22 Thread Mark Santcroos
On Thu, Aug 22, 2002 at 09:43:45AM +0200, Martin Blapp wrote: I suspect all the SIG4 and SIG11 problems we see are due memory corruption in CURRENT. In the latter case, the affected file looks like: case HASH('^', 'e'): case HASH('^', 'i'): case HASH('^' 'o'): \xc0 case

Re: Memory corruption in CURRENT

2002-08-22 Thread Soeren Schmidt
It seems Martin Blapp wrote: Hi all, I suspect all the SIG4 and SIG11 problems we see are due memory corruption in CURRENT. The file is correct after a reboot, so the corruption was limited to the copy cached in RAM. Thats memory corruption. I'm also not able anymore to make

Re: Memory corruption in CURRENT

2002-08-22 Thread Martin Blapp
Hi Soeren, However, this kind of problem in most cases spells bad HW to me, ie subspec RAM, poor powersupply, badly cooled CPU, overclocking etc etc... That's what I thought too. I have now three different systems which show all this: 1) PIV 1,6Ghz, Intel B845DG Board, 1GB Kingston Ram, 2)

Re: Memory corruption in CURRENT

2002-08-22 Thread Soeren Schmidt
It seems Martin Blapp wrote: Hi Soeren, However, this kind of problem in most cases spells bad HW to me, ie subspec RAM, poor powersupply, badly cooled CPU, overclocking etc etc... That's what I thought too. I have now three different systems which show all this: 1) PIV 1,6Ghz,

Re: Memory corruption in CURRENT

2002-08-22 Thread Udo Schweigert
On Thu, Aug 22, 2002 at 11:23:46 +0200, Martin Blapp wrote: That's what I thought too. I have now three different systems which show all this: 1) PIV 1,6Ghz, Intel B845DG Board, 1GB Kingston Ram, 2) PIV 2Ghz Intel B845DG Board, 1GB Kingston ECC Ram 3) PIV 2,26 Ghz Asus P4B533 Board with

Re: Memory corruption in CURRENT

2002-08-22 Thread Don Lewis
On 22 Aug, Mark Santcroos wrote: On Thu, Aug 22, 2002 at 09:43:45AM +0200, Martin Blapp wrote: Thats memory corruption. I'm also not able anymore to make 10 buildworlds (without -j, that triggers panics in pmap code). Bye the way, I'm experiencing this since about 4-5 months. All

Re: Memory corruption in CURRENT

2002-08-22 Thread Terry Lambert
Soeren Schmidt wrote: It seems Martin Blapp wrote: I suspect all the SIG4 and SIG11 problems we see are due memory corruption in CURRENT. The file is correct after a reboot, so the corruption was limited to the copy cached in RAM. Thats memory corruption. I'm also not able

Re: Memory corruption in CURRENT

2002-08-22 Thread Mark Santcroos
On Thu, Aug 22, 2002 at 11:30:50AM +0200, Udo Schweigert wrote: Only a little addition from me: I had the same problems on -stable and they only disappeared after compiling the kernel without debugging. I had the impression that it has to do with the size of the kernel (but this of course

Re: Memory corruption in CURRENT

2002-08-22 Thread Terry Lambert
Martin Blapp wrote: I have now three different systems which show all this: 1) PIV 1,6Ghz, Intel B845DG Board, 1GB Kingston Ram, 2) PIV 2Ghz Intel B845DG Board, 1GB Kingston ECC Ram 3) PIV 2,26 Ghz Asus P4B533 Board with I845 chipset, 1GB noname Ram All running CURRENT. I also replaced

Re: Memory corruption in CURRENT

2002-08-22 Thread Mark Santcroos
Hi Martin, As you know this problem for longer, did you already try to make the problem a bit more reproducable / narrowed down? If not, we really should try to, that will be the first step in fixing it. Mark -- Mark Santcroos RIPE Network Coordination Centre

Re: Memory corruption in CURRENT

2002-08-22 Thread Mark Santcroos
On Thu, Aug 22, 2002 at 02:38:25AM -0700, Terry Lambert wrote: Alternatively, rather than those options, try losing 512M of the RAM... I note they are all 1G boxes. No, mine is 256MB. Mark -- Mark Santcroos RIPE Network Coordination Centre

Re: Memory corruption in CURRENT

2002-08-22 Thread Don Lewis
On 22 Aug, Soeren Schmidt wrote: However, this kind of problem in most cases spells bad HW to me, ie subspec RAM, poor powersupply, badly cooled CPU, overclocking etc etc... My motherboard chipset supports ECC RAM and I have ECC RAM installed. I upgraded to an expensive Antec power supply

Re: Memory corruption in CURRENT

2002-08-22 Thread Terry Lambert
Mark Santcroos wrote: On Thu, Aug 22, 2002 at 02:38:25AM -0700, Terry Lambert wrote: Alternatively, rather than those options, try losing 512M of the RAM... I note they are all 1G boxes. No, mine is 256MB. Correction: all of his were 1G, and should be halved. *You*, on the other hand,

Re: Memory corruption in CURRENT

2002-08-22 Thread Mark Santcroos
On Thu, Aug 22, 2002 at 02:33:57AM -0700, Terry Lambert wrote: options DISABLE_PSE options DISABLE_PG_G Coming up next in this theater :-) btw, how does the report that using the other compiler fixed everything for KT fit in? -- Mark Santcroos RIPE

Re: Memory corruption in CURRENT

2002-08-22 Thread KT Sin
Hi This is what I did the to system's cc/gcc. I built gcc3.1.1 released version from the ports (with much pain of coz). passion:/usr/bin[514]# ls -l cc* gcc* lrwxr-xr-x 1 root wheel 20 Aug 12 21:54 cc - /usr/local/bin/gcc31 -r-xr-xr-x 2 root wheel 135616 Aug 12 21:52 cc.sav lrwxr-xr-x

Re: Memory corruption in CURRENT

2002-08-22 Thread Terry Lambert
Mark Santcroos wrote: On Thu, Aug 22, 2002 at 02:33:57AM -0700, Terry Lambert wrote: options DISABLE_PSE options DISABLE_PG_G Coming up next in this theater :-) btw, how does the report that using the other compiler fixed everything for KT fit in? Coincidentally. It's

Re: Memory corruption in CURRENT

2002-08-22 Thread Don Lewis
On 22 Aug, Terry Lambert wrote: Alternatively, rather than those options, try losing 512M of the RAM... I note they are all 1G boxes. When I first put this system together several months ago, I only installed the first 512M of RAM and the problem was much worse. I only had about a 50% chance

Re: Memory corruption in CURRENT

2002-08-22 Thread Mark Santcroos
On Thu, Aug 22, 2002 at 03:17:03AM -0700, Terry Lambert wrote: Mark Santcroos wrote: On Thu, Aug 22, 2002 at 02:33:57AM -0700, Terry Lambert wrote: options DISABLE_PSE options DISABLE_PG_G Coming up next in this theater :-) btw, how does the report that using the

Re: Memory corruption in CURRENT

2002-08-22 Thread Martin Blapp
Hi, options DISABLE_PSE options DISABLE_PG_G Just added them. I'll now build 20 buildworlds with those enabled. Martin To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-hackers in the body of the message

Re: Memory corruption in CURRENT

2002-08-22 Thread Terry Lambert
Don Lewis wrote: At the moment I'm running a set of buildworlds with an August 6th kernel, just to verify the problem that I'm seeing isn't something new. When I'm done with that, I'll reduce the RAM from 1G to 512M and try again. I'll also try the DISABLE_PSE and DISABLE_PG_G options.

Re: Memory corruption in CURRENT

2002-08-22 Thread Terry Lambert
Mark Santcroos wrote: On Thu, Aug 22, 2002 at 03:17:03AM -0700, Terry Lambert wrote: Mark Santcroos wrote: On Thu, Aug 22, 2002 at 02:33:57AM -0700, Terry Lambert wrote: options DISABLE_PSE options DISABLE_PG_G Coming up next in this theater :-) btw, how does

Re: Memory corruption in CURRENT

2002-08-22 Thread Terry Lambert
Martin Blapp wrote: options DISABLE_PSE options DISABLE_PG_G Just added them. I'll now build 20 buildworlds with those enabled. Let the list know if it does anything. If Soren could also test, that would give a sample size. If it's a 3-for-3 workaround, then I probably need

Re: Memory corruption in CURRENT

2002-08-22 Thread Mark Santcroos
On Thu, Aug 22, 2002 at 04:23:46AM -0700, Terry Lambert wrote: Ugh! Wait until it seems to work for a statistically significant sample size, and for more than one person before calling it happy! Also, I'm not sure looking at the code whether or not the PG_G is truly significant, or just

Re: Memory corruption in CURRENT

2002-08-22 Thread Mark Santcroos
On Thu, Aug 22, 2002 at 04:31:02AM -0700, Terry Lambert wrote: If it's a 3-for-3 workaround, then I probably need to take the discussion offline with Peter Wemm, and come up with a permanent fix. There was something with non-disclosure, am I right? -- Mark Santcroos

Re: Memory corruption in CURRENT

2002-08-22 Thread Soeren Schmidt
It seems Terry Lambert wrote: Martin Blapp wrote: options DISABLE_PSE options DISABLE_PG_G Just added them. I'll now build 20 buildworlds with those enabled. Let the list know if it does anything. If Soren could also test, that would give a sample size. Sure, but I

Re: Memory corruption in CURRENT

2002-08-22 Thread Mark Santcroos
Hi, Can you revert back to the system compiler and also compile your kernel with this options and do some buildworlds again? Thanks Mark On Thu, Aug 22, 2002 at 01:41:13PM +0200, Soeren Schmidt wrote: It seems Terry Lambert wrote: Martin Blapp wrote: options DISABLE_PSE

Re: Memory corruption in CURRENT

2002-08-22 Thread Soeren Schmidt
It seems Mark Santcroos wrote: Hi, Can you revert back to the system compiler and also compile your kernel with this options and do some buildworlds again? I already use the system compiler... On Thu, Aug 22, 2002 at 01:41:13PM +0200, Soeren Schmidt wrote: Sure, but I dont have the

Re: Memory corruption in CURRENT

2002-08-22 Thread Mark Santcroos
On Thu, Aug 22, 2002 at 01:55:42PM +0200, Soeren Schmidt wrote: Can you revert back to the system compiler and also compile your kernel with this options and do some buildworlds again? I already use the system compiler... That's why the message was addressed to kt ;-) -- Mark Santcroos

Re: Memory corruption in CURRENT

2002-08-22 Thread Terry Lambert
Mark Santcroos wrote: On Thu, Aug 22, 2002 at 04:31:02AM -0700, Terry Lambert wrote: If it's a 3-for-3 workaround, then I probably need to take the discussion offline with Peter Wemm, and come up with a permanent fix. There was something with non-disclosure, am I right? No. If it's

Re: Memory corruption in CURRENT

2002-08-22 Thread Terry Lambert
Mark Santcroos wrote: On Thu, Aug 22, 2002 at 01:41:13PM +0200, Soeren Schmidt wrote: Sure, but I dont have the problem :) I can buildworld for days on my (heavily overclocked btw) Athlon with no problems at all... Can you revert back to the system compiler and also compile your kernel

Re: Memory corruption in CURRENT

2002-08-22 Thread Mark Santcroos
On Thu, Aug 22, 2002 at 05:21:54AM -0700, Terry Lambert wrote: Mark Santcroos wrote: On Thu, Aug 22, 2002 at 01:41:13PM +0200, Soeren Schmidt wrote: Sure, but I dont have the problem :) I can buildworld for days on my (heavily overclocked btw) Athlon with no problems at all... Can

Re: Memory corruption in CURRENT

2002-08-22 Thread Terry Lambert
Mark Santcroos wrote: I was just giving a slight report, not yelling halleluja yet ;-) It's doing the 2nd buildworld now. Do you also want me to try to split up the disabling of the two options? No. Me saying to use both options was just me being lazy about spending 2 days re-documenting

Re: Memory corruption in CURRENT

2002-08-22 Thread Bosko Milekic
We have seen weird problems regarding the pmap PG_G related stuff (well sort of, it has to do with PSE and PG_G) on ppro and pII chips (apparently, this is not the case with at least Xeons) but what happened, for the record, was this: We would enable PSE and switch the pde corresponding to the

Re: Memory corruption in CURRENT

2002-08-22 Thread Martin Blapp
Hi Terry, options DISABLE_PSE options DISABLE_PG_G I'm now at buildworld IV, since I have these options compiled it the bug did not show up again. Another sideeffect: Before that I could not even make -j 10 buildworld, that ended with a page fault somewhere im pmap... This