Re: Solved: CURRENT and P-IV problems
On 21 Aug, Don Lewis wrote: On 21 Aug, Martin Blapp wrote: Hi, Try to compile the entire system on another box, install it then on the CURRENT target box, and try again ! Bye the way, after 6 rounds, I see now SIG4 and SIG11 too :-/ To bad - so it's definitly data corruption in CURRENT. Asus Board P4B533-V, P-IV 2,26Ghz, 1GB DDR 2100 Ram. No sign of any problems here with last night's -current. Gigabyte GA7-DX+, Athlon XP 1900+, 1GB PC2100 ECC DRAM, SCSI disk, NFS client. I'm not running any sound hardware or the Xserver, and I'm accessing the host via ssh instead of the console. The kernel is GENERIC + SMBus. It's on buildworld #7 at the moment: I guess I spoke too soon. I encountered errors on buildworld iterations 8 and 10. cc -O -pipe -DIN_GCC -DHAVE_CONFIG_H -DPREFIX=\/usr\ -I/usr/obj/usr/src/gnu/u sr.bin/cc/cc_int/../cc_tools -I/usr/src/gnu/usr.bin/cc/cc_int/../cc_tools -I/usr /src/gnu/usr.bin/cc/cc_int/../../../../contrib/gcc -I/usr/src/gnu/usr.bin/cc/cc_ int/../../../../contrib/gcc/config -DHAVE_CONFIG_H -DTARGET_NAME=\i386-undermyd esk-freebsd\ -DIN_GCC -c /usr/src/contrib/gcc/reload1.c -o reload1.o /usr/src/contrib/gcc/reload1.c: In function `emit_reload_insns': /usr/src/contrib/gcc/reload1.c:7339: internal error: Segmentation fault Please submit a full bug report, with preprocessed source if appropriate. See URL:http://www.gnu.org/software/gcc/bugs.html for instructions. *** Error code 1 Stop in /usr/src/gnu/usr.bin/cc/cc_int. c++ -O -pipe -DHAVE_STDLIB_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DIRENT_H=1 -DHAVE_LIMI TS_H=1 -DHAVE_STRING_H=1 -DHAVE_STRINGS_H=1 -DHAVE_MATH_H=1 -DRET_TYPE_SRAND_IS_ VOID=1 -DHAVE_SYS_NERR=1 -DHAVE_SYS_ERRLIST=1 -DHAVE_CC_LIMITS_H=1 -DRETSIGTYPE= void -DHAVE_STRUCT_EXCEPTION=1 -DHAVE_GETPAGESIZE=1 -DHAVE_MMAP=1 -DHAVE_FMOD=1 -DHAVE_STRTOL=1 -DHAVE_GETCWD=1 -DHAVE_STRERROR=1 -DHAVE_PUTENV=1 -DHAVE_RENAME= 1 -DHAVE_MKSTEMP=1 -DHAVE_STRCASECMP=1 -DHAVE_STRNCASECMP=1 -DHAVE_STRSEP=1 -DHA VE_STRDUP=1 -DSYS_SIGLIST_DECLARED=1 -I/usr/src/gnu/usr.bin/groff/src/utils/addf tinfo/../../../../../../contrib/groff/src/include -I/usr/src/gnu/usr.bin/groff/s rc/utils/addftinfo/../../../src/include-fno-rtti -fno-exceptions -c /usr/src /contrib/groff/src/utils/addftinfo/guess.cc /usr/src/contrib/groff/src/utils/addftinfo/guess.cc:446:20: missing terminating ' character /usr/src/contrib/groff/src/utils/addftinfo/guess.cc:447:1: warning: null charact er(s) ignored *** Error code 1 Stop in /usr/src/gnu/usr.bin/groff/src/utils/addftinfo. In the latter case, the affected file looks like: case HASH('^', 'e'): case HASH('^', 'i'): case HASH('^' 'o'): \xc0 case HASH('^', 'u'): %case HADH('`', \xc0A'): ^@ase HASH('`', 'E'): case HASH('`', 'I'): case HASH('`', 'O'): case HASH('`', 'U'): The file is correct after a reboot, so the corruption was limited to the copy cached in RAM. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Solved: CURRENT and P-IV problems
Hi, As reported, Brian and I did see SIG4 and SIG11 during make buildworlds. I've replaced everything, two - three times, the problem persisted. I also tried three motherboards, but all from the same type: Intel BD843BG with DDR 266 Ram (2100). Just for interest, I've replaced this Mobo now with an Asus P4B533-V board. All segfaults and illegal instructions are gone now. So it seems to be specific to the Intel board. BIOS update did not help. Change timing settings also not. The default settings produce these errors. It happens rarely on STABLE, often on CURRENT. What issue could this be with the Intel manufacured board ? Is it a design issue, or could it still be a FreeBSD bug ? Both Mobo's use the same i845 chipset, and use the same Ram. Can anyone who experienced those coredumps send me a exact list of used chipsets ? Martin Martin Blapp, [EMAIL PROTECTED] [EMAIL PROTECTED] -- ImproWare AG, UNIXSP ISP, Zurlindenstrasse 29, 4133 Pratteln, CH Phone: +41 061 826 93 00: +41 61 826 93 01 PGP: finger -l [EMAIL PROTECTED] PGP Fingerprint: B434 53FC C87C FE7B 0A18 B84C 8686 EF22 D300 551E -- To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Solved: CURRENT and P-IV problems
On Wed, Aug 21, 2002 at 02:48:38PM +0200, Martin Blapp wrote: Hi, As reported, Brian and I did see SIG4 and SIG11 during make buildworlds. I've replaced everything, two - three times, the problem persisted. I also tried three motherboards, but all from the same type: Intel BD843BG with DDR 266 Ram (2100). Just for interest, I've replaced this Mobo now with an Asus P4B533-V board. All segfaults and illegal instructions are gone now. So it seems to be specific to the Intel board. BIOS update did not help. Change timing settings also not. The default settings produce these errors. It happens rarely on STABLE, often on CURRENT. What issue could this be with the Intel manufacured board ? Is it a design issue, or could it still be a FreeBSD bug ? Both Mobo's use the same i845 chipset, and use the same Ram. Can anyone who experienced those coredumps send me a exact list of used chipsets ? Hi Martin, I have a P4 mobile in my laptop and also had this behaviour for a certain -current window last week (the time I got the laptop). (Dell C640) Now it is gone. I'm sorry, I don't have an exact date/commit. It has the Intel Mobile 845MP chipset. Mark -- Mark Santcroos RIPE Network Coordination Centre http://www.ripe.net/home/mark/ New Projects Group/TTM To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Solved: CURRENT and P-IV problems
Hi, I have a P4 mobile in my laptop and also had this behaviour for a certain -current window last week (the time I got the laptop). (Dell C640) Now it is gone. I'm sorry, I don't have an exact date/commit. Try to do some worlds in a row (5-10) and you will see if it survives. The problem here was that it did work sometimes, sometimes not. But I could never finish 10 worlds in a row. Can you try that ? It has the Intel Mobile 845MP chipset. Ah, well mine is here: http://www.intel.com/design/motherbd/bg/index.htm?iid=ipp_dlc_deskmb+p4pmb_D845BG; It's a normal Intel 845 chipset. Martin To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Solved: CURRENT and P-IV problems
On Wed, Aug 21, 2002 at 03:32:44PM +0200, Martin Blapp wrote: I have a P4 mobile in my laptop and also had this behaviour for a certain -current window last week (the time I got the laptop). (Dell C640) Now it is gone. I'm sorry, I don't have an exact date/commit. Try to do some worlds in a row (5-10) and you will see if it survives. The problem here was that it did work sometimes, sometimes not. But I could never finish 10 worlds in a row. Can you try that ? Eek! First one already bailed out! Talking about bad luck! Signal 10 this time. Now I come to think of it, I didn't do much buildworld's, mostly buildkernels. Mark -- Mark Santcroos RIPE Network Coordination Centre http://www.ripe.net/home/mark/ New Projects Group/TTM To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Solved: CURRENT and P-IV problems
On Wed, Aug 21, 2002 at 04:07:14PM +0200, Mark Santcroos wrote: On Wed, Aug 21, 2002 at 03:32:44PM +0200, Martin Blapp wrote: I have a P4 mobile in my laptop and also had this behaviour for a certain -current window last week (the time I got the laptop). (Dell C640) Now it is gone. I'm sorry, I don't have an exact date/commit. Try to do some worlds in a row (5-10) and you will see if it survives. The problem here was that it did work sometimes, sometimes not. But I could never finish 10 worlds in a row. Can you try that ? Eek! First one already bailed out! Talking about bad luck! Signal 10 this time. The 2nd one fails at exactly the same point, that can't be coindedence. Also with a signal 10. (libutil) The 3rd ended somewhere else(games/rogue), but now with signal 4. Doing the 4rd now. Mark -- Mark Santcroos RIPE Network Coordination Centre http://www.ripe.net/home/mark/ New Projects Group/TTM To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Solved: CURRENT and P-IV problems
On Wed, Aug 21, 2002 at 04:45:54PM +0200, Mark Santcroos wrote: Can you try that ? Eek! First one already bailed out! Talking about bad luck! Signal 10 this time. The 2nd one fails at exactly the same point, that can't be coindedence. Also with a signal 10. (libutil) The 3rd ended somewhere else(games/rogue), but now with signal 4. Doing the 4rd now. Which ends with signal 11 in usr.sbin/devinfo/ ... Mark -- Mark Santcroos RIPE Network Coordination Centre http://www.ripe.net/home/mark/ New Projects Group/TTM To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Solved: CURRENT and P-IV problems
Hi, The 2nd one fails at exactly the same point, that can't be coindedence. Also with a signal 10. (libutil) The 3rd ended somewhere else(games/rogue), but now with signal 4. Doing the 4rd now. Which ends with signal 11 in usr.sbin/devinfo/ ... Mark May this be the memory corruption other users see (Alfred, David) ? I'll start again a new row of 10 builds here and see where it ends. Martin To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Solved: CURRENT and P-IV problems
Hi I believe this is caused by the pre-released version of gcc in the system. I started seeing this problem one week after I upgraded my hardware to Pentium 4 in May. Two weeks ago, I built the final release version of gcc 3.1.1 in the ports and used that to compile the kernel and userlands. All the strange signals have disappeared since then. Good-luck, kt On Wed, Aug 21, 2002 at 05:27:24PM +0200, Martin Blapp wrote: Hi, The 2nd one fails at exactly the same point, that can't be coindedence. Also with a signal 10. (libutil) The 3rd ended somewhere else(games/rogue), but now with signal 4. Doing the 4rd now. Which ends with signal 11 in usr.sbin/devinfo/ ... Mark May this be the memory corruption other users see (Alfred, David) ? I'll start again a new row of 10 builds here and see where it ends. Martin To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Solved: CURRENT and P-IV problems
On Wed, 21 Aug 2002 14:48:38 +0200 (CEST) Martin Blapp [EMAIL PROTECTED] wrote: As reported, Brian and I did see SIG4 and SIG11 during make buildworlds. I've replaced everything, two - three times, the problem persisted. I also tried three motherboards, but all from the same type: Intel BD843BG with DDR 266 Ram (2100). Can anyone who experienced those coredumps send me a exact list of used chipsets ? Are you interested in a pciconf -v -l? Mainboard: Intel D845BGL Sockel478 bulk WA/WL System: 4.6-REL Mostly SIG4, very few SIG11 (both in buildorlds, but not in every buildworld). Bye, Alexander. -- I believe the technical term is Oops! http://www.Leidinger.net Alexander @ Leidinger.net GPG fingerprint = C518 BC70 E67F 143F BE91 3365 79E2 9C60 B006 3FE7 To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Solved: CURRENT and P-IV problems
On Wed, Aug 21, 2002 at 11:38:20PM +0800, KT Sin wrote: Hi I believe this is caused by the pre-released version of gcc in the system. I started seeing this problem one week after I upgraded my hardware to Pentium 4 in May. Two weeks ago, I built the final release version of gcc 3.1.1 in the ports and used that to compile the kernel and userlands. All the strange signals have disappeared since then. Thanks for the hint, I'm building /usr/ports/lang/gcc31/ now. Mark -- Mark Santcroos RIPE Network Coordination Centre http://www.ripe.net/home/mark/ New Projects Group/TTM To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Solved: CURRENT and P-IV problems
On Wed, Aug 21, 2002 at 08:05:17PM +0200, Mark Santcroos wrote: On Wed, Aug 21, 2002 at 11:38:20PM +0800, KT Sin wrote: Hi I believe this is caused by the pre-released version of gcc in the system. I started seeing this problem one week after I upgraded my hardware to Pentium 4 in May. Two weeks ago, I built the final release version of gcc 3.1.1 in the ports and used that to compile the kernel and userlands. All the strange signals have disappeared since then. Thanks for the hint, I'm building /usr/ports/lang/gcc31/ now. /usr/local/bin/gcc31 -fpic -DPIC -O -pipe -D_IEEE_LIBM -D_ARCH_INDIRECT=i387_ -c /usr/src/lib/msun/src/s_nextafter.c -o s_nextafter.So Illegal instruction (core dumped) *** Error code 132 Stop in /usr/src. *** Error code 1 That didn't work. I will now try the other gcc's from ports. Mark -- Mark Santcroos RIPE Network Coordination Centre http://www.ripe.net/home/mark/ New Projects Group/TTM To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Solved: CURRENT and P-IV problems
Hi, Try to compile the entire system on another box, install it then on the CURRENT target box, and try again ! Bye the way, after 6 rounds, I see now SIG4 and SIG11 too :-/ To bad - so it's definitly data corruption in CURRENT. Asus Board P4B533-V, P-IV 2,26Ghz, 1GB DDR 2100 Ram. Martin Martin Blapp, [EMAIL PROTECTED] [EMAIL PROTECTED] -- ImproWare AG, UNIXSP ISP, Zurlindenstrasse 29, 4133 Pratteln, CH Phone: +41 061 826 93 00: +41 61 826 93 01 PGP: finger -l [EMAIL PROTECTED] PGP Fingerprint: B434 53FC C87C FE7B 0A18 B84C 8686 EF22 D300 551E -- To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Solved: CURRENT and P-IV problems
Hi Please try to continue from where it broke by repeating make. Otherwise, please get a precompiled port package. kt On Wed, Aug 21, 2002 at 10:26:37PM +0200, Mark Santcroos wrote: On Wed, Aug 21, 2002 at 08:05:17PM +0200, Mark Santcroos wrote: On Wed, Aug 21, 2002 at 11:38:20PM +0800, KT Sin wrote: Hi I believe this is caused by the pre-released version of gcc in the system. I started seeing this problem one week after I upgraded my hardware to Pentium 4 in May. Two weeks ago, I built the final release version of gcc 3.1.1 in the ports and used that to compile the kernel and userlands. All the strange signals have disappeared since then. Thanks for the hint, I'm building /usr/ports/lang/gcc31/ now. /usr/local/bin/gcc31 -fpic -DPIC -O -pipe -D_IEEE_LIBM -D_ARCH_INDIRECT=i387_ -c /usr/src/lib/msun/src/s_nextafter.c -o s_nextafter.So Illegal instruction (core dumped) *** Error code 132 Stop in /usr/src. *** Error code 1 That didn't work. I will now try the other gcc's from ports. Mark -- Mark SantcroosRIPE Network Coordination Centre http://www.ripe.net/home/mark/New Projects Group/TTM To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: Solved: CURRENT and P-IV problems
On 21 Aug, Martin Blapp wrote: Hi, Try to compile the entire system on another box, install it then on the CURRENT target box, and try again ! Bye the way, after 6 rounds, I see now SIG4 and SIG11 too :-/ To bad - so it's definitly data corruption in CURRENT. Asus Board P4B533-V, P-IV 2,26Ghz, 1GB DDR 2100 Ram. No sign of any problems here with last night's -current. Gigabyte GA7-DX+, Athlon XP 1900+, 1GB PC2100 ECC DRAM, SCSI disk, NFS client. I'm not running any sound hardware or the Xserver, and I'm accessing the host via ssh instead of the console. The kernel is GENERIC + SMBus. It's on buildworld #7 at the moment: -rw-r--r-- 1 root wheel 5321510 Aug 21 14:43 /var/tmp/buildworld24-1 -rw-r--r-- 1 root wheel 5321510 Aug 21 15:38 /var/tmp/buildworld24-2 -rw-r--r-- 1 root wheel 5321510 Aug 21 16:33 /var/tmp/buildworld24-3 -rw-r--r-- 1 root wheel 5321510 Aug 21 17:28 /var/tmp/buildworld24-4 -rw-r--r-- 1 root wheel 5321510 Aug 21 18:23 /var/tmp/buildworld24-5 -rw-r--r-- 1 root wheel 5321510 Aug 21 19:18 /var/tmp/buildworld24-6 -rw-r--r-- 1 root wheel 826096 Aug 21 19:28 /var/tmp/buildworld24-7 I was having filesystem corruption problems a couple months ago, but haven't seen any of these problems in ages. The only outstanding problem is a lock order reversal in the pipe code that is triggered by the OpenOffice port build. Witness complains, but the build completes successfully. Copyright (c) 1992-2002 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.0-CURRENT #34: Wed Aug 21 01:36:34 PDT 2002 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERICSMB Preloaded elf kernel /boot/kernel/kernel at 0xc0603000. Preloaded elf module /boot/kernel/acpi.ko at 0xc06030a8. Timecounter i8254 frequency 1193182 Hz Timecounter TSC frequency 1608231091 Hz CPU: AMD Athlon(tm) XP 1900+ (1608.23-MHz 686-class CPU) Origin = AuthenticAMD Id = 0x662 Stepping = 2 Features=0x383fbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE AMD Features=0xc048MP,AMIE,DSP,3DNow! real memory = 1073676288 (1048512K bytes) avail memory = 1035575296 (1011304K bytes) Pentium Pro MTRR support enabled Using $PIR table, 11 entries at 0xc00fdc30 npx0: math processor on motherboard npx0: INT 16 interface acpi0: GBTAWRDACPI on motherboard acpi0: power button is handled as a fixed feature programming model. Timecounter ACPI-fast frequency 3579545 Hz acpi_timer0: 24-bit timer at 3.579545MHz port 0x4008-0x400b on acpi0 acpi_cpu0: CPU on acpi0 acpi_button0: Power Button on acpi0 acpi_button1: Sleep Button on acpi0 acpi_pcib0: Host-PCI bridge port 0x6000-0x607f,0x5000-0x500f,0x4080-0x40ff,0x4000-0x407f,0xcf8-0xcff on acpi0 pci0: PCI bus on acpi_pcib0 agp0: AMD 761 host to AGP bridge port 0xc000-0xc003 mem 0xef02-0xef020fff,0xe800-0xebff at device 0.0 on pci0 pcib1: PCI-PCI bridge at device 1.0 on pci0 pci1: PCI bus on pcib1 pci1: display, VGA at device 5.0 (no driver attached) isab0: PCI-ISA bridge at device 7.0 on pci0 isa0: ISA bus on isab0 atapci0: VIA 82C686 ATA100 controller port 0xc400-0xc40f at device 7.1 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata1: at 0x170 irq 15 on atapci0 uhci0: VIA 83C572 USB controller port 0xc800-0xc81f irq 5 at device 7.2 on pci0 usb0: VIA 83C572 USB controller on uhci0 usb0: USB revision 1.0 uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhub0: port error, restarting port 1 uhub0: port error, giving up port 1 uhub0: port error, restarting port 2 uhub0: port error, giving up port 2 uhci1: VIA 83C572 USB controller port 0xcc00-0xcc1f irq 5 at device 7.3 on pci0 usb1: VIA 83C572 USB controller on uhci1 usb1: USB revision 1.0 uhub1: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered uhub1: port error, restarting port 1 uhub1: port error, giving up port 1 uhub1: port error, restarting port 2 uhub1: port error, giving up port 2 viapropm0: SMBus I/O base at 0x5000 viapropm0: VIA VT82C686A Power Management Unit port 0x5000-0x500f at device 7.4 on pci0 viapropm0: SMBus revision code 0x40 smb0: SMBus generic I/O on smbus0 fxp0: Intel Pro 10/100B/100+ Ethernet port 0xe000-0xe03f mem 0xef00-0xef01,0xef021000-0xef021fff irq 10 at device 10.0 on pci0 fxp0: Ethernet address 00:02:b3:5c:8b:82 inphy0: i82555 10/100 media interface on miibus0 inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto ahc_pci0: Adaptec 19160B Ultra160 SCSI adapter port 0xe400-0xe4ff mem 0xef022000-0xef022fff irq 11 at device 12.0 on pci0 aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs fdc0: enhanced floppy controller (i82077, NE72065 or clone) port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0 fdc0: FIFO enabled, 8 bytes threshold fd0: 1440-KB 3.5 drive on fdc0 drive 0 sio0 port 0x3f8-0x3ff irq 4
Re: CURRENT and P-IV problems
Hi all, I can tell now for sure, that all SIG11 and SIG4 problems are gone with make buildworld, if I compile here make(8) rm(8) mkdir(8) with -g -ggdb If I don't do that, make world stops after 4 - 30 seconds. So it could be definitly some optimizing bug in our gcc. And this bug seems to be present in gcc 2.95.4 as well, as in the new gcc 3.1. The VAX and Windows debuggers are famous for making pointer errors disappear when you compile /debug. GDB is better at not doing this, but isn't perfect. Compiling with and without debug will yield different code. -g makes binaries bigger, and prevents some optimizations, even if you aren't telling the compiler to optimize. Does a strip -g'ed version of the -g compiled binary have the same problem? No. This still works fine. I can compile with -g -ggdb and then strip the binary and it still works fine. Also, an objdump -p comparison of the two might be informative; there were a number of problems in Alpha-land when the compiler assumptions changed because of the new binutils. This might be a similar problem to the ld.so problems there, only with the ELF loader code. With -g -ggdb Program Header: LOAD off0x vaddr 0x08048000 paddr 0x08048000 align 2**12 filesz 0x0004e9ed memsz 0x0004e9ed flags r-x LOAD off0x0004ea00 vaddr 0x08097a00 paddr 0x08097a00 align 2**12 filesz 0x1598 memsz 0x00010d70 flags rw- NOTE off0x0094 vaddr 0x08048094 paddr 0x08048094 align 2**2 filesz 0x0018 memsz 0x0018 flags r-- The problematic version here on PIV 2Ghz: # objdump -p /bin/rm /bin/rm: file format elf32-i386 Program Header: LOAD off0x vaddr 0x08048000 paddr 0x08048000 align 2**12 filesz 0x0004e56d memsz 0x0004e56d flags r-x LOAD off0x0004e580 vaddr 0x08097580 paddr 0x08097580 align 2**12 filesz 0x1598 memsz 0x00010d70 flags rw- NOTE off0x0094 vaddr 0x08048094 paddr 0x08048094 align 2**2 filesz 0x0018 memsz 0x0018 flags r-- To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: CURRENT and P-IV problems
Hi, I see here now very strange effects. I've upgraded to the newest CURRENT yesterday, installed on the PIV machine over NFS. Now rm(1) and make(1) coredump with sig 10. So I thought it would be a good idea to recompile them with -g -ggdb and retry. Now the strange part. The coredumps are gone. Ok, I did not use -pipe then. I'll will now try to use -pipe and -g and -ggdb all together. How the fuck this can have a effect on these coredumps ??? Martin Martin Blapp, [EMAIL PROTECTED] [EMAIL PROTECTED] -- ImproWare AG, UNIXSP ISP, Zurlindenstrasse 29, 4133 Pratteln, CH Phone: +41 061 826 93 00: +41 61 826 93 01 PGP Fingerprint: B434 53FC C87C FE7B 0A18 B84C 8686 EF22 D300 551E -- To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: CURRENT and P-IV problems
Martin Blapp wrote: Now rm(1) and make(1) coredump with sig 10. So I thought it would be a good idea to recompile them with -g -ggdb and retry. Now the strange part. The coredumps are gone. Ok, I did not use -pipe then. I'll will now try to use -pipe and -g and -ggdb all together. How the fuck this can have a effect on these coredumps ??? The VAX and Windows debuggers are famous for making pointer errors disappear when you compile /debug. GDB is better at not doing this, but isn't perfect. Compiling with and without debug will yield different code. -g makes binaries bigger, and prevents some optimizations, even if you aren't telling the compiler to optimize. Does a strip -g'ed version of the -g compiled binary have the same problem? Also, an objdump -p comparison of the two might be informative; there were a number of problems in Alpha-land when the compiler assumptions changed because of the new binutils. This might be a similar problem to the ld.so problems there, only with the ELF loader code. Without more investigation by you, though, all you are going to get is educated guesses. -- Terry To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: CURRENT and P-IV problems
On Sat, May 04, 2002 at 09:26:33PM +0100, Brian Somers wrote: Try disabling -pipe when building the compiler. This seems to make things more stable here (CFLAGS=-O in /etc/make.conf) - as if building the kernel with -pipe sometimes produces a kernel that subsequently murders the compiler with sig11/sig4 all the time. If so, then we have a bug in our pipe ('|', not 'gcc -pipe') implimentation. That would seem to be the case - assuming my hypothesis is correct - which is far from being a sure thing at this point. If things stay stable for the next week or so, I'll set the machine up to start doing parallel builds and see where we go from there... -- Brian [EMAIL PROTECTED][EMAIL PROTECTED] http://www.freebsd-services.com/brian@[uk.]FreeBSD.org Don't _EVER_ lose your sense of humour ! brian@[uk.]OpenBSD.org To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: CURRENT and P-IV problems
Hi all, The problem must have been introduced after April, 3. I've a kernel.old from this date which runs perfectly. Maybe this can help to track the bug down. Martin Martin Blapp, [EMAIL PROTECTED] [EMAIL PROTECTED] -- ImproWare AG, UNIXSP ISP, Zurlindenstrasse 29, 4133 Pratteln, CH Phone: +41 061 826 93 00: +41 61 826 93 01 PGP Fingerprint: B434 53FC C87C FE7B 0A18 B84C 8686 EF22 D300 551E -- To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: CURRENT and P-IV problems
Hi, I have to take back that a kernel from april 3. runs fine. It happens there too, but few times than on recent current. The build lives for 5 minutes, instead of 30 seconds. Then I get a SIG4 as usual and cc crashes. Anybody has a idea to which date I can switch back to have my problem solved ? Martin (still clueless) Martin Blapp, [EMAIL PROTECTED] [EMAIL PROTECTED] -- ImproWare AG, UNIXSP ISP, Zurlindenstrasse 29, 4133 Pratteln, CH Phone: +41 061 826 93 00: +41 61 826 93 01 PGP Fingerprint: B434 53FC C87C FE7B 0A18 B84C 8686 EF22 D300 551E -- To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
CURRENT and P-IV problems
Hi all, I experiment very strange problems here at the moment with a new server. Buildworld survives about 30 secondy, the errors are SIG4 (90%) and SIG11 (10%). And I cannot compile any important programs :-/ I've exchanged all relevant parts: - Power Supply: 300W, for PIV with additional CPU supply - CPU (PIV, 2Ghz, 512K cache) - Ram with ECC correction - Board (Intel D845BG) - SCSI Card. (it happens also on ATA) We have these boards running fine here. And now to the strange part. It does not happen with STABLE. This let's me beleave that this is a CURRENT problem. I'm really really pointless. Martin Martin Blapp, [EMAIL PROTECTED] [EMAIL PROTECTED] -- ImproWare AG, UNIXSP ISP, Zurlindenstrasse 29, 4133 Pratteln, CH Phone: +41 061 826 93 00: +41 61 826 93 01 PGP Fingerprint: B434 53FC C87C FE7B 0A18 B84C 8686 EF22 D300 551E -- To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: CURRENT and P-IV problems
I can tell now for sure that it happens on CURRENT only. I replaced the disk with a STABLE one, same model, and have completed a make buildworld -j 20 sucessfully. The CURRENT disk (in this case SCSI, but it happens also on ATA dumps core a buildworld after 10 - 30 seconds. Martin Martin Blapp, [EMAIL PROTECTED] [EMAIL PROTECTED] -- ImproWare AG, UNIXSP ISP, Zurlindenstrasse 29, 4133 Pratteln, CH Phone: +41 061 826 93 00: +41 61 826 93 01 PGP Fingerprint: B434 53FC C87C FE7B 0A18 B84C 8686 EF22 D300 551E -- To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: CURRENT and P-IV problems
Hi, Try disabling -pipe when building the compiler. This seems to make things more stable here (CFLAGS=-O in /etc/make.conf) - as if building the kernel with -pipe sometimes produces a kernel that subsequently murders the compiler with sig11/sig4 all the time. This is just marginally more than theory at the moment though. You may need to bootstrap a new kernel by building it on another machine to get things running again. Hi all, I experiment very strange problems here at the moment with a new server. Buildworld survives about 30 secondy, the errors are SIG4 (90%) and SIG11 (10%). And I cannot compile any important programs :-/ I've exchanged all relevant parts: - Power Supply: 300W, for PIV with additional CPU supply - CPU (PIV, 2Ghz, 512K cache) - Ram with ECC correction - Board (Intel D845BG) - SCSI Card. (it happens also on ATA) We have these boards running fine here. And now to the strange part. It does not happen with STABLE. This let's me beleave that this is a CURRENT problem. I'm really really pointless. Martin Martin Blapp, [EMAIL PROTECTED] [EMAIL PROTECTED] -- ImproWare AG, UNIXSP ISP, Zurlindenstrasse 29, 4133 Pratteln, CH Phone: +41 061 826 93 00: +41 61 826 93 01 PGP Fingerprint: B434 53FC C87C FE7B 0A18 B84C 8686 EF22 D300 551E -- -- Brian [EMAIL PROTECTED][EMAIL PROTECTED] http://www.freebsd-services.com/brian@[uk.]FreeBSD.org Don't _EVER_ lose your sense of humour ! brian@[uk.]OpenBSD.org To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: CURRENT and P-IV problems
On Sat, 2002-05-04 at 13:54, Martin Blapp wrote: Hi all, I experiment very strange problems here at the moment with a new server. Buildworld survives about 30 secondy, the errors are SIG4 (90%) and SIG11 (10%). And I cannot compile any important programs :-/ I've exchanged all relevant parts: - Power Supply: 300W, for PIV with additional CPU supply - CPU (PIV, 2Ghz, 512K cache) - Ram with ECC correction - Board (Intel D845BG) - SCSI Card. (it happens also on ATA) Sorry for the me too but...well...me too. I don't get very far through *any* build process before I get Signal 4 errors. I also don't have any optimizations set in /etc/make.conf. The machine that is giving me problems also has a PIV (1.6 GHz) processor. Could this issue be specific to Pentium IV's (I know it's a long shot...)? We have these boards running fine here. And now to the strange part. It does not happen with STABLE. Again, ditto. This does not happen to me with -STABLE either. This let's me beleave that this is a CURRENT problem. I saw some other folks reporting that they were seeing the same thing on their machines and I was just wondering: could those machines also be P-IV's? Or maybe there is another common thread...? I started with DP1 and have since been able to update my sources and build/installworld successfully but only through sheer persistence. The problem still exists. -Scott To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: CURRENT and P-IV problems
On Sat, May 04, 2002 at 09:26:33PM +0100, Brian Somers wrote: Try disabling -pipe when building the compiler. This seems to make things more stable here (CFLAGS=-O in /etc/make.conf) - as if building the kernel with -pipe sometimes produces a kernel that subsequently murders the compiler with sig11/sig4 all the time. If so, then we have a bug in our pipe ('|', not 'gcc -pipe') implimentation. To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: CURRENT and P-IV problems
On Sat, 4 May 2002, David O'Brien wrote: On Sat, May 04, 2002 at 09:26:33PM +0100, Brian Somers wrote: Try disabling -pipe when building the compiler. This seems to make things more stable here (CFLAGS=-O in /etc/make.conf) - as if building the kernel with -pipe sometimes produces a kernel that subsequently murders the compiler with sig11/sig4 all the time. If so, then we have a bug in our pipe ('|', not 'gcc -pipe') implimentation. I have seen signs of a generic pipe bug in vi: vi's i/o buffer for pipes is sometimes invalid (kern/sys_pipe.c:pipe_build_write_buffer() gets an error faulting it in). This doesn't usually cause signals; it just confuses vi. Bruce To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: CURRENT and P-IV problems
Apparently, On Sun, May 05, 2002 at 10:44:44AM +1000, Bruce Evans said words to the effect of; On Sat, 4 May 2002, David O'Brien wrote: On Sat, May 04, 2002 at 09:26:33PM +0100, Brian Somers wrote: Try disabling -pipe when building the compiler. This seems to make things more stable here (CFLAGS=-O in /etc/make.conf) - as if building the kernel with -pipe sometimes produces a kernel that subsequently murders the compiler with sig11/sig4 all the time. If so, then we have a bug in our pipe ('|', not 'gcc -pipe') implimentation. I have seen signs of a generic pipe bug in vi: vi's i/o buffer for pipes is sometimes invalid (kern/sys_pipe.c:pipe_build_write_buffer() gets an error faulting it in). This doesn't usually cause signals; it just confuses vi. Can you try backing out rev 1.104 of kern/sys_pipe.c? Jake To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: CURRENT and P-IV problems
On Sat, 4 May 2002, Jake Burkholder wrote: Apparently, On Sun, May 05, 2002 at 10:44:44AM +1000, Bruce Evans said words to the effect of; I have seen signs of a generic pipe bug in vi: vi's i/o buffer for pipes is sometimes invalid (kern/sys_pipe.c:pipe_build_write_buffer() gets an error faulting it in). This doesn't usually cause signals; it just confuses vi. Can you try backing out rev 1.104 of kern/sys_pipe.c? I first noticed vi getting confused in the same way (but not the faultin failure) long before rev.1.104 (in late Jan or early Feb. this year). Bruce To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: CURRENT and P-IV problems
On Sun, 5 May 2002, Bruce Evans wrote: On Sat, 4 May 2002, Jake Burkholder wrote: Apparently, On Sun, May 05, 2002 at 10:44:44AM +1000, Bruce Evans said words to the effect of; I have seen signs of a generic pipe bug in vi: vi's i/o buffer for pipes is sometimes invalid (kern/sys_pipe.c:pipe_build_write_buffer() gets an error faulting it in). This doesn't usually cause signals; it just confuses vi. Can you try backing out rev 1.104 of kern/sys_pipe.c? I first noticed vi getting confused in the same way (but not the faultin failure) long before rev.1.104 (in late Jan or early Feb. this year). Anyway, the failure is in the vm_fault_quick() call which is one line before the pmap_*extract() line that was changed in rev.1.104. (I have some debugging code that traps if the i386 fubyte() fails, but fubyte() fails so rarely that I had forgotten about it. It didn't trap back in Jan/Feb, but I may have been running a plain current kernel without the debugging code.) Bruce To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message
Re: CURRENT and P-IV problems
On Sat, May 04, 2002 at 21:54:08 +0200, Martin Blapp wrote: Hi all, I experiment very strange problems here at the moment with a new server. Buildworld survives about 30 secondy, the errors are SIG4 (90%) and SIG11 (10%). And I cannot compile any important programs :-/ I've exchanged all relevant parts: - Power Supply: 300W, for PIV with additional CPU supply - CPU (PIV, 2Ghz, 512K cache) - Ram with ECC correction - Board (Intel D845BG) - SCSI Card. (it happens also on ATA) We have these boards running fine here. And now to the strange part. It does not happen with STABLE. This let's me beleave that this is a CURRENT problem. Unfortunately not. Since 1 week these types of errors (cc -pipe exiting with some signal) also occurs on -stable for me with a P-IV 1.7GHz. I tried to figure out which commit to RELENG_4 introduced the problems and found that the problems began with the ipfilter update to v3.4.27. But this seems not to be the real cause, as a workaround I found (dropping -pipe from CFLAGS) has nothing to do with ipfilter. Here are some symptoms from my /var/log/messages pid 92638 (cc), uid 0: exited on signal 4 (core dumped) pid 44494 (cc), uid 0: exited on signal 10 (core dumped) pid 23068 (sed), uid 0: exited on signal 11 (core dumped) pid 19046 (egrep), uid 0: exited on signal 4 (core dumped) pid 28452 (sed), uid 0: exited on signal 11 (core dumped) pid 65784 (cpp0), uid 0: exited on signal 4 (core dumped) pid 61931 (sed), uid 0: exited on signal 10 (core dumped) pid 80953 (cc), uid 0: exited on signal 10 (core dumped) pid 32562 (cc), uid 0: exited on signal 10 (core dumped) pid 12812 (sed), uid 0: exited on signal 11 (core dumped) pid 36423 (cc), uid 0: exited on signal 10 (core dumped) pid 87631 (cc), uid 0: exited on signal 4 (core dumped) pid 58087 (sed), uid 0: exited on signal 11 (core dumped) Again: there are no hardware problems (memory, cooling, etc.) here, it is only related to some kernels. Booting a kernel which is old enough makes the system running flawlessly. Would be nice if that could be resolved before -RELEASE (but it seems to be a difficult bug to track down). Best regards Udo Schweigert -- Udo Schweigert, Siemens AG | Voice : +49 89 636 42170 CT IC 3, Siemens CERT| Fax: +49 89 636 41166 D-81730 Muenchen / Germany | email : [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with unsubscribe freebsd-current in the body of the message