Re: Solved: CURRENT and P-IV problems

2002-08-22 Thread Don Lewis

On 21 Aug, Don Lewis wrote:
 On 21 Aug, Martin Blapp wrote:
 
 Hi,
 
 Try to compile the entire system on another box, install it then
 on the CURRENT target box, and try again !
 
 Bye the way, after 6 rounds, I see now SIG4 and SIG11 too :-/
 To bad - so it's definitly data corruption in CURRENT.
 
 Asus Board P4B533-V, P-IV 2,26Ghz, 1GB DDR 2100 Ram.
 
 No sign of any problems here with last night's -current.  Gigabyte
 GA7-DX+, Athlon XP 1900+, 1GB PC2100 ECC DRAM, SCSI disk, NFS client.
 I'm not running any sound hardware or the Xserver, and I'm accessing the
 host via ssh instead of the console.  The kernel is GENERIC + SMBus.
 It's on buildworld #7 at the moment:

I guess I spoke too soon.  I encountered errors on buildworld iterations
8 and 10.

cc -O -pipe  -DIN_GCC -DHAVE_CONFIG_H -DPREFIX=\/usr\ -I/usr/obj/usr/src/gnu/u
sr.bin/cc/cc_int/../cc_tools -I/usr/src/gnu/usr.bin/cc/cc_int/../cc_tools -I/usr
/src/gnu/usr.bin/cc/cc_int/../../../../contrib/gcc -I/usr/src/gnu/usr.bin/cc/cc_
int/../../../../contrib/gcc/config -DHAVE_CONFIG_H -DTARGET_NAME=\i386-undermyd
esk-freebsd\ -DIN_GCC  -c /usr/src/contrib/gcc/reload1.c -o reload1.o
/usr/src/contrib/gcc/reload1.c: In function `emit_reload_insns':
/usr/src/contrib/gcc/reload1.c:7339: internal error: Segmentation fault
Please submit a full bug report,
with preprocessed source if appropriate.
See URL:http://www.gnu.org/software/gcc/bugs.html for instructions.
*** Error code 1

Stop in /usr/src/gnu/usr.bin/cc/cc_int.


c++  -O -pipe  -DHAVE_STDLIB_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DIRENT_H=1 -DHAVE_LIMI
TS_H=1 -DHAVE_STRING_H=1 -DHAVE_STRINGS_H=1 -DHAVE_MATH_H=1 -DRET_TYPE_SRAND_IS_
VOID=1 -DHAVE_SYS_NERR=1 -DHAVE_SYS_ERRLIST=1 -DHAVE_CC_LIMITS_H=1 -DRETSIGTYPE=
void -DHAVE_STRUCT_EXCEPTION=1 -DHAVE_GETPAGESIZE=1 -DHAVE_MMAP=1 -DHAVE_FMOD=1
-DHAVE_STRTOL=1 -DHAVE_GETCWD=1 -DHAVE_STRERROR=1 -DHAVE_PUTENV=1 -DHAVE_RENAME=
1 -DHAVE_MKSTEMP=1 -DHAVE_STRCASECMP=1 -DHAVE_STRNCASECMP=1 -DHAVE_STRSEP=1 -DHA
VE_STRDUP=1 -DSYS_SIGLIST_DECLARED=1 -I/usr/src/gnu/usr.bin/groff/src/utils/addf
tinfo/../../../../../../contrib/groff/src/include -I/usr/src/gnu/usr.bin/groff/s
rc/utils/addftinfo/../../../src/include-fno-rtti -fno-exceptions -c /usr/src
/contrib/groff/src/utils/addftinfo/guess.cc
/usr/src/contrib/groff/src/utils/addftinfo/guess.cc:446:20: missing terminating
' character
/usr/src/contrib/groff/src/utils/addftinfo/guess.cc:447:1: warning: null charact
er(s) ignored
*** Error code 1

Stop in /usr/src/gnu/usr.bin/groff/src/utils/addftinfo.

In the latter case, the affected file looks like:

  case HASH('^', 'e'):
  case HASH('^', 'i'):
  case HASH('^'  'o'):
\xc0 case HASH('^', 'u'):
 %case HADH('`', \xc0A'):
  ^@ase HASH('`', 'E'):
  case HASH('`', 'I'):
  case HASH('`', 'O'):
  case HASH('`', 'U'):

The file is correct after a reboot, so the corruption was limited to the
copy cached in RAM.



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Solved: CURRENT and P-IV problems

2002-08-21 Thread Martin Blapp


Hi,

As reported, Brian and I did see SIG4 and SIG11 during make buildworlds.

I've replaced everything, two - three times, the problem persisted.
I also tried three motherboards, but all from the same type:

Intel BD843BG with DDR 266 Ram (2100).

Just for interest, I've replaced this Mobo now with an Asus P4B533-V
board. All segfaults and illegal instructions are gone now.

So it seems to be specific to the Intel board. BIOS update did not
help. Change timing settings also not. The default settings produce
these errors. It happens rarely on STABLE, often on CURRENT.

What issue could this be with the Intel manufacured board ? Is it
a design issue, or could it still be a FreeBSD bug ?

Both Mobo's use the same i845 chipset, and use the same Ram.

Can anyone who experienced those coredumps send me a exact list
of used chipsets ?

Martin

Martin Blapp, [EMAIL PROTECTED] [EMAIL PROTECTED]
--
ImproWare AG, UNIXSP  ISP, Zurlindenstrasse 29, 4133 Pratteln, CH
Phone: +41 061 826 93 00: +41 61 826 93 01
PGP: finger -l [EMAIL PROTECTED]
PGP Fingerprint: B434 53FC C87C FE7B 0A18 B84C 8686 EF22 D300 551E
--


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Solved: CURRENT and P-IV problems

2002-08-21 Thread Mark Santcroos

On Wed, Aug 21, 2002 at 02:48:38PM +0200, Martin Blapp wrote:
 
 Hi,
 
 As reported, Brian and I did see SIG4 and SIG11 during make buildworlds.
 
 I've replaced everything, two - three times, the problem persisted.
 I also tried three motherboards, but all from the same type:
 
 Intel BD843BG with DDR 266 Ram (2100).
 
 Just for interest, I've replaced this Mobo now with an Asus P4B533-V
 board. All segfaults and illegal instructions are gone now.
 
 So it seems to be specific to the Intel board. BIOS update did not
 help. Change timing settings also not. The default settings produce
 these errors. It happens rarely on STABLE, often on CURRENT.
 
 What issue could this be with the Intel manufacured board ? Is it
 a design issue, or could it still be a FreeBSD bug ?
 
 Both Mobo's use the same i845 chipset, and use the same Ram.
 
 Can anyone who experienced those coredumps send me a exact list
 of used chipsets ?

Hi Martin,

I have a P4 mobile in my laptop and also had this behaviour for a certain
-current window last week (the time I got the laptop).  (Dell C640)
Now it is gone. I'm sorry, I don't have an exact date/commit.

It has the Intel Mobile 845MP chipset.

Mark

-- 
Mark Santcroos  RIPE Network Coordination Centre
http://www.ripe.net/home/mark/  New Projects Group/TTM

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message




Re: Solved: CURRENT and P-IV problems

2002-08-21 Thread Martin Blapp


Hi,

 I have a P4 mobile in my laptop and also had this behaviour for a certain
 -current window last week (the time I got the laptop).  (Dell C640)
 Now it is gone. I'm sorry, I don't have an exact date/commit.

Try to do some worlds in a row (5-10) and you will see if it survives.

The problem here was that it did work sometimes, sometimes not. But
I could never finish 10 worlds in a row.

Can you try that ?


 It has the Intel Mobile 845MP chipset.

Ah, well mine is here:

http://www.intel.com/design/motherbd/bg/index.htm?iid=ipp_dlc_deskmb+p4pmb_D845BG;

It's a normal Intel 845 chipset.

Martin


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Solved: CURRENT and P-IV problems

2002-08-21 Thread Mark Santcroos

On Wed, Aug 21, 2002 at 03:32:44PM +0200, Martin Blapp wrote:
  I have a P4 mobile in my laptop and also had this behaviour for a certain
  -current window last week (the time I got the laptop).  (Dell C640)
  Now it is gone. I'm sorry, I don't have an exact date/commit.
 
 Try to do some worlds in a row (5-10) and you will see if it survives.
 
 The problem here was that it did work sometimes, sometimes not. But
 I could never finish 10 worlds in a row.
 
 Can you try that ?

Eek! First one already bailed out! Talking about bad luck!
Signal 10 this time.

Now I come to think of it, I didn't do much buildworld's, mostly
buildkernels.

Mark

-- 
Mark Santcroos  RIPE Network Coordination Centre
http://www.ripe.net/home/mark/  New Projects Group/TTM

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Solved: CURRENT and P-IV problems

2002-08-21 Thread Mark Santcroos

On Wed, Aug 21, 2002 at 04:07:14PM +0200, Mark Santcroos wrote:
 On Wed, Aug 21, 2002 at 03:32:44PM +0200, Martin Blapp wrote:
   I have a P4 mobile in my laptop and also had this behaviour for a certain
   -current window last week (the time I got the laptop).  (Dell C640)
   Now it is gone. I'm sorry, I don't have an exact date/commit.
  
  Try to do some worlds in a row (5-10) and you will see if it survives.
  
  The problem here was that it did work sometimes, sometimes not. But
  I could never finish 10 worlds in a row.
  
  Can you try that ?
 
 Eek! First one already bailed out! Talking about bad luck!
 Signal 10 this time.

The 2nd one fails at exactly the same point, that can't be coindedence.
Also with a signal 10. (libutil)

The 3rd ended somewhere else(games/rogue), but now with signal 4.

Doing the 4rd now.

Mark


-- 
Mark Santcroos  RIPE Network Coordination Centre
http://www.ripe.net/home/mark/  New Projects Group/TTM

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Solved: CURRENT and P-IV problems

2002-08-21 Thread Mark Santcroos

On Wed, Aug 21, 2002 at 04:45:54PM +0200, Mark Santcroos wrote:
   Can you try that ?
  
  Eek! First one already bailed out! Talking about bad luck!
  Signal 10 this time.
 
 The 2nd one fails at exactly the same point, that can't be coindedence.
 Also with a signal 10. (libutil)
 
 The 3rd ended somewhere else(games/rogue), but now with signal 4.
 
 Doing the 4rd now.

Which ends with signal 11 in usr.sbin/devinfo/ ...

Mark

-- 
Mark Santcroos  RIPE Network Coordination Centre
http://www.ripe.net/home/mark/  New Projects Group/TTM

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Solved: CURRENT and P-IV problems

2002-08-21 Thread Martin Blapp


Hi,

  The 2nd one fails at exactly the same point, that can't be coindedence.
  Also with a signal 10. (libutil)
 
  The 3rd ended somewhere else(games/rogue), but now with signal 4.
 
  Doing the 4rd now.

 Which ends with signal 11 in usr.sbin/devinfo/ ...

 Mark

May this be the memory corruption other users see (Alfred, David) ?

I'll start again a new row of 10 builds here and see where it
ends.

Martin


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Solved: CURRENT and P-IV problems

2002-08-21 Thread KT Sin

Hi

I believe this is caused by the pre-released version of gcc in the system. 

I started seeing this problem one week after I upgraded my hardware to
Pentium 4 in May. Two weeks ago, I built the final release version of
gcc 3.1.1 in the ports and used that to compile the kernel and userlands.

All the strange signals have disappeared since then.

Good-luck,
kt



On Wed, Aug 21, 2002 at 05:27:24PM +0200, Martin Blapp wrote:
 
 Hi,
 
   The 2nd one fails at exactly the same point, that can't be coindedence.
   Also with a signal 10. (libutil)
  
   The 3rd ended somewhere else(games/rogue), but now with signal 4.
  
   Doing the 4rd now.
 
  Which ends with signal 11 in usr.sbin/devinfo/ ...
 
  Mark
 
 May this be the memory corruption other users see (Alfred, David) ?
 
 I'll start again a new row of 10 builds here and see where it
 ends.
 
 Martin
 
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with unsubscribe freebsd-current in the body of the message
 

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Solved: CURRENT and P-IV problems

2002-08-21 Thread Alexander Leidinger

On Wed, 21 Aug 2002 14:48:38 +0200 (CEST) Martin Blapp [EMAIL PROTECTED]
wrote:

 As reported, Brian and I did see SIG4 and SIG11 during make
 buildworlds.
 
 I've replaced everything, two - three times, the problem persisted.
 I also tried three motherboards, but all from the same type:
 
 Intel BD843BG with DDR 266 Ram (2100).

 Can anyone who experienced those coredumps send me a exact list
 of used chipsets ?

Are you interested in a pciconf -v -l?

Mainboard: Intel D845BGL Sockel478 bulk WA/WL

System: 4.6-REL

Mostly SIG4, very few SIG11 (both in buildorlds, but not in every buildworld).

Bye,
Alexander.

-- 
   I believe the technical term is Oops!

http://www.Leidinger.net   Alexander @ Leidinger.net
  GPG fingerprint = C518 BC70 E67F 143F BE91  3365 79E2 9C60 B006 3FE7

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Solved: CURRENT and P-IV problems

2002-08-21 Thread Mark Santcroos

On Wed, Aug 21, 2002 at 11:38:20PM +0800, KT Sin wrote:
 Hi
 
 I believe this is caused by the pre-released version of gcc in the system. 
 
 I started seeing this problem one week after I upgraded my hardware to
 Pentium 4 in May. Two weeks ago, I built the final release version of
 gcc 3.1.1 in the ports and used that to compile the kernel and userlands.
 
 All the strange signals have disappeared since then.

Thanks for the hint, I'm building /usr/ports/lang/gcc31/ now.

Mark

-- 
Mark Santcroos  RIPE Network Coordination Centre
http://www.ripe.net/home/mark/  New Projects Group/TTM

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Solved: CURRENT and P-IV problems

2002-08-21 Thread Mark Santcroos

On Wed, Aug 21, 2002 at 08:05:17PM +0200, Mark Santcroos wrote:
 On Wed, Aug 21, 2002 at 11:38:20PM +0800, KT Sin wrote:
  Hi
  
  I believe this is caused by the pre-released version of gcc in the system. 
  
  I started seeing this problem one week after I upgraded my hardware to
  Pentium 4 in May. Two weeks ago, I built the final release version of
  gcc 3.1.1 in the ports and used that to compile the kernel and userlands.
  
  All the strange signals have disappeared since then.
 
 Thanks for the hint, I'm building /usr/ports/lang/gcc31/ now.

/usr/local/bin/gcc31 -fpic -DPIC -O -pipe  -D_IEEE_LIBM -D_ARCH_INDIRECT=i387_  -c 
/usr/src/lib/msun/src/s_nextafter.c -o s_nextafter.So
Illegal instruction (core dumped)
*** Error code 132

Stop in /usr/src.
*** Error code 1


That didn't work. I will now try the other gcc's from ports.

Mark

-- 
Mark Santcroos  RIPE Network Coordination Centre
http://www.ripe.net/home/mark/  New Projects Group/TTM

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Solved: CURRENT and P-IV problems

2002-08-21 Thread Martin Blapp


Hi,

Try to compile the entire system on another box, install it then
on the CURRENT target box, and try again !

Bye the way, after 6 rounds, I see now SIG4 and SIG11 too :-/
To bad - so it's definitly data corruption in CURRENT.

Asus Board P4B533-V, P-IV 2,26Ghz, 1GB DDR 2100 Ram.

Martin

Martin Blapp, [EMAIL PROTECTED] [EMAIL PROTECTED]
--
ImproWare AG, UNIXSP  ISP, Zurlindenstrasse 29, 4133 Pratteln, CH
Phone: +41 061 826 93 00: +41 61 826 93 01
PGP: finger -l [EMAIL PROTECTED]
PGP Fingerprint: B434 53FC C87C FE7B 0A18 B84C 8686 EF22 D300 551E
--


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Solved: CURRENT and P-IV problems

2002-08-21 Thread KT Sin

Hi

Please try to continue from where it broke by repeating make. Otherwise, 
please get a precompiled port package.

kt

On Wed, Aug 21, 2002 at 10:26:37PM +0200, Mark Santcroos wrote:
 On Wed, Aug 21, 2002 at 08:05:17PM +0200, Mark Santcroos wrote:
  On Wed, Aug 21, 2002 at 11:38:20PM +0800, KT Sin wrote:
   Hi
   
   I believe this is caused by the pre-released version of gcc in the system. 
   
   I started seeing this problem one week after I upgraded my hardware to
   Pentium 4 in May. Two weeks ago, I built the final release version of
   gcc 3.1.1 in the ports and used that to compile the kernel and userlands.
   
   All the strange signals have disappeared since then.
  
  Thanks for the hint, I'm building /usr/ports/lang/gcc31/ now.
 
 /usr/local/bin/gcc31 -fpic -DPIC -O -pipe  -D_IEEE_LIBM -D_ARCH_INDIRECT=i387_  -c 
/usr/src/lib/msun/src/s_nextafter.c -o s_nextafter.So
 Illegal instruction (core dumped)
 *** Error code 132
 
 Stop in /usr/src.
 *** Error code 1
 
 
 That didn't work. I will now try the other gcc's from ports.
 
 Mark
 
 -- 
 Mark SantcroosRIPE Network Coordination Centre
 http://www.ripe.net/home/mark/New Projects Group/TTM
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with unsubscribe freebsd-current in the body of the message
 

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: Solved: CURRENT and P-IV problems

2002-08-21 Thread Don Lewis

On 21 Aug, Martin Blapp wrote:
 
 Hi,
 
 Try to compile the entire system on another box, install it then
 on the CURRENT target box, and try again !
 
 Bye the way, after 6 rounds, I see now SIG4 and SIG11 too :-/
 To bad - so it's definitly data corruption in CURRENT.
 
 Asus Board P4B533-V, P-IV 2,26Ghz, 1GB DDR 2100 Ram.

No sign of any problems here with last night's -current.  Gigabyte
GA7-DX+, Athlon XP 1900+, 1GB PC2100 ECC DRAM, SCSI disk, NFS client.
I'm not running any sound hardware or the Xserver, and I'm accessing the
host via ssh instead of the console.  The kernel is GENERIC + SMBus.
It's on buildworld #7 at the moment:

-rw-r--r--  1 root  wheel  5321510 Aug 21 14:43 /var/tmp/buildworld24-1
-rw-r--r--  1 root  wheel  5321510 Aug 21 15:38 /var/tmp/buildworld24-2
-rw-r--r--  1 root  wheel  5321510 Aug 21 16:33 /var/tmp/buildworld24-3
-rw-r--r--  1 root  wheel  5321510 Aug 21 17:28 /var/tmp/buildworld24-4
-rw-r--r--  1 root  wheel  5321510 Aug 21 18:23 /var/tmp/buildworld24-5
-rw-r--r--  1 root  wheel  5321510 Aug 21 19:18 /var/tmp/buildworld24-6
-rw-r--r--  1 root  wheel   826096 Aug 21 19:28 /var/tmp/buildworld24-7


I was having filesystem corruption problems a couple months ago, but
haven't seen any of these problems in ages.

The only outstanding problem is a lock order reversal in the pipe code
that is triggered by the OpenOffice port build.  Witness complains, but
the build completes successfully.


Copyright (c) 1992-2002 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 5.0-CURRENT #34: Wed Aug 21 01:36:34 PDT 2002
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERICSMB
Preloaded elf kernel /boot/kernel/kernel at 0xc0603000.
Preloaded elf module /boot/kernel/acpi.ko at 0xc06030a8.
Timecounter i8254  frequency 1193182 Hz
Timecounter TSC  frequency 1608231091 Hz
CPU: AMD Athlon(tm) XP 1900+ (1608.23-MHz 686-class CPU)
  Origin = AuthenticAMD  Id = 0x662  Stepping = 2
  
Features=0x383fbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE
  AMD Features=0xc048MP,AMIE,DSP,3DNow!
real memory  = 1073676288 (1048512K bytes)
avail memory = 1035575296 (1011304K bytes)
Pentium Pro MTRR support enabled
Using $PIR table, 11 entries at 0xc00fdc30
npx0: math processor on motherboard
npx0: INT 16 interface
acpi0: GBTAWRDACPI on motherboard
acpi0: power button is handled as a fixed feature programming model.
Timecounter ACPI-fast  frequency 3579545 Hz
acpi_timer0: 24-bit timer at 3.579545MHz port 0x4008-0x400b on acpi0
acpi_cpu0: CPU on acpi0
acpi_button0: Power Button on acpi0
acpi_button1: Sleep Button on acpi0
acpi_pcib0: Host-PCI bridge port 
0x6000-0x607f,0x5000-0x500f,0x4080-0x40ff,0x4000-0x407f,0xcf8-0xcff on acpi0
pci0: PCI bus on acpi_pcib0
agp0: AMD 761 host to AGP bridge port 0xc000-0xc003 mem 
0xef02-0xef020fff,0xe800-0xebff at device 0.0 on pci0
pcib1: PCI-PCI bridge at device 1.0 on pci0
pci1: PCI bus on pcib1
pci1: display, VGA at device 5.0 (no driver attached)
isab0: PCI-ISA bridge at device 7.0 on pci0
isa0: ISA bus on isab0
atapci0: VIA 82C686 ATA100 controller port 0xc400-0xc40f at device 7.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata1: at 0x170 irq 15 on atapci0
uhci0: VIA 83C572 USB controller port 0xc800-0xc81f irq 5 at device 7.2 on pci0
usb0: VIA 83C572 USB controller on uhci0
usb0: USB revision 1.0
uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhub0: port error, restarting port 1
uhub0: port error, giving up port 1
uhub0: port error, restarting port 2
uhub0: port error, giving up port 2
uhci1: VIA 83C572 USB controller port 0xcc00-0xcc1f irq 5 at device 7.3 on pci0
usb1: VIA 83C572 USB controller on uhci1
usb1: USB revision 1.0
uhub1: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhub1: port error, restarting port 1
uhub1: port error, giving up port 1
uhub1: port error, restarting port 2
uhub1: port error, giving up port 2
viapropm0: SMBus I/O base at 0x5000
viapropm0: VIA VT82C686A Power Management Unit port 0x5000-0x500f at device 7.4 on 
pci0
viapropm0: SMBus revision code 0x40
smb0: SMBus generic I/O on smbus0
fxp0: Intel Pro 10/100B/100+ Ethernet port 0xe000-0xe03f mem 
0xef00-0xef01,0xef021000-0xef021fff irq 10 at device 10.0 on pci0
fxp0: Ethernet address 00:02:b3:5c:8b:82
inphy0: i82555 10/100 media interface on miibus0
inphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
ahc_pci0: Adaptec 19160B Ultra160 SCSI adapter port 0xe400-0xe4ff mem 
0xef022000-0xef022fff irq 11 at device 12.0 on pci0
aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
fdc0: enhanced floppy controller (i82077, NE72065 or clone) port 0x3f7,0x3f0-0x3f5 
irq 6 drq 2 on acpi0
fdc0: FIFO enabled, 8 bytes threshold
fd0: 1440-KB 3.5 drive on fdc0 drive 0
sio0 port 0x3f8-0x3ff irq 4 

Re: CURRENT and P-IV problems

2002-05-30 Thread Martin Blapp


Hi all,

I can tell now for sure, that all SIG11 and SIG4 problems are
gone with make buildworld, if I compile here

make(8)
rm(8)
mkdir(8)

with -g -ggdb

If I don't do that, make world stops after 4 - 30 seconds. So it
could be definitly some optimizing bug in our gcc. And this bug
seems to be present in gcc 2.95.4 as well, as in the new gcc 3.1.

 The VAX and Windows debuggers are famous for making pointer
 errors disappear when you compile /debug.  GDB is better at
 not doing this, but isn't perfect.  Compiling with and without
 debug will yield different code.

 -g makes binaries bigger, and prevents some optimizations,
 even if you aren't telling the compiler to optimize.

 Does a strip -g'ed version of the -g compiled binary have the
 same problem?

No. This still works fine. I can compile with -g -ggdb and then
strip the binary and it still works fine.


 Also, an objdump -p comparison of the two might be informative;
 there were a number of problems in Alpha-land when the compiler
 assumptions changed because of the new binutils.  This might be a
 similar problem to the ld.so problems there, only with the ELF
 loader code.

With -g -ggdb

Program Header:
LOAD off0x vaddr 0x08048000 paddr 0x08048000 align 2**12
 filesz 0x0004e9ed memsz 0x0004e9ed flags r-x
LOAD off0x0004ea00 vaddr 0x08097a00 paddr 0x08097a00 align 2**12
 filesz 0x1598 memsz 0x00010d70 flags rw-
NOTE off0x0094 vaddr 0x08048094 paddr 0x08048094 align 2**2
 filesz 0x0018 memsz 0x0018 flags r--

The problematic version here on PIV 2Ghz:

# objdump -p /bin/rm

/bin/rm: file format elf32-i386

Program Header:
LOAD off0x vaddr 0x08048000 paddr 0x08048000 align 2**12
 filesz 0x0004e56d memsz 0x0004e56d flags r-x
LOAD off0x0004e580 vaddr 0x08097580 paddr 0x08097580 align 2**12
 filesz 0x1598 memsz 0x00010d70 flags rw-
NOTE off0x0094 vaddr 0x08048094 paddr 0x08048094 align 2**2
 filesz 0x0018 memsz 0x0018 flags r--


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: CURRENT and P-IV problems

2002-05-14 Thread Martin Blapp


Hi,

I see here now very strange effects. I've upgraded to the
newest CURRENT yesterday, installed on the PIV machine
over NFS.

Now rm(1) and make(1) coredump with sig 10. So I thought it
would be a good idea to recompile them with -g -ggdb and
retry.

Now the strange part. The coredumps are gone. Ok, I did not
use -pipe then. I'll will now try to use -pipe and -g and -ggdb
all together.

How the fuck this can have a effect on these coredumps ???

Martin

Martin Blapp, [EMAIL PROTECTED] [EMAIL PROTECTED]
--
ImproWare AG, UNIXSP  ISP, Zurlindenstrasse 29, 4133 Pratteln, CH
Phone: +41 061 826 93 00: +41 61 826 93 01
PGP Fingerprint: B434 53FC C87C FE7B 0A18 B84C 8686 EF22 D300 551E
--




To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: CURRENT and P-IV problems

2002-05-14 Thread Terry Lambert

Martin Blapp wrote:
 Now rm(1) and make(1) coredump with sig 10. So I thought it
 would be a good idea to recompile them with -g -ggdb and
 retry.
 
 Now the strange part. The coredumps are gone. Ok, I did not
 use -pipe then. I'll will now try to use -pipe and -g and -ggdb
 all together.
 
 How the fuck this can have a effect on these coredumps ???

The VAX and Windows debuggers are famous for making pointer
errors disappear when you compile /debug.  GDB is better at
not doing this, but isn't perfect.  Compiling with and without
debug will yield different code.

-g makes binaries bigger, and prevents some optimizations,
even if you aren't telling the compiler to optimize.

Does a strip -g'ed version of the -g compiled binary have the
same problem?

Also, an objdump -p comparison of the two might be informative;
there were a number of problems in Alpha-land when the compiler
assumptions changed because of the new binutils.  This might be a
similar problem to the ld.so problems there, only with the ELF
loader code.

Without more investigation by you, though, all you are going to
get is educated guesses.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: CURRENT and P-IV problems

2002-05-07 Thread Brian Somers

 On Sat, May 04, 2002 at 09:26:33PM +0100, Brian Somers wrote:
  Try disabling -pipe when building the compiler.  This seems to make 
  things more stable here (CFLAGS=-O in /etc/make.conf) - as if 
  building the kernel with -pipe sometimes produces a kernel that 
  subsequently murders the compiler with sig11/sig4 all the time.
 
 If so, then we have a bug in our pipe ('|', not 'gcc -pipe')
 implimentation.

That would seem to be the case - assuming my hypothesis is correct - 
which is far from being a sure thing at this point.

If things stay stable for the next week or so, I'll set the machine 
up to start doing parallel builds and see where we go from there...
-- 
Brian [EMAIL PROTECTED][EMAIL PROTECTED]
  http://www.freebsd-services.com/brian@[uk.]FreeBSD.org
Don't _EVER_ lose your sense of humour !  brian@[uk.]OpenBSD.org



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: CURRENT and P-IV problems

2002-05-05 Thread Martin Blapp


Hi all,

The problem must have been introduced after April, 3.
I've a kernel.old from this date which runs perfectly.
Maybe this can help to track the bug down.

Martin

Martin Blapp, [EMAIL PROTECTED] [EMAIL PROTECTED]
--
ImproWare AG, UNIXSP  ISP, Zurlindenstrasse 29, 4133 Pratteln, CH
Phone: +41 061 826 93 00: +41 61 826 93 01
PGP Fingerprint: B434 53FC C87C FE7B 0A18 B84C 8686 EF22 D300 551E
--



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: CURRENT and P-IV problems

2002-05-05 Thread Martin Blapp


Hi,

I have to take back that a kernel from april 3. runs fine.
It happens there too, but few times than on recent current.

The build lives for 5 minutes, instead of 30 seconds. Then
I get a SIG4 as usual and cc crashes.

Anybody has a idea to which date I can switch back to
have my problem solved ?

Martin (still clueless)

Martin Blapp, [EMAIL PROTECTED] [EMAIL PROTECTED]
--
ImproWare AG, UNIXSP  ISP, Zurlindenstrasse 29, 4133 Pratteln, CH
Phone: +41 061 826 93 00: +41 61 826 93 01
PGP Fingerprint: B434 53FC C87C FE7B 0A18 B84C 8686 EF22 D300 551E
--



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



CURRENT and P-IV problems

2002-05-04 Thread Martin Blapp


Hi all,

I experiment very strange problems here at the moment with
a new server.

Buildworld survives about 30 secondy, the errors are SIG4 (90%)
and SIG11 (10%). And I cannot compile any important programs :-/

I've exchanged all relevant parts:

- Power Supply: 300W, for PIV with additional CPU supply
- CPU (PIV, 2Ghz, 512K cache)
- Ram with ECC correction
- Board (Intel D845BG)
- SCSI Card. (it happens also on ATA)

We have these boards running fine here. And now to the strange part.
It does not happen with STABLE.

This let's me beleave that this is a CURRENT problem.

I'm really really pointless.

Martin

Martin Blapp, [EMAIL PROTECTED] [EMAIL PROTECTED]
--
ImproWare AG, UNIXSP  ISP, Zurlindenstrasse 29, 4133 Pratteln, CH
Phone: +41 061 826 93 00: +41 61 826 93 01
PGP Fingerprint: B434 53FC C87C FE7B 0A18 B84C 8686 EF22 D300 551E
--


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: CURRENT and P-IV problems

2002-05-04 Thread Martin Blapp


I can tell now for sure that it happens on CURRENT only.

I replaced the disk with a STABLE one, same model, and have
completed a make buildworld -j 20 sucessfully.

The CURRENT disk (in this case SCSI, but it happens also on ATA
dumps core a buildworld after 10 - 30 seconds.

Martin

Martin Blapp, [EMAIL PROTECTED] [EMAIL PROTECTED]
--
ImproWare AG, UNIXSP  ISP, Zurlindenstrasse 29, 4133 Pratteln, CH
Phone: +41 061 826 93 00: +41 61 826 93 01
PGP Fingerprint: B434 53FC C87C FE7B 0A18 B84C 8686 EF22 D300 551E
--


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: CURRENT and P-IV problems

2002-05-04 Thread Brian Somers

Hi,

Try disabling -pipe when building the compiler.  This seems to make 
things more stable here (CFLAGS=-O in /etc/make.conf) - as if 
building the kernel with -pipe sometimes produces a kernel that 
subsequently murders the compiler with sig11/sig4 all the time.

This is just marginally more than theory at the moment though.

You may need to bootstrap a new kernel by building it on another 
machine to get things running again.

 Hi all,
 
 I experiment very strange problems here at the moment with
 a new server.
 
 Buildworld survives about 30 secondy, the errors are SIG4 (90%)
 and SIG11 (10%). And I cannot compile any important programs :-/
 
 I've exchanged all relevant parts:
 
 - Power Supply:   300W, for PIV with additional CPU supply
 - CPU (PIV, 2Ghz, 512K cache)
 - Ram with ECC correction
 - Board (Intel D845BG)
 - SCSI Card. (it happens also on ATA)
 
 We have these boards running fine here. And now to the strange part.
 It does not happen with STABLE.
 
 This let's me beleave that this is a CURRENT problem.
 
 I'm really really pointless.
 
 Martin
 
 Martin Blapp, [EMAIL PROTECTED] [EMAIL PROTECTED]
 --
 ImproWare AG, UNIXSP  ISP, Zurlindenstrasse 29, 4133 Pratteln, CH
 Phone: +41 061 826 93 00: +41 61 826 93 01
 PGP Fingerprint: B434 53FC C87C FE7B 0A18 B84C 8686 EF22 D300 551E
 --
 

-- 
Brian [EMAIL PROTECTED][EMAIL PROTECTED]
  http://www.freebsd-services.com/brian@[uk.]FreeBSD.org
Don't _EVER_ lose your sense of humour !  brian@[uk.]OpenBSD.org



To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: CURRENT and P-IV problems

2002-05-04 Thread Scott R.

On Sat, 2002-05-04 at 13:54, Martin Blapp wrote: 
 
 Hi all,
 
 I experiment very strange problems here at the moment with
 a new server.
 
 Buildworld survives about 30 secondy, the errors are SIG4 (90%)
 and SIG11 (10%). And I cannot compile any important programs :-/
 
 I've exchanged all relevant parts:
 
 - Power Supply:   300W, for PIV with additional CPU supply
 - CPU (PIV, 2Ghz, 512K cache)
 - Ram with ECC correction
 - Board (Intel D845BG)
 - SCSI Card. (it happens also on ATA)

Sorry for the me too but...well...me too.  I don't get very far
through *any* build process before I get Signal 4 errors.  I also don't
have any optimizations set in /etc/make.conf.  The machine that is
giving me problems also has a PIV (1.6 GHz) processor.  Could this issue
be specific to Pentium IV's (I know it's a long shot...)? 

 We have these boards running fine here. And now to the strange part.
 It does not happen with STABLE.

Again, ditto.  This does not happen to me with -STABLE either. 

 This let's me beleave that this is a CURRENT problem.

I saw some other folks reporting that they were seeing the same thing on
their machines and I was just wondering:  could those machines also be
P-IV's?  Or maybe there is another common thread...?  I started with DP1
and have since been able to update my sources and build/installworld
successfully but only through sheer persistence.  The problem still
exists.

-Scott


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: CURRENT and P-IV problems

2002-05-04 Thread David O'Brien

On Sat, May 04, 2002 at 09:26:33PM +0100, Brian Somers wrote:
 Try disabling -pipe when building the compiler.  This seems to make 
 things more stable here (CFLAGS=-O in /etc/make.conf) - as if 
 building the kernel with -pipe sometimes produces a kernel that 
 subsequently murders the compiler with sig11/sig4 all the time.

If so, then we have a bug in our pipe ('|', not 'gcc -pipe')
implimentation.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: CURRENT and P-IV problems

2002-05-04 Thread Bruce Evans

On Sat, 4 May 2002, David O'Brien wrote:

 On Sat, May 04, 2002 at 09:26:33PM +0100, Brian Somers wrote:
  Try disabling -pipe when building the compiler.  This seems to make
  things more stable here (CFLAGS=-O in /etc/make.conf) - as if
  building the kernel with -pipe sometimes produces a kernel that
  subsequently murders the compiler with sig11/sig4 all the time.

 If so, then we have a bug in our pipe ('|', not 'gcc -pipe')
 implimentation.

I have seen signs of a generic pipe bug in vi: vi's i/o buffer for
pipes is sometimes invalid (kern/sys_pipe.c:pipe_build_write_buffer()
gets an error faulting it in).  This doesn't usually cause signals;
it just confuses vi.

Bruce


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: CURRENT and P-IV problems

2002-05-04 Thread Jake Burkholder

Apparently, On Sun, May 05, 2002 at 10:44:44AM +1000,
Bruce Evans said words to the effect of;

 On Sat, 4 May 2002, David O'Brien wrote:
 
  On Sat, May 04, 2002 at 09:26:33PM +0100, Brian Somers wrote:
   Try disabling -pipe when building the compiler.  This seems to make
   things more stable here (CFLAGS=-O in /etc/make.conf) - as if
   building the kernel with -pipe sometimes produces a kernel that
   subsequently murders the compiler with sig11/sig4 all the time.
 
  If so, then we have a bug in our pipe ('|', not 'gcc -pipe')
  implimentation.
 
 I have seen signs of a generic pipe bug in vi: vi's i/o buffer for
 pipes is sometimes invalid (kern/sys_pipe.c:pipe_build_write_buffer()
 gets an error faulting it in).  This doesn't usually cause signals;
 it just confuses vi.

Can you try backing out rev 1.104 of kern/sys_pipe.c?

Jake

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: CURRENT and P-IV problems

2002-05-04 Thread Bruce Evans

On Sat, 4 May 2002, Jake Burkholder wrote:

 Apparently, On Sun, May 05, 2002 at 10:44:44AM +1000,
   Bruce Evans said words to the effect of;
 
  I have seen signs of a generic pipe bug in vi: vi's i/o buffer for
  pipes is sometimes invalid (kern/sys_pipe.c:pipe_build_write_buffer()
  gets an error faulting it in).  This doesn't usually cause signals;
  it just confuses vi.

 Can you try backing out rev 1.104 of kern/sys_pipe.c?

I first noticed vi getting confused in the same way (but not the faultin
failure) long before rev.1.104 (in late Jan or early Feb. this year).

Bruce


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: CURRENT and P-IV problems

2002-05-04 Thread Bruce Evans

On Sun, 5 May 2002, Bruce Evans wrote:

 On Sat, 4 May 2002, Jake Burkholder wrote:

  Apparently, On Sun, May 05, 2002 at 10:44:44AM +1000,
  Bruce Evans said words to the effect of;
  
   I have seen signs of a generic pipe bug in vi: vi's i/o buffer for
   pipes is sometimes invalid (kern/sys_pipe.c:pipe_build_write_buffer()
   gets an error faulting it in).  This doesn't usually cause signals;
   it just confuses vi.
 
  Can you try backing out rev 1.104 of kern/sys_pipe.c?

 I first noticed vi getting confused in the same way (but not the faultin
 failure) long before rev.1.104 (in late Jan or early Feb. this year).

Anyway, the failure is in the vm_fault_quick() call which is one line
before the pmap_*extract() line that was changed in rev.1.104.  (I have
some debugging code that traps if the i386 fubyte() fails, but fubyte()
fails so rarely that I had forgotten about it.  It didn't trap back in
Jan/Feb, but I may have been running a plain current kernel without the
debugging code.)

Bruce


To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message



Re: CURRENT and P-IV problems

2002-05-04 Thread Udo Schweigert

On Sat, May 04, 2002 at 21:54:08 +0200, Martin Blapp wrote:
 Hi all,
 
 I experiment very strange problems here at the moment with
 a new server.
 
 Buildworld survives about 30 secondy, the errors are SIG4 (90%)
 and SIG11 (10%). And I cannot compile any important programs :-/
 
 I've exchanged all relevant parts:
 
 - Power Supply:   300W, for PIV with additional CPU supply
 - CPU (PIV, 2Ghz, 512K cache)
 - Ram with ECC correction
 - Board (Intel D845BG)
 - SCSI Card. (it happens also on ATA)
 
 We have these boards running fine here. And now to the strange part.
 It does not happen with STABLE.
 
 This let's me beleave that this is a CURRENT problem.
 

Unfortunately not. Since 1 week these types of errors (cc -pipe exiting
with some signal) also occurs on -stable for me with a P-IV 1.7GHz. I tried
to figure out which commit to RELENG_4 introduced the problems and found
that the problems began with the ipfilter update to v3.4.27. But this seems
not to be the real cause, as a workaround I found (dropping -pipe from
CFLAGS) has nothing to do with ipfilter.

Here are some symptoms from my /var/log/messages
pid 92638 (cc), uid 0: exited on signal 4 (core dumped)
pid 44494 (cc), uid 0: exited on signal 10 (core dumped)
pid 23068 (sed), uid 0: exited on signal 11 (core dumped)
pid 19046 (egrep), uid 0: exited on signal 4 (core dumped)
pid 28452 (sed), uid 0: exited on signal 11 (core dumped)
pid 65784 (cpp0), uid 0: exited on signal 4 (core dumped)
pid 61931 (sed), uid 0: exited on signal 10 (core dumped)
pid 80953 (cc), uid 0: exited on signal 10 (core dumped)
pid 32562 (cc), uid 0: exited on signal 10 (core dumped)
pid 12812 (sed), uid 0: exited on signal 11 (core dumped)
pid 36423 (cc), uid 0: exited on signal 10 (core dumped)
pid 87631 (cc), uid 0: exited on signal 4 (core dumped)
pid 58087 (sed), uid 0: exited on signal 11 (core dumped)

Again: there are no hardware problems (memory, cooling, etc.) here, it is
only related to some kernels. Booting a kernel which is old enough makes
the system running flawlessly.

Would be nice if that could be resolved before -RELEASE (but it seems to be
a difficult bug to track down).

Best regards

Udo Schweigert
--
Udo Schweigert, Siemens AG   | Voice  : +49 89 636 42170
CT IC 3, Siemens CERT| Fax: +49 89 636 41166
D-81730 Muenchen / Germany   | email  : [EMAIL PROTECTED]

To Unsubscribe: send mail to [EMAIL PROTECTED]
with unsubscribe freebsd-current in the body of the message