Re: crash with -current
On 01/11/15(Sun) 22:43, Sonic wrote: > Upgraded -current earlier today, and system has crashed: Do you have an idea how to reproduce this crash? Which program are you running that uses bpf? > = > Nov 1 22:23:55 stargate /bsd: uvm_fault(0x818f9920, > 0xfff7818adf60, 0, 1) -> e > Nov 1 22:23:55 stargate /bsd: fatal page fault in supervisor mode > Nov 1 22:23:55 stargate /bsd: trap type 6 code 0 rip 81329e69 > cs 8 rflags 10286 cr2 fff7818adf60 cpl 7 rsp 8000221df76 > 0 > Nov 1 22:23:55 stargate /bsd: panic: trap type 6, code=0, pc=81329e69 > Nov 1 22:23:55 stargate /bsd: Starting stack trace... > Nov 1 22:23:55 stargate /bsd: panic() at panic+0x10b > Nov 1 22:23:55 stargate /bsd: trap() at trap+0x7b8 > Nov 1 22:23:55 stargate /bsd: --- trap (number 6) --- > Nov 1 22:23:55 stargate /bsd: trap() at trap+0x709 > Nov 1 22:23:55 stargate /bsd: --- trap (number 4) --- > Nov 1 22:23:55 stargate /bsd: trap() at trap+0x709 > Nov 1 22:23:55 stargate /bsd: --- trap (number 4) --- > Nov 1 22:23:55 stargate /bsd: bpf_filter() at bpf_filter+0x19b > Nov 1 22:23:55 stargate /bsd: _bpf_mtap() at _bpf_mtap+0xf4 > Nov 1 22:23:55 stargate /bsd: bpf_mtap_ether() at bpf_mtap_ether+0x39 > Nov 1 22:23:55 stargate /bsd: em_start() at em_start+0xd6 > Nov 1 22:23:55 stargate /bsd: nettxintr() at nettxintr+0x52 > Nov 1 22:23:55 stargate /bsd: softintr_dispatch() at softintr_dispatch+0x8b > Nov 1 22:23:55 stargate /bsd: Xsoftnet() at Xsoftnet+0x1f > Nov 1 22:23:55 stargate /bsd: --- interrupt --- > Nov 1 22:23:55 stargate /bsd: end of kernel > Nov 1 22:23:55 stargate /bsd: end trace frame: 0xff8132b494, count: 246 > Nov 1 22:23:55 stargate /bsd: 0x282: > Nov 1 22:23:55 stargate /bsd: End of stack trace. > = > # dmesg > OpenBSD 5.8-current (GENERIC.MP) #8: Sun Nov 1 13:10:37 EST 2015 > r...@stargate.grizzly.bear:/usr/src/sys/arch/amd64/compile/GENERIC.MP > real mem = 4277665792 (4079MB) > avail mem = 4143894528 (3951MB) > mpath0 at root > scsibus0 at mpath0: 256 targets > mainbus0 at root > bios0 at mainbus0: SMBIOS rev. 2.6 @ 0x9f000 (19 entries) > bios0: vendor American Megatrends Inc. version "1.2b" date 07/19/13 > bios0: Supermicro X7SPA-HF > acpi0 at bios0: rev 2 > acpi0: sleep states S0 S1 S4 S5 > acpi0: tables DSDT FACP APIC MCFG OEMB HPET EINJ BERT ERST HEST > acpi0: wakeup devices P0P1(S4) PS2K(S4) PS2M(S4) USB0(S4) USB1(S4) > USB2(S4) USB5(S4) EUSB(S4) USB3(S4) USB4(S4) USB6(S4) USBE(S4) > P0P4(S4) P0P5(S4) P0P6(S4) P0P7(S4) [...] > acpitimer0 at acpi0: 3579545 Hz, 24 bits > acpimadt0 at acpi0 addr 0xfee0: PC-AT compat > cpu0 at mainbus0: apid 0 (boot processor) > cpu0: Intel(R) Atom(TM) CPU D525 @ 1.80GHz, 1800.23 MHz > cpu0: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,TM2,SSSE3,CX16,xTPR,PDCM,MOVBE,NXE,LONG,LAHF,PERF,SENSOR > cpu0: 512KB 64b/line 8-way L2 cache > cpu0: smt 0, core 0, package 0 > mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges > cpu0: apic clock running at 200MHz > cpu0: mwait min=64, max=64, C-substates=0.1, IBE > cpu1 at mainbus0: apid 2 (application processor) > cpu1: Intel(R) Atom(TM) CPU D525 @ 1.80GHz, 1800.01 MHz > cpu1: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,TM2,SSSE3,CX16,xTPR,PDCM,MOVBE,NXE,LONG,LAHF,PERF,SENSOR > cpu1: 512KB 64b/line 8-way L2 cache > cpu1: smt 0, core 1, package 0 > ioapic0 at mainbus0: apid 3 pa 0xfec0, version 20, 24 pins > ioapic0: misconfigured as apic 1, remapped to apid 3 > acpimcfg0 at acpi0 addr 0xe000, bus 0-255 > acpihpet0 at acpi0: 14318179 Hz > acpiprt0 at acpi0: bus 0 (PCI0) > acpiprt1 at acpi0: bus 4 (P0P1) > acpiprt2 at acpi0: bus 1 (P0P4) > acpiprt3 at acpi0: bus 2 (P0P8) > acpiprt4 at acpi0: bus 3 (P0P9) > acpicpu0 at acpi0: C1(@1 halt!) > acpicpu1 at acpi0: C1(@1 halt!) > acpibtn0 at acpi0: SLPB > acpibtn1 at acpi0: PWRB > ipmi at mainbus0 not configured > pci0 at mainbus0 bus 0 > pchb0 at pci0 dev 0 function 0 "Intel Pineview DMI" rev 0x02 > uhci0 at pci0 dev 26 function 0 "Intel 82801I USB" rev 0x02: apic 3 int 16 > uhci1 at pci0 dev 26 function 1 "Intel 82801I USB" rev 0x02: apic 3 int 21 > uhci2 at pci0 dev 26 function 2 "Intel 82801I USB" rev 0x02: apic 3 int 19 > ehci0 at pci0 dev 26 function 7 "Intel 82801I USB" rev 0x02: apic 3 int 18 > usb0 at ehci0: USB revision 2.0 > uhub0 at usb0 "Intel EHCI root hub" rev 2.00/1.00 addr 1 > ppb0 at pci0 dev 28 function 0 "Intel 82801I PCIE" rev 0x02: msi > pci1 at ppb0 bus 1 > ppb1 at pci0 dev 28 function 4 "Intel 82801I PCIE" rev 0x02: msi > pci2 at ppb1 bus 2 > em0 at pci2 dev 0 function 0 "Intel 82574L" rev 0x00: msi, address > 00:25:90:92:d4:f8 > ppb2 at pci0 dev 28 function 5 "Intel 82801I PCIE" rev 0x02: msi > pci3 at ppb2
Re: crash with -current
Sonic wrote: > On Mon, Nov 2, 2015 at 12:19 PM, Martin Pieuchotwrote: > > Do you have an idea how to reproduce this crash? Which program are you > > running that uses bpf? > > Not using bpf at all (that I know of), just a straightforward firewall > - pf, dhcpd, unbound. > > The nasty little "em0: watchdog timeout -- resetting", reported on > 10-4-15 bug is also back - triggered by using a bittorrent client on > one of my desktops. The above reported crash happened while streaming > some Netflix. I'm inclined to suspect the em driver is hosed again. Yeah, my gut reaction was that this was something to do with the recent em changes rather than BPF. I don't think the BPF code has changed recently. BPF is hooked into network interfaces pretty directly, so if some packet data was being freed too early by a race condition, this seems like a conceivable way we'd find out. My speculative 2¢.
Re: crash with -current
On Mon, Nov 2, 2015 at 12:19 PM, Martin Pieuchotwrote: > Do you have an idea how to reproduce this crash? Which program are you > running that uses bpf? Not using bpf at all (that I know of), just a straightforward firewall - pf, dhcpd, unbound. The nasty little "em0: watchdog timeout -- resetting", reported on 10-4-15 bug is also back - triggered by using a bittorrent client on one of my desktops. The above reported crash happened while streaming some Netflix. I'm inclined to suspect the em driver is hosed again. Chris
Re: crash with -current
On 2015-11-02, Michael McConvillewrote: > My speculative 2¢. See, even the USA needs more than ASCII :-)
Re: crash with -current
On 11/2/15 1:31 PM, Stuart Henderson wrote: > On 2015-11-02, Michael McConvillewrote: >> My speculative 2¢. > > See, even the USA needs more than ASCII :-) May be not. $0.02 :-))
Re: crash with -current
On Mon, Nov 02, 2015 at 12:34:11PM -0500, Sonic wrote: > On Mon, Nov 2, 2015 at 12:19 PM, Martin Pieuchotwrote: > > Do you have an idea how to reproduce this crash? Which program are you > > running that uses bpf? > > Not using bpf at all (that I know of), just a straightforward firewall > - pf, dhcpd, unbound. dhcpd uses bpf > > The nasty little "em0: watchdog timeout -- resetting", reported on > 10-4-15 bug is also back - triggered by using a bittorrent client on > one of my desktops. The above reported crash happened while streaming > some Netflix. I'm inclined to suspect the em driver is hosed again. > > Chris
crash with -current
Upgraded -current earlier today, and system has crashed: = Nov 1 22:23:55 stargate /bsd: uvm_fault(0x818f9920, 0xfff7818adf60, 0, 1) -> e Nov 1 22:23:55 stargate /bsd: fatal page fault in supervisor mode Nov 1 22:23:55 stargate /bsd: trap type 6 code 0 rip 81329e69 cs 8 rflags 10286 cr2 fff7818adf60 cpl 7 rsp 8000221df76 0 Nov 1 22:23:55 stargate /bsd: panic: trap type 6, code=0, pc=81329e69 Nov 1 22:23:55 stargate /bsd: Starting stack trace... Nov 1 22:23:55 stargate /bsd: panic() at panic+0x10b Nov 1 22:23:55 stargate /bsd: trap() at trap+0x7b8 Nov 1 22:23:55 stargate /bsd: --- trap (number 6) --- Nov 1 22:23:55 stargate /bsd: trap() at trap+0x709 Nov 1 22:23:55 stargate /bsd: --- trap (number 4) --- Nov 1 22:23:55 stargate /bsd: trap() at trap+0x709 Nov 1 22:23:55 stargate /bsd: --- trap (number 4) --- Nov 1 22:23:55 stargate /bsd: bpf_filter() at bpf_filter+0x19b Nov 1 22:23:55 stargate /bsd: _bpf_mtap() at _bpf_mtap+0xf4 Nov 1 22:23:55 stargate /bsd: bpf_mtap_ether() at bpf_mtap_ether+0x39 Nov 1 22:23:55 stargate /bsd: em_start() at em_start+0xd6 Nov 1 22:23:55 stargate /bsd: nettxintr() at nettxintr+0x52 Nov 1 22:23:55 stargate /bsd: softintr_dispatch() at softintr_dispatch+0x8b Nov 1 22:23:55 stargate /bsd: Xsoftnet() at Xsoftnet+0x1f Nov 1 22:23:55 stargate /bsd: --- interrupt --- Nov 1 22:23:55 stargate /bsd: end of kernel Nov 1 22:23:55 stargate /bsd: end trace frame: 0xff8132b494, count: 246 Nov 1 22:23:55 stargate /bsd: 0x282: Nov 1 22:23:55 stargate /bsd: End of stack trace. = # dmesg OpenBSD 5.8-current (GENERIC.MP) #8: Sun Nov 1 13:10:37 EST 2015 r...@stargate.grizzly.bear:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 4277665792 (4079MB) avail mem = 4143894528 (3951MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.6 @ 0x9f000 (19 entries) bios0: vendor American Megatrends Inc. version "1.2b" date 07/19/13 bios0: Supermicro X7SPA-HF acpi0 at bios0: rev 2 acpi0: sleep states S0 S1 S4 S5 acpi0: tables DSDT FACP APIC MCFG OEMB HPET EINJ BERT ERST HEST acpi0: wakeup devices P0P1(S4) PS2K(S4) PS2M(S4) USB0(S4) USB1(S4) USB2(S4) USB5(S4) EUSB(S4) USB3(S4) USB4(S4) USB6(S4) USBE(S4) P0P4(S4) P0P5(S4) P0P6(S4) P0P7(S4) [...] acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Atom(TM) CPU D525 @ 1.80GHz, 1800.23 MHz cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,TM2,SSSE3,CX16,xTPR,PDCM,MOVBE,NXE,LONG,LAHF,PERF,SENSOR cpu0: 512KB 64b/line 8-way L2 cache cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges cpu0: apic clock running at 200MHz cpu0: mwait min=64, max=64, C-substates=0.1, IBE cpu1 at mainbus0: apid 2 (application processor) cpu1: Intel(R) Atom(TM) CPU D525 @ 1.80GHz, 1800.01 MHz cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,TM2,SSSE3,CX16,xTPR,PDCM,MOVBE,NXE,LONG,LAHF,PERF,SENSOR cpu1: 512KB 64b/line 8-way L2 cache cpu1: smt 0, core 1, package 0 ioapic0 at mainbus0: apid 3 pa 0xfec0, version 20, 24 pins ioapic0: misconfigured as apic 1, remapped to apid 3 acpimcfg0 at acpi0 addr 0xe000, bus 0-255 acpihpet0 at acpi0: 14318179 Hz acpiprt0 at acpi0: bus 0 (PCI0) acpiprt1 at acpi0: bus 4 (P0P1) acpiprt2 at acpi0: bus 1 (P0P4) acpiprt3 at acpi0: bus 2 (P0P8) acpiprt4 at acpi0: bus 3 (P0P9) acpicpu0 at acpi0: C1(@1 halt!) acpicpu1 at acpi0: C1(@1 halt!) acpibtn0 at acpi0: SLPB acpibtn1 at acpi0: PWRB ipmi at mainbus0 not configured pci0 at mainbus0 bus 0 pchb0 at pci0 dev 0 function 0 "Intel Pineview DMI" rev 0x02 uhci0 at pci0 dev 26 function 0 "Intel 82801I USB" rev 0x02: apic 3 int 16 uhci1 at pci0 dev 26 function 1 "Intel 82801I USB" rev 0x02: apic 3 int 21 uhci2 at pci0 dev 26 function 2 "Intel 82801I USB" rev 0x02: apic 3 int 19 ehci0 at pci0 dev 26 function 7 "Intel 82801I USB" rev 0x02: apic 3 int 18 usb0 at ehci0: USB revision 2.0 uhub0 at usb0 "Intel EHCI root hub" rev 2.00/1.00 addr 1 ppb0 at pci0 dev 28 function 0 "Intel 82801I PCIE" rev 0x02: msi pci1 at ppb0 bus 1 ppb1 at pci0 dev 28 function 4 "Intel 82801I PCIE" rev 0x02: msi pci2 at ppb1 bus 2 em0 at pci2 dev 0 function 0 "Intel 82574L" rev 0x00: msi, address 00:25:90:92:d4:f8 ppb2 at pci0 dev 28 function 5 "Intel 82801I PCIE" rev 0x02: msi pci3 at ppb2 bus 3 em1 at pci3 dev 0 function 0 "Intel 82574L" rev 0x00: msi, address 00:25:90:92:d4:f9 uhci3 at pci0 dev 29 function 0 "Intel 82801I USB" rev 0x02: apic 3 int 23 uhci4 at pci0 dev 29 function 1 "Intel 82801I USB" rev 0x02: apic 3 int 19 uhci5 at pci0 dev 29 function 2 "Intel 82801I USB" rev 0x02: apic 3 int 18 ehci1
Re: Sun Netra X1 still get dc0: watchdog timeout and crash on current
Date: Fri, 04 Mar 2011 07:40:59 -0500 From: Daniel Ouellet dan...@presscom.net I get pretty much a symetric traffic of ~43.5Mbps each direction using tcpbench as before. Then after may be one minute, both side will cut off oppose to before when only one side was doing that. It will then stay off for a few seconds and come back up on both side at the same time, just like ti went down at the same time and EACH time this happen I will get a watchdog timeout. I will get it on either network port like this: # dc0: watchdog timeout dc0: watchdog timeout dc1: watchdog timeout dc0: watchdog timeout dc0: watchdog timeout dc0: watchdog timeout dc0: watchdog timeout dc0: watchdog timeout dc0: watchdog timeout dc1: watchdog timeout dc0: watchdog timeout Yes. The chip still craps out. My diff doesn't address that. But at least the machine won't panic anymore, and the network interface will recover after the timeout. It's committed now. I'll see if I can figure out why the chip shits itself. May send you more stuff to test. Cheers, Mark
Re: Sun Netra X1 still get dc0: watchdog timeout and crash on current
On 3/5/11 1:49 PM, Mark Kettenis wrote: Date: Fri, 04 Mar 2011 07:40:59 -0500 From: Daniel Ouelletdan...@presscom.net I get pretty much a symetric traffic of ~43.5Mbps each direction using tcpbench as before. Then after may be one minute, both side will cut off oppose to before when only one side was doing that. It will then stay off for a few seconds and come back up on both side at the same time, just like ti went down at the same time and EACH time this happen I will get a watchdog timeout. I will get it on either network port like this: # dc0: watchdog timeout dc0: watchdog timeout dc1: watchdog timeout dc0: watchdog timeout dc0: watchdog timeout dc0: watchdog timeout dc0: watchdog timeout dc0: watchdog timeout dc0: watchdog timeout dc1: watchdog timeout dc0: watchdog timeout Yes. The chip still craps out. My diff doesn't address that. But at least the machine won't panic anymore, and the network interface will recover after the timeout. It's committed now. I'll see if I can figure out why the chip shits itself. May send you more stuff to test. Thanks I saw that commit. Anytime you need me to test more, I am all over it! (: Best, Daniel
Re: Sun Netra X1 still get dc0: watchdog timeout and crash on current
On 3/3/11 3:28 PM, Mark Kettenis wrote: Daniel, Can you try this diff? It won't get rid of the watchdog timeout, but hopefully it will prevent the DMA error. Hi Mark, Here is the results for this. Sorry it took me so long to get to test it. It does make a difference and I will test more today, but so far here is what I have after 30+ minutes of testing different things. No crash yet. (But I will let it run for a few hours to be sure) The connections behave differently now. I get pretty much a symetric traffic of ~43.5Mbps each direction using tcpbench as before. Then after may be one minute, both side will cut off oppose to before when only one side was doing that. It will then stay off for a few seconds and come back up on both side at the same time, just like ti went down at the same time and EACH time this happen I will get a watchdog timeout. I will get it on either network port like this: # dc0: watchdog timeout dc0: watchdog timeout dc1: watchdog timeout dc0: watchdog timeout dc0: watchdog timeout dc0: watchdog timeout dc0: watchdog timeout dc0: watchdog timeout dc0: watchdog timeout dc1: watchdog timeout dc0: watchdog timeout More frequently on dc0 then dc1. Why I do not know. May be just what I start first, may be, but anyway, that's the results so far. Each time it comes back up it will go back to the ~42.5Mbps, be stable for a while, then goes down on both side at once, etc, but each time the watchdog will show up when you loose the traffic going through the router. But still no crash yet. One thing that I would love to understand in tcpbench however to make this better. I thought that one could actually preset the size of the packets sent? I see the -B and -S option, but I keep reading the man page and I just don't get it. I thought may be the -S was the size of the packet, but then if I set it above 1500, it shouldn't work then as the MTU is 1500, but it does work. I would love to get more details on that and to know how to set the size of the packets if possible as I do not get the meaning of the man page for the -B -S options as is and their real impact on the stream itself if any. I will send you more feedback today at some point after I get some much needed sleep. Look to me based on what I see that if the watchdog timeout wasn't there, most likely there wouldn't be cut in the stream going across the router as this is way to consistent in between the cut and the console output. Best and thanks again for your work on this. I very much appreciate it big time! Daniel
Re: Sun Netra X1 still get dc0: watchdog timeout and crash on current
On 3/4/11 7:40 AM, Daniel Ouellet wrote: On 3/3/11 3:28 PM, Mark Kettenis wrote: Daniel, Can you try this diff? It won't get rid of the watchdog timeout, but hopefully it will prevent the DMA error. Follow up on my previous email, I let it run 12 hours so far with as much traffic I could send both ways with two other servers running tcpbench and not a crash what so ever. Still the interrupt runs at 90% pretty much constantly other then when the timeout watchdog show up where it doesn't have any interrupt then, obviously no traffic to process, anyway. So far I see this watchdog still only when I do duplex traffic and I do not see it when it's one way, or way slow on on direction and full the other. Your diff sure help, bot sure if that's the best solution, but sure remove the crash. Hope this help some. Daniel
Re: Sun Netra X1 still get dc0: watchdog timeout and crash on current
Daniel, Can you try this diff? It won't get rid of the watchdog timeout, but hopefully it will prevent the DMA error. Index: dc.c === RCS file: /cvs/src/sys/dev/ic/dc.c,v retrieving revision 1.116 diff -u -p -r1.116 dc.c --- dc.c5 Aug 2010 07:57:04 - 1.116 +++ dc.c3 Mar 2011 20:22:12 - @@ -3059,6 +3059,7 @@ void dc_stop(struct dc_softc *sc, int softonly) { struct ifnet *ifp; + u_int32_t isr; int i; ifp = sc-sc_arpcom.ac_if; @@ -3070,6 +3071,28 @@ dc_stop(struct dc_softc *sc, int softonl if (!softonly) { DC_CLRBIT(sc, DC_NETCFG, (DC_NETCFG_RX_ON|DC_NETCFG_TX_ON)); + + for (i = 0; i DC_TIMEOUT; i++) { + isr = CSR_READ_4(sc, DC_ISR); + if ((isr DC_ISR_TX_IDLE || + (isr DC_ISR_TX_STATE) == DC_TXSTATE_RESET) + (isr DC_ISR_RX_STATE) == DC_RXSTATE_STOPPED) + break; + DELAY(10); + } + + if (i == DC_TIMEOUT) { + if (!((isr DC_ISR_TX_IDLE) || + (isr DC_ISR_TX_STATE) == DC_TXSTATE_RESET) + !DC_IS_ASIX(sc) !DC_IS_DAVICOM(sc)) + printf(%s: failed to force tx to idle state\n, + sc-sc_dev.dv_xname); + if (!((isr DC_ISR_RX_STATE) == DC_RXSTATE_STOPPED) + !DC_HAS_BROKEN_RXSTATE(sc)) + printf(%s: failed to force rx to idle state\n, + sc-sc_dev.dv_xname); + } + CSR_WRITE_4(sc, DC_IMR, 0x); CSR_WRITE_4(sc, DC_TXADDR, 0x); CSR_WRITE_4(sc, DC_RXADDR, 0x);
Sun Netra X1 still get dc0: watchdog timeout and crash on current.
Hi, I was testing routing with this box to see what could be done and when use in full duplex mode routing traffic with tcpbench on servers at either side of this box and this one as a router in between, I get this crash after a timeout of the watchdog. Only happen when I process traffic in both direction, if I do one side, all good, the other all good, but both direction at the same time, I get 94Mb on one side, 54Mb on the other and crash with timeout getting me into the ddb All related information below, trace, ps and dmesg. Thanks Daniel # dc0: watchdog timeout panic: psycho0: uncorrectable DMA error AFAR 6ed72050 (pa=0 tte=0/6ce94012) AFSR 21ff4000 kdb breakpoint at 1462280 Stopped at Debugger+0x4: nop RUN AT LEAST 'trace' AND 'ps' AND INCLUDE OUTPUT WHEN REPORTING THIS PANIC! DO NOT EVEN BOTHER REPORTING THIS WITHOUT INCLUDING THAT INFORMATION! ddb ddb trace psycho_ue(4f99e00, 6, e0017428, 40009db9c40, 14284e0, ) at psycho_ue+0x 7c sparc_interrupt(40009da8f50, 0, 40010b14050, 80006ce94012, 20, deafbeef) at sparc_interrupt+0x2a0 m_extfree(40009da8f50, 4000a07c0f0, 60014000, ff00, 1918, deafbeef) at m_extfre e+0x7c m_free_unlocked(40009da8f50, 60014000, 2000, 4000a07c0f0, 11f8ce0, 18d0) at m_f ree_unlocked+0x4c m_freem(40009da8f50, 2000, 400010a9d00, 0, 800, 2) at m_freem+0x20 dc_stop(4000109a000, 3, 2000, 4000a07c150, 20, 40) at dc_stop+0xd0 dc_init(4000109a000, 6, 40001081900, 1e00, 20, a) at dc_init+0x20 dc_tx_underrun(4000109a000, 73ff, ff, 8000, ff00, 18d0) at dc_tx_underrun+0 x1ec dc_intr(4000109a000, 1, e0017b50, 40009db9c40, 1081200, ) at dc_intr+0x198 sparc_interrupt(400014f4050, 40001081e00, 40001081900, 7e0, 20, a) at sparc_int errupt+0x2a0 nettxintr(4000109a000, 3fff, ff, ff00, 1918, 1910) at nettxintr+0x6c netintr(e0017ec8, 0, e0017ec8, 8000, 11f8ce0, 18d0) at netintr+0xd0 sparc_interrupt(e0017ec8, 0, e0017ec8, 0, 11f8ce0, 180f3d8) at sparc_interrupt+ 0x2a0 sparc_interrupt(188a030, e0018000, 153a150, 1886148, 0, 0) at sparc_interrupt+0 x2a0 sched_idle(e0018000, 4000a070400, 153a758, 4000efb8000, 40010bba470, 1c09d30) a t sched_idle+0x140 proc_trampoline(0, 0, 0, 0, 0, 0) at proc_trampoline+0x4 ddb ddb ps PID PPID PGRPUID S FLAGS WAIT COMMAND 17579 24574 17579 0 3 0x44180 poll systat 24574 26610 24574 0 3 0x4080 pause ksh 26610 29211 26610 0 3 0x4180 selectsshd 2817 1 2817 0 3 0x40180 selectsendmail 22433 1 22433 0 3 0x4080 ttyin ksh 16463 1 16463 0 30x80 selectcron 1739 1 1739 0 3 0x180 selectinetd 29211 1 29211 0 30x80 selectsshd 13002 31220 26791 83 3 0x180 poll ntpd 31220 26791 26791 83 3 0x180 poll ntpd 26791 1 26791 0 30x80 poll ntpd 9509 15123 15123 74 3 0x180 bpf pflogd 15123 1 15123 0 30x80 netio pflogd 30009 12451 12451 73 3 0x180 poll syslogd 12451 1 12451 0 30x88 netio syslogd 13 0 0 0 30x100200 aiodoned aiodoned 12 0 0 0 30x100200 syncerupdate 11 0 0 0 30x100200 cleaner cleaner 10 0 0 0 30x100200 reaperreaper 9 0 0 0 30x100200 pgdaemon pagedaemon 8 0 0 0 30x100200 bored crypto 7 0 0 0 30x100200 pftm pfpurge 6 0 0 0 30x100200 usbtskusbtask 5 0 0 0 30x100200 usbatsk usbatsk 4 0 0 0 30x100200 bored syswq *3 0 0 0 7 0x40100200idle0 2 0 0 0 30x100200 kmalloc kmthread 1 0 1 0 3 0x4080 wait init 0 -1 0 0 3 0x80200 scheduler swapper ddb OpenBSD 4.9 (GENERIC) #253: Tue Mar 1 17:27:32 MST 2011 dera...@sparc64.openbsd.org:/usr/src/sys/arch/sparc64/compile/GENERIC real mem = 1073741824 (1024MB) avail mem = 1043521536 (995MB) mainbus0 at root: Sun Netra X1 (UltraSPARC-IIe 500MHz) cpu0 at mainbus0: SUNW,UltraSPARC-IIe (rev 1.4) @ 500 MHz cpu0: physical 16K instruction (32 b/l), 16K data (32 b/l), 256K external (64 b/l) psycho0 at mainbus0: SUNW,sabre, impl 0, version 0, ign 7c0 psycho0: bus range 0-0, PCI bus 0 psycho0: dvma map 6000-7fff pci0 at psycho0 ebus0 at pci0 dev 7 function 0 Acer Labs M1533 ISA rev 0x00 dma at ebus0 addr 0- ivec 0x2a not configured rtc0 at ebus0 addr 70-71: m5819 power0 at ebus0 addr 2000-2007 ivec 0x23 lom0 at ebus0 addr 8010-8011 ivec 0x2a: LOMlite2 rev 3.10 com0 at ebus0 addr 3f8-3ff
Kernel crash in -current of yesterday
As I have some lockups of the PC, someone suggested me to upgrade to -current. I download the current snapshot of a couple hours ago and made an upgrade. At the following reboot the system crashed! It seems that there were two problems. First there were the following blue texts: spec_open_clone(): cloning device (23, 0) for pid 28994 spec_open_clone(): new minor for cloned device is 1 And a couple lines later in the rc output there was the following blue text (warning: I copied it by hand): uvm_fault(0xfe8013033c18, 0x0, 0, 1) - e kernel: page fault trap, code=0 stopped at ffs_sync_vnode +0x25: testb $0xf,0x20(%rax) And here is the output of trace (again, copied by hand): ffs_sync_vnode() AT ffs_sync_vnode + 0x25 ufs_mount_foreach_vnode() AT ufs_mount_foreach_vnode + 0x32 ffs_sync() AT ffs_sync + 0x74 sys_sync() AT sys_sync + 0x97 syscall() AT syscall + 0x225 --- syscall (number 36) --- end of kernel end of trace frame: 0x503941a8, count: -5 Bye. -- ___ __ |- [EMAIL PROTECTED] |ederico Giannici http://www.neomedia.it ___