Re: 5.8 freezes on Shuttle DS87, anybody else?

2015-12-01 Thread Harald Dunkel
I migrated this openBSD setup to a 5 years old network
appliance. Its running for more than a week without problems.

This means I don't have a test setup to chase the problem
anymore.

Regards
Harri



Re: 5.8 freezes on Shuttle DS87, anybody else?

2015-11-23 Thread Harald Dunkel
On 11/16/2015 04:28 PM, Harald Dunkel wrote:
> 
> See attachment. Hope this helps.
> 
> Regards
> Harri
> 
Obviously attachments are not working. Here you go.

Hope this helps
Harri
-
login:

OpenBSD/amd64 (redgate.red.aixigo.de) (tty00)

login:

Stopped at  Debugger+0x9:   leave

ddb{0}> trace
Debugger() at Debugger+0x9
comintr() at comintr+0x253
intr_handler() at intr_handler+0x67
Xintr_ioapic_edge4() at Xintr_ioapic_edge4+0xc9
--- interrupt ---
x86_bus_space_io_write_2() at x86_bus_space_io_write_2+0xf
re_intr() at re_intr+0xbd
intr_handler() at intr_handler+0x67
Xintr_ioapic_edge22() at Xintr_ioapic_edge22+0xc9
--- interrupt ---
Xsoftclock() at Xsoftclock+0x15
--- interrupt ---
end trace frame: 0x0, count: -9
0x8:
ddb{0}> ps
   PID   PPID   PGRPUID  S   FLAGS  WAIT  COMMAND
 23791  27180   5661  0  30x83  ttyin less
 27180   4667  27180  0  30x83  wait  bash
  4667   6058   4667  0  30x8b  pause ksh
  6058  11179   6058  0  30x92  selectsshd
 26887  1  1  0  30x82  ttyopngetty
  9959  1   9959  0  30x83  ttyin getty
   167  1167  0  30x83  ttyin getty
 14289  1  14289  0  30x83  ttyin getty
 23518  1  23518  0  30x83  ttyin getty
 26304  1  26304  0  30x83  ttyin getty
 27872  1  27872  0  30x83  ttyin getty
 31785  1  31785  0  30x80  poll  cron
 28768  1  28768  0  30x80  kqreadapmd
 18511  1948631  30x90  poll  dnsmasq
 17033  1  17033 99  30x90  poll  sndiod
  6479  12666  12666 95  30x90  kqreadsmtpd
 24124  12666  12666 95  30x90  kqreadsmtpd
 27005  12666  12666 95  30x90  kqreadsmtpd
  8097  12666  12666 95  30x90  kqreadsmtpd
 11978  12666  12666 95  30x90  kqreadsmtpd
 19689  12666  12666103  30x90  kqreadsmtpd
 12666  1  12666  0  30x80  kqreadsmtpd
 11179  1  11179  0  30x80  selectsshd
 24589  24236   8654 83  30x90  poll  ntpd
 24236   8654   8654 83  30x90  poll  ntpd
  8654  1   8654  0  30x80  poll  ntpd
 27871  21455  21455 74  30x90  bpf   pflogd
 21455  1  21455  0  30x80  netio pflogd
 18631750750 73  30x90  kqreadsyslogd
   750  1750  0  30x80  netio syslogd
 16413  0  0  0  3 0x14200  pgzerozerothread
  7272  0  0  0  3 0x14200  aiodoned  aiodoned
 31812  0  0  0  3 0x14200  syncerupdate
 30554  0  0  0  3 0x14200  cleaner   cleaner
 12801  0  0  0  3 0x14200  reaperreaper
 23371  0  0  0  3 0x14200  pgdaemon  pagedaemon
  8472  0  0  0  3 0x14200  bored crypto
  8667  0  0  0  3 0x14200  pftm  pfpurge
 11541  0  0  0  3 0x14200  usbtskusbtask
  9707  0  0  0  3 0x14200  usbatsk   usbatsk
 22037  0  0  0  3 0x14200  bored intelrel
 26221  0  0  0  3  0x40014200  acpi0 acpi0
 17359  0  0  0  7  0x40014200idle3
 12468  0  0  0  7  0x40014200idle2
  9787  0  0  0  7  0x40014200idle1
 32157  0  0  0  3 0x14200  bored sensors
 15878  0  0  0  2 0x14200softnet
  6342  0  0  0  3 0x14200  bored systqmp
 28852  0  0  0  3 0x14200  bored systq
*23231  0  0  0  7  0x40014200idle0
 1  0  1  0  30x82  wait  init
 0 -1  0  0  3 0x10200  scheduler swapper
ddb{0}> show registers
rdi0x3f8
rsi0
rbp   0x800032d33a38
rbx 0xf9
rdx0x3f8
rcx   0x8188c640cpu_info_primary
rax0
r8   0x1
r9 0
r10 0x40
r11   0x81340170x86_bus_space_mem_read_4
r12   0x8023b110
r13   0x8023b000
r14   0x801ce5c0
r150x3f8
rip   0x81343b09Debugger+0x9
cs   0x8
rflags 0x286
rsp  

Re: 5.8 freezes on Shuttle DS87, anybody else?

2015-11-16 Thread Harald Dunkel
On 11/12/2015 10:22 AM, Stuart Henderson wrote:
> On 2015-11-11, Harald Dunkel  wrote:
>> Hi folks,
>>
>> below you can find the trace and ps for the frozen system,
>> as well as the output of dmesg.
>>
>> Hope this helps. Please mail if I can help to track down this
>> problem.
> 
> Trace for other CPUs might help (ddb{0} shows that you are on cpu 0;
> "mach ddbcpu 1" etc switches to another one). Also the line marked '*'
> in ps output (indicating the currently-running process) from other
> CPUs.
> 

See attachment. Hope this helps.

Regards
Harri

[demime 1.01d removed an attachment of type text/x-log which had a name of 
ddb.log]



Re: 5.8 freezes on Shuttle DS87, anybody else?

2015-11-12 Thread Stuart Henderson
On 2015-11-11, Harald Dunkel  wrote:
> Hi folks,
>
> below you can find the trace and ps for the frozen system,
> as well as the output of dmesg.
>
> Hope this helps. Please mail if I can help to track down this
> problem.

Trace for other CPUs might help (ddb{0} shows that you are on cpu 0;
"mach ddbcpu 1" etc switches to another one). Also the line marked '*'
in ps output (indicating the currently-running process) from other
CPUs.



Re: 5.8 freezes on Shuttle DS87, anybody else?

2015-11-11 Thread Harald Dunkel
Hi folks,

below you can find the trace and ps for the frozen system,
as well as the output of dmesg.

Hope this helps. Please mail if I can help to track down this
problem.


Many thanx
Harri
-
OpenBSD/amd64 (redgate.red.aixigo.de) (tty00)

login: Stopped at  Debugger+0x9:   leave
ddb{0}> trace
Debugger() at Debugger+0x9
comintr() at comintr+0x253
intr_handler() at intr_handler+0x67
Xintr_ioapic_edge4() at Xintr_ioapic_edge4+0xc9
--- interrupt ---
x86_bus_space_io_write_2() at x86_bus_space_io_write_2+0xf
re_intr() at re_intr+0xbd
intr_handler() at intr_handler+0x67
Xintr_ioapic_edge22() at Xintr_ioapic_edge22+0xc9
--- interrupt ---
Xsoftclock() at Xsoftclock+0x15
--- interrupt ---
end trace frame: 0x0, count: -9
0x8:
ddb{0}> ps
   PID   PPID   PGRPUID  S   FLAGS  WAIT  COMMAND
 30454  1  30454  0  30x83  ttyin getty
 10874  1  1  0  30x82  ttyopngetty
 28447  1  28447  0  30x83  ttyin getty
  5725  1   5725  0  30x83  ttyin getty
  3173  1   3173  0  30x83  ttyin getty
 29633  1  29633  0  30x83  ttyin getty
 19476  1  19476  0  30x83  ttyin getty
 24969  1  24969  0  30x80  poll  cron
 14110  1  14110  0  30x80  kqreadapmd
  8681  1   4009631  30x90  poll  dnsmasq
 26045  1  26045 99  30x90  poll  sndiod
 18870522522 95  30x90  kqreadsmtpd
   473522522 95  30x90  kqreadsmtpd
  6553522522 95  30x90  kqreadsmtpd
 14104522522 95  30x90  kqreadsmtpd
 29084522522 95  30x90  kqreadsmtpd
 24592522522103  30x90  kqreadsmtpd
   522  1522  0  30x80  kqreadsmtpd
 14125  1  14125  0  30x80  selectsshd
 10581  32102  19391 83  30x90  poll  ntpd
 32102  19391  19391 83  30x90  poll  ntpd
 19391  1  19391  0  30x80  poll  ntpd
   127   5292   5292 74  30x90  bpf   pflogd
  5292  1   5292  0  30x80  netio pflogd
  1933   1100   1100 73  30x90  kqreadsyslogd
  1100  1   1100  0  30x80  netio syslogd
   172  0  0  0  3 0x14200  pgzerozerothread
 23588  0  0  0  3 0x14200  aiodoned  aiodoned
  4889  0  0  0  3 0x14200  syncerupdate
 23401  0  0  0  3 0x14200  cleaner   cleaner
  7897  0  0  0  3 0x14200  reaperreaper
 32283  0  0  0  3 0x14200  pgdaemon  pagedaemon
   319  0  0  0  3 0x14200  bored crypto
  2315  0  0  0  3 0x14200  pftm  pfpurge
 24227  0  0  0  3 0x14200  usbtskusbtask
 31176  0  0  0  3 0x14200  usbatsk   usbatsk
 22851  0  0  0  3 0x14200  bored intelrel
  9237  0  0  0  3  0x40014200  acpi0 acpi0
   859  0  0  0  7  0x40014200idle3
  8843  0  0  0  7  0x40014200idle2
 22722  0  0  0  7  0x40014200idle1
 21595  0  0  0  3 0x14200  bored sensors
 32532  0  0  0  2 0x14200softnet
 17651  0  0  0  3 0x14200  bored systqmp
 26185  0  0  0  3 0x14200  bored systq
*32437  0  0  0  7  0x40014200idle0
 1  0  1  0  30x82  wait  init
 0 -1  0  0  3 0x10200  scheduler swapper
-
OpenBSD 5.8-stable (GENERIC.DEBUG) #0: Thu Oct 29 08:21:11 CET 2015
r...@redgate.red.aixigo.de:/usr/src/sys/arch/amd64/compile/GENERIC.DEBUG
real mem = 4161720320 (3968MB)
avail mem = 4031696896 (3844MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xec1e0 (76 entries)
bios0: vendor American Megatrends Inc. version "1.00" date 08/21/2014
bios0: Shuttle Inc. DS87D
acpi0 at bios0: rev 2
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP APIC FPDT SLIC SSDT SSDT MCFG HPET SSDT SSDT
acpi0: wakeup devices PXSX(S4) RP01(S4) PXSX(S4) RP02(S4) PXSX(S4) RP03(S4) 
PXSX(S4) RP04(S4) PXSX(S4) RP05(S4) PXSX(S4) RP06(S4) PXSX(S4) RP07(S4) 
PXSX(S4) RP08(S4) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM) i3-4150 CPU @ 3.50GHz, 3492.38 MHz
cpu0: 

Re: 5.8 freezes on Shuttle DS87, anybody else?

2015-11-02 Thread Harald Dunkel
Hi Stuart,

On 10/29/15 10:06, Stuart Henderson wrote:
> 
> You'll need ddb.console=1 in sysctl.conf and reboot if you don't have
> it already (it needs changing before securelevel is set). 

Check:

diff --git a/sysctl.conf b/sysctl.conf
index 0722eac..ff5f0d4 100644
--- a/sysctl.conf
+++ b/sysctl.conf
@@ -26,8 +26,8 @@ net.inet6.ip6.forwarding=1# 1=Permit forwarding (routing) 
of IPv6 packets
 #net.inet.carp.preempt=1   # 1=Enable carp(4) preemption
 #net.inet.carp.log=3   # log level of carp(4) info, default 2
 #net.pipex.enable=1# 1=Enable pipex(4) for npppd(8)
-#ddb.panic=0   # 0=Do not drop into ddb on a kernel panic
-#ddb.console=1 # 1=Permit entry of ddb from the console
+ddb.panic=1# 0=Do not drop into ddb on a kernel panic
+ddb.console=1  # 1=Permit entry of ddb from the console
 #fs.posix.setuid=0 # 0=Traditional BSD chown() semantics
 #vm.swapencrypt.enable=0   # 0=Do not encrypt pages that go to swap
 #vfs.nfs.iothreads=4   # Number of nfsio kernel threads

> Test it before
> the system hangs otherwise you won't be able to distinguish between BREAK
> not working and the OS not being able to enter DDB.

Check: "~~b" does the trick. Its an opengear CM4116 terminal server. The
second '~' is needed for ssh.

> 
> A trick if your console server can't send BREAK: set the speed to 300 baud
> and send ^A. (BREAK is just the tx line being held at 0 for longer than
> it would take to send a normal frame).
> 
>> Since both hosts are affected and since 5.7 was fine
>> (AFAICR) it appears to me to be a software issue. I just
>> wonder if anybody experienced the same problem?
> 
> What software do the systems run?
> 

It is a gateway between an internal network and the internet.
It does packet filtering, and it runs dnsmasq on the internal
side.

> Are they pingable when they hang?
> 

No, its dead. Only one host is active, the other is a cold
spare I used for making sure its not some faulty hardware.

>> [demime 1.01d removed an attachment of type text/x-log which had a name of 
>> dmesg.log]
> 
> This list doesn't aaccept attachments, please send it in-line.
> 

See below. Hope this helps.
Harri
-
OpenBSD 5.8-stable (GENERIC.DEBUG) #0: Thu Oct 29 08:21:11 CET 2015
r...@redgate.red.aixigo.de:/usr/src/sys/arch/amd64/compile/GENERIC.DEBUG
real mem = 4161720320 (3968MB)
avail mem = 4031696896 (3844MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xec1e0 (76 entries)
bios0: vendor American Megatrends Inc. version "1.00" date 08/21/2014
bios0: Shuttle Inc. DS87D
acpi0 at bios0: rev 2
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP APIC FPDT SLIC SSDT SSDT MCFG HPET SSDT SSDT
acpi0: wakeup devices PXSX(S4) RP01(S4) PXSX(S4) RP02(S4) PXSX(S4) RP03(S4) 
PXSX(S4) RP04(S4) PXSX(S4) RP05(S4) PXSX(S4) RP06(S4) PXSX(S4) RP07(S4) 
PXSX(S4) RP08(S4) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM) i3-4150 CPU @ 3.50GHz, 3492.38 MHz
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SENSOR,ARAT
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 99MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4, IBE
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Core(TM) i3-4150 CPU @ 3.50GHz, 3491.92 MHz
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SENSOR,ARAT
cpu1: 256KB 64b/line 8-way L2 cache
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 1 (application processor)
cpu2: Intel(R) Core(TM) i3-4150 CPU @ 3.50GHz, 3491.92 MHz
cpu2: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SENSOR,ARAT
cpu2: 256KB 64b/line 8-way L2 cache
cpu2: smt 1, core 0, package 0
cpu3 at mainbus0: apid 3 (application processor)
cpu3: Intel(R) Core(TM) i3-4150 CPU @ 3.50GHz, 3491.92 MHz

Re: 5.8 freezes on Shuttle DS87, anybody else?

2015-11-02 Thread Harald Dunkel
PS: Would you recommend any special options or flags
for GENERIC.DEBUG, besides

makeoptionsDEBUG="-g"

?

Regards
Harri



Re: 5.8 freezes on Shuttle DS87, anybody else?

2015-10-29 Thread Stuart Henderson
On 2015-10-29, Harald Dunkel  wrote:
> Hi folks,
>
> I had several system freezes of our 2 Shuttle DS87 hosts
> running 5.8. Sometimes the host is up for a week without
> problems, but I have also seen 3 freezes on one day.
>
> The serial console doesn't give a hint about what goes
> wrong. I have built 5.8 with -g now to create a crash
> dump for a bug report on the next failure (hoping this
> break thing works with my console server).

You'll need ddb.console=1 in sysctl.conf and reboot if you don't have
it already (it needs changing before securelevel is set). Test it before
the system hangs otherwise you won't be able to distinguish between BREAK
not working and the OS not being able to enter DDB.

A trick if your console server can't send BREAK: set the speed to 300 baud
and send ^A. (BREAK is just the tx line being held at 0 for longer than
it would take to send a normal frame).

> Since both hosts are affected and since 5.7 was fine
> (AFAICR) it appears to me to be a software issue. I just
> wonder if anybody experienced the same problem?

What software do the systems run?

Are they pingable when they hang?

> [demime 1.01d removed an attachment of type text/x-log which had a name of 
> dmesg.log]

This list doesn't aaccept attachments, please send it in-line.