Re: snapshots total freeze (linux emulation)
hmm, on Fri, Jan 18, 2013 at 05:07:13AM +0100, frantisek holop said that hmm, on Tue, Jan 15, 2013 at 02:42:55PM -0800, Philip Guenther said that On Fri, Dec 28, 2012 at 1:07 PM, Philip Guenther guent...@gmail.com wrote: On Fri, Dec 28, 2012 at 8:57 AM, frantisek holop min...@obiit.org wrote: ... savecore came on and i have in the logs: Dec 28 00:25:25 amaaq savecore: reboot after panic: kernel diagnostic assertion wp-wp_new_futex == f failed: file ../../../../compat/linux/linux_futex.c, line 568 Excellent. The next question is whether that's the only bug that you're hitting, or if there's something else going on that should also be debugged. As for that particular failed assertion, it would be interesting to know what the actual values of wp-wp_mew_futex was (if it was NULL, then I have a guess as to the bug; if it wasn't NULL, then uh, good luck!) The fix for this has been committed, at least if it's the wp-wp_new_futex==NULL case. installing snapshot... thanks for looking into this sad to say, panic'd again, seems related. as i am writing this in the ecuadorian jungle, there is an actual bug on pics 3, 4 visible. screenshots: obiit.org/f/panic2_1.jpg obiit.org/f/panic2_2.jpg obiit.org/f/panic2_3.jpg obiit.org/f/panic2_4.jpg OpenBSD 5.2-current (GENERIC.MP) #20: Mon Jan 21 18:04:07 MST 2013 t...@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC.MP cpu0: Intel(R) Atom(TM) CPU N570 @ 1.66GHz (GenuineIntel 686-class) 1.67 GHz cpu0: FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,NXE,LONG,SSE3,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,MOVBE,LAHF,PERF real mem = 1061818368 (1012MB) avail mem = 1033461760 (985MB) mainbus0 at root bios0 at mainbus0: AT/286+ BIOS, date 03/31/10, SMBIOS rev. 2.6 @ 0xeb0f0 (53 entries) bios0: vendor LENOVO version 50CN12WW date 04/22/2011 bios0: LENOVO 20109 acpi0 at bios0: rev 3 acpi0: sleep states S0 S1 S3 S4 S5 acpi0: tables DSDT FACP APIC MCFG SLIC HPET acpi0: wakeup devices P0P8(S4) PS2K(S3) PS2M(S3) EUSB(S3) P0PA(S4) P0PB(S4) P0PC(S4) P0P9(S3) USB0(S3) USB1(S3) USB2(S3) USB3(S3) PWRB(S3) SLPB(S3) acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: apic clock running at 167MHz cpu1 at mainbus0: apid 1 (application processor) cpu1: Intel(R) Atom(TM) CPU N570 @ 1.66GHz (GenuineIntel 686-class) 1.68 GHz cpu1: FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,NXE,LONG,SSE3,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,MOVBE,LAHF,PERF cpu2 at mainbus0: apid 2 (application processor) cpu2: Intel(R) Atom(TM) CPU N570 @ 1.66GHz (GenuineIntel 686-class) 1.68 GHz cpu2: FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,NXE,LONG,SSE3,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,MOVBE,LAHF,PERF cpu3 at mainbus0: apid 3 (application processor) cpu3: Intel(R) Atom(TM) CPU N570 @ 1.66GHz (GenuineIntel 686-class) 1.68 GHz cpu3: FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,NXE,LONG,SSE3,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,MOVBE,LAHF,PERF ioapic0 at mainbus0: apid 4 pa 0xfec0, version 20, 24 pins acpimcfg0 at acpi0 addr 0xf000, bus 0-63 acpihpet0 at acpi0: 14318179 Hz acpiprt0 at acpi0: bus 0 (PCI0) acpiprt1 at acpi0: bus 3 (P0P8) acpiprt2 at acpi0: bus 2 (P0PA) acpiprt3 at acpi0: bus -1 (P0PB) acpiprt4 at acpi0: bus -1 (P0PC) acpiprt5 at acpi0: bus 1 (P0P9) acpiec0 at acpi0 acpicpu0 at acpi0:, C3, C2, C1, PSS acpicpu1 at acpi0:, C3, C2, C1, PSS acpicpu2 at acpi0:, C3, C2, C1, PSS acpicpu3 at acpi0:, C3, C2, C1, PSS acpibtn0 at acpi0: PWRB acpibtn1 at acpi0: SLPB acpibtn2 at acpi0: LID_ acpiac0 at acpi0: AC unit online acpibat0 at acpi0: BAT1 model LNV-L10C6Y12 serial 004706 type LiIon oem CPT-ES3 acpivideo0 at acpi0: GFX0 acpivout0 at acpivideo0: DD02 bios0: ROM list: 0xc/0xda00! 0xce000/0x1000 cpu0: Enhanced SpeedStep 1680 MHz: speeds: 1667, 1334, 1000 MHz pci0 at mainbus0 bus 0: configuration mode 1 (bios) pchb0 at pci0 dev 0 function 0 Intel Pineview DMI rev 0x02 vga1 at pci0 dev 2 function 0 Intel Pineview Video rev 0x02 wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation) wsdisplay0: screen 1-5 added (80x25, vt100 emulation) intagp0 at vga1 agp0 at intagp0: aperture at 0xd000, size 0x1000 inteldrm0 at vga1: apic 4 int 16 drm0 at inteldrm0 Intel Pineview Video rev 0x02 at pci0 dev 2 function 1 not configured azalia0 at pci0 dev 27 function 0 Intel 82801GB HD Audio rev 0x02: msi azalia0: codecs: Realtek ALC269 audio0 at azalia0 ppb0 at pci0 dev 28 function 0 Intel 82801GB PCIE rev 0x02: apic 4 int 16 pci1 at ppb0 bus 1 re0 at pci1 dev 0 function 0 Realtek
Re: snapshots total freeze (linux emulation)
hmm, on Sun, Feb 03, 2013 at 02:47:06AM +0100, frantisek holop said that sad to say, panic'd again, seems related. forgot to add: nice panic, because this was at shutdown.. -f -- fact: fourteen out of every ten people like chocolate.
Re: snapshots total freeze (linux emulation)
hmm, on Tue, Jan 15, 2013 at 02:42:55PM -0800, Philip Guenther said that On Fri, Dec 28, 2012 at 1:07 PM, Philip Guenther guent...@gmail.com wrote: On Fri, Dec 28, 2012 at 8:57 AM, frantisek holop min...@obiit.org wrote: ... savecore came on and i have in the logs: Dec 28 00:25:25 amaaq savecore: reboot after panic: kernel diagnostic assertion wp-wp_new_futex == f failed: file ../../../../compat/linux/linux_futex.c, line 568 Excellent. The next question is whether that's the only bug that you're hitting, or if there's something else going on that should also be debugged. As for that particular failed assertion, it would be interesting to know what the actual values of wp-wp_mew_futex was (if it was NULL, then I have a guess as to the bug; if it wasn't NULL, then uh, good luck!) The fix for this has been committed, at least if it's the wp-wp_new_futex==NULL case. installing snapshot... thanks for looking into this -f -- drink till she's cute. stop before you get married.
Re: snapshots total freeze (linux emulation)
On Fri, Dec 28, 2012 at 1:07 PM, Philip Guenther guent...@gmail.com wrote: On Fri, Dec 28, 2012 at 8:57 AM, frantisek holop min...@obiit.org wrote: ... savecore came on and i have in the logs: Dec 28 00:25:25 amaaq savecore: reboot after panic: kernel diagnostic assertion wp-wp_new_futex == f failed: file ../../../../compat/linux/linux_futex.c, line 568 Excellent. The next question is whether that's the only bug that you're hitting, or if there's something else going on that should also be debugged. As for that particular failed assertion, it would be interesting to know what the actual values of wp-wp_mew_futex was (if it was NULL, then I have a guess as to the bug; if it wasn't NULL, then uh, good luck!) The fix for this has been committed, at least if it's the wp-wp_new_futex==NULL case. Philip Guenther
Re: snapshots total freeze
On 12/25/2012 07:05 PM, frantisek holop wrote: hi there, since a couple of snapshosts back i can quite reliably freeze my openbsd notebook simply by leaving it on overnight. the desktop is there, all the open windows are there, but it has become a painting... nothing in the logs, no panic, nothing. anybody else is seeing something similar? -f Just my 0.5 cents in order... I also had some sporadic hangs on various 5.2-snapshots some time ago on my home pc (isn't a notebook). I've found a reason for this once I opened the case. it was full of dust. So I did a dust blowing/sucking, and it now works fine. I suspect the same reasons with notebooks, if a notebook have vents.(mine have) -- With best regards, Gregory Edigarov
Re: snapshots total freeze
Hi, On 25.12.2012 20:28, epsilon wrote: On Tue, Dec 25, 2012 at 06:05:10PM +0100, frantisek holop wrote: since a couple of snapshosts back i can quite reliably freeze my openbsd notebook simply by leaving it on overnight. the desktop is there, all the open windows are there, but it has become a painting... nothing in the logs, no panic, nothing. anybody else is seeing something similar? Not really the same, but maybe compareable. I am unsure, but let's see: Since upgrade to 5.2 my gateway box freezes in about one out of four times I boot it (it's switched off over night). It freezes somewhere after starting network daemons and starting local daemons. I tried to disable services I do not essentially need or to substitute them with other solutions. So far no findings here. But this box runs no X. I have connected a keyboard and a monitor and I am able to switch between the virtual terminals but no reaction there. If I simply hit return, nothing happens. No login possible. ICMP pings are replyed, but I cannot SSH into the box. Connections are NOT rejected, they just time out. Same with all other TCP connections. After a while the fan accelerates. It looks like the CPU is working very hard. Unfortunately this is really the only reaction this box gives me. But better than nothing. I can confirm the problem here. We run some 5.2 VMs on ESXi 5.1 and if the VM freezes the CPU runs on 100% but the system is completely frozen. No disk IO, no keyboard, but kernel networking seems to be ok. We hit this problem on a physical server after upgrading to 5.2 too. Sometimes the problem occurs during boot like described above, but most of the freezes are seen after a short up time on system with heavy disk io. We had two VMs which crashed every 2 - 4 days after upgrading to 5.2. After a long time of investigation I had the assumption that a problem with the UVM / FFS softdep bufs causes the freeze. I saw a high, increasing number of busymap bufs in systat. After a simple sync the number drops down to 0 - 2 and increases again ... Since 9 days, I run sync every 5 minutes and both systems did *not* freeze again. - Joerg -- OSN Online Service Nuernberg GmbH, Bucher Str. 78, 90408 Nuernberg Tel: +49 911 39905-0 - Fax: +49 911 39905-55 - http://www.osn.de HRB 15022 Nuernberg, USt-Id: DE189301263, GF: Joerg Goltermann
Re: snapshots total freeze
Joerg Goltermann wrote: I can confirm the problem here. We run some 5.2 VMs on ESXi 5.1 and if the VM freezes the CPU runs on 100% but the system is completely frozen. No disk IO, no keyboard, but kernel networking seems to be ok. We had two VMs which crashed every 2 - 4 days after upgrading to 5.2. After a long time of investigation I had the assumption that a problem with the UVM / FFS softdep bufs causes the freeze. I saw a high, increasing number of busymap bufs in systat. After a simple sync the number drops down to 0 - 2 and increases again ... We're running a 5.2 VM on ESXi 5.0 and experiencing somewhat similar problems since upgrading from 5.1. No total freezes, the keyboard is still working, but a reboot is needed to get the machine back in a usable state. I noticed the problems when the network wasn't working anymore. After bringing the interfaces down/up and flushing the routes, it looked good again but i then noticed that the clock hung completely. ntpd -s wouldn't work and the clock didn't count any further either. It just stuck. No log or kernel messages, unfortunately. The problems happened two times since upgrading in mid November, about every 2 weeks. We have softdep enabled too, I'll check the systat output when I'm back at work and let you know if I see the same. Regards Andre
Re: snapshots total freeze
Hi, On Fri, Dec 28, 2012 at 12:01:37PM +0100, Joerg Goltermann wrote: ... We hit this problem on a physical server after upgrading to 5.2 too. ... Since 9 days, I run sync every 5 minutes and both systems did *not* freeze again. Thanks for the hint. I will cronjob this. - Eps
Re: snapshots total freeze (linux emulation)
hmm, on Tue, Dec 25, 2012 at 06:05:10PM +0100, frantisek holop said that since a couple of snapshosts back i can quite reliably freeze my openbsd notebook simply by leaving it on overnight. the desktop is there, all the open windows are there, but it has become a painting... nothing in the logs, no panic, nothing. so let me correct that: no visible panic.. i stayed up overnight as well for a change, and it froze/panicked right in the front of my eyes. i started a 'boot crash', but i am not sure if it was finished correctly, it was sitting there forever with the disk led on, so in the end i just power cycled it. savecore came on and i have in the logs: Dec 28 00:25:25 amaaq savecore: reboot after panic: kernel diagnostic assertion wp-wp_new_futex == f failed: file ../../../../compat/linux/linux_futex.c, line 568 Dec 28 00:25:25 amaaq savecore: /var/crash/bounds: No such file or directory Dec 28 00:25:25 amaaq savecore: writing core to /var/crash/bsd.0.core Dec 28 00:26:10 amaaq savecore: writing kernel to /var/crash/bsd.0 as the only program i run in linux emulation is opera... thanks for all the tips how to catch this. -f -- i'm not nearly as think as you confused i am.
Re: snapshots total freeze (linux emulation)
On Fri, Dec 28, 2012 at 8:57 AM, frantisek holop min...@obiit.org wrote: ... i started a 'boot crash', but i am not sure if it was finished correctly, it was sitting there forever with the disk led on, so in the end i just power cycled it. It depends on how much memory you have and how fast your disk is. With 4GB of mem on my laptop, it feels like it takes 5 minutes, though I haven't actually timed it. If I'm *trying* get a crash dump (as part of debugging something particularly complicated, say), then I'll usually tell boot to lie and only use 128MB, via machine memory =128M, before boot bsd.test. (Note: if you do that, you must *also* do that when booting after the crash dump, so that savecore can find the dump in the swap partition.) savecore came on and i have in the logs: Dec 28 00:25:25 amaaq savecore: reboot after panic: kernel diagnostic assertion wp-wp_new_futex == f failed: file ../../../../compat/linux/linux_futex.c, line 568 Excellent. The next question is whether that's the only bug that you're hitting, or if there's something else going on that should also be debugged. As for that particular failed assertion, it would be interesting to know what the actual values of wp-wp_mew_futex was (if it was NULL, then I have a guess as to the bug; if it wasn't NULL, then uh, good luck!) Philip Guenther
Re: snapshots total freeze
This makes me think about a problem I had with an Eeepc a while back. I never reported it because I tracked that down to a faulty SSD that would just hang the machine when accessing certain sectors. The way I verified this was to run dd if=/dev/rwd0c of=/dev/null bs=1m (with the appropriate device node for your drive) and see if it completes. Do this for all your mounted drives and you may find the culprit. 2012/12/25 Philip Guenther guent...@gmail.com On Tue, Dec 25, 2012 at 2:57 PM, frantisek holop min...@obiit.org wrote: hmm, on Tue, Dec 25, 2012 at 11:31:43PM +0100, Marc Espie said that On Tue, Dec 25, 2012 at 11:23:06PM +0100, frantisek holop wrote: (difficult to believe no people see this, every notebook i had since 2008 could not shutdown cleanly 50-70% of the time) I don't know what you do with your machines, or what specific hw you have that causes this. My machines shutdown gracefully most of the time. I'll second that. When mine hang it's because I have a bug in whatever diff I'm working on... (the dmesgs of all my current and previous notebooks can be found in the misc archive). thinkpad, ideapad, eeepc, all of them showed this behaviour. i use these machines for simple daily use. browsing, some development work, etc. as every day as it gets for unix users. So start eliminating differences. Does it hang over night if you never login? If not, then it's something you're running that does it. So login and logout and see whether it's some daemon started by your .xsession (dbus? that gkrellm thing that caused a 1+ load?) that causes it. If it does hang even if you don't login, then start checking off the system daemons. What if X is never started? Heck, boot to single user and leave it there over night. You were seeing ACPI taking more and more memory before; is that still happening? Is there a correlation between that and the hangs? Make a hypothesis (it's caused by something in my .xsession), come up with a way to test it (see if it happens if I don't login) then do so. You have a problem; do SCIENCE on it. there are normally 2 ways of unclean shutdown: either syncing now and never done, or (on this ideapad) simply black screen, presumably still part of X, and never going to the console. i made this script to minimize damage: $ cat bin/ha.sh #!/bin/sh sudo sync sudo mount -u -r /data sudo halt -p Does the hang in sync happen if you're not running any processes as you? Only happens if certain filesystems are mounted? What are the prerequisites on it hanging? Since this doesn't seem to affect developers, you're going to have to do the science. Philip Guenther -- La brigade SnW veut vous recruter - http://www.brigadesnw.ca
Re: snapshots total freeze
On Tue, Dec 25, 2012 at 06:05:10PM +0100, frantisek holop wrote: hi there, since a couple of snapshosts back i can quite reliably freeze my openbsd notebook simply by leaving it on overnight. the desktop is there, all the open windows are there, but it has become a painting... nothing in the logs, no panic, nothing. Sounds like you're in X while the kernel panics. The system is probably in ddb but you cannot see the ddb prompt since you're in X. Can you leave the screen switched to the console (Ctrl+Alt+F1) over night and check whether a ddb prompt shows up in the morning?
Re: snapshots total freeze
Stefan Sperling s...@openbsd.org writes: On Tue, Dec 25, 2012 at 06:05:10PM +0100, frantisek holop wrote: hi there, since a couple of snapshosts back i can quite reliably freeze my openbsd notebook simply by leaving it on overnight. the desktop is there, all the open windows are there, but it has become a painting... nothing in the logs, no panic, nothing. Sounds like you're in X while the kernel panics. The system is probably in ddb but you cannot see the ddb prompt since you're in X. Can you leave the screen switched to the console (Ctrl+Alt+F1) over night and check whether a ddb prompt shows up in the morning? Tip: if your system only freezes while you have X on screen, you can generally go back to ttyC0 and even if the display isn't refreshed, blindly type at the ddb prompt. In that case, crash(8) will help. -- Jérémie Courrèges-Anglas GPG Key fingerprint: 61DB D9A0 00A4 67CF 2A90 8961 6191 8FBF 06A1 1494
Re: snapshots total freeze
On Tue, 25 Dec 2012 18:17:27 +0100 Stefan Sperling s...@openbsd.org wrote: On Tue, Dec 25, 2012 at 06:05:10PM +0100, frantisek holop wrote: hi there, since a couple of snapshosts back i can quite reliably freeze my openbsd notebook simply by leaving it on overnight. the desktop is there, all the open windows are there, but it has become a painting... nothing in the logs, no panic, nothing. Sounds like you're in X while the kernel panics. The system is probably in ddb but you cannot see the ddb prompt since you're in X. Can you leave the screen switched to the console (Ctrl+Alt+F1) over night and check whether a ddb prompt shows up in the morning? When you have also a ntfs partition mounted, then just try to switch to the console and simply run # ksh /etc/daily as root, and look if it exits normally. I posted a same error a couple of days ago, because I got a freezed X every night at 0:30am. the command `boot dump` and man crash(8) will help as mentioned, if the kernel panics. And I would just umount any ntfs partition overnight. Best, Sebastian. -- Sebastian Neuper pha...@gmx.de
Re: snapshots total freeze
On Tue, Dec 25, 2012 at 06:05:10PM +0100, frantisek holop wrote: since a couple of snapshosts back i can quite reliably freeze my openbsd notebook simply by leaving it on overnight. the desktop is there, all the open windows are there, but it has become a painting... nothing in the logs, no panic, nothing. anybody else is seeing something similar? Not really the same, but maybe compareable. I am unsure, but let's see: Since upgrade to 5.2 my gateway box freezes in about one out of four times I boot it (it's switched off over night). It freezes somewhere after starting network daemons and starting local daemons. I tried to disable services I do not essentially need or to substitute them with other solutions. So far no findings here. But this box runs no X. I have connected a keyboard and a monitor and I am able to switch between the virtual terminals but no reaction there. If I simply hit return, nothing happens. No login possible. ICMP pings are replyed, but I cannot SSH into the box. Connections are NOT rejected, they just time out. Same with all other TCP connections. After a while the fan accelerates. It looks like the CPU is working very hard. Unfortunately this is really the only reaction this box gives me. But better than nothing. nothing in the logs, no panic, nothing. Yes! Even the named startup logging misses in /var/log/messages. The freeze always appears somewhere after named starts (see above). It looks like syslogd did not have the time to write the file. The last thing I got in /var/log/messages is: ... /bsd: root on wd0a ... After rebooting the hard way, the only thing I got are sometimes (not always) /var/lost+found/* files (/var is a separate partition). For some weeks I used a more recent hardware. The only difference is: The fan is louder. So I stay sticky with this one to minimize the harm done to me (OK - it has other NICs, and a SATA drive). Maybe I missinterpret things, but for me it looks like the kernel is still running, but all userland activities are completely dead/blocked/locked/looping/whatever. IPv6 is disabled on all NICs. Just saying ... Greetings E. $ cat rc.conf.local sendmail_flags=-L sm-mta -C/etc/mail/sendmail.cf -bd -q30m named_flags= httpd_flags=-DSSL -u ftpproxy_flags= tftpd_flags=-4 -l xx.xx.xx.xx /tftpboot ifstated_flags= dhcpd_flags=xl0 $ dmesg OpenBSD 5.2 (GENERIC) #278: Wed Aug 1 10:04:16 MDT 2012 dera...@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC cpu0: Intel Pentium III (GenuineIntel 686-class) 732 MHz cpu0: FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PSE36,MMX,FXSR,SSE real mem = 266727424 (254MB) avail mem = 251506688 (239MB) mainbus0 at root bios0 at mainbus0: AT/286+ BIOS, date 08/25/00, BIOS32 rev. 0 @ 0xe7300, SMBIOS rev. 2.3 @ 0xf8dc6 (47 entries) bios0: vendor Compaq version 686P2 v2.04 date 08/25/2000 bios0: Compaq Deskpro apm0 at bios0: Power Management spec V1.2 acpi at bios0 function 0x0 not configured pcibios0 at bios0: rev 2.1 @ 0xe7300/0x8d00 pcibios0: PCI IRQ Routing Table rev 1.0 @ 0xf6260/208 (11 entries) pcibios0: PCI Interrupt Router at 000:31:0 (Intel 82801BA LPC rev 0x00) pcibios0: PCI bus #2 is the last bus bios0: ROM list: 0xc/0xa000 0xca000/0x800 0xca800/0xd800! 0xe/0x1! cpu0 at mainbus0: (uniprocessor) pci0 at mainbus0 bus 0: configuration mode 1 (bios) pchb0 at pci0 dev 0 function 0 Intel 82815 Host rev 0x02 vga1 at pci0 dev 2 function 0 Intel 82815 Video rev 0x02 wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation) wsdisplay0: screen 1-5 added (80x25, vt100 emulation) intagp0 at vga1 agp0 at intagp0: aperture at 0x4400, size 0x400 ppb0 at pci0 dev 30 function 0 Intel 82801BA Hub-to-PCI rev 0x02 pci1 at ppb0 bus 2 rl0 at pci1 dev 4 function 0 Realtek 8139 rev 0x10: irq 5, address 00:08:a1:57:08:83 rlphy0 at rl0 phy 0: RTL internal PHY fxp0 at pci1 dev 8 function 0 Intel 82562 rev 0x01, i82562: irq 10, address 00:02:a5:2b:0f:43 inphy0 at fxp0 phy 1: i82562EM 10/100 PHY, rev. 0 xl0 at pci1 dev 9 function 0 3Com 3c905C 100Base-TX rev 0x78: irq 11, address 00:04:76:26:b5:0f exphy0 at xl0 phy 24: 3Com internal media interface ichpcib0 at pci0 dev 31 function 0 Intel 82801BA LPC rev 0x02: 24-bit timer at 3579545Hz pciide0 at pci0 dev 31 function 1 Intel 82801BA IDE rev 0x02: DMA, channel 0 wired to compatibility, channel 1 wired to compatibility atapiscsi0 at pciide0 channel 0 drive 0 scsibus0 at atapiscsi0: 2 targets cd0 at scsibus0 targ 0 lun 0: COMPAQ, CD-ROM CR-594, YC0G ATAPI 5/cdrom removable cd0(pciide0:0:0): using PIO mode 4, DMA mode 2 wd0 at pciide0 channel 1 drive 0: MAXTOR STM380215A wd0: 16-sector PIO, LBA48, 76319MB, 156301488 sectors wd0(pciide0:1:0): using PIO mode 4, Ultra-DMA mode 5 uhci0 at pci0 dev 31 function 4 Intel 82801BA USB rev 0x02: irq 10 auich0 at pci0 dev 31 function 5 Intel 82801BA AC97 rev 0x02: irq 5, ICH2 AC97 ac97: codec id 0x41445360 (Analog Devices AD1885) ac97: codec features headphone, Analog Devices Phat
Re: snapshots total freeze
On Tue, Dec 25, 2012 at 9:05 AM, frantisek holop min...@obiit.org wrote: since a couple of snapshosts back i can quite reliably freeze my openbsd notebook simply by leaving it on overnight. the desktop is there, all the open windows are there, but it has become a painting... nothing in the logs, no panic, nothing. anybody else is seeing something similar? I'll second the try doing ctrl-alt-F1 followed by 'boot crash' recommendation. (If you don't have swap and /var space for a crash dump, then at least try blind typing 'trace' and 'ps' and then 'boot reboot', as if that can actually reboot then you'll see the panic message, back trace, and ps info in the output of dmesg. The other thought is that you describe this as a change from before; what was the last snapshot you installed that you can be sure *did not* exhibit this behavior? e.g., did the machine last overnight when running 5.1? 5.2-release? An October snapshot? Philip Guenther
Re: snapshots total freeze
On Tue, Dec 25, 2012 at 06:05:10PM +0100, frantisek holop wrote: hi there, since a couple of snapshosts back i can quite reliably freeze my openbsd notebook simply by leaving it on overnight. the desktop is there, all the open windows are there, but it has become a painting... nothing in the logs, no panic, nothing. anybody else is seeing something similar? -f My notebook would freeze up regularly, especially after I had opened enough applications. This was somewhat annoying. :) Turned out that my hard drive was at the edge of failure. I am guessing that needing to swap to the hard drive after free memory filled up caused the failures. I could be wrong. I suggest that you carefully check the hard drive and memory. Might not have anything to do with the snapshot. Never rule out hardware. Chris Bennett
Re: snapshots total freeze
hmm, on Tue, Dec 25, 2012 at 06:17:27PM +0100, Stefan Sperling said that On Tue, Dec 25, 2012 at 06:05:10PM +0100, frantisek holop wrote: hi there, since a couple of snapshosts back i can quite reliably freeze my openbsd notebook simply by leaving it on overnight. the desktop is there, all the open windows are there, but it has become a painting... nothing in the logs, no panic, nothing. Sounds like you're in X while the kernel panics. The system is probably in ddb but you cannot see the ddb prompt since you're in X. Can you leave the screen switched to the console (Ctrl+Alt+F1) over night and check whether a ddb prompt shows up in the morning? sure, i will try. -f -- if you can't see black, white has no meaning
Re: snapshots total freeze
hmm, on Tue, Dec 25, 2012 at 03:49:00PM -0600, Chris Bennett said that On Tue, Dec 25, 2012 at 06:05:10PM +0100, frantisek holop wrote: hi there, since a couple of snapshosts back i can quite reliably freeze my openbsd notebook simply by leaving it on overnight. the desktop is there, all the open windows are there, but it has become a painting... nothing in the logs, no panic, nothing. anybody else is seeing something similar? -f My notebook would freeze up regularly, especially after I had opened enough applications. This was somewhat annoying. :) Turned out that my hard drive was at the edge of failure. I am guessing that needing to swap to the hard drive after free memory filled up caused the failures. I could be wrong. a dying disk tends to make quite a lot of noise in the logs.. i see no signs of that so far. and i have lots of fsck's because of unclean shutdowns. basically nowadays every clean shutdown is cause for celebration. (difficult to believe no people see this, every notebook i had since 2008 could not shutdown cleanly 50-70% of the time) as to the memory, i have a usb stick with linux that has memtest86 as boot option. i can leave that overnight as well. hardware is always suspect, but it's so difficult to test... -f -- sorry guys, this is not heavy metal.
Re: snapshots total freeze
On Tue, Dec 25, 2012 at 11:23:06PM +0100, frantisek holop wrote: (difficult to believe no people see this, every notebook i had since 2008 could not shutdown cleanly 50-70% of the time) I don't know what you do with your machines, or what specific hw you have that causes this. My machines shutdown gracefully most of the time.
Re: snapshots total freeze
hmm, on Tue, Dec 25, 2012 at 01:23:41PM -0800, Philip Guenther said that On Tue, Dec 25, 2012 at 9:05 AM, frantisek holop min...@obiit.org wrote: since a couple of snapshosts back i can quite reliably freeze my openbsd notebook simply by leaving it on overnight. the desktop is there, all the open windows are there, but it has become a painting... nothing in the logs, no panic, nothing. anybody else is seeing something similar? I'll second the try doing ctrl-alt-F1 followed by 'boot crash' recommendation. (If you don't have swap and /var space for a crash dump, then at least try blind typing 'trace' and 'ps' and then 'boot reboot', as if that can actually reboot then you'll see the panic message, back trace, and ps info in the output of dmesg. i always try these. no results. keyboard seems to be totally dead (i also have machdep.kbdreset=1 and it has never worked in case of a freeze). The other thought is that you describe this as a change from before; what was the last snapshot you installed that you can be sure *did not* exhibit this behavior? e.g., did the machine last overnight when running 5.1? 5.2-release? An October snapshot? leaving my notebook on overnight did not result in hard freeze before. it is basically idling, not even connected to network. couple of programs left running as they were. basically a poor man's hybernate as this notebook cannot wake up after zzz.. to define that before, i am sorry, i cannot. i remember in november a distinct freeze when i left to computer alone only for a couple of hours. coming back to it it was totally unresponsive and i could only turn it off. this case i said overnight but of course the freeze could have happened anytime before i woke up. i just wrote a heartbeat script that spams syslog... i have also stopped using systat enterily as it could lock up all and every snapshot for the last year. starting it specifically outside X in the console would lock up the machine so bad only turning off helped. no ddb, no ctrl-alt-del, just the blinking cursor behind systat. so frustrating not to be able to trace down any of this. apparently it is not happening to others either. so yes, too bad. -f -- no sense being pessimistic. it wouldn't work anyway.
Re: snapshots total freeze
hmm, on Tue, Dec 25, 2012 at 11:31:43PM +0100, Marc Espie said that On Tue, Dec 25, 2012 at 11:23:06PM +0100, frantisek holop wrote: (difficult to believe no people see this, every notebook i had since 2008 could not shutdown cleanly 50-70% of the time) I don't know what you do with your machines, or what specific hw you have that causes this. My machines shutdown gracefully most of the time. (the dmesgs of all my current and previous notebooks can be found in the misc archive). thinkpad, ideapad, eeepc, all of them showed this behaviour. i use these machines for simple daily use. browsing, some development work, etc. as every day as it gets for unix users. there are normally 2 ways of unclean shutdown: either syncing now and never done, or (on this ideapad) simply black screen, presumably still part of X, and never going to the console. i made this script to minimize damage: $ cat bin/ha.sh #!/bin/sh sudo sync sudo mount -u -r /data sudo halt -p -f -- runtime error. should i walk? (y/n)
Re: snapshots total freeze
On Tue, Dec 25, 2012 at 2:57 PM, frantisek holop min...@obiit.org wrote: hmm, on Tue, Dec 25, 2012 at 11:31:43PM +0100, Marc Espie said that On Tue, Dec 25, 2012 at 11:23:06PM +0100, frantisek holop wrote: (difficult to believe no people see this, every notebook i had since 2008 could not shutdown cleanly 50-70% of the time) I don't know what you do with your machines, or what specific hw you have that causes this. My machines shutdown gracefully most of the time. I'll second that. When mine hang it's because I have a bug in whatever diff I'm working on... (the dmesgs of all my current and previous notebooks can be found in the misc archive). thinkpad, ideapad, eeepc, all of them showed this behaviour. i use these machines for simple daily use. browsing, some development work, etc. as every day as it gets for unix users. So start eliminating differences. Does it hang over night if you never login? If not, then it's something you're running that does it. So login and logout and see whether it's some daemon started by your .xsession (dbus? that gkrellm thing that caused a 1+ load?) that causes it. If it does hang even if you don't login, then start checking off the system daemons. What if X is never started? Heck, boot to single user and leave it there over night. You were seeing ACPI taking more and more memory before; is that still happening? Is there a correlation between that and the hangs? Make a hypothesis (it's caused by something in my .xsession), come up with a way to test it (see if it happens if I don't login) then do so. You have a problem; do SCIENCE on it. there are normally 2 ways of unclean shutdown: either syncing now and never done, or (on this ideapad) simply black screen, presumably still part of X, and never going to the console. i made this script to minimize damage: $ cat bin/ha.sh #!/bin/sh sudo sync sudo mount -u -r /data sudo halt -p Does the hang in sync happen if you're not running any processes as you? Only happens if certain filesystems are mounted? What are the prerequisites on it hanging? Since this doesn't seem to affect developers, you're going to have to do the science. Philip Guenther