Re: snapshots total freeze (linux emulation)

2013-02-02 Thread frantisek holop
hmm, on Fri, Jan 18, 2013 at 05:07:13AM +0100, frantisek holop said that
 hmm, on Tue, Jan 15, 2013 at 02:42:55PM -0800, Philip Guenther said that
  On Fri, Dec 28, 2012 at 1:07 PM, Philip Guenther guent...@gmail.com wrote:
   On Fri, Dec 28, 2012 at 8:57 AM, frantisek holop min...@obiit.org wrote:
  ...
   savecore came on and i have in the logs:
  
   Dec 28 00:25:25 amaaq savecore: reboot after panic: kernel diagnostic 
   assertion wp-wp_new_futex == f failed: file 
   ../../../../compat/linux/linux_futex.c, line 568
  
   Excellent.  The next question is whether that's the only bug that
   you're hitting, or if there's something else going on that should also
   be debugged.
  
   As for that particular failed assertion, it would be interesting to
   know what the actual values of wp-wp_mew_futex was (if it was NULL,
   then I have a guess as to the bug; if it wasn't NULL, then uh, good
   luck!)
  
  The fix for this has been committed, at least if it's the
  wp-wp_new_futex==NULL case.
 
 installing snapshot...
 thanks for looking into this

sad to say, panic'd again, seems related.
as i am writing this in the ecuadorian jungle,
there is an actual bug on pics 3, 4 visible.

screenshots:
obiit.org/f/panic2_1.jpg
obiit.org/f/panic2_2.jpg
obiit.org/f/panic2_3.jpg
obiit.org/f/panic2_4.jpg




OpenBSD 5.2-current (GENERIC.MP) #20: Mon Jan 21 18:04:07 MST 2013
t...@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC.MP
cpu0: Intel(R) Atom(TM) CPU N570 @ 1.66GHz (GenuineIntel 686-class) 1.67 GHz
cpu0: 
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,NXE,LONG,SSE3,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,MOVBE,LAHF,PERF
real mem  = 1061818368 (1012MB)
avail mem = 1033461760 (985MB)
mainbus0 at root
bios0 at mainbus0: AT/286+ BIOS, date 03/31/10, SMBIOS rev. 2.6 @ 0xeb0f0 (53 
entries)
bios0: vendor LENOVO version 50CN12WW date 04/22/2011
bios0: LENOVO 20109
acpi0 at bios0: rev 3
acpi0: sleep states S0 S1 S3 S4 S5
acpi0: tables DSDT FACP APIC MCFG SLIC HPET
acpi0: wakeup devices P0P8(S4) PS2K(S3) PS2M(S3) EUSB(S3) P0PA(S4) P0PB(S4) 
P0PC(S4) P0P9(S3) USB0(S3) USB1(S3) USB2(S3) USB3(S3) PWRB(S3) SLPB(S3)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: apic clock running at 167MHz
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Intel(R) Atom(TM) CPU N570 @ 1.66GHz (GenuineIntel 686-class) 1.68 GHz
cpu1: 
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,NXE,LONG,SSE3,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,MOVBE,LAHF,PERF
cpu2 at mainbus0: apid 2 (application processor)
cpu2: Intel(R) Atom(TM) CPU N570 @ 1.66GHz (GenuineIntel 686-class) 1.68 GHz
cpu2: 
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,NXE,LONG,SSE3,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,MOVBE,LAHF,PERF
cpu3 at mainbus0: apid 3 (application processor)
cpu3: Intel(R) Atom(TM) CPU N570 @ 1.66GHz (GenuineIntel 686-class) 1.68 GHz
cpu3: 
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,NXE,LONG,SSE3,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,MOVBE,LAHF,PERF
ioapic0 at mainbus0: apid 4 pa 0xfec0, version 20, 24 pins
acpimcfg0 at acpi0 addr 0xf000, bus 0-63
acpihpet0 at acpi0: 14318179 Hz
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 3 (P0P8)
acpiprt2 at acpi0: bus 2 (P0PA)
acpiprt3 at acpi0: bus -1 (P0PB)
acpiprt4 at acpi0: bus -1 (P0PC)
acpiprt5 at acpi0: bus 1 (P0P9)
acpiec0 at acpi0
acpicpu0 at acpi0:, C3, C2, C1, PSS
acpicpu1 at acpi0:, C3, C2, C1, PSS
acpicpu2 at acpi0:, C3, C2, C1, PSS
acpicpu3 at acpi0:, C3, C2, C1, PSS
acpibtn0 at acpi0: PWRB
acpibtn1 at acpi0: SLPB
acpibtn2 at acpi0: LID_
acpiac0 at acpi0: AC unit online
acpibat0 at acpi0: BAT1 model LNV-L10C6Y12 serial 004706 type LiIon   
oem CPT-ES3
acpivideo0 at acpi0: GFX0
acpivout0 at acpivideo0: DD02
bios0: ROM list: 0xc/0xda00! 0xce000/0x1000
cpu0: Enhanced SpeedStep 1680 MHz: speeds: 1667, 1334, 1000 MHz
pci0 at mainbus0 bus 0: configuration mode 1 (bios)
pchb0 at pci0 dev 0 function 0 Intel Pineview DMI rev 0x02
vga1 at pci0 dev 2 function 0 Intel Pineview Video rev 0x02
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
intagp0 at vga1
agp0 at intagp0: aperture at 0xd000, size 0x1000
inteldrm0 at vga1: apic 4 int 16
drm0 at inteldrm0
Intel Pineview Video rev 0x02 at pci0 dev 2 function 1 not configured
azalia0 at pci0 dev 27 function 0 Intel 82801GB HD Audio rev 0x02: msi
azalia0: codecs: Realtek ALC269
audio0 at azalia0
ppb0 at pci0 dev 28 function 0 Intel 82801GB PCIE rev 0x02: apic 4 int 16
pci1 at ppb0 bus 1
re0 at pci1 dev 0 function 0 Realtek 

Re: snapshots total freeze (linux emulation)

2013-02-02 Thread frantisek holop
hmm, on Sun, Feb 03, 2013 at 02:47:06AM +0100, frantisek holop said that
 sad to say, panic'd again, seems related.

forgot to add: nice panic, because this was at shutdown..

-f 
-- 
fact: fourteen out of every ten people like chocolate.



Re: snapshots total freeze (linux emulation)

2013-01-17 Thread frantisek holop
hmm, on Tue, Jan 15, 2013 at 02:42:55PM -0800, Philip Guenther said that
 On Fri, Dec 28, 2012 at 1:07 PM, Philip Guenther guent...@gmail.com wrote:
  On Fri, Dec 28, 2012 at 8:57 AM, frantisek holop min...@obiit.org wrote:
 ...
  savecore came on and i have in the logs:
 
  Dec 28 00:25:25 amaaq savecore: reboot after panic: kernel diagnostic 
  assertion wp-wp_new_futex == f failed: file 
  ../../../../compat/linux/linux_futex.c, line 568
 
  Excellent.  The next question is whether that's the only bug that
  you're hitting, or if there's something else going on that should also
  be debugged.
 
  As for that particular failed assertion, it would be interesting to
  know what the actual values of wp-wp_mew_futex was (if it was NULL,
  then I have a guess as to the bug; if it wasn't NULL, then uh, good
  luck!)
 
 The fix for this has been committed, at least if it's the
 wp-wp_new_futex==NULL case.

installing snapshot...
thanks for looking into this

-f
-- 
drink till she's cute.  stop before you get married.



Re: snapshots total freeze (linux emulation)

2013-01-15 Thread Philip Guenther
On Fri, Dec 28, 2012 at 1:07 PM, Philip Guenther guent...@gmail.com wrote:
 On Fri, Dec 28, 2012 at 8:57 AM, frantisek holop min...@obiit.org wrote:
...
 savecore came on and i have in the logs:

 Dec 28 00:25:25 amaaq savecore: reboot after panic: kernel diagnostic 
 assertion wp-wp_new_futex == f failed: file 
 ../../../../compat/linux/linux_futex.c, line 568

 Excellent.  The next question is whether that's the only bug that
 you're hitting, or if there's something else going on that should also
 be debugged.

 As for that particular failed assertion, it would be interesting to
 know what the actual values of wp-wp_mew_futex was (if it was NULL,
 then I have a guess as to the bug; if it wasn't NULL, then uh, good
 luck!)

The fix for this has been committed, at least if it's the
wp-wp_new_futex==NULL case.


Philip Guenther



Re: snapshots total freeze

2013-01-02 Thread Gregory Edigarov

On 12/25/2012 07:05 PM, frantisek holop wrote:

hi there,

since a couple of snapshosts back i can quite reliably
freeze my openbsd notebook simply by leaving it on
overnight.  the desktop is there, all the open windows
are there, but it has become a painting...
nothing in the logs, no panic, nothing.
anybody else is seeing something similar?

-f

Just my 0.5 cents in order...
I also had some sporadic hangs on various 5.2-snapshots some time ago on 
my home pc (isn't a notebook).
I've found a reason for this once I opened the case. it was full of 
dust. So I did a dust blowing/sucking, and it now works fine.
I suspect the same reasons with notebooks, if a notebook have 
vents.(mine have)



--
With best regards,
 Gregory Edigarov



Re: snapshots total freeze

2012-12-28 Thread Joerg Goltermann

Hi,

On 25.12.2012 20:28, epsilon wrote:

On Tue, Dec 25, 2012 at 06:05:10PM +0100, frantisek holop wrote:

since a couple of snapshosts back i can quite reliably
freeze my openbsd notebook simply by leaving it on
overnight.  the desktop is there, all the open windows
are there, but it has become a painting...
nothing in the logs, no panic, nothing.
anybody else is seeing something similar?


Not really the same, but maybe compareable. I am unsure, but let's
see:

Since upgrade to 5.2 my gateway box freezes in about one out of four
times I boot it (it's switched off over night). It freezes somewhere
after starting network daemons and starting local daemons. I
tried to disable services I do not essentially need or to substitute
them with other solutions. So far no findings here.

But this box runs no X. I have connected a keyboard and a monitor and
I am able to switch between the virtual terminals but no reaction
there. If I simply hit return, nothing happens. No login possible.

ICMP pings are replyed, but I cannot SSH into the box. Connections are
NOT rejected, they just time out. Same with all other TCP connections.

After a while the fan accelerates. It looks like the CPU is working
very hard. Unfortunately this is really the only reaction this box
gives me. But better than nothing.



I can confirm the problem here. We run some 5.2 VMs on ESXi 5.1
and if the VM freezes the CPU runs on 100% but the system is
completely frozen. No disk IO, no keyboard, but kernel networking
seems to be ok.

We hit this problem on a physical server after upgrading to 5.2 too.

Sometimes the problem occurs during boot like described above, but
most of the freezes are seen after a short up time on system with
heavy disk io.

We had two VMs which crashed every 2 - 4 days after upgrading to 5.2.
After a long time of investigation I had the assumption that a problem
with the UVM / FFS softdep bufs causes the freeze. I saw a high,
increasing number of busymap bufs in systat. After a simple sync the
number drops down to 0 - 2 and increases again ...

Since 9 days, I run sync every 5 minutes and both systems did
*not* freeze again.

 - Joerg


--
OSN Online Service Nuernberg GmbH, Bucher Str. 78, 90408 Nuernberg
Tel: +49 911 39905-0 - Fax: +49 911 39905-55 - http://www.osn.de
HRB 15022 Nuernberg, USt-Id: DE189301263, GF: Joerg Goltermann



Re: snapshots total freeze

2012-12-28 Thread André Stöbe
Joerg Goltermann wrote:
 I can confirm the problem here. We run some 5.2 VMs on ESXi 5.1
 and if the VM freezes the CPU runs on 100% but the system is
 completely frozen. No disk IO, no keyboard, but kernel networking
 seems to be ok.
 
 We had two VMs which crashed every 2 - 4 days after upgrading to 5.2.
 After a long time of investigation I had the assumption that a problem
 with the UVM / FFS softdep bufs causes the freeze. I saw a high,
 increasing number of busymap bufs in systat. After a simple sync the
 number drops down to 0 - 2 and increases again ...

We're running a 5.2 VM on ESXi 5.0 and experiencing somewhat similar problems 
since upgrading from 5.1. No total freezes, the keyboard is still working, but 
a reboot is needed to get the machine back in a usable state.

I noticed the problems when the network wasn't working anymore. After bringing 
the interfaces down/up and flushing the routes, it looked good again but i then 
noticed that the clock hung completely. ntpd -s wouldn't work and the clock 
didn't count any further either. It just stuck. No log or kernel messages, 
unfortunately. The problems happened two times since upgrading in mid November, 
about every 2 weeks.

We have softdep enabled too, I'll check the systat output when I'm back at work 
and let you know if I see the same.

Regards
Andre



Re: snapshots total freeze

2012-12-28 Thread epsilon
Hi,

On Fri, Dec 28, 2012 at 12:01:37PM +0100, Joerg Goltermann wrote:
 ...
 We hit this problem on a physical server after upgrading to 5.2 too.
 ...
 Since 9 days, I run sync every 5 minutes and both systems did
 *not* freeze again.

Thanks for the hint. I will cronjob this.

 - Eps



Re: snapshots total freeze (linux emulation)

2012-12-28 Thread frantisek holop
hmm, on Tue, Dec 25, 2012 at 06:05:10PM +0100, frantisek holop said that
 since a couple of snapshosts back i can quite reliably
 freeze my openbsd notebook simply by leaving it on
 overnight.  the desktop is there, all the open windows
 are there, but it has become a painting...
 nothing in the logs, no panic, nothing.

so let me correct that: no visible panic..

i stayed up overnight as well for a change,
and it froze/panicked right in the front of my eyes.

i started a 'boot crash', but i am not sure
if it was finished correctly, it was sitting there
forever with the disk led on, so in the end
i just power cycled it.  savecore came on
and i have in the logs:

Dec 28 00:25:25 amaaq savecore: reboot after panic: kernel diagnostic assertion 
wp-wp_new_futex == f failed: file ../../../../compat/linux/linux_futex.c, 
line 568
Dec 28 00:25:25 amaaq savecore: /var/crash/bounds: No such file or directory
Dec 28 00:25:25 amaaq savecore: writing core to /var/crash/bsd.0.core
Dec 28 00:26:10 amaaq savecore: writing kernel to /var/crash/bsd.0

as the only program i run in linux emulation is opera...

thanks for all the tips how to catch this.

-f
-- 
i'm not nearly as think as you confused i am.



Re: snapshots total freeze (linux emulation)

2012-12-28 Thread Philip Guenther
On Fri, Dec 28, 2012 at 8:57 AM, frantisek holop min...@obiit.org wrote:
...
 i started a 'boot crash', but i am not sure
 if it was finished correctly, it was sitting there
 forever with the disk led on, so in the end
 i just power cycled it.

It depends on how much memory you have and how fast your disk is.
With 4GB of mem on my laptop, it feels like it takes 5 minutes, though
I haven't actually timed it.  If I'm *trying* get a crash dump (as
part of debugging something particularly complicated, say), then I'll
usually tell boot to lie and only use 128MB, via machine memory
=128M, before boot bsd.test.

(Note: if you do that, you must *also* do that when booting after the
crash dump, so that savecore can find the dump in the swap partition.)


 savecore came on and i have in the logs:

 Dec 28 00:25:25 amaaq savecore: reboot after panic: kernel diagnostic 
 assertion wp-wp_new_futex == f failed: file 
 ../../../../compat/linux/linux_futex.c, line 568

Excellent.  The next question is whether that's the only bug that
you're hitting, or if there's something else going on that should also
be debugged.

As for that particular failed assertion, it would be interesting to
know what the actual values of wp-wp_mew_futex was (if it was NULL,
then I have a guess as to the bug; if it wasn't NULL, then uh, good
luck!)


Philip Guenther



Re: snapshots total freeze

2012-12-27 Thread Arnaud Bergeron
This makes me think about a problem I had with an Eeepc a while back.  I
never reported it because I tracked that down to a faulty SSD that would
just hang the machine when accessing certain sectors.

The way I verified this was to run dd if=/dev/rwd0c of=/dev/null bs=1m
(with the appropriate device node for your drive) and see if it completes.
 Do this for all your mounted drives and you may find the culprit.


2012/12/25 Philip Guenther guent...@gmail.com

 On Tue, Dec 25, 2012 at 2:57 PM, frantisek holop min...@obiit.org wrote:
  hmm, on Tue, Dec 25, 2012 at 11:31:43PM +0100, Marc Espie said that
  On Tue, Dec 25, 2012 at 11:23:06PM +0100, frantisek holop wrote:
   (difficult to believe no people see this, every notebook
   i had since 2008 could not shutdown cleanly 50-70%
   of the time)
 
  I don't know what you do with your machines, or what specific hw you
  have that causes this. My machines shutdown gracefully most of the time.

 I'll second that.  When mine hang it's because I have a bug in
 whatever diff I'm working on...


  (the dmesgs of all my current and previous notebooks can be
  found in the misc archive).  thinkpad, ideapad, eeepc,
  all of them showed this behaviour.  i use these machines
  for simple daily use.  browsing, some development work,
  etc.  as every day as it gets for unix users.

 So start eliminating differences.  Does it hang over night if you
 never login?  If not, then it's something you're running that does it.
  So login and logout and see whether it's some daemon started by your
 .xsession (dbus?  that gkrellm thing that caused a 1+ load?) that
 causes it.  If it does hang even if you don't login, then start
 checking off the system daemons.  What if X is never started?  Heck,
 boot to single user and leave it there over night.

 You were seeing ACPI taking more and more memory before; is that still
 happening?  Is there a correlation between that and the hangs?

 Make a hypothesis (it's caused by something in my .xsession), come
 up with a way to test it (see if it happens if I don't login) then
 do so.  You have a problem; do SCIENCE on it.


  there are normally 2 ways of unclean shutdown:
  either syncing now and never done, or
  (on this ideapad) simply black screen, presumably
  still part of X, and never going to the console.

  i made this script to minimize damage:
 
  $ cat bin/ha.sh
  #!/bin/sh
 
  sudo sync
  sudo mount -u -r /data
  sudo halt -p

 Does the hang in sync happen if you're not running any processes as
 you?  Only happens if certain filesystems are mounted?  What are the
 prerequisites on it hanging?

 Since this doesn't seem to affect developers, you're going to have to
 do the science.


 Philip Guenther




-- 
La brigade SnW veut vous recruter - http://www.brigadesnw.ca



Re: snapshots total freeze

2012-12-25 Thread Stefan Sperling
On Tue, Dec 25, 2012 at 06:05:10PM +0100, frantisek holop wrote:
 hi there,
 
 since a couple of snapshosts back i can quite reliably
 freeze my openbsd notebook simply by leaving it on
 overnight.  the desktop is there, all the open windows
 are there, but it has become a painting...
 nothing in the logs, no panic, nothing.

Sounds like you're in X while the kernel panics. The system is probably
in ddb but you cannot see the ddb prompt since you're in X.

Can you leave the screen switched to the console (Ctrl+Alt+F1)
over night and check whether a ddb prompt shows up in the morning?



Re: snapshots total freeze

2012-12-25 Thread Jérémie Courrèges-Anglas
Stefan Sperling s...@openbsd.org writes:

 On Tue, Dec 25, 2012 at 06:05:10PM +0100, frantisek holop wrote:
 hi there,
 
 since a couple of snapshosts back i can quite reliably
 freeze my openbsd notebook simply by leaving it on
 overnight.  the desktop is there, all the open windows
 are there, but it has become a painting...
 nothing in the logs, no panic, nothing.

 Sounds like you're in X while the kernel panics. The system is probably
 in ddb but you cannot see the ddb prompt since you're in X.

 Can you leave the screen switched to the console (Ctrl+Alt+F1)
 over night and check whether a ddb prompt shows up in the morning?

Tip: if your system only freezes while you have X on screen, you can
generally go back to ttyC0 and even if the display isn't refreshed,
blindly type at the ddb prompt.  In that case, crash(8) will help.

-- 
Jérémie Courrèges-Anglas
GPG Key fingerprint: 61DB D9A0 00A4 67CF 2A90  8961 6191 8FBF 06A1 1494



Re: snapshots total freeze

2012-12-25 Thread Sebastian Neuper
On Tue, 25 Dec 2012 18:17:27 +0100
Stefan Sperling s...@openbsd.org wrote:

 On Tue, Dec 25, 2012 at 06:05:10PM +0100, frantisek holop wrote:
  hi there,
  
  since a couple of snapshosts back i can quite reliably
  freeze my openbsd notebook simply by leaving it on
  overnight.  the desktop is there, all the open windows
  are there, but it has become a painting...
  nothing in the logs, no panic, nothing.
 
 Sounds like you're in X while the kernel panics. The system is probably
 in ddb but you cannot see the ddb prompt since you're in X.
 
 Can you leave the screen switched to the console (Ctrl+Alt+F1)
 over night and check whether a ddb prompt shows up in the morning?
 

When you have also a ntfs partition mounted, then just try to
switch to the console and simply run
# ksh /etc/daily 
as root, and look if it exits normally.

I posted a same error a couple of days ago, 
because I got a freezed X every night at 0:30am.

the command `boot dump` and man crash(8) will help as
mentioned, if the kernel panics.

And I would just umount any ntfs partition overnight.

Best, Sebastian.


-- 
Sebastian Neuper pha...@gmx.de



Re: snapshots total freeze

2012-12-25 Thread epsilon
On Tue, Dec 25, 2012 at 06:05:10PM +0100, frantisek holop wrote:
 since a couple of snapshosts back i can quite reliably
 freeze my openbsd notebook simply by leaving it on
 overnight.  the desktop is there, all the open windows
 are there, but it has become a painting...
 nothing in the logs, no panic, nothing.
 anybody else is seeing something similar?

Not really the same, but maybe compareable. I am unsure, but let's
see:

Since upgrade to 5.2 my gateway box freezes in about one out of four
times I boot it (it's switched off over night). It freezes somewhere
after starting network daemons and starting local daemons. I
tried to disable services I do not essentially need or to substitute
them with other solutions. So far no findings here.

But this box runs no X. I have connected a keyboard and a monitor and
I am able to switch between the virtual terminals but no reaction
there. If I simply hit return, nothing happens. No login possible.

ICMP pings are replyed, but I cannot SSH into the box. Connections are
NOT rejected, they just time out. Same with all other TCP connections.

After a while the fan accelerates. It looks like the CPU is working
very hard. Unfortunately this is really the only reaction this box
gives me. But better than nothing.

 nothing in the logs, no panic, nothing.

Yes! Even the named startup logging misses in /var/log/messages.
The freeze always appears somewhere after named starts (see above).
It looks like syslogd did not have the time to write the file.
The last thing I got in /var/log/messages is:

... /bsd: root on wd0a ...

After rebooting the hard way, the only thing I got are sometimes (not
always) /var/lost+found/* files (/var is a separate partition).

For some weeks I used a more recent hardware. The only difference is:
The fan is louder. So I stay sticky with this one to minimize the harm
done to me (OK - it has other NICs, and a SATA drive).

Maybe I missinterpret things, but for me it looks like the kernel is
still running, but all userland activities are completely
dead/blocked/locked/looping/whatever.

IPv6 is disabled on all NICs. Just saying ...

Greetings
  E.

$ cat rc.conf.local
sendmail_flags=-L sm-mta -C/etc/mail/sendmail.cf -bd -q30m
named_flags=
httpd_flags=-DSSL -u
ftpproxy_flags=
tftpd_flags=-4 -l xx.xx.xx.xx /tftpboot
ifstated_flags=
dhcpd_flags=xl0

$ dmesg
OpenBSD 5.2 (GENERIC) #278: Wed Aug  1 10:04:16 MDT 2012
dera...@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC
cpu0: Intel Pentium III (GenuineIntel 686-class) 732 MHz
cpu0: 
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PSE36,MMX,FXSR,SSE
real mem  = 266727424 (254MB)
avail mem = 251506688 (239MB)
mainbus0 at root
bios0 at mainbus0: AT/286+ BIOS, date 08/25/00, BIOS32 rev. 0 @ 0xe7300, SMBIOS 
rev. 2.3 @ 0xf8dc6 (47 entries)
bios0: vendor Compaq version 686P2 v2.04 date 08/25/2000
bios0: Compaq Deskpro
apm0 at bios0: Power Management spec V1.2
acpi at bios0 function 0x0 not configured
pcibios0 at bios0: rev 2.1 @ 0xe7300/0x8d00
pcibios0: PCI IRQ Routing Table rev 1.0 @ 0xf6260/208 (11 entries)
pcibios0: PCI Interrupt Router at 000:31:0 (Intel 82801BA LPC rev 0x00)
pcibios0: PCI bus #2 is the last bus
bios0: ROM list: 0xc/0xa000 0xca000/0x800 0xca800/0xd800! 0xe/0x1!
cpu0 at mainbus0: (uniprocessor)
pci0 at mainbus0 bus 0: configuration mode 1 (bios)
pchb0 at pci0 dev 0 function 0 Intel 82815 Host rev 0x02
vga1 at pci0 dev 2 function 0 Intel 82815 Video rev 0x02
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
intagp0 at vga1
agp0 at intagp0: aperture at 0x4400, size 0x400
ppb0 at pci0 dev 30 function 0 Intel 82801BA Hub-to-PCI rev 0x02
pci1 at ppb0 bus 2
rl0 at pci1 dev 4 function 0 Realtek 8139 rev 0x10: irq 5, address 
00:08:a1:57:08:83
rlphy0 at rl0 phy 0: RTL internal PHY
fxp0 at pci1 dev 8 function 0 Intel 82562 rev 0x01, i82562: irq 10, address 
00:02:a5:2b:0f:43
inphy0 at fxp0 phy 1: i82562EM 10/100 PHY, rev. 0
xl0 at pci1 dev 9 function 0 3Com 3c905C 100Base-TX rev 0x78: irq 11, address 
00:04:76:26:b5:0f
exphy0 at xl0 phy 24: 3Com internal media interface
ichpcib0 at pci0 dev 31 function 0 Intel 82801BA LPC rev 0x02: 24-bit timer 
at 3579545Hz
pciide0 at pci0 dev 31 function 1 Intel 82801BA IDE rev 0x02: DMA, channel 0 
wired to compatibility, channel 1 wired to compatibility
atapiscsi0 at pciide0 channel 0 drive 0
scsibus0 at atapiscsi0: 2 targets
cd0 at scsibus0 targ 0 lun 0: COMPAQ, CD-ROM CR-594, YC0G ATAPI 5/cdrom 
removable
cd0(pciide0:0:0): using PIO mode 4, DMA mode 2
wd0 at pciide0 channel 1 drive 0: MAXTOR STM380215A
wd0: 16-sector PIO, LBA48, 76319MB, 156301488 sectors
wd0(pciide0:1:0): using PIO mode 4, Ultra-DMA mode 5
uhci0 at pci0 dev 31 function 4 Intel 82801BA USB rev 0x02: irq 10
auich0 at pci0 dev 31 function 5 Intel 82801BA AC97 rev 0x02: irq 5, ICH2 AC97
ac97: codec id 0x41445360 (Analog Devices AD1885)
ac97: codec features headphone, Analog Devices Phat 

Re: snapshots total freeze

2012-12-25 Thread Philip Guenther
On Tue, Dec 25, 2012 at 9:05 AM, frantisek holop min...@obiit.org wrote:
 since a couple of snapshosts back i can quite reliably
 freeze my openbsd notebook simply by leaving it on
 overnight.  the desktop is there, all the open windows
 are there, but it has become a painting...
 nothing in the logs, no panic, nothing.
 anybody else is seeing something similar?

I'll second the try doing ctrl-alt-F1 followed by 'boot crash'
recommendation.  (If you don't have swap and /var space for a crash
dump, then at least try blind typing 'trace' and 'ps' and then 'boot
reboot', as if that can actually reboot then you'll see the panic
message, back trace, and ps info in the output of dmesg.

The other thought is that you describe this as a change from before;
what was the last snapshot you installed that you can be sure *did
not* exhibit this behavior?  e.g., did the machine last overnight when
running 5.1?  5.2-release?  An October snapshot?


Philip Guenther



Re: snapshots total freeze

2012-12-25 Thread Chris Bennett
On Tue, Dec 25, 2012 at 06:05:10PM +0100, frantisek holop wrote:
 hi there,
 
 since a couple of snapshosts back i can quite reliably
 freeze my openbsd notebook simply by leaving it on
 overnight.  the desktop is there, all the open windows
 are there, but it has become a painting...
 nothing in the logs, no panic, nothing.
 anybody else is seeing something similar?
 
 -f
 

My notebook would freeze up regularly, especially after I had
opened enough applications.

This was somewhat annoying. :)

Turned out that my hard drive was at the edge of failure.
I am guessing that needing to swap to the hard drive after free
memory filled up caused the failures. I could be wrong.

I suggest that you carefully check the hard drive and memory.
Might not have anything to do with the snapshot.

Never rule out hardware.

Chris Bennett



Re: snapshots total freeze

2012-12-25 Thread frantisek holop
hmm, on Tue, Dec 25, 2012 at 06:17:27PM +0100, Stefan Sperling said that
 On Tue, Dec 25, 2012 at 06:05:10PM +0100, frantisek holop wrote:
  hi there,
  
  since a couple of snapshosts back i can quite reliably
  freeze my openbsd notebook simply by leaving it on
  overnight.  the desktop is there, all the open windows
  are there, but it has become a painting...
  nothing in the logs, no panic, nothing.
 
 Sounds like you're in X while the kernel panics. The system is probably
 in ddb but you cannot see the ddb prompt since you're in X.
 
 Can you leave the screen switched to the console (Ctrl+Alt+F1)
 over night and check whether a ddb prompt shows up in the morning?

sure, i will try.

-f
-- 
if you can't see black, white has no meaning



Re: snapshots total freeze

2012-12-25 Thread frantisek holop
hmm, on Tue, Dec 25, 2012 at 03:49:00PM -0600, Chris Bennett said that
 On Tue, Dec 25, 2012 at 06:05:10PM +0100, frantisek holop wrote:
  hi there,
  
  since a couple of snapshosts back i can quite reliably
  freeze my openbsd notebook simply by leaving it on
  overnight.  the desktop is there, all the open windows
  are there, but it has become a painting...
  nothing in the logs, no panic, nothing.
  anybody else is seeing something similar?
  
  -f
  
 
 My notebook would freeze up regularly, especially after I had
 opened enough applications.
 
 This was somewhat annoying. :)
 
 Turned out that my hard drive was at the edge of failure.
 I am guessing that needing to swap to the hard drive after free
 memory filled up caused the failures. I could be wrong.

a dying disk tends to make quite a lot of noise in the logs..
i see no signs of that so far.  and i have lots of
fsck's because of unclean shutdowns. basically
nowadays every clean shutdown is cause for celebration.
(difficult to believe no people see this, every notebook
i had since 2008 could not shutdown cleanly 50-70%
of the time)

as to the memory, i have a usb stick with linux that has
memtest86 as boot option. i can leave that overnight as well.

hardware is always suspect, but it's so difficult to test...

-f
-- 
sorry guys, this is not heavy metal.



Re: snapshots total freeze

2012-12-25 Thread Marc Espie
On Tue, Dec 25, 2012 at 11:23:06PM +0100, frantisek holop wrote:
 (difficult to believe no people see this, every notebook
 i had since 2008 could not shutdown cleanly 50-70%
 of the time)

I don't know what you do with your machines, or what specific hw you
have that causes this. My machines shutdown gracefully most of the time.



Re: snapshots total freeze

2012-12-25 Thread frantisek holop
hmm, on Tue, Dec 25, 2012 at 01:23:41PM -0800, Philip Guenther said that
 On Tue, Dec 25, 2012 at 9:05 AM, frantisek holop min...@obiit.org wrote:
  since a couple of snapshosts back i can quite reliably
  freeze my openbsd notebook simply by leaving it on
  overnight.  the desktop is there, all the open windows
  are there, but it has become a painting...
  nothing in the logs, no panic, nothing.
  anybody else is seeing something similar?
 
 I'll second the try doing ctrl-alt-F1 followed by 'boot crash'
 recommendation.  (If you don't have swap and /var space for a crash
 dump, then at least try blind typing 'trace' and 'ps' and then 'boot
 reboot', as if that can actually reboot then you'll see the panic
 message, back trace, and ps info in the output of dmesg.

i always try these.  no results.
keyboard seems to be totally dead (i also have
machdep.kbdreset=1 and it has never worked in
case of a freeze).

 The other thought is that you describe this as a change from before;
 what was the last snapshot you installed that you can be sure *did
 not* exhibit this behavior?  e.g., did the machine last overnight when
 running 5.1?  5.2-release?  An October snapshot?

leaving my notebook on overnight did not result in hard freeze before.
it is basically idling, not even connected to network.  couple
of programs left running as they were.  basically a poor man's
hybernate as this notebook cannot wake up after zzz..

to define that before, i am sorry, i cannot.  i remember
in november a distinct freeze when i left to computer alone
only for a couple of hours.  coming back to it it was totally
unresponsive and i could only turn it off. this case i said
overnight but of course the freeze could have happened
anytime before i woke up.  i just wrote a heartbeat script
that spams syslog...

i have also stopped using systat enterily as it could
lock up all and every snapshot for the last year.
starting it specifically outside X in the console
would lock up the machine so bad only turning off helped.
no ddb, no ctrl-alt-del, just the blinking cursor behind
systat.  so frustrating not to be able to trace down
any of this. apparently it is not happening to others either.
so yes, too bad.

-f
-- 
no sense being pessimistic.  it wouldn't work anyway.



Re: snapshots total freeze

2012-12-25 Thread frantisek holop
hmm, on Tue, Dec 25, 2012 at 11:31:43PM +0100, Marc Espie said that
 On Tue, Dec 25, 2012 at 11:23:06PM +0100, frantisek holop wrote:
  (difficult to believe no people see this, every notebook
  i had since 2008 could not shutdown cleanly 50-70%
  of the time)
 
 I don't know what you do with your machines, or what specific hw you
 have that causes this. My machines shutdown gracefully most of the time.

(the dmesgs of all my current and previous notebooks can be
found in the misc archive).  thinkpad, ideapad, eeepc,
all of them showed this behaviour.  i use these machines
for simple daily use.  browsing, some development work,
etc.  as every day as it gets for unix users.

there are normally 2 ways of unclean shutdown:
either syncing now and never done, or
(on this ideapad) simply black screen, presumably
still part of X, and never going to the console.

i made this script to minimize damage:

$ cat bin/ha.sh
#!/bin/sh

sudo sync
sudo mount -u -r /data
sudo halt -p



-f
-- 
runtime error.  should i walk? (y/n)



Re: snapshots total freeze

2012-12-25 Thread Philip Guenther
On Tue, Dec 25, 2012 at 2:57 PM, frantisek holop min...@obiit.org wrote:
 hmm, on Tue, Dec 25, 2012 at 11:31:43PM +0100, Marc Espie said that
 On Tue, Dec 25, 2012 at 11:23:06PM +0100, frantisek holop wrote:
  (difficult to believe no people see this, every notebook
  i had since 2008 could not shutdown cleanly 50-70%
  of the time)

 I don't know what you do with your machines, or what specific hw you
 have that causes this. My machines shutdown gracefully most of the time.

I'll second that.  When mine hang it's because I have a bug in
whatever diff I'm working on...


 (the dmesgs of all my current and previous notebooks can be
 found in the misc archive).  thinkpad, ideapad, eeepc,
 all of them showed this behaviour.  i use these machines
 for simple daily use.  browsing, some development work,
 etc.  as every day as it gets for unix users.

So start eliminating differences.  Does it hang over night if you
never login?  If not, then it's something you're running that does it.
 So login and logout and see whether it's some daemon started by your
.xsession (dbus?  that gkrellm thing that caused a 1+ load?) that
causes it.  If it does hang even if you don't login, then start
checking off the system daemons.  What if X is never started?  Heck,
boot to single user and leave it there over night.

You were seeing ACPI taking more and more memory before; is that still
happening?  Is there a correlation between that and the hangs?

Make a hypothesis (it's caused by something in my .xsession), come
up with a way to test it (see if it happens if I don't login) then
do so.  You have a problem; do SCIENCE on it.


 there are normally 2 ways of unclean shutdown:
 either syncing now and never done, or
 (on this ideapad) simply black screen, presumably
 still part of X, and never going to the console.

 i made this script to minimize damage:

 $ cat bin/ha.sh
 #!/bin/sh

 sudo sync
 sudo mount -u -r /data
 sudo halt -p

Does the hang in sync happen if you're not running any processes as
you?  Only happens if certain filesystems are mounted?  What are the
prerequisites on it hanging?

Since this doesn't seem to affect developers, you're going to have to
do the science.


Philip Guenther