Re: Severe problem with amd64 -current as of June 13

2013-06-16 Thread STeve Andre'

On 06/16/13 01:05, Philip Guenther wrote:

On Sat, Jun 15, 2013 at 7:19 PM, STeve Andre' and...@msu.edu wrote:

On June 13 I updated my tree from anoncvs.usa.openbsd.org, and
the new -current failed shortly after running it.

Things would get *very* slow, with continuous disk activity.  It was
sort of possible to switch between screens, albeit after a few minutes.
Processes would be shown as in a disk wait (D).  Eventually the system
freezes completely with the continuous disk activity as evidenced by
my disk LED.

Hmm, once it gets into this state, can you go to the console, then
break into ddb (you'll need to add ddb.console=1 to your
/etc/sysctl.conf and reboot for that to work), and then do show uvm
and report the results?


Philip Guenther


I will try that, but I doubt I can get to a console.  It's very quick
from slow to hung.  But I'll umount everything I can and see
what transpires.  Thanks.

--STeve Andre'



Re: Severe problem with amd64 -current as of June 13

2013-06-16 Thread STeve Andre'

On 06/16/13 01:05, Philip Guenther wrote:

On Sat, Jun 15, 2013 at 7:19 PM, STeve Andre' and...@msu.edu wrote:

On June 13 I updated my tree from anoncvs.usa.openbsd.org, and
the new -current failed shortly after running it.

Things would get *very* slow, with continuous disk activity.  It was
sort of possible to switch between screens, albeit after a few minutes.
Processes would be shown as in a disk wait (D).  Eventually the system
freezes completely with the continuous disk activity as evidenced by
my disk LED.

Hmm, once it gets into this state, can you go to the console, then
break into ddb (you'll need to add ddb.console=1 to your
/etc/sysctl.conf and reboot for that to work), and then do show uvm
and report the results?


Philip Guenther



Well, that worked.  As soon as it got slow I flipped to the console and got:

current uvm status:
pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12
2020005 vm pages: 444603active, 5144inactive, -29733wired, 1340021free 
min 10% (25) anon, 10% (25) vnode, 5% (12) vtext

pages0anon, 0vnode, 0vtext
freemin=67333, free-target=89777, inactive-target=0, wired-max=673335
faults=1649483, traps=1771874, intrs=88809, ctxswitch=8541333 
fpuswitch=6

softint=200628, syscalls=36792232, kmapent=13
faultcounts:
noram=0, noanon=0 pgwait=0, pgrele=0
ok relocks(total)=52922(53361), anget(retries)=179277(0), amapcopy=208087
neighbor anon/obj pg=20425/253589, gets(lock/unlock)=155928/53361
cases: anon=160496, anoncow=18781, obj=131739, prcopy=23750, przero=1309640
daemon and swap counts: woke=0, revs=0, scans=0, obscans=0, anscans=0
busy=0, freed=0, reactivate=0, deactivate=0
pageouts=0, pending=0, nswget=0
nswapdev=1, nanon=0, nanonneeded=0, nfreeanon=0
swpages=1050240, swpginuse=0, swpgonly=0, paging=0
kernelpointers: objs(kern)=0x81c96c00

--STeve



Severe problem with amd64 -current as of June 13

2013-06-15 Thread STeve Andre'

   amd64-current seems rather wounded at the moment.  I've been
running -current since June 5th with no problems.  This is a W500
thinkpad.

   On June 13 I updated my tree from anoncvs.usa.openbsd.org, and
the new -current failed shortly after running it.

   Things would get *very* slow, with continuous disk activity.  It was
sort of possible to switch between screens, albeit after a few minutes.
Processes would be shown as in a disk wait (D).  Eventually the system
freezes completely with the continuous disk activity as evidenced by
my disk LED.

   Going back to the June 5th kernel with the June 13th userland seems
to be working.  Looking at the FAQ there are things for -current which
had no relevance to this problem.  I then got a new copy of src and
xenocara with the same results.

   Seeing memmove and similar changes, and a fix for at least one arch
makes me wonder if amd64 has a problem.

   Suggestions?

---dmesg of the 'good' kernel of June 5th
Jun 15 00:28:58 paladin /bsd: OpenBSD 5.3-current (GENERIC.MP) #0: Wed 
Jun  5 21:33:29 EDT 2013
Jun 15 00:28:58 paladin /bsd: 
root@paladin:/usr/src/sys/arch/amd64/compile/GENERIC.MP

Jun 15 00:28:58 paladin /bsd: real mem = 8499433472 (8105MB)
Jun 15 00:28:58 paladin /bsd: avail mem = 8265461760 (7882MB)
Jun 15 00:28:58 paladin /bsd: mainbus0 at root
Jun 15 00:28:58 paladin /bsd: bios0 at mainbus0: SMBIOS rev. 2.4 @ 
0xe0010 (80 entries)
Jun 15 00:28:58 paladin /bsd: bios0: vendor LENOVO version 6FET92WW 
(3.22 ) date 12/14/2011

Jun 15 00:28:58 paladin /bsd: bios0: LENOVO 4061CTO
Jun 15 00:28:58 paladin /bsd: acpi0 at bios0: rev 2
Jun 15 00:28:58 paladin /bsd: acpi0: sleep states S0 S3 S4 S5
Jun 15 00:28:58 paladin /bsd: acpi0: tables DSDT FACP SSDT ECDT APIC 
MCFG HPET SLIC BOOT ASF! SSDT TCPA DMAR SSDT SSDT SSDT
Jun 15 00:28:58 paladin /bsd: acpi0: wakeup devices LID_(S3) SLPB(S3) 
UART(S3) IGBE(S4) EXP0(S4) EXP1(S4) EXP2(S4) EXP3(S4) EXP4(S4) PCI1(S4) 
USB0(S3) USB3(S3) USB5(S3) EHC0(S3) EHC1(S3) HDEF(S4)

Jun 15 00:28:58 paladin /bsd: acpitimer0 at acpi0: 3579545 Hz, 24 bits
Jun 15 00:28:58 paladin /bsd: acpiec0 at acpi0
Jun 15 00:28:58 paladin /bsd: acpimadt0 at acpi0 addr 0xfee0: PC-AT 
compat

Jun 15 00:28:58 paladin /bsd: cpu0 at mainbus0: apid 0 (boot processor)
Jun 15 00:28:58 paladin /bsd: cpu0: Intel(R) Core(TM)2 Duo CPU T9600 @ 
2.80GHz, 2793.45 MHz
Jun 15 00:28:58 paladin /bsd: cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,NXE,LONG,LAHF,PERF

Jun 15 00:28:58 paladin /bsd: cpu0: 6MB 64b/line 16-way L2 cache
Jun 15 00:28:58 paladin /bsd: cpu0: smt 0, core 0, package 0
Jun 15 00:28:58 paladin /bsd: cpu0: apic clock running at 266MHz
Jun 15 00:28:58 paladin /bsd: cpu1 at mainbus0: apid 1 (application 
processor)
Jun 15 00:28:58 paladin /bsd: cpu1: Intel(R) Core(TM)2 Duo CPU T9600 @ 
2.80GHz, 2793.00 MHz
Jun 15 00:28:58 paladin /bsd: cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,NXE,LONG,LAHF,PERF

Jun 15 00:28:58 paladin /bsd: cpu1: 6MB 64b/line 16-way L2 cache
Jun 15 00:28:58 paladin /bsd: cpu1: smt 0, core 1, package 0
Jun 15 00:28:58 paladin /bsd: ioapic0 at mainbus0: apid 1 pa 0xfec0, 
version 20, 24 pins
Jun 15 00:28:58 paladin /bsd: ioapic0: misconfigured as apic 2, remapped 
to apid 1

Jun 15 00:28:58 paladin /bsd: acpimcfg0 at acpi0 addr 0xe000, bus 0-63
Jun 15 00:28:58 paladin /bsd: acpihpet0 at acpi0: 14318179 Hz
Jun 15 00:28:58 paladin /bsd: acpiprt0 at acpi0: bus 0 (PCI0)
Jun 15 00:28:58 paladin /bsd: acpiprt1 at acpi0: bus 1 (AGP_)
Jun 15 00:28:58 paladin /bsd: acpiprt2 at acpi0: bus 2 (EXP0)
Jun 15 00:28:58 paladin /bsd: acpiprt3 at acpi0: bus 3 (EXP1)
Jun 15 00:28:58 paladin /bsd: acpiprt4 at acpi0: bus -1 (EXP2)
Jun 15 00:28:58 paladin /bsd: acpiprt5 at acpi0: bus 5 (EXP3)
Jun 15 00:28:58 paladin /bsd: acpiprt6 at acpi0: bus 13 (EXP4)
Jun 15 00:28:58 paladin /bsd: acpiprt7 at acpi0: bus 21 (PCI1)
Jun 15 00:28:58 paladin /bsd: acpicpu0 at acpi0: C3, C2, C1, PSS
Jun 15 00:28:58 paladin /bsd: acpicpu1 at acpi0: C3, C2, C1, PSS
Jun 15 00:28:58 paladin /bsd: acpipwrres0 at acpi0: PUBS
Jun 15 00:28:58 paladin /bsd: acpitz0 at acpi0: critical temperature is 
127 degC
Jun 15 00:28:58 paladin /bsd: acpitz1 at acpi0: critical temperature is 
100 degC

Jun 15 00:28:58 paladin /bsd: acpibtn0 at acpi0: LID_
Jun 15 00:28:58 paladin /bsd: acpibtn1 at acpi0: SLPB
Jun 15 00:28:58 paladin /bsd: acpibat0 at acpi0: BAT0 model 42T4619 
serial  5701 type LION oem SANYO

Jun 15 00:28:58 paladin /bsd: acpibat1 at acpi0: BAT1 not present
Jun 15 00:28:58 paladin /bsd: acpiac0 at acpi0: AC unit online
Jun 15 00:28:58 paladin /bsd: acpithinkpad0 at acpi0
Jun 15 00:28:58 paladin /bsd: acpidock0 at acpi0: GDCK not docked (0)
Jun 15 00:28:58 

Re: Severe problem with amd64 -current as of June 13

2013-06-15 Thread STeve Andre'
On 06/16/13 00:23, Amit Kulkarni wrote:



 On Sat, Jun 15, 2013 at 9:19 PM, STeve Andre' and...@msu.edu 
 mailto:and...@msu.edu wrote:

  amd64-current seems rather wounded at the moment.  I've been
 running -current since June 5th with no problems.  This is a W500
 thinkpad.

On June 13 I updated my tree from anoncvs.usa.openbsd.org
 http://anoncvs.usa.openbsd.org, and
 the new -current failed shortly after running it.

Things would get *very* slow, with continuous disk activity.
  It was
 sort of possible to switch between screens, albeit after a few
 minutes.
 Processes would be shown as in a disk wait (D).  Eventually the system
 freezes completely with the continuous disk activity as evidenced by
 my disk LED.

Going back to the June 5th kernel with the June 13th userland seems
 to be working.  Looking at the FAQ there are things for -current which
 had no relevance to this problem.  I then got a new copy of src and
 xenocara with the same results.

Seeing memmove and similar changes, and a fix for at least one arch
 makes me wonder if amd64 has a problem.

Suggestions?


 i just updated a desktop PC with the 15th June snap and things were 
 working fine... and the June 10th packages.

Interesting.  I updated my tree this afternoon, and built a new kernel which
finally had the same problem.  It ran for a while fine, but then got 
very slow
and finally froze.  This was after 10 or 15 minutes, long enough for me 
to think
that things were OK.

Running on the June 5th kernel right now and all is well.

I'll try the June 10th snaps, but I think that's missing png?

--STeve Andre'



Re: Severe problem with amd64 -current as of June 13

2013-06-15 Thread Philip Guenther
On Sat, Jun 15, 2013 at 7:19 PM, STeve Andre' and...@msu.edu wrote:
On June 13 I updated my tree from anoncvs.usa.openbsd.org, and
 the new -current failed shortly after running it.

Things would get *very* slow, with continuous disk activity.  It was
 sort of possible to switch between screens, albeit after a few minutes.
 Processes would be shown as in a disk wait (D).  Eventually the system
 freezes completely with the continuous disk activity as evidenced by
 my disk LED.

Hmm, once it gets into this state, can you go to the console, then
break into ddb (you'll need to add ddb.console=1 to your
/etc/sysctl.conf and reboot for that to work), and then do show uvm
and report the results?


Philip Guenther