Re: bhyve tty / login problems / panic

2013-03-31 Thread Dan Mack


Here's the latest (note I didn't update my release.iso with the latest, at 
this point I'm still running a slightly earlier (cd /usr/src/release ; make release) copy from 248935 below.  Also,

the kernel that the VM is running I actually built while in uni-processor mode 
which shows how well
behaved it is when running in non-SMP.

You can see my motherboard type and such from the previous host boot log (Intel 
desktop board), let me know if you want any ACPI
bios setting details; the firmware from Intel is about 1 month old so maybe it 
is related.

Hope this helps, let me know where to upload my release.iso and/or a small disk 
image to boot if needed.

Dan

root@olive:/vms # bhyvectl --vm=coco --get-stats --cpu=0 ; bhyvectl --vm=coco 
--get-stats --cpu=1
vcpu0
vm exits due to external interrupt  29118
number of times hlt was intercepted 3410
number of times %cr access was intercepted  1
number of times rdmsr was intercepted   0
number of times wrmsr was intercepted   4
number of monitor trap exits0
number of times pause was intercepted   7967499
vm exits due to interrupt window opening11701
vm exits due to nmi window opening  0
number of times in/out was intercepted  480563
number of times cpuid was intercepted   122
vm exits due to nested page fault   1490965
number of vm exits for unknown reason   0
number of times astpending at exit  282
number of vm exits handled in userspace 8451625
number of NMIs delivered to vcpu0
vcpu total runtime  94121692198
number of ticks vcpu was idle   5750
timer interrupts generated by vlapic9073
vcpu migration across host cpus 52
number of times hlt was ignored 0
total number of vm exits9983383
vcpu1
vm exits due to external interrupt  73725
number of times hlt was intercepted 4995
number of times %cr access was intercepted  2
number of times rdmsr was intercepted   0
number of times wrmsr was intercepted   4
number of monitor trap exits0
number of times pause was intercepted   7707742
vm exits due to interrupt window opening45746
vm exits due to nmi window opening  0
number of times in/out was intercepted  79601
number of times cpuid was intercepted   105
vm exits due to nested page fault   384793
number of vm exits for unknown reason   0
number of times astpending at exit  263
number of vm exits handled in userspace 7792338
number of NMIs delivered to vcpu0
vcpu total runtime  87399262934
number of ticks vcpu was idle   3940
timer interrupts generated by vlapic10039
vcpu migration across host cpus 65
number of times hlt was ignored 0
total number of vm exits8296713
root@olive:/vms # uname -a
FreeBSD olive.example.com 10.0-CURRENT FreeBSD 10.0-CURRENT #8 r248949: Sun Mar 
31 09:42:40 CDT 2013 r...@olive.example.com:/usr/obj/usr/src/sys/MACKGEN  
amd64

root@olive:/vms # top -bHI
last pid:  1417;  load averages:  1.96,  0.74,  0.35  up 0+00:26:0010:38:13
23 processes:  3 running, 20 sleeping

Mem: 21M Active, 13M Inact, 8166M Wired, 1920K Cache, 54G Free
ARC: 832M Total, 310M MFU, 510M MRU, 8336K Anon, 2364K Header, 2125K Other
Swap:


  PID USERNAME   PRI NICE   SIZERES STATE   C   TIME   WCPU COMMAND
 1390 root   1030  4120M   139M CPU77   1:48 100.00% bhyve{coco 
vcpu 0}
 1390 root   1020  4120M   139M CPU11   1:44 100.00% bhyve{coco 
vcpu 1}


And here is the vm boot sequence and panic again which shows how the clock is 
going very fast (the crash only took about 2 minutes of wall time if that):

FreeBSD/amd64 User boot, Revision 1.1
(r...@olive.example.com, Sun Mar 31 09:32:01 CDT 2013)
Loading /boot/defaults/loader.conf 
/boot/kernel/kernel text=0x5a2518 data=0x980e8+0xef8b0 syms=[0x8+0xca980+0x8+0x12bb86]

/
  __      _ _
 |  | |  _ \ / |  __ \
 | |___ _ __ ___  ___ | |_) | (___ | |  | |
 |  ___| '__/ _ \/ _ \|  _ < \___ \| |  | |
 | |   | | |  __/  __/| |_) |) | |__| |
 | |   | | |||| |  |  |
 |_|   |_|  \___|\___||/|_/|_/````
 s` `.---...--.```   -/
 +Welcome to FreeBSD---+ +o   .--` /y:`  +.
 | |  yo`:.:o  `+-
 |  1. Boot Multi User [Enter] |   y/   -/`   -o/
 |  2. Boot [S]ingle User  |  .-  ::/sy+:.
 |  3. [Esc]ape to loader prompt   |  / `--  /
 |  

Re: bhyve tty / login problems / panic

2013-03-30 Thread Neel Natu
Hi Dan,

On Sat, Mar 30, 2013 at 12:17 PM, Dan Mack  wrote:

> Here you go:
>
>  - uni-processor output:
>
>
> root@olive:~ # bhyvectl --vm=coco --get-stats --cpu=0
> vcpu0
> vm exits due to external interrupt  1330
> number of times hlt was intercepted 676
> number of times %cr access was intercepted  1
> number of times rdmsr was intercepted   0
> number of times wrmsr was intercepted   4
> number of monitor trap exits0
> number of times pause was intercepted   428499
> vm exits due to interrupt window opening1231
> vm exits due to nmi window opening  0
> number of times in/out was intercepted  423985
> number of times cpuid was intercepted   22
> vm exits due to nested page fault   7523
> number of vm exits for unknown reason   0
> number of times astpending at exit  38
> number of vm exits handled in userspace 853290
>
> number of NMIs delivered to vcpu0
> vcpu total runtime  6954314530
> number of ticks vcpu was idle   59572
> vcpu migration across host cpus 29
>
> number of times hlt was ignored 0
> total number of vm exits863271
>
> And here is when running with vcpu==2:
>
>
> root@olive:~ # bhyvectl --vm=coco --get-stats --cpu=0
> vcpu0
> vm exits due to external interrupt  19234
> number of times hlt was intercepted 1965
> number of times %cr access was intercepted  1
> number of times rdmsr was intercepted   0
> number of times wrmsr was intercepted   4
> number of monitor trap exits0
> number of times pause was intercepted   3907709
> vm exits due to interrupt window opening8105
> vm exits due to nmi window opening  0
> number of times in/out was intercepted  457998
> number of times cpuid was intercepted   50
> vm exits due to nested page fault   940909
> number of vm exits for unknown reason   0
> number of times astpending at exit  102
> number of vm exits handled in userspace 4367825
>
> number of NMIs delivered to vcpu0
> vcpu total runtime  51834598990
> number of ticks vcpu was idle   887
> vcpu migration across host cpus 25
>
> number of times hlt was ignored 0
> total number of vm exits5335975
>
> root@olive:~ # bhyvectl --vm=coco --get-stats --cpu=1
> vcpu1
> vm exits due to external interrupt  51249
> number of times hlt was intercepted 3676
> number of times %cr access was intercepted  2
> number of times rdmsr was intercepted   0
> number of times wrmsr was intercepted   4
> number of monitor trap exits0
> number of times pause was intercepted   4667527
> vm exits due to interrupt window opening32453
> vm exits due to nmi window opening  0
> number of times in/out was intercepted  57278
> number of times cpuid was intercepted   63
> vm exits due to nested page fault   266091
> number of vm exits for unknown reason   0
> number of times astpending at exit  164
> number of vm exits handled in userspace 4728481
>
> number of NMIs delivered to vcpu0
> vcpu total runtime  53833466076
> number of ticks vcpu was idle   965
> vcpu migration across host cpus 16
>
> number of times hlt was ignored 0
> total number of vm exits5078343
>
> The panic comes eventually like this:
>
> login: panic: deadlkres: possible deadlock detected for
> 0xfe0090fb5490, blocked for 217798 ticks
>
> cpuid = 0
> Uptime: 255d6h29m56s
> Automatic reboot in 15 seconds - press a key on the console to abort
>
>
Thanks for capturing that.

In the SMP case the aberrant source of VM exits is the nested page table
faults - which means local apic accesses since these are being entirely
handled in the kernel. They are an order of magnitude more than in the UP
case - even after adjusting for the difference in the run times in the two
cases.

Also, the uptime message (255d6h29m56s) is indicates something funky
happening with the time keeping.

I have added one more stat to vmm.ko that counts the number of times a
local apic timer interrupt is generated for each vcpu.

Also, I would like to install your release.iso if you are able to upload
it. I very much doubt, it but want to rule out those specific bits as the
source of the problem.

Thanks again for helping to debug this.

best
Neel


>
> Hope this helps, and FYI, here's a dmesg from the host:
>
>
> Copyright (c) 1992-2013 The FreeBSD Project.
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989

Re: bhyve tty / login problems / panic

2013-03-30 Thread Dan Mack

Here you go:

 - uni-processor output:

root@olive:~ # bhyvectl --vm=coco --get-stats --cpu=0
vcpu0
vm exits due to external interrupt  1330
number of times hlt was intercepted 676
number of times %cr access was intercepted  1
number of times rdmsr was intercepted   0
number of times wrmsr was intercepted   4
number of monitor trap exits0
number of times pause was intercepted   428499
vm exits due to interrupt window opening1231
vm exits due to nmi window opening  0
number of times in/out was intercepted  423985
number of times cpuid was intercepted   22
vm exits due to nested page fault   7523
number of vm exits for unknown reason   0
number of times astpending at exit  38
number of vm exits handled in userspace 853290
number of NMIs delivered to vcpu0
vcpu total runtime  6954314530
number of ticks vcpu was idle   59572
vcpu migration across host cpus 29
number of times hlt was ignored 0
total number of vm exits863271

And here is when running with vcpu==2:

root@olive:~ # bhyvectl --vm=coco --get-stats --cpu=0
vcpu0
vm exits due to external interrupt  19234
number of times hlt was intercepted 1965
number of times %cr access was intercepted  1
number of times rdmsr was intercepted   0
number of times wrmsr was intercepted   4
number of monitor trap exits0
number of times pause was intercepted   3907709
vm exits due to interrupt window opening8105
vm exits due to nmi window opening  0
number of times in/out was intercepted  457998
number of times cpuid was intercepted   50
vm exits due to nested page fault   940909
number of vm exits for unknown reason   0
number of times astpending at exit  102
number of vm exits handled in userspace 4367825
number of NMIs delivered to vcpu0
vcpu total runtime  51834598990
number of ticks vcpu was idle   887
vcpu migration across host cpus 25
number of times hlt was ignored 0
total number of vm exits5335975
root@olive:~ # bhyvectl --vm=coco --get-stats --cpu=1
vcpu1
vm exits due to external interrupt  51249
number of times hlt was intercepted 3676
number of times %cr access was intercepted  2
number of times rdmsr was intercepted   0
number of times wrmsr was intercepted   4
number of monitor trap exits0
number of times pause was intercepted   4667527
vm exits due to interrupt window opening32453
vm exits due to nmi window opening  0
number of times in/out was intercepted  57278
number of times cpuid was intercepted   63
vm exits due to nested page fault   266091
number of vm exits for unknown reason   0
number of times astpending at exit  164
number of vm exits handled in userspace 4728481
number of NMIs delivered to vcpu0
vcpu total runtime  53833466076
number of ticks vcpu was idle   965
vcpu migration across host cpus 16
number of times hlt was ignored 0
total number of vm exits5078343

The panic comes eventually like this:

login: panic: deadlkres: possible deadlock detected for 0xfe0090fb5490, 
blocked for 217798 ticks

cpuid = 0
Uptime: 255d6h29m56s
Automatic reboot in 15 seconds - press a key on the console to abort


Hope this helps, and FYI, here's a dmesg from the host:

Copyright (c) 1992-2013 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 10.0-CURRENT #7 r248935: Sat Mar 30 13:19:43 CDT 2013
r...@olive.example.com:/usr/obj/usr/src/sys/MACKGEN amd64
FreeBSD clang version 3.2 (tags/RELEASE_32/final 170710) 20121221
CPU: Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz (3200.19-MHz K8-class CPU)
  Origin = "GenuineIntel"  Id = 0x206d7  Family = 0x6  Model = 0x2d 
Stepping = 7


Features=0xbfebfbff

Features2=0x1fbee3bf
  AMD Features=0x2c100800
  AMD Features2=0x1
  TSC: P-state invariant, performance statistics
real memory  = 68719476736 (65536 MB)
avail memory = 63123800064 (60199 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: 
FreeBSD/SMP: Multiprocessor System Detected: 12 CPUs
FreeBSD/SMP: 1 package(s) x 6 core(s) x 2 SMT threads
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
 cpu2 (AP): APIC ID:  2
 cpu3 (AP): APIC ID:  3
 cpu4 (AP): APIC ID:  4
 cpu5 (AP): APIC ID:  5
 cpu6 (AP): API

Re: bhyve tty / login problems / panic

2013-03-30 Thread Dan Mack


Yep, no problem.  I saw the checkin and started a rebuild a few minutes 
ago :-)


Thanks,

Dan

On Sat, 30 Mar 2013, Neel Natu wrote:


Hi Dan,

On Fri, Mar 29, 2013 at 3:20 PM, Dan Mack  wrote:




Here you go Neel:

root@olive:~ # bhyvectl --vm=coco --get-stats --cpu=0
vcpu0
vm exits due to external interrupt  45051
number of times hlt was intercepted 2277
number of NMIs delivered to vcpu0
vcpu total runtime  105001652482
number of ticks vcpu was idle   919
vcpu migration across host cpus 13
number of times hlt was ignored 0
total number of vm exits9240925
root@olive:~ #
root@olive:~ #
root@olive:~ # bhyvectl --vm=coco --get-stats --cpu=1
vcpu1
vm exits due to external interrupt  149431
number of times hlt was intercepted 4222
number of NMIs delivered to vcpu0
vcpu total runtime  106876622528
number of ticks vcpu was idle   869
vcpu migration across host cpus 20
number of times hlt was ignored 0
total number of vm exits8065909



That did not help a whole lot because we were not keeping track of all the
reasons a vcpu could exit.

I have fixed that in r248935:
http://svnweb.freebsd.org/base?view=revision&revision=248935

Do you mind updating your vmm.ko with the change and getting the stats
again?

best
Neel



Dan



On Fri, 29 Mar 2013, Neel Natu wrote:

 Hi Dan,


On Fri, Mar 29, 2013 at 12:38 PM, Dan Mack  wrote:



I ran the procsystime dtrace script on bhyve when it was chewing up all
the CPU in vCPU=2 mode, and this is what I see for about 10s of runtime:

root@olive:/usr/share/dtrace/toolkit # ./procsystime -n bhyve -aT

Tracing... Hit Ctrl-C to end...
dtrace: 158536 dynamic variable drops with non-empty dirty list
dtrace: 207447 dynamic variable drops with non-empty dirty list
dtrace: 189205 dynamic variable drops with non-empty dirty list
dtrace: 164341 dynamic variable drops with non-empty dirty list
dtrace: 246307 dynamic variable drops with non-empty dirty list
dtrace: 187640 dynamic variable drops with non-empty dirty list
dtrace: 214771 dynamic variable drops with non-empty dirty list
dtrace: 221265 dynamic variable drops with non-empty dirty list
^C
dtrace: 243468 dynamic variable drops with non-empty dirty list

Elapsed Times for processes bhyve,

 SYSCALL  TIME (ns)
_umtx_op  18349
  writev 135712
  preadv   16175267
 pwritev   22924378
   ioctl 4353897920
  TOTAL: 4393151626

CPU Times for processes bhyve,

 SYSCALL  TIME (ns)
_umtx_op   8815
  writev 103145
 pwritev   10647023
  preadv   15159976
   ioctl 3943399889
  TOTAL: 3969318848

Syscall Counts for processes bhyve,

 SYSCALL  COUNT
_umtx_op 17
  writev 43
 pwritev639
  preadv   1183
   ioctl 652782
  TOTAL: 654664

Not sure if that helps or not.  Hotkernel shows the cpu in the
kernel`acpi_cpu_c1  call most of the time, normal ?

Dan

root@olive:/usr/share/dtrace/toolkit # ./hotkernel

Sampling... Hit Ctrl-C to end.
^C
FUNCTIONCOUNT   PCNT
zfs.ko`space_map_sync   1   0.0%
kernel`syscall_thread_exit  1   0.0%
kernel`should_yield 1   0.0%
zfs.ko`lzjb_compress1   0.0%
zfs.ko`dbuf_rele1   0.0%
zfs.ko`dsl_pool_tempreserve_space   1   0.0%

kernel`_bus_dmamap_load_buffer  1   0.0%
kernel`cpu_idleclock1   0.0%
kernel`_rw_runlock_cookie   1   0.0%
kernel`pmap_enter   1   0.0%
kernel`__lockmgr_args   1   0.0%
zfs.ko`vdev_geom_io_start   1   0.0%
kernel`ahci_ch_intr 1   0.0%
kernel`__rw_rlock   1   0.0%
kernel`VOP_UNLOCK_APV   1   0.0%
zfs.ko`arc_buf_thaw 1   0.0%
kernel`atomic_set_int   1   0.0%
kernel`g_io_schedule_down   1   0.0%
kernel`__mtx_lock_flags 1   0.0%
kernel`_callout_stop_safe   1   0.0%
kernel`sched_idletd 1   0.0%
kernel`_sx_slock1   0.0%
vmm.

Re: bhyve tty / login problems / panic

2013-03-30 Thread Neel Natu
Hi Dan,

On Fri, Mar 29, 2013 at 3:20 PM, Dan Mack  wrote:

>
>
> Here you go Neel:
>
> root@olive:~ # bhyvectl --vm=coco --get-stats --cpu=0
> vcpu0
> vm exits due to external interrupt  45051
> number of times hlt was intercepted 2277
> number of NMIs delivered to vcpu0
> vcpu total runtime  105001652482
> number of ticks vcpu was idle   919
> vcpu migration across host cpus 13
> number of times hlt was ignored 0
> total number of vm exits9240925
> root@olive:~ #
> root@olive:~ #
> root@olive:~ # bhyvectl --vm=coco --get-stats --cpu=1
> vcpu1
> vm exits due to external interrupt  149431
> number of times hlt was intercepted 4222
> number of NMIs delivered to vcpu0
> vcpu total runtime  106876622528
> number of ticks vcpu was idle   869
> vcpu migration across host cpus 20
> number of times hlt was ignored 0
> total number of vm exits8065909
>
>
That did not help a whole lot because we were not keeping track of all the
reasons a vcpu could exit.

I have fixed that in r248935:
http://svnweb.freebsd.org/base?view=revision&revision=248935

Do you mind updating your vmm.ko with the change and getting the stats
again?

best
Neel


> Dan
>
>
>
> On Fri, 29 Mar 2013, Neel Natu wrote:
>
>  Hi Dan,
>>
>> On Fri, Mar 29, 2013 at 12:38 PM, Dan Mack  wrote:
>>
>>
>>> I ran the procsystime dtrace script on bhyve when it was chewing up all
>>> the CPU in vCPU=2 mode, and this is what I see for about 10s of runtime:
>>>
>>> root@olive:/usr/share/dtrace/toolkit # ./procsystime -n bhyve -aT
>>>
>>> Tracing... Hit Ctrl-C to end...
>>> dtrace: 158536 dynamic variable drops with non-empty dirty list
>>> dtrace: 207447 dynamic variable drops with non-empty dirty list
>>> dtrace: 189205 dynamic variable drops with non-empty dirty list
>>> dtrace: 164341 dynamic variable drops with non-empty dirty list
>>> dtrace: 246307 dynamic variable drops with non-empty dirty list
>>> dtrace: 187640 dynamic variable drops with non-empty dirty list
>>> dtrace: 214771 dynamic variable drops with non-empty dirty list
>>> dtrace: 221265 dynamic variable drops with non-empty dirty list
>>> ^C
>>> dtrace: 243468 dynamic variable drops with non-empty dirty list
>>>
>>> Elapsed Times for processes bhyve,
>>>
>>>  SYSCALL  TIME (ns)
>>> _umtx_op  18349
>>>   writev 135712
>>>   preadv   16175267
>>>  pwritev   22924378
>>>ioctl 4353897920
>>>   TOTAL: 4393151626
>>>
>>> CPU Times for processes bhyve,
>>>
>>>  SYSCALL  TIME (ns)
>>> _umtx_op   8815
>>>   writev 103145
>>>  pwritev   10647023
>>>   preadv   15159976
>>>ioctl 3943399889
>>>   TOTAL: 3969318848
>>>
>>> Syscall Counts for processes bhyve,
>>>
>>>  SYSCALL  COUNT
>>> _umtx_op 17
>>>   writev 43
>>>  pwritev639
>>>   preadv   1183
>>>ioctl 652782
>>>   TOTAL: 654664
>>>
>>> Not sure if that helps or not.  Hotkernel shows the cpu in the
>>> kernel`acpi_cpu_c1  call most of the time, normal ?
>>>
>>> Dan
>>>
>>> root@olive:/usr/share/dtrace/toolkit # ./hotkernel
>>>
>>> Sampling... Hit Ctrl-C to end.
>>> ^C
>>> FUNCTIONCOUNT   PCNT
>>> zfs.ko`space_map_sync   1   0.0%
>>> kernel`syscall_thread_exit  1   0.0%
>>> kernel`should_yield 1   0.0%
>>> zfs.ko`lzjb_compress1   0.0%
>>> zfs.ko`dbuf_rele1   0.0%
>>> zfs.ko`dsl_pool_tempreserve_space   1   0.0%
>>>
>>> kernel`_bus_dmamap_load_buffer  1   0.0%
>>> kernel`cpu_idleclock1   0.0%
>>> kernel`_rw_runlock_cookie   1   0.0%
>>> kernel`pmap_enter   1   0.0%
>>> kernel`__lockmgr_args   1   0.0%
>>> zfs.ko`vdev_geom_io_start   1   0.0%
>>> kernel`ahci_ch_intr 1   0.0%
>>> kernel`__rw_rlock   1   0.0%
>>> kernel`VOP_UNLOCK_APV   1   0.0%
>>> zfs.ko`arc_buf_thaw 1   0.0%
>>> kernel`atomic_set_int   1   0.0%
>>> kernel`g_io_schedule_down   1   0.0%
>>> kernel`__mtx_lock_flags

Re: bhyve tty / login problems / panic

2013-03-29 Thread Dan Mack



Here you go Neel:

root@olive:~ # bhyvectl --vm=coco --get-stats --cpu=0
vcpu0
vm exits due to external interrupt  45051
number of times hlt was intercepted 2277
number of NMIs delivered to vcpu0
vcpu total runtime  105001652482
number of ticks vcpu was idle   919
vcpu migration across host cpus 13
number of times hlt was ignored 0
total number of vm exits9240925
root@olive:~ #
root@olive:~ #
root@olive:~ # bhyvectl --vm=coco --get-stats --cpu=1
vcpu1
vm exits due to external interrupt  149431
number of times hlt was intercepted 4222
number of NMIs delivered to vcpu0
vcpu total runtime  106876622528
number of ticks vcpu was idle   869
vcpu migration across host cpus 20
number of times hlt was ignored 0
total number of vm exits8065909

Dan


On Fri, 29 Mar 2013, Neel Natu wrote:


Hi Dan,

On Fri, Mar 29, 2013 at 12:38 PM, Dan Mack  wrote:



I ran the procsystime dtrace script on bhyve when it was chewing up all
the CPU in vCPU=2 mode, and this is what I see for about 10s of runtime:

root@olive:/usr/share/dtrace/**toolkit # ./procsystime -n bhyve -aT
Tracing... Hit Ctrl-C to end...
dtrace: 158536 dynamic variable drops with non-empty dirty list
dtrace: 207447 dynamic variable drops with non-empty dirty list
dtrace: 189205 dynamic variable drops with non-empty dirty list
dtrace: 164341 dynamic variable drops with non-empty dirty list
dtrace: 246307 dynamic variable drops with non-empty dirty list
dtrace: 187640 dynamic variable drops with non-empty dirty list
dtrace: 214771 dynamic variable drops with non-empty dirty list
dtrace: 221265 dynamic variable drops with non-empty dirty list
^C
dtrace: 243468 dynamic variable drops with non-empty dirty list

Elapsed Times for processes bhyve,

 SYSCALL  TIME (ns)
_umtx_op  18349
  writev 135712
  preadv   16175267
 pwritev   22924378
   ioctl 4353897920
  TOTAL: 4393151626

CPU Times for processes bhyve,

 SYSCALL  TIME (ns)
_umtx_op   8815
  writev 103145
 pwritev   10647023
  preadv   15159976
   ioctl 3943399889
  TOTAL: 3969318848

Syscall Counts for processes bhyve,

 SYSCALL  COUNT
_umtx_op 17
  writev 43
 pwritev639
  preadv   1183
   ioctl 652782
  TOTAL: 654664

Not sure if that helps or not.  Hotkernel shows the cpu in the
kernel`acpi_cpu_c1  call most of the time, normal ?

Dan

root@olive:/usr/share/dtrace/**toolkit # ./hotkernel
Sampling... Hit Ctrl-C to end.
^C
FUNCTIONCOUNT   PCNT
zfs.ko`space_map_sync   1   0.0%
kernel`syscall_thread_exit  1   0.0%
kernel`should_yield 1   0.0%
zfs.ko`lzjb_compress1   0.0%
zfs.ko`dbuf_rele1   0.0%
zfs.ko`dsl_pool_tempreserve_**space   1   0.0%
kernel`_bus_dmamap_load_buffer  1   0.0%
kernel`cpu_idleclock1   0.0%
kernel`_rw_runlock_cookie   1   0.0%
kernel`pmap_enter   1   0.0%
kernel`__lockmgr_args   1   0.0%
zfs.ko`vdev_geom_io_start   1   0.0%
kernel`ahci_ch_intr 1   0.0%
kernel`__rw_rlock   1   0.0%
kernel`VOP_UNLOCK_APV   1   0.0%
zfs.ko`arc_buf_thaw 1   0.0%
kernel`atomic_set_int   1   0.0%
kernel`g_io_schedule_down   1   0.0%
kernel`__mtx_lock_flags 1   0.0%
kernel`_callout_stop_safe   1   0.0%
kernel`sched_idletd 1   0.0%
kernel`_sx_slock1   0.0%
vmm.ko`lapic_set_intr   1   0.0%
kernel`cpusetobj_ffs1   0.0%
zfs.ko`dbuf_hold1   0.0%
vmm.ko`vlapic_intr_accepted 1   0.0%
zfs.ko`sa_attr_op   1   0.0%
zfs.ko`dmu_buf_hold_array_by_**dnode  1   0.0%
zfs.ko`taskq_run_safe   1   0.0%
kernel`vm_reserv_f

Re: bhyve tty / login problems / panic

2013-03-29 Thread Neel Natu
Hi Dan,

On Fri, Mar 29, 2013 at 12:38 PM, Dan Mack  wrote:

>
> I ran the procsystime dtrace script on bhyve when it was chewing up all
> the CPU in vCPU=2 mode, and this is what I see for about 10s of runtime:
>
> root@olive:/usr/share/dtrace/**toolkit # ./procsystime -n bhyve -aT
> Tracing... Hit Ctrl-C to end...
> dtrace: 158536 dynamic variable drops with non-empty dirty list
> dtrace: 207447 dynamic variable drops with non-empty dirty list
> dtrace: 189205 dynamic variable drops with non-empty dirty list
> dtrace: 164341 dynamic variable drops with non-empty dirty list
> dtrace: 246307 dynamic variable drops with non-empty dirty list
> dtrace: 187640 dynamic variable drops with non-empty dirty list
> dtrace: 214771 dynamic variable drops with non-empty dirty list
> dtrace: 221265 dynamic variable drops with non-empty dirty list
> ^C
> dtrace: 243468 dynamic variable drops with non-empty dirty list
>
> Elapsed Times for processes bhyve,
>
>  SYSCALL  TIME (ns)
> _umtx_op  18349
>   writev 135712
>   preadv   16175267
>  pwritev   22924378
>ioctl 4353897920
>   TOTAL: 4393151626
>
> CPU Times for processes bhyve,
>
>  SYSCALL  TIME (ns)
> _umtx_op   8815
>   writev 103145
>  pwritev   10647023
>   preadv   15159976
>ioctl 3943399889
>   TOTAL: 3969318848
>
> Syscall Counts for processes bhyve,
>
>  SYSCALL  COUNT
> _umtx_op 17
>   writev 43
>  pwritev639
>   preadv   1183
>ioctl 652782
>   TOTAL: 654664
>
> Not sure if that helps or not.  Hotkernel shows the cpu in the
> kernel`acpi_cpu_c1  call most of the time, normal ?
>
> Dan
>
> root@olive:/usr/share/dtrace/**toolkit # ./hotkernel
> Sampling... Hit Ctrl-C to end.
> ^C
> FUNCTIONCOUNT   PCNT
> zfs.ko`space_map_sync   1   0.0%
> kernel`syscall_thread_exit  1   0.0%
> kernel`should_yield 1   0.0%
> zfs.ko`lzjb_compress1   0.0%
> zfs.ko`dbuf_rele1   0.0%
> zfs.ko`dsl_pool_tempreserve_**space   1   0.0%
> kernel`_bus_dmamap_load_buffer  1   0.0%
> kernel`cpu_idleclock1   0.0%
> kernel`_rw_runlock_cookie   1   0.0%
> kernel`pmap_enter   1   0.0%
> kernel`__lockmgr_args   1   0.0%
> zfs.ko`vdev_geom_io_start   1   0.0%
> kernel`ahci_ch_intr 1   0.0%
> kernel`__rw_rlock   1   0.0%
> kernel`VOP_UNLOCK_APV   1   0.0%
> zfs.ko`arc_buf_thaw 1   0.0%
> kernel`atomic_set_int   1   0.0%
> kernel`g_io_schedule_down   1   0.0%
> kernel`__mtx_lock_flags 1   0.0%
> kernel`_callout_stop_safe   1   0.0%
> kernel`sched_idletd 1   0.0%
> kernel`_sx_slock1   0.0%
> vmm.ko`lapic_set_intr   1   0.0%
> kernel`cpusetobj_ffs1   0.0%
> zfs.ko`dbuf_hold1   0.0%
> vmm.ko`vlapic_intr_accepted 1   0.0%
> zfs.ko`sa_attr_op   1   0.0%
> zfs.ko`dmu_buf_hold_array_by_**dnode  1   0.0%
> zfs.ko`taskq_run_safe   1   0.0%
> kernel`vm_reserv_free_page  1   0.0%
> kernel`_sx_sunlock  1   0.0%
> kernel`vn_rlimit_fsize  1   0.0%
> zfs.ko`vdev_queue_io_to_issue   1   0.0%
> kernel`vn_write 1   0.0%
> kernel`lock_init1   0.0%
> vmm.ko`vmx_resume   1   0.0%
> zfs.ko`zio_gang_tree_free   1   0.0%
> zfs.ko`dbuf_rele_and_unlock 1   0.0%
> kernel`mnt_vnode_next_active1   0.0%
> kernel`vm_page_remove   1   0.0%
> zfs.ko`vdev_queue_deadl

Re: bhyve tty / login problems / panic

2013-03-29 Thread Dan Mack


I ran the procsystime dtrace script on bhyve when it was chewing up all 
the CPU in vCPU=2 mode, and this is what I see for about 10s of runtime:


root@olive:/usr/share/dtrace/toolkit # ./procsystime -n bhyve -aT
Tracing... Hit Ctrl-C to end...
dtrace: 158536 dynamic variable drops with non-empty dirty list
dtrace: 207447 dynamic variable drops with non-empty dirty list
dtrace: 189205 dynamic variable drops with non-empty dirty list
dtrace: 164341 dynamic variable drops with non-empty dirty list
dtrace: 246307 dynamic variable drops with non-empty dirty list
dtrace: 187640 dynamic variable drops with non-empty dirty list
dtrace: 214771 dynamic variable drops with non-empty dirty list
dtrace: 221265 dynamic variable drops with non-empty dirty list
^C
dtrace: 243468 dynamic variable drops with non-empty dirty list

Elapsed Times for processes bhyve,

 SYSCALL  TIME (ns)
_umtx_op  18349
  writev 135712
  preadv   16175267
 pwritev   22924378
   ioctl 4353897920
  TOTAL: 4393151626

CPU Times for processes bhyve,

 SYSCALL  TIME (ns)
_umtx_op   8815
  writev 103145
 pwritev   10647023
  preadv   15159976
   ioctl 3943399889
  TOTAL: 3969318848

Syscall Counts for processes bhyve,

 SYSCALL  COUNT
_umtx_op 17
  writev 43
 pwritev639
  preadv   1183
   ioctl 652782
  TOTAL: 654664

Not sure if that helps or not.  Hotkernel shows the cpu in the 
kernel`acpi_cpu_c1  call most of the time, normal ?


Dan

root@olive:/usr/share/dtrace/toolkit # ./hotkernel
Sampling... Hit Ctrl-C to end.
^C
FUNCTIONCOUNT   PCNT
zfs.ko`space_map_sync   1   0.0%
kernel`syscall_thread_exit  1   0.0%
kernel`should_yield 1   0.0%
zfs.ko`lzjb_compress1   0.0%
zfs.ko`dbuf_rele1   0.0%
zfs.ko`dsl_pool_tempreserve_space   1   0.0%
kernel`_bus_dmamap_load_buffer  1   0.0%
kernel`cpu_idleclock1   0.0%
kernel`_rw_runlock_cookie   1   0.0%
kernel`pmap_enter   1   0.0%
kernel`__lockmgr_args   1   0.0%
zfs.ko`vdev_geom_io_start   1   0.0%
kernel`ahci_ch_intr 1   0.0%
kernel`__rw_rlock   1   0.0%
kernel`VOP_UNLOCK_APV   1   0.0%
zfs.ko`arc_buf_thaw 1   0.0%
kernel`atomic_set_int   1   0.0%
kernel`g_io_schedule_down   1   0.0%
kernel`__mtx_lock_flags 1   0.0%
kernel`_callout_stop_safe   1   0.0%
kernel`sched_idletd 1   0.0%
kernel`_sx_slock1   0.0%
vmm.ko`lapic_set_intr   1   0.0%
kernel`cpusetobj_ffs1   0.0%
zfs.ko`dbuf_hold1   0.0%
vmm.ko`vlapic_intr_accepted 1   0.0%
zfs.ko`sa_attr_op   1   0.0%
zfs.ko`dmu_buf_hold_array_by_dnode  1   0.0%
zfs.ko`taskq_run_safe   1   0.0%
kernel`vm_reserv_free_page  1   0.0%
kernel`_sx_sunlock  1   0.0%
kernel`vn_rlimit_fsize  1   0.0%
zfs.ko`vdev_queue_io_to_issue   1   0.0%
kernel`vn_write 1   0.0%
kernel`lock_init1   0.0%
vmm.ko`vmx_resume   1   0.0%
zfs.ko`zio_gang_tree_free   1   0.0%
zfs.ko`dbuf_rele_and_unlock 1   0.0%
kernel`mnt_vnode_next_active1   0.0%
kernel`vm_page_remove   1   0.0%
zfs.ko`vdev_queue_deadline_compare  1   0.0%
kernel`zone_free_item   1   0.0%
vmm.ko`vmx_launch   1   0.0%
kernel`m_pullup 1  

Re: bhyve tty / login problems / panic

2013-03-29 Thread Dan Mack

On Thu, 28 Mar 2013, Peter Grehan wrote:


Hi Dan,


Any ideas?


Just to confirm: your host system is an i7 3930k, which looks like a 6 core 
system. Should be plenty beefy enough for a 2 vCPU guest.


That's what I thought, 6 cores, 12 threads, 64 GB RAM.

The issue you are seeing looks a bit like what happens when systems are 
heavily oversubscribed and vCPUs can't get enough run time.


I was running in on an idle system; only bhyve was running.


How much spare RAM is on the host when you started the vm ? Also, is there 
much running on the host when the guest is started ?


64GB

Here's some more information (sorry for the formatting):

Case 1:  single vCPU via:
root 641121.3  0.0 4219232 25588  1  D+2:09PM 0:02.47 
/usr/sbin/bhyve -c 1 -m 4096 -AI -H -P -g 0 -s 0:0,hostbridge -s 
1:0,virtio-net,tap0 -s 2:0,virtio-blk,./diskdev -S 31,uart,stdio coco

Top shows it idle (note low load, 4GB memory given to guest, still 53GB free 
RAM, nothing else clocking CPU cycles):

root@olive:~ # top -bH
last pid: 64148;  load averages:  0.03,  0.22,  0.22  up 0+16:29:3114:11:51
24 processes:  1 running, 23 sleeping

Mem: 32M Active, 50M Inact, 8498M Wired, 504K Cache, 53G Free
ARC: 778M Total, 146M MFU, 390M MRU, 16K Anon, 16M Header, 226M Other
Swap:


  PID USERNAME   PRI NICE   SIZERES STATE   C   TIME   WCPU COMMAND
64112 root200  4120M 25980K vmidle  2   0:03  0.00% bhyve{coco vcpu 
0}
  925 root200 14296K  2044K select 11   0:00  0.00% syslogd
 2695 root200 81452K  6952K select  2   0:00  0.00% sshd
 2697 root200 20224K  4632K ttyin   1   0:00  0.00% csh
64112 root520  4120M 25980K kqread  9   0:00  0.00% bhyve{coco 
mevent}


So, case #1 all is normal.  I was able to build a kernel with very good 
performance.

The boot sequence looks like the following:

root@olive:/vms/coco # ./vmrun.sh coco
Launching virtual machine "coco" ...
Consoles: userboot

FreeBSD/amd64 User boot, Revision 1.1
(r...@olive.example.com, Thu Mar 28 21:20:49 CDT 2013)
Loading /boot/defaults/loader.conf 
/boot/kernel/kernel text=0x5a2178 data=0x98088+0xef8b0 syms=[0x8+0xca8d8+0x8+0x12bb08]

/
  __      _ _
 |  | |  _ \ / |  __ \
 | |___ _ __ ___  ___ | |_) | (___ | |  | |
 |  ___| '__/ _ \/ _ \|  _ < \___ \| |  | |
 | |   | | |  __/  __/| |_) |) | |__| |
 | |   | | |||| |  |  |
 |_|   |_|  \___|\___||/|_/|_/````
 s` `.---...--.```   -/
 +Welcome to FreeBSD---+ +o   .--` /y:`  +.
 | |  yo`:.:o  `+-
 |  1. Boot Multi User [Enter] |   y/   -/`   -o/
 |  2. Boot [S]ingle User  |  .-  ::/sy+:.
 |  3. [Esc]ape to loader prompt   |  / `--  /
 |  4. Reboot  | `:  :`
 | | `:  :`
 |  Options:   |  /  /
 |  5. Configure Boot [O]ptions... |  .--.
 | |   --  -.
 | |`:`  `:`
 | |  .-- `--.
 | | .---..
 +-+


Booting...
Copyright (c) 1992-2013 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 10.0-CURRENT #0 r248859: Thu Mar 28 19:52:06 CDT 2013
r...@cocopuff.example.com:/sys/amd64/compile/BHYVE amd64
FreeBSD clang version 3.2 (tags/RELEASE_32/final 170710) 20121221
CPU: Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz (3199.84-MHz K8-class CPU)
  Origin = "GenuineIntel"  Id = 0x206d7  Family = 0x6  Model = 0x2d  Stepping = 
7
  
Features=0x8fa3ab7f
  
Features2=0x83bee217
  AMD Features=0x2c100800
  AMD Features2=0x1
  TSC: P-state invariant
real memory  = 5368709120 (5120 MB)
avail memory = 3957534720 (3774 MB)
Event timer "LAPIC" quality 400
ACPI APIC Table: 
random device not loaded; using insecure entropy
ioapic0: Changing APIC ID to 1
ioapic0  irqs 0-16 on motherboard
kbd0 at kbdmux0
acpi0:  on motherboard
atrtc0:  port 0x70-0x71,0x72-0x77 irq 8 on acpi0
Event timer "RTC" frequency 32768 Hz quality 0
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <32-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
pcib0:  port 0xcf8-0xcff on acpi0
pci0:  on pcib0
pcib0: no PRT entry for 0.31.INTA
virtio_pci0:  port 0x2000-0x201f mem 
0xc000-0xc0001fff at device 1.0 on pci0
vtnet0

Re: bhyve tty / login problems / panic

2013-03-28 Thread Peter Grehan

Hi Dan,


Any ideas?


 Just to confirm: your host system is an i7 3930k, which looks like a 6 
core system. Should be plenty beefy enough for a 2 vCPU guest.


 The issue you are seeing looks a bit like what happens when systems 
are heavily oversubscribed and vCPUs can't get enough run time.


 How much spare RAM is on the host when you started the vm ? Also, is 
there much running on the host when the guest is started ?


later,

Peter.
___
freebsd-virtualization@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"


Re: bhyve tty / login problems / panic

2013-03-28 Thread Dan Mack


Update#1:   I was using the default of 2 virtual CPUs.  When I switched to 
1 CPU via vmrun.sh, the system booted clean and let me login.


Also, when running with two vCPUs, the bhyve process is spinning at 100% 
even when the guest is doing nothing.


When running with one CPU, bhyve is mostly idle when the guest is idle.

Whatever was happening, it was impacting system time in a bad way as all 
of the log files were already getting rotated even though the system was 
less than 30 minutes old :-)


Let me know if there is anything I can do to help and if you have any 
ideas on a solution (other than only using uni-processor guests).


Thanks,

Dan

On Thu, 28 Mar 2013, Dan Mack wrote:



I haven't spent too much time debugging this yet but I'd thought I'd ask just 
in case someone else has seen this:


- fresh 10.x system (248804) , standard FreeBSD iso image I made with
  make release

- latest vmrun.sh release.iso ...
- install went okay, everything seemed normal

- boot splash screen comes up fine, select default and do first boot

- cannot login ... first few attempts result in 'login timed out messages' 
(but they get emitted as soon as I hit CR)


- eventually I can get in after a few tries but then the shell immediately 
boots me after emitting the motd  :-)


- and just leaving it sit results in panic

- my kernel is mainly just GENERIC with the debugging yanked out and raid, 
scsi, and wifi devices.


Any ideas?


Below is the boot log and what I saw ...
Booting...
GDB: no debug ports present
KDB: debugger backends: ddb
KDB: current backend: ddb
Copyright (c) 1992-2013 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
   The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 10.0-CURRENT #4 r248804: Wed Mar 27 19:29:38 CDT 2013
   r...@olive.example.com:/usr/obj/usr/src/sys/MACKGEN amd64
FreeBSD clang version 3.2 (tags/RELEASE_32/final 170710) 20121221
CPU: Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz (3199.83-MHz K8-class CPU)
 Origin = "GenuineIntel"  Id = 0x206d7  Family = 0x6  Model = 0x2d Stepping 
= 7


Features=0x8fa3ab7f

Features2=0x83bee217
 AMD Features=0x2c100800
 AMD Features2=0x1
 TSC: P-state invariant
real memory  = 536870912 (512 MB)
avail memory = 482803712 (460 MB)
Event timer "LAPIC" quality 400
ACPI APIC Table: 
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
FreeBSD/SMP: 2 package(s) x 1 core(s)
cpu0 (BSP): APIC ID:  0
login: /amd64 (cocopuff.example.com) (console)ds.KLOCAL>ic 0 mtu 
1500tation)3fff at device 2.0 on pci0i0


FreeBSD/amd64 (cocopuff.example.com) (console)

login: root
Login timed out after 300 seconds

FreeBSD/amd64 (cocopuff.example.com) (console)

login: root
Login timed out after 300 seconds

FreeBSD/amd64 (cocopuff.example.com) (console)

login: root
Login timed out after 300 seconds

FreeBSD/amd64 (cocopuff.example.com) (console)

login: root
Login timed out after 300 seconds

FreeBSD/amd64 (cocopuff.example.com) (console)

login: root
Password:
Jul 12 07:12:58 cocopuff login: ROOT LOGIN (root) ON console
FreeBSD 10.0-CURRENT (MACKGEN) #4 r248804: Wed Mar 27 19:29:38 CDT 2013

Welcome to FreeBSD!

Before seeking technical support, please use the following resources:

o  Security advisories and updated errata information for all releases are
  at http://www.FreeBSD.org/releases/ - always consult the ERRATA section
  for your release first as it's updated frequently.

o  The Handbook and FAQ documents are at http://www.FreeBSD.org/ and,
  along with the mailing lists, can be searched by going to
  http://www.FreeBSD.org/search/.  If the doc package has been installed
  (or fetched via pkg install lang-freebsd-doc, where lang is the
  2-letter language code, e.g. en), they are also available formatted

If you still have a question or problem, please take the output of
`uname -a', along with any relevant error messages, and email it
as a question to the questi...@freebsd.org mailing list.  If you are
unfamiliar with FreeBSD's directory layout, please refer to the hier(7)
manual page.  If you are not familiar with manual pages, type `man man'.

Edit /etc/motd to change this login announcement.

You have new mail.
You have new mail.
root@cocopuff:~ # auto-logout

FreeBSD/amd64 (cocopuff.macktronics.com) (console)

login: panic: deadlkres: possible deadlock detected for 0xfe001c919920, 
blocked for 303081 ticks


cpuid = 0
KDB: enter: panic
[ thread pid 0 tid 100027 ]
Stopped at  kdb_enter+0x3e: movq$0,kdb_why
db> bt
Tracing pid 0 tid 100027 td 0xfe000358f490
kdb_enter() at kdb_enter+0x3e/frame 0xff80002bcae0
panic() at panic+0x176/frame 0xff80002bcb60
deadlkres() at deadlkres+0x488/frame 0xff80002bcbb0
fork_exit() at fork_exit+0x9a/frame 0xff80002bcbf0
fork_trampoline() at fork_trampoline+0xe/frame 0xff80002bcbf0
--- trap 0, rip = 0, rsp = 0xff80002bccb0, rbp = 0 ---


___