Re: -current host, 10.1 client loops

2016-04-12 Thread Poul-Henning Kamp

In message <570c74cf.1080...@freebsd.org>, Peter Grehan writes:


>  I'll try with a bit more CPU oversubscription and see if I can hit 
>this. Thanks for the info - this should help track it down.


For what its worth, I think the port being compiled was:

/usr/ports/www/p5-Mozilla-CAnss

Another possibly relevant datapoint:

The actively being written to filesystem is UFS+SU+TRIM but no
journal and I think fsck sees more and worse corruption than it
should if writes happened in the order Kirk intended.  Specifically
it should never get fsck_ffs into the "please run fsck again" jungle.

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
freebsd-virtualization@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"


Re: -current host, 10.1 client loops

2016-04-11 Thread Peter Grehan

CTRL-T in the console works

...

==> _cpu_0 <==
rip[0]  0xc0b14780
rip[0]  0xc0b14782


 There are the RIPs that show when the system is idle.


==> _cpu_2 <==
rip[2]  0xc100feb0


 This is the TLB shootdown loop:

(kgdb) x/i 0xc100feb0
0xc100feb0 : pause

 I'll try with a bit more CPU oversubscription and see if I can hit 
this. Thanks for the info - this should help track it down.


later,

Peter.
___
freebsd-virtualization@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"


Re: -current host, 10.1 client loops

2016-04-10 Thread Peter Grehan

Hi Poul-Henning,


With the host now running:

FreeBSD 11.0-CURRENT #0 r297514M

And (another) i386 guest running:

FreeBSD 11.0-CURRENT #3 r297721M

I do not see the problem.

I'm going to try the 10.1/i386 guest over the weekend.


 I've not yet been able to reproduce this. I've tried  2- and 3-vCPU 
10.1/i386 guests on an Intel E3-1220 v3 (single and dual virtio-blk 
disks), and a 2 vCPU guest on an AMD Sempron 3850 APU. Buildworlds with 
-j set to the number of vCPUs have completed fine.


 The Intel system was using file-backed images on ZFS, while the AMD 
system was on UFS. The guests were UFS. The console was to stdout in a 
tmux session that was mostly backgrounded.


 Was there anything particular with your buildworld ? (-j settings, etc)

 Also, if the guest looks like it hangs, you can extract the RIPs with
  bhyvectl --get-rip --cpu=0 --vm=
  bhyvectl --get-rip --cpu=1 --vm=

 This might give a hint as to where the guest is spinning.

later,

Peter.

___
freebsd-virtualization@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"


Re: -current host, 10.1 client loops

2016-04-08 Thread Poul-Henning Kamp

In message 
, =?UTF-8?Q?Olivier_Cochard=2DLabb=C3=A9?= writes:

With the host now running:

FreeBSD 11.0-CURRENT #0 r297514M

And (another) i386 guest running:

FreeBSD 11.0-CURRENT #3 r297721M

I do not see the problem.

I'm going to try the 10.1/i386 guest over the weekend.


-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
freebsd-virtualization@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"


Re: -current host, 10.1 client loops

2016-03-20 Thread Olivier Cochard-Labbé
On Wed, Mar 16, 2016 at 10:25 PM, Poul-Henning Kamp 
wrote:

>
> How do I tell that script to use ahci-hd ?
>
>
​You just need to replace virtio-blk by ahci-hd in the script, like:

sed 's/virtio-blk/ahci-hd/g' /usr/share/examples/bhyve/vmrun.sh >
~/vmrun-ahci.sh

​And running ~/vmrun-ahci.sh in place
​
___
freebsd-virtualization@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"

Re: -current host, 10.1 client loops

2016-03-19 Thread Neel Natu
On Thu, Mar 17, 2016 at 12:38 AM, Poul-Henning Kamp  wrote:
> 
> In message 
> , 
> =?UTF-8?Q?Olivier_Cochard=2D
> Labb=C3=A9?= writes:
>
>
>>> How do I tell that script to use ahci-hd ?
>>>
>>>
>>You just need to replace virtio-blk by ahci-hd in the script, like:
>
> Ok, it also hung with ahci-hd and two disks...
>
> Just to make sure I have asked about this:  Are there any special
> params needed to run a 10.1-R guest ?
>

https://www.freebsd.org/releases/10.1R/errata.html#open-issues

Can you try to reproduce after setting 'vfs.unmapped_buf_allowed=0' at
the guest loader prompt?

The issue doesn't mention bhyve specifically but given that it's a
simple workaround it should be worth a try.

best
Neel

> --
> Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
> p...@freebsd.org | TCP/IP since RFC 956
> FreeBSD committer   | BSD since 4.3-tahoe
> Never attribute to malice what can adequately be explained by incompetence.
___
freebsd-virtualization@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"


Re: -current host, 10.1 client loops

2016-03-19 Thread Poul-Henning Kamp

In message 
, Neel Natu writes:

I tried a i386 -current guest, and it hung, both cores spinning
during a buildworld.

This should make it pretty easy for somebody else to reproduce.

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
freebsd-virtualization@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"


Re: -current host, 10.1 client loops

2016-03-19 Thread Poul-Henning Kamp

In message 
, Neel Natu writes:

>https://www.freebsd.org/releases/10.1R/errata.html#open-issues
>
>Can you try to reproduce after setting 'vfs.unmapped_buf_allowed=0' at
>the guest loader prompt?
>
>The issue doesn't mention bhyve specifically but given that it's a
>simple workaround it should be worth a try.

That may have helped, at least I got all the way through a nanobsd
build this time.

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
freebsd-virtualization@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"


Re: -current host, 10.1 client loops

2016-03-19 Thread Poul-Henning Kamp

In message 
, 
=?UTF-8?Q?Olivier_Cochard=2D
Labb=C3=A9?= writes:


>> How do I tell that script to use ahci-hd ?
>>
>>
>You just need to replace virtio-blk by ahci-hd in the script, like:

Ok, it also hung with ahci-hd and two disks...

Just to make sure I have asked about this:  Are there any special
params needed to run a 10.1-R guest ?

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
freebsd-virtualization@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"


Re: -current host, 10.1 client loops

2016-03-19 Thread Poul-Henning Kamp

In message 

Re: -current host, 10.1 client loops

2016-03-15 Thread Poul-Henning Kamp

In message 
, Neel Natu writes:

>Also, if it is possible to reproduce with a single vcpu then it will
>help when analyzing the output of ktrdump.

Well, on the gutfeeling that it might help I gave a '-c 1' and that
seems to prevent the problem from happening so I could get my builds
done.

So some kind of synchronization issue maybe ?

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
freebsd-virtualization@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"


Re: -current host, 10.1 client loops

2016-03-15 Thread Neel Natu
Hi,

On Tue, Mar 15, 2016 at 7:36 AM, Poul-Henning Kamp  wrote:
> I'm seing bhyve go into some kind of endless loop while trying to
> compile the gcc port on 10.1 as guest.
>
> In one case CTRL-T on the console kept working, but showed an rm(1)
> process raking up CPU time.
>
> Are there known bogon in current/bhyve or using 10.1-R/i386 as a guest
> I have not spotted ?
>
> How does one debug stuff like this ?
>

I usually do the following things to get a sense for what might be going on:

- "top -H" on the host to figure out if guest vcpu threads are spinning
- "top -H" inside the guest if possible
- "bhyvectl --vm vmname --cpu vcpuid --get-stats" on the host

If there is nothing obvious from the above then I will recompile the
host kernel with KTR enabled (vmm.ko has detailed tracing at KTR_GEN).
This is usually very helpful to understand what might be going on.

Also, if it is possible to reproduce with a single vcpu then it will
help when analyzing the output of ktrdump.

best
Neel

> Poul-Henning
>
> Bhyve started with:
>
> sh /usr/share/examples/bhyve/vmrun.sh \
> -m 1G \
> -t ${VMN} \
> -d ${P}.root.dd \
> -d ${P}.swap.dd \
> -d ${P}.tami_install.dd \
> -d ${P}.tami_git.dd \
> vm${VMU} || true
>
> Host:
> 11.0-CURRENT #4 r296808: Sun Mar 13 22:39:59 UTC 2016
>
> CPU: AMD Athlon(tm) II X3 455 Processor (3311.18-MHz K8-class CPU)
>   Origin="AuthenticAMD"  Id=0x100f53  Family=0x10  Model=0x5  
> Stepping=3
>   
> Features=0x178bfbff
>   Features2=0x802009
>   AMD 
> Features=0xee500800
>   AMD 
> Features2=0x837ff
>   SVM: NP,NRIP,NAsids=64
>   TSC: P-state invariant
> real memory  = 17179869184 (16384 MB)
> avail memory = 16573935616 (15806 MB)
> Event timer "LAPIC" quality 400
> ACPI APIC Table: <090712 APIC1033>
> FreeBSD/SMP: Multiprocessor System Detected: 3 CPUs
> FreeBSD/SMP: 1 package(s) x 3 core(s)
>  cpu0 (BSP): APIC ID:  0
>  cpu1 (AP): APIC ID:  1
>  cpu2 (AP): APIC ID:  2
> ACPI BIOS Warning (bug): 32/64X length mismatch in FADT/Gpe0Block: 
> 64/32 (20150818/tbfadt-649)
>
>
> Guest:
> 10.1-RELEASE i386
>
>
>
>
> --
> Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
> p...@freebsd.org | TCP/IP since RFC 956
> FreeBSD committer   | BSD since 4.3-tahoe
> Never attribute to malice what can adequately be explained by incompetence.
> ___
> freebsd-virtualization@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
> To unsubscribe, send any mail to 
> "freebsd-virtualization-unsubscr...@freebsd.org"
___
freebsd-virtualization@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"


Re: -current host, 10.1 client loops

2016-03-15 Thread Neel Natu
Hi,

On Tue, Mar 15, 2016 at 2:30 PM, Poul-Henning Kamp  wrote:
> 
> In message 
> 
> , Neel Natu writes:
>
>>Also, if it is possible to reproduce with a single vcpu then it will
>>help when analyzing the output of ktrdump.
>
> Well, on the gutfeeling that it might help I gave a '-c 1' and that
> seems to prevent the problem from happening so I could get my builds
> done.
>
> So some kind of synchronization issue maybe ?
>

Yes, that might be possible.

A couple of suggestions to narrow it down further:
- try to reproduce with 2 or 4 vcpus (3 cpus is probably not a well
tested configuration)
- try to reproduce with the ahci-hd device emulation instead of virtio-blk.

If this problem is reproducible then I am happy to work with you to
get to the bottom of this.

best
Neel

> --
> Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
> p...@freebsd.org | TCP/IP since RFC 956
> FreeBSD committer   | BSD since 4.3-tahoe
> Never attribute to malice what can adequately be explained by incompetence.
___
freebsd-virtualization@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"