Re: Bhyve Broken: whose fault (AMD, FreeBSD, ZFS ...?)

2017-07-21 Thread Peter Grehan

On 7/21/17 5:54 PM, Peter Grehan wrote:

remember that the quoted problem is spit out by the guest, not the host.

That said, the 'top' line on the frozen bhyve was:

29380 root  22  200  1060M   928M kqread  5 218:32 399.30% 
bhyve


... indicating that the bhyve had almost all it's memory... and the 
system had also 500M free when I checked it.


  Are you using a "-j" option to buildworld in the guest ? You'd need at 
least 1G for each vCPU when doing parallel builds (and preferably a bit 
more) or you will swap heavily.


 I should add: I'm not saying this will fix the problem, but if it is 
an issue with the guest swapping, giving the guest more RAM is a workaround.


later,

Peter.
___
freebsd-virtualization@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"


Re: Bhyve Broken: whose fault (AMD, FreeBSD, ZFS ...?)

2017-07-21 Thread Zaphod Beeblebrox
remember that the quoted problem is spit out by the guest, not the host.

That said, the 'top' line on the frozen bhyve was:

29380 root  22  200  1060M   928M kqread  5 218:32 399.30% bhyve

... indicating that the bhyve had almost all it's memory... and the system
had also 500M free when I checked it.


On Fri, Jul 21, 2017 at 7:03 PM, Peter Grehan  wrote:

> oh ... and ... the console spit out:
>>
>> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 4522, size: 8192
>>
>> (swap is on a separate zVol).
>>
>
>  Ok - you may have hit a separate issue. Is ZFS ARC limited on your setup
> ? If bhyve and ZFS (and other consumers) end up fighting for memory,
> everyone loses :( A general rule of thumb is to limit ARC to less than the
> bhyve VM usage + a few additional gig for the base system.
>
>  The FreeBSD default of giving ZFS all memory minus 1GB doesn't work too
> well when running VMs.
>
> later,
>
> Peter.
>
___
freebsd-virtualization@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"


Re: Bhyve Broken: whose fault (AMD, FreeBSD, ZFS ...?)

2017-07-21 Thread Zaphod Beeblebrox
I tried the -p 0:1 -p 1:2 -p 2:3 -p 3:4 bit.  If I may say, it felt a
little "chunky" ... but that could have just been a perception.

Anyways... still hung the guest.

On Fri, Jul 21, 2017 at 6:58 PM, Peter Grehan  wrote:

> Hi,
>
> What should be next steps here?  This is repeatable.  The host is stable
>> (it can makeworld -j32 in about 25 minutes ... so it's hardware seems
>> good).  Is this an AMD bug?  Is it bad to use ZFS ZVols?
>>
>
>  ZVols are fine. Is the guest panic a spinlock timeout ?
>
>  I believe this is a bug in bhyve/SVM. It appears somewhat related to
> processor speed (I can't repro on a 2.3GHz 8 CPU Opteron 6320, but can hit
> it after 15 mins or so on a Ryzen 1700, with/without SMT).
>
>  Anish and I are currently chasing this and have repros. An experiment you
> could try is to run with the vCPUs pinned i.e. for a 4 vCPU guest, add the
> options "-p 0:1 -p 1:2 -p 2:3 -p 3:4".
>
> later,
>
> Peter.
>
___
freebsd-virtualization@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"


Re: Bhyve Broken: whose fault (AMD, FreeBSD, ZFS ...?)

2017-07-21 Thread Peter Grehan

oh ... and ... the console spit out:

swap_pager: indefinite wait buffer: bufobj: 0, blkno: 4522, size: 8192

(swap is on a separate zVol).


 Ok - you may have hit a separate issue. Is ZFS ARC limited on your 
setup ? If bhyve and ZFS (and other consumers) end up fighting for 
memory, everyone loses :( A general rule of thumb is to limit ARC to 
less than the bhyve VM usage + a few additional gig for the base system.


 The FreeBSD default of giving ZFS all memory minus 1GB doesn't work 
too well when running VMs.


later,

Peter.
___
freebsd-virtualization@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"


Re: Bhyve Broken: whose fault (AMD, FreeBSD, ZFS ...?)

2017-07-21 Thread Zaphod Beeblebrox
oh ... and ... the console spit out:

swap_pager: indefinite wait buffer: bufobj: 0, blkno: 4522, size: 8192

(swap is on a separate zVol).

On Fri, Jul 21, 2017 at 6:59 PM, Zaphod Beeblebrox 
wrote:

> ... curiously, top running on the guest reveals (the point at which the
> bhyve wedges):
>
> 88722 root  1  520   109M   105M pfault  3   0:03  42.37%
> llvm-tblgen
> 88687 root  1  520   374M   347M pfault  2   0:04  38.24%
> llvm-tblgen
> 88668 root  1  520   236M   225M pfault  0   0:04  35.11%
> llvm-tblgen
> 88743 root  1  520 55460K 26392K pfault  3   0:00  10.23% cc
>
> ... where top on the host just shows 100% bhyve 100% busy on 4 threads.
>
>
> On Fri, Jul 21, 2017 at 6:43 PM, Zaphod Beeblebrox 
> wrote:
>
>> Since I found out that I can't run a Samba directory server in a jail,
>> I've had the setup of a bhyve on my list.  I had toyed with Bhyve 6 or 8
>> months ago, and still had the images, so I zfs cloned one and set about a
>> source upgrade.
>>
>> This ignomineously hung.
>>
>> So... I upgraded the host to 11.1-RC3, and I reinstalled a fresh guest
>> from the 11.1-RC3 install CD.  The guest uses UFS2 on a 40G disk, the
>> server is an AMD 9590 with 32G RAM and a 40T ZFS array.
>>
>> After installation, I started the guest again with 1G ram, 4 processors
>> (of the 8 on the source CPU) and tried a buildworld again.  This time the
>> guest crashed and rebooted.
>>
>> What should be next steps here?  This is repeatable.  The host is stable
>> (it can makeworld -j32 in about 25 minutes ... so it's hardware seems
>> good).  Is this an AMD bug?  Is it bad to use ZFS ZVols?
>>
>
>
___
freebsd-virtualization@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"


Re: Bhyve Broken: whose fault (AMD, FreeBSD, ZFS ...?)

2017-07-21 Thread Zaphod Beeblebrox
... curiously, top running on the guest reveals (the point at which the
bhyve wedges):

88722 root  1  520   109M   105M pfault  3   0:03  42.37%
llvm-tblgen
88687 root  1  520   374M   347M pfault  2   0:04  38.24%
llvm-tblgen
88668 root  1  520   236M   225M pfault  0   0:04  35.11%
llvm-tblgen
88743 root  1  520 55460K 26392K pfault  3   0:00  10.23% cc

... where top on the host just shows 100% bhyve 100% busy on 4 threads.


On Fri, Jul 21, 2017 at 6:43 PM, Zaphod Beeblebrox 
wrote:

> Since I found out that I can't run a Samba directory server in a jail,
> I've had the setup of a bhyve on my list.  I had toyed with Bhyve 6 or 8
> months ago, and still had the images, so I zfs cloned one and set about a
> source upgrade.
>
> This ignomineously hung.
>
> So... I upgraded the host to 11.1-RC3, and I reinstalled a fresh guest
> from the 11.1-RC3 install CD.  The guest uses UFS2 on a 40G disk, the
> server is an AMD 9590 with 32G RAM and a 40T ZFS array.
>
> After installation, I started the guest again with 1G ram, 4 processors
> (of the 8 on the source CPU) and tried a buildworld again.  This time the
> guest crashed and rebooted.
>
> What should be next steps here?  This is repeatable.  The host is stable
> (it can makeworld -j32 in about 25 minutes ... so it's hardware seems
> good).  Is this an AMD bug?  Is it bad to use ZFS ZVols?
>
___
freebsd-virtualization@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"


Re: Bhyve Broken: whose fault (AMD, FreeBSD, ZFS ...?)

2017-07-21 Thread Peter Grehan

Hi,


What should be next steps here?  This is repeatable.  The host is stable
(it can makeworld -j32 in about 25 minutes ... so it's hardware seems
good).  Is this an AMD bug?  Is it bad to use ZFS ZVols?


 ZVols are fine. Is the guest panic a spinlock timeout ?

 I believe this is a bug in bhyve/SVM. It appears somewhat related to 
processor speed (I can't repro on a 2.3GHz 8 CPU Opteron 6320, but can 
hit it after 15 mins or so on a Ryzen 1700, with/without SMT).


 Anish and I are currently chasing this and have repros. An experiment 
you could try is to run with the vCPUs pinned i.e. for a 4 vCPU guest, 
add the options "-p 0:1 -p 1:2 -p 2:3 -p 3:4".


later,

Peter.
___
freebsd-virtualization@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"