Re: [libvirt] [BUG] mlock support breakage

2017-03-22 Thread Luiz Capitulino
On Wed, 15 Mar 2017 10:11:50 +0100
Andrea Bolognani  wrote:

> On Wed, 2017-03-15 at 08:59 +0100, Jiri Denemark wrote:
> > > > Removing all memory locking limits should be something that
> > > > admins very carefully opt-in into, because of the potential
> > > > host DoS consequences. Certainly not the default.  
> > > 
> > > There's no opt-in with , it is mandatory to increase
> > > the mlock limit. Asking users to do this themselves is only
> > > adding an extra step that's causing breakage right now.  
> > 
> > ... we could consider  to be the explicit request for
> > setting an infinite memory locking limit and letting users set a lower
> > limit with hard_limit if they want.
> > 
> > There are several other cases in which memory is locked implicitly and
> > we should definitely not set the unlimited default for them.  
> 
> That would still be suboptimal because the risk involved in
> allowing QEMU to allocate unlimited amounts of locked memory
> might not be immediately apparent to the user, but at least
> we would have an explicit opt-in (the presence of the
>  element) and we would not lose the ability to set
> the limit explicitly to a lower value, so it sounds like a
> decent enough balance.
> 
> Anyone opposed to implementing it this way?

Ping?

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [BUG] mlock support breakage

2017-03-17 Thread Luiz Capitulino
On Wed, 15 Mar 2017 14:29:13 -0400
Luiz Capitulino  wrote:

> > ... we could consider  to be the explicit request for
> > setting an infinite memory locking limit and letting users set a lower
> > limit with hard_limit if they want.  
> 
> That's exactly how I see it! It seems we're total agreement.
> 
> Now, something has just occurred to me: shouldn't VFIO have
> the same problem? It's the same hard limit that's set.

I took a look at this today. While it's the same mlock limit
that's set and while QEMU's allocations can surpass that limit,
I didn't get a crash when using VFIO. The most probable obvious
reason for this is that VFIO is probably mlocking a small region,
although I could not find where this is done in QEMU.

In that case, VFIO is not affected. This issue is specific to
, where the 1GB limit set by libvirt conflicts with
QEMU memory needs.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [BUG] mlock support breakage

2017-03-16 Thread Luiz Capitulino
On Wed, 15 Mar 2017 08:59:34 +0100
Jiri Denemark  wrote:

> On Tue, Mar 14, 2017 at 15:54:25 -0400, Luiz Capitulino wrote:
> > On Tue, 14 Mar 2017 20:28:12 +0100
> > Andrea Bolognani  wrote:
> >   
> > > On Tue, 2017-03-14 at 14:54 -0400, Luiz Capitulino wrote:  
> > > > > It's unfortunate that the current, buggy behavior made
> > > > > it look like you didn't necessarily have to worry about
> > > > > this. If we fix it, existing guests will fail to start
> > > > > right away instead of possibly crashing in the future:
> > > > > while that's going to be very annoying in the short run,
> > > > 
> > > > It breaks existing guests, so it's beyond annoying.  
> 
> Because they were relying on a bug in our code rather than following the
> documentation.

The documentation was strictly followed. The documentation strongly
advises against using . We did that, everything worked
as expected.

I don't think it's a bug that things just worked. That's how
it's supposed to be and that's how it's for VFIO support.

The bug is that it's not working now.

> > > Existing guests are already broken, it's just that the
> > > breakage has not hit them yet ;)  
> > 
> > We should prevent that from happening.  
> 
> The only way we could achieve this is to set the memory locking limit to
> "unlimited" by default, computing the limit is impossible unless the
> result will be bigger than host's memory which is effectively equivalent
> to "unlimited".

I agree. I also suggested setting mlock limit to infinity when
libvirt sees  in the XML.

> > > > > Luiz mentioned the fact that you can't set the memory
> > > > > locking limit to "unlimited" with the current 
> > > > > element: that's something we can, and should, address.
> > > > > With that implemented, the administrator will have full
> > > > > control on the memory limit and will be able to implement
> > > > > the policy that best suits the use case at hand.
> > > > 
> > > > Asking  users to set  to "unlimited"
> > > > is a much worse solution than automatically setting the
> > > > memory lock limit to infinity in libvirt, for the reasons
> > > > I outlined in my first email.  
> 
> Memory locking limit and memory hard limit are separate things even
> though they can be set by a single value in domain XML. Hard limit is by
> default unlimited and a domain will just be killed if it consumes too
> much memory. However setting memory locking limit to unlimited may cause
> an OOM condition inside the host kernel, which is why it is not and
> shouldn't be the default value. However, ...

Well, I was wondering if OOM killer (and Linux's memory reclaim
logic) would work the same way (except for swapping pages, of
course) for processes with infinity mlock limit. If the answer
is yes, then a malicious guest trying a DDoS attack would only
be partially successful.

> > > Removing all memory locking limits should be something that
> > > admins very carefully opt-in into, because of the potential
> > > host DoS consequences. Certainly not the default.  
> > 
> > There's no opt-in with , it is mandatory to increase
> > the mlock limit. Asking users to do this themselves is only
> > adding an extra step that's causing breakage right now.  
> 
> ... we could consider  to be the explicit request for
> setting an infinite memory locking limit and letting users set a lower
> limit with hard_limit if they want.

That's exactly how I see it! It seems we're total agreement.

Now, something has just occurred to me: shouldn't VFIO have
the same problem? It's the same hard limit that's set.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [BUG] mlock support breakage

2017-03-15 Thread Andrea Bolognani
On Wed, 2017-03-15 at 08:59 +0100, Jiri Denemark wrote:
> > > Removing all memory locking limits should be something that
> > > admins very carefully opt-in into, because of the potential
> > > host DoS consequences. Certainly not the default.
> > 
> > There's no opt-in with , it is mandatory to increase
> > the mlock limit. Asking users to do this themselves is only
> > adding an extra step that's causing breakage right now.
> 
> ... we could consider  to be the explicit request for
> setting an infinite memory locking limit and letting users set a lower
> limit with hard_limit if they want.
> 
> There are several other cases in which memory is locked implicitly and
> we should definitely not set the unlimited default for them.

That would still be suboptimal because the risk involved in
allowing QEMU to allocate unlimited amounts of locked memory
might not be immediately apparent to the user, but at least
we would have an explicit opt-in (the presence of the
 element) and we would not lose the ability to set
the limit explicitly to a lower value, so it sounds like a
decent enough balance.

Anyone opposed to implementing it this way?


PS: Luiz, please check your MUA. It's messing with the
recipients in a way that makes me surprised messages
    managed to get to the list at all.
-- 
Andrea Bolognani / Red Hat / Virtualization

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [BUG] mlock support breakage

2017-03-15 Thread Jiri Denemark
On Tue, Mar 14, 2017 at 15:54:25 -0400, Luiz Capitulino wrote:
> On Tue, 14 Mar 2017 20:28:12 +0100
> Andrea Bolognani  wrote:
> 
> > On Tue, 2017-03-14 at 14:54 -0400, Luiz Capitulino wrote:
> > > > It's unfortunate that the current, buggy behavior made
> > > > it look like you didn't necessarily have to worry about
> > > > this. If we fix it, existing guests will fail to start
> > > > right away instead of possibly crashing in the future:
> > > > while that's going to be very annoying in the short run,  
> > > 
> > > It breaks existing guests, so it's beyond annoying.

Because they were relying on a bug in our code rather than following the
documentation.

> > Existing guests are already broken, it's just that the
> > breakage has not hit them yet ;)
> 
> We should prevent that from happening.

The only way we could achieve this is to set the memory locking limit to
"unlimited" by default, computing the limit is impossible unless the
result will be bigger than host's memory which is effectively equivalent
to "unlimited".

> > > > Luiz mentioned the fact that you can't set the memory
> > > > locking limit to "unlimited" with the current 
> > > > element: that's something we can, and should, address.
> > > > With that implemented, the administrator will have full
> > > > control on the memory limit and will be able to implement
> > > > the policy that best suits the use case at hand.  
> > > 
> > > Asking  users to set  to "unlimited"
> > > is a much worse solution than automatically setting the
> > > memory lock limit to infinity in libvirt, for the reasons
> > > I outlined in my first email.

Memory locking limit and memory hard limit are separate things even
though they can be set by a single value in domain XML. Hard limit is by
default unlimited and a domain will just be killed if it consumes too
much memory. However setting memory locking limit to unlimited may cause
an OOM condition inside the host kernel, which is why it is not and
shouldn't be the default value. However, ...

> > Removing all memory locking limits should be something that
> > admins very carefully opt-in into, because of the potential
> > host DoS consequences. Certainly not the default.
> 
> There's no opt-in with , it is mandatory to increase
> the mlock limit. Asking users to do this themselves is only
> adding an extra step that's causing breakage right now.

... we could consider  to be the explicit request for
setting an infinite memory locking limit and letting users set a lower
limit with hard_limit if they want.

There are several other cases in which memory is locked implicitly and
we should definitely not set the unlimited default for them.

Jirka

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [BUG] mlock support breakage

2017-03-14 Thread Luiz Capitulino
On Tue, 14 Mar 2017 20:28:12 +0100
Andrea Bolognani  wrote:

> On Tue, 2017-03-14 at 14:54 -0400, Luiz Capitulino wrote:
> > > It's unfortunate that the current, buggy behavior made
> > > it look like you didn't necessarily have to worry about
> > > this. If we fix it, existing guests will fail to start
> > > right away instead of possibly crashing in the future:
> > > while that's going to be very annoying in the short run,  
> > 
> > It breaks existing guests, so it's beyond annoying.  
> 
> Existing guests are already broken, it's just that the
> breakage has not hit them yet ;)

We should prevent that from happening.

> > > Luiz mentioned the fact that you can't set the memory
> > > locking limit to "unlimited" with the current 
> > > element: that's something we can, and should, address.
> > > With that implemented, the administrator will have full
> > > control on the memory limit and will be able to implement
> > > the policy that best suits the use case at hand.  
> > 
> > Asking  users to set  to "unlimited"
> > is a much worse solution than automatically setting the
> > memory lock limit to infinity in libvirt, for the reasons
> > I outlined in my first email.  
> 
> Removing all memory locking limits should be something that
> admins very carefully opt-in into, because of the potential
> host DoS consequences. Certainly not the default.

There's no opt-in with , it is mandatory to increase
the mlock limit. Asking users to do this themselves is only
adding an extra step that's causing breakage right now.

> That's the same with /etc/security/limits.conf, where the
> default memory locking limit is extremely low (64 KiB) and
> the admin can decide to raise it or even remove it entirely
> if needed.

But that's a bad example, we have to help our users not
contribute to make their life miserable.

Users want to use  without having to guess limits
that we can't figure out ourselves.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [BUG] mlock support breakage

2017-03-14 Thread Andrea Bolognani
On Tue, 2017-03-14 at 14:54 -0400, Luiz Capitulino wrote:
> > It's unfortunate that the current, buggy behavior made
> > it look like you didn't necessarily have to worry about
> > this. If we fix it, existing guests will fail to start
> > right away instead of possibly crashing in the future:
> > while that's going to be very annoying in the short run,
> 
> It breaks existing guests, so it's beyond annoying.

Existing guests are already broken, it's just that the
breakage has not hit them yet ;)

> > Luiz mentioned the fact that you can't set the memory
> > locking limit to "unlimited" with the current 
> > element: that's something we can, and should, address.
> > With that implemented, the administrator will have full
> > control on the memory limit and will be able to implement
> > the policy that best suits the use case at hand.
> 
> Asking  users to set  to "unlimited"
> is a much worse solution than automatically setting the
> memory lock limit to infinity in libvirt, for the reasons
> I outlined in my first email.

Removing all memory locking limits should be something that
admins very carefully opt-in into, because of the potential
host DoS consequences. Certainly not the default.

That's the same with /etc/security/limits.conf, where the
default memory locking limit is extremely low (64 KiB) and
the admin can decide to raise it or even remove it entirely
if needed.

> PS: Still, we should have "unlimited" support for 

Definitely.

-- 
Andrea Bolognani / Red Hat / Virtualization

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [BUG] mlock support breakage

2017-03-14 Thread Luiz Capitulino
On Tue, 14 Mar 2017 19:40:38 +0100
Andrea Bolognani  wrote:

> It's unfortunate that the current, buggy behavior made
> it look like you didn't necessarily have to worry about
> this. If we fix it, existing guests will fail to start
> right away instead of possibly crashing in the future:
> while that's going to be very annoying in the short run,

It breaks existing guests, so it's beyond annoying.

> it's arguably better than illuding people their guests
> will be good in the long run while in reality we can't
> provide such guarantee.
> 
> Luiz mentioned the fact that you can't set the memory
> locking limit to "unlimited" with the current 
> element: that's something we can, and should, address.
> With that implemented, the administrator will have full
> control on the memory limit and will be able to implement
> the policy that best suits the use case at hand.

Asking  users to set  to "unlimited"
is a much worse solution than automatically setting the
memory lock limit to infinity in libvirt, for the reasons
I outlined in my first email.

PS: Still, we should have "unlimited" support for 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [BUG] mlock support breakage

2017-03-14 Thread Andrea Bolognani
On Tue, 2017-03-14 at 15:55 +0100, Peter Krempa wrote:
> > NB if we did enforce $RAM + $LARGE_NUMBER, then I'd suggest we did
> > set a default hard_limit universally once more, not only set a mlock
> 
> We were already attempting to do this and we reverted it since there's
> no deterministic $LARGE_NUMBER which would work for all cases.
> 
> I can imagine this working if $LARGE_NUMBER is infinity or host
> memory+swap size.
> 
> commit 16bcb3b61675a88bff00317336b9610080c31000
> Author: Michal Privoznik 
> Date:   Fri Aug 9 14:46:54 2013 +0200
> 
> qemu: Drop qemuDomainMemoryLimit
> 
> This function is to guess the correct limit for maximal memory
> usage by qemu for given domain. This can never be guessed
> correctly, not to mention all the pains and sleepless nights this
> code has caused. Once somebody discovers algorithm to solve the
> Halting Problem, we can compute the limit algorithmically. But
> till then, this code should never see the light of the release
> again.

Right.

Even the current accidental limit, guest memory + 1 GiB,
doesn't give any real guarantee about the guests working
in the future: a simple QEMU upgrade could be enough to
push memory usage past the current allowance and cause
them to crash. Peter mentioned that just adding (a lot)
more disks could actually be enough.

Even for the host admin, setting the memory limits
appropriately will always be a balancing act between
making sure the host can swap out guests when pressured
for memory and making sure the guests themselves can
allocate the memory they need. Different use cases will
call for different balances, and since there is no
one-size-fits-all solution we shouldn't pretend that
this complexity doesn't exist and hide it from the
administrator.

It's unfortunate that the current, buggy behavior made
it look like you didn't necessarily have to worry about
this. If we fix it, existing guests will fail to start
right away instead of possibly crashing in the future:
while that's going to be very annoying in the short run,
it's arguably better than illuding people their guests
will be good in the long run while in reality we can't
provide such guarantee.

Luiz mentioned the fact that you can't set the memory
locking limit to "unlimited" with the current 
element: that's something we can, and should, address.
With that implemented, the administrator will have full
control on the memory limit and will be able to implement
the policy that best suits the use case at hand.

-- 
Andrea Bolognani / Red Hat / Virtualization

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [BUG] mlock support breakage

2017-03-14 Thread Peter Krempa
On Mon, Mar 13, 2017 at 18:16:49 +, Daniel Berrange wrote:
> On Mon, Mar 13, 2017 at 02:08:30PM -0400, Luiz Capitulino wrote:
> > On Mon, 13 Mar 2017 13:53:33 -0400
> > Luiz Capitulino  wrote:
> > 
> > > OK, you're right. I personally don't like we're putting a random cap
> > > on QEMU memory allocations, but if it's large enough it shouldn't be
> > > a problem (I hope).
> > 
> > The I hope part meaning, if we do find legitimate reasons for QEMU's
> > address space to go beyond $LARGE_NUMBER, it will be means of guests
> > randomly crashing when using .
> 
> NB if we did enforce $RAM + $LARGE_NUMBER, then I'd suggest we did
> set a default hard_limit universally once more, not only set a mlock

We were already attempting to do this and we reverted it since there's
no deterministic $LARGE_NUMBER which would work for all cases.

I can imagine this working if $LARGE_NUMBER is infinity or host
memory+swap size.

commit 16bcb3b61675a88bff00317336b9610080c31000
Author: Michal Privoznik 
Date:   Fri Aug 9 14:46:54 2013 +0200

qemu: Drop qemuDomainMemoryLimit

This function is to guess the correct limit for maximal memory
usage by qemu for given domain. This can never be guessed
correctly, not to mention all the pains and sleepless nights this
code has caused. Once somebody discovers algorithm to solve the
Halting Problem, we can compute the limit algorithmically. But
till then, this code should never see the light of the release
again.



signature.asc
Description: PGP signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [BUG] mlock support breakage

2017-03-14 Thread Andrea Bolognani
On Mon, 2017-03-13 at 14:24 -0400, Luiz Capitulino wrote:
> > NB if we did enforce $RAM + $LARGE_NUMBER, then I'd suggest we did
> > set a default hard_limit universally once more, not only set a mlock
> > limit when using . This would at least ensure we see consistent
> > (bad) behaviour rather than have edge cases that only appeared when
> >  was present.

Setting  doesn't limit just the amount of memory
the QEMU process is allowed to lock, but also the amount of
memory it's allowed to allocate at all (through cgroups).

Is it fair to assume that QEMU will never *allocate* more
than 2 GiB (or whatever $LARGE_NUMBER we end up picking) in
addition to what's needed to fit the guest memory? I would
certainly hope so, but maybe there are use cases that require
more than that.

Were you thinking about adding  automatically to
all guests where it's not already present, or just setting
the limits silently? The latter sounds like it would be very
opaque to users, so I'm guessing the former.

If we added  to the XML automatically, we would
have the problem of keeping it updated with changes to the
amount of guest memory... Or maybe we could expect users to
remove the  element, to get it regenerated, every
time they change the amount of guest memory?

One nice side-effect of doing this unconditionally is that
we could get rid of some of the special-casing we are doing
for VFIO-using guest, especially saving and restoring the
memory locking limit when host devices are assigned and
unassigned.

> Makes me even more nervous, but I agree with your reasoning.
> 
> Btw, do we have a volunteer to do this work? Andrea?

Since I've already spent a significant amount of time
researching the issue, and wrote the patch that caused it
to rear its ugly head in the first place, I guess it makes
sense for me to be volunteered to fix it ;)

-- 
Andrea Bolognani / Red Hat / Virtualization

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [BUG] mlock support breakage

2017-03-13 Thread Luiz Capitulino
On Mon, 13 Mar 2017 18:16:49 +
"Daniel P. Berrange"  wrote:

> On Mon, Mar 13, 2017 at 02:08:30PM -0400, Luiz Capitulino wrote:
> > On Mon, 13 Mar 2017 13:53:33 -0400
> > Luiz Capitulino  wrote:
> >   
> > > OK, you're right. I personally don't like we're putting a random cap
> > > on QEMU memory allocations, but if it's large enough it shouldn't be
> > > a problem (I hope).  
> > 
> > The I hope part meaning, if we do find legitimate reasons for QEMU's
> > address space to go beyond $LARGE_NUMBER, it will be means of guests
> > randomly crashing when using .  
> 
> NB if we did enforce $RAM + $LARGE_NUMBER, then I'd suggest we did
> set a default hard_limit universally once more, not only set a mlock
> limit when using . This would at least ensure we see consistent
> (bad) behaviour rather than have edge cases that only appeared when
>  was present.

Makes me even more nervous, but I agree with your reasoning.

Btw, do we have a volunteer to do this work? Andrea?

> 
> We would need $LARGE_NUMBER to be much more conservative than what we
> used in the past though, to avoid hitting the same problems once again.
> 
> Regards,
> Daniel

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [BUG] mlock support breakage

2017-03-13 Thread Daniel P. Berrange
On Mon, Mar 13, 2017 at 02:08:30PM -0400, Luiz Capitulino wrote:
> On Mon, 13 Mar 2017 13:53:33 -0400
> Luiz Capitulino  wrote:
> 
> > OK, you're right. I personally don't like we're putting a random cap
> > on QEMU memory allocations, but if it's large enough it shouldn't be
> > a problem (I hope).
> 
> The I hope part meaning, if we do find legitimate reasons for QEMU's
> address space to go beyond $LARGE_NUMBER, it will be means of guests
> randomly crashing when using .

NB if we did enforce $RAM + $LARGE_NUMBER, then I'd suggest we did
set a default hard_limit universally once more, not only set a mlock
limit when using . This would at least ensure we see consistent
(bad) behaviour rather than have edge cases that only appeared when
 was present.

We would need $LARGE_NUMBER to be much more conservative than what we
used in the past though, to avoid hitting the same problems once again.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://entangle-photo.org   -o-http://search.cpan.org/~danberr/ :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [BUG] mlock support breakage

2017-03-13 Thread Luiz Capitulino
On Mon, 13 Mar 2017 13:53:33 -0400
Luiz Capitulino  wrote:

> OK, you're right. I personally don't like we're putting a random cap
> on QEMU memory allocations, but if it's large enough it shouldn't be
> a problem (I hope).

The I hope part meaning, if we do find legitimate reasons for QEMU's
address space to go beyond $LARGE_NUMBER, it will be means of guests
randomly crashing when using .

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [BUG] mlock support breakage

2017-03-13 Thread Luiz Capitulino
On Mon, 13 Mar 2017 16:43:46 +
"Daniel P. Berrange"  wrote:

> On Mon, Mar 13, 2017 at 12:35:42PM -0400, Luiz Capitulino wrote:
> > On Mon, 13 Mar 2017 16:08:58 +
> > "Daniel P. Berrange"  wrote:
> >   
> > > >  2. Drop change c2e60ad0e51 and automtically increase memory
> > > > locking limit to infinity when seeing 
> > > > 
> > > >pros: make all cases work, no more  requirement
> > > > 
> > > >cons: allows guests with  to lock all memory
> > > >  assigned to them plus QEMU allocations. While this seems
> > > >  undesirable or even a security issue, using 
> > > >  will have the same effect
> > > 
> > > I think this is the only viable approach, given that no one can
> > > provide a way to reliably calculate QEMU peak memory usage.
> > > Unless we want to take guest RAM + $LARGE NUMBER - eg just blindly
> > > assume that 2 GB is enough for QEMU working set, so for an 8 GB
> > > guest, just set  10 GB as the limit.  

I forgot to say that I'm fine with this solution, provided that we
drop the requirement on using  with  and
revert commit c2e60ad0e51.

> > Better to set it to infinity and be done with it.  
> 
> Not neccessarily, no. If we set $RAM + $LARGE_NUMBER, we are still likely
> to be well below the overall physical  RAM of a host. IOW, a single
> compromised QEMU would still be restricted in how much it can pin. If we
> set it to infinity then a compromised QEMU can lock all of physical RAM
> doing a very effective DOS on the host, as it can't even swap the guest
> out to recover.

OK, you're right. I personally don't like we're putting a random cap
on QEMU memory allocations, but if it's large enough it shouldn't be
a problem (I hope).

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [BUG] mlock support breakage

2017-03-13 Thread Daniel P. Berrange
On Mon, Mar 13, 2017 at 12:35:42PM -0400, Luiz Capitulino wrote:
> On Mon, 13 Mar 2017 16:08:58 +
> "Daniel P. Berrange"  wrote:
> 
> > >  2. Drop change c2e60ad0e51 and automtically increase memory
> > > locking limit to infinity when seeing 
> > > 
> > >pros: make all cases work, no more  requirement
> > > 
> > >cons: allows guests with  to lock all memory
> > >  assigned to them plus QEMU allocations. While this seems
> > >  undesirable or even a security issue, using 
> > >  will have the same effect  
> > 
> > I think this is the only viable approach, given that no one can
> > provide a way to reliably calculate QEMU peak memory usage.
> > Unless we want to take guest RAM + $LARGE NUMBER - eg just blindly
> > assume that 2 GB is enough for QEMU working set, so for an 8 GB
> > guest, just set  10 GB as the limit.
> 
> Better to set it to infinity and be done with it.

Not neccessarily, no. If we set $RAM + $LARGE_NUMBER, we are still likely
to be well below the overall physical  RAM of a host. IOW, a single
compromised QEMU would still be restricted in how much it can pin. If we
set it to infinity then a compromised QEMU can lock all of physical RAM
doing a very effective DOS on the host, as it can't even swap the guest
out to recover.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://entangle-photo.org   -o-http://search.cpan.org/~danberr/ :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [BUG] mlock support breakage

2017-03-13 Thread Luiz Capitulino
On Mon, 13 Mar 2017 16:08:58 +
"Daniel P. Berrange"  wrote:

> On Mon, Mar 13, 2017 at 11:58:24AM -0400, Luiz Capitulino wrote:
> > 
> > Libvirt commit c2e60ad0e51 added a new check to the XML validation
> > logic where XMLs containing  must also
> > contain . This causes two breakages where
> > working guests won't start anymore:
> > 
> > 1. Systems where mlock limit was set in /etc/security/limits.conf  
> 
> I'm surprised if that has any effect, unless you were setting it
> against the root user.
> 
> The limits.conf file is loaded by PAM, and when libvirtd spawns
> QEMU, PAM is not involved, so limits.conf will never be activated.
> 
> This is why libvirt provides max_processes/max_files/max_core
> settings in /etc/libvirt/qemu.conf - you can't set those in
> limits.conf and have them work - unless you set them against
> root, so libvirtd itself got the higher limits which are then
> inherited by QEMU. 

My quick test seemed to work, but I may have declared success
before the guest had time to crash. I'll double check this,
but we agree about the best way to fix it.

> >  2. Drop change c2e60ad0e51 and automtically increase memory
> > locking limit to infinity when seeing 
> > 
> >pros: make all cases work, no more  requirement
> > 
> >cons: allows guests with  to lock all memory
> >  assigned to them plus QEMU allocations. While this seems
> >  undesirable or even a security issue, using 
> >  will have the same effect  
> 
> I think this is the only viable approach, given that no one can
> provide a way to reliably calculate QEMU peak memory usage.
> Unless we want to take guest RAM + $LARGE NUMBER - eg just blindly
> assume that 2 GB is enough for QEMU working set, so for an 8 GB
> guest, just set  10 GB as the limit.

Better to set it to infinity and be done with it.

> > Lastly,  doesn't belong to , it should
> > be in . I recommend deprecating it from 
> > and adding it where it belongs.  
> 
> We never make these kind of changes in libvirt XML. It is sub-optimal
> location, but it has no functional problem, so there's no functional
> benefit to moving it and clear backcompat downsides.

Fine with me.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [BUG] mlock support breakage

2017-03-13 Thread Daniel P. Berrange
On Mon, Mar 13, 2017 at 11:58:24AM -0400, Luiz Capitulino wrote:
> 
> Libvirt commit c2e60ad0e51 added a new check to the XML validation
> logic where XMLs containing  must also
> contain . This causes two breakages where
> working guests won't start anymore:
> 
> 1. Systems where mlock limit was set in /etc/security/limits.conf

I'm surprised if that has any effect, unless you were setting it
against the root user.

The limits.conf file is loaded by PAM, and when libvirtd spawns
QEMU, PAM is not involved, so limits.conf will never be activated.

This is why libvirt provides max_processes/max_files/max_core
settings in /etc/libvirt/qemu.conf - you can't set those in
limits.conf and have them work - unless you set them against
root, so libvirtd itself got the higher limits which are then
inherited by QEMU. 

> 2. Guests using hugeTLB pages. In this case, guests were relying
>on the fact that libvirt automagically increases mlock
>limit to 1GB

Yep, that's bad - we mustn't break previously working scenarios
like this, even if there were not following documented practice.

> While it's true that  documentation
> says that  is required, this is actually
> an extremely bad request because:
> 
>  A.  own documention strongly recommends
> NOT using it

Yep, hard limit is impossible to calculate reliably since no one
has been able to provide an accurate way to predict QEMU's peak
memory usage. When libvirt previously set hard_limit by default,
we got many bug reports about guest's killed by the OOM killer,
no matter what algorithm we tried.

>  B.  does more than setting memory locking
> limit
> 
>  C.  does not support infinity, so you have
> to guess a limit
> 
>  D. If  is less than 1GB, it will lower
> VFIO's automatic limit of "guest memory + 1GB"
> 
> Here's two possible solutions to fix this all:
> 
>  1. Drop change c2e60ad0e51 and drop automatic increases. Let
> users configure limits in /etc/security/limits.conf
> 
> pros: this is the most correct way to do it, and how
>   it should be done originally IMHO
> 
> cons: will break working VFIO setups, so probably undoable

limits.conf is useless - see above.

>  2. Drop change c2e60ad0e51 and automtically increase memory
> locking limit to infinity when seeing 
> 
>pros: make all cases work, no more  requirement
> 
>cons: allows guests with  to lock all memory
>  assigned to them plus QEMU allocations. While this seems
>  undesirable or even a security issue, using 
>  will have the same effect

I think this is the only viable approach, given that no one can
provide a way to reliably calculate QEMU peak memory usage.
Unless we want to take guest RAM + $LARGE NUMBER - eg just blindly
assume that 2 GB is enough for QEMU working set, so for an 8 GB
guest, just set  10 GB as the limit.

> Lastly,  doesn't belong to , it should
> be in . I recommend deprecating it from 
> and adding it where it belongs.

We never make these kind of changes in libvirt XML. It is sub-optimal
location, but it has no functional problem, so there's no functional
benefit to moving it and clear backcompat downsides.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://entangle-photo.org   -o-http://search.cpan.org/~danberr/ :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list