Re: [libvirt] New QEMU daemon for persistent reservations

2017-11-27 Thread Paolo Bonzini
On 27/11/2017 14:44, Michal Privoznik wrote:
> On 11/27/2017 12:09 PM, Paolo Bonzini wrote:
>> On 24/11/2017 19:13, Michal Privoznik wrote:
>>> On 11/24/2017 06:18 PM, Paolo Bonzini wrote:
 On 24/11/2017 18:07, Michal Privoznik wrote:
> The helper is going to be daemonized (for the same reason
> that qemu process is) => there's no SIGCHLD for libvirtd to receive.

 Couldn't (or shouldn't!) libvirt register itself as a subreaper instead?
>>>
>>> Whether libvirt can't have a
>>> separate thread where nothing waitpid() would run? That could hardly
>>> work - not only would libvirt need to spawn separate thread for every
>>> helper, it would not work across restarts of libvirtd. We would have no
>>> guarantee that PID we recorded is still the same process as it was when
>>> we were dying.
>>
>> Oh, you're right. :(  Even if libvirtd were a subreaper (see prctl(2)
>> manpage), the daemonized qemu-pr-helper would be reparented to init when
>> libvirtd restarts.
>>
>> libvirtd could test whether the helper is alive by connecting to its
>> socket.  If that's not enough, I'll add an event.
> 
> Well, connecting and staying connected? Otherwise mere connect followed
> by immediate disconnect would work only when libvirtd is starting up and
> reconnecting to already running domains. Not if the helper dies some
> time after libvirt started the domain and the helper. [1]

Ok, let's add the event and state command (to QEMU 2.12 only).

Thanks,

Paolo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-11-27 Thread Michal Privoznik
On 11/27/2017 12:09 PM, Paolo Bonzini wrote:
> On 24/11/2017 19:13, Michal Privoznik wrote:
>> On 11/24/2017 06:18 PM, Paolo Bonzini wrote:
>>> On 24/11/2017 18:07, Michal Privoznik wrote:
 The helper is going to be daemonized (for the same reason
 that qemu process is) => there's no SIGCHLD for libvirtd to receive.
>>>
>>> Couldn't (or shouldn't!) libvirt register itself as a subreaper instead?
>>
>> Whether libvirt can't have a
>> separate thread where nothing waitpid() would run? That could hardly
>> work - not only would libvirt need to spawn separate thread for every
>> helper, it would not work across restarts of libvirtd. We would have no
>> guarantee that PID we recorded is still the same process as it was when
>> we were dying.
> 
> Oh, you're right. :(  Even if libvirtd were a subreaper (see prctl(2)
> manpage), the daemonized qemu-pr-helper would be reparented to init when
> libvirtd restarts.
> 
> libvirtd could test whether the helper is alive by connecting to its
> socket.  If that's not enough, I'll add an event.

Well, connecting and staying connected? Otherwise mere connect followed
by immediate disconnect would work only when libvirtd is starting up and
reconnecting to already running domains. Not if the helper dies some
time after libvirt started the domain and the helper. [1]

> 
> Another thing: we have
> 
> 
> 
> which represents "one daemon per QEMU" in privileged libvirtd and
> "well-known socket path" in session libvirtd.  What would the
> live-domain XML look like?  Maybe:
> 
> 
> 
> mode='server'/>
> 
> 
> 
> 
> mode='server'/>
> 

This is a good idea. We can explicitly state if the helper is supposed
to be managed by libvirtd or not. However, I'd prefer not exposing path
in case managed='yes' because it's something that libvirt should manage.
We don't expose monitor path either, for instance.

With managed='yes' libvirt would:
a) spawn daemon on per domain basis
b) generate unique path for each helper it spawns
c) label the socket (so that qemu can access it)
d) put the path onto qemu's cmd line

However, with managed='no' libvirt would:
a) require path to be provided in the XML
b) merely just put the path onto qemu's cmd line

> 
> And then, would it be an error to specify the "wrong" managed setting in
> the domain-creation XML?

For now yes. For now specifying both path and managed='yes' would be
considered an error. It gives us enough room to allow that in the future
should anybody need it.

> Or would it be possible to use a managed='no'
> helper in privileged libvirtd?  I take it as a "no" given your remark below.

You're right. This is going to be forbidden too. Again, we can allow
this if somebody needs it.

1: Also, it's worth nothing that libvirtd would restart the helper iff
managed='yes'.

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-11-27 Thread Paolo Bonzini
On 27/11/2017 14:35, Michal Privoznik wrote:
>>> But can you also test _more_ permissions if you want?  So if QEMU passed
>>> to the helper a file for which it has "lock" but not "ioctl" access,
>>> would it make sense for the helper to let it through?  Together with the
>>> socket MAC, this should be precise enough.
>> Sure, you can check any of the permissions which are defined for the
>> type of object you've got. The goal is to check permissions which
>> correspond to actions you're taking on the object. So if you do
>> something other than just ioctl() on the passed in FD, it would make
>> sense to check further permissions as appropriate.
> Just to make sure I understand correctly: the PD passing is done by qemu
> and not libvirt, right? Technically, we don't open the disks.

Correct.

Paolo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-11-27 Thread Michal Privoznik
On 11/27/2017 02:29 PM, Daniel P. Berrange wrote:
> On Mon, Nov 27, 2017 at 02:01:06PM +0100, Paolo Bonzini wrote:
>> On 27/11/2017 13:57, Daniel P. Berrange wrote:
 Got it.  My problem here is that ioctl permission might be too strict.
 One use case for the helper is to bypass the ioctl permission, and only
 grant SCSI passthrough access for the specific case of persistent
 reservation commands.  Would it make sense to also allow e.g. "lock"
 (and would it pass the SELinux policy)?
>>> When I write "...ioctl perm..." I use that just as a short cut to represent
>>> whatever SELinux access vector would be checked if QEMU were to do the SCSI
>>> PR calls itself.  The access vector permission bits are defined in the 
>>> policy
>>> file, and in the auto-generated header file 
>>> /usr/include/selinux/av_permissions.h
>>>
>>> AFAICT, there's only a coarse COMMON_FILE_IOCTL access vector defined, 
>>> nothing
>>> on the level of individual ioctl commands. So, yes, the MAC policy check
>>> would allow other ioctl commands to be run, aside from those required for
>>> persistent reservations. We just have to rely on the PR helper code for
>>> that restriction.
>>
>> But can you also test _more_ permissions if you want?  So if QEMU passed
>> to the helper a file for which it has "lock" but not "ioctl" access,
>> would it make sense for the helper to let it through?  Together with the
>> socket MAC, this should be precise enough.
> 
> Sure, you can check any of the permissions which are defined for the
> type of object you've got. The goal is to check permissions which
> correspond to actions you're taking on the object. So if you do
> something other than just ioctl() on the passed in FD, it would make
> sense to check further permissions as appropriate.

Just to make sure I understand correctly: the PD passing is done by qemu
and not libvirt, right? Technically, we don't open the disks.

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-11-27 Thread Daniel P. Berrange
On Mon, Nov 27, 2017 at 02:01:06PM +0100, Paolo Bonzini wrote:
> On 27/11/2017 13:57, Daniel P. Berrange wrote:
> >> Got it.  My problem here is that ioctl permission might be too strict.
> >> One use case for the helper is to bypass the ioctl permission, and only
> >> grant SCSI passthrough access for the specific case of persistent
> >> reservation commands.  Would it make sense to also allow e.g. "lock"
> >> (and would it pass the SELinux policy)?
> > When I write "...ioctl perm..." I use that just as a short cut to represent
> > whatever SELinux access vector would be checked if QEMU were to do the SCSI
> > PR calls itself.  The access vector permission bits are defined in the 
> > policy
> > file, and in the auto-generated header file 
> > /usr/include/selinux/av_permissions.h
> > 
> > AFAICT, there's only a coarse COMMON_FILE_IOCTL access vector defined, 
> > nothing
> > on the level of individual ioctl commands. So, yes, the MAC policy check
> > would allow other ioctl commands to be run, aside from those required for
> > persistent reservations. We just have to rely on the PR helper code for
> > that restriction.
> 
> But can you also test _more_ permissions if you want?  So if QEMU passed
> to the helper a file for which it has "lock" but not "ioctl" access,
> would it make sense for the helper to let it through?  Together with the
> socket MAC, this should be precise enough.

Sure, you can check any of the permissions which are defined for the
type of object you've got. The goal is to check permissions which
correspond to actions you're taking on the object. So if you do
something other than just ioctl() on the passed in FD, it would make
sense to check further permissions as appropriate.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-11-27 Thread Paolo Bonzini
On 27/11/2017 13:57, Daniel P. Berrange wrote:
>> Got it.  My problem here is that ioctl permission might be too strict.
>> One use case for the helper is to bypass the ioctl permission, and only
>> grant SCSI passthrough access for the specific case of persistent
>> reservation commands.  Would it make sense to also allow e.g. "lock"
>> (and would it pass the SELinux policy)?
> When I write "...ioctl perm..." I use that just as a short cut to represent
> whatever SELinux access vector would be checked if QEMU were to do the SCSI
> PR calls itself.  The access vector permission bits are defined in the policy
> file, and in the auto-generated header file 
> /usr/include/selinux/av_permissions.h
> 
> AFAICT, there's only a coarse COMMON_FILE_IOCTL access vector defined, nothing
> on the level of individual ioctl commands. So, yes, the MAC policy check
> would allow other ioctl commands to be run, aside from those required for
> persistent reservations. We just have to rely on the PR helper code for
> that restriction.

But can you also test _more_ permissions if you want?  So if QEMU passed
to the helper a file for which it has "lock" but not "ioctl" access,
would it make sense for the helper to let it through?  Together with the
socket MAC, this should be precise enough.

Thanks,

Paolo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-11-27 Thread Paolo Bonzini
On 27/11/2017 13:18, Daniel P. Berrange wrote:
> On Mon, Nov 27, 2017 at 12:51:33PM +0100, Paolo Bonzini wrote:
>> On 27/11/2017 12:37, Daniel P. Berrange wrote:
>>> On Mon, Nov 27, 2017 at 12:13:24PM +0100, Paolo Bonzini wrote:
 Hm, I see what you mean now.  But it would be "just" a qemu-pr-helper
 bug that it trusts the caller to have "ioctl" permissions on the file
 descriptor, wouldn't it?

 And it could be a feature even, since the remote QEMU process also has
 to have "connect" permissions on the qemu-pr-helper socket.  So you
 could give it ioctl access *limited to persistent reservations* by
 granting the appropriate permissions on the socket.
>>>
>>> We can't grant access to the persistent reservation helper's socket on a
>>> per QEMU basis. Permissions are granted on the domain type svirt_t, and
>>> we don't want to invent a new domain type just for having access to the
>>> PR helper.
>>
>> You can do so via DAC and MAC on the socket itself, or is that not enough?
> 
> Yes, you can do MAC on the socket, but that merely gives you protection
> betweeen the host & guest. The goal of sVirt is to give protection
> between VMs, and MAC on the socket alone doesn't guarantee that.
> In the shared-PR helper the PR helper context would not have an
> MAC level applied, so the MAC check is now
> 
>process context == qemu_pr_helper_t:s0
>file context == svirt_image_t:s0,c340,c524
> 
> This is still a MAC check, but it is a weaker check than what sVirt promises,
> because a QEMU with MCS   c340,c524 could pass a file descriptor with a MCS
> level c123,c422.  QEMU would be denied to use this bogus FD, but the PR
> helper would happily allow this bad access.

So the issue could be that QEMU could have received this bogus FD from
an attacker.  Got it.

> So to satisfy sVirt separation of QEMU instances, we need the PR helper to
> be able to validate the MCS levels, as the kernel no longer has enough info
> todo this automatically. 
> 
> In the PR helper this is achieved with something like
> 
> getpeercon(socketfd, &qemucon)  => returns svirt_t:s0,c340,c524
> fgetfilecont(filefd, &diskcon)  => returns svirt_image_t:s0,c340,c524
> 
> Then it would call something like
> 
>security_compute_av(qemucon, diskcon, ...file class..., ...ioctl perm...)
> 
> to get a MAC decision from the kernel. This preserves the sVirt isolation
> of individual QEMU instances.

Got it.  My problem here is that ioctl permission might be too strict.
One use case for the helper is to bypass the ioctl permission, and only
grant SCSI passthrough access for the specific case of persistent
reservation commands.  Would it make sense to also allow e.g. "lock"
(and would it pass the SELinux policy)?

Also, what about AppArmor?

Thanks,

Paolo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-11-27 Thread Daniel P. Berrange
On Mon, Nov 27, 2017 at 01:49:41PM +0100, Paolo Bonzini wrote:
> On 27/11/2017 13:18, Daniel P. Berrange wrote:
> > On Mon, Nov 27, 2017 at 12:51:33PM +0100, Paolo Bonzini wrote:
> >> On 27/11/2017 12:37, Daniel P. Berrange wrote:
> >>> On Mon, Nov 27, 2017 at 12:13:24PM +0100, Paolo Bonzini wrote:
>  Hm, I see what you mean now.  But it would be "just" a qemu-pr-helper
>  bug that it trusts the caller to have "ioctl" permissions on the file
>  descriptor, wouldn't it?
> 
>  And it could be a feature even, since the remote QEMU process also has
>  to have "connect" permissions on the qemu-pr-helper socket.  So you
>  could give it ioctl access *limited to persistent reservations* by
>  granting the appropriate permissions on the socket.
> >>>
> >>> We can't grant access to the persistent reservation helper's socket on a
> >>> per QEMU basis. Permissions are granted on the domain type svirt_t, and
> >>> we don't want to invent a new domain type just for having access to the
> >>> PR helper.
> >>
> >> You can do so via DAC and MAC on the socket itself, or is that not enough?
> > 
> > Yes, you can do MAC on the socket, but that merely gives you protection
> > betweeen the host & guest. The goal of sVirt is to give protection
> > between VMs, and MAC on the socket alone doesn't guarantee that.
> > In the shared-PR helper the PR helper context would not have an
> > MAC level applied, so the MAC check is now
> > 
> >process context == qemu_pr_helper_t:s0
> >file context == svirt_image_t:s0,c340,c524
> > 
> > This is still a MAC check, but it is a weaker check than what sVirt 
> > promises,
> > because a QEMU with MCS   c340,c524 could pass a file descriptor with a MCS
> > level c123,c422.  QEMU would be denied to use this bogus FD, but the PR
> > helper would happily allow this bad access.
> 
> So the issue could be that QEMU could have received this bogus FD from
> an attacker.  Got it.
> 
> > So to satisfy sVirt separation of QEMU instances, we need the PR helper to
> > be able to validate the MCS levels, as the kernel no longer has enough info
> > todo this automatically. 
> > 
> > In the PR helper this is achieved with something like
> > 
> > getpeercon(socketfd, &qemucon)  => returns svirt_t:s0,c340,c524
> > fgetfilecont(filefd, &diskcon)  => returns svirt_image_t:s0,c340,c524
> > 
> > Then it would call something like
> > 
> >security_compute_av(qemucon, diskcon, ...file class..., ...ioctl perm...)
> > 
> > to get a MAC decision from the kernel. This preserves the sVirt isolation
> > of individual QEMU instances.
> 
> Got it.  My problem here is that ioctl permission might be too strict.
> One use case for the helper is to bypass the ioctl permission, and only
> grant SCSI passthrough access for the specific case of persistent
> reservation commands.  Would it make sense to also allow e.g. "lock"
> (and would it pass the SELinux policy)?

When I write "...ioctl perm..." I use that just as a short cut to represent
whatever SELinux access vector would be checked if QEMU were to do the SCSI
PR calls itself.  The access vector permission bits are defined in the policy
file, and in the auto-generated header file 
/usr/include/selinux/av_permissions.h

AFAICT, there's only a coarse COMMON_FILE_IOCTL access vector defined, nothing
on the level of individual ioctl commands. So, yes, the MAC policy check
would allow other ioctl commands to be run, aside from those required for
persistent reservations. We just have to rely on the PR helper code for
that restriction.

> Also, what about AppArmor?

I don't think it has a way to do the delegated permission checks in the
way I describe for SELinux. 

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-11-27 Thread Daniel P. Berrange
On Mon, Nov 27, 2017 at 12:51:33PM +0100, Paolo Bonzini wrote:
> On 27/11/2017 12:37, Daniel P. Berrange wrote:
> > On Mon, Nov 27, 2017 at 12:13:24PM +0100, Paolo Bonzini wrote:
> >> Hm, I see what you mean now.  But it would be "just" a qemu-pr-helper
> >> bug that it trusts the caller to have "ioctl" permissions on the file
> >> descriptor, wouldn't it?
> >>
> >> And it could be a feature even, since the remote QEMU process also has
> >> to have "connect" permissions on the qemu-pr-helper socket.  So you
> >> could give it ioctl access *limited to persistent reservations* by
> >> granting the appropriate permissions on the socket.
> > 
> > We can't grant access to the persistent reservation helper's socket on a
> > per QEMU basis. Permissions are granted on the domain type svirt_t, and
> > we don't want to invent a new domain type just for having access to the
> > PR helper.
> 
> You can do so via DAC and MAC on the socket itself, or is that not enough?

Whatever is done with DAC is irrelevant when considering the MAC policy.

Yes, you can do MAC on the socket, but that merely gives you protection
betweeen the host & guest. The goal of sVirt is to give protection
between VMs, and MAC on the socket alone doesn't guarantee that.

Consider what the kernel would enforce if the SCSI PR was done inside
QEMU. It would check

   allowed(process context, file context, ...file class.., ...ioctl perm...)

where

   process context == svirt_t:s0,c340,c524
   file context == svirt_image_t:s0,c340,c524

In the case of a per-QEMU  SCSI PR, we can get the same effect because
when this is checked by the kernel, the PR helper has a process context
still containing a MCS level, eg

   process context ==  qemu_pr_helper_t:s0,c340,c524
   file context ==  svirt_image_t:s0,c340,c524


In the shared-PR helper though, the PR helper context would not have an
MAC level applied, so the MAC check is now

   process context == qemu_pr_helper_t:s0
   file context == svirt_image_t:s0,c340,c524

This is still a MAC check, but it is a weaker check than what sVirt promises,
because a QEMU with MCS   c340,c524 could pass a file descriptor with a MCS
level c123,c422.  QEMU would be denied to use this bogus FD, but the PR
helper would happily allow this bad access.

So to satisfy sVirt separation of QEMU instances, we need the PR helper to
be able to validate the MCS levels, as the kernel no longer has enough info
todo this automatically. 

In the PR helper this is achieved with something like

getpeercon(socketfd, &qemucon)  => returns svirt_t:s0,c340,c524
fgetfilecont(filefd, &diskcon)  => returns svirt_image_t:s0,c340,c524

Then it would call something like

   security_compute_av(qemucon, diskcon, ...file class..., ...ioctl perm...)

to get a MAC decision from the kernel. This preserves the sVirt isolation
of individual QEMU instances.

> In other words, what are the SELinux best practices when you don't want
> a process to have blanket access to a permission, but you may be fine
> with a subset of those?  If you think of unpriv_sgio=0 as a very simple
> MAC, this is actually the very case that the PR helper is designed for.

It depends on the security goal of the MAC policy. The original SELinux
policy was just protecting the host from QEMUs. For sVirt the goal is
stronger, to have isolation between individual QEMUs

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-11-27 Thread Paolo Bonzini
On 27/11/2017 12:37, Daniel P. Berrange wrote:
> On Mon, Nov 27, 2017 at 12:13:24PM +0100, Paolo Bonzini wrote:
>> Hm, I see what you mean now.  But it would be "just" a qemu-pr-helper
>> bug that it trusts the caller to have "ioctl" permissions on the file
>> descriptor, wouldn't it?
>>
>> And it could be a feature even, since the remote QEMU process also has
>> to have "connect" permissions on the qemu-pr-helper socket.  So you
>> could give it ioctl access *limited to persistent reservations* by
>> granting the appropriate permissions on the socket.
> 
> We can't grant access to the persistent reservation helper's socket on a
> per QEMU basis. Permissions are granted on the domain type svirt_t, and
> we don't want to invent a new domain type just for having access to the
> PR helper.

You can do so via DAC and MAC on the socket itself, or is that not enough?

In other words, what are the SELinux best practices when you don't want
a process to have blanket access to a permission, but you may be fine
with a subset of those?  If you think of unpriv_sgio=0 as a very simple
MAC, this is actually the very case that the PR helper is designed for.

Thanks,

Paolo

> So if we grant access to a global PR helper, we must have that helper
> do MAC checks. Without it, QEMU has delegated actions it can't do itself
> to a separate process thus escaping its MAC confinement in that area.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-11-27 Thread Daniel P. Berrange
On Mon, Nov 27, 2017 at 12:13:24PM +0100, Paolo Bonzini wrote:
> On 27/11/2017 11:59, Daniel P. Berrange wrote:
> > On Mon, Nov 27, 2017 at 11:57:56AM +0100, Paolo Bonzini wrote:
> >> On 27/11/2017 10:40, Daniel P. Berrange wrote:
> >>>
> >>> If we had one daemon per QEMU, then we would give the daemon the same
> >>> MCS label as QEMU. The kernel will thus enforce this label matches the
> >>> label on the QEMU process when it connects to the UNIX socket. The kernel
> >>> will also validate the label on the disk file descriptor passed to the
> >>> daemon by QEMU.
> >>>
> >>> If we had one daemon per host, then that daemon will need a generic
> >>> label that lets all QEMUs connect to it. When QEMU passes in a disk
> >>> FD, the daemon will need to query the SELinux context of the remote
> >>> QEMU process, and then perform a userspace ACL check of that against
> >>> the FD that is passed in.
> >>>
> >>> The latter case means the QEMU helper will need to link to libselinux
> >>> and performs checks itself.
> >>
> >> Then it seems much better to use one daemon per QEMU, indeed.
> > 
> > That would rule out using persistent reservations for unprivileged QEMU
> > though, because we still need sVirt protection for that.
> 
> Hm, I see what you mean now.  But it would be "just" a qemu-pr-helper
> bug that it trusts the caller to have "ioctl" permissions on the file
> descriptor, wouldn't it?
> 
> And it could be a feature even, since the remote QEMU process also has
> to have "connect" permissions on the qemu-pr-helper socket.  So you
> could give it ioctl access *limited to persistent reservations* by
> granting the appropriate permissions on the socket.

We can't grant access to the persistent reservation helper's socket on a
per QEMU basis. Permissions are granted on the domain type svirt_t, and
we don't want to invent a new domain type just for having access to the
PR helper.

So if we grant access to a global PR helper, we must have that helper
do MAC checks. Without it, QEMU has delegated actions it can't do itself
to a separate process thus escaping its MAC confinement in that area.

Userspace MAC checks are not very hard to do, so I don't see a significant
burden to adding this support to the helper.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-11-27 Thread Paolo Bonzini
On 27/11/2017 11:59, Daniel P. Berrange wrote:
> On Mon, Nov 27, 2017 at 11:57:56AM +0100, Paolo Bonzini wrote:
>> On 27/11/2017 10:40, Daniel P. Berrange wrote:
>>>
>>> If we had one daemon per QEMU, then we would give the daemon the same
>>> MCS label as QEMU. The kernel will thus enforce this label matches the
>>> label on the QEMU process when it connects to the UNIX socket. The kernel
>>> will also validate the label on the disk file descriptor passed to the
>>> daemon by QEMU.
>>>
>>> If we had one daemon per host, then that daemon will need a generic
>>> label that lets all QEMUs connect to it. When QEMU passes in a disk
>>> FD, the daemon will need to query the SELinux context of the remote
>>> QEMU process, and then perform a userspace ACL check of that against
>>> the FD that is passed in.
>>>
>>> The latter case means the QEMU helper will need to link to libselinux
>>> and performs checks itself.
>>
>> Then it seems much better to use one daemon per QEMU, indeed.
> 
> That would rule out using persistent reservations for unprivileged QEMU
> though, because we still need sVirt protection for that.

Hm, I see what you mean now.  But it would be "just" a qemu-pr-helper
bug that it trusts the caller to have "ioctl" permissions on the file
descriptor, wouldn't it?

And it could be a feature even, since the remote QEMU process also has
to have "connect" permissions on the qemu-pr-helper socket.  So you
could give it ioctl access *limited to persistent reservations* by
granting the appropriate permissions on the socket.

Thanks,

Paolo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-11-27 Thread Paolo Bonzini
On 24/11/2017 19:13, Michal Privoznik wrote:
> On 11/24/2017 06:18 PM, Paolo Bonzini wrote:
>> On 24/11/2017 18:07, Michal Privoznik wrote:
>>> The helper is going to be daemonized (for the same reason
>>> that qemu process is) => there's no SIGCHLD for libvirtd to receive.
>>
>> Couldn't (or shouldn't!) libvirt register itself as a subreaper instead?
> 
> Whether libvirt can't have a
> separate thread where nothing waitpid() would run? That could hardly
> work - not only would libvirt need to spawn separate thread for every
> helper, it would not work across restarts of libvirtd. We would have no
> guarantee that PID we recorded is still the same process as it was when
> we were dying.

Oh, you're right. :(  Even if libvirtd were a subreaper (see prctl(2)
manpage), the daemonized qemu-pr-helper would be reparented to init when
libvirtd restarts.

libvirtd could test whether the helper is alive by connecting to its
socket.  If that's not enough, I'll add an event.

Another thing: we have



which represents "one daemon per QEMU" in privileged libvirtd and
"well-known socket path" in session libvirtd.  What would the
live-domain XML look like?  Maybe:



  




  


And then, would it be an error to specify the "wrong" managed setting in
the domain-creation XML?  Or would it be possible to use a managed='no'
helper in privileged libvirtd?  I take it as a "no" given your remark below.

Thanks,

Paolo

> If qemu has to connect to the helper's socket the socket must have a
> label that allows that. However, libvirt allows running qemu under
> pretty much any label that users configure. So if there were just one
> daemon per host (which there is not nor it will be), we wouldn't be able
> to make any MCS (or at least that's how I understand it) - thus the only
> thing we could do is just to allow everybody to connect.
> 
> This is more of affirmation that one helper daemon per host is a bad
> idea than anything really.
> 
> Michal
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-11-27 Thread Daniel P. Berrange
On Mon, Nov 27, 2017 at 11:57:56AM +0100, Paolo Bonzini wrote:
> On 27/11/2017 10:40, Daniel P. Berrange wrote:
> > 
> > If we had one daemon per QEMU, then we would give the daemon the same
> > MCS label as QEMU. The kernel will thus enforce this label matches the
> > label on the QEMU process when it connects to the UNIX socket. The kernel
> > will also validate the label on the disk file descriptor passed to the
> > daemon by QEMU.
> > 
> > If we had one daemon per host, then that daemon will need a generic
> > label that lets all QEMUs connect to it. When QEMU passes in a disk
> > FD, the daemon will need to query the SELinux context of the remote
> > QEMU process, and then perform a userspace ACL check of that against
> > the FD that is passed in.
> > 
> > The latter case means the QEMU helper will need to link to libselinux
> > and performs checks itself.
> 
> Then it seems much better to use one daemon per QEMU, indeed.

That would rule out using persistent reservations for unprivileged QEMU
though, because we still need sVirt protection for that.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-11-27 Thread Paolo Bonzini
On 27/11/2017 10:40, Daniel P. Berrange wrote:
> 
> If we had one daemon per QEMU, then we would give the daemon the same
> MCS label as QEMU. The kernel will thus enforce this label matches the
> label on the QEMU process when it connects to the UNIX socket. The kernel
> will also validate the label on the disk file descriptor passed to the
> daemon by QEMU.
> 
> If we had one daemon per host, then that daemon will need a generic
> label that lets all QEMUs connect to it. When QEMU passes in a disk
> FD, the daemon will need to query the SELinux context of the remote
> QEMU process, and then perform a userspace ACL check of that against
> the FD that is passed in.
> 
> The latter case means the QEMU helper will need to link to libselinux
> and performs checks itself.

Then it seems much better to use one daemon per QEMU, indeed.

Paolo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-11-27 Thread Daniel P. Berrange
On Fri, Nov 24, 2017 at 04:42:00PM +0100, Paolo Bonzini wrote:
> On 24/11/2017 15:52, Daniel P. Berrange wrote:
> >> So what has been suggested so far is:
> >> 
> >>   
> >> 
> >> 
> >> 
> >>   
> 
>  without an inner  element leaves libvirtd with
> the choice of a daemon per QEMU, or a daemon per host in a well-known
> location.  Unprivileged libvirtd would always use the latter; for
> privileged libvirtd it is implementation-defined.  I like it.
> 
>  with an inner  always gives a daemon per host in
> a custom location.  It can be used by either unprivileged or privileged
> libvirtd.
> 
> >> Now, my question is, in the first case - how should libvirt chose the
> >> path? Should it be different for each disk/domain? How is the daemon
> >> started in the first place - should libvirt start it? And when should
> >> libvirt kill it?
> >
> > The core question is one daemon per QEMU, or one daemon per host. I'd be
> > more inclined to have one daemon per QEMU so we always have isolation
> > and thus do't have to worry about a shared daemon being a potential
> > attack point between distinct QEMU's.
> 
> One daemon per QEMU is nicer IMO because it lets us do MCS.  Of course
> one daemon per QEMU can only apply to system libvirtd; session must use
> one daemon per host.
> 
> One daemon per host is easy, because it's just a couple new command-line
> options as far as libvirtd is concerned, but we need to check that it
> works well with MCS.

In both cases we can do MCS, but it would work differently.

If we had one daemon per QEMU, then we would give the daemon the same
MCS label as QEMU. The kernel will thus enforce this label matches the
label on the QEMU process when it connects to the UNIX socket. The kernel
will also validate the label on the disk file descriptor passed to the
daemon by QEMU.

If we had one daemon per host, then that daemon will need a generic
label that lets all QEMUs connect to it. When QEMU passes in a disk
FD, the daemon will need to query the SELinux context of the remote
QEMU process, and then perform a userspace ACL check of that against
the FD that is passed in.

The latter case means the QEMU helper will need to link to libselinux
and performs checks itself.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-11-27 Thread Peter Krempa
On Mon, Nov 27, 2017 at 07:27:17 +0100, Michal Privoznik wrote:
> On 11/26/2017 09:32 AM, Peter Krempa wrote:
> > On Fri, Nov 24, 2017 at 15:38:54 +0100, Michal Privoznik wrote:
> >> On 08/22/2017 06:27 PM, Paolo Bonzini wrote:

[...]

> >> However, we want to be able to turn it on/off or not mention it at all
> >> on per-disk basis. So what has been suggested so far is:
> >>
> >>   
> >> 
> >> 
> >> 
> >>   
> > 
> > Note that if the reservation is applied to the image and not the disk
> > the element should become child of the  elememt.
> 
> > Especially if
> > there is only a ever so slight chance that it will work with formats
> > supporting backing store (qcow2) and thus there would be a need to
> > configure it per-image.
> 
> The way I understand PRs is that they are format agnostic, so there's
> not (nor there will be) per image configuration. Nevertheless,

In qemu you have two objects representing a disk image (or member of the
backing chain). One is the format driver, and the second is the driver
which accesses the storage itself. So even if it concerns the access to
the physical storage, it may be necessary to track it in the backing
chain and thus under .

Only if we are sure that no member of the backing chain can have
different config, only then it can be made per-disk.

>  concern disk image so I will place the element under
> . Good point.


signature.asc
Description: PGP signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] New QEMU daemon for persistent reservations

2017-11-26 Thread Michal Privoznik
On 11/26/2017 09:32 AM, Peter Krempa wrote:
> On Fri, Nov 24, 2017 at 15:38:54 +0100, Michal Privoznik wrote:
>> On 08/22/2017 06:27 PM, Paolo Bonzini wrote:
>>> Hi all,
>>>
>>
>> Sorry for resurrecting old thread but seems like there was no agreement
>> reached.
>>
>> We don't want to expose any paths because the fact that PR helper is a
>> separate binary that uses a UNIX socket to talk to qemu is a
>> implementation detail of qemu. Other HVs may have it differently.
>>
>> However, we want to be able to turn it on/off or not mention it at all
>> on per-disk basis. So what has been suggested so far is:
>>
>>   
>> 
>> 
>> 
>>   
> 
> Note that if the reservation is applied to the image and not the disk
> the element should become child of the  elememt.

> Especially if
> there is only a ever so slight chance that it will work with formats
> supporting backing store (qcow2) and thus there would be a need to
> configure it per-image.

The way I understand PRs is that they are format agnostic, so there's
not (nor there will be) per image configuration. Nevertheless,
 concern disk image so I will place the element under
. Good point.

What is still unclear is what should happen when the helper daemon dies
while domain is running. I was thinking about this during the weekend
and realized that if we go with the event delivered to libvirtd, we also
need a monitor command to query the state (we need to cover case when
event is fired but daemon is not runnning, it has to query the state
once started).

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-11-26 Thread Peter Krempa
On Fri, Nov 24, 2017 at 15:38:54 +0100, Michal Privoznik wrote:
> On 08/22/2017 06:27 PM, Paolo Bonzini wrote:
> > Hi all,
> > 
> 
> Sorry for resurrecting old thread but seems like there was no agreement
> reached.
> 
> We don't want to expose any paths because the fact that PR helper is a
> separate binary that uses a UNIX socket to talk to qemu is a
> implementation detail of qemu. Other HVs may have it differently.
> 
> However, we want to be able to turn it on/off or not mention it at all
> on per-disk basis. So what has been suggested so far is:
> 
>   
> 
> 
> 
>   

Note that if the reservation is applied to the image and not the disk
the element should become child of the  elememt. Especially if
there is only a ever so slight chance that it will work with formats
supporting backing store (qcow2) and thus there would be a need to
configure it per-image.




signature.asc
Description: PGP signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] New QEMU daemon for persistent reservations

2017-11-24 Thread Michal Privoznik
On 11/24/2017 06:18 PM, Paolo Bonzini wrote:
> On 24/11/2017 18:07, Michal Privoznik wrote:
>> On 11/24/2017 04:42 PM, Paolo Bonzini wrote:
>>> One daemon per QEMU is nicer IMO because it lets us do MCS.  Of course
>>> one daemon per QEMU can only apply to system libvirtd; session must use
>>> one daemon per host.
>>
>> Agreed. One daemon per QEMU it is then.  Just to make sure whether I> 
>> understand correctly - libvirtd would start it *before* exec()-ing qemu
>> (so that qemu can connect) and kill it after qemu dies. But what should
>> happen if the helper dies while qemu is running? Should qemu pause
>> itself, fire up an event on the monitor so that libvirtd can spawn the
>> helper again?
> 
> QEMU currently tries to reconnect five times every second, and then
> fails the I/O command.  This seemed okay to me because PRs are generally
> used in a HA environment where the failure will be broadcast and another
> node will pick up the work.
> 
>> The helper is going to be daemonized (for the same reason
>> that qemu process is) => there's no SIGCHLD for libvirtd to receive.
> 
> Couldn't (or shouldn't!) libvirt register itself as a subreaper instead?

I don't know what you mean by registering. Whether libvirt can't have a
separate thread where nothing waitpid() would run? That could hardly
work - not only would libvirt need to spawn separate thread for every
helper, it would not work across restarts of libvirtd. We would have no
guarantee that PID we recorded is still the same process as it was when
we were dying.

> 
>> Also, I don't really expect anything special when it comes to migration,
>> just want to make sure I'm not mislead.
> 
> No, everything is fine for migration.
> 
>>> One daemon per host is easy, because it's just a couple new command-line
>>> options as far as libvirtd is concerned, but we need to check that it
>>> works well with MCS.
>>
>> It's very likely that it wouldn't. Users can chose arbitrary DAC/SELinux
>> labels for their qemus. In general we will not find any intersection
>> that the helper socket can have.
> 
> I know some of those words. :)  Can you explain to an SELinux layman?

If qemu has to connect to the helper's socket the socket must have a
label that allows that. However, libvirt allows running qemu under
pretty much any label that users configure. So if there were just one
daemon per host (which there is not nor it will be), we wouldn't be able
to make any MCS (or at least that's how I understand it) - thus the only
thing we could do is just to allow everybody to connect.

This is more of affirmation that one helper daemon per host is a bad
idea than anything really.

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-11-24 Thread Paolo Bonzini
On 24/11/2017 18:07, Michal Privoznik wrote:
> On 11/24/2017 04:42 PM, Paolo Bonzini wrote:
>> One daemon per QEMU is nicer IMO because it lets us do MCS.  Of course
>> one daemon per QEMU can only apply to system libvirtd; session must use
>> one daemon per host.
> 
> Agreed. One daemon per QEMU it is then.  Just to make sure whether I> 
> understand correctly - libvirtd would start it *before* exec()-ing qemu
> (so that qemu can connect) and kill it after qemu dies. But what should
> happen if the helper dies while qemu is running? Should qemu pause
> itself, fire up an event on the monitor so that libvirtd can spawn the
> helper again?

QEMU currently tries to reconnect five times every second, and then
fails the I/O command.  This seemed okay to me because PRs are generally
used in a HA environment where the failure will be broadcast and another
node will pick up the work.

> The helper is going to be daemonized (for the same reason
> that qemu process is) => there's no SIGCHLD for libvirtd to receive.

Couldn't (or shouldn't!) libvirt register itself as a subreaper instead?

> Also, I don't really expect anything special when it comes to migration,
> just want to make sure I'm not mislead.

No, everything is fine for migration.

>> One daemon per host is easy, because it's just a couple new command-line
>> options as far as libvirtd is concerned, but we need to check that it
>> works well with MCS.
> 
> It's very likely that it wouldn't. Users can chose arbitrary DAC/SELinux
> labels for their qemus. In general we will not find any intersection
> that the helper socket can have.

I know some of those words. :)  Can you explain to an SELinux layman?

> Yup. In case of a unprivileged libvirtd there'll be a system-wide helper
> daemon that:
> a) will not be managed by libvirtd
> b) has wide permissions so that any user can connect to it
> 
> This basically follows what we have for vhostuser. Except that vhostuser
> is not an implementation detail of qemu :-)

Okay, thanks.

Paolo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-11-24 Thread Michal Privoznik
On 11/24/2017 04:42 PM, Paolo Bonzini wrote:
> On 24/11/2017 15:52, Daniel P. Berrange wrote:
>>> So what has been suggested so far is:
>>>
>>>   
>>> 
>>> 
>>> 
>>>   
> 
>  without an inner  element leaves libvirtd with
> the choice of a daemon per QEMU, or a daemon per host in a well-known
> location.  Unprivileged libvirtd would always use the latter; for
> privileged libvirtd it is implementation-defined.  I like it.
> 
>  with an inner  always gives a daemon per host in
> a custom location.  It can be used by either unprivileged or privileged
> libvirtd.
> 
>>> Now, my question is, in the first case - how should libvirt chose the
>>> path? Should it be different for each disk/domain? How is the daemon
>>> started in the first place - should libvirt start it? And when should
>>> libvirt kill it?
>>
>> The core question is one daemon per QEMU, or one daemon per host. I'd be
>> more inclined to have one daemon per QEMU so we always have isolation
>> and thus do't have to worry about a shared daemon being a potential
>> attack point between distinct QEMU's.
> 
> One daemon per QEMU is nicer IMO because it lets us do MCS.  Of course
> one daemon per QEMU can only apply to system libvirtd; session must use
> one daemon per host.

Agreed. One daemon per QEMU it is then. Just to make sure whether I
understand correctly - libvirtd would start it *before* exec()-ing qemu
(so that qemu can connect) and kill it after qemu dies. But what should
happen if the helper dies while qemu is running? Should qemu pause
itself, fire up an event on the monitor so that libvirtd can spawn the
helper again? The helper is going to be daemonized (for the same reason
that qemu process is) => there's no SIGCHLD for libvirtd to receive.

Also, I don't really expect anything special when it comes to migration,
just want to make sure I'm not mislead.

> 
> One daemon per host is easy, because it's just a couple new command-line
> options as far as libvirtd is concerned, but we need to check that it
> works well with MCS.

It's very likely that it wouldn't. Users can chose arbitrary DAC/SELinux
labels for their qemus. In general we will not find any intersection
that the helper socket can have.

> 
>> If one daemon per host, then for privileged libvirtd, we should make sure
>> the daemon ships with a systemd unit file + socket activation file, then
>> we have a well-known cross-distro standardized socket path.
> 
> Ok, then I will send a patch for upstream QEMU that adds the Fedora
> systemd unit files to contrib/.  They are useful anyway.

Yup. In case of a unprivileged libvirtd there'll be a system-wide helper
daemon that:
a) will not be managed by libvirtd
b) has wide permissions so that any user can connect to it

This basically follows what we have for vhostuser. Except that vhostuser
is not an implementation detail of qemu :-)

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-11-24 Thread Paolo Bonzini
On 24/11/2017 15:52, Daniel P. Berrange wrote:
>> So what has been suggested so far is:
>> 
>>   
>> 
>> 
>> 
>>   

 without an inner  element leaves libvirtd with
the choice of a daemon per QEMU, or a daemon per host in a well-known
location.  Unprivileged libvirtd would always use the latter; for
privileged libvirtd it is implementation-defined.  I like it.

 with an inner  always gives a daemon per host in
a custom location.  It can be used by either unprivileged or privileged
libvirtd.

>> Now, my question is, in the first case - how should libvirt chose the
>> path? Should it be different for each disk/domain? How is the daemon
>> started in the first place - should libvirt start it? And when should
>> libvirt kill it?
>
> The core question is one daemon per QEMU, or one daemon per host. I'd be
> more inclined to have one daemon per QEMU so we always have isolation
> and thus do't have to worry about a shared daemon being a potential
> attack point between distinct QEMU's.

One daemon per QEMU is nicer IMO because it lets us do MCS.  Of course
one daemon per QEMU can only apply to system libvirtd; session must use
one daemon per host.

One daemon per host is easy, because it's just a couple new command-line
options as far as libvirtd is concerned, but we need to check that it
works well with MCS.

> If one daemon per host, then for privileged libvirtd, we should make sure
> the daemon ships with a systemd unit file + socket activation file, then
> we have a well-known cross-distro standardized socket path.

Ok, then I will send a patch for upstream QEMU that adds the Fedora
systemd unit files to contrib/.  They are useful anyway.

Thanks,

Paolo

> If one daemon per QEMU, then we should just put the socket in the VM's
> private dir under /var/run/libvirt/qemu/$GUEST/

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-11-24 Thread Daniel P. Berrange
On Fri, Nov 24, 2017 at 03:38:54PM +0100, Michal Privoznik wrote:
> On 08/22/2017 06:27 PM, Paolo Bonzini wrote:
> > Hi all,
> > 
> 
> Sorry for resurrecting old thread but seems like there was no agreement
> reached.
> 
> We don't want to expose any paths because the fact that PR helper is a
> separate binary that uses a UNIX socket to talk to qemu is a
> implementation detail of qemu. Other HVs may have it differently.
> 
> However, we want to be able to turn it on/off or not mention it at all
> on per-disk basis. So what has been suggested so far is:
> 
>   
> 
> 
> 
>   
> 
> for privileged qemu, then:
> 
>   
> 
> 
> 
>   
> 
>   
> 
> for unprivileged qemu, or:
> 
>   
> 
> 
> 
>   
> 
> for PR feature turned off (equivalent to leaving it out completely).
> 
> Now, my question is, in the first case - how should libvirt chose the
> path? Should it be different for each disk/domain? How is the daemon
> started in the first place - should libvirt start it? And when should
> libvirt kill it?

The core question is one daemon per QEMU, or one daemon per host. I'd be
more inclined to have one daemon per QEMU so we always have isolation
and thus do't have to worry about a shared daemon being a potential
attack point between distinct QEMU's.

If one daemon per host, then for privileged libvirtd, we should make sure
the daemon ships with a systemd unit file + socket activation file, then
we have a well-known cross-distro standardized socket path.

If one daemon per QEMU, then we should just put the socket in the VM's
private dir under /var/run/libvirt/qemu/$GUEST/.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-11-24 Thread Michal Privoznik
On 08/22/2017 06:27 PM, Paolo Bonzini wrote:
> Hi all,
> 

Sorry for resurrecting old thread but seems like there was no agreement
reached.

We don't want to expose any paths because the fact that PR helper is a
separate binary that uses a UNIX socket to talk to qemu is a
implementation detail of qemu. Other HVs may have it differently.

However, we want to be able to turn it on/off or not mention it at all
on per-disk basis. So what has been suggested so far is:

  



  

for privileged qemu, then:

  



  

  

for unprivileged qemu, or:

  



  

for PR feature turned off (equivalent to leaving it out completely).

Now, my question is, in the first case - how should libvirt chose the
path? Should it be different for each disk/domain? How is the daemon
started in the first place - should libvirt start it? And when should
libvirt kill it?


Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-10-04 Thread Paolo Bonzini
On 04/10/2017 10:28, Daniel P. Berrange wrote:
> On Tue, Oct 03, 2017 at 06:53:56PM +0200, Paolo Bonzini wrote:
>> On 03/10/2017 18:39, Daniel P. Berrange wrote:
>>> On Tue, Oct 03, 2017 at 06:35:03PM +0200, Paolo Bonzini wrote:
 And later on we might have other ways to implement persistent
 reservations in QEMU.  So while I'm not a big fan(*) of the
 driver='helper' moniker, I don't think an attribute is enough.  Maybe
 driver='external'?...
>>>
>>> Yes, if there's a choice of ways to manage reservations, we could
>>> reflect that as  'reservations=passthrough|emulated' or something
>>> like that. 
>>>
>>> I just don't think the concept of a helper program should be visible
>>> in the XML, as it is an impl detail that is totally QEMU specific and
>>> could conceivably change eg not even needed with unpriv_sgio,
>>
>> Not sure about that, the mpathpersist behavior is somewhat magic and I
>> am not really sure we should enable it by default, even with unpriv_sgio.
>>
>>> and if
>>> kernel were enhanced could be usable without needing a helper elsewhere
>>> too.
>>
>> It's an implementation detail for system mode, but not for user mode
>> (where ACLs on the socket are used to allow access to a privileged
>> operation).
>>
>> So:
>>
>>
>>
>> (uses helper from global configuration if not a libiscsi drive) vs.
>>
>>
>>  
>>
> 
> What do you mean by  here ?  If that's the chardev source I would
> really prefer not to have that visible.

Yes, it's the chardev source.  See above: in user mode, ACLs on the
socket are used to allow access to a privileged operation, so the source
has to be there.  It's not the most common mode, but it's the only one
that works for user mode and hardcoding a (possibly distro-specific)
socket path in libvirtd is worse IMO.

Paolo

>>
>> (for user mode) vs.
>>
>>
>>
>> (fails if libiscsi || CAP_SYS_RAWIO || unpriv_sgio)?
> 
> Regards,
> Daniel
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-10-04 Thread Daniel P. Berrange
On Tue, Oct 03, 2017 at 06:53:56PM +0200, Paolo Bonzini wrote:
> On 03/10/2017 18:39, Daniel P. Berrange wrote:
> > On Tue, Oct 03, 2017 at 06:35:03PM +0200, Paolo Bonzini wrote:
> >> And later on we might have other ways to implement persistent
> >> reservations in QEMU.  So while I'm not a big fan(*) of the
> >> driver='helper' moniker, I don't think an attribute is enough.  Maybe
> >> driver='external'?...
> > 
> > Yes, if there's a choice of ways to manage reservations, we could
> > reflect that as  'reservations=passthrough|emulated' or something
> > like that. 
> > 
> > I just don't think the concept of a helper program should be visible
> > in the XML, as it is an impl detail that is totally QEMU specific and
> > could conceivably change eg not even needed with unpriv_sgio,
> 
> Not sure about that, the mpathpersist behavior is somewhat magic and I
> am not really sure we should enable it by default, even with unpriv_sgio.
> 
> > and if
> > kernel were enhanced could be usable without needing a helper elsewhere
> > too.
> 
> It's an implementation detail for system mode, but not for user mode
> (where ACLs on the socket are used to allow access to a privileged
> operation).
> 
> So:
> 
>
> 
> (uses helper from global configuration if not a libiscsi drive) vs.
> 
>
>  
>

What do you mean by  here ?  If that's the chardev source I would
really prefer not to have that visible.

> 
> (for user mode) vs.
> 
>
> 
> (fails if libiscsi || CAP_SYS_RAWIO || unpriv_sgio)?

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-10-03 Thread Paolo Bonzini
On 03/10/2017 18:39, Daniel P. Berrange wrote:
> On Tue, Oct 03, 2017 at 06:35:03PM +0200, Paolo Bonzini wrote:
>> And later on we might have other ways to implement persistent
>> reservations in QEMU.  So while I'm not a big fan(*) of the
>> driver='helper' moniker, I don't think an attribute is enough.  Maybe
>> driver='external'?...
> 
> Yes, if there's a choice of ways to manage reservations, we could
> reflect that as  'reservations=passthrough|emulated' or something
> like that. 
> 
> I just don't think the concept of a helper program should be visible
> in the XML, as it is an impl detail that is totally QEMU specific and
> could conceivably change eg not even needed with unpriv_sgio,

Not sure about that, the mpathpersist behavior is somewhat magic and I
am not really sure we should enable it by default, even with unpriv_sgio.

> and if
> kernel were enhanced could be usable without needing a helper elsewhere
> too.

It's an implementation detail for system mode, but not for user mode
(where ACLs on the socket are used to allow access to a privileged
operation).

So:

   

(uses helper from global configuration if not a libiscsi drive) vs.

   
 
   

(for user mode) vs.

   

(fails if libiscsi || CAP_SYS_RAWIO || unpriv_sgio)?

Paolo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-10-03 Thread Daniel P. Berrange
On Tue, Oct 03, 2017 at 06:35:03PM +0200, Paolo Bonzini wrote:
> On 03/10/2017 18:17, Daniel P. Berrange wrote:
> > On Tue, Oct 03, 2017 at 06:07:53PM +0200, Paolo Bonzini wrote:
> >> Yes, but OTOH if libvirtd starts the daemon, nobody cares about the
> >> source type, so perhaps
> >>
> >>
> >>  
> >>
> >>
> >> (mandatory source) vs.
> >>
> >>
> >>  /path/to/qemu-pr-helper
> >>
> >>
> >> (optional path, default from global configuration) vs.
> >>
> >>
> >>
> >> (no helper)?
> > 
> > I tend to think we should not expose the concept of a 'helper' at all in
> > the XML. The fact that QEMU has a separate binary to do this is really an
> > internal implementation detail due to the need for privilege separation/
> > elevation.
> > 
> > Libvirt should just do the right thing running the helper with a suitable
> > configuration when needed, just like we run the TAP device helper when
> > needed.
> 
> I agree we don't need the helper path.  I am not sure about the socket
> path, but maybe that could also be in qemu.conf.  However,
> reservations='off' doesn't always make sense:
> 
> 1) if you have appropriate privileges (via unpriv_sgio on distros that
> have it, or capabilities, or just because you're using the libiscsi
> driver) you will be able to access them anyway;
> 
> 2) if you're on a multipath device, you need to use the helper anyway to
> get the right semantics;
> 
> And later on we might have other ways to implement persistent
> reservations in QEMU.  So while I'm not a big fan(*) of the
> driver='helper' moniker, I don't think an attribute is enough.  Maybe
> driver='external'?...

Yes, if there's a choice of ways to manage reservations, we could
reflect that as  'reservations=passthrough|emulated' or something
like that. 

I just don't think the concept of a helper program should be visible
in the XML, as it is an impl detail that is totally QEMU specific and
could conceivably change eg not even needed with unpriv_sgio, and if
kernel were enhanced could be usable without needing a helper elsewhere
too.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-10-03 Thread Paolo Bonzini
On 03/10/2017 18:17, Daniel P. Berrange wrote:
> On Tue, Oct 03, 2017 at 06:07:53PM +0200, Paolo Bonzini wrote:
>> Yes, but OTOH if libvirtd starts the daemon, nobody cares about the
>> source type, so perhaps
>>
>>
>>  
>>
>>
>> (mandatory source) vs.
>>
>>
>>  /path/to/qemu-pr-helper
>>
>>
>> (optional path, default from global configuration) vs.
>>
>>
>>
>> (no helper)?
> 
> I tend to think we should not expose the concept of a 'helper' at all in
> the XML. The fact that QEMU has a separate binary to do this is really an
> internal implementation detail due to the need for privilege separation/
> elevation.
> 
> Libvirt should just do the right thing running the helper with a suitable
> configuration when needed, just like we run the TAP device helper when
> needed.

I agree we don't need the helper path.  I am not sure about the socket
path, but maybe that could also be in qemu.conf.  However,
reservations='off' doesn't always make sense:

1) if you have appropriate privileges (via unpriv_sgio on distros that
have it, or capabilities, or just because you're using the libiscsi
driver) you will be able to access them anyway;

2) if you're on a multipath device, you need to use the helper anyway to
get the right semantics;

And later on we might have other ways to implement persistent
reservations in QEMU.  So while I'm not a big fan(*) of the
driver='helper' moniker, I don't think an attribute is enough.  Maybe
driver='external'?...

Paolo

(*) I might say, I have some reservations about it

> IOW, just be as simple as an attribute  "reservations=on|off" and we'll
> decide the UNIX socket path, daemon forking, etc ourselves
> 
> Regards,
> Daniel
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-10-03 Thread Paolo Bonzini
On 03/10/2017 17:59, Michal Privoznik wrote:
> Ah, this breaks my design. I guess
> 
>   
> 
>   
> 
>   
> 
> is pure madness, isn't it?

Yes, but OTOH if libvirtd starts the daemon, nobody cares about the
source type, so perhaps

   
 
   

(mandatory source) vs.

   
 /path/to/qemu-pr-helper
   

(optional path, default from global configuration) vs.

   

(no helper)?

Paolo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-10-03 Thread Daniel P. Berrange
On Tue, Oct 03, 2017 at 06:07:53PM +0200, Paolo Bonzini wrote:
> On 03/10/2017 17:59, Michal Privoznik wrote:
> > Ah, this breaks my design. I guess
> > 
> >   
> > 
> >   
> > 
> >   
> > 
> > is pure madness, isn't it?
> 
> Yes, but OTOH if libvirtd starts the daemon, nobody cares about the
> source type, so perhaps
> 
>
>  
>
> 
> (mandatory source) vs.
> 
>
>  /path/to/qemu-pr-helper
>
> 
> (optional path, default from global configuration) vs.
> 
>
> 
> (no helper)?

I tend to think we should not expose the concept of a 'helper' at all in
the XML. The fact that QEMU has a separate binary to do this is really an
internal implementation detail due to the need for privilege separation/
elevation.

Libvirt should just do the right thing running the helper with a suitable
configuration when needed, just like we run the TAP device helper when
needed.

IOW, just be as simple as an attribute  "reservations=on|off" and we'll
decide the UNIX socket path, daemon forking, etc ourselves

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-10-03 Thread Michal Privoznik
On 09/10/2017 11:38 AM, Paolo Bonzini wrote:
> On 28/08/2017 13:11, Michal Privoznik wrote:
>> On 08/25/2017 12:41 AM, Paolo Bonzini wrote:
>>> On 22/08/2017 18:27, Paolo Bonzini wrote:
 Hi all,
>>

>>>
>>> The XML to use the helper with a predefined socket could be:
>>>
>>> 
>>>/path/to/unix.socket'
>>> 

This looks okay-ish. But from future proof POV, I'd have the path as an
attribute, or as a separate sub-element of  so that we can stick
more elements into it if we need to.

>>
>> Do we want to/need to expose the path here? I mean, is user expected to
>> do something with it? We don't expose monitor path anywhere but keep it
>> private (of course we store it in so called status XML which is a
>> persistent storage solely for purpose of reloading the internal state
>> when daemon is restarted).
> 
> In this case, yes.  This is for the case of a global daemon.

I'm thinking if we should just use virDomainChrSourceDefPtr for this. I
mean, if you take look at the vhost-user XML format it looks something
like this:

  




  

http://libvirt.org/formatdomain.html#elementVhostuser

So we could have:

  

  

  

The advantage of this is that we can express other connection modes if
the pr-helper is ever going to support them (although, if you need to
pass FDs around, UNIX socket is the only way for you). Then again, I
think it follows what we have elsewhere. Also, if we go this way, we can
forbid any type='' but unix and mode='server'.


> 
>>>
>>> while to use it with a dedicated daemon
>>>
>>> 
>>>/path/to/qemu-pr-helper
>>> 

Ah, this breaks my design. I guess

  

  

  

is pure madness, isn't it?

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-09-11 Thread Paolo Bonzini
On 11/09/2017 17:53, Daniel P. Berrange wrote:
> On Mon, Sep 11, 2017 at 05:47:28PM +0200, Paolo Bonzini wrote:
>> On 11/09/2017 17:33, Daniel P. Berrange wrote:
>>> On Mon, Sep 11, 2017 at 05:27:20PM +0200, Paolo Bonzini wrote:
 On 11/09/2017 17:23, Daniel P. Berrange wrote:
>> On the other hand, the daemon has CAP_SYS_RAWIO and CAP_SYS_ADMIN, so if
>> you get memory corruption all bets are probably off anyway.
> That's where the benefit of strict selinux labelling comes in. If we had
> strict labelling of the individual paths below the device, then even if
> the daemon got corrupted, the policy would prevent it from doing any
> damage to the system beyond calling ioctl() the individual paths it had
> been granted. It wouldn't be able to access devices associated with
> the host OS mounts, or other non-VM related or non-multipath related
> block devices.

 Sure, but those capabilities let you do a lot of nasty things
 indirectly, even within the constraints of the SELinux policy.

 For example, if you are able to reconfigure device mapper, you can
 convince the kernel to write to any block device---even if you cannot
 open it.  IDWEFAL (I don't write exploits for a living) but I'm sure
 that's just scraping the surface.
>>>
>>> Surely we would not write an SELinux policy that allows this daemon
>>> to reconfigure device mapper.
>>>
>>> IIUC, all this daemon should need is the ability to request persistent
>>> reservations on the individual paths associated with the mpath device.
>>>
>>> Is it not possible to write a SElinux policy which allows that, without
>>> also allowing reconfiguration of device mapper.
>>
>> As far as I know, querying and reconfiguring the device mapper are both
>> done with ioctls on /dev/mapper/control, and both require CAP_SYS_ADMIN.
>>
>> Maybe future versions of Linux could change it to require CAP_SYS_ADMIN
>> only for reconfiguration, so that the PR helper daemon does not require
>> the capability anymore.  However, that would be independent from
>> SELinux, which only controls "ioctl" access without finer-grain choice
>> of which ioctls to allow.
> 
> I don't suppose that the LUNS behind a mpath device end up being
> surface in /sys/block anywhere do they ?

The LUNs actually can be identified via /sys/block/dm-NN/slaves (I
think), but you cannot find out if it's a mpath device in the first
place without a /dev/mapper/control ioctl.

>> I understand that you want to protect in depth, but unfortunately this
>> only works if all layers are aware of SELinux.  Luckily the daemon is
>> much, much smaller than QEMU, and so is the attack surface.
> 
> It would be ok if we think its possible to lock it down later (once any
> needed kernel enhancements are developed), without having to change the
> interaction between QEMU / libvirt / helper daemon.  I'm beginning to
> feel that is ok.

Yes, that's my reasoning as well.  We could (and perhaps should) still
look at MCS to block unwanted connections to the socket, though.

Paolo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-09-11 Thread Daniel P. Berrange
On Mon, Sep 11, 2017 at 05:47:28PM +0200, Paolo Bonzini wrote:
> On 11/09/2017 17:33, Daniel P. Berrange wrote:
> > On Mon, Sep 11, 2017 at 05:27:20PM +0200, Paolo Bonzini wrote:
> >> On 11/09/2017 17:23, Daniel P. Berrange wrote:
>  On the other hand, the daemon has CAP_SYS_RAWIO and CAP_SYS_ADMIN, so if
>  you get memory corruption all bets are probably off anyway.
> >>> That's where the benefit of strict selinux labelling comes in. If we had
> >>> strict labelling of the individual paths below the device, then even if
> >>> the daemon got corrupted, the policy would prevent it from doing any
> >>> damage to the system beyond calling ioctl() the individual paths it had
> >>> been granted. It wouldn't be able to access devices associated with
> >>> the host OS mounts, or other non-VM related or non-multipath related
> >>> block devices.
> >>
> >> Sure, but those capabilities let you do a lot of nasty things
> >> indirectly, even within the constraints of the SELinux policy.
> >>
> >> For example, if you are able to reconfigure device mapper, you can
> >> convince the kernel to write to any block device---even if you cannot
> >> open it.  IDWEFAL (I don't write exploits for a living) but I'm sure
> >> that's just scraping the surface.
> > 
> > Surely we would not write an SELinux policy that allows this daemon
> > to reconfigure device mapper.
> > 
> > IIUC, all this daemon should need is the ability to request persistent
> > reservations on the individual paths associated with the mpath device.
> > 
> > Is it not possible to write a SElinux policy which allows that, without
> > also allowing reconfiguration of device mapper.
> 
> As far as I know, querying and reconfiguring the device mapper are both
> done with ioctls on /dev/mapper/control, and both require CAP_SYS_ADMIN.
> 
> Maybe future versions of Linux could change it to require CAP_SYS_ADMIN
> only for reconfiguration, so that the PR helper daemon does not require
> the capability anymore.  However, that would be independent from
> SELinux, which only controls "ioctl" access without finer-grain choice
> of which ioctls to allow.

I don't suppose that the LUNS behind a mpath device end up being
surface in /sys/block anywhere do they ?

> I understand that you want to protect in depth, but unfortunately this
> only works if all layers are aware of SELinux.  Luckily the daemon is
> much, much smaller than QEMU, and so is the attack surface.

It would be ok if we think its possible to lock it down later (once any
needed kernel enhancements are developed), without having to change the
interaction between QEMU / libvirt / helper daemon.  I'm beginning to
feel that is ok.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-09-11 Thread Paolo Bonzini
On 11/09/2017 17:33, Daniel P. Berrange wrote:
> On Mon, Sep 11, 2017 at 05:27:20PM +0200, Paolo Bonzini wrote:
>> On 11/09/2017 17:23, Daniel P. Berrange wrote:
 On the other hand, the daemon has CAP_SYS_RAWIO and CAP_SYS_ADMIN, so if
 you get memory corruption all bets are probably off anyway.
>>> That's where the benefit of strict selinux labelling comes in. If we had
>>> strict labelling of the individual paths below the device, then even if
>>> the daemon got corrupted, the policy would prevent it from doing any
>>> damage to the system beyond calling ioctl() the individual paths it had
>>> been granted. It wouldn't be able to access devices associated with
>>> the host OS mounts, or other non-VM related or non-multipath related
>>> block devices.
>>
>> Sure, but those capabilities let you do a lot of nasty things
>> indirectly, even within the constraints of the SELinux policy.
>>
>> For example, if you are able to reconfigure device mapper, you can
>> convince the kernel to write to any block device---even if you cannot
>> open it.  IDWEFAL (I don't write exploits for a living) but I'm sure
>> that's just scraping the surface.
> 
> Surely we would not write an SELinux policy that allows this daemon
> to reconfigure device mapper.
> 
> IIUC, all this daemon should need is the ability to request persistent
> reservations on the individual paths associated with the mpath device.
> 
> Is it not possible to write a SElinux policy which allows that, without
> also allowing reconfiguration of device mapper.

As far as I know, querying and reconfiguring the device mapper are both
done with ioctls on /dev/mapper/control, and both require CAP_SYS_ADMIN.

Maybe future versions of Linux could change it to require CAP_SYS_ADMIN
only for reconfiguration, so that the PR helper daemon does not require
the capability anymore.  However, that would be independent from
SELinux, which only controls "ioctl" access without finer-grain choice
of which ioctls to allow.

I understand that you want to protect in depth, but unfortunately this
only works if all layers are aware of SELinux.  Luckily the daemon is
much, much smaller than QEMU, and so is the attack surface.

Paolo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-09-11 Thread Paolo Bonzini
On 11/09/2017 17:23, Daniel P. Berrange wrote:
>> On the other hand, the daemon has CAP_SYS_RAWIO and CAP_SYS_ADMIN, so if
>> you get memory corruption all bets are probably off anyway.
> That's where the benefit of strict selinux labelling comes in. If we had
> strict labelling of the individual paths below the device, then even if
> the daemon got corrupted, the policy would prevent it from doing any
> damage to the system beyond calling ioctl() the individual paths it had
> been granted. It wouldn't be able to access devices associated with
> the host OS mounts, or other non-VM related or non-multipath related
> block devices.

Sure, but those capabilities let you do a lot of nasty things
indirectly, even within the constraints of the SELinux policy.

For example, if you are able to reconfigure device mapper, you can
convince the kernel to write to any block device---even if you cannot
open it.  IDWEFAL (I don't write exploits for a living) but I'm sure
that's just scraping the surface.

Paolo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-09-11 Thread Daniel P. Berrange
On Mon, Sep 11, 2017 at 05:20:10PM +0200, Paolo Bonzini wrote:
> On 11/09/2017 16:32, Daniel P. Berrange wrote:
> >> This is already handled via SCM_RIGHTS and is part of the design of the
> >> helper daemon.  QEMU cannot even open mpath devices which are not
> >> accessible according to its SELinux category or device cgroup.
> > 
> > Ah so the daemon relies on the fact that the client was not permitted
> > to open another file. So the only FD it can receive from the client is
> > one that was associated with a permitted mpath device.
> > 
> > That would be sufficient to protect against a malicious qemu process
> > trying to explicitly pass it invalid FD data. It wouldn't be sufficient
> > to be safe against a QEMU process that somehow managed to trigger a bug
> > that caused it to corrupt memory and thus trick into opening a different
> > file. For the latter we would need to have some stricter policy about
> > the helper daemon and labelling on files.
> 
> Exactly.  The passed file descriptor acts as a "capability"; the daemon
> goes from there to the paths through fstat on the file descriptor
> followed by /dev/mapper/control APIs (mostly issued by libmpathpersist).
> 
> On the other hand, the daemon has CAP_SYS_RAWIO and CAP_SYS_ADMIN, so if
> you get memory corruption all bets are probably off anyway.

That's where the benefit of strict selinux labelling comes in. If we had
strict labelling of the individual paths below the device, then even if
the daemon got corrupted, the policy would prevent it from doing any
damage to the system beyond calling ioctl() the individual paths it had
been granted. It wouldn't be able to access devices associated with
the host OS mounts, or other non-VM related or non-multipath related
block devices.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-09-11 Thread Daniel P. Berrange
On Mon, Sep 11, 2017 at 05:27:20PM +0200, Paolo Bonzini wrote:
> On 11/09/2017 17:23, Daniel P. Berrange wrote:
> >> On the other hand, the daemon has CAP_SYS_RAWIO and CAP_SYS_ADMIN, so if
> >> you get memory corruption all bets are probably off anyway.
> > That's where the benefit of strict selinux labelling comes in. If we had
> > strict labelling of the individual paths below the device, then even if
> > the daemon got corrupted, the policy would prevent it from doing any
> > damage to the system beyond calling ioctl() the individual paths it had
> > been granted. It wouldn't be able to access devices associated with
> > the host OS mounts, or other non-VM related or non-multipath related
> > block devices.
> 
> Sure, but those capabilities let you do a lot of nasty things
> indirectly, even within the constraints of the SELinux policy.
> 
> For example, if you are able to reconfigure device mapper, you can
> convince the kernel to write to any block device---even if you cannot
> open it.  IDWEFAL (I don't write exploits for a living) but I'm sure
> that's just scraping the surface.

Surely we would not write an SELinux policy that allows this daemon
to reconfigure device mapper.

IIUC, all this daemon should need is the ability to request persistent
reservations on the individual paths associated with the mpath device.

Is it not possible to write a SElinux policy which allows that, without
also allowing reconfiguration of device mapper.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-09-11 Thread Paolo Bonzini
On 11/09/2017 16:32, Daniel P. Berrange wrote:
>> This is already handled via SCM_RIGHTS and is part of the design of the
>> helper daemon.  QEMU cannot even open mpath devices which are not
>> accessible according to its SELinux category or device cgroup.
> 
> Ah so the daemon relies on the fact that the client was not permitted
> to open another file. So the only FD it can receive from the client is
> one that was associated with a permitted mpath device.
> 
> That would be sufficient to protect against a malicious qemu process
> trying to explicitly pass it invalid FD data. It wouldn't be sufficient
> to be safe against a QEMU process that somehow managed to trigger a bug
> that caused it to corrupt memory and thus trick into opening a different
> file. For the latter we would need to have some stricter policy about
> the helper daemon and labelling on files.

Exactly.  The passed file descriptor acts as a "capability"; the daemon
goes from there to the paths through fstat on the file descriptor
followed by /dev/mapper/control APIs (mostly issued by libmpathpersist).

On the other hand, the daemon has CAP_SYS_RAWIO and CAP_SYS_ADMIN, so if
you get memory corruption all bets are probably off anyway.

Paolo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-09-11 Thread Daniel P. Berrange
On Mon, Sep 11, 2017 at 04:23:27PM +0200, Paolo Bonzini wrote:
> On 11/09/2017 14:09, Daniel P. Berrange wrote:
> > On Mon, Sep 11, 2017 at 01:44:59PM +0200, Paolo Bonzini wrote:
> >> On 11/09/2017 12:46, Daniel P. Berrange wrote:
>  So the helper is a bit different from QEMU with respect to both SELinux
>  MCS labeling and the devices cgroup.  Access limitation comes from only
>  ever operating on devices that have been passed on the socket.  SELinux
>  MCS could be used so that only the "right" QEMU can connect to each
>  instance of the helper, though.
> >>> I wonder how this interacts with SELinux. IIUC if we grant access to
> >>> the multipath device file, the kernel won't normally require us to
> >>> grant access the underlying paths. I/O is just allowed because they
> >>> are a member of the mpath device. Hopefully this applies to the
> >>> persistent reservations too ?
> >>
> >> No, persistent reservations are special; they have to be registered
> >> independently on each path.  (As I said, this was the original reason to
> >> have a separate daemon: implementing it in QEMU would be not just a bad
> >> idea because you need CAP_SYS_ADMIN, it would be impossible for
> >> libvirt-started guests because of sVirt and device cgroups).
> >>
> >> So I think the helper daemon needs to be granted blanket ioctl access to
> >> devices, without using either device cgroups or MCS.
> > 
> > If it was a single helper daemon for all guests, that would be ok to be
> > granted access to all devices. If we did that though, the daemon would
> > have to be SELinux aware. ie, libvirt would have to talk to it and say
> > that svirt_t:s0:c236,c660 is permitted /dev/mpath/foo, and it would
> > have to validate the requests from the socket to QEMU to make sure that
> > QEMU is not requesting access to other mpath devices.
> 
> This is already handled via SCM_RIGHTS and is part of the design of the
> helper daemon.  QEMU cannot even open mpath devices which are not
> accessible according to its SELinux category or device cgroup.

Ah so the daemon relies on the fact that the client was not permitted
to open another file. So the only FD it can receive from the client is
one that was associated with a permitted mpath device.

That would be sufficient to protect against a malicious qemu process
trying to explicitly pass it invalid FD data. It wouldn't be sufficient
to be safe against a QEMU process that somehow managed to trigger a bug
that caused it to corrupt memory and thus trick into opening a different
file. For the latter we would need to have some stricter policy about
the helper daemon and labelling on files.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-09-11 Thread Paolo Bonzini
On 11/09/2017 14:09, Daniel P. Berrange wrote:
> On Mon, Sep 11, 2017 at 01:44:59PM +0200, Paolo Bonzini wrote:
>> On 11/09/2017 12:46, Daniel P. Berrange wrote:
 So the helper is a bit different from QEMU with respect to both SELinux
 MCS labeling and the devices cgroup.  Access limitation comes from only
 ever operating on devices that have been passed on the socket.  SELinux
 MCS could be used so that only the "right" QEMU can connect to each
 instance of the helper, though.
>>> I wonder how this interacts with SELinux. IIUC if we grant access to
>>> the multipath device file, the kernel won't normally require us to
>>> grant access the underlying paths. I/O is just allowed because they
>>> are a member of the mpath device. Hopefully this applies to the
>>> persistent reservations too ?
>>
>> No, persistent reservations are special; they have to be registered
>> independently on each path.  (As I said, this was the original reason to
>> have a separate daemon: implementing it in QEMU would be not just a bad
>> idea because you need CAP_SYS_ADMIN, it would be impossible for
>> libvirt-started guests because of sVirt and device cgroups).
>>
>> So I think the helper daemon needs to be granted blanket ioctl access to
>> devices, without using either device cgroups or MCS.
> 
> If it was a single helper daemon for all guests, that would be ok to be
> granted access to all devices. If we did that though, the daemon would
> have to be SELinux aware. ie, libvirt would have to talk to it and say
> that svirt_t:s0:c236,c660 is permitted /dev/mpath/foo, and it would
> have to validate the requests from the socket to QEMU to make sure that
> QEMU is not requesting access to other mpath devices.

This is already handled via SCM_RIGHTS and is part of the design of the
helper daemon.  QEMU cannot even open mpath devices which are not
accessible according to its SELinux category or device cgroup.

> If it was a one helper daemon per QEMU instance, then we would want to
> directly confine it to just the underlying devices associated with the
> mpath device allowed for that QEMU instance. This would imply that we
> needed to label the underlying devices with the MCS label to match the
> VM.

This would be nice but not necessary, right?

Paolo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-09-11 Thread Daniel P. Berrange
On Mon, Sep 11, 2017 at 01:44:59PM +0200, Paolo Bonzini wrote:
> On 11/09/2017 12:46, Daniel P. Berrange wrote:
> >> So the helper is a bit different from QEMU with respect to both SELinux
> >> MCS labeling and the devices cgroup.  Access limitation comes from only
> >> ever operating on devices that have been passed on the socket.  SELinux
> >> MCS could be used so that only the "right" QEMU can connect to each
> >> instance of the helper, though.
> > I wonder how this interacts with SELinux. IIUC if we grant access to
> > the multipath device file, the kernel won't normally require us to
> > grant access the underlying paths. I/O is just allowed because they
> > are a member of the mpath device. Hopefully this applies to the
> > persistent reservations too ?
> 
> No, persistent reservations are special; they have to be registered
> independently on each path.  (As I said, this was the original reason to
> have a separate daemon: implementing it in QEMU would be not just a bad
> idea because you need CAP_SYS_ADMIN, it would be impossible for
> libvirt-started guests because of sVirt and device cgroups).
> 
> So I think the helper daemon needs to be granted blanket ioctl access to
> devices, without using either device cgroups or MCS.

If it was a single helper daemon for all guests, that would be ok to be
granted access to all devices. If we did that though, the daemon would
have to be SELinux aware. ie, libvirt would have to talk to it and say
that svirt_t:s0:c236,c660 is permitted /dev/mpath/foo, and it would
have to validate the requests from the socket to QEMU to make sure that
QEMU is not requesting access to other mpath devices.

If it was a one helper daemon per QEMU instance, then we would want to
directly confine it to just the underlying devices associated with the
mpath device allowed for that QEMU instance. This would imply that we
needed to label the underlying devices with the MCS label to match the
VM.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-09-11 Thread Paolo Bonzini
On 11/09/2017 12:46, Daniel P. Berrange wrote:
>> So the helper is a bit different from QEMU with respect to both SELinux
>> MCS labeling and the devices cgroup.  Access limitation comes from only
>> ever operating on devices that have been passed on the socket.  SELinux
>> MCS could be used so that only the "right" QEMU can connect to each
>> instance of the helper, though.
> I wonder how this interacts with SELinux. IIUC if we grant access to
> the multipath device file, the kernel won't normally require us to
> grant access the underlying paths. I/O is just allowed because they
> are a member of the mpath device. Hopefully this applies to the
> persistent reservations too ?

No, persistent reservations are special; they have to be registered
independently on each path.  (As I said, this was the original reason to
have a separate daemon: implementing it in QEMU would be not just a bad
idea because you need CAP_SYS_ADMIN, it would be impossible for
libvirt-started guests because of sVirt and device cgroups).

So I think the helper daemon needs to be granted blanket ioctl access to
devices, without using either device cgroups or MCS.

Paolo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-09-11 Thread Daniel P. Berrange
On Sun, Sep 10, 2017 at 11:52:24AM +0200, Paolo Bonzini wrote:
> On 29/08/2017 14:41, Daniel P. Berrange wrote:
> > On Tue, Aug 22, 2017 at 06:27:40PM +0200, Paolo Bonzini wrote:
> >> Hi all,
> >>
> >> I am adding a new daemon to QEMU, that QEMU can connect to in order to
> >> issue persistent reservation commands.
> > 
> > Can you elaborate on what this daemon does ?
> > 
> > IIUC, by 'persistent reservation' you are referring to SCSI LUN
> > reservations ?  If so, the security model needs to involve more
> > than just policy on the socket that QEMU uses to talk to the
> > PR daemon.  We need to be able to control what device nodes this
> > daemon is permitted to access.
> > 
> > For iSCSI backed disks, the daemon might need to use the libiscsi
> > driver, instead of assuming there is a device node on the local
> > host too.
> 
> libiscsi-backed disks can already issue persistent reservations.  The
> daemon is only needed for physical disks, as an alternative to
> sgio='unfiltered' or (yuck) running QEMU with CAP_SYS_RAWIO.
> 
> > Libvirt has a generic lock manager facility and I've previously
> > though might be able to be extended to acquire SCSI reservations
> > for disks which are backed by SCSI/iSCSI, but never tried to
> > work on this.
> 
> This is an interesting idea, but it is unrelated to this work, which is
> about guests who manage the locking on their own (through peristent
> reservations).

Ah ok, I missed that this was about allowing the guest todo reservations
itself, rather than QEMU using it for disk locking.

> > I'm not sure if want to have QEMU spawning this daemon itself or not.
> 
> Definitely not.  The daemon is not suid root.
> 
> > On the one hand if QEMU spawns the daemon, then it means the daemon
> > inherits the SELinux policy of QEMU by default, so is restricted to
> > only access the devices that QEMU has been granted.
> 
> Libvirt could also give a confined policy to the helper, but there are
> some complications because of multipath.  When passed a multipath
> device, the daemon forwards the PR command to _all_ paths, not just the
> currently active one, similar to the mpathpersist(1) command.  (In fact
> this was the original usecase; support for non-multipath devices was
> just an easy and useful extension.)
>
> So the helper is a bit different from QEMU with respect to both SELinux
> MCS labeling and the devices cgroup.  Access limitation comes from only
> ever operating on devices that have been passed on the socket.  SELinux
> MCS could be used so that only the "right" QEMU can connect to each
> instance of the helper, though.

I wonder how this interacts with SELinux. IIUC if we grant access to
the multipath device file, the kernel won't normally require us to
grant access the underlying paths. I/O is just allowed because they
are a member of the mpath device. Hopefully this applies to the
persistent reservations too ?

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-09-10 Thread Paolo Bonzini
On 29/08/2017 14:41, Daniel P. Berrange wrote:
> On Tue, Aug 22, 2017 at 06:27:40PM +0200, Paolo Bonzini wrote:
>> Hi all,
>>
>> I am adding a new daemon to QEMU, that QEMU can connect to in order to
>> issue persistent reservation commands.
> 
> Can you elaborate on what this daemon does ?
> 
> IIUC, by 'persistent reservation' you are referring to SCSI LUN
> reservations ?  If so, the security model needs to involve more
> than just policy on the socket that QEMU uses to talk to the
> PR daemon.  We need to be able to control what device nodes this
> daemon is permitted to access.
> 
> For iSCSI backed disks, the daemon might need to use the libiscsi
> driver, instead of assuming there is a device node on the local
> host too.

libiscsi-backed disks can already issue persistent reservations.  The
daemon is only needed for physical disks, as an alternative to
sgio='unfiltered' or (yuck) running QEMU with CAP_SYS_RAWIO.

> Libvirt has a generic lock manager facility and I've previously
> though might be able to be extended to acquire SCSI reservations
> for disks which are backed by SCSI/iSCSI, but never tried to
> work on this.

This is an interesting idea, but it is unrelated to this work, which is
about guests who manage the locking on their own (through peristent
reservations).

> I'm not sure if want to have QEMU spawning this daemon itself or not.

Definitely not.  The daemon is not suid root.

> On the one hand if QEMU spawns the daemon, then it means the daemon
> inherits the SELinux policy of QEMU by default, so is restricted to
> only access the devices that QEMU has been granted.

Libvirt could also give a confined policy to the helper, but there are
some complications because of multipath.  When passed a multipath
device, the daemon forwards the PR command to _all_ paths, not just the
currently active one, similar to the mpathpersist(1) command.  (In fact
this was the original usecase; support for non-multipath devices was
just an easy and useful extension.)

So the helper is a bit different from QEMU with respect to both SELinux
MCS labeling and the devices cgroup.  Access limitation comes from only
ever operating on devices that have been passed on the socket.  SELinux
MCS could be used so that only the "right" QEMU can connect to each
instance of the helper, though.

Paolo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-09-10 Thread Paolo Bonzini
On 10/09/2017 11:38, Paolo Bonzini wrote:
> The daemon can then be
> placed in the same devices cgroup and SELinux MCS category as QEMU.

At least regarding the devices cgroup, this is wrong, sorry (the socket
can be given an MCS category to restrict who connects to it, but not the
daemon).  More details in the reply to Daniel's message.

Thanks,

Paolo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-09-10 Thread Paolo Bonzini
On 28/08/2017 13:11, Michal Privoznik wrote:
> On 08/25/2017 12:41 AM, Paolo Bonzini wrote:
>> On 22/08/2017 18:27, Paolo Bonzini wrote:
>>> Hi all,
> 
> Hey, sorry for late reply. I was enjoying my PTO by not reading e-mails :-)

Me too!

>>>
>>> I am adding a new daemon to QEMU, that QEMU can connect to in order to
>>> issue persistent reservation commands.
> 
> Persistent reservation of what?

SCSI persistent reservation commands are commands that support multiple
initiators accessing the same device.

>> Thinking more about it, Libvirt could start the daemon on its own,
>> assigning a socket per virtual machine.  SELinux MCS should then just
>> work, because the same category is assigned to the daemon instance and QEMU.
> 
> Whoa, libvirt is not really prepared for qemu spawning processes, or
> having more than one process per qemu domain. But it shouldn't be that
> hard to prepare internal structs for that.

As Daniel remarked, QEMU should not be spawning processes with elevated
privileges.  In fact, it definitely should not be doing that in this
case.  The two usecases are:

1) for user libvirtd, the daemon is started by init and uses a global
socket.  To talk to the daemon, the user must have been given access to
the socket by the administrator via Unix groups or ACLs or whatever.
Once you have access, you can send PRs to whatever device you have
access (unless the administrator hasn't also configured e.g. devices
cgroups for the daemon---but this is outside libvirt's scope).  While
this mode can be used for system libvirtd too, it would be less secure
and it would be hard to make it work with SELinux, so...

2) ... for system libvirtd, a copy of the daemon is started for each VM
by libvirtd, together with QEMU.  This mode cannot be used for user
libvirtd because the helper is _not_ suid root.  The daemon can then be
placed in the same devices cgroup and SELinux MCS category as QEMU.

>> In particular, Libvirt could create the socket itself, label it, and
>> pass it to the daemon through the systemd socket activation protocol
>> (which qemu-pr-helper supports).
> 
> We can pass FDs to qemu (in fact any process). Even if its running. So
> that shouldn't be a problem.

Yeah, the systemd socket activation protocol should be trivial to
support in libvirtd (compared to other changes to use >1 process per VM).

>>
>> The XML to use the helper with a predefined socket could be:
>>
>>  
>> /path/to/unix.socket'
>>  
> 
> Do we want to/need to expose the path here? I mean, is user expected to
> do something with it? We don't expose monitor path anywhere but keep it
> private (of course we store it in so called status XML which is a
> persistent storage solely for purpose of reloading the internal state
> when daemon is restarted).

In this case, yes.  This is for the case of a global daemon.

>>
>> while to use it with a dedicated daemon
>>
>>  
>> /path/to/qemu-pr-helper
>>  
> 
> 
> Ah, so there isn't 1:1 relationship with qemu process and the daemon
> helper. One daemon can serve multiple qemu processes, am I right?

Yes, but it need not be the case.

> Also, how would libvirt know if the daemon helper dies?

The daemon shouldn't die.  Last famous words, sure, but even if the
daemon isn't respawned, persistent reservations commands fail but
everything else works normally.

> I mean, if libvirt is
> to start it again (there are some systemd-less distros), we have to know
> that such situation happened. For instance, we can get an event on the
> monitor to which we start the daemon and pass new FD to its socket to
> qemu? Although, this would mean a significant work on libvirt side in
> case there's 1:N relationship. Because on delivery of the event from two
> domains we have to figure out if the domains are supposed to have their
> own daemons or one shared.

Yeah, this is why for the 1:N model (user libvirtd) I prefer to just
leave it to systemd and systemd-less distros will rely on the last
famous words above.  For the 1:1 model (system libvirtd) libvirtd
_could_ respawn, but it is not big deal if it doesn't.

> Also, what happens when the daemon dies? What's the qemu state at that
> point? Is the guest still running or is it paused (e.g. like on ENOSPC
> error on disks)?

See above.

Paolo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-08-29 Thread Daniel P. Berrange
On Tue, Aug 22, 2017 at 06:27:40PM +0200, Paolo Bonzini wrote:
> Hi all,
> 
> I am adding a new daemon to QEMU, that QEMU can connect to in order to
> issue persistent reservation commands.

Can you elaborate on what this daemon does ?

IIUC, by 'persistent reservation' you are referring to SCSI LUN
reservations ?  If so, the security model needs to involve more
than just policy on the socket that QEMU uses to talk to the
PR daemon.  We need to be able to control what device nodes this
daemon is permitted to access.

For iSCSI backed disks, the daemon might need to use the libiscsi
driver, instead of assuming there is a device node on the local
host too.

Libvirt has a generic lock manager facility and I've previously
though might be able to be extended to acquire SCSI reservations
for disks which are backed by SCSI/iSCSI, but never tried to
work on this.

> The daemon can only issue the commands on file descriptor that QEMU
> already has.  In addition normal users shouldn't have access to the
> daemon's Unix socket in /run, so the daemon is protected against misuse.
> 
> My question is what is the best way to handle the connection to the
> daemon socket.  Currently, the path to the socket is passed to QEMU on
> the command line:
> 
>  -object pr-manager-helper,id=mgr,path=/run/qemu-pr-helper.sock \
>  -drive if=none,id=hd,driver=raw,filename=/dev/sdb,file.pr-manager=mgr \
>  -device scsi-block,drive=hd
> 
> (the new parts are "-object pr-manager-helper" and "file.pr-manager").

I'm not sure if want to have QEMU spawning this daemon itself or not.

On the one hand if QEMU spawns the daemon, then it means the daemon
inherits the SELinux policy of QEMU by default, so is restricted to
only access the devices that QEMU has been granted. On the other
hand, this means we need to give QEMU permission to execve(), which
is something we've been striving to remove by default.

> I could just make it root:root and pass a file descriptor from libvirt
> to QEMU, but this would make it impossible for QEMU to reconnect to the
> daemon in case someone does a "systemctl restart" or even just kills it
> inadvertently.  The daemon is stateless, so transparent reconnection
> would be a nice feature to have.
> 
> The alternative is to somehow label the daemon socket so that it can be
> accessed by QEMU, but I'm not very well versed in SELinux.
> 
> Any ideas?

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-08-28 Thread Michal Privoznik
On 08/25/2017 12:41 AM, Paolo Bonzini wrote:
> On 22/08/2017 18:27, Paolo Bonzini wrote:
>> Hi all,

Hey, sorry for late reply. I was enjoying my PTO by not reading e-mails :-)

>>
>> I am adding a new daemon to QEMU, that QEMU can connect to in order to
>> issue persistent reservation commands.

Persistent reservation of what?

>>
>> The daemon can only issue the commands on file descriptor that QEMU
>> already has.  In addition normal users shouldn't have access to the
>> daemon's Unix socket in /run, so the daemon is protected against misuse.
>>
>> My question is what is the best way to handle the connection to the
>> daemon socket.  Currently, the path to the socket is passed to QEMU on
>> the command line:
>>
>>  -object pr-manager-helper,id=mgr,path=/run/qemu-pr-helper.sock \
>>  -drive if=none,id=hd,driver=raw,filename=/dev/sdb,file.pr-manager=mgr \
>>  -device scsi-block,drive=hd
>>
>> (the new parts are "-object pr-manager-helper" and "file.pr-manager").
>>
>> I could just make it root:root and pass a file descriptor from libvirt
>> to QEMU, but this would make it impossible for QEMU to reconnect to the
>> daemon in case someone does a "systemctl restart" or even just kills it
>> inadvertently.  The daemon is stateless, so transparent reconnection
>> would be a nice feature to have.
>>
>> The alternative is to somehow label the daemon socket so that it can be
>> accessed by QEMU, but I'm not very well versed in SELinux.
> 
> Thinking more about it, Libvirt could start the daemon on its own,
> assigning a socket per virtual machine.  SELinux MCS should then just
> work, because the same category is assigned to the daemon instance and QEMU.

Whoa, libvirt is not really prepared for qemu spawning processes, or
having more than one process per qemu domain. But it shouldn't be that
hard to prepare internal structs for that.

> 
> In particular, Libvirt could create the socket itself, label it, and
> pass it to the daemon through the systemd socket activation protocol
> (which qemu-pr-helper supports).

We can pass FDs to qemu (in fact any process). Even if its running. So
that shouldn't be a problem.

> 
> The XML to use the helper with a predefined socket could be:
> 
>   
>  /path/to/unix.socket'
>   

Do we want to/need to expose the path here? I mean, is user expected to
do something with it? We don't expose monitor path anywhere but keep it
private (of course we store it in so called status XML which is a
persistent storage solely for purpose of reloading the internal state
when daemon is restarted).

> 
> while to use it with a dedicated daemon
> 
>   
>  /path/to/qemu-pr-helper
>   


Ah, so there isn't 1:1 relationship with qemu process and the daemon
helper. One daemon can serve multiple qemu processes, am I right? Also,
how would libvirt know if the daemon helper dies? I mean, if libvirt is
to start it again (there are some systemd-less distros), we have to know
that such situation happened. For instance, we can get an event on the
monitor to which we start the daemon and pass new FD to its socket to
qemu? Although, this would mean a significant work on libvirt side in
case there's 1:N relationship. Because on delivery of the event from two
domains we have to figure out if the domains are supposed to have their
own daemons or one shared.

Also, what happens when the daemon dies? What's the qemu state at that
point? Is the guest still running or is it paused (e.g. like on ENOSPC
error on disks)?

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] New QEMU daemon for persistent reservations

2017-08-24 Thread Paolo Bonzini
On 22/08/2017 18:27, Paolo Bonzini wrote:
> Hi all,
> 
> I am adding a new daemon to QEMU, that QEMU can connect to in order to
> issue persistent reservation commands.
> 
> The daemon can only issue the commands on file descriptor that QEMU
> already has.  In addition normal users shouldn't have access to the
> daemon's Unix socket in /run, so the daemon is protected against misuse.
> 
> My question is what is the best way to handle the connection to the
> daemon socket.  Currently, the path to the socket is passed to QEMU on
> the command line:
> 
>  -object pr-manager-helper,id=mgr,path=/run/qemu-pr-helper.sock \
>  -drive if=none,id=hd,driver=raw,filename=/dev/sdb,file.pr-manager=mgr \
>  -device scsi-block,drive=hd
> 
> (the new parts are "-object pr-manager-helper" and "file.pr-manager").
> 
> I could just make it root:root and pass a file descriptor from libvirt
> to QEMU, but this would make it impossible for QEMU to reconnect to the
> daemon in case someone does a "systemctl restart" or even just kills it
> inadvertently.  The daemon is stateless, so transparent reconnection
> would be a nice feature to have.
> 
> The alternative is to somehow label the daemon socket so that it can be
> accessed by QEMU, but I'm not very well versed in SELinux.

Thinking more about it, Libvirt could start the daemon on its own,
assigning a socket per virtual machine.  SELinux MCS should then just
work, because the same category is assigned to the daemon instance and QEMU.

In particular, Libvirt could create the socket itself, label it, and
pass it to the daemon through the systemd socket activation protocol
(which qemu-pr-helper supports).

The XML to use the helper with a predefined socket could be:


   /path/to/unix.socket'


while to use it with a dedicated daemon


   /path/to/qemu-pr-helper


An empty  element can use a standard helper program
declared in qemu.conf (default /usr/libexec/qemu-pr-helper).

Paolo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list