Re: [Qemu-devel] [PATCH v2] Add argument filters to the seccomp sandbox

2015-09-30 Thread Paul Moore
On Monday, September 28, 2015 11:14:42 PM Namsun Ch'o wrote:
> > My understanding of the config file you proposed is that it would allow
> > the
> > configuration of a whitelist, so changes to the config very could result
> > in
> > *less* strict of a filter, not always more.
> 
> No. Any time an administrator wants a syscall that is not enabled in the
> sandbox, they either don't actually need it, or it's a bug and should be
> fixed. So all the config would do is add argument filters to syscalls which
> are already whitelisted.

If the syscall, without arguments, is already added to the whitelist then 
adding a new libseccomp rule to allow that same syscall with specific 
arguments will have no effect since a broader rule already exists in the 
filter.

> The alternative would be that the syscalls are given no further argument
> filtering. The config could only make the filteres more restrictive, never
> less.

I still don't see how this is the case, but it probably isn't worth arguing 
any further without some patches.
 
> Perhaps there could be a #define somewhere that toggles whether or not a
> syscall argument filter can be created for a syscall which is not in the
> built-in whitelist, otherwise it would throw an error saying that you cannot
> create an argument filter for a syscall that is not permitted.

I would argue you should never be able to add a syscall to the whitelist via a 
config file and/or command line option, but that is my opinion.

-- 
paul moore
security @ redhat




Re: [Qemu-devel] [PATCH v2] Add argument filters to the seccomp sandbox

2015-09-28 Thread Paul Moore
On Monday, September 28, 2015 05:34:53 PM Namsun Ch'o wrote:
> > To be clear, I'm not suggesting "--enable-syscalls=foo,bar,...", what I'm
> > suggesting is a decomposition of the current filter list into blocks of
> > syscalls that are needed to enable specific functionality.  For example,
> > if you enable audio support at runtime a set of syscalls will be added to
> > the filter whitelist, if you enable a network device a different set of
> > syscalls will be added to the filter, and so on.
> > 
> > I think having an admin specified filter, either via a command line or
> > configuration file, is a step in the wrong direction.
> 
> How come? I think it is safer than forcing an admin to re-compile everything
> (which just won't happen in an enterprise environment).

For starters, I don't believe that a random administrator understands the QEMU 
code and the potential impact of any changes to such a config file as well as 
the QEMU developers.

> ... If any configuration file only increases the strictness of a syscall, I
> don't see the danger of an admin shooting themselves in the foot. Allowing
> an admin to decrease security would be a problem, but they can do -sandbox
> off anyway.

My understanding of the config file you proposed is that it would allow the 
configuration of a whitelist, so changes to the config very could result in 
*less* strict of a filter, not always more.

Also, while yes, allowing an admin to disable the sandbox via a command line 
switch does disable the seccomp filter, it is obvious that such a step is 
being taken.

> But if the dynamic sandbox is strict enough for each feature, it'd be great.

-- 
paul moore
security @ redhat




Re: [Qemu-devel] [PATCH v2] Add argument filters to the seccomp sandbox

2015-09-28 Thread Paul Moore
On Saturday, September 26, 2015 01:06:57 AM Namsun Ch'o wrote:
> > I've suggested this in the past but to my knowledge no has done any work
> > in this direction, including myself. Despite the lack of progress, I still
> > think this is a very worthwhile idea.
> 
> Which is exactly why I think a configuration file would be the best option
> instead of --enable-syscalls=foo,bar,baz. It would allow someone to easily
> customize their policy without needing to create a patch, or wait on QEMU
> developers to do work on it.

To be clear, I'm not suggesting "--enable-syscalls=foo,bar,...", what I'm 
suggesting is a decomposition of the current filter list into blocks of 
syscalls that are needed to enable specific functionality.  For example, if 
you enable audio support at runtime a set of syscalls will be added to the 
filter whitelist, if you enable a network device a different set of syscalls 
will be added to the filter, and so on.

I think having an admin specified filter, either via a command line or 
configuration file, is a step in the wrong direction.

-- 
paul moore
security @ redhat




Re: [Qemu-devel] [PATCH v2] Add argument filters to the seccomp sandbox

2015-09-25 Thread Paul Moore
On Friday, September 25, 2015 12:53:04 AM Namsun Ch'o wrote:
> Another idea which would fit in with the security model is to have a dynamic
> sandbox which enables syscalls and syscall filters based on what command
> line or config parameters are passed to QEMU on its first start.

I've suggested this in the past but to my knowledge no has done any work in 
this direction, including myself.  Despite the lack of progress, I still think 
this is a very worthwhile idea.

-- 
paul moore
security @ redhat




Re: [Qemu-devel] [PATCH] Add a few argument filters to the seccomp sandbox

2015-09-11 Thread Paul Moore
On Thursday, September 10, 2015 03:48:52 PM Daniel P. Berrange wrote:
> On Wed, Sep 09, 2015 at 09:55:33PM -0400, namn...@safe-mail.net wrote:
> > +
> > +/* shmget */
> > +rc = seccomp_rule_add(ctx, SCMP_ACT_ALLOW, SCMP_SYS(shmget), 2,
> > +SCMP_A0(SCMP_CMP_EQ, IPC_PRIVATE),
> > +SCMP_A2(SCMP_CMP_EQ, IP_CREAT|0777));
> 
> I'm not familiar with semantics of these seccomp rules, but is this
> saying that the second arg must be exactly equal to IP_CREAT|0777 ?

Yes.

> If the app passes IP_CREAT|0600, would that be permitted instead ?
> The latter is what I see gtk2 source code passing for mode.

It wouldn't match the rule as written above, if it doesn't match any other 
configured rules it would hit the default filter action.

-- 
paul moore
security @ redhat




Re: [Qemu-devel] [PATCH for-2.3] Revert seccomp tests that allow it to be used on non-x86 architectures

2015-07-01 Thread Paul Moore
On Wednesday, July 01, 2015 02:07:49 PM Andrew Jones wrote:
 On Tue, Jun 30, 2015 at 01:18:49PM -0400, Paul Moore wrote:
  On Tuesday, June 30, 2015 06:07:40 PM Peter Maydell wrote:
   On 30 June 2015 at 18:01, Paul Moore pmo...@redhat.com wrote:
I'm starting to wonder if the 32-bit ARM build system didn't have
__NR_cacheflush defined in the system headers; that might explain some
of
the behavior.  Could you check your system to see if it has
__NR_cacheflush defined (try /usr/include/asm/unistd.h)?
   
   The constant name is __ARM_NR_cacheflush, not __NR_cacheflush
   (all the ARM-specific syscalls are __ARM_NR_*). See
   http://lxr.free-electrons.com/source/arch/arm/include/uapi/asm/unistd.h#
   L418 
  /me smacks his forehead
  
  Of course it is.  We already work around that in arch-syscall-validate. 
  D'oh!
  
  Good news though, I think we just found the bug ;)
  
  I'm currently trying to put out another fire in a different project; as
  soon as I've got that done I'll fix this.  However, if somebody wants to
  play, I'm always happy to accept patches :)
 
 Sent: https://groups.google.com/forum/#!topic/libseccomp/RD9RTmc2Lxo

Applied, thanks.

 I'll send the patch for qemu to add cacheflush to the whitelist shortly.

-- 
paul moore
security @ redhat




Re: [Qemu-devel] [PATCH for-2.3] Revert seccomp tests that allow it to be used on non-x86 architectures

2015-06-30 Thread Paul Moore
On Tuesday, June 30, 2015 10:39:34 AM Andrew Jones wrote:
 On Mon, Jun 29, 2015 at 04:24:55PM -0400, Paul Moore wrote:
  Hmm, so either the kernel is screwing up with the seccomp filter for this
  particular syscall (unlikely) or libseccomp is screwing up the filter
  creation (more likely).  I don't have an ARM system handy at the moment,
  but could you use the seccomp_export_pfc() and seccomp_export_bpf()
  functions to dump the PFC/BPF filter code to a file and send it out?
 
 Attached

Thanks.

 I looked at the pfc file and compared all the syscalls in it vs.  the list
 in qemu-seccomp.c. The pfc file is missing cacheflush, and has an 'UNKNOWN'
 instead.

Yeah, the associated syscall number is showing -10104 which is the pseudo 
syscall number for architectures that don't support cacheflush().  That is 
obviously wrong

 Also, I think there may be another problem with the filter (or pfc)
 generation. Several of the syscalls have weird syscall numbers. For
 example, I would expect mmap to be 90, but instead it's -10181.

That is intentional.  According to the Linux kernel sources mmap() isn't 
defined for 32-bit ARM EABI so you are seeing libseccomp's pseudo syscall 
number for mmap().

 And, since there was something weird, and not related to cacheflush, in the
 arm32 pfc, I decided to check it on my mustang too. The output there gets
 cacheflush for the name instead of UNKNOWN, but has the same weird
 number (-10104) that the midway has. It also has several other weird
 numbers. The output from the mustang is in the attached tarball as well.

I would expect aarch64 to have a pseudo syscall number for cacheflush() as 
that syscall isn't defined for 64-bit ARM.  What I don't understand is why 
libseccomp doesn't recognize cacheflush() in this particular case.

I'm starting to wonder if the 32-bit ARM build system didn't have 
__NR_cacheflush defined in the system headers; that might explain some of the 
behavior.  Could you check your system to see if it has __NR_cacheflush 
defined (try /usr/include/asm/unistd.h)?  If it does, could you try rebuilding 
the libseccomp package on your system and seeing if it resolves the problem?

-- 
paul moore
security @ redhat




Re: [Qemu-devel] [PATCH for-2.3] Revert seccomp tests that allow it to be used on non-x86 architectures

2015-06-30 Thread Paul Moore
On Tuesday, June 30, 2015 06:07:40 PM Peter Maydell wrote:
 On 30 June 2015 at 18:01, Paul Moore pmo...@redhat.com wrote:
  I'm starting to wonder if the 32-bit ARM build system didn't have
  __NR_cacheflush defined in the system headers; that might explain some of
  the behavior.  Could you check your system to see if it has
  __NR_cacheflush defined (try /usr/include/asm/unistd.h)?
 
 The constant name is __ARM_NR_cacheflush, not __NR_cacheflush
 (all the ARM-specific syscalls are __ARM_NR_*). See
 http://lxr.free-electrons.com/source/arch/arm/include/uapi/asm/unistd.h#L418

/me smacks his forehead

Of course it is.  We already work around that in arch-syscall-validate.  D'oh!

Good news though, I think we just found the bug ;)

I'm currently trying to put out another fire in a different project; as soon 
as I've got that done I'll fix this.  However, if somebody wants to play, I'm 
always happy to accept patches :)

-- 
paul moore
security @ redhat




Re: [Qemu-devel] [PATCH for-2.3] Revert seccomp tests that allow it to be used on non-x86 architectures

2015-06-29 Thread Paul Moore
On Monday, June 29, 2015 09:50:17 AM Andrew Jones wrote:
 On Fri, Jun 26, 2015 at 04:26:22PM -0400, Paul Moore wrote:
  Perhaps a stupid question, but you did verify that it is cacheflush that
  is causing the problem?  The seccomp filter code will emit a message to
  syslog or the audit log, depending on your configuration, with the
  syscall number.
 
   #./tools/scmp_sys_resolver -a arm cacheflush
   983042
   #./tools/scmp_sys_resolver -a arm 983042
 
 I hadn't before (didn't know about the logging). I had determined the
 problem by running qemu in gdb. I just checked now though and confirmed
 it
 
 type=SECCOMP msg=audit(1435563996.731:2032): auid=1001 uid=1001 gid=1001
 ses=157 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
 pid=27059 comm=qemu-system-arm
 exe=/home/drjones/code/qemu/arm-softmmu/qemu-system-arm sig=31
 arch=4028 syscall=983042 compat=0 ip=0xb6b43164 code=0x0
 
 This log was generated even with the above patch applied to qemu.

The only thing that comes to mind quickly is that the cacheflush() call is 
being done by a thread that was created before the seccomp filter was loaded 
into the kernel; although I believe you said you already checked that.

If you are using a recent kernel and libseccomp you can try enabling the 
SCMP_FLTATR_CTL_TSYNC attribute to apply the filter to all running threads in 
the process.

rc = seccomp_attr_set(ctx, SCMP_FLTATR_CTL_TSYNC, 1);
if (rc)
/* error */

... although that may have unintended consequences since threads which were 
never filtered are not getting caught up in the seccomp filter.  Although, the 
current QEMU seccomp filter is so permissive it might not be a real concern.

Anyway, (double) check the thread creation and seccomp_load() ordering.

-- 
paul moore
security @ redhat




Re: [Qemu-devel] [PATCH for-2.3] Revert seccomp tests that allow it to be used on non-x86 architectures

2015-06-29 Thread Paul Moore
On Monday, June 29, 2015 07:47:29 PM Andrew Jones wrote:
 On Mon, Jun 29, 2015 at 10:53:14AM -0400, Paul Moore wrote:
  On Monday, June 29, 2015 09:50:17 AM Andrew Jones wrote:
   On Fri, Jun 26, 2015 at 04:26:22PM -0400, Paul Moore wrote:
Perhaps a stupid question, but you did verify that it is cacheflush
that
is causing the problem?  The seccomp filter code will emit a message
to
syslog or the audit log, depending on your configuration, with the
syscall number.

 #./tools/scmp_sys_resolver -a arm cacheflush
 983042
 #./tools/scmp_sys_resolver -a arm 983042
   
   I hadn't before (didn't know about the logging). I had determined the
   problem by running qemu in gdb. I just checked now though and confirmed
   it
   
   type=SECCOMP msg=audit(1435563996.731:2032): auid=1001 uid=1001 gid=1001
   ses=157 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
   pid=27059 comm=qemu-system-arm
   exe=/home/drjones/code/qemu/arm-softmmu/qemu-system-arm sig=31
   arch=4028 syscall=983042 compat=0 ip=0xb6b43164 code=0x0
   
   This log was generated even with the above patch applied to qemu.
  
  The only thing that comes to mind quickly is that the cacheflush() call is
  being done by a thread that was created before the seccomp filter was
  loaded into the kernel; although I believe you said you already checked
  that.

 Nope, I hadn't, but I have now ...

Actually, never mind on that, I was being stupid.  If it was a different 
thread it wouldn't be impacted by the seccomp filter at all ...

 ... So we're calling __clear_cache from the same thread that called
 seccomp_start, and that thread dies the moment it calls the syscall.
 No other threads except id(2) at this time, which appears to be
 something created by __libc_start_main before main() runs.

Hmm, so either the kernel is screwing up with the seccomp filter for this 
particular syscall (unlikely) or libseccomp is screwing up the filter creation 
(more likely).  I don't have an ARM system handy at the moment, but could you 
use the seccomp_export_pfc() and seccomp_export_bpf() functions to dump the 
PFC/BPF filter code to a file and send it out?

  If you are using a recent kernel and libseccomp you can try enabling the
  SCMP_FLTATR_CTL_TSYNC attribute to apply the filter to all running threads
  in the process.
  
  rc = seccomp_attr_set(ctx, SCMP_FLTATR_CTL_TSYNC, 1);
  if (rc)
  
  /* error */
 
 I tried this, but it error'ed out with rc == -95 (EOPNOTSUPP ?)
 My kernel version is 4.0.5-200.fc21.armv7hl+lpae

That should be a recent enough kernel, but perhaps your version of libseccomp 
was built against an older version of the kernel that didn't have the 
necessary support (and it was disabled at compile time)?

-- 
paul moore
security @ redhat




Re: [Qemu-devel] [PATCH for-2.3] Revert seccomp tests that allow it to be used on non-x86 architectures

2015-06-26 Thread Paul Moore
On Friday, June 26, 2015 06:03:18 PM Andrew Jones wrote:
 On Tue, Jun 16, 2015 at 02:16:03PM +0100, Peter Maydell wrote:
  On 16 June 2015 at 14:12, Andrew Jones drjo...@redhat.com wrote:
   Can we now revert this revert, along with bumping the non-x86 arch
   atleast-version to v2.2.1
  
  Probably. I suggest you submit a patch and test it on the
  relevant architectures and seccomp versions.
 
 I don't see any problems with the light testing (booting a guest)
 I've done on my mustang, but AArch64 worked with libseccomp 2.2.0
 too. So I dusted off my Midway (updated to Fedora 21 that has
 libseccomp 2.2.1 packaged), and gave it a try, but unfortunately
 it still doesn't work...
 
 I found that we needed to add another syscall to the whitelist;
 the arm-private 'cacheflush', as it's used by __builtin___clear_cache.
 And, from libseccomp's git history it appears that syscall is known
 
 commit a710a2d246bdc73ba77e3ff5624e790688cc51fd
 Author: Paul Moore pmo...@redhat.com
 Date:   Wed May 6 12:05:45 2015 -0400
 
 arm: add some missing syscalls
 
 Add the following syscalls to the ARM arch/ABI and update the syscall
 validation script.
 
  * breakpoint()
  * cacheflush()
  * usr26()
  * usr32()
  * set_tls()
 
 Reported-by: Purcareata Bogdan b43...@freescale.com
 Signed-off-by: Paul Moore pmo...@redhat.com
 
 
 And also appears to be in 2.2.1
 $ git describe a710a2d246bdc73ba77e3ff5624e790688cc51fd
 v2.2.0-10-ga710a2d246bdc
 
 However the qemu thread that makes that syscall still dies, even
 with this patch
 
 diff --git a/qemu-seccomp.c b/qemu-seccomp.c
 index f9de0d3390feb..33644a4e3c3d3 100644
 --- a/qemu-seccomp.c
 +++ b/qemu-seccomp.c
 @@ -237,7 +237,8 @@ static const struct QemuSeccompSyscall
 seccomp_whitelist[] = {
  { SCMP_SYS(fadvise64), 240 },
  { SCMP_SYS(inotify_init1), 240 },
  { SCMP_SYS(inotify_add_watch), 240 },
 -{ SCMP_SYS(mbind), 240 }
 +{ SCMP_SYS(mbind), 240 },
 +{ SCMP_SYS(cacheflush), 240 },
  };
 
  int seccomp_start(void)
 
 
 Paul, can you help me figure out what I'm missing?

Perhaps a stupid question, but you did verify that it is cacheflush that is 
causing the problem?  The seccomp filter code will emit a message to syslog or 
the audit log, depending on your configuration, with the syscall number.

 #./tools/scmp_sys_resolver -a arm cacheflush
 983042
 #./tools/scmp_sys_resolver -a arm 983042

-- 
paul moore
security @ redhat




Re: [Qemu-devel] seccomp breakage on arm

2015-04-10 Thread Paul Moore
On Friday, April 10, 2015 05:40:40 PM Andreas Färber wrote:
 My main concern - that you apparently misunderstood - was whether this
 is a QEMU or a libseccomp issue. If it's on libseccomp's side then it's
 less urgent for QEMU and any new configure checks are just candy IMO.

My opinion is that the 32-bit ARM build issue you described is a libseccomp 
bug (see the bug fix I sent a few hours ago); QEMU's usage of libseccomp is 
perfectly reasonable.  That said, the libseccomp bug clearly affected QEMU in 
a bad way.  Making matters worse is that the problem wasn't caught until late 
in the QEMU release process, nobody likes surprises like this.

We've corrected the issue in libseccomp, and it looks like the QEMU folks are 
reverting ARM/seccomp support so I think we've resolved this for the immediate 
future.  Long term I think the libseccomp project can look at adding some 
additional automated tests so these issues will be caught at build/packaging 
time, further, I think the QEMU project can restore ARM/seccomp support in the 
next release and look at its own testing procedures to understand why this 
issue wasn't caught earlier as well.

-- 
paul moore
security @ redhat




Re: [Qemu-devel] seccomp breakage on arm

2015-04-09 Thread Paul Moore
On Thursday, April 09, 2015 10:21:52 AM Eduardo Otubo wrote:
 On Thu, Apr 09, 2015 at 05=01=31AM +0200, Andreas Färber wrote:
  Hello,
  
  I am seeing the following build failure on openSUSE Tumbleweed armv7l
  with --enable-seccomp in v2.3.0-rc2:
  
  [  551s] In file included from qemu-seccomp.c:16:0:
  [  551s] /usr/include/libseccomp/seccomp.h:177:23: error: '__NR_mmap'
  undeclared here (not in a function)
  [  551s]  #define SCMP_SYS(x)  (__NR_##x)
  [  551s]^
  [  551s] qemu-seccomp.c:36:7: note: in expansion of macro 'SCMP_SYS'
  [  551s]  { SCMP_SYS(mmap), 247 },
  [  551s]^
  [  551s] /usr/include/libseccomp/seccomp.h:177:23: error:
  '__NR_getrlimit' undeclared here (not in a function)
  [  551s]  #define SCMP_SYS(x)  (__NR_##x)
  [  551s]^
  [  551s] qemu-seccomp.c:57:7: note: in expansion of macro 'SCMP_SYS'
  [  551s]  { SCMP_SYS(getrlimit), 245 },
  [  551s]^
  [  551s] /home/abuild/rpmbuild/BUILD/qemu-2.3.0-rc2/rules.mak:57: recipe
  for target 'qemu-seccomp.o' failed
  [  551s] make: *** [qemu-seccomp.o] Error 1
  
  Is this a problem with libseccomp 2.2.0 / master and needs to be fixed
  in the library? Or do we need to #ifdef some syscalls in qemu-seccomp.c?
 
 This should be already fixed in the library as mentioned by the
 maintainer in this[0] thread. Adding Paul Moore in CC. There's also a
 bug entry on launchpad[1] for that. I provided the patch (before the
 pull reuqest) requesting for some review and testing but never heard
 back again. Also CC'ing Karl-Philipp Richter (bug owner) for some
 opinions on that as well.
 
 Regards,
 
 [0] http://sourceforge.net/p/libseccomp/mailman/message/32955831/
 [1] https://bugs.launchpad.net/qemu/+bug/1363641

This should be fixed with libseccomp v2.2.0; if you are still seeing problems 
using v2.2.0 let me know.

-- 
paul moore
security @ redhat




Re: [Qemu-devel] seccomp breakage on arm

2015-04-09 Thread Paul Moore
On Thursday, April 09, 2015 02:28:24 PM Andreas Färber wrote:
 Am 09.04.2015 um 11:10 schrieb Paul Moore:
  On Thursday, April 09, 2015 10:21:52 AM Eduardo Otubo wrote:
  On Thu, Apr 09, 2015 at 05=01=31AM +0200, Andreas Färber wrote:
  Hello,
  
  I am seeing the following build failure on openSUSE Tumbleweed armv7l
  with --enable-seccomp in v2.3.0-rc2:
  
  [  551s] In file included from qemu-seccomp.c:16:0:
  [  551s] /usr/include/libseccomp/seccomp.h:177:23: error: '__NR_mmap'
  undeclared here (not in a function)
  [  551s]  #define SCMP_SYS(x)  (__NR_##x)
  [  551s]^
  [  551s] qemu-seccomp.c:36:7: note: in expansion of macro 'SCMP_SYS'
  [  551s]  { SCMP_SYS(mmap), 247 },
  [  551s]^
  [  551s] /usr/include/libseccomp/seccomp.h:177:23: error:
  '__NR_getrlimit' undeclared here (not in a function)
  [  551s]  #define SCMP_SYS(x)  (__NR_##x)
  [  551s]^
  [  551s] qemu-seccomp.c:57:7: note: in expansion of macro 'SCMP_SYS'
  [  551s]  { SCMP_SYS(getrlimit), 245 },
  [  551s]^
  [  551s] /home/abuild/rpmbuild/BUILD/qemu-2.3.0-rc2/rules.mak:57: recipe
  for target 'qemu-seccomp.o' failed
  [  551s] make: *** [qemu-seccomp.o] Error 1
  
  Is this a problem with libseccomp 2.2.0 / master and needs to be fixed
  in the library? Or do we need to #ifdef some syscalls in qemu-seccomp.c?
  
  This should be already fixed in the library as mentioned by the
  maintainer in this[0] thread. Adding Paul Moore in CC. There's also a
  bug entry on launchpad[1] for that. I provided the patch (before the
  pull reuqest) requesting for some review and testing but never heard
  back again. Also CC'ing Karl-Philipp Richter (bug owner) for some
  opinions on that as well.
  
  Regards,
  
  [0] http://sourceforge.net/p/libseccomp/mailman/message/32955831/
  [1] https://bugs.launchpad.net/qemu/+bug/1363641
  
  This should be fixed with libseccomp v2.2.0; if you are still seeing
  problems using v2.2.0 let me know.
 
 This *is* with libseccomp v2.2.0, as mentioned above, and I had checked
 that there were no related changes beyond v2.2.0 on your master branch.

I saw were you were *asking* if this was a problem with libseccomp v2.2.0, not 
stating that you were seeing a problem with v2.2.0; I interpreted your 
comments as running a version of libseccomp  v2.2.0 and you were asking if 
the problem had been fixed before your upgraded your copy of libseccomp.

Regardless, I think I see what the problem is, and if I'm correct it affects 
time, umount, stime, alarm, utime, getrlimit, select, readdir, mmap, 
socketcall, syscall, and ipc.  I'm traveling at the moment so a patch may be a 
bit delayed, but I'll be sure to CC you on the fix in case you are able to do 
some testing.

Thanks for the report, I'm sorry about the initial confusion.

-Paul

-- 
paul moore
security @ redhat




[Qemu-devel] [PATCH] seccomp: add mbind() to the syscall whitelist

2014-12-17 Thread Paul Moore
The memory-backend-ram QOM object utilizes the mbind(2) syscall to
set the policy for a memory range.  Add the syscall to the seccomp
sandbox whitelist.

Signed-off-by: Paul Moore pmo...@redhat.com
---
 qemu-seccomp.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/qemu-seccomp.c b/qemu-seccomp.c
index af6a375..b0c6269 100644
--- a/qemu-seccomp.c
+++ b/qemu-seccomp.c
@@ -235,7 +235,8 @@ static const struct QemuSeccompSyscall seccomp_whitelist[] 
= {
 { SCMP_SYS(fallocate), 240 },
 { SCMP_SYS(fadvise64), 240 },
 { SCMP_SYS(inotify_init1), 240 },
-{ SCMP_SYS(inotify_add_watch), 240 }
+{ SCMP_SYS(inotify_add_watch), 240 },
+{ SCMP_SYS(mbind), 240 }
 };
 
 int seccomp_start(void)




Re: [Qemu-devel] [PATCHv3] seccomp: change configure to avoid arm 32 to break

2014-11-07 Thread Paul Moore
On Friday, November 07, 2014 10:05:44 AM Eduardo Otubo wrote:
 Current stable version of libseccomp (2.1.1) only supports i386 and
 x86_64 archs correctly. This patch limits the usage of the syscall
 filter for those archs and updates to the correct last version of
 libseccomp.
 
 This patch also fixes the bug:
 https://bugs.launchpad.net/qemu/+bug/1363641
 
 Signed-off-by: Eduardo Otubo eduardo.ot...@profitbricks.com
 ---
  configure | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

Thanks Eduardo, I'll let you know once I've cut a new release of libseccomp.

Acked-by: Paul Moore pmo...@redhat.com

 diff --git a/configure b/configure
 index 2f17bf3..47048f0 100755
 --- a/configure
 +++ b/configure
 @@ -1823,7 +1823,8 @@ fi
  # libseccomp check
 
  if test $seccomp != no ; then
 -if $pkg_config --atleast-version=2.1.0 libseccomp; then
 +if test $cpu = i386 || test $cpu = x86_64 
 +$pkg_config --atleast-version=2.1.1 libseccomp; then
  libs_softmmu=$libs_softmmu `$pkg_config --libs libseccomp`
  QEMU_CFLAGS=$QEMU_CFLAGS `$pkg_config --cflags libseccomp`
   seccomp=yes

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH] seccomp: change configure to avoid arm 32 to break

2014-11-06 Thread Paul Moore
On Thursday, November 06, 2014 03:49:18 PM Eduardo Otubo wrote:
 Right now seccomp is breaking the compilation of Qemu on armv7l due
 to libsecomp current lack of support for this arch. This problem is
 already fixed on libseccomp upstream but no release date for that is
 scheduled to far. This patch disables support for seccomp on armv7l
 temporarily until libseccomp does a new release. Then I'll remove the
 hack and update libseccomp dependency on configure script.
 
 Related bug: https://bugs.launchpad.net/qemu/+bug/1363641
 
 Signed-off-by: Eduardo Otubo eduardo.ot...@profitbricks.com
 ---
  configure | 20 +++-
  1 file changed, 11 insertions(+), 9 deletions(-)
 
 diff --git a/configure b/configure
 index 2f17bf3..16fd7f5 100755
 --- a/configure
 +++ b/configure
 @@ -1823,15 +1823,17 @@ fi
  # libseccomp check
 
  if test $seccomp != no ; then
 -if $pkg_config --atleast-version=2.1.0 libseccomp; then
 -libs_softmmu=$libs_softmmu `$pkg_config --libs libseccomp`
 -QEMU_CFLAGS=$QEMU_CFLAGS `$pkg_config --cflags libseccomp`
 - seccomp=yes
 -else
 - if test $seccomp = yes; then
 -feature_not_found libseccomp Install libseccomp devel =
 2.1.0 -  fi
 - seccomp=no
 +if test $cpu = i386 || test $cpu = x86_64; then
 +if $pkg_config --atleast-version=2.1.0 libseccomp; then
 +libs_softmmu=$libs_softmmu `$pkg_config --libs libseccomp`
 +QEMU_CFLAGS=$QEMU_CFLAGS `$pkg_config --cflags libseccomp`
 +seccomp=yes
 +else
 +if test $seccomp = yes; then
 +feature_not_found libseccomp Install libseccomp devel =
 2.1.0 +fi
 +seccomp=no
 +fi
  fi
  fi
  ##

Also, note the current release of libseccomp is v2.1.1 which has a number of 
bug fixes on top of v2.1.0.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH] seccomp: change configure to avoid arm 32 to break

2014-11-06 Thread Paul Moore
On Thursday, November 06, 2014 10:24:41 AM Eduardo Otubo wrote:
 Paul, do you have any plans for a new libseccomp release?

Yes, I have plans.  Do I have a date, no. ;)

I was trying to sync up with support for a new ABI, but I believe that was 
pushed to v3.19.  I'll need to check on things, but it may be worth doing a 
new release in the meantime.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH] seccomp: change configure to avoid arm 32 to break

2014-11-06 Thread Paul Moore
On Thursday, November 06, 2014 05:36:04 PM Eduardo Otubo wrote:
 On Thu, Nov 06, 2014 at 11:22:16AM -0500, Paul Moore wrote:
  On Thursday, November 06, 2014 03:49:18 PM Eduardo Otubo wrote:
   Right now seccomp is breaking the compilation of Qemu on armv7l due
   to libsecomp current lack of support for this arch. This problem is
   already fixed on libseccomp upstream but no release date for that is
   scheduled to far. This patch disables support for seccomp on armv7l
   temporarily until libseccomp does a new release. Then I'll remove the
   hack and update libseccomp dependency on configure script.
   
   Related bug: https://bugs.launchpad.net/qemu/+bug/1363641
   
   Signed-off-by: Eduardo Otubo eduardo.ot...@profitbricks.com
   ---
   
configure | 20 +++-
1 file changed, 11 insertions(+), 9 deletions(-)
   
   diff --git a/configure b/configure
   index 2f17bf3..16fd7f5 100755
   --- a/configure
   +++ b/configure
   @@ -1823,15 +1823,17 @@ fi
   
# libseccomp check

if test $seccomp != no ; then
   
   -if $pkg_config --atleast-version=2.1.0 libseccomp; then
   -libs_softmmu=$libs_softmmu `$pkg_config --libs libseccomp`
   -QEMU_CFLAGS=$QEMU_CFLAGS `$pkg_config --cflags libseccomp`
   - seccomp=yes
   -else
   - if test $seccomp = yes; then
   -feature_not_found libseccomp Install libseccomp devel =
   2.1.0 -  fi
   - seccomp=no
   +if test $cpu = i386 || test $cpu = x86_64; then
   +if $pkg_config --atleast-version=2.1.0 libseccomp; then
   +libs_softmmu=$libs_softmmu `$pkg_config --libs
   libseccomp`
   +QEMU_CFLAGS=$QEMU_CFLAGS `$pkg_config --cflags
   libseccomp`
   +seccomp=yes
   +else
   +if test $seccomp = yes; then
   +feature_not_found libseccomp Install libseccomp
   devel = 2.1.0 +fi
   +seccomp=no
   +fi
   
fi

fi
##
  
  Also, note the current release of libseccomp is v2.1.1 which has a number
  of bug fixes on top of v2.1.0.
 
 Does that applies to the distros package version?

Well, I can't speak for all distros, but I always recommend the latest bug-fix 
version for obvious reasons.  While I do control the libseccomp package for 
some distributions, I don't control them all.  I've got enough to worry about, 
I'll let others worry about packaging :)

 I'm running Ubuntu 14.04 and it's still 2.1.0. A regular user would have to
 download and install from scratch in order to build Qemu, then.

I would recommend filing a request for Debian/Ubuntu to package the latest 
libseccomp; v2.1.1 is over a year old at this point.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH] seccomp: change configure to avoid arm 32 to break

2014-11-05 Thread Paul Moore
On Wednesday, November 05, 2014 05:08:20 PM Peter Maydell wrote:
 On 5 November 2014 16:47, Eduardo Otubo wrote:
  Right now seccomp is breaking the compilation of Qemu on armv7l due
  to libsecomp current lack of support for this arch. This problem is
  already fixed on libseccomp upstream but no release date for that is
  scheduled to far. This patch disables support for seccomp on armv7l
  temporarily until libseccomp does a new release. Then I'll remove the
  hack and update libseccomp dependency on configure script.
  
  Related bug: https://bugs.launchpad.net/qemu/+bug/1363641

...

 (How are upstream proposing to fix this anyway? I couldn't
 figure that out from the mailing list thread.)

The problem was that the released version of libseccomp has some holes in 
the internal syscall table for 32-bit ARM with respect to all of the other 
supported architectures.  The current libseccomp upstream has some additional 
tooling and checks to ensure that the different ABI syscall tables are kept in 
sync to prevent something like this from happening in the future.

I'm more than happy to discuss how libseccomp handles the different 
architectures, but that's probably a bit off-topic for this thread.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH] seccomp: change configure to avoid arm 32 to break

2014-11-05 Thread Paul Moore
On Wednesday, November 05, 2014 08:08:06 PM Peter Maydell wrote:
 On 5 November 2014 19:46, Paul Moore pmo...@redhat.com wrote:
  On Wednesday, November 05, 2014 05:08:20 PM Peter Maydell wrote:
  On 5 November 2014 16:47, Eduardo Otubo wrote:
   Right now seccomp is breaking the compilation of Qemu on armv7l due
   to libsecomp current lack of support for this arch. This problem is
   already fixed on libseccomp upstream but no release date for that is
   scheduled to far. This patch disables support for seccomp on armv7l
   temporarily until libseccomp does a new release. Then I'll remove the
   hack and update libseccomp dependency on configure script.
   
   Related bug: https://bugs.launchpad.net/qemu/+bug/1363641
  
  ...
  
  (How are upstream proposing to fix this anyway? I couldn't
  figure that out from the mailing list thread.)
  
  The problem was that the released version of libseccomp has some holes
  in
  the internal syscall table for 32-bit ARM with respect to all of the other
  supported architectures.  The current libseccomp upstream has some
  additional tooling and checks to ensure that the different ABI syscall
  tables are kept in sync to prevent something like this from happening in
  the future.
 
 OK. So should we make QEMU say if x86_64 or i386, require
 seccomp 2.1 or better, else require 2.2 or better?

I would probably just limit QEMU/seccomp to x86_64 and x86.  Once we have the 
new release that fixes everything we can start worrying about versions and 
different ABIs.

 If our current source will build with seccomp 2.2 then that seems like a
 better check to put in our configure script than a simple disabling of
 the functionality on ARM hosts; it means that if distros end up
 with QEMU 2.2 plus seccomp 2.2 the functionality won't be
 unnecessarily disabled. (Please correct me if I have your
 next-release numbering wrong!)

Well, technically we don't have libseccomp v2.2 yet so I can't say for certain 
what it will look like and how it will behave.

  I'm more than happy to discuss how libseccomp handles the different
  architectures, but that's probably a bit off-topic for this thread.
 
 I guess the only thing that matters for us is that there wasn't
 an API break required for the fix.

Nope, the API is solid, just some internal fixes.

-- 
paul moore
security and virtualization @ redhat




[Qemu-devel] [PATCH] seccomp: add semctl() to the syscall whitelist

2014-08-14 Thread Paul Moore
QEMU needs to call semctl() for correct operation.  This particular
problem was identified on shutdown with the following commandline:

 # qemu -sandbox on -monitor stdio \
   -device intel-hda -device hda-duplex -vnc :0

Signed-off-by: Paul Moore pmo...@redhat.com
---
 qemu-seccomp.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/qemu-seccomp.c b/qemu-seccomp.c
index ea8094d..0503764 100644
--- a/qemu-seccomp.c
+++ b/qemu-seccomp.c
@@ -230,7 +230,8 @@ static const struct QemuSeccompSyscall seccomp_whitelist[] 
= {
 { SCMP_SYS(timerfd_create), 240 },
 { SCMP_SYS(shmctl), 240 },
 { SCMP_SYS(mlock), 240 },
-{ SCMP_SYS(munlock), 240 }
+{ SCMP_SYS(munlock), 240 },
+{ SCMP_SYS(semctl), 240 }
 };
 
 int seccomp_start(void)




Re: [Qemu-devel] [PULL 00/02] seccomp: adding new syscalls to the whitelist

2014-04-28 Thread Paul Moore
On Sunday, April 27, 2014 11:10:50 AM Paolo Bonzini wrote:
 Il 14/04/2014 16:47, Paul Moore ha scritto:
   Yes. Also the commits don't have your signed-off-by:
   so I can't apply it.
  
  Eduardo?
  
  It is absurd that we have had two fixes held up this long for such silly
  things.
 
 It's not silly, see http://lwn.net/Articles/592503/ for more information.

The contributor agreement isn't silly.  What is silly, and frustrating, are 
these stupid delays between comments/respin/etc.  If you're trying to make 
some point about the contributor agreements, sign-offs, etc. please make it 
elsewhere.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PULL 00/02] seccomp: adding new syscalls to the whitelist

2014-04-14 Thread Paul Moore
On Thursday, April 03, 2014 12:14:21 PM Peter Maydell wrote:
 On 2 April 2014 16:53, Paolo Bonzini pbonz...@redhat.com wrote:
  Il 01/04/2014 15:06, Eduardo Otubo ha scritto:
  Not sure why it didn't get upstream yet.
  Anthony, Peter, could you take a closer look at this?
  
  Peter filters on for you to fetch changes up to and your git didn't
  include it. :)
 
 Yes. Also the commits don't have your signed-off-by:
 so I can't apply it.

Eduardo?

It is absurd that we have had two fixes held up this long for such silly 
things.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PULL 00/02] seccomp: adding new syscalls to the whitelist

2014-03-24 Thread Paul Moore
On Thursday, March 13, 2014 10:42:42 AM Eduardo Otubo wrote:
 The following changes since commit 750036a848ea913ba6343718ffa70da98f7eef6b:
 
   Merge remote-tracking branch 'remotes/afaerber/tags/prep-for-upstream'
 into staging (2014-03-12 17:53:37 +)
 
 are available in the git repository at:
 
   git://github.com/otubo/qemu.git seccomp
 
 Felix Geyer (1):
   seccomp: add timerfd_create and timerfd_settime to the whitelist
 
 Paul Moore (1):
   seccomp: add shmctl(), mlock(), and munlock() to the syscall whitelist
 
  qemu-seccomp.c |7 ++-
  1 files changed, 6 insertions(+), 1 deletions(-)

Notice this still hasn't made it upstream and thought I would check to see 
where things stood ...

-- 
paul moore
security and virtualization @ redhat




[Qemu-devel] [PATCH] seccomp: add getrusage() to the syscall whitelist for Open vSwitch

2014-03-06 Thread Paul Moore
When QEMU is used with Open vSwitch it is common to create netdev
script and downscript scripts that use the ovs-vsctl command to manage
the underlying network devices.  Unfortunately, ovs-vsctl calls the
getrusage() syscall which is not currently present in the QEMU/seccomp
whistelist.

Signed-off-by: Paul Moore pmo...@redhat.com
---
 qemu-seccomp.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/qemu-seccomp.c b/qemu-seccomp.c
index caa926e..86210a4 100644
--- a/qemu-seccomp.c
+++ b/qemu-seccomp.c
@@ -225,7 +225,8 @@ static const struct QemuSeccompSyscall seccomp_whitelist[] 
= {
 { SCMP_SYS(fchmod), 240 },
 { SCMP_SYS(shmget), 240 },
 { SCMP_SYS(shmat), 240 },
-{ SCMP_SYS(shmdt), 240 }
+{ SCMP_SYS(shmdt), 240 },
+{ SCMP_SYS(getrusage), 240 }
 };
 
 int seccomp_start(void)




Re: [Qemu-devel] [PATCH] seccomp: add shmctl(), mlock(), and munlock() to the syscall whitelist

2014-03-05 Thread Paul Moore
On Wednesday, March 05, 2014 11:53:58 AM Eduardo Otubo wrote:
 On 03/03/2014 05:41 PM, Paul Moore wrote:
  On Wednesday, February 26, 2014 10:25:01 AM Paul Moore wrote:
  Additional testing reveals that PulseAudio requires shmctl() and the
  mlock()/munlock() syscalls on some systems/configurations.  As before,
  on systems that do require these syscalls, the problem can be seen with
  
  the following command line:
 # qemu -monitor stdio  -sandbox on \
 
-device intel-hda -device hda-duplex
  
  Signed-off-by: Paul Moore pmo...@redhat.com
  ---
  
qemu-seccomp.c |5 -
1 file changed, 4 insertions(+), 1 deletion(-)
  
  diff --git a/qemu-seccomp.c b/qemu-seccomp.c
  index caa926e..3db1e9b 100644
  --- a/qemu-seccomp.c
  +++ b/qemu-seccomp.c
  @@ -225,7 +225,10 @@ static const struct QemuSeccompSyscall
  seccomp_whitelist[] = { { SCMP_SYS(fchmod), 240 },
  
{ SCMP_SYS(shmget), 240 },
{ SCMP_SYS(shmat), 240 },
  
  -{ SCMP_SYS(shmdt), 240 }
  +{ SCMP_SYS(shmdt), 240 },
  +{ SCMP_SYS(shmctl), 240 },
  +{ SCMP_SYS(mlock), 240 },
  +{ SCMP_SYS(munlock), 240 }
  
};

int seccomp_start(void)
  
  Bump to bring this back the forefront of everyone's minds.  Can we get
  this merged?
 
 Sorry for the late review, I've been hit by a terrible tendinitis and
 was unable to use a computer for the last week. :(

Hopefully nothing that isn't too unbearable and unmanageable.  Tendinitis can 
be a real pain.

 I ACK this patch and I'll create a pull request by friday EOD if nothing
 else comes up.

Great, thank you.  I've run across one more issue related to Open vSwitch and 
getrusage(); I'll try to get that fixed/tested before Friday, but don't let 
that hold up this fix.

On a somewhat related note, have you done any work on the dynamic, per-feature 
whitelist?  I'm thinking of doing some work to put together a basic framework 
for this, but I didn't want to waste time on a duplicated effort ...

-Paul

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH] seccomp: add shmctl(), mlock(), and munlock() to the syscall whitelist

2014-03-03 Thread Paul Moore
On Wednesday, February 26, 2014 10:25:01 AM Paul Moore wrote:
 Additional testing reveals that PulseAudio requires shmctl() and the
 mlock()/munlock() syscalls on some systems/configurations.  As before,
 on systems that do require these syscalls, the problem can be seen with
 the following command line:
 
   # qemu -monitor stdio  -sandbox on \
  -device intel-hda -device hda-duplex
 
 Signed-off-by: Paul Moore pmo...@redhat.com
 ---
  qemu-seccomp.c |5 -
  1 file changed, 4 insertions(+), 1 deletion(-)
 
 diff --git a/qemu-seccomp.c b/qemu-seccomp.c
 index caa926e..3db1e9b 100644
 --- a/qemu-seccomp.c
 +++ b/qemu-seccomp.c
 @@ -225,7 +225,10 @@ static const struct QemuSeccompSyscall
 seccomp_whitelist[] = { { SCMP_SYS(fchmod), 240 },
  { SCMP_SYS(shmget), 240 },
  { SCMP_SYS(shmat), 240 },
 -{ SCMP_SYS(shmdt), 240 }
 +{ SCMP_SYS(shmdt), 240 },
 +{ SCMP_SYS(shmctl), 240 },
 +{ SCMP_SYS(mlock), 240 },
 +{ SCMP_SYS(munlock), 240 }
  };
 
  int seccomp_start(void)

Bump to bring this back the forefront of everyone's minds.  Can we get this 
merged?

-- 
paul moore
security and virtualization @ redhat




[Qemu-devel] [PATCH] seccomp: add shmctl(), mlock(), and munlock() to the syscall whitelist

2014-02-26 Thread Paul Moore
Additional testing reveals that PulseAudio requires shmctl() and the
mlock()/munlock() syscalls on some systems/configurations.  As before,
on systems that do require these syscalls, the problem can be seen with
the following command line:

  # qemu -monitor stdio  -sandbox on \
 -device intel-hda -device hda-duplex

Signed-off-by: Paul Moore pmo...@redhat.com
---
 qemu-seccomp.c |5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/qemu-seccomp.c b/qemu-seccomp.c
index caa926e..3db1e9b 100644
--- a/qemu-seccomp.c
+++ b/qemu-seccomp.c
@@ -225,7 +225,10 @@ static const struct QemuSeccompSyscall seccomp_whitelist[] 
= {
 { SCMP_SYS(fchmod), 240 },
 { SCMP_SYS(shmget), 240 },
 { SCMP_SYS(shmat), 240 },
-{ SCMP_SYS(shmdt), 240 }
+{ SCMP_SYS(shmdt), 240 },
+{ SCMP_SYS(shmctl), 240 },
+{ SCMP_SYS(mlock), 240 },
+{ SCMP_SYS(munlock), 240 }
 };
 
 int seccomp_start(void)




Re: [Qemu-devel] [PATCH 0/2] QEMU/seccomp fixes for PulseAudio

2014-01-16 Thread Paul Moore
On Thursday, January 16, 2014 01:52:30 PM Eduardo Otubo wrote:
 On 01/15/2014 05:38 PM, Paul Moore wrote:
  It turns out we need to add some additional syscalls to QEMU to make
  PulseAudio happy.  Two minor patches follow ...
  
  ---
  
  Paul Moore (2):
 seccomp: add mkdir() and fchmod() to the whitelist
 seccomp: add some basic shared memory syscalls to the whitelist

qemu-seccomp.c |7 ++-
1 file changed, 6 insertions(+), 1 deletion(-)
 
 Paul, sorry for the so late review. I was on vacation until yesterday
 and I'm still trying to empty my inbox.

No worries.  Personally, I wouldn't call a one day turnaround as a late 
review, I'd call that pretty good :)

-- 
paul moore
security and virtualization @ redhat




[Qemu-devel] [PATCH 2/2] seccomp: add some basic shared memory syscalls to the whitelist

2014-01-15 Thread Paul Moore
PulseAudio requires the use of shared memory so add shmget(), shmat(),
and shmdt() to the syscall whitelist.

Reported-by: xu...@redhat.com
Signed-off-by: Paul Moore pmo...@redhat.com
---
 qemu-seccomp.c |5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/qemu-seccomp.c b/qemu-seccomp.c
index 89f244f..caa926e 100644
--- a/qemu-seccomp.c
+++ b/qemu-seccomp.c
@@ -222,7 +222,10 @@ static const struct QemuSeccompSyscall seccomp_whitelist[] 
= {
 { SCMP_SYS(io_destroy), 241 },
 { SCMP_SYS(arch_prctl), 240 },
 { SCMP_SYS(mkdir), 240 },
-{ SCMP_SYS(fchmod), 240 }
+{ SCMP_SYS(fchmod), 240 },
+{ SCMP_SYS(shmget), 240 },
+{ SCMP_SYS(shmat), 240 },
+{ SCMP_SYS(shmdt), 240 }
 };
 
 int seccomp_start(void)




[Qemu-devel] [PATCH 0/2] QEMU/seccomp fixes for PulseAudio

2014-01-15 Thread Paul Moore
It turns out we need to add some additional syscalls to QEMU to make
PulseAudio happy.  Two minor patches follow ...

---

Paul Moore (2):
  seccomp: add mkdir() and fchmod() to the whitelist
  seccomp: add some basic shared memory syscalls to the whitelist


 qemu-seccomp.c |7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)




[Qemu-devel] [PATCH 1/2] seccomp: add mkdir() and fchmod() to the whitelist

2014-01-15 Thread Paul Moore
The PulseAudio library attempts to do a mkdir(2) and fchmod(2) on
/run/user/UID/pulse which is currently blocked by the syscall
filter; this patch adds the two missing syscalls to the whitelist.
You can reproduce this problem with the following command:

 # qemu -monitor stdio -device intel-hda -device hda-duplex

If watched under strace the following syscalls are shown:

 mkdir(/run/user/0/pulse, 0700)
 fchmod(11, 0700) [NOTE: 11 is the fd for /run/user/0/pulse]

Reported-by: xu...@redhat.com
Signed-off-by: Paul Moore pmo...@redhat.com
---
 qemu-seccomp.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/qemu-seccomp.c b/qemu-seccomp.c
index b7c1253..89f244f 100644
--- a/qemu-seccomp.c
+++ b/qemu-seccomp.c
@@ -220,7 +220,9 @@ static const struct QemuSeccompSyscall seccomp_whitelist[] 
= {
 { SCMP_SYS(io_cancel), 241 },
 { SCMP_SYS(io_setup), 241 },
 { SCMP_SYS(io_destroy), 241 },
-{ SCMP_SYS(arch_prctl), 240 }
+{ SCMP_SYS(arch_prctl), 240 },
+{ SCMP_SYS(mkdir), 240 },
+{ SCMP_SYS(fchmod), 240 }
 };
 
 int seccomp_start(void)




[Qemu-devel] [PATCH] seccomp: add mkdir() and fchmod() to the whitelist

2014-01-03 Thread Paul Moore
The PulseAudio library attempts to do a mkdir(2) and fchmod(2) on
/run/user/UID/pulse which is currently blocked by the syscall
filter; this patch adds the two missing syscalls to the whitelist.
You can reproduce this problem with the following command:

 # qemu -monitor stdio -device intel-hda -device hda-duplex

If watched under strace the following syscalls are shown:

 mkdir(/run/user/0/pulse, 0700)
 fchmod(11, 0700) [NOTE: 11 is the fd for /run/user/0/pulse]

Reported-by: xu...@redhat.com
Signed-off-by: Paul Moore pmo...@redhat.com
---
 qemu-seccomp.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/qemu-seccomp.c b/qemu-seccomp.c
index cf07869..bb19306 100644
--- a/qemu-seccomp.c
+++ b/qemu-seccomp.c
@@ -220,7 +220,9 @@ static const struct QemuSeccompSyscall seccomp_whitelist[] 
= {
 { SCMP_SYS(io_cancel), 241 },
 { SCMP_SYS(io_setup), 241 },
 { SCMP_SYS(io_destroy), 241 },
-{ SCMP_SYS(arch_prctl), 240 }
+{ SCMP_SYS(arch_prctl), 240 },
+{ SCMP_SYS(mkdir), 240 },
+{ SCMP_SYS(fchmod), 240 }
 };
 
 int seccomp_start(void)




Re: [Qemu-devel] [PATCH] seccomp: add mkdir() and fchmod() to the whitelist

2014-01-03 Thread Paul Moore
On Friday, January 03, 2014 09:24:57 PM Paolo Bonzini wrote:
 Il 03/01/2014 20:58, Paul Moore ha scritto:
  The PulseAudio library attempts to do a mkdir(2) and fchmod(2) on
  /run/user/UID/pulse which is currently blocked by the syscall
  filter; this patch adds the two missing syscalls to the whitelist.
  
  You can reproduce this problem with the following command:
   # qemu -monitor stdio -device intel-hda -device hda-duplex
  
  If watched under strace the following syscalls are shown:
   mkdir(/run/user/0/pulse, 0700)
   fchmod(11, 0700) [NOTE: 11 is the fd for /run/user/0/pulse]
 
 Can fchmod be exploited to violate the sandbox (e.g. to let data escape
 from a VM that ought not to have any way to communicate with the outside
 world)?

Technically, there is the potential for any syscall to be exploited in such a 
way that a malicious guest could gain greater access than desired and do 
something evil with that access.  After all, that was the motivation behind 
seccomp: disable unused syscalls to reduce the chance of an attacker 
exploiting a syscall bug.

The important thing to remember here is that the seccomp code in QEMU is not 
enabling syscalls, it is disabling them.  In other words, a QEMU instance with 
the seccomp functionality enabled, e.g. '-sandbox on', only reduces the number 
of syscalls available to the QEMU process, it never increases or adds 
vulnerable syscalls to the QEMU process.

Granted, yes, there are syscalls in the current whitelist that I wish we could 
disable, but we are still trying to arrive a whitelist that is all 
encompassing (or close to it) with respect to QEMU functionality.  Once we 
have that list in hand (each fix like the one I posted gets us closer) we can 
start looking at selectively shrinking the whitelist*.

* We've talked about this on-list previously and there are several approaches 
here, some include conditionally adding/removing syscalls based on the QEMU 
functionality requested, e.g. command line, different sandbox profiles, e.g. 
standalone vs libvirt, and staged seccomp filters, e.g. a whitelist followed 
by progressively tighter blacklists.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH] seccomp: exit if seccomp_init() fails

2013-12-19 Thread Paul Moore
On Wednesday, December 18, 2013 11:48:11 AM Corey Bryant wrote:
 This fixes a bug where we weren't exiting if seccomp_init() failed.
 
 Signed-off-by: Corey Bryant cor...@linux.vnet.ibm.com
 ---
  qemu-seccomp.c | 1 +
  1 file changed, 1 insertion(+)

Acked-by: Paul Moore pmo...@redhat.com
 
 diff --git a/qemu-seccomp.c b/qemu-seccomp.c
 index cf07869..b7c1253 100644
 --- a/qemu-seccomp.c
 +++ b/qemu-seccomp.c
 @@ -231,6 +231,7 @@ int seccomp_start(void)
 
  ctx = seccomp_init(SCMP_ACT_KILL);
  if (ctx == NULL) {
 +rc = -1;
  goto seccomp_return;
  }

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH] seccomp: -sandbox on won't kill Qemu when option not built in

2013-12-10 Thread Paul Moore
On Tuesday, December 10, 2013 04:48:54 PM Lucas Meneghel Rodrigues wrote:
 On 12/10/2013 01:20 AM, Corey Bryant wrote:
  IMHO the test suite should probe to see if sandbox is working or not,
  and
  just not use the -sandbox on arg if the host doesn't support it.
  
  But I think this could be done on virt-test as well :)
  
  This would make sense.
  
  Although it sounds like Lucas was looking for an error message when
  seccomp kills qemu.  Maybe virt-test could grep the audit log for the
  existence of a type=SECCOMP record within the test's time of
  execution, and issue a message based on that.
 
 It's a valid idea. The problem I see with it is that not every distro
 out there uses SELinux. Not getting into the merits of whether they
 should, ideally it'd be nice to have this working on distros that won't
 use SELinux.

Minor point of clarification, but audit and SELinux and independent subsystems 
in the kernel.

Also, and I don't have a non-audit kernel built at the moment to verify this, 
but on non-audit kernels the audit messages should be sent to syslog so you 
*should* still be able to search for SECCOMP records regardless.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH] seccomp: -sandbox on won't kill Qemu when option not built in

2013-12-09 Thread Paul Moore
On Monday, December 09, 2013 03:51:36 PM Eduardo Otubo wrote:
 On 12/09/2013 03:33 PM, Daniel P. Berrange wrote:
  On Mon, Dec 09, 2013 at 03:20:52PM -0200, Eduardo Otubo wrote:
  This option was requested by virt-test team so they can run tests with
  Qemu and -sandbox on set without breaking whole test if host doesn't
  
  have support for seccomp in kernel. It covers two possibilities:
1) Host kernel support does not support seccomp, but user installed
Qemu

   package with sandbox support: Libseccomp will fail - qemu will fail
   nicely and won't stop execution.

2) Host kernel has support but Qemu package wasn't built with sandbox

   feature. Qemu will fail nicely and won't stop execution.
  
  Signed-off-by: Eduardo Otubo ot...@linux.vnet.ibm.com
  ---
  
vl.c | 10 +++---
1 file changed, 3 insertions(+), 7 deletions(-)

{snip}

  This change is really dubious from a security POV. If the admin requested
  sandboxing and the host or QEMU build cannot support it, then QEMU really
  *must* exit.
 
 I think an admin must know what he's doing. If he requested sandbox but
 without kernel support he need to step back a little and understand what
 he's doing. This patch won't decrease the security level, IMHO.

NACK

For the reasons Daniel already mentioned.  Mistakes happen, a lot, and if the 
user explicitly requests security functionality and we can't provide it we 
need to fail in a manner that doesn't increase the user's risk.

  IMHO the test suite should probe to see if sandbox is working or not, and
  just not use the -sandbox on arg if the host doesn't support it.
 
 But I think this could be done on virt-test as well :)

This would be ideal, but if you must have a fallback mechanism in QEMU proper, 
make it separate from '-sandbox on' so that it doesn't break with the current 
behavior and also makes it is obvious that the functionality is not 
guaranteed, e.g. '-sandbox try' or similar.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH] seccomp: add kill() to the syscall whitelist

2013-11-26 Thread Paul Moore
On Thursday, November 21, 2013 02:40:48 PM Eduardo Otubo wrote:
 On 11/21/2013 01:40 PM, Paul Moore wrote:
  The kill() syscall is triggered with the following command:
# qemu -sandbox on -monitor stdio \

   -device intel-hda -device hda-duplex -vnc :0
  
  The resulting syslog/audit message:
# ausearch -m SECCOMP

time-Wed Nov 20 09:52:08 2013
type=SECCOMP msg=audit(1384912328.482:6656): auid=0 uid=0 gid=0 ses=854

 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 pid=12087
 comm=qemu-kvm sig=31 syscall=62 compat=0 ip=0x7f7a1d2abc67 code=0x0

# scmp_sys_resolver 62
kill
  
  Reported-by: CongLi c...@redhat.com
  Tested-by: CongLi c...@redhat.com
  Signed-off-by: Paul Moore pmo...@redhat.com
  ---
  
qemu-seccomp.c |1 +
1 file changed, 1 insertion(+)
  
  diff --git a/qemu-seccomp.c b/qemu-seccomp.c
  index 69cee44..cf07869 100644
  --- a/qemu-seccomp.c
  +++ b/qemu-seccomp.c
  @@ -114,6 +114,7 @@ static const struct QemuSeccompSyscall
  seccomp_whitelist[] = { 
{ SCMP_SYS(write), 244 },
{ SCMP_SYS(fcntl), 243 },
{ SCMP_SYS(tgkill), 242 },
  
  +{ SCMP_SYS(kill), 242 },
  
{ SCMP_SYS(rt_sigaction), 242 },
{ SCMP_SYS(pipe2), 242 },
{ SCMP_SYS(munmap), 242 },
 
 ACK, Reviewed and tested.
 (I'll send a pull request tomorrow EOD)
 
 Reviewed-by: Eduardo Otubo ot...@linux.vnet.ibm.com

Ping?

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH for-1.7] seccomp: setting -sandbox on by default

2013-11-22 Thread Paul Moore
On Friday, November 22, 2013 11:34:41 AM Stefan Hajnoczi wrote:
 IMO this seccomp approach is doomed since QEMU does not practice
 privilege separation.  QEMU is monolithic so it's really hard to create
 a meaningful sets of system calls.

I'm a big fan of decomposing QEMU, but based on previous discussions there 
seems to be a lot of fear from the core QEMU folks around decomposition; 
enough that I'm not sure it is worth the time and effort at this point to 
pursue it.  While I agree that a decomposed QEMU would be able to make better 
use of syscall filtering (and LSM/SELinux protection, and ...) I don't believe 
it means syscall filtering is a complete lost cause with a monolithic QEMU.  
Any improvement you can make, no matter how small, is still and improvement.

 To avoid breaking stuff you need to be too liberal, defeating the purpose of
 seccomp.

Even if you can only disable a few syscalls you are still better off than you 
were before.  Could it be done better, of course it could, but it doesn't mean 
you shouldn't try for some benefit.

 For each QEMU command-line there may be a different set of syscalls that
 should be allowed/forbidden.

I'm not sure if you missed it or not, but I had an email exchange with Eduardo 
on this list about making the syscall whitelist a bit more intelligent and 
dependent on what functionality was enabled for a given QEMU instance.  This 
should help a bit with the problems you are describing.

 The existing approach clearly doesn't support the full range of options
 that users specify on the command-line.

Bugs.  It will get fixed in time with more testing/debugging.  Eduardo is 
working on improving the testing and RH's QA folks are working hard to shake 
out the bugs too.  I just posted another bug fix patch to the whitelist a few 
days ago.

 So I guess the options are:
 
 1. Don't make it the default since it breaks stuff but use it for very
 specific scenarios (e.g. libvirt use cases that have been well tested).

In my opinion, I think it was probably a bit premature to make enable it by 
default, but at some point in the future I think we do need to do this.

 2. Provide a kind of syscall set for various QEMU options and apply the
 union of them at launch.  This still seems fragile but in theory it
 could work.

This is what I was discussing above.  I think this is likely the next big 
improvement.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH for-1.7] seccomp: setting -sandbox on by default

2013-11-22 Thread Paul Moore
On Friday, November 22, 2013 11:39:31 AM Stefan Hajnoczi wrote:
 On Thu, Nov 21, 2013 at 10:48:58AM -0500, Paul Moore wrote:
  I'm always open to suggestions on how to improve the development/debugging
  process, so if you have any ideas please let me know.
 
 The failure mode is terrible:

Glad to see you don't feel strongly about things.

   The following fails to start here (the shell hangs and ps shows QEMU
   is a defunct process)
 
 When a process dies the shell prints a warning.  That alerts the user
 and prompts them to take further steps.
 
 Hanging the shell is a really bad way to fail.  We can't expect users to
 begin searching logs for audit failures.  They probably don't even know
 about audit or seccomp.

First things first, if a normal user hits a seccomp failure I consider it to 
be a bug, and to be honest, I've seen much nastier, much more subtle bugs than 
an inadvertent seccomp death.  In a perfect world only developers would run 
into this problem and I would expect developers to be smart enough to figure 
out what is going on.

Getting past that, if seccomp is configured to kill the process when it 
violates the filter rules there isn't much else we can do; the kernel kills 
the process and then the rest of userspace, e.g. the shell/libvirt/etc., does 
whatever it does.  We have very little control over things.

 Is it possible to produce output when a seccomp violation takes place?

See the earlier comments from Eduardo about his attempts in this area.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH for-1.7] seccomp: setting -sandbox on by default

2013-11-22 Thread Paul Moore
On Friday, November 22, 2013 04:48:41 PM Stefan Hajnoczi wrote:
 On Fri, Nov 22, 2013 at 09:44:42AM -0500, Paul Moore wrote:
  On Friday, November 22, 2013 11:39:31 AM Stefan Hajnoczi wrote:
   On Thu, Nov 21, 2013 at 10:48:58AM -0500, Paul Moore wrote:
I'm always open to suggestions on how to improve the
development/debugging
process, so if you have any ideas please let me know.
   
   The failure mode is terrible:
  Glad to see you don't feel strongly about things.
 
 Sorry for the rant :).  I know you and Eduardo understand the issues and
 have already been working on them.

I can't speak for Eduardo, but no worries on my end; it just wouldn't be an 
Open Source project without a bit of hyperbole now and then would it? ;)

 I hope hearing it from a developer who isn't following seccomp is useful
 though.

Definitely.  I should have said it earlier, but I do appreciate you taking the 
time to comment.

 It shows which issues stick out and hinder usability.  Users will only be
 happy with seccomp when it works silently behind the scenes.

Exactly.  Users don't tolerate bugs and I don't blame them.  After all, at 
some point we are all users too.

 Developers will only be happy with seccomp if it's easy and rewarding to
 support/debug.

Agreed.

As a developer, how do you feel about the audit/syslog based approach I 
mentioned earlier?

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH for-1.7] seccomp: setting -sandbox on by default

2013-11-21 Thread Paul Moore
On Thursday, November 21, 2013 04:14:11 PM Paolo Bonzini wrote:
 Il 30/10/2013 11:04, Stefan Hajnoczi ha scritto:
  On Wed, Oct 23, 2013 at 12:42:34PM -0200, Eduardo Otubo wrote:
  On 10/22/2013 11:00 AM, Anthony Liguori wrote:
  On Tue, Oct 22, 2013 at 12:21 PM, Eduardo Otubo
  
  ot...@linux.vnet.ibm.com wrote:
  Inverting the way sandbox handles arguments, making possible to have no
  argument and still have '-sandbox on' enabled.
  
  Signed-off-by: Eduardo Otubo ot...@linux.vnet.ibm.com
  ---
  
  The option '-sandbox on' is now used by default by virt-test[0] -- it
  has been merged into the 'next' branch and will be available in the
  next release, meaning we have a back support for regression tests if
  anything breaks because of some missing system call not listed in the
  whitelist.
  
  This being said, I think it makes sense to have this option set to 'on'
  by
  default in the next Qemu version. It's been a while since no missing
  syscall is reported and at this point the whitelist seems to be pretty
  mature.
  
  [0] -
  https://github.com/autotest/virt-test/commit/50e1f7d47a94f4c770880cd8e
  c0f18365dcba714 
  This breaks hot_add of a network device that uses a script= argument,
  correct?
  
  If so, this cannot be made default.
  
  Anthony, I believe you're talking about the blacklist feature. This
  is the old whitelist that is already upstream and it does not block
  any network device to be hot plugged.
  
  The following fails to start here (the shell hangs and ps shows QEMU is
  a defunct process):
  
  qemu-system-x86_64 -sandbox on -enable-kvm -m 1024 -cpu host \
  
 -drive if=virtio,cache=none,file=test.img
 
 Easier-to-debug failures are another prerequisite for enabling the
 sandbox by default, I think.

I believe I've posted this information before, but just in case ...

IMHO, it is really not that hard to debug a seccomp failure; the first step is 
to look for the failure in the audit log or syslog.  If you are on a 
Fedora/RHEL based system you are most likely running audit, so finding the 
seccomp failures are quite simple with the 'ausearch' command:

 # ausearch -m SECCOMP
 
 time-Wed Nov 20 09:52:08 2013
 type=SECCOMP msg=audit(1384912328.482:6656): auid=0 uid=0 gid=0 ses=854
  subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 pid=12087
  comm=qemu-kvm sig=31 syscall=62 compat=0 ip=0x7f7a1d2abc67 code=0x0

... if you are using syslog, feel free to use whatever tool you prefer, e.g. 
grep, less, etc.

Once you have the syscall number, syscall=62, in the audit message above, 
you can use the 'scmp_sys_resolver' to resolve the number into a name:

 # scmp_sys_resolver 62
 kill

The 'scmp_sys_resolver' tool is part of the libseccomp-devel package on 
Fedora/RHEL based systems, it may be packaged differently on other 
distributions.

I'm always open to suggestions on how to improve the development/debugging 
process, so if you have any ideas please let me know.

-Paul

-- 
paul moore
security and virtualization @ redhat




[Qemu-devel] [PATCH] seccomp: add kill() to the syscall whitelist

2013-11-21 Thread Paul Moore
The kill() syscall is triggered with the following command:

 # qemu -sandbox on -monitor stdio \
-device intel-hda -device hda-duplex -vnc :0

The resulting syslog/audit message:

 # ausearch -m SECCOMP
 
 time-Wed Nov 20 09:52:08 2013
 type=SECCOMP msg=audit(1384912328.482:6656): auid=0 uid=0 gid=0 ses=854
  subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 pid=12087
  comm=qemu-kvm sig=31 syscall=62 compat=0 ip=0x7f7a1d2abc67 code=0x0
 # scmp_sys_resolver 62
 kill

Reported-by: CongLi c...@redhat.com
Tested-by: CongLi c...@redhat.com
Signed-off-by: Paul Moore pmo...@redhat.com
---
 qemu-seccomp.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/qemu-seccomp.c b/qemu-seccomp.c
index 69cee44..cf07869 100644
--- a/qemu-seccomp.c
+++ b/qemu-seccomp.c
@@ -114,6 +114,7 @@ static const struct QemuSeccompSyscall seccomp_whitelist[] 
= {
 { SCMP_SYS(write), 244 },
 { SCMP_SYS(fcntl), 243 },
 { SCMP_SYS(tgkill), 242 },
+{ SCMP_SYS(kill), 242 },
 { SCMP_SYS(rt_sigaction), 242 },
 { SCMP_SYS(pipe2), 242 },
 { SCMP_SYS(munmap), 242 },




Re: [Qemu-devel] [PATCHv3 1/3] seccomp: adding blacklist support

2013-10-09 Thread Paul Moore
On Tuesday, October 08, 2013 09:42:24 PM Eduardo Otubo wrote:
 v3: The -netdev tap option is checked in the vl.c file during the
 process of the command line argument list. It sets tap_enabled to true
 or false according to the configuration found. Later at the seccomp
 filter installation, this value is checked wheter to install or not this
 feature.

I like the idea of slowly making the QEMU syscall filter dependent on the 
runtime configuration.  With that in mind, I wonder if we should have a more 
general purpose API in include/sysemu/seccomp.h that allows QEMU to indicate 
to the the QEMU/seccomp code that a particular feature is enabled.

Maybe something like this:

  #define SCMP_FEAT_TAP ...

  int seccomp_feature_enable(int feature);

One more comment below.

 Adding a system call blacklist right before the vcpus starts. This
 filter is composed by the system calls that can't be executed after the
 guests are up. This list should be refined as whitelist is, with as much
 testing as we can do using virt-test.
 
 Signed-off-by: Eduardo Otubo ot...@linux.vnet.ibm.com
 ---
  include/sysemu/seccomp.h |  6 -
  qemu-seccomp.c   | 64
 +++- vl.c |
 21 +++-
  3 files changed, 77 insertions(+), 14 deletions(-)
 
 diff --git a/include/sysemu/seccomp.h b/include/sysemu/seccomp.h
 index 1189fa2..9dc7e52 100644
 --- a/include/sysemu/seccomp.h
 +++ b/include/sysemu/seccomp.h
 @@ -15,8 +15,12 @@
  #ifndef QEMU_SECCOMP_H
  #define QEMU_SECCOMP_H
 
 +#define WHITELIST 0
 +#define BLACKLIST 1

Should these #defines be namespaced in some way, e.g. SCMP_LIST_BLACKLIST?

  #include seccomp.h
  #include qemu/osdep.h
 
 -int seccomp_start(void);
 +int seccomp_start(int list_type);
 +
  #endif


-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCHv3 3/3] seccomp: general fixes

2013-10-09 Thread Paul Moore
On Tuesday, October 08, 2013 09:42:26 PM Eduardo Otubo wrote:
  1) On qemu-seccomp.c:255, the variable ctx was being used
 uninitialized; now it's initialized with NULL and it's being checked at
 the end of the function.
 
  2) Changed the name of the command line option from enable to
 sandbox for a better understanding from user side.
 
 Signed-off-by: Eduardo Otubo ot...@linux.vnet.ibm.com
 ---
  qemu-seccomp.c | 4 ++--
  vl.c   | 6 +++---
  2 files changed, 5 insertions(+), 5 deletions(-)
 
 diff --git a/qemu-seccomp.c b/qemu-seccomp.c
 index 84a42bc..fdd0de3 100644
 --- a/qemu-seccomp.c
 +++ b/qemu-seccomp.c
 @@ -258,7 +258,7 @@ seccomp_return:
  int seccomp_start(int list_type)
  {
  int rc = 0;
 -scmp_filter_ctx ctx;
 +scmp_filter_ctx ctx = NULL;
 
  switch (list_type) {
  case WHITELIST:
 @@ -285,7 +285,7 @@ int seccomp_start(int list_type)
 
  rc = seccomp_load(ctx);
 
 -  seccomp_return:
 +seccomp_return:
  if (ctx)
  seccomp_release(ctx);
  return rc;

Any particular reason these changes weren't folded into patch 1/3?

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PULL 00/01] seccomp: trivial changes

2013-10-01 Thread Paul Moore
On Tuesday, September 24, 2013 03:30:57 PM Eduardo Otubo wrote:
 Anthony,
 
 The following changes since commit f828a4c8faa118e0ebab3e353ac6840f3b2a0318:
 
   Merge remote-tracking branch 'stefanha/tracing' into staging (2013-09-23
 11:53:22 -0500)
 
 are available in the git repository at:
 
   git://github.com/otubo/qemu.git seccomp
 
 Eduardo Otubo (1):
   seccomp: fine tuning whitelist by adding times()
 
  qemu-seccomp.c |1 +
  1 files changed, 1 insertions(+), 0 deletions(-)

Just a follow-up ping to see where things stand on this bugfix ...

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH] seccomp: adding times() to the whitelist

2013-09-23 Thread Paul Moore
On Wednesday, September 04, 2013 10:11:10 AM Paul Moore wrote:
 On Wednesday, September 04, 2013 09:25:08 AM Eduardo Otubo wrote:
  This was causing Qemu process to hang when using -sandbox on.
  
  Related RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1004175
  
  Signed-off-by: Eduardo Otubo ot...@linux.vnet.ibm.com
 
 Works for me.
 
 Tested-by: Paul Moore pmo...@redhat.com

Eduardo, perhaps you should just merge this into your tree and send a pull 
request?  This fix should also go into -stable.

Acked-by: Paul Moore pmo...@redhat.com

  ---
  
   qemu-seccomp.c |1 +
   1 files changed, 1 insertions(+), 0 deletions(-)
  
  diff --git a/qemu-seccomp.c b/qemu-seccomp.c
  index 37d38f8..69cee44 100644
  --- a/qemu-seccomp.c
  +++ b/qemu-seccomp.c
  @@ -90,6 +90,7 @@ static const struct QemuSeccompSyscall
  seccomp_whitelist[] = { { SCMP_SYS(getuid), 245 },
  
   { SCMP_SYS(geteuid), 245 },
   { SCMP_SYS(timer_create), 245 },
  
  +{ SCMP_SYS(times), 245 },
  
   { SCMP_SYS(exit), 245 },
   { SCMP_SYS(clock_gettime), 245 },
   { SCMP_SYS(time), 245 },

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCHv2 2/3] seccomp: adding command line support for blacklist

2013-09-18 Thread Paul Moore
On Wednesday, September 18, 2013 04:59:10 PM Daniel P. Berrange wrote:
 On Wed, Sep 18, 2013 at 11:53:09AM -0400, Paul Moore wrote:
  On Wednesday, September 18, 2013 08:38:17 AM Daniel P. Berrange wrote:
   Libvirt does not want to be in the business of creating seccomp syscall
   filters for QEMU. As mentioned before, IMHO that places an unacceptable
   burden on libvirt to know about the syscalls each a particular version
   of QEMU requires for its operation.
  
  At a high level, I don't see how libvirt configuring and installing a
  syscall filter is substantially different from libvirt configuring and
  installing a network filter.
 
 The rules created for a network filter have no bearing or relation to
 internal QEMU implementation details, as you have with syscalls, so
 this isn't really a relevant comparison.

The rules created for a network filter are directly related to the details of 
the guest running inside of QEMU.  From a practical point of view I see both 
network and syscall filtering as being dependent on the guest; the network 
filtering configuration can change as the guest's services change, the syscall 
filtering configuration can change as the QEMU functionality can change.

  Also, and I recognize this is diverting away from a topic most of
  qemu-devel is not interested in, what about libvirt-lxc?  What about all
  of the other virtualization drivers supported by libvirt (granted, not
  all would be candidates for syscall filtering, but you get the idea).
 
 It isn't clear to me that syscall filtering is something that's relevant
 for inclusion in libvirt-lxc. It seems like something that would be used
 by apps running inside LXC containers directly.

For all the same reasons that it makes sense to filter syscalls in QEMU, I 
think it makes sense to filter syscalls in libvirt-lxc.  The fundamental 
concern is that the kernel presents are large attack surface in the way of 
syscalls, and it is extremely likely that any given container does not have a 
legitimate need to call into all of the syscalls the kernel presents to 
userspace; especially if you consider the recent approaches of using 
containers to ship/deploy single applications.

Also, just in case there are some misconceptions floating around, loading a 
syscall filter in libvirt doesn't mean the individual container applications 
can't also load their own filter.  When multiple syscall filters are present 
for a given process, all of the filters are evaluated and the most restrictive 
decision for a given syscall request wins.

 Libvirt has no knowledge of such apps or what rules they might require, so
 can't make any kind of intelligent decision about syscall filtering for LXC.

A perfectly valid point, but I also think of syscall filtering as allowing the 
host administrator the ability to reduce the attack surface of the host 
system/kernel from potentially malicious containers/applications without 
having to rely on these containers/applications to police themselves.

 I really view seccomp as something that apps use directly themselves, not
 something that a 3rd party process applies prior to launching the apps,
 since the latter has far too much administrative burden IMHO.

The seccomp filter functionality is definitely something that apps can use 
themselves, but to limit syscall filtering to just that use case is to miss 
out on other valid uses as well.  As far as the burden is concerned, is 
users/administrators find it too difficult, there is nothing requiring them to 
use it, however, for those who are facing serious security risks in their 
deployments providing syscall filtering in libvirt might be a very welcome 
addition.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCHv2 2/3] seccomp: adding command line support for blacklist

2013-09-18 Thread Paul Moore
On Wednesday, September 18, 2013 08:38:17 AM Daniel P. Berrange wrote:
 Libvirt does not want to be in the business of creating seccomp syscall
 filters for QEMU. As mentioned before, IMHO that places an unacceptable
 burden on libvirt to know about the syscalls each a particular version
 of QEMU requires for its operation.

At a high level, I don't see how libvirt configuring and installing a syscall 
filter is substantially different from libvirt configuring and installing a 
network filter.

Also, and I recognize this is diverting away from a topic most of qemu-devel 
is not interested in, what about libvirt-lxc?  What about all of the other 
virtualization drivers supported by libvirt (granted, not all would be 
candidates for syscall filtering, but you get the idea).

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCHv2 2/3] seccomp: adding command line support for blacklist

2013-09-18 Thread Paul Moore
On Wednesday, September 18, 2013 05:32:17 PM Daniel P. Berrange wrote:
 On Wed, Sep 18, 2013 at 12:19:44PM -0400, Paul Moore wrote:
  On Wednesday, September 18, 2013 04:59:10 PM Daniel P. Berrange wrote:
   On Wed, Sep 18, 2013 at 11:53:09AM -0400, Paul Moore wrote:
On Wednesday, September 18, 2013 08:38:17 AM Daniel P. Berrange wrote:
 Libvirt does not want to be in the business of creating seccomp
 syscall filters for QEMU. As mentioned before, IMHO that places an
 unacceptable burden on libvirt to know about the syscalls each a
 particular version of QEMU requires for its operation.

At a high level, I don't see how libvirt configuring and installing a
syscall filter is substantially different from libvirt configuring and
installing a network filter.
   
   The rules created for a network filter have no bearing or relation to
   internal QEMU implementation details, as you have with syscalls, so
   this isn't really a relevant comparison.
  
  The rules created for a network filter are directly related to the details
  of the guest running inside of QEMU.  From a practical point of view I
  see both network and syscall filtering as being dependent on the guest;
  the network filtering configuration can change as the guest's services
  change, the syscall filtering configuration can change as the QEMU
  functionality can change.

 You're talking about two very different things here. Seccomp syscall
 filtering affects QEMU itself while network filter affects the guest
 OS apps inside QEMU.

From a security standpoint I'm not entirely convinced the distinction is 
important.

 Network filtering still does not depend on the implementation details of the
 guest OS apps - it depends on the services that those apps are using.

Once again, I'm not entirely sure that worrying about the distinction between 
guest apps/services is important - it is just the guest.

 Thus configuring network filters does not require the admin to have
 knowledge of the apps internal impl details in the way that seccomp does.

Network filters require the admin to have knowledge of what apps/services the 
guest is providing.  Syscall filters require the admin have knowledge of what 
version of QEMU is deployed on the host.  I think it is reasonable to expect 
that the admin has more knowledge, and more control, over the QEMU version 
they are using than they do over what is being run in the hosted guests.

I don't argue that arriving at the correct syscall filter configuration is 
more difficult than a network filter, but I don't see what that means we can't 
offer it as an option for the more savy admins.  Also, the libvirt patches I'm 
currently working on allow the syscall filter to be defined either as a 
whitelist or a blacklist; the blacklist approach should provide a much more 
gradual learning curve ... and in the case of containers, I suspect it might 
also be the better solution.

Also, and I recognize this is diverting away from a topic most of
qemu-devel is not interested in, what about libvirt-lxc?  What about
all of the other virtualization drivers supported by libvirt (granted,
not all would be candidates for syscall filtering, but you get the
idea).
   
   It isn't clear to me that syscall filtering is something that's relevant
   for inclusion in libvirt-lxc. It seems like something that would be used
   by apps running inside LXC containers directly.
  
  For all the same reasons that it makes sense to filter syscalls in QEMU, I
  think it makes sense to filter syscalls in libvirt-lxc.  The fundamental
  concern is that the kernel presents are large attack surface in the way of
  syscalls, and it is extremely likely that any given container does not
  have a legitimate need to call into all of the syscalls the kernel
  presents to userspace; especially if you consider the recent approaches
  of using containers to ship/deploy single applications.
  
  Also, just in case there are some misconceptions floating around, loading
  a syscall filter in libvirt doesn't mean the individual container
  applications can't also load their own filter.  When multiple syscall
  filters are present for a given process, all of the filters are evaluated
  and the most restrictive decision for a given syscall request wins.
  
   Libvirt has no knowledge of such apps or what rules they might require,
   so can't make any kind of intelligent decision about syscall filtering
   for LXC.
 
  A perfectly valid point, but I also think of syscall filtering as allowing
  the host administrator the ability to reduce the attack surface of the
  host system/kernel from potentially malicious containers/applications
  without having to rely on these containers/applications to police
  themselves.
  
   I really view seccomp as something that apps use directly themselves,
   not something that a 3rd party process applies prior to launching the
   apps, since the latter has far too much

Re: [Qemu-devel] [PATCHv2 2/3] seccomp: adding command line support for blacklist

2013-09-17 Thread Paul Moore
On Tuesday, September 17, 2013 02:06:06 PM Daniel P. Berrange wrote:
 On Tue, Sep 17, 2013 at 10:01:23AM -0300, Eduardo Otubo wrote:

  Paul, what exactly are you planning to add to libvirt? I'm not a big
  fan of using qemu command line to pass syscalls for blacklist as
  arguments, but I can't see other way to avoid problems (like -net
  bridge / -net tap) from happening.

At present, and as far as I'm concerned pretty much everything is open for 
discussion, the code works similar to the libvirt network filters.  You create 
a separate XML configuration file which defines the filter and you reference 
that filter from the domain's XML configuration.  When a QEMU/KVM or LXC based 
domain starts it uses libseccomp to create the seccomp filter and then loads 
it into the kernel after the fork but before the domain is exec'd.

There are no command line arguments passed to QEMU.  This work can co-exist 
with the QEMU seccomp filters without problem.

The original goal of this effort wasn't to add libvirt syscall filtering for 
QEMU, but rather for LXC; adding QEMU support just happened to be a trivial 
patch once the LXC support was added.

(I also apologize for the delays, I hit a snag with an existing problem on 
libvirt which stopped work and then some other BZs grabbed my attention...)

 IMHO, if libvirt is enabling seccomp, then making all possible cli
 args work is a non-goal. If there are things which require privileges
 seccomp is blocking, then libvirt should avoid using them. eg by making
 use of FD passing where appropriate to reduce privileges qemu needs.

I agree.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH] seccomp: adding times() to the whitelist

2013-09-13 Thread Paul Moore
On Wednesday, September 04, 2013 10:11:10 AM Paul Moore wrote:
 On Wednesday, September 04, 2013 09:25:08 AM Eduardo Otubo wrote:
  This was causing Qemu process to hang when using -sandbox on.
  
  Related RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1004175
  
  Signed-off-by: Eduardo Otubo ot...@linux.vnet.ibm.com
 
 Works for me.
 
 Tested-by: Paul Moore pmo...@redhat.com

I fear this patch may have been lost in the maintainer discussion - can we 
merge this fix please?

  ---
  
   qemu-seccomp.c |1 +
   1 files changed, 1 insertions(+), 0 deletions(-)
  
  diff --git a/qemu-seccomp.c b/qemu-seccomp.c
  index 37d38f8..69cee44 100644
  --- a/qemu-seccomp.c
  +++ b/qemu-seccomp.c
  @@ -90,6 +90,7 @@ static const struct QemuSeccompSyscall
  seccomp_whitelist[] = { { SCMP_SYS(getuid), 245 },
  
   { SCMP_SYS(geteuid), 245 },
   { SCMP_SYS(timer_create), 245 },
  
  +{ SCMP_SYS(times), 245 },
  
   { SCMP_SYS(exit), 245 },
   { SCMP_SYS(clock_gettime), 245 },
   { SCMP_SYS(time), 245 },

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH] seccomp: adding times() to the whitelist

2013-09-09 Thread Paul Moore
On Monday, September 09, 2013 12:38:12 PM Paolo Bonzini wrote:
 Il 06/09/2013 20:41, Eduardo Otubo ha scritto:
  Hello,
  
  Any chance to get this patch applied?
  
  Thanks!
 
 Paul, perhaps you can add yourself to MAINTAINERS and send a pull request?
 
 Paolo

Out of respect for the work that Eduardo has done, and is continuing to do, 
with the QEMU seccomp filtering, I think Eduardo should be the one to take on 
this role.  If Eduardo declines I'll do ahead and submit a patch adding myself 
to the MAINTAINERS file.

  On 09/04/2013 11:11 AM, Paul Moore wrote:
  On Wednesday, September 04, 2013 09:25:08 AM Eduardo Otubo wrote:
  This was causing Qemu process to hang when using -sandbox on.
  
  Related RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1004175
  
  Signed-off-by: Eduardo Otubo ot...@linux.vnet.ibm.com
  
  Works for me.
  
  Tested-by: Paul Moore pmo...@redhat.com
  
  ---
  
qemu-seccomp.c |1 +
1 files changed, 1 insertions(+), 0 deletions(-)
  
  diff --git a/qemu-seccomp.c b/qemu-seccomp.c
  index 37d38f8..69cee44 100644
  --- a/qemu-seccomp.c
  +++ b/qemu-seccomp.c
  @@ -90,6 +90,7 @@ static const struct QemuSeccompSyscall
  seccomp_whitelist[]
  = { { SCMP_SYS(getuid), 245 },
  
{ SCMP_SYS(geteuid), 245 },
{ SCMP_SYS(timer_create), 245 },
  
  +{ SCMP_SYS(times), 245 },
  
{ SCMP_SYS(exit), 245 },
{ SCMP_SYS(clock_gettime), 245 },
{ SCMP_SYS(time), 245 },

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] seccomp submaintainer? (was Re: [PATCH] seccomp: adding times() to the whitelist)

2013-09-09 Thread Paul Moore
On Monday, September 09, 2013 03:48:09 PM Paolo Bonzini wrote:
 Il 09/09/2013 15:20, Eduardo Otubo ha scritto:
  Out of respect for the work that Eduardo has done, and is
  continuing to do, with the QEMU seccomp filtering, I think Eduardo
  should be the one to take on this role. If Eduardo declines I'll do
  ahead and submit a patch adding myself to the MAINTAINERS file.
  
  If this is ok for everyone, I would be really glad to take this role to
  myself. Paul, thanks for this vote of confidence. Paolo, should I send a
  patch for MAINTAINERS right away?
 
 Ok, I was suggesting Paul because he was the one doing reviews.
 
 Eduardo, that is also okay for me.  However, even as a maintainer please
 do wait for Paul's reviews.  Many areas of QEMU have maintainers that do
 not send their own patches without a review, so this wouldn't be a new
 rule. :)

Okay, with respect to maintainership, I was thinking more along the lines of 
the Linux Kernel where those that do the work get the job; it looks like QEMU 
has a slightly different twist on the idea.  If it makes more sense to the 
QEMU devs you can always add me as a co-maintainer.

Regardless, I do plan on continuing to review/test patches and I don't expect 
that to change in the near future.

 Please wait for Anthony's ack.  I changed the subject and CCed him to
 grab his attention.
 
 Paolo

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH] MAINTAINERS: Add myself to MAINTAINERS file

2013-09-09 Thread Paul Moore
On Monday, September 09, 2013 02:04:15 PM Eduardo Otubo wrote:
 Add myself to the MAINTAINERS file. I'll be looking at qemu-seccomp.c
 and include/sysemu/seccomp.h.
 
 Signed-off-by: Eduardo Otubo ot...@linux.vnet.ibm.com
 ---
 As discussed in previous threads, I'm including myself to the MAINTAINERS
 file so I can take care of the sandbox feature in Qemu.
 
  MAINTAINERS |6 ++
  1 files changed, 6 insertions(+), 0 deletions(-)

Acked-by: Paul Moore pmo...@redhat.com

 diff --git a/MAINTAINERS b/MAINTAINERS
 index d128ed0..09c5148 100644
 --- a/MAINTAINERS
 +++ b/MAINTAINERS
 @@ -766,6 +766,12 @@ M: Blue Swirl blauwir...@gmail.com
  S: Odd Fixes
  F: scripts/checkpatch.pl
 
 +Seccomp
 +M: Eduardo Otubo ot...@linux.vnet.ibm.com
 +S: Supported
 +F: qemu-seccomp.c
 +F: include/sysemu/seccomp.h
 +
  Usermode Emulation
  --
  BSD user

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH] seccomp: adding times() to the whitelist

2013-09-04 Thread Paul Moore
On Wednesday, September 04, 2013 09:25:08 AM Eduardo Otubo wrote:
 This was causing Qemu process to hang when using -sandbox on.
 
 Related RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1004175
 
 Signed-off-by: Eduardo Otubo ot...@linux.vnet.ibm.com

Works for me.

Tested-by: Paul Moore pmo...@redhat.com

 ---
  qemu-seccomp.c |1 +
  1 files changed, 1 insertions(+), 0 deletions(-)
 
 diff --git a/qemu-seccomp.c b/qemu-seccomp.c
 index 37d38f8..69cee44 100644
 --- a/qemu-seccomp.c
 +++ b/qemu-seccomp.c
 @@ -90,6 +90,7 @@ static const struct QemuSeccompSyscall seccomp_whitelist[]
 = { { SCMP_SYS(getuid), 245 },
  { SCMP_SYS(geteuid), 245 },
  { SCMP_SYS(timer_create), 245 },
 +{ SCMP_SYS(times), 245 },
  { SCMP_SYS(exit), 245 },
  { SCMP_SYS(clock_gettime), 245 },
  { SCMP_SYS(time), 245 },

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH] seccomp: adding a second whitelist

2013-09-03 Thread Paul Moore
On Tuesday, September 03, 2013 02:08:28 PM Corey Bryant wrote:
 On 09/03/2013 02:02 PM, Corey Bryant wrote:
  On 08/30/2013 10:21 AM, Eduardo Otubo wrote:
  On 08/29/2013 05:34 AM, Stefan Hajnoczi wrote:
  On Wed, Aug 28, 2013 at 10:04:32PM -0300, Eduardo Otubo wrote:
  Now there's a second whitelist, right before the vcpu starts. The
  second
  whitelist is the same as the first one, except for exec() and select().
  
  -netdev tap,downscript=/path/to/script requires exec() in the QEMU
  shutdown code path.  Will this work with seccomp?
  
  I actually don't know, but I'll test that as well. Can you run a test
  with this patch and -netdev? I mean, if you're pointing that out you
  might have a scenario already setup, right?
  
  Thanks!
  
  This uses exec() in net/tap.c.
  
  I think if we're going to introduce a sandbox environment that restricts
  existing QEMU behavior, then we have to introduce a new argument to the
  -sandbox option.  So for example, -sandbox on would continue to use
  the whitelist that allows everything in QEMU to work (or at least it
  should :).  And something like -sandbox on,strict=on would use the
  whitelist + blacklist.
  
  If this is acceptable though, then I wonder how we could go about adding
  new syscalls to the blacklist in future QEMU releases without regressing
  -sandbox on,strict=on.
 
 Maybe a better approach is to provide support that allows libvirt to
 define the blacklist and pass it to QEMU?

FYI: I didn't want to mention this until I had some patches ready to post, but 
I'm currently working on adding syscall filtering, via libseccomp, to libvirt.  
I hope to get an initial RFC-quality patch out soon.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH] seccomp: adding a second whitelist

2013-09-03 Thread Paul Moore
On Tuesday, September 03, 2013 05:07:53 PM Eduardo Otubo wrote:
 On 09/03/2013 03:21 PM, Paul Moore wrote:
  On Tuesday, September 03, 2013 02:08:28 PM Corey Bryant wrote:
  On 09/03/2013 02:02 PM, Corey Bryant wrote:
  On 08/30/2013 10:21 AM, Eduardo Otubo wrote:
  On 08/29/2013 05:34 AM, Stefan Hajnoczi wrote:
  On Wed, Aug 28, 2013 at 10:04:32PM -0300, Eduardo Otubo wrote:
  Now there's a second whitelist, right before the vcpu starts. The
  second
  whitelist is the same as the first one, except for exec() and
  select().
  
  -netdev tap,downscript=/path/to/script requires exec() in the QEMU
  shutdown code path.  Will this work with seccomp?
  
  I actually don't know, but I'll test that as well. Can you run a test
  with this patch and -netdev? I mean, if you're pointing that out you
  might have a scenario already setup, right?
  
  Thanks!
  
  This uses exec() in net/tap.c.
  
  I think if we're going to introduce a sandbox environment that restricts
  existing QEMU behavior, then we have to introduce a new argument to the
  -sandbox option.  So for example, -sandbox on would continue to use
  the whitelist that allows everything in QEMU to work (or at least it
  should :).  And something like -sandbox on,strict=on would use the
  whitelist + blacklist.
  
  If this is acceptable though, then I wonder how we could go about adding
  new syscalls to the blacklist in future QEMU releases without regressing
  -sandbox on,strict=on.
  
  Maybe a better approach is to provide support that allows libvirt to
  define the blacklist and pass it to QEMU?
  
  FYI: I didn't want to mention this until I had some patches ready to post,
  but I'm currently working on adding syscall filtering, via libseccomp, to
  libvirt. I hope to get an initial RFC-quality patch out soon.
 
 Paul, if you need any help with Qemu and/or testing, please let me know.
 I would be glad to help :) When you post your RFC to libvirt mailing
 list please add me as CC.

Of course, I appreciate all the help I can get.  We can chat a bit more once 
the patches are posted.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH] seccomp: adding a second whitelist

2013-08-30 Thread Paul Moore
On Friday, August 30, 2013 05:23:45 PM Stefan Hajnoczi wrote:
 On Fri, Aug 30, 2013 at 4:21 PM, Eduardo Otubo ot...@linux.vnet.ibm.com 
wrote:
  On 08/29/2013 05:34 AM, Stefan Hajnoczi wrote:
  On Wed, Aug 28, 2013 at 10:04:32PM -0300, Eduardo Otubo wrote:
  Now there's a second whitelist, right before the vcpu starts. The second
  whitelist is the same as the first one, except for exec() and select().
  
  -netdev tap,downscript=/path/to/script requires exec() in the QEMU
  shutdown code path.  Will this work with seccomp?
  
  I actually don't know, but I'll test that as well. Can you run a test with
  this patch and -netdev? I mean, if you're pointing that out you might have
  a scenario already setup, right?
 
 I'm not having much luck running qemu.git/master with CONFIG_SECCOMP
 on Fedora 19.  The GTK UI opens but I don't see the guest's display.
 
 $ x86_64-softmmu/qemu-system-x86_64
 [...GTK UI opens but QEMU is hung...]
 
 strace shows the process is hung somehow and ps says it's defunct
 although it never exited.
 
 $ sudo cat /proc/5912/stack
 [81061fda] do_exit+0x6ca/0xa20
 [810ef090] __secure_computing+0xe0/0x240
 [8101d722] syscall_trace_enter+0x172/0x230
 [816478c8] tracesys+0x7e/0xe2
 [] 0x
 
 Okay, so seccomp killed the process.
 
 $ sudo cat /proc/5912/syscall
 29 0x0 0x1000 0x380 0x7fffbeb49380 0x0 0x0 0x7fffbeb495b8 0x7f6b72402657
 
 $ git grep '\29\' arch/x86/include/generated/uapi/asm/unistd_64.h
 #define __NR_shmget 29
 
 Now it needs syscall 30.  I guess the whitelist is only designed for a
 specific invocation that you are testing?

For future reference, it doesn't need to be that hard to identify when seccomp 
has killed a process.  If you're running audit go ahead and check the audit 
log:

 # ausearch -m SECCOMP
 
 time-Fri Aug 30 11:37:46 2013
 type=SECCOMP msg=audit(1377877066.414:64): auid=0 uid=0 gid=0 ses=1
 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 pid=3787
 comm=20-live-basic_d sig=31 syscall=2 compat=0 ip=0x3a27ae6570 code=0x0

... and notice the 'syscall' field which in this case happens to be '2'.  If 
you have the 'scmp_sys_resolver' tool installed on your system (libseccomp-
devel = 2.1.0 on Fedora) you can then resolve the syscall number:

 # scmp_sys_resolver 2
 open

It is also worth mentioning that while scmp_sys_resolver resolves syscalls for 
the native architecture by default, it can resolve for any of the 
architectures that libseccomp supports, see the manpage for details 
(currently: x86, x86_64, x32, and arm).

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH] seccomp: adding a second whitelist

2013-08-30 Thread Paul Moore
On Friday, August 30, 2013 11:27:28 AM Eduardo Otubo wrote:
 On 08/29/2013 09:56 AM, Paul Moore wrote:
  On Wednesday, August 28, 2013 10:04:32 PM Eduardo Otubo wrote:
  Now there's a second whitelist, right before the vcpu starts. The second
  whitelist is the same as the first one, except for exec() and select().
  
  Signed-off-by: Eduardo Otubo ot...@linux.vnet.ibm.com
  
  We talked about this in a previous thread, but as a reminder, the kernel's
  seccomp BPF filter works by executing all of the loaded filters for each
  syscall and taking the least permissive action for all of the results.  In
  other words, if one filter returns ALLOW for a given syscall and another
  filter returns KILL, the kernel will select the KILL action for the
  syscall.
  
  With that in mind, I think the best option is to keep the existing
  whitelist and instead of creating a second whitelist, create a second
  *blacklist* that removes the syscalls you don't want to allow anymore,
  e.g. exec() and select().  This approach should be easier to maintain and
  would result in less overhead in the kernel's seccomp evaluator (the
  blacklist filter would be much smaller than a second whitelist filter).
 
 You're correct. I was thinking in a whole other approach, but your point
 makes a lot more sense. As I mentioned on the IRC, I should call
 seccomp_init(SCMP_ACT_ALLOW) and seccomp_rule_add(ctx, SCMP_ACT_KILL,
 list[i].num, 0); is that correct?

Yes, just basically swap the actions.

Also, as an FYI, while I may be in the IRC room, I typically don't actually 
monitor the room unless you direct a comment at me (it starts blinking and 
grabs my attention).

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH] seccomp: adding a second whitelist

2013-08-29 Thread Paul Moore
On Wednesday, August 28, 2013 10:04:32 PM Eduardo Otubo wrote:
 Now there's a second whitelist, right before the vcpu starts. The second
 whitelist is the same as the first one, except for exec() and select().
 
 Signed-off-by: Eduardo Otubo ot...@linux.vnet.ibm.com

We talked about this in a previous thread, but as a reminder, the kernel's 
seccomp BPF filter works by executing all of the loaded filters for each 
syscall and taking the least permissive action for all of the results.  In 
other words, if one filter returns ALLOW for a given syscall and another 
filter returns KILL, the kernel will select the KILL action for the syscall.

With that in mind, I think the best option is to keep the existing whitelist 
and instead of creating a second whitelist, create a second *blacklist* that 
removes the syscalls you don't want to allow anymore, e.g. exec() and 
select().  This approach should be easier to maintain and would result in less 
overhead in the kernel's seccomp evaluator (the blacklist filter would be much 
smaller than a second whitelist filter).

-Paul

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH] seccomp: add additional asynchronous I/O syscalls

2013-07-29 Thread Paul Moore
On Wednesday, July 24, 2013 03:02:23 PM Eduardo Otubo wrote:
 On 07/23/2013 10:57 AM, Paul Moore wrote:
  On Monday, July 15, 2013 03:32:01 PM Paul Moore wrote:
  A previous commit, seccomp: add the asynchronous I/O syscalls to the
  whitelist, added several asynchronous I/O syscalls but left out the
  io_submit() and io_cancel() syscalls.  This patch corrects this by
  adding the two missing asynchronous I/O syscalls.
  
  Signed-off-by: Paul Moore pmo...@redhat.com
  
  A gentle nudge so this fix doesn't get forgotten.
 
 Reviewed and tested.
 
 Reviewed-by: Eduardo Otubo ot...@linux.vnet.ibm.com

Any chance of merging this patch?

  ---
  
qemu-seccomp.c |2 ++
1 file changed, 2 insertions(+)
  
  diff --git a/qemu-seccomp.c b/qemu-seccomp.c
  index ca123bf..173d185 100644
  --- a/qemu-seccomp.c
  +++ b/qemu-seccomp.c
  @@ -33,6 +33,7 @@ static const struct QemuSeccompSyscall
  seccomp_whitelist[] = { { SCMP_SYS(socketcall), 250 },
  
#endif

{ SCMP_SYS(read), 249 },
  
  +{ SCMP_SYS(io_submit), 249 },
  
{ SCMP_SYS(brk), 248 },
{ SCMP_SYS(clone), 247 },
{ SCMP_SYS(mmap), 247 },
  
  @@ -231,6 +232,7 @@ static const struct QemuSeccompSyscall
  seccomp_whitelist[] = { { SCMP_SYS(recvmmsg), 241 },
  
{ SCMP_SYS(prlimit64), 241 },
{ SCMP_SYS(waitid), 241 },
  
  +{ SCMP_SYS(io_cancel), 241 },
  
{ SCMP_SYS(io_setup), 241 },
{ SCMP_SYS(io_destroy), 241 }

};
-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH] seccomp: add arch_prctl() to the syscall whitelist

2013-07-29 Thread Paul Moore
On Wednesday, July 24, 2013 03:01:57 PM Eduardo Otubo wrote:
 On 07/23/2013 10:57 AM, Paul Moore wrote:
  On Thursday, July 18, 2013 09:57:03 AM Paul Moore wrote:
  It appears that even a very simple /etc/qemu-ifup configuration can
  
  require the arch_prctl() syscall, see the example below:
 #!/bin/sh
 /sbin/ifconfig $1 0.0.0.0 up
 /usr/sbin/brctl addif switch $1
  
  Signed-off-by: Paul Moore pmo...@redhat.com
  
  As with the other fix, a gentle nudge so this isn't forgotten.
 
 Reviewed and tested.
 
 Reviewed-by: Eduardo Otubo ot...@linux.vnet.ibm.com

Any chance of merging this patch?

  ---
  
qemu-seccomp.c |3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
  
  diff --git a/qemu-seccomp.c b/qemu-seccomp.c
  index 173d185..9e91c73 100644
  --- a/qemu-seccomp.c
  +++ b/qemu-seccomp.c
  @@ -234,7 +234,8 @@ static const struct QemuSeccompSyscall
  seccomp_whitelist[] = { { SCMP_SYS(waitid), 241 },
  
{ SCMP_SYS(io_cancel), 241 },
{ SCMP_SYS(io_setup), 241 },
  
  -{ SCMP_SYS(io_destroy), 241 }
  +{ SCMP_SYS(io_destroy), 241 },
  +{ SCMP_SYS(arch_prctl), 240 }
  
};

int seccomp_start(void)
-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH] seccomp: add additional asynchronous I/O syscalls

2013-07-23 Thread Paul Moore
On Monday, July 15, 2013 03:32:01 PM Paul Moore wrote:
 A previous commit, seccomp: add the asynchronous I/O syscalls to the
 whitelist, added several asynchronous I/O syscalls but left out the
 io_submit() and io_cancel() syscalls.  This patch corrects this by
 adding the two missing asynchronous I/O syscalls.
 
 Signed-off-by: Paul Moore pmo...@redhat.com

A gentle nudge so this fix doesn't get forgotten.

 ---
  qemu-seccomp.c |2 ++
  1 file changed, 2 insertions(+)
 
 diff --git a/qemu-seccomp.c b/qemu-seccomp.c
 index ca123bf..173d185 100644
 --- a/qemu-seccomp.c
 +++ b/qemu-seccomp.c
 @@ -33,6 +33,7 @@ static const struct QemuSeccompSyscall seccomp_whitelist[]
 = { { SCMP_SYS(socketcall), 250 },
  #endif
  { SCMP_SYS(read), 249 },
 +{ SCMP_SYS(io_submit), 249 },
  { SCMP_SYS(brk), 248 },
  { SCMP_SYS(clone), 247 },
  { SCMP_SYS(mmap), 247 },
 @@ -231,6 +232,7 @@ static const struct QemuSeccompSyscall
 seccomp_whitelist[] = { { SCMP_SYS(recvmmsg), 241 },
  { SCMP_SYS(prlimit64), 241 },
  { SCMP_SYS(waitid), 241 },
 +{ SCMP_SYS(io_cancel), 241 },
  { SCMP_SYS(io_setup), 241 },
  { SCMP_SYS(io_destroy), 241 }
  };
-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH] seccomp: add arch_prctl() to the syscall whitelist

2013-07-23 Thread Paul Moore
On Thursday, July 18, 2013 09:57:03 AM Paul Moore wrote:
 It appears that even a very simple /etc/qemu-ifup configuration can
 require the arch_prctl() syscall, see the example below:
 
   #!/bin/sh
   /sbin/ifconfig $1 0.0.0.0 up
   /usr/sbin/brctl addif switch $1
 
 Signed-off-by: Paul Moore pmo...@redhat.com

As with the other fix, a gentle nudge so this isn't forgotten.

 ---
  qemu-seccomp.c |3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)
 
 diff --git a/qemu-seccomp.c b/qemu-seccomp.c
 index 173d185..9e91c73 100644
 --- a/qemu-seccomp.c
 +++ b/qemu-seccomp.c
 @@ -234,7 +234,8 @@ static const struct QemuSeccompSyscall
 seccomp_whitelist[] = { { SCMP_SYS(waitid), 241 },
  { SCMP_SYS(io_cancel), 241 },
  { SCMP_SYS(io_setup), 241 },
 -{ SCMP_SYS(io_destroy), 241 }
 +{ SCMP_SYS(io_destroy), 241 },
 +{ SCMP_SYS(arch_prctl), 240 }
  };
 
  int seccomp_start(void)
-- 
paul moore
security and virtualization @ redhat




[Qemu-devel] [PATCH] seccomp: add arch_prctl() to the syscall whitelist

2013-07-18 Thread Paul Moore
It appears that even a very simple /etc/qemu-ifup configuration can
require the arch_prctl() syscall, see the example below:

#!/bin/sh
/sbin/ifconfig $1 0.0.0.0 up
/usr/sbin/brctl addif switch $1

Signed-off-by: Paul Moore pmo...@redhat.com
---
 qemu-seccomp.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/qemu-seccomp.c b/qemu-seccomp.c
index 173d185..9e91c73 100644
--- a/qemu-seccomp.c
+++ b/qemu-seccomp.c
@@ -234,7 +234,8 @@ static const struct QemuSeccompSyscall seccomp_whitelist[] 
= {
 { SCMP_SYS(waitid), 241 },
 { SCMP_SYS(io_cancel), 241 },
 { SCMP_SYS(io_setup), 241 },
-{ SCMP_SYS(io_destroy), 241 }
+{ SCMP_SYS(io_destroy), 241 },
+{ SCMP_SYS(arch_prctl), 240 }
 };
 
 int seccomp_start(void)




Re: [Qemu-devel] seccomp: remove unused syscalls - for 1.6

2013-07-18 Thread Paul Moore
On Thursday, July 18, 2013 06:37:15 PM Paolo Bonzini wrote:
 Il 18/07/2013 18:35, Eduardo Otubo ha scritto:
  On 07/18/2013 01:28 PM, Anthony Liguori wrote:
  Eduardo Otubo ot...@linux.vnet.ibm.com writes:
  Hello all,
  
  In this small patch series I basically:
  Cover letter should be marked [PATCH 0/2].  Otherwise it defeats
  filtering.
  
  Would like to see a Reviewed-by from someone before applying this.
  
  I'm running some tests with qemu  xen, I'll post a v3 by the end of
  the day. I'll format the cover letter in the correct way next time.
 
 I feel that, at some point, grep and code review must trump experiments...
 
 Paul, how did you guys handle this in other projects?

To the best of my knowledge QEMU currently stands alone with its complexity 
and use of seccomp filtering.  There are other applications, but they are 
either of the syscall sandboxing type where the users define the filters, or 
the rigid, smaller, well defined filter type.  QEMU is both large and has a 
huge number of options which affect the syscalls used.

At some point it would be nice to develop a mechanism to do some static 
analysis on a binary and its associated libraries to come up with a worst case 
filter (worst case because you might not want all the syscalls that a library 
uses, e.g. glibc).  Unfortunately, we don't have such a tool the moment - it's 
hard enough generating correct filters with a nice architecture agnostic 
manner :)

On the plus side, I think libseccomp is very close to being pretty much 
feature complete (excluding new architectures that may pop up, at present we 
are only x86, x86_64, x32, and ARM) so I'll be able to start turning some 
effort towards better tools and patches for existing applications.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] seccomp: remove unused syscalls - for 1.6

2013-07-18 Thread Paul Moore
On Thursday, July 18, 2013 08:48:10 PM Peter Maydell wrote:
 On 18 July 2013 20:39, Paul Moore pmo...@redhat.com wrote:
  On the plus side, I think libseccomp is very close to being pretty much
  feature complete (excluding new architectures that may pop up, at present
  we are only x86, x86_64, x32, and ARM)
 
 ...AArch64 ? :-)
 

Not yet, just 32-bit ARM EABI.

If you've got a working system and are willing to so some hacking or run some 
tests we could work on it for a future libseccomp release.  An emulated 
AArch64 VM would also work, but that route can be slow/annoying.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] seccomp: remove unused syscalls - for 1.6

2013-07-18 Thread Paul Moore
On Thursday, July 18, 2013 10:31:46 PM Peter Maydell wrote:
 On 18 July 2013 21:05, Paul Moore pmo...@redhat.com wrote:
  On Thursday, July 18, 2013 08:48:10 PM Peter Maydell wrote:
  On 18 July 2013 20:39, Paul Moore pmo...@redhat.com wrote:
   On the plus side, I think libseccomp is very close to being pretty much
   feature complete (excluding new architectures that may pop up, at
   present
   we are only x86, x86_64, x32, and ARM)
  
  ...AArch64 ? :-)
  
  Not yet, just 32-bit ARM EABI.
  
  If you've got a working system and are willing to so some hacking or run
  some tests we could work on it for a future libseccomp release.  An
  emulated AArch64 VM would also work, but that route can be slow/annoying.
 
 Simulators are all we have right now (we're juuust getting to the
 point where hardware is starting to become available). I wasn't
 being serious really, though I'm sure somebody (possibly even
 somebody at Red Hat :-)) will work around to it at some point.

Regardless, consider it a standing offer.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH 1/2] seccomp: no need to check arch in syscall whitelist

2013-07-15 Thread Paul Moore
On Monday, July 15, 2013 02:29:37 PM Eduardo Otubo wrote:
 Since libseccomp 2.0 there's no need to check the architecture type
 anymore.
 
 Signed-off-by: Eduardo Otubo ot...@linux.vnet.ibm.com
 ---
  qemu-seccomp.c | 13 -
  1 file changed, 13 deletions(-)

Good, this should make long term maintenance easier.  However, you should 
probably update the configure script to require libseccomp-2.0.0 or greater.  
Actually, since this is 1.6 material, I would make it dependent on 
libseccomp-2.1.0 as there are a number of improvements in that release and it 
has been out for a while now.

If you're feeling particularly adventurous, you could even enable the QEMU 
seccomp code for x32/ARM hosts with libseccomp-2.1.0 hosts :)
 
 diff --git a/qemu-seccomp.c b/qemu-seccomp.c
 index ca123bf..1d5fd71 100644
 --- a/qemu-seccomp.c
 +++ b/qemu-seccomp.c
 @@ -26,12 +26,9 @@ static const struct QemuSeccompSyscall
 seccomp_whitelist[] = { { SCMP_SYS(timer_gettime), 254 },
  { SCMP_SYS(futex), 253 },
  { SCMP_SYS(select), 252 },
 -#if defined(__x86_64__)
  { SCMP_SYS(recvfrom), 251 },
  { SCMP_SYS(sendto), 250 },
 -#elif defined(__i386__)
  { SCMP_SYS(socketcall), 250 },
 -#endif
  { SCMP_SYS(read), 249 },
  { SCMP_SYS(brk), 248 },
  { SCMP_SYS(clone), 247 },
 @@ -40,7 +37,6 @@ static const struct QemuSeccompSyscall seccomp_whitelist[]
 = { { SCMP_SYS(execve), 245 },
  { SCMP_SYS(open), 245 },
  { SCMP_SYS(ioctl), 245 },
 -#if defined(__x86_64__)
  { SCMP_SYS(socket), 245 },
  { SCMP_SYS(setsockopt), 245 },
  { SCMP_SYS(recvmsg), 245 },
 @@ -51,9 +47,7 @@ static const struct QemuSeccompSyscall seccomp_whitelist[]
 = { { SCMP_SYS(bind), 245 },
  { SCMP_SYS(listen), 245 },
  { SCMP_SYS(semget), 245 },
 -#elif defined(__i386__)
  { SCMP_SYS(ipc), 245 },
 -#endif
  { SCMP_SYS(gettimeofday), 245 },
  { SCMP_SYS(readlink), 245 },
  { SCMP_SYS(access), 245 },
 @@ -64,7 +58,6 @@ static const struct QemuSeccompSyscall seccomp_whitelist[]
 = { { SCMP_SYS(statfs), 245 },
  { SCMP_SYS(unlink), 245 },
  { SCMP_SYS(wait4), 245 },
 -#if defined(__i386__)
  { SCMP_SYS(fcntl64), 245 },
  { SCMP_SYS(fstat64), 245 },
  { SCMP_SYS(stat64), 245 },
 @@ -77,7 +70,6 @@ static const struct QemuSeccompSyscall seccomp_whitelist[]
 = { { SCMP_SYS(_llseek), 245 },
  { SCMP_SYS(mmap2), 245 },
  { SCMP_SYS(sigprocmask), 245 },
 -#endif
  { SCMP_SYS(sched_getparam), 245 },
  { SCMP_SYS(sched_getscheduler), 245 },
  { SCMP_SYS(fstat), 245 },
 @@ -145,9 +137,7 @@ static const struct QemuSeccompSyscall
 seccomp_whitelist[] = { { SCMP_SYS(epoll_create), 242 },
  { SCMP_SYS(epoll_ctl), 242 },
  { SCMP_SYS(epoll_wait), 242 },
 -#if defined(__i386__)
  { SCMP_SYS(waitpid), 242 },
 -#elif defined(__x86_64__)
  { SCMP_SYS(getsockname), 242 },
  { SCMP_SYS(getpeername), 242 },
  { SCMP_SYS(accept4), 242 },
 @@ -159,7 +149,6 @@ static const struct QemuSeccompSyscall
 seccomp_whitelist[] = { { SCMP_SYS(semtimedop), 241 },
  { SCMP_SYS(epoll_ctl_old), 241 },
  { SCMP_SYS(epoll_wait_old), 241 },
 -#endif
  { SCMP_SYS(epoll_pwait), 241 },
  { SCMP_SYS(epoll_create1), 241 },
  { SCMP_SYS(ppoll), 241 },
 @@ -174,7 +163,6 @@ static const struct QemuSeccompSyscall
 seccomp_whitelist[] = { { SCMP_SYS(getresuid), 241 },
  { SCMP_SYS(getresgid), 241 },
  { SCMP_SYS(getgroups), 241 },
 -#if defined(__i386__)
  { SCMP_SYS(getresuid32), 241 },
  { SCMP_SYS(getresgid32), 241 },
  { SCMP_SYS(getgroups32), 241 },
 @@ -193,7 +181,6 @@ static const struct QemuSeccompSyscall
 seccomp_whitelist[] = { { SCMP_SYS(lstat64), 241 },
  { SCMP_SYS(sendfile64), 241 },
  { SCMP_SYS(ugetrlimit), 241 },
 -#endif
  { SCMP_SYS(alarm), 241 },
  { SCMP_SYS(rt_sigsuspend), 241 },
  { SCMP_SYS(rt_sigqueueinfo), 241 },
-- 
paul moore
security and virtualization @ redhat




[Qemu-devel] [PATCH] seccomp: add additional asynchronous I/O syscalls

2013-07-15 Thread Paul Moore
A previous commit, seccomp: add the asynchronous I/O syscalls to the
whitelist, added several asynchronous I/O syscalls but left out the
io_submit() and io_cancel() syscalls.  This patch corrects this by
adding the two missing asynchronous I/O syscalls.

Signed-off-by: Paul Moore pmo...@redhat.com
---
 qemu-seccomp.c |2 ++
 1 file changed, 2 insertions(+)

diff --git a/qemu-seccomp.c b/qemu-seccomp.c
index ca123bf..173d185 100644
--- a/qemu-seccomp.c
+++ b/qemu-seccomp.c
@@ -33,6 +33,7 @@ static const struct QemuSeccompSyscall seccomp_whitelist[] = {
 { SCMP_SYS(socketcall), 250 },
 #endif
 { SCMP_SYS(read), 249 },
+{ SCMP_SYS(io_submit), 249 },
 { SCMP_SYS(brk), 248 },
 { SCMP_SYS(clone), 247 },
 { SCMP_SYS(mmap), 247 },
@@ -231,6 +232,7 @@ static const struct QemuSeccompSyscall seccomp_whitelist[] 
= {
 { SCMP_SYS(recvmmsg), 241 },
 { SCMP_SYS(prlimit64), 241 },
 { SCMP_SYS(waitid), 241 },
+{ SCMP_SYS(io_cancel), 241 },
 { SCMP_SYS(io_setup), 241 },
 { SCMP_SYS(io_destroy), 241 }
 };




Re: [Qemu-devel] [PATCH] seccomp: add the asynchronous I/O syscalls to the whitelist

2013-07-10 Thread Paul Moore
On Wednesday, May 29, 2013 04:30:01 PM Paul Moore wrote:
 In order to enable the asynchronous I/O functionality when using the
 seccomp sandbox we need to add the associated syscalls to the
 whitelist.
 
 Signed-off-by: Paul Moore pmo...@redhat.com
 ---
  qemu-seccomp.c |5 -
  1 file changed, 4 insertions(+), 1 deletion(-)
 
 diff --git a/qemu-seccomp.c b/qemu-seccomp.c
 index 031da1d..ca123bf 100644
 --- a/qemu-seccomp.c
 +++ b/qemu-seccomp.c
 @@ -87,6 +87,7 @@ static const struct QemuSeccompSyscall seccomp_whitelist[]
 = { { SCMP_SYS(stat), 245 },
  { SCMP_SYS(uname), 245 },
  { SCMP_SYS(eventfd2), 245 },
 +{ SCMP_SYS(io_getevents), 245 },
  { SCMP_SYS(dup), 245 },
  { SCMP_SYS(dup2), 245 },
  { SCMP_SYS(dup3), 245 },
 @@ -229,7 +230,9 @@ static const struct QemuSeccompSyscall
 seccomp_whitelist[] = { { SCMP_SYS(sendmmsg), 241 },
  { SCMP_SYS(recvmmsg), 241 },
  { SCMP_SYS(prlimit64), 241 },
 -{ SCMP_SYS(waitid), 241 }
 +{ SCMP_SYS(waitid), 241 },
 +{ SCMP_SYS(io_setup), 241 },
 +{ SCMP_SYS(io_destroy), 241 }
  };
 
  int seccomp_start(void)

Any reason this patch wasn't pulled in for 1.5.1?

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH] seccomp: add the asynchronous I/O syscalls to the whitelist

2013-07-10 Thread Paul Moore
On Wednesday, July 10, 2013 10:02:55 PM Andreas Färber wrote:
 Am 10.07.2013 16:31, schrieb Paul Moore:
  On Wednesday, May 29, 2013 04:30:01 PM Paul Moore wrote:
  In order to enable the asynchronous I/O functionality when using the
  seccomp sandbox we need to add the associated syscalls to the
  whitelist.
  
  Signed-off-by: Paul Moore pmo...@redhat.com
  ---
  
   qemu-seccomp.c |5 -
   1 file changed, 4 insertions(+), 1 deletion(-)
  
  diff --git a/qemu-seccomp.c b/qemu-seccomp.c
  index 031da1d..ca123bf 100644
  --- a/qemu-seccomp.c
  +++ b/qemu-seccomp.c
  @@ -87,6 +87,7 @@ static const struct QemuSeccompSyscall
  seccomp_whitelist[] = { { SCMP_SYS(stat), 245 },
  
   { SCMP_SYS(uname), 245 },
   { SCMP_SYS(eventfd2), 245 },
  
  +{ SCMP_SYS(io_getevents), 245 },
  
   { SCMP_SYS(dup), 245 },
   { SCMP_SYS(dup2), 245 },
   { SCMP_SYS(dup3), 245 },
  
  @@ -229,7 +230,9 @@ static const struct QemuSeccompSyscall
  seccomp_whitelist[] = { { SCMP_SYS(sendmmsg), 241 },
  
   { SCMP_SYS(recvmmsg), 241 },
   { SCMP_SYS(prlimit64), 241 },
  
  -{ SCMP_SYS(waitid), 241 }
  +{ SCMP_SYS(waitid), 241 },
  +{ SCMP_SYS(io_setup), 241 },
  +{ SCMP_SYS(io_destroy), 241 }
  
   };
   
   int seccomp_start(void)
  
  Any reason this patch wasn't pulled in for 1.5.1?
 
 Yes: You forget to put a line Cc: qemu-sta...@nongnu.org into the commit
 message nor was it ever CC'ed while on the list. ;)

You learn something new everyday, thanks.

Can I assume this will make it into 1.5.2 now?

-- 
paul moore
security and virtualization @ redhat




[Qemu-devel] [PATCH] seccomp: add the asynchronous I/O syscalls to the whitelist

2013-05-29 Thread Paul Moore
In order to enable the asynchronous I/O functionality when using the
seccomp sandbox we need to add the associated syscalls to the
whitelist.

Signed-off-by: Paul Moore pmo...@redhat.com
---
 qemu-seccomp.c |5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/qemu-seccomp.c b/qemu-seccomp.c
index 031da1d..ca123bf 100644
--- a/qemu-seccomp.c
+++ b/qemu-seccomp.c
@@ -87,6 +87,7 @@ static const struct QemuSeccompSyscall seccomp_whitelist[] = {
 { SCMP_SYS(stat), 245 },
 { SCMP_SYS(uname), 245 },
 { SCMP_SYS(eventfd2), 245 },
+{ SCMP_SYS(io_getevents), 245 },
 { SCMP_SYS(dup), 245 },
 { SCMP_SYS(dup2), 245 },
 { SCMP_SYS(dup3), 245 },
@@ -229,7 +230,9 @@ static const struct QemuSeccompSyscall seccomp_whitelist[] 
= {
 { SCMP_SYS(sendmmsg), 241 },
 { SCMP_SYS(recvmmsg), 241 },
 { SCMP_SYS(prlimit64), 241 },
-{ SCMP_SYS(waitid), 241 }
+{ SCMP_SYS(waitid), 241 },
+{ SCMP_SYS(io_setup), 241 },
+{ SCMP_SYS(io_destroy), 241 }
 };
 
 int seccomp_start(void)




Re: [Qemu-devel] [RFC] Continuous work on sandboxing

2013-05-01 Thread Paul Moore
On Tuesday, April 30, 2013 04:28:54 PM Corey Bryant wrote:
 Just to be clear, I'm thinking you could launch guests in one of two
 different seccomp sandboxed environments:
 
 1) Using the existing and more permissive whitelist where every QEMU
 feature works:
 
 qemu-kvm -sandbox on,default

In general, I like the comma delimited list of sandbox filters/methods/etc. 
but I'm not sure we need to explicitly specify default, it seems like on 
would be sufficient.  It also preserved compatibility with what we have now.

 2) A more restricted whitelist environment that doesn't allow all QEMU
 features to work.  It would be limited to the whitelist in 1 and it
 would also deny things like execve(), open(), socket(), certain ioctl()
 parameters, and may only allow reads/writes to specifc fds, and/or block
 anything else that could be dangerous:
 
 qemu-kvm -sandbox on,restricted
 
 I'm just throwing these command line options and syscalls out there.
 And maybe it makes more sense for libvirt to pass the syscalls and
 parameters to QEMU so that libvirt can determine the parameters to
 restrict, like fd's the guest is allowed to read/write.
 
 Here's another thread where this was discussed:
 http://www.redhat.com/archives/libvir-list/2013-April/msg01501.html

Seems reasonable to me.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [RFC] Continuous work on sandboxing

2013-04-30 Thread Paul Moore
On Monday, April 29, 2013 05:52:10 PM Corey Bryant wrote:
 On 04/26/2013 05:07 PM, Paul Moore wrote:
  [snip]
  
  3. Debugging and/or learning mode - third party libraries still have the
  problem of interfering in the Qemu's signal mask. According to some
  previous discussions, perhaps patch all external libraries that mass up
  with this mask (spice, for example) is a way to solve it. But not sure
  if it worth the time spent. Would like to hear you guys.
  
  I think patching all the libraries is a losing battle, I think we need to
  pursue alternate debugging techniques.
 
 I agree.  It would be nice to have some sort of learning mode that
 reported all denied syscalls on a single run, but signal handlers
 doesn't seem like the right way.  Maybe we could improve on this
 approach, since it never gained traction: https://lkml.org/lkml/2013/1/7/313
 
 At least we can get a single denied syscall at a time today via the
 audit log that the kernel issues.  Eduardo, you may want to see if
 there's a good place to document that for QEMU so that people know where
 to look.

Lately I've been using the fact that the seccomp BPF filter result generates 
an audit log; it either dumps to syslog or the audit log (depending on your 
configuration) and seems to accomplish most of what we wanted with 
SECCOMP_RET_INFO.

I'm always open to new/better ideas, but this has been working reasonably well 
for me for the past few months.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [RFC] Continuous work on sandboxing

2013-04-29 Thread Paul Moore
On Monday, April 29, 2013 03:39:57 PM Eduardo Otubo wrote:
 On 04/26/2013 06:07 PM, Paul Moore wrote:
  On Friday, April 26, 2013 03:39:33 PM Eduardo Otubo wrote:
  Also, looking a bit further ahead, it might be interesting to look at
  removing some of the arch dependent stuff in qemu-seccomp.c.  The latest
  version of libseccomp should remove the need for many, if not all, of the
  arch specific #ifdefs and the next version of libseccomp will add support
  for x32 and ARM.

 Tell me more about this. You're saying I can remove the #ifdefs and keep
 the lines like { SCMP_SYS(getresuid32), 241 },  or address these
 syscalls in another way?

Yes.  If you are using libseccomp 2.x you shouldn't need any of the #ifdefs in 
the seccomp_whitelist[] variable like there are at present.  As long as you 
aren't using the *_exact() versions of the libseccomp APIs, which QEMU is not, 
the library will do the right thing for the current architecture: syscalls 
that don't exist will be ignored, those that need translation, e.g. socket() 
on x86, will be translated, and those that do exist normally will be handled 
normally.  Anything else I would consider a bug in libseccomp.

There is more to it if you are generating a seccomp filter to support multiple 
simultaneous architectures, e.g. x86 and x86_64, but that doesn't really apply 
with QEMU.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [RFC] Continuous work on sandboxing

2013-04-26 Thread Paul Moore
On Friday, April 26, 2013 03:39:33 PM Eduardo Otubo wrote:
 Hello folks,
 
 Resuming the sandboxing work, I'd  like to ask for comments on the
 ideias I have:
 
 1. Reduce whitelist to the optimal subset: Run various tests on Qemu
 with different configurations to reduce to the smallest syscall set
 possible; test and send a patch weekly (this is already being performed
 and a patch is on the way)

Is this hooked into a testing framework?  While it is always nice to have 
someone verify the correctness, having a simple tool/testsuite what can run 
through things on a regular basis is even better.

Also, looking a bit further ahead, it might be interesting to look at removing 
some of the arch dependent stuff in qemu-seccomp.c.  The latest version of 
libseccomp should remove the need for many, if not all, of the arch specific 
#ifdefs and the next version of libseccomp will add support for x32 and ARM.

 2. Introduce a second whitelist - the whitelist should be defined in
 libvirt and passed on to qemu or just pre defined in Qemu? Also remove
 execve() and avoid open() and socket() and its parameters ...

If I'm understanding you correctly, I think what you'll want is a second 
*blacklist*.  We talked about this previously; we currently have a single 
whitelist, and considering how seccomp works, you can really only further 
restrict things after you install a whitelist into the kernel (hence the 
blacklist).

 3. Debugging and/or learning mode - third party libraries still have the
 problem of interfering in the Qemu's signal mask. According to some
 previous discussions, perhaps patch all external libraries that mass up
 with this mask (spice, for example) is a way to solve it. But not sure
 if it worth the time spent. Would like to hear you guys.

I think patching all the libraries is a losing battle, I think we need to 
pursue alternate debugging techniques.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCHv5 1.3] seccomp: adding new syscalls (bugzilla 855162)

2012-11-29 Thread Paul Moore
On Thursday, November 29, 2012 01:56:41 PM Eduardo Otubo wrote:
 According to the bug 855162[0] - there's the need of adding new syscalls
 to the whitelist when using Qemu with Libvirt.
 
 [0] - https://bugzilla.redhat.com/show_bug.cgi?id=855162
 
 Reported-by: Paul Moore pmo...@redhat.com
 Signed-off-by: Eduardo Otubo ot...@linux.vnet.ibm.com
 Signed-off-by: Corey Bryant cor...@linux.vnet.ibm.com

Works for me on 64-bit x86 F17.

Tested-by: Paul Moore pmo...@redhat.com

 diff --git a/qemu-seccomp.c b/qemu-seccomp.c
 index 64329a3..2a71d6f 100644
 --- a/qemu-seccomp.c
 +++ b/qemu-seccomp.c
 @@ -26,8 +26,12 @@ static const struct QemuSeccompSyscall
 seccomp_whitelist[] = { { SCMP_SYS(timer_gettime), 254 },
  { SCMP_SYS(futex), 253 },
  { SCMP_SYS(select), 252 },
 +#if defined(__x86_64__)
  { SCMP_SYS(recvfrom), 251 },
  { SCMP_SYS(sendto), 250 },
 +#elif defined(__i386__)
 +{ SCMP_SYS(socketcall), 250 },
 +#endif
  { SCMP_SYS(read), 249 },
  { SCMP_SYS(brk), 248 },
  { SCMP_SYS(clone), 247 },
 @@ -36,15 +40,30 @@ static const struct QemuSeccompSyscall
 seccomp_whitelist[] = { { SCMP_SYS(execve), 245 },
  { SCMP_SYS(open), 245 },
  { SCMP_SYS(ioctl), 245 },
 +#if defined(__x86_64__)
 +{ SCMP_SYS(socket), 245 },
 +{ SCMP_SYS(setsockopt), 245 },
  { SCMP_SYS(recvmsg), 245 },
  { SCMP_SYS(sendmsg), 245 },
  { SCMP_SYS(accept), 245 },
  { SCMP_SYS(connect), 245 },
 +{ SCMP_SYS(socketpair), 245 },
 +{ SCMP_SYS(bind), 245 },
 +{ SCMP_SYS(listen), 245 },
 +{ SCMP_SYS(semget), 245 },
 +#elif defined(__i386__)
 +{ SCMP_SYS(ipc), 245 },
 +#endif
  { SCMP_SYS(gettimeofday), 245 },
  { SCMP_SYS(readlink), 245 },
  { SCMP_SYS(access), 245 },
  { SCMP_SYS(prctl), 245 },
  { SCMP_SYS(signalfd), 245 },
 +{ SCMP_SYS(getrlimit), 245 },
 +{ SCMP_SYS(set_tid_address), 245 },
 +{ SCMP_SYS(statfs), 245 },
 +{ SCMP_SYS(unlink), 245 },
 +{ SCMP_SYS(wait4), 245 },
  #if defined(__i386__)
  { SCMP_SYS(fcntl64), 245 },
  { SCMP_SYS(fstat64), 245 },
 @@ -56,30 +75,33 @@ static const struct QemuSeccompSyscall
 seccomp_whitelist[] = { { SCMP_SYS(sigreturn), 245 },
  { SCMP_SYS(_newselect), 245 },
  { SCMP_SYS(_llseek), 245 },
 -{ SCMP_SYS(mmap2), 245},
 +{ SCMP_SYS(mmap2), 245 },
  { SCMP_SYS(sigprocmask), 245 },
 -#elif defined(__x86_64__)
 -{ SCMP_SYS(sched_getparam), 245},
 -{ SCMP_SYS(sched_getscheduler), 245},
 -{ SCMP_SYS(fstat), 245},
 -{ SCMP_SYS(clock_getres), 245},
 -{ SCMP_SYS(sched_get_priority_min), 245},
 -{ SCMP_SYS(sched_get_priority_max), 245},
 -{ SCMP_SYS(stat), 245},
 -{ SCMP_SYS(socket), 245},
 -{ SCMP_SYS(setsockopt), 245},
 -{ SCMP_SYS(uname), 245},
 -{ SCMP_SYS(semget), 245},
  #endif
 +{ SCMP_SYS(sched_getparam), 245 },
 +{ SCMP_SYS(sched_getscheduler), 245 },
 +{ SCMP_SYS(fstat), 245 },
 +{ SCMP_SYS(clock_getres), 245 },
 +{ SCMP_SYS(sched_get_priority_min), 245 },
 +{ SCMP_SYS(sched_get_priority_max), 245 },
 +{ SCMP_SYS(stat), 245 },
 +{ SCMP_SYS(uname), 245 },
  { SCMP_SYS(eventfd2), 245 },
  { SCMP_SYS(dup), 245 },
 +{ SCMP_SYS(dup2), 245 },
 +{ SCMP_SYS(dup3), 245 },
  { SCMP_SYS(gettid), 245 },
 +{ SCMP_SYS(getgid), 245 },
 +{ SCMP_SYS(getegid), 245 },
 +{ SCMP_SYS(getuid), 245 },
 +{ SCMP_SYS(geteuid), 245 },
  { SCMP_SYS(timer_create), 245 },
  { SCMP_SYS(exit), 245 },
  { SCMP_SYS(clock_gettime), 245 },
  { SCMP_SYS(time), 245 },
  { SCMP_SYS(restart_syscall), 245 },
  { SCMP_SYS(pwrite64), 245 },
 +{ SCMP_SYS(nanosleep), 245 },
  { SCMP_SYS(chown), 245 },
  { SCMP_SYS(openat), 245 },
  { SCMP_SYS(getdents), 245 },
 @@ -93,8 +115,6 @@ static const struct QemuSeccompSyscall
 seccomp_whitelist[] = { { SCMP_SYS(lseek), 245 },
  { SCMP_SYS(pselect6), 245 },
  { SCMP_SYS(fork), 245 },
 -{ SCMP_SYS(bind), 245 },
 -{ SCMP_SYS(listen), 245 },
  { SCMP_SYS(eventfd), 245 },
  { SCMP_SYS(rt_sigprocmask), 245 },
  { SCMP_SYS(write), 244 },
 @@ -104,10 +124,112 @@ static const struct QemuSeccompSyscall
 seccomp_whitelist[] = { { SCMP_SYS(pipe2), 242 },
  { SCMP_SYS(munmap), 242 },
  { SCMP_SYS(mremap), 242 },
 +{ SCMP_SYS(fdatasync), 242 },
 +{ SCMP_SYS(close), 242 },
 +{ SCMP_SYS(rt_sigpending), 242 },
 +{ SCMP_SYS(rt_sigtimedwait), 242 },
 +{ SCMP_SYS(readv), 242 },
 +{ SCMP_SYS(writev), 242 },
 +{ SCMP_SYS(preadv), 242 },
 +{ SCMP_SYS(pwritev), 242 },
 +{ SCMP_SYS(setrlimit), 242 },
 +{ SCMP_SYS(ftruncate), 242 },
 +{ SCMP_SYS(lstat), 242 },
 +{ SCMP_SYS(pipe), 242 },
 +{ SCMP_SYS(umask), 242 },
 +{ SCMP_SYS(chdir), 242 },
 +{ SCMP_SYS(setitimer), 242 },
 +{ SCMP_SYS(setsid), 242 },
 +{ SCMP_SYS(poll), 242 },
 +{ SCMP_SYS(epoll_create), 242 },
 +{ SCMP_SYS(epoll_ctl), 242 },
 +{ SCMP_SYS(epoll_wait

Re: [Qemu-devel] [PATCHv3 1/5] seccomp: adding new syscalls (bugzilla 855162)

2012-11-27 Thread Paul Moore
On Tuesday, November 27, 2012 11:11:32 AM Corey Bryant wrote:
 Thanks for the additional details.  They were very useful.  I was able
 to reproduce this when I manually built spice release 0.10.1, but not
 with the Fedora 0.10.1 package.  One difference I noticed is that the
 Fedora version wasn't logging info messages.
 
 Nonetheless, we'll send new patches soon.  It looks like the following
 were missing: epoll_create, epoll_wait, and epoll_ctl

Great, glad to hear my test system wasn't just being stubborn :)

I'll re-test as soon as I see the patches.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCHv3 1/5] seccomp: adding new syscalls (bugzilla 855162)

2012-11-26 Thread Paul Moore
On Monday, November 26, 2012 11:41:06 AM Corey Bryant wrote:
 On 11/21/2012 10:24 AM, Paul Moore wrote:
  On Wednesday, November 21, 2012 11:20:44 AM Eduardo Otubo wrote:
  Hello folks,
  
  Does anyone had a chance to take a look at this? We would like to get
  this into the 1.3 release.
  
  Thanks again :)
  
  I way a bit delayed due to travel, but I started playing with it a bit
  yesterday afternoon and unfortunately it still doesn't work for me (using
  the same test/reproducer I documented in the RH BZ).  I've tried running
  QEMU both via libvirt and the command line (using a libvirt derived
  command line).
  
  I'm applying the patches to the F17 QEMU 1.2 package; there is some minor
  fixup needed in the configure script but nothing major.
  
  What is further frustrating is that the debug code (patch 5/5) doesn't
  seem to output the problematic syscall.  I wanted to investigate this a
  bit more before responding, but with the holiday approaching
  (Thanksgiving in the US), I'm not sure how much progress I'll be able to
  make for the remainder of this week.  Sorry about that.
  
  If you have any further questions about how, or what, I'm testing, just
  ask.

 Paul, Is your host 32 or 64-bit?

64-bit

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCHv3 1/5] seccomp: adding new syscalls (bugzilla 855162)

2012-11-26 Thread Paul Moore
On Monday, November 26, 2012 02:59:21 PM Corey Bryant wrote:
 On 11/26/2012 12:08 PM, Paul Moore wrote:
  On Monday, November 26, 2012 11:41:06 AM Corey Bryant wrote:
  On 11/21/2012 10:24 AM, Paul Moore wrote:
  On Wednesday, November 21, 2012 11:20:44 AM Eduardo Otubo wrote:
  Hello folks,
  
  Does anyone had a chance to take a look at this? We would like to get
  this into the 1.3 release.
  
  Thanks again :)
  
  I way a bit delayed due to travel, but I started playing with it a bit
  yesterday afternoon and unfortunately it still doesn't work for me
  (using the same test/reproducer I documented in the RH BZ).  I've tried
  running QEMU both via libvirt and the command line (using a libvirt
  derived command line).
  
  I'm applying the patches to the F17 QEMU 1.2 package; there is some
  minor fixup needed in the configure script but nothing major.
  
  What is further frustrating is that the debug code (patch 5/5) doesn't
  seem to output the problematic syscall.  I wanted to investigate this a
  bit more before responding, but with the holiday approaching
  (Thanksgiving in the US), I'm not sure how much progress I'll be able to
  make for the remainder of this week.  Sorry about that.
  
  If you have any further questions about how, or what, I'm testing, just
  ask.
  
  Paul, Is your host 32 or 64-bit?
  
  64-bit
 
 I'm having trouble recreating this.  I'm running a Fedora 17 64-bit host
 and a Fedora 17 64-bit guest with domain XML that mirrors yours.
 
 Here's the domain XML I'm using and the resulting QEMU command line:
 
 Domain XML:   http://pastebin.com/DWa4RQ1Y
 Command line: http://pastebin.com/2QTWsUhP
 
 I'm running with QEMU commit 8db972cfa469b4e4afd9c65e54e796b83b5ce3a2
 which is 1.2.0 with: (a) just the first patch applied, as well as with
 (b) all of this patch series applied.
 
 Any thoughts on what could be different?

Like I said earlier, I'm running with the F17 QEMU 1.2 package, 
qemu-1.2.0-16.fc18 to be exact, with Eduardo's patches applied on top.

I'm currently testing another set of interim patches from Eduardo that was 
sent off-list for testing (you were CC'd); hopefully that will resolve the 
problem.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCHv3 1/5] seccomp: adding new syscalls (bugzilla 855162)

2012-11-26 Thread Paul Moore
On Monday, November 26, 2012 03:41:00 PM Paul Moore wrote:
 On Monday, November 26, 2012 02:59:21 PM Corey Bryant wrote:
  On 11/26/2012 12:08 PM, Paul Moore wrote:
   On Monday, November 26, 2012 11:41:06 AM Corey Bryant wrote:
   On 11/21/2012 10:24 AM, Paul Moore wrote:
   On Wednesday, November 21, 2012 11:20:44 AM Eduardo Otubo wrote:
   Hello folks,
   
   Does anyone had a chance to take a look at this? We would like to get
   this into the 1.3 release.
   
   Thanks again :)
   
   I way a bit delayed due to travel, but I started playing with it a bit
   yesterday afternoon and unfortunately it still doesn't work for me
   (using the same test/reproducer I documented in the RH BZ).  I've
   tried running QEMU both via libvirt and the command line (using a
   libvirt derived command line).
   
   I'm applying the patches to the F17 QEMU 1.2 package; there is some
   minor fixup needed in the configure script but nothing major.
   
   What is further frustrating is that the debug code (patch 5/5) doesn't
   seem to output the problematic syscall.  I wanted to investigate this
   a bit more before responding, but with the holiday approaching
   (Thanksgiving in the US), I'm not sure how much progress I'll be able
   to make for the remainder of this week.  Sorry about that.
   
   If you have any further questions about how, or what, I'm testing,
   just ask.
   
   Paul, Is your host 32 or 64-bit?
   
   64-bit
  
  I'm having trouble recreating this.  I'm running a Fedora 17 64-bit host
  and a Fedora 17 64-bit guest with domain XML that mirrors yours.
  
  Here's the domain XML I'm using and the resulting QEMU command line:
  
  Domain XML:   http://pastebin.com/DWa4RQ1Y
  Command line: http://pastebin.com/2QTWsUhP
  
  I'm running with QEMU commit 8db972cfa469b4e4afd9c65e54e796b83b5ce3a2
  which is 1.2.0 with: (a) just the first patch applied, as well as with
  (b) all of this patch series applied.
  
  Any thoughts on what could be different?
 
 Like I said earlier, I'm running with the F17 QEMU 1.2 package,
 qemu-1.2.0-16.fc18 to be exact, with Eduardo's patches applied on top.
 
 I'm currently testing another set of interim patches from Eduardo that was
 sent off-list for testing (you were CC'd); hopefully that will resolve the
 problem.

Unfortunately, the latest patches from Eduardo met with the same fate.  Here 
is more detailed information on my system (HP DL160 G5, F17, 64-bit):

# uname -r
3.6.7-4.fc17.x86_64

[NOTE: standard F17 kernel]

# rpm -qa | grep qemu
qemu-kvm-tools-1.2.0-16.pm5.fc17.x86_64
qemu-common-1.2.0-16.pm5.fc17.x86_64
qemu-kvm-1.2.0-16.pm5.fc17.x86_64
ipxe-roms-qemu-20120328-1.gitaac9718.fc17.noarch
qemu-img-1.2.0-16.pm5.fc17.x86_64
qemu-system-x86-1.2.0-16.pm5.fc17.x86_64

[NOTE: the 'pm5' is my designation indicating the patched version]

# ./qemu_seccomp.sh -sandbox off
char device redirected to /dev/pts/0
do_spice_init: starting 0.10.1
spice_server_add_interface: SPICE_INTERFACE_MIGRATION
spice_server_add_interface: SPICE_INTERFACE_KEYBOARD
spice_server_add_interface: SPICE_INTERFACE_MOUSE
spice_server_add_interface: SPICE_INTERFACE_QXL
red_worker_main: begin
display_channel_create: create display channel
cursor_channel_create: create cursor channel
spice_server_add_interface: SPICE_INTERFACE_PLAYBACK
spice_server_add_interface: SPICE_INTERFACE_RECORD
[NOTE: I hit Ctrl-C at this point]
qemu: terminating on signal 2
spice_server_remove_interface: remove SPICE_INTERFACE_PLAYBACK
spice_server_remove_interface: remove SPICE_INTERFACE_RECORD

# ./qemu_seccomp.sh -sandbox on
char device redirected to /dev/pts/0
do_spice_init: starting 0.10.1
spice_server_add_interface: SPICE_INTERFACE_MIGRATION
spice_server_add_interface: SPICE_INTERFACE_KEYBOARD
spice_server_add_interface: SPICE_INTERFACE_MOUSE
spice_server_add_interface: SPICE_INTERFACE_QXL
red_worker_main: begin
./qemu_seccomp.sh: line 28: 21085 Bad system call /usr/bin/qemu-kvm -S 
-M pc-0.14 -enable-kvm -m 2048 -smp 2,sockets=2,cores=1,threads=1 -name f16-
test-1 -uuid 13c7da9b-a79a-0688-267a-8206136bc8d6 -nodefconfig -nodefaults -
chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/f16-
test-1.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control 
-rtc base=utc -no-shutdown -device virtio-serial-pci,id=virtio-
serial0,bus=pci.0,addr=0x5 -device piix3-usb-
uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/libvirt/images/f16-
test-1.img,if=none,id=drive-virtio-disk0,format=raw -device virtio-blk-
pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-
disk0,bootindex=1 -netdev user,id=hostnet0 -device virtio-net-
pci,netdev=hostnet0,id=net0,mac=52:54:00:9a:9d:63,bus=pci.0,addr=0x3 -chardev 
pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev 
spicevmc,id=charchannel0,name=vdagent -device virtserialport,bus=virtio-
serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 -
device usb-tablet,id=input0 -spice port=5900,addr=127.0.0.1

Re: [Qemu-devel] [PATCHv3 1/5] seccomp: adding new syscalls (bugzilla 855162)

2012-11-21 Thread Paul Moore
On Wednesday, November 21, 2012 11:20:44 AM Eduardo Otubo wrote:
 Hello folks,
 
 Does anyone had a chance to take a look at this? We would like to get
 this into the 1.3 release.
 
 Thanks again :)

I way a bit delayed due to travel, but I started playing with it a bit 
yesterday afternoon and unfortunately it still doesn't work for me (using the 
same test/reproducer I documented in the RH BZ).  I've tried running QEMU both 
via libvirt and the command line (using a libvirt derived command line).

I'm applying the patches to the F17 QEMU 1.2 package; there is some minor 
fixup needed in the configure script but nothing major.

What is further frustrating is that the debug code (patch 5/5) doesn't seem to 
output the problematic syscall.  I wanted to investigate this a bit more 
before responding, but with the holiday approaching (Thanksgiving in the US), 
I'm not sure how much progress I'll be able to make for the remainder of this 
week.  Sorry about that.

If you have any further questions about how, or what, I'm testing, just ask.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCHv2 3/4] Support for double whitelist filters

2012-11-05 Thread Paul Moore
On Monday, November 05, 2012 09:39:46 AM Corey Bryant wrote:
 On 11/02/2012 06:14 PM, Paul Moore wrote:
  On Friday, November 02, 2012 06:00:29 PM Corey Bryant wrote:
  On 11/02/2012 05:29 PM, Paul Moore wrote:
  On Tuesday, October 23, 2012 03:55:31 AM Eduardo Otubo wrote:
  This patch includes a second whitelist right before the main loop. It's
  a smaller and more restricted whitelist, excluding execve() among many
  others.
  
  v2: * ctx changed to main_loop_ctx
  
* seccomp_on now inside ifdef
* open syscall added to the main_loop whitelist
  
  Signed-off-by: Eduardo Otubo ot...@linux.vnet.ibm.com
  
  Unfortunately qemu.org seems to be down for me today so I can't grab the
  latest repo to review/verify this patch (some of my comments/assumptions
  below may be off) but I'm a little confused, hopefully you guys can help
  me out, read below ...
  
  The first call to seccomp_install_filter() will setup a whitelist for
  the
  syscalls that have been explicitly specified, all others will hit the
  default action TRAP/KILL.  The second call to seccomp_install_filter()
  will add a second whitelist for another set of explicitly specified
  syscalls, all others will hit the default action TRAP/KILL.
  
  That's correct.  The goal was to have a 2nd list that is a subset of the
  1st list, and also not include execve() in the 2nd list.  At this point
  though, since it's late in the release, we've expanded the 2nd list to
  be the same as the 1st with the exception of execve() not being in the
  2nd list.
  
  The problem occurs when the filters are executed in the kernel when a
  syscall is executed.  On each syscall the first filter will be executed
  and the action will either be ALLOW or TRAP/KILL, next the second filter
  will be executed and the action will either be ALLOW or TRAP/KILL; since
  the kernel always takes the most restrictive (lowest integer action
  value) action when multiple filters are specified, I think your double
  whitelist value is going to have some inherent problems.
  
  That's something I hadn't thought of.  But TRAP and KILL won't exist
  together in our whitelists, and our 2nd whitelist is a subset of the
  1st.  So do you think there would still be problems?
  
  It doesn't really matter if the default action is TRAP and/or KILL, the
  point is that if you use a second whitelist after an initial whitelist
  the effective seccomp filter is going to be only the syscalls you
  explicitly allowed in the second whitelist.  When using multiple seccomp
  filters on a process, all filters are executed for each syscall and the
  most restrictive action of all the filters is the action that the kernel
  takes.
  
  Don't get me wrong, I like the idea of progressively restricting QEMU, but
  if you are going to load multiple seccomp filters into the kernel, you
  almost certainly only want the first whitelist filter to be the union of
  all the seccomp filter you intend to load with all subsequent filters
  being blacklists which progressively remove syscalls which are allowed by
  the initial whitelist.
 
 That's what we're doing though.  The first whitelist is a union of all
 subsequent filters.  Of course there's only one subsequent filter at
 this point.  But the idea is to start out with a large whitelist for
 initialization and then tighten it up before the main loop when
 presumably less syscalls are needed.

Okay, that's good ... It still seems a bit odd to me, I think a whitelist 1st 
blacklist 2nd is a more intuitive and efficient solution but that may just be 
me.

 My concern is getting the two whitelists correct.  We keep uncovering
 new syscalls as we test.

Of course, this whole whitelist/blacklist discussion assumes the list of 
allowed syscalls is correct.

  I might suggest an initial, fairly permissive
  whitelist followed by a follow-on blacklist if you want to disable
  certain
  syscalls.
  
  I have to admit I'm nervous about this at this point in QEMU 1.3.  It's
  getting late in the cycle and we'd hoped to get this in earlier.  A more
  permissive whitelist is probably going to be the only way we'll
  successfully turn -sandbox on by default at this point in QEMU 1.3.
  
  Thats fine, I just wanted to point out that I think the multiple whitelist
  approach is going to have some inherent problems.
 
 Are you thinking there will be problems with the current two-whitelist
 approach, or are you thinking there would be problems in the future if
 we continued restricting the QEMU process with further whitelists?  If
 you mean the latter, then I understand your point since QEMU is a single
 process that requires a certain subset of syscalls.

I was originally concerned that you were structuring the whitelists 
incorrectly, but it sounds like that is not the case - that's good.

I'm still concerned that the double whitelist approach may result in bigger 
syscall filters than necessary but until we get a final-ish list there is no 
point worrying

Re: [Qemu-devel] [PATCHv2 1/4] Adding new syscalls (bugzilla 855162)

2012-11-02 Thread Paul Moore
On Friday, November 02, 2012 09:48:55 AM Corey Bryant wrote:
 On 11/01/2012 05:43 PM, Paul Moore wrote:
  On Tuesday, October 23, 2012 03:55:29 AM Eduardo Otubo wrote:
  According to the bug 855162[0] - there's the need of adding new syscalls
  to the whitelist whenn using Qemu with Libvirt.
  
  [0] - https://bugzilla.redhat.com/show_bug.cgi?id=855162
  
  v2: Adding new syscalls to the list: readlink, rt_sigpending, and
  
   rt_sigtimedwait
  
  Reported-by: Paul Moore pmo...@redhat.com
  Signed-off-by: Eduardo Otubo ot...@linux.vnet.ibm.com
  ---
  
qemu-seccomp.c | 13 -
1 file changed, 12 insertions(+), 1 deletion(-)
  
  I had an opportunity to test this patchset on a F17 machine using QEMU 1.2
  and unfortunately it still fails.  I'm using a relatively basic guest
  configuration running F16, the details are documented in the RH BZ that
  Eduardo mentioned in the patch description.
 
 Paul, Here's the latest diff for the whitelist.  We're looking to get
 the patches out in the next few days after a bit more testing.

Okay, thanks for the updated list ... I'm rebuilding QEMU right now and I'll 
report back with the results later today.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCHv2 1/4] Adding new syscalls (bugzilla 855162)

2012-11-02 Thread Paul Moore
On Friday, November 02, 2012 12:29:37 AM Eduardo Otubo wrote:
 On Thu, Nov 01, 2012 at 05:43:03PM -0400, Paul Moore wrote:
  On Tuesday, October 23, 2012 03:55:29 AM Eduardo Otubo wrote:
   According to the bug 855162[0] - there's the need of adding new syscalls
   to the whitelist whenn using Qemu with Libvirt.
   
   [0] - https://bugzilla.redhat.com/show_bug.cgi?id=855162
   
   v2: Adding new syscalls to the list: readlink, rt_sigpending, and
   
   rt_sigtimedwait
   
   Reported-by: Paul Moore pmo...@redhat.com
   Signed-off-by: Eduardo Otubo ot...@linux.vnet.ibm.com
   ---
   
qemu-seccomp.c | 13 -
1 file changed, 12 insertions(+), 1 deletion(-)
  
  I had an opportunity to test this patchset on a F17 machine using QEMU 1.2
  and unfortunately it still fails.  I'm using a relatively basic guest
  configuration running F16, the details are documented in the RH BZ that
  Eduardo mentioned in the patch description.
  
  Eduardo, I assume you are not able to reproduce this?
 
 Unfortunately no. But we have the v3  patchset coming soon with new
 syscalls and we're hoping to get this fixed. Thanks for the feedback
 Paul!

No problem, thanks for all your work on this patchset.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCHv2 1/4] Adding new syscalls (bugzilla 855162)

2012-11-02 Thread Paul Moore
On Friday, November 02, 2012 10:10:02 AM Paul Moore wrote:
 On Friday, November 02, 2012 09:48:55 AM Corey Bryant wrote:
  On 11/01/2012 05:43 PM, Paul Moore wrote:
   On Tuesday, October 23, 2012 03:55:29 AM Eduardo Otubo wrote:
   According to the bug 855162[0] - there's the need of adding new
   syscalls
   to the whitelist whenn using Qemu with Libvirt.
   
   [0] - https://bugzilla.redhat.com/show_bug.cgi?id=855162
   
   v2: Adding new syscalls to the list: readlink, rt_sigpending, and
   
rt_sigtimedwait
   
   Reported-by: Paul Moore pmo...@redhat.com
   Signed-off-by: Eduardo Otubo ot...@linux.vnet.ibm.com
   ---
   
 qemu-seccomp.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)
   
   I had an opportunity to test this patchset on a F17 machine using QEMU
   1.2
   and unfortunately it still fails.  I'm using a relatively basic guest
   configuration running F16, the details are documented in the RH BZ that
   Eduardo mentioned in the patch description.
  
  Paul, Here's the latest diff for the whitelist.  We're looking to get
  the patches out in the next few days after a bit more testing.
 
 Okay, thanks for the updated list ... I'm rebuilding QEMU right now and I'll
 report back with the results later today.

Sadly, no luck, it still fails.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCHv2 1/4] Adding new syscalls (bugzilla 855162)

2012-11-02 Thread Paul Moore
On Friday, November 02, 2012 10:43:41 AM Corey Bryant wrote:
 On 11/02/2012 10:38 AM, Paul Moore wrote:
  On Friday, November 02, 2012 10:10:02 AM Paul Moore wrote:
  On Friday, November 02, 2012 09:48:55 AM Corey Bryant wrote:
  On 11/01/2012 05:43 PM, Paul Moore wrote:
  On Tuesday, October 23, 2012 03:55:29 AM Eduardo Otubo wrote:
  According to the bug 855162[0] - there's the need of adding new
  syscalls
  to the whitelist whenn using Qemu with Libvirt.
  
  [0] - https://bugzilla.redhat.com/show_bug.cgi?id=855162
  
  v2: Adding new syscalls to the list: readlink, rt_sigpending, and
  
rt_sigtimedwait
  
  Reported-by: Paul Moore pmo...@redhat.com
  Signed-off-by: Eduardo Otubo ot...@linux.vnet.ibm.com
  ---
  
 qemu-seccomp.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)
  
  I had an opportunity to test this patchset on a F17 machine using QEMU
  1.2
  and unfortunately it still fails.  I'm using a relatively basic guest
  configuration running F16, the details are documented in the RH BZ that
  Eduardo mentioned in the patch description.
  
  Paul, Here's the latest diff for the whitelist.  We're looking to get
  the patches out in the next few days after a bit more testing.
  
  Okay, thanks for the updated list ... I'm rebuilding QEMU right now and
  I'll report back with the results later today.
  
  Sadly, no luck, it still fails.
 
 Hmm, let me send you the current patch set off-line, which includes
 debug support to write the failing syscall out.  If you don't mind could
 you try it out?

Sure, no problem.

On a related note, I think it would be a *really* good idea to also submit the 
debug code upstream, just in a disabled state by default.  You could either 
bracket it with #ifdefs or get fancy and allow it at runtime with '-sandbox 
debug' or something similar.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCHv2 3/4] Support for double whitelist filters

2012-11-02 Thread Paul Moore
On Tuesday, October 23, 2012 03:55:31 AM Eduardo Otubo wrote:
 This patch includes a second whitelist right before the main loop. It's
 a smaller and more restricted whitelist, excluding execve() among many
 others.
 
 v2: * ctx changed to main_loop_ctx
 * seccomp_on now inside ifdef
 * open syscall added to the main_loop whitelist
 
 Signed-off-by: Eduardo Otubo ot...@linux.vnet.ibm.com

Unfortunately qemu.org seems to be down for me today so I can't grab the 
latest repo to review/verify this patch (some of my comments/assumptions below 
may be off) but I'm a little confused, hopefully you guys can help me out, 
read below ...

The first call to seccomp_install_filter() will setup a whitelist for the 
syscalls that have been explicitly specified, all others will hit the default 
action TRAP/KILL.  The second call to seccomp_install_filter() will add a 
second whitelist for another set of explicitly specified syscalls, all others 
will hit the default action TRAP/KILL.

The problem occurs when the filters are executed in the kernel when a syscall 
is executed.  On each syscall the first filter will be executed and the action 
will either be ALLOW or TRAP/KILL, next the second filter will be executed and 
the action will either be ALLOW or TRAP/KILL; since the kernel always takes 
the most restrictive (lowest integer action value) action when multiple 
filters are specified, I think your double whitelist value is going to have 
some inherent problems.  I might suggest an initial, fairly permissive 
whitelist followed by a follow-on blacklist if you want to disable certain 
syscalls.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCHv2 3/4] Support for double whitelist filters

2012-11-02 Thread Paul Moore
On Friday, November 02, 2012 06:00:29 PM Corey Bryant wrote:
 On 11/02/2012 05:29 PM, Paul Moore wrote:
  On Tuesday, October 23, 2012 03:55:31 AM Eduardo Otubo wrote:
  This patch includes a second whitelist right before the main loop. It's
  a smaller and more restricted whitelist, excluding execve() among many
  others.
  
  v2: * ctx changed to main_loop_ctx
  
   * seccomp_on now inside ifdef
   * open syscall added to the main_loop whitelist
  
  Signed-off-by: Eduardo Otubo ot...@linux.vnet.ibm.com
  
  Unfortunately qemu.org seems to be down for me today so I can't grab the
  latest repo to review/verify this patch (some of my comments/assumptions
  below may be off) but I'm a little confused, hopefully you guys can help
  me out, read below ...
  
  The first call to seccomp_install_filter() will setup a whitelist for the
  syscalls that have been explicitly specified, all others will hit the
  default action TRAP/KILL.  The second call to seccomp_install_filter()
  will add a second whitelist for another set of explicitly specified
  syscalls, all others will hit the default action TRAP/KILL.
 
 That's correct.  The goal was to have a 2nd list that is a subset of the
 1st list, and also not include execve() in the 2nd list.  At this point
 though, since it's late in the release, we've expanded the 2nd list to
 be the same as the 1st with the exception of execve() not being in the
 2nd list.
 
  The problem occurs when the filters are executed in the kernel when a
  syscall is executed.  On each syscall the first filter will be executed
  and the action will either be ALLOW or TRAP/KILL, next the second filter
  will be executed and the action will either be ALLOW or TRAP/KILL; since
  the kernel always takes the most restrictive (lowest integer action
  value) action when multiple filters are specified, I think your double
  whitelist value is going to have some inherent problems.
 
 That's something I hadn't thought of.  But TRAP and KILL won't exist
 together in our whitelists, and our 2nd whitelist is a subset of the
 1st.  So do you think there would still be problems?

It doesn't really matter if the default action is TRAP and/or KILL, the point 
is that if you use a second whitelist after an initial whitelist the effective 
seccomp filter is going to be only the syscalls you explicitly allowed in the 
second whitelist.  When using multiple seccomp filters on a process, all 
filters are executed for each syscall and the most restrictive action of all 
the filters is the action that the kernel takes.

Don't get me wrong, I like the idea of progressively restricting QEMU, but if 
you are going to load multiple seccomp filters into the kernel, you almost 
certainly only want the first whitelist filter to be the union of all the 
seccomp filter you intend to load with all subsequent filters being blacklists 
which progressively remove syscalls which are allowed by the initial 
whitelist.

  I might suggest an initial, fairly permissive
  whitelist followed by a follow-on blacklist if you want to disable certain
  syscalls.
 
 I have to admit I'm nervous about this at this point in QEMU 1.3.  It's
 getting late in the cycle and we'd hoped to get this in earlier.  A more
 permissive whitelist is probably going to be the only way we'll
 successfully turn -sandbox on by default at this point in QEMU 1.3.

Thats fine, I just wanted to point out that I think the multiple whitelist 
approach is going to have some inherent problems.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCHv2 1/4] Adding new syscalls (bugzilla 855162)

2012-11-01 Thread Paul Moore
On Tuesday, October 23, 2012 03:55:29 AM Eduardo Otubo wrote:
 According to the bug 855162[0] - there's the need of adding new syscalls
 to the whitelist whenn using Qemu with Libvirt.
 
 [0] - https://bugzilla.redhat.com/show_bug.cgi?id=855162
 
 v2: Adding new syscalls to the list: readlink, rt_sigpending, and
 rt_sigtimedwait
 
 Reported-by: Paul Moore pmo...@redhat.com
 Signed-off-by: Eduardo Otubo ot...@linux.vnet.ibm.com
 ---
  qemu-seccomp.c | 13 -
  1 file changed, 12 insertions(+), 1 deletion(-)

I had an opportunity to test this patchset on a F17 machine using QEMU 1.2 and 
unfortunately it still fails.  I'm using a relatively basic guest 
configuration running F16, the details are documented in the RH BZ that 
Eduardo mentioned in the patch description.

Eduardo, I assume you are not able to reproduce this?

 diff --git a/qemu-seccomp.c b/qemu-seccomp.c
 index 64329a3..a7b33e2 100644
 --- a/qemu-seccomp.c
 +++ b/qemu-seccomp.c
 @@ -45,6 +45,13 @@ static const struct QemuSeccompSyscall
 seccomp_whitelist[] = { { SCMP_SYS(access), 245 },
  { SCMP_SYS(prctl), 245 },
  { SCMP_SYS(signalfd), 245 },
 +{ SCMP_SYS(getrlimit), 245 },
 +{ SCMP_SYS(set_tid_address), 245 },
 +{ SCMP_SYS(socketpair), 245 },
 +{ SCMP_SYS(statfs), 245 },
 +{ SCMP_SYS(unlink), 245 },
 +{ SCMP_SYS(wait4), 245 },
 +{ SCMP_SYS(getuid), 245 },
  #if defined(__i386__)
  { SCMP_SYS(fcntl64), 245 },
  { SCMP_SYS(fstat64), 245 },
 @@ -107,7 +114,11 @@ static const struct QemuSeccompSyscall
 seccomp_whitelist[] = { { SCMP_SYS(getsockname), 242 },
  { SCMP_SYS(getpeername), 242 },
  { SCMP_SYS(fdatasync), 242 },
 -{ SCMP_SYS(close), 242 }
 +{ SCMP_SYS(close), 242 },
 +{ SCMP_SYS(accept4), 242 },
 +{ SCMP_SYS(readlink), 242 },
 +{ SCMP_SYS(rt_sigpending), 242 },
 +{ SCMP_SYS(rt_sigtimedwait), 242 }
  };
 
  int seccomp_start(void)
-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [libseccomp-discuss] [RFC] [PATCHv2 0/2] Sandboxing Qemu guests with Libseccomp

2012-10-29 Thread Paul Moore
On Monday, October 29, 2012 03:32:34 PM Daniel P. Berrange wrote:
 On Mon, Oct 29, 2012 at 11:11:15AM -0400, Corey Bryant wrote:
  Hi Paul,
  
  Do you know when Fedora packages will be available for libseccomp?
 
 They're already reviewed and in Fedora 18
 
   https://bugzilla.redhat.com/show_bug.cgi?id=830992

What Daniel said.  If you're asking about F17, I don't plan on adding 
libseccomp to releases prior to F18.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH] New syscalls to the seccomp whitelist

2012-09-26 Thread Paul Moore
On Thursday, September 20, 2012 06:00:59 PM Eduardo Otubo wrote:
 Seccomp syscall whitelist updated after tests running qemu under
 libvirt ...

Hi Eduardo,

I know from our discussions offlist that you have an additional debugging 
patch to help identify missing syscalls, perhaps you could also submit that 
patch too?  I think we would want the debugging patch #ifdef'd out in normal 
use, but I think it might help the QEMU developers.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH] New syscalls to the seccomp whitelist

2012-09-26 Thread Paul Moore
On Wednesday, September 26, 2012 01:24:54 PM Eduardo Otubo wrote:
 On Wed, Sep 26, 2012 at 11:14:29AM -0400, Paul Moore wrote:
  On Thursday, September 20, 2012 06:00:59 PM Eduardo Otubo wrote:
   Seccomp syscall whitelist updated after tests running qemu under
   libvirt ...
  
  Hi Eduardo,
  
  I know from our discussions offlist that you have an additional debugging
  patch to help identify missing syscalls, perhaps you could also submit
  that patch too?  I think we would want the debugging patch #ifdef'd out in
  normal use, but I think it might help the QEMU developers.
 
 That's surely a good thing Paul. I'll rebase my patches and send it
 right away.  Thanks!

Great, I think that will be a nice addition.

Also, FWIW, I don't think you need to rebase/resubmit the patch you already 
sent, just port your debugging patch to go on top.

-- 
paul moore
security and virtualization @ redhat




Re: [Qemu-devel] [PATCH] New syscalls to the seccomp whitelist

2012-09-26 Thread Paul Moore
On Friday, September 21, 2012 09:40:33 AM Eduardo Otubo wrote:
 Seccomp syscall whitelist updated after tests running qemu under
 libvirt. Reference to the bug -
 https://bugzilla.redhat.com/show_bug.cgi?id=855162

Unfortunately, this patch still does not work correctly for me using 
F17/libvirt and a F16 guest.  The QEMU instance starts, and the guest appears 
to be running, but I never see any of the BIOS POST messages on the console.

I'm attaching the guest's XML definition file if that helps.

 Signed-off-by: Eduardo Otubo ot...@linux.vnet.ibm.com
 ---
  qemu-seccomp.c | 21 -
  1 file changed, 20 insertions(+), 1 deletion(-)
 
 diff --git a/qemu-seccomp.c b/qemu-seccomp.c
 index 64329a3..4712338 100644
 --- a/qemu-seccomp.c
 +++ b/qemu-seccomp.c
 @@ -70,6 +70,7 @@ static const struct QemuSeccompSyscall seccomp_whitelist[]
 = { { SCMP_SYS(setsockopt), 245},
  { SCMP_SYS(uname), 245},
  { SCMP_SYS(semget), 245},
 +{ SCMP_SYS(accept4), 241 },
  #endif
  { SCMP_SYS(eventfd2), 245 },
  { SCMP_SYS(dup), 245 },
 @@ -107,7 +108,25 @@ static const struct QemuSeccompSyscall
 seccomp_whitelist[] = { { SCMP_SYS(getsockname), 242 },
  { SCMP_SYS(getpeername), 242 },
  { SCMP_SYS(fdatasync), 242 },
 -{ SCMP_SYS(close), 242 }
 +{ SCMP_SYS(close), 242 },
 +{ SCMP_SYS(unlink), 241 },
 +{ SCMP_SYS(statfs), 241 },
 +{ SCMP_SYS(getuid), 241 },
 +{ SCMP_SYS(ftruncate), 241 },
 +{ SCMP_SYS(getegid), 241 },
 +{ SCMP_SYS(geteuid), 241 },
 +{ SCMP_SYS(getgid), 241 },
 +{ SCMP_SYS(getrlimit), 241 },
 +{ SCMP_SYS(set_tid_address), 241 },
 +{ SCMP_SYS(socketpair), 241 },
 +{ SCMP_SYS(fstatfs), 241 },
 +{ SCMP_SYS(epoll_create), 241 },
 +{ SCMP_SYS(epoll_ctl), 241 },
 +{ SCMP_SYS(epoll_wait), 241 },
 +{ SCMP_SYS(pipe), 241 },
 +{ SCMP_SYS(poll), 241 },
 +{ SCMP_SYS(rt_sigpending), 241 },
 +{ SCMP_SYS(rt_sigtimedwait), 241 },
  };
 
  int seccomp_start(void)

-- 
paul moore
security and virtualization @ redhat


f16-test-1.xml
Description: XML document


Re: [Qemu-devel] [PATCH] fips: fix build on !Linux

2012-08-06 Thread Paul Moore
On Friday, August 03, 2012 06:31:38 PM Anthony Liguori wrote:
 Commit 0f66998 makes -enable-fips conditional on Linux hosts but then uses
 it unconditionally in vl.c.
 
 Fix this by moving the fips handling to os-posix.c and adding a condition.

Sorry for not catching this, thanks for the fix.

 Cc: Paul Moore pmo...@redhat.com
 Signed-off-by: Anthony Liguori aligu...@us.ibm.com
 ---
  os-posix.c |5 +
  vl.c   |3 ---
  2 files changed, 5 insertions(+), 3 deletions(-)
 
 diff --git a/os-posix.c b/os-posix.c
 index daf3d6f..79fa228 100644
 --- a/os-posix.c
 +++ b/os-posix.c
 @@ -188,6 +188,11 @@ void os_parse_cmd_args(int index, const char *optarg)
  case QEMU_OPTION_daemonize:
  daemonize = 1;
  break;
 +#if defined(CONFIG_LINUX)
 +case QEMU_OPTION_enablefips:
 +fips_set_state(true);
 +break;
 +#endif
  }
  return;
  }
 diff --git a/vl.c b/vl.c
 index 8cda85f..6d2ce45 100644
 --- a/vl.c
 +++ b/vl.c
 @@ -3199,9 +3199,6 @@ int main(int argc, char **argv, char **envp)
  case QEMU_OPTION_qtest_log:
  qtest_log = optarg;
  break;
 -case QEMU_OPTION_enablefips:
 -fips_set_state(true);
 -break;
  default:
  os_parse_cmd_args(popt-index, optarg);
  }
-- 
paul moore
security and virtualization @ redhat




[Qemu-devel] [PATCH v5] vnc: disable VNC password authentication (security type 2) when in FIPS mode

2012-08-03 Thread Paul Moore
FIPS 140-2 requires disabling certain ciphers, including DES, which is used
by VNC to obscure passwords when they are sent over the network.  The
solution for FIPS users is to disable the use of VNC password auth when the
host system is operating in FIPS compliance mode and the user has specified
'-enable-fips' on the QEMU command line.

This patch causes QEMU to emit a message to stderr when the host system is
running in FIPS mode and a VNC password was specified on the commend line.
If the system is not running in FIPS mode, or is running in FIPS mode but
VNC password authentication was not requested, QEMU operates normally.

Signed-off-by: Paul Moore pmo...@redhat.com

--
Changelog
* v5
- Added the '-enable-fips' command line option
* v4
- Removed the use of syslog
* v3
- Use fgetc() instead of fgets() in fips_enabled
- Only emit a syslog message if the caller tries to use VNC password auth
- Suggest alternative auth methods in the stderr notice
* v2
- Protected syslog with _WIN32
- Protected the guts of fips_enabled() with __linux__
- Converted fips_enabled() and the fips flag from int to bool
*v1
- Initial draft
---
 osdep.c |   29 +
 osdep.h |4 
 qemu-doc.texi   |8 +---
 qemu-options.hx |   11 +++
 ui/vnc.c|   10 ++
 vl.c|4 
 6 files changed, 63 insertions(+), 3 deletions(-)

diff --git a/osdep.c b/osdep.c
index 03817f0..c07faf5 100644
--- a/osdep.c
+++ b/osdep.c
@@ -24,6 +24,7 @@
 #include stdlib.h
 #include stdio.h
 #include stdarg.h
+#include stdbool.h
 #include string.h
 #include errno.h
 #include unistd.h
@@ -48,6 +49,8 @@ extern int madvise(caddr_t, size_t, int);
 #include trace.h
 #include qemu_socket.h
 
+static bool fips_enabled = false;
+
 static const char *qemu_version = QEMU_VERSION;
 
 int socket_set_cork(int fd, int v)
@@ -253,3 +256,29 @@ const char *qemu_get_version(void)
 {
 return qemu_version;
 }
+
+void fips_set_state(bool requested)
+{
+#ifdef __linux__
+if (requested) {
+FILE *fds = fopen(/proc/sys/crypto/fips_enabled, r);
+if (fds != NULL) {
+fips_enabled = (fgetc(fds) == '1');
+fclose(fds);
+}
+}
+#else
+fips_enabled = false;
+#endif /* __linux__ */
+
+#ifdef _FIPS_DEBUG
+fprintf(stderr, FIPS mode %s (requested %s)\n,
+   (fips_enabled ? enabled : disabled),
+   (requested ? enabled : disabled));
+#endif
+}
+
+bool fips_get_state(void)
+{
+return fips_enabled;
+}
diff --git a/osdep.h b/osdep.h
index 1e15a4b..d4b887d 100644
--- a/osdep.h
+++ b/osdep.h
@@ -3,6 +3,7 @@
 
 #include stdarg.h
 #include stddef.h
+#include stdbool.h
 #ifdef __OpenBSD__
 #include sys/types.h
 #include sys/signal.h
@@ -154,4 +155,7 @@ void qemu_set_cloexec(int fd);
 void qemu_set_version(const char *);
 const char *qemu_get_version(void);
 
+void fips_set_state(bool requested);
+bool fips_get_state(void);
+
 #endif
diff --git a/qemu-doc.texi b/qemu-doc.texi
index 84dad19..f482fed 100644
--- a/qemu-doc.texi
+++ b/qemu-doc.texi
@@ -1124,9 +1124,11 @@ the protocol limits passwords to 8 characters it should 
not be considered
 to provide high security. The password can be fairly easily brute-forced by
 a client making repeat connections. For this reason, a VNC server using 
password
 authentication should be restricted to only listen on the loopback interface
-or UNIX domain sockets. Password authentication is requested with the 
@code{password}
-option, and then once QEMU is running the password is set with the monitor. 
Until
-the monitor is used to set the password all clients will be rejected.
+or UNIX domain sockets. Password authentication is not supported when operating
+in FIPS 140-2 compliance mode as it requires the use of the DES cipher. 
Password
+authentication is requested with the @code{password} option, and then once QEMU
+is running the password is set with the monitor. Until the monitor is used to
+set the password all clients will be rejected.
 
 @example
 qemu-system-i386 [...OPTIONS...] -vnc :1,password -monitor stdio
diff --git a/qemu-options.hx b/qemu-options.hx
index dc68e15..1f114ad 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -2783,6 +2783,17 @@ DEF(qtest-log, HAS_ARG, QEMU_OPTION_qtest_log,
 -qtest-log LOG  specify tracing options\n,
 QEMU_ARCH_ALL)
 
+#ifdef __linux__
+DEF(enable-fips, 0, QEMU_OPTION_enablefips,
+-enable-fipsenable FIPS 140-2 compliance\n,
+QEMU_ARCH_ALL)
+#endif
+STEXI
+@item -enable-fips
+@findex -enable-fips
+Enable FIPS 140-2 compliance mode.
+ETEXI
+
 HXCOMM This is the last statement. Insert new options before this line!
 STEXI
 @end table
diff --git a/ui/vnc.c b/ui/vnc.c
index cfc61a7..312ad7f 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -32,6 +32,7 @@
 #include acl.h
 #include qemu-objects.h
 #include qmp-commands.h
+#include osdep.h
 
 #define VNC_REFRESH_INTERVAL_BASE 30
 #define VNC_REFRESH_INTERVAL_INC  50
@@ -2875,6 +2876,15 @@ int

Re: [Qemu-devel] [PATCH v4] vnc: disable VNC password authentication (security type 2) when in FIPS mode

2012-07-31 Thread Paul Moore
On Friday, June 08, 2012 05:38:12 PM Paul Moore wrote:
 FIPS 140-2 requires disabling certain ciphers, including DES, which is used
 by VNC to obscure passwords when they are sent over the network.  The
 solution for FIPS users is to disable the use of VNC password auth when the
 host system is operating in FIPS mode.
 
 This patch causes QEMU to emit a message to stderr when the host system is
 running in FIPS mode and a VNC password was specified on the commend line.
 If the system is not running in FIPS mode, or is running in FIPS mode but
 VNC password authentication was not requested, QEMU operates normally.
 
 Signed-off-by: Paul Moore pmo...@redhat.com

Hi Anthony,

Any word on this patch?  Other than Daniel Berrange's reviewed-by tag, the 
discussion of the v4 patch has been quiet and I think we addressed all the 
other remaining issues in the discussion attached to the v2 patch posting.

-Paul

 --
 Changelog
 * v4
 - Removed the use of syslog
 * v3
 - Use fgetc() instead of fgets() in fips_enabled
 - Only emit a syslog message if the caller tries to use VNC password auth
 - Suggest alternative auth methods in the stderr notice
 * v2
 - Protected syslog with _WIN32
 - Protected the guts of fips_enabled() with __linux__
 - Converted fips_enabled() and the fips flag from int to bool
 *v1
 - Initial draft
 ---
  qemu-doc.texi |8 +---
  ui/vnc.c  |   27 +++
  ui/vnc.h  |1 +
  3 files changed, 33 insertions(+), 3 deletions(-)
 
 diff --git a/qemu-doc.texi b/qemu-doc.texi
 index 0af0ff4..fe8d3df 100644
 --- a/qemu-doc.texi
 +++ b/qemu-doc.texi
 @@ -1124,9 +1124,11 @@ the protocol limits passwords to 8 characters it
 should not be considered to provide high security. The password can be
 fairly easily brute-forced by a client making repeat connections. For this
 reason, a VNC server using password authentication should be restricted to
 only listen on the loopback interface -or UNIX domain sockets. Password
 authentication is requested with the @code{password} -option, and then once
 QEMU is running the password is set with the monitor. Until -the monitor is
 used to set the password all clients will be rejected. +or UNIX domain
 sockets. Password authentication is not supported when operating +in FIPS
 140-2 compliance mode as it requires the use of the DES cipher. Password
 +authentication is requested with the @code{password} option, and then once
 QEMU +is running the password is set with the monitor. Until the monitor is
 used to +set the password all clients will be rejected.
 
  @example
  qemu-system-i386 [...OPTIONS...] -vnc :1,password -monitor stdio
 diff --git a/ui/vnc.c b/ui/vnc.c
 index 54bc5ad..4bd816d 100644
 --- a/ui/vnc.c
 +++ b/ui/vnc.c
 @@ -48,6 +48,21 @@ static DisplayChangeListener *dcl;
  static int vnc_cursor_define(VncState *vs);
  static void vnc_release_modifiers(VncState *vs);
 
 +static bool fips_enabled(void)
 +{
 +bool enabled = false;
 +
 +#ifdef __linux__
 +FILE *fds = fopen(/proc/sys/crypto/fips_enabled, r);
 +if (fds != NULL) {
 +enabled = (fgetc(fds) == '1');
 +fclose(fds);
 +}
 +#endif /* __linux__ */
 +
 +return enabled;
 +}
 +
  static void vnc_set_share_mode(VncState *vs, VncShareMode mode)
  {
  #ifdef _VNC_DEBUG
 @@ -2748,6 +2763,9 @@ void vnc_display_init(DisplayState *ds)
  dcl-idle = 1;
  vnc_display = vs;
 
 +vs-fips = fips_enabled();
 +VNC_DEBUG(FIPS mode %s\n, (vs-fips ? enabled : disabled));
 +
  vs-lsock = -1;
 
  vs-ds = ds;
 @@ -2896,6 +2914,15 @@ int vnc_display_open(DisplayState *ds, const char
 *display) while ((options = strchr(options, ','))) {
  options++;
  if (strncmp(options, password, 8) == 0) {
 +if (vs-fips) {
 +fprintf(stderr,
 +VNC password auth disabled due to FIPS mode, 
 +consider using the VeNCrypt or SASL authentication
  +methods as an alternative\n);
 +g_free(vs-display);
 +vs-display = NULL;
 +return -1;
 +}
  password = 1; /* Require password auth */
  } else if (strncmp(options, reverse, 7) == 0) {
  reverse = 1;
 diff --git a/ui/vnc.h b/ui/vnc.h
 index a851ebd..d41631b 100644
 --- a/ui/vnc.h
 +++ b/ui/vnc.h
 @@ -160,6 +160,7 @@ struct VncDisplay
  char *display;
  char *password;
  time_t expires;
 +bool fips;
  int auth;
  bool lossy;
  bool non_adaptive;
-- 
paul moore
security and virtualization @ redhat




  1   2   >