Re: fsl_udc_core: BUG: scheduling while atomic

2011-05-13 Thread Sergej.Stepanov
I would say it is a general problem by using CONFIG_PREEMPT_VOLUNTARY,
not only Freescale...

Am Donnerstag, den 12.05.2011, 11:30 -0400 schrieb Matthew L. Creech:
 On Thu, May 12, 2011 at 4:37 AM,  sergej.stepa...@ids.de wrote:
  Hi Mattheew,
 
  such oops you can get also with spi.
  For such problem helps to compile your kernel with other preemption
  model:
   - preempt
   - standard
   - !!! but not voluntary preemption !!!
 
 Thanks Sergej, indeed I'm currently using CONFIG_PREEMPT_VOLUNTARY on
 this board.  I'll change it to fix this problem for now.
 
 Do you happen to know whether the Freescale folks intend to fix this?
 If not, it seems like at least some sort of warning is in order.
 
 -- 
 Matthew L. Creech
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: [linuxppc-release] [PATCH 1/2] powerpc, e5500: add networking to defconfig

2011-05-13 Thread Li Yang-R58472
Subject: Re: [linuxppc-release] [PATCH 1/2] powerpc, e5500: add networking
to defconfig

On Thu, 12 May 2011 10:31:08 -0500
Scott Wood scottw...@freescale.com wrote:

 On Thu, 12 May 2011 01:11:03 -0500
 Li Yang-R58472 r58...@freescale.com wrote:

  diff --git a/arch/powerpc/configs/e55xx_smp_defconfig
  b/arch/powerpc/configs/e55xx_smp_defconfig
  index 9fa1613..f4c5780 100644
  --- a/arch/powerpc/configs/e55xx_smp_defconfig
  +++ b/arch/powerpc/configs/e55xx_smp_defconfig
  @@ -6,10 +6,10 @@ CONFIG_NR_CPUS=2
   CONFIG_EXPERIMENTAL=y
   CONFIG_SYSVIPC=y
   CONFIG_BSD_PROCESS_ACCT=y
  +CONFIG_SPARSE_IRQ=y
 
  Hi Scott,
 
  I remember in previous testing that this option has a negative effect
on performance.  Do we really need it to be enabled?

 I didn't change this setting, it just moved due to running it through
 savedefconfig.

What was the performance impact?

It adds CPU cycles to the interrupt handling path.  Will cause performance drop 
for benchmarks with large amount of interrupts such as IP forwarding.

- Leo

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATHC] Fix for Pegasos keyboard and mouse

2011-05-13 Thread Gabriel Paubert
[See http://lists.ozlabs.org/pipermail/linuxppc-dev/2010-October/086424.html
and followups. Part of the commit message is directly copied from that.]

Commit 540c6c392f01887dcc96bef0a41e63e6c1334f01 tries to find i8042 IRQs in
the device-tree but doesn't fall back to the old hardcoded 1 and 12 in all
failure cases.

Specifically, the case where the device-tree contains nothing matching
pnpPNP,303 or pnpPNP,f03 doesn't seem to be handled well. It sort of falls
through to the old code, but leaves the IRQs set to 0.

Signed-off-by: Gabriel Paubert paub...@iram.es

---

This fix has only been tested on Pegasos, but to my knowledge it only 
affects a Pegasos specific path (all other fimwares should be able
to find the keyboard through the pnp identifiers.

diff --git a/arch/powerpc/kernel/setup-common.c 
b/arch/powerpc/kernel/setup-common.c
index 21f30cb..6c7abbf 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -602,6 +602,10 @@ int check_legacy_ioport(unsigned long base_port)
 * name instead */
if (!np)
np = of_find_node_by_name(NULL, 8042);
+   if (np) {
+   of_i8042_kbd_irq = 1;
+   of_i8042_aux_irq = 12;
+   }
break;
case FDC_BASE: /* FDC1 */
np = of_find_node_by_type(NULL, fdc);
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering

2011-05-13 Thread Ingo Molnar

* Peter Zijlstra pet...@infradead.org wrote:

 On Fri, 2011-05-13 at 14:10 +0200, Ingo Molnar wrote:
  err = event_vfs_getname(result);
 
 I really think we should not do this. Events like we have them should be 
 inactive, totally passive entities, only observe but not affect execution 
 (other than the bare minimal time delay introduced by observance).

Well, this patchset already demonstrates that we can use a single event 
callback for a rather useful purpose.

Either it makes sense to do, in which case we should share facilities as much 
as possible, or it makes no sense, in which case we should not merge it at all.

 If you want another entity that is more active, please invent a new name for 
 it and create a new subsystem for them, now you could have these active 
 entities also have an (automatic) passive event side, but that's some detail.

Why should we have two callbacks next to each other:

event_vfs_getname(result);
result = check_event_vfs_getname(result);

if one could do it all?

Thanks,

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering

2011-05-13 Thread Peter Zijlstra
On Fri, 2011-05-13 at 14:26 +0200, Ingo Molnar wrote:
 * Peter Zijlstra pet...@infradead.org wrote:
 
  On Fri, 2011-05-13 at 14:10 +0200, Ingo Molnar wrote:
   err = event_vfs_getname(result);
  
  I really think we should not do this. Events like we have them should be 
  inactive, totally passive entities, only observe but not affect execution 
  (other than the bare minimal time delay introduced by observance).
 
 Well, this patchset already demonstrates that we can use a single event 
 callback for a rather useful purpose.

Can and should are two distinct things.

 Either it makes sense to do, in which case we should share facilities as much 
 as possible, or it makes no sense, in which case we should not merge it at 
 all.

And I'm arguing we should _not_. Observing is radically different from
Affecting, at the very least the two things should have different
permission schemes. We should not confuse these two matters.

  If you want another entity that is more active, please invent a new name 
  for 
  it and create a new subsystem for them, now you could have these active 
  entities also have an (automatic) passive event side, but that's some 
  detail.
 
 Why should we have two callbacks next to each other:
 
   event_vfs_getname(result);
   result = check_event_vfs_getname(result);
 
 if one could do it all?

Did you actually read the bit where I said that check_event_* (although
I still think that name sucks) could imply a matching event_*?
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering

2011-05-13 Thread Peter Zijlstra
On Fri, 2011-05-13 at 14:39 +0200, Peter Zijlstra wrote:
 
event_vfs_getname(result);
result = check_event_vfs_getname(result); 

Another fundamental difference is how to treat the callback chains for
these two.

Observers won't have a return value and are assumed to never fail,
therefore we can always call every entry on the callback list.

Active things otoh do have a return value, and thus we need to have
semantics that define what to do with that during callback iteration,
when to continue and when to break. Thus for active elements its
impossible to guarantee all entries will indeed be called.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering

2011-05-13 Thread Ingo Molnar

* Peter Zijlstra pet...@infradead.org wrote:

  Why should we have two callbacks next to each other:
  
  event_vfs_getname(result);
  result = check_event_vfs_getname(result);
  
  if one could do it all?
 
 Did you actually read the bit where I said that check_event_* (although
 I still think that name sucks) could imply a matching event_*?

No, did not notice that - and yes that solves this particular problem.

So given that by your own admission it makes sense to share the facilities at 
the low level, i also argue that it makes sense to share as high up as 
possible.

Are you perhaps arguing for a -observe flag that would make 100% sure that the 
default behavior for events is observe-only? That would make sense indeed.

Otherwise both cases really want to use all the same facilities for event 
discovery, setup, control and potential extraction of events.

Thanks,

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering

2011-05-13 Thread Ingo Molnar

* Peter Zijlstra pet...@infradead.org wrote:

 On Fri, 2011-05-13 at 14:39 +0200, Peter Zijlstra wrote:
  
 event_vfs_getname(result);
 result = check_event_vfs_getname(result); 
 
 Another fundamental difference is how to treat the callback chains for
 these two.
 
 Observers won't have a return value and are assumed to never fail,
 therefore we can always call every entry on the callback list.
 
 Active things otoh do have a return value, and thus we need to have
 semantics that define what to do with that during callback iteration,
 when to continue and when to break. Thus for active elements its
 impossible to guarantee all entries will indeed be called.

I think the sanest semantics is to run all active callbacks as well.

For example if this is used for three stacked security policies - as if 3 LSM 
modules were stacked at once. We'd call all three, and we'd determine that at 
least one failed - and we'd return a failure.

Even if the first one failed already we'd still want to trigger *all* the 
failures, because security policies like to know when they have triggered a 
failure (regardless of other active policies) and want to see that failure 
event (if they are logging such events).

So to me this looks pretty similar to observer callbacks as well, it's the 
natural extension to an observer callback chain.

Observer callbacks are simply constant functions (to the caller), those which 
never return failure and which never modify any of the parameters.

It's as if you argued that there should be separate syscalls/facilities for 
handling readonly files versus handling read/write files.

Thanks,

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering

2011-05-13 Thread Peter Zijlstra
On Fri, 2011-05-13 at 14:54 +0200, Ingo Molnar wrote:
 I think the sanest semantics is to run all active callbacks as well.
 
 For example if this is used for three stacked security policies - as if 3 LSM 
 modules were stacked at once. We'd call all three, and we'd determine that at 
 least one failed - and we'd return a failure. 

But that only works for boolean functions where you can return the
multi-bit-or of the result. What if you need to return the specific
error code.

Also, there's bound to be other cases where people will want to employ
this, look at all the various notifier chain muck we've got, it already
deals with much of this -- simply because users need it.

Then there's the whole indirection argument, if you don't need
indirection, its often better to not use it, I myself much prefer code
to look like:

   foo1(bar);
   foo2(bar);
   foo3(bar);

Than:

   foo_notifier(bar);

Simply because its much clearer who all are involved without me having
to grep around to see who registers for foo_notifier and wth they do
with it. It also makes it much harder to sneak in another user, whereas
its nearly impossible to find new notifier users.

Its also much faster, no extra memory accesses, no indirect function
calls, no other muck.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering

2011-05-13 Thread Ingo Molnar

* Peter Zijlstra pet...@infradead.org wrote:

 On Fri, 2011-05-13 at 14:54 +0200, Ingo Molnar wrote:
  I think the sanest semantics is to run all active callbacks as well.
  
  For example if this is used for three stacked security policies - as if 3 
  LSM 
  modules were stacked at once. We'd call all three, and we'd determine that 
  at 
  least one failed - and we'd return a failure. 
 
 But that only works for boolean functions where you can return the
 multi-bit-or of the result. What if you need to return the specific
 error code.

Do you mean that one filter returns -EINVAL while the other -EACCES?

Seems like a non-problem to me, we'd return the first nonzero value.

 Also, there's bound to be other cases where people will want to employ
 this, look at all the various notifier chain muck we've got, it already
 deals with much of this -- simply because users need it.

Do you mean it would be easy to abuse it? What kind of abuse are you most 
worried about?

 Then there's the whole indirection argument, if you don't need
 indirection, its often better to not use it, I myself much prefer code
 to look like:
 
foo1(bar);
foo2(bar);
foo3(bar);
 
 Than:
 
foo_notifier(bar);
 
 Simply because its much clearer who all are involved without me having
 to grep around to see who registers for foo_notifier and wth they do
 with it. It also makes it much harder to sneak in another user, whereas
 its nearly impossible to find new notifier users.
 
 Its also much faster, no extra memory accesses, no indirect function
 calls, no other muck.

But i suspect this question has been settled, given the fact that even pure 
observer events need and already process a chain of events? Am i missing 
something about your argument?

Thanks,

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering

2011-05-13 Thread Peter Zijlstra
On Fri, 2011-05-13 at 14:49 +0200, Ingo Molnar wrote:
 
 So given that by your own admission it makes sense to share the facilities at 
 the low level, i also argue that it makes sense to share as high up as 
 possible. 

I'm not saying any such thing, I'm saying that it might make sense to
observe active objects and auto-create these observation points. That
doesn't make them similar or make them share anything.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering

2011-05-13 Thread Peter Zijlstra
Cut the microblaze list since its bouncy.

On Fri, 2011-05-13 at 15:18 +0200, Ingo Molnar wrote:
 * Peter Zijlstra pet...@infradead.org wrote:
 
  On Fri, 2011-05-13 at 14:54 +0200, Ingo Molnar wrote:
   I think the sanest semantics is to run all active callbacks as well.
   
   For example if this is used for three stacked security policies - as if 3 
   LSM 
   modules were stacked at once. We'd call all three, and we'd determine 
   that at 
   least one failed - and we'd return a failure. 
  
  But that only works for boolean functions where you can return the
  multi-bit-or of the result. What if you need to return the specific
  error code.
 
 Do you mean that one filter returns -EINVAL while the other -EACCES?
 
 Seems like a non-problem to me, we'd return the first nonzero value.

Assuming the first is -EINVAL, what then is the value in computing the
-EACCESS? Sounds like a massive waste of time to me.

  Also, there's bound to be other cases where people will want to employ
  this, look at all the various notifier chain muck we've got, it already
  deals with much of this -- simply because users need it.
 
 Do you mean it would be easy to abuse it? What kind of abuse are you most 
 worried about?

I'm not worried about abuse, I'm saying that going by the existing
notifier pattern always visiting all entries on the callback list is
undesired.

  Then there's the whole indirection argument, if you don't need
  indirection, its often better to not use it, I myself much prefer code
  to look like:
  
 foo1(bar);
 foo2(bar);
 foo3(bar);
  
  Than:
  
 foo_notifier(bar);
  
  Simply because its much clearer who all are involved without me having
  to grep around to see who registers for foo_notifier and wth they do
  with it. It also makes it much harder to sneak in another user, whereas
  its nearly impossible to find new notifier users.
  
  Its also much faster, no extra memory accesses, no indirect function
  calls, no other muck.
 
 But i suspect this question has been settled, given the fact that even pure 
 observer events need and already process a chain of events? Am i missing 
 something about your argument?

I'm saying that there's reasons to not use notifiers passive or active.

Mostly the whole notifier/indirection muck comes up once you want
modules to make use of the thing, because then you need dynamic
management of the callback list.

(Then again, I'm fairly glad we don't have explicit callbacks in
kernel/cpu.c for all the cpu-hotplug callbacks :-)

Anyway, I oppose for the existing events to gain an active role.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering

2011-05-13 Thread Ingo Molnar

* Peter Zijlstra pet...@infradead.org wrote:

 Cut the microblaze list since its bouncy.
 
 On Fri, 2011-05-13 at 15:18 +0200, Ingo Molnar wrote:
  * Peter Zijlstra pet...@infradead.org wrote:
  
   On Fri, 2011-05-13 at 14:54 +0200, Ingo Molnar wrote:
I think the sanest semantics is to run all active callbacks as well.

For example if this is used for three stacked security policies - as if 
3 LSM 
modules were stacked at once. We'd call all three, and we'd determine 
that at 
least one failed - and we'd return a failure. 
   
   But that only works for boolean functions where you can return the
   multi-bit-or of the result. What if you need to return the specific
   error code.
  
  Do you mean that one filter returns -EINVAL while the other -EACCES?
  
  Seems like a non-problem to me, we'd return the first nonzero value.
 
 Assuming the first is -EINVAL, what then is the value in computing the
 -EACCESS? Sounds like a massive waste of time to me.

No, because the common case is no rejection - this is a security mechanism. So 
in the normal case we would execute all 3 anyway, just to determine that all 
return 0.

Are you really worried about the abnormal case of one of them returning an 
error and us calculating all 3 return values?

   Also, there's bound to be other cases where people will want to employ
   this, look at all the various notifier chain muck we've got, it already
   deals with much of this -- simply because users need it.
  
  Do you mean it would be easy to abuse it? What kind of abuse are you most 
  worried about?
 
 I'm not worried about abuse, I'm saying that going by the existing
 notifier pattern always visiting all entries on the callback list is
 undesired.

That is because many notifier chains are used in an 'event consuming' manner - 
they are responding to things like hardware events and are called in an 
interrupt-handler alike fashion most of the time.

   Then there's the whole indirection argument, if you don't need
   indirection, its often better to not use it, I myself much prefer code
   to look like:
   
  foo1(bar);
  foo2(bar);
  foo3(bar);
   
   Than:
   
  foo_notifier(bar);
   
   Simply because its much clearer who all are involved without me having
   to grep around to see who registers for foo_notifier and wth they do
   with it. It also makes it much harder to sneak in another user, whereas
   its nearly impossible to find new notifier users.
   
   Its also much faster, no extra memory accesses, no indirect function
   calls, no other muck.
  
  But i suspect this question has been settled, given the fact that even pure 
  observer events need and already process a chain of events? Am i missing 
  something about your argument?
 
 I'm saying that there's reasons to not use notifiers passive or active.
 
 Mostly the whole notifier/indirection muck comes up once you want
 modules to make use of the thing, because then you need dynamic
 management of the callback list.

But your argument assumes that we'd have a chain of functions to call, like 
regular notifiers.

While the natural model here would be to have a list of registered event 
structs for that point, with different filters but basically the same callback 
mechanism (a call into the filter engine in essence).

Also note that the common case would be no event registered - and we'd 
automatically optimize that case via the existing jump labels optimization.

 (Then again, I'm fairly glad we don't have explicit callbacks in kernel/cpu.c 
 for all the cpu-hotplug callbacks :-)
 
 Anyway, I oppose for the existing events to gain an active role.

Why if 'being active' is optional and useful?

Thanks,

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering

2011-05-13 Thread Ingo Molnar

* Peter Zijlstra pet...@infradead.org wrote:

 On Fri, 2011-05-13 at 14:49 +0200, Ingo Molnar wrote:
  
  So given that by your own admission it makes sense to share the facilities 
  at 
  the low level, i also argue that it makes sense to share as high up as 
  possible. 
 
 I'm not saying any such thing, I'm saying that it might make sense to
 observe active objects and auto-create these observation points. That
 doesn't make them similar or make them share anything.

Well, they would share the lowest level call site:

result = check_event_vfs_getname(result);

You call it 'auto-generated call site', i call it a shared (single line) call 
site. The same thing as far as the lowest level goes.

Now (the way i understood it) you'd want to stop the sharing right after that. 
I argue that it should go all the way up.

Note: i fully agree that there should be events where filters can have no 
effect whatsoever. For example if this was written as:

check_event_vfs_getname(result);

Then it would have no effect. This is decided by the subsystem developers, 
obviously. So whether an event is 'active' or 'passive' can be enforced at the 
subsystem level as well.

As far as the event facilities go, 'no effect observation' is a special-case of 
'active observation' - just like read-only files are a special case of 
read-write files.

Thanks,

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering

2011-05-13 Thread Eric Paris
[dropping microblaze and roland]

lOn Fri, 2011-05-13 at 14:10 +0200, Ingo Molnar wrote:
 * James Morris jmor...@namei.org wrote:

 It is a simple and sensible security feature, agreed? It allows most code to 
 run well and link to countless libraries - but no access to other files is 
 allowed.

It's simple enough and sounds reasonable, but you can read all the
discussion about AppArmour why many people don't really think it's the
best.  Still, I'll agree it's a lot better than nothing.

 But if i had a VFS event at the fs/namei.c::getname() level, i would have 
 access to a central point where the VFS string becomes stable to the kernel 
 and 
 can be checked (and denied if necessary).
 
 A sidenote, and not surprisingly, the audit subsystem already has an event 
 callback there:
 
 audit_getname(result);
 
 Unfortunately this audit callback cannot be used for my purposes, because the 
 event is single-purpose for auditd and because it allows no feedback (no 
 deny/accept discretion for the security policy).
 
 But if had this simple event there:
 
   err = event_vfs_getname(result);

Wow it sounds so easy.  Now lets keep extending your train of thought
until we can actually provide the security provided by SELinux.  What do
we end up with?  We end up with an event hook right next to every LSM
hook.  You know, the LSM hooks were placed where they are for a reason.
Because those were the locations inside the kernel where you actually
have information about the task doing an operation and the objects
(files, sockets, directories, other tasks, etc) they are doing an
operation on.

Honestly all you are talking about it remaking the LSM with 2 sets of
hooks instead if 1.  Why?  It seems much easier that if you want the
language of the filter engine you would just make a new LSM that uses
the filter engine for it's policy language rather than the language
created by SELinux or SMACK or name your LSM implementation.

  - unprivileged:  application-definable, allowing the embedding of security 
   policy in *apps* as well, not just the system
 
  - flexible:  can be added/removed runtime unprivileged, and cheaply so
 
  - transparent:   does not impact executing code that meets the policy
 
  - nestable:  it is inherited by child tasks and is fundamentally 
 stackable,
   multiple policies will have the combined effect and they
   are transparent to each other. So if a child task within a
   sandbox adds *more* checks then those add to the already
   existing set of checks. We only narrow permissions, never
   extend them.
 
  - generic:   allowing observation and (safe) control of security relevant
   parameters not just at the system call boundary but at other
   relevant places of kernel execution as well: which 
   points/callbacks could also be used for other types of 
 event 
   extraction such as perf. It could even be shared with audit 
 ...

I'm not arguing that any of these things are bad things.  What you
describe is a new LSM that uses a discretionary access control model but
with the granularity and flexibility that has traditionally only existed
in the mandatory access control security modules previously implemented
in the kernel.

I won't argue that's a bad idea, there's no reason in my mind that a
process shouldn't be allowed to control it's own access decisions in a
more flexible way than rwx bits.  Then again, I certainly don't see a
reason that this syscall hardening patch should be held up while a whole
new concept in computer security is contemplated...

-Eric

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering

2011-05-13 Thread Eric Paris
[dropping microblaze and roland]

On Fri, 2011-05-13 at 15:18 +0200, Ingo Molnar wrote:
 * Peter Zijlstra pet...@infradead.org wrote:
 
  On Fri, 2011-05-13 at 14:54 +0200, Ingo Molnar wrote:
   I think the sanest semantics is to run all active callbacks as well.
   
   For example if this is used for three stacked security policies - as if 3 
   LSM 
   modules were stacked at once. We'd call all three, and we'd determine 
   that at 
   least one failed - and we'd return a failure. 
  
  But that only works for boolean functions where you can return the
  multi-bit-or of the result. What if you need to return the specific
  error code.
 
 Do you mean that one filter returns -EINVAL while the other -EACCES?
 
 Seems like a non-problem to me, we'd return the first nonzero value.

Sounds so easy!  Why haven't LSMs stacked already?  Because what happens
if one of these hooks did something stateful?  Lets say on open, hook #1
returns EPERM.  hook #2 allocates memory.  The open is going to fail and
hooks #2 is never going to get the close() which should have freed the
allocation.  If you can be completely stateless its easier, but there's
a reason that stacking security modules is hard.  Serge has tried in the
past and both dhowells and casey schaufler are working on it right now.
Stacking is never as easy as it sounds   :)

-Eric

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering

2011-05-13 Thread Peter Zijlstra
On Fri, 2011-05-13 at 11:10 -0400, Eric Paris wrote:
 Then again, I certainly don't see a
 reason that this syscall hardening patch should be held up while a whole
 new concept in computer security is contemplated... 

Which makes me wonder why this syscall hardening stuff is done outside
of LSM? Why isn't is part of the LSM so that say SELinux can have a
syscall bitmask per security context?

Making it part of the LSM also avoids having to add this prctl().


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering

2011-05-13 Thread Peter Zijlstra
On Fri, 2011-05-13 at 16:57 +0200, Ingo Molnar wrote:
 this is a security mechanism

Who says? and why would you want to unify two separate concepts only to
them limit it to security that just doesn't make sense.

Either you provide a full on replacement for notifier chain like things
or you don't, only extending trace events in this fashion for security
is like way weird.

Plus see the arguments Eric made about stacking stuff, not only security
schemes will have those problems.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


RE: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system callfiltering

2011-05-13 Thread David Laight
 ... If you can be completely stateless its easier, but there's
 a reason that stacking security modules is hard.  Serge has tried in
the
 past and both dhowells and casey schaufler are working on it right
now.
 Stacking is never as easy as it sounds   :)

For a bad example of trying to allow alternate security models
look at NetBSD's kauth code :-)

NetBSD also had issues where some 'system call trace' code
was being used to (try to) apply security - unfortunately
it worked by looking at the user-space buffers on system
call entry - and a multithreaded program can easily arrange
to update them after the initial check!
For trace/event type activities this wouldn't really matter,
for security policy it does.
(I've not looked directly at these event points in linux)

David


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering

2011-05-13 Thread Ingo Molnar

* James Morris jmor...@namei.org wrote:

 On Thu, 12 May 2011, Ingo Molnar wrote:
  Funnily enough, back then you wrote this:
  
 I'm concerned that we're seeing yet another security scheme being 
  designed on 
  the fly, without a well-formed threat model, and without taking into 
  account 
  lessons learned from the seemingly endless parade of similar, failed 
  schemes. 
  
  so when and how did your opinion of this scheme turn from it being an 
  endless parade of failed schemes to it being a well-defined and readily 
  understandable feature? :-)
 
 When it was defined in a way which limited its purpose to reducing the attack 
 surface of the sycall interface.

Let me outline a simple example of a new filter expression based security 
feature that could be implemented outside the narrow system call boundary you 
find acceptable, and please tell what is bad about it.

Say i'm a user-space sandbox developer who wants to enforce that sandboxed code 
should only be allowed to open files in /home/sandbox/, /lib/ and /usr/lib/.

It is a simple and sensible security feature, agreed? It allows most code to 
run well and link to countless libraries - but no access to other files is 
allowed.

I would also like my sandbox app to be able to install this policy without 
having to be root. I do not want the sandbox app to have permission to create 
labels on /lib and /usr/lib and what not.

Firstly, using the filter code i deny the various link creation syscalls so 
that sandboxed code cannot escape for example by creating a symlink to outside 
the permitted VFS namespace. (Note: we opt-in to syscalls, that way new 
syscalls added by new kernels are denied by defalt. The current symlink 
creation syscalls are not opted in to.)

But the next step, actually checking filenames, poses a big hurdle: i cannot 
implement the filename checking at the sys_open() syscall level in a secure 
way: because the pathname is passed to sys_open() by pointer, and if i check it 
at the generic sys_open() syscall level, another thread in the sandbox might 
modify the underlying filename *after* i've checked it.

But if i had a VFS event at the fs/namei.c::getname() level, i would have 
access to a central point where the VFS string becomes stable to the kernel and 
can be checked (and denied if necessary).

A sidenote, and not surprisingly, the audit subsystem already has an event 
callback there:

audit_getname(result);

Unfortunately this audit callback cannot be used for my purposes, because the 
event is single-purpose for auditd and because it allows no feedback (no 
deny/accept discretion for the security policy).

But if had this simple event there:

err = event_vfs_getname(result);

I could implement this new filename based sandboxing policy, using a filter 
like this installed on the vfs::getname event and inherited by all sandboxed 
tasks (which cannot uninstall the filter, obviously):

  
if (strstr(name, ..))
return -EACCESS;

if (!strncmp(name, /home/sandbox/, 14) 
!strncmp(name, /lib/, 5) 
!strncmp(name, /usr/lib/, 9))
return -EACCESS;

  

  #
  # Note1: Obviously the filter engine would be extended to allow such simple 
string
  #match functions. )
  #
  # Note2: .. is disallowed so that sandboxed code cannot escape the 
restrictions
  # using /...
  #

This kind of flexible and dynamic sandboxing would allow a wide range of file 
ops within the sandbox, while still isolating it from files not included in the 
specified VFS namespace.

( Note that there are tons of other examples as well, for useful security 
features
  that are best done using events outside the syscall boundary. )

The security event filters code tied to seccomp and syscalls at the moment is 
useful, but limited in its future potential.

So i argue that it should go slightly further and should become:

 - unprivileged:  application-definable, allowing the embedding of security 
  policy in *apps* as well, not just the system

 - flexible:  can be added/removed runtime unprivileged, and cheaply so

 - transparent:   does not impact executing code that meets the policy

 - nestable:  it is inherited by child tasks and is fundamentally stackable,
  multiple policies will have the combined effect and they
  are transparent to each other. So if a child task within a
  sandbox adds *more* checks then those add to the already
  existing set of checks. We only narrow permissions, never
  extend them.

 - generic:   allowing observation and (safe) control of security relevant
  parameters not just at the system call boundary but at other
  relevant places of kernel execution as well: which 
  points/callbacks could also be used for other types of event 
  extraction such 

Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering

2011-05-13 Thread Eric Paris
On Fri, 2011-05-13 at 17:23 +0200, Peter Zijlstra wrote:
 On Fri, 2011-05-13 at 11:10 -0400, Eric Paris wrote:
  Then again, I certainly don't see a
  reason that this syscall hardening patch should be held up while a whole
  new concept in computer security is contemplated... 
 
 Which makes me wonder why this syscall hardening stuff is done outside
 of LSM? Why isn't is part of the LSM so that say SELinux can have a
 syscall bitmask per security context?

I could do that, but I like Will's approach better.  From the PoV of
meeting security goals of information flow, data confidentiality,
integrity, least priv, etc limiting on the syscall boundary doesn't make
a lot of sense.  You just don't know enough there to enforce these
things.  These are the types of goals that SELinux and other LSMs have
previously tried to enforce.  From the PoV of making the kernel more
resistant to attacks and making a process more resistant to misbehavior
I think that the syscall boundary is appropriate.  Although I could do
it in SELinux it don't really want to do it there.

In case people are interested or confused let me give my definition of
two words I've used a bit in these conversations: discretionary and
mandatory.  Any time I talk about a 'discretionary' security decision it
is a security decisions that a process imposed upon itself.  Aka the
choice to use seccomp is discretionary.  The choice to mark our own file
u-wx is discretionary.  This isn't the best definition but it's one that
works well in this discussion.  Mandatory security is one enforce by a
global policy.  It's what selinux is all about.  SELinux doesn't give
hoot what a process wants to do, it enforces a global policy from the
top down.  You take over a process, well, too bad, you still have no
choice but to follow the mandatory policy.

The LSM does NOT enforce a mandatory access control model, it's just how
it's been used in the past.  Ingo appears to me (please correct me if
I'm wrong) to really be a fan of exposing the flexibility of the LSM to
a discretionary access control model.  That doesn't seem like a bad
idea.  And maybe using the filter engine to define the language to do
this isn't a bad idea either.  But I think that's a 'down the road'
project, not something to hold up a better seccomp.

 Making it part of the LSM also avoids having to add this prctl().

Well, it would mean exposing some new language construct to every LSM
(instead of a single prctl construct) and it would mean anyone wanting
to use the interface would have to rely on the LSM implementing those
hooks the way they need it.  Honestly chrome can already get all of the
benefits of this patch (given a perfectly coded kernel) and a whole lot
more using SELinux, but (surprise surprise) not everyone uses SELinux.
I think it's a good idea to expose a simple interface which will be
widely enough adopted that many userspace applications can rely on it
for hardening.

The existence of the LSM and the fact that there exists multiple
security modules that may or may not be enabled really leads application
developers to be unable to rely on LSM for security.  If linux had a
single security model which everyone could rely on we wouldn't really
have as big of an issue but that's not possible.  So I'm advocating for
this series which will provide a single useful change which applications
can rely upon across distros and platforms to enhance the properties and
abilities of the linux kernel.

-Eric

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 3/5] v2 seccomp_filters: Enable ftrace-based system call filtering

2011-05-13 Thread Will Drewry
On Fri, May 13, 2011 at 10:55 AM, Eric Paris epa...@redhat.com wrote:
 On Fri, 2011-05-13 at 17:23 +0200, Peter Zijlstra wrote:
 On Fri, 2011-05-13 at 11:10 -0400, Eric Paris wrote:
  Then again, I certainly don't see a
  reason that this syscall hardening patch should be held up while a whole
  new concept in computer security is contemplated...

 Which makes me wonder why this syscall hardening stuff is done outside
 of LSM? Why isn't is part of the LSM so that say SELinux can have a
 syscall bitmask per security context?

 I could do that, but I like Will's approach better.  From the PoV of
 meeting security goals of information flow, data confidentiality,
 integrity, least priv, etc limiting on the syscall boundary doesn't make
 a lot of sense.  You just don't know enough there to enforce these
 things.  These are the types of goals that SELinux and other LSMs have
 previously tried to enforce.  From the PoV of making the kernel more
 resistant to attacks and making a process more resistant to misbehavior
 I think that the syscall boundary is appropriate.  Although I could do
 it in SELinux it don't really want to do it there.

There's also the problem that there are no hooks per-system call for
LSMs, only logical hooks that sometimes mirror system call names and
are called after user data has been parsed.  If system call enter
hooks, like seccomp's, were added for LSMs, it would allow the lsm
bitmask approach, but it still wouldn't satisfy the issues you raise
below (and I wholeheartedly agree with).

 In case people are interested or confused let me give my definition of
 two words I've used a bit in these conversations: discretionary and
 mandatory.  Any time I talk about a 'discretionary' security decision it
 is a security decisions that a process imposed upon itself.  Aka the
 choice to use seccomp is discretionary.  The choice to mark our own file
 u-wx is discretionary.  This isn't the best definition but it's one that
 works well in this discussion.  Mandatory security is one enforce by a
 global policy.  It's what selinux is all about.  SELinux doesn't give
 hoot what a process wants to do, it enforces a global policy from the
 top down.  You take over a process, well, too bad, you still have no
 choice but to follow the mandatory policy.

 The LSM does NOT enforce a mandatory access control model, it's just how
 it's been used in the past.  Ingo appears to me (please correct me if
 I'm wrong) to really be a fan of exposing the flexibility of the LSM to
 a discretionary access control model.  That doesn't seem like a bad
 idea.  And maybe using the filter engine to define the language to do
 this isn't a bad idea either.  But I think that's a 'down the road'
 project, not something to hold up a better seccomp.

 Making it part of the LSM also avoids having to add this prctl().

 Well, it would mean exposing some new language construct to every LSM
 (instead of a single prctl construct) and it would mean anyone wanting
 to use the interface would have to rely on the LSM implementing those
 hooks the way they need it.  Honestly chrome can already get all of the
 benefits of this patch (given a perfectly coded kernel) and a whole lot
 more using SELinux, but (surprise surprise) not everyone uses SELinux.
 I think it's a good idea to expose a simple interface which will be
 widely enough adopted that many userspace applications can rely on it
 for hardening.

 The existence of the LSM and the fact that there exists multiple
 security modules that may or may not be enabled really leads application
 developers to be unable to rely on LSM for security.  If linux had a
 single security model which everyone could rely on we wouldn't really
 have as big of an issue but that's not possible.  So I'm advocating for
 this series which will provide a single useful change which applications
 can rely upon across distros and platforms to enhance the properties and
 abilities of the linux kernel.

 -Eric


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev