Re: RFC(V3): Audit Kernel Container IDs

2018-02-05 Thread Simo Sorce
On Fri, 2018-02-02 at 18:24 -0500, Paul Moore wrote:
> On Fri, Feb 2, 2018 at 5:19 PM, Simo Sorce <s...@redhat.com> wrote:
> > On Fri, 2018-02-02 at 16:24 -0500, Paul Moore wrote:
> > > On Wed, Jan 10, 2018 at 2:00 AM, Richard Guy Briggs <r...@redhat.com> 
> > > wrote:
> > > > On 2018-01-09 11:18, Simo Sorce wrote:
> > > > > On Tue, 2018-01-09 at 07:16 -0500, Richard Guy Briggs wrote:
> 
> ...
> 
> > > > Paul, can you justify this somewhat larger inconvenience for some
> > > > relatively minor convenience on our part?
> > > 
> > > Done in direct response to Simo.
> > 
> > Sorry but your response sounds more like waving away then addressing
> > them, the excuse being: we can't please everyone, so we are going to
> > please no one.
> 
> I obviously disagree with the take on my comments but you're free to
> your opinion.

The I misunderstood your comments, I am not interested in putting words
in your mouth.

> I believe saying we are pleasing no one isn't really fair now is it?

Well, of course you are going to please the audit subsystem, I
understand that. I think there is a problem of expectations. Some
people, me included, hoped to have a way to identify a container with
the help of the kernel.

> Is there any type of audit container ID now?  How would you go about
> associating audit events with containers now?

We do not have a good way, there are some dirty tricks like inferring
the container identity via cgroup names, but that is ... eww.
This is why, given audit has the same need of user space, there was
some hope we could agree on an identifier that could be used by both.
It would make correlating audit logs and other cluster-wide events
simpler. That is all.

>  (spoiler alert: it ain't
> pretty, and there are gaps I don't believe you can cover)  This
> proposal provides a mechanism to do this in a way that isn't tied to
> any one particular concept of a container and is manageable inside the
> kernel.

I like the proposal for the most part, we are just discussing on the
nature of the identifier, which is a minor detail in the end.

> If you have a need to track audit events for containers, I find it
> extremely hard to believe that you are not at least partially pleased
> by the solutions presented here.  It may not be everything on your
> wishlist, but when did you ever get *everything* on your wishlist?

It is true, and I am sorry if I came out demanding or abrasive. It was
not my intention. Of course a u64 that has to be mapped is still better
than nothing. It does cause a lot more work in user space, but it is
not impossible to deal with.

> > > But to be clear Richard, we've talked about this a few times, it's not
> > > a "minor convenience" on our part, it's a pretty big convenience once
> > > we starting having to route audit events and make decisions based on
> > > the audit container ID information.  Audit performance is less than
> > > awesome now, I'm working hard to not make it worse.
> > 
> > Sounds like a security vs performance trade off to me.
> 
> Welcome to software development.  It's generally a pretty terrible
> hobby and/or occupation, but we make up for it with long hours and
> endless frustration.

Tell me more about that, not! ;-)

> > > > u64 vs u128 is easy for us to
> > > > accomodate in terms of scalar comparisons.  It doubles the information
> > > > in every container id field we print in audit records.
> > > 
> > > ... and slows down audit container ID checks.
> > 
> > Are you saying a cmp on a u128 is slower than a comparison on a u64 and
> > this is something that will be noticeable ?
> 
> Do you have a 128 bit system?

no, but all 64bit systems have an instruction that allow you to do
atomic 128 compare and swap (IIRC ?).

>   I don't.  I've got a bunch of 64 bit
> systems, and a couple of 32 bit systems too.  People that use audit
> have a tendency to really hammer on it, to the point that we get
> performance complaints on a not infrequent basis.  I don't know the
> exact number of times we are going to need to check the audit
> container ID, but it's reasonable to think that we'll expose it as a
> filter-able field which adds a few checks, we'll use it for record
> routing so that's a few more, and if we're running multiple audit
> daemons we will probably want to include LSM checks which could result
> in a few more audit container ID checks.  If it was one comparison I
> wouldn't be too worried about it, but the point I'm trying to make is
> that we don't know what the implementation is going to look like yet
> and I suspect this ID is going to be leveraged in several places in
> th

Re: RFC(V3): Audit Kernel Container IDs

2018-02-02 Thread Simo Sorce
On Fri, 2018-02-02 at 16:24 -0500, Paul Moore wrote:
> On Wed, Jan 10, 2018 at 2:00 AM, Richard Guy Briggs <r...@redhat.com> wrote:
> > On 2018-01-09 11:18, Simo Sorce wrote:
> > > On Tue, 2018-01-09 at 07:16 -0500, Richard Guy Briggs wrote:
> > > > Containers are a userspace concept.  The kernel knows nothing of them.
> > > > 
> > > > The Linux audit system needs a way to be able to track the container
> > > > provenance of events and actions.  Audit needs the kernel's help to do
> > > > this.
> > > > 
> > > > Since the concept of a container is entirely a userspace concept, a
> > > > registration from the userspace container orchestration system initiates
> > > > this.  This will define a point in time and a set of resources
> > > > associated with a particular container with an audit container
> > > > identifier.
> > > > 
> > > > The registration is a u64 representing the audit container identifier
> > > > written to a special file in a pseudo filesystem (proc, since PID tree
> > > > already exists) representing a process that will become a parent process
> > > > in that container.  This write might place restrictions on mount
> > > > namespaces required to define a container, or at least careful checking
> > > > of namespaces in the kernel to verify permissions of the orchestrator so
> > > > it can't change its own container ID.  A bind mount of nsfs may be
> > > > necessary in the container orchestrator's mount namespace.  This write
> > > > can only happen once per process.
> > > > 
> > > > Note: The justification for using a u64 is that it minimizes the
> > > > information printed in every audit record, reducing bandwidth and limits
> > > > comparisons to a single u64 which will be faster and less error-prone.
> > > > 
> > > > Require CAP_AUDIT_CONTROL to be able to carry out the registration.  At
> > > > that time, record the target container's user-supplied audit container
> > > > identifier along with a target container's parent process (which may
> > > > become the target container's "init" process) process ID (referenced
> > > > from the initial PID namespace) in a new record AUDIT_CONTAINER with a
> > > > qualifying op=$action field.
> > > > 
> > > > Issue a new auxilliary record AUDIT_CONTAINER_INFO for each valid
> > > > container ID present on an auditable action or event.
> > > > 
> > > > Forked and cloned processes inherit their parent's audit container
> > > > identifier, referenced in the process' task_struct.  Since the audit
> > > > container identifier is inherited rather than written, it can still be
> > > > written once.  This will prevent tampering while allowing nesting.
> > > > (This can be implemented with an internal settable flag upon
> > > > registration that does not get copied across a fork/clone.)
> > > > 
> > > > Mimic setns(2) and return an error if the process has already initiated
> > > > threading or forked since this registration should happen before the
> > > > process execution is started by the orchestrator and hence should not
> > > > yet have any threads or children.  If this is deemed overly restrictive,
> > > > switch all of the target's threads and children to the new containerID.
> > > > 
> > > > Trust the orchestrator to judiciously use and restrict 
> > > > CAP_AUDIT_CONTROL.
> > > > 
> > > > When a container ceases to exist because the last process in that
> > > > container has exited log the fact to balance the registration action.
> > > > (This is likely needed for certification accountability.)
> > > > 
> > > > At this point it appears unnecessary to add a container session
> > > > identifier since this is all tracked from loginuid and sessionid to
> > > > communicate with the container orchestrator to spawn an additional
> > > > session into an existing container which would be logged.  It can be
> > > > added at a later date without breaking API should it be deemed
> > > > necessary.
> > > > 
> > > > The following namespace logging actions are not needed for certification
> > > > purposes at this point, but are helpful for tracking namespace activity.
> > > > These are auxilliary records that are associated with namespace
> > > 

Re: RFC(V3): Audit Kernel Container IDs

2018-01-09 Thread Simo Sorce
_DESTROY] [process exit, unshare(2), setns(2)]
> 
> Issue a new auxilliary record AUDIT_NS_CHANGE listing (opt: op=$action)
> the parent and child namespace IDs for any changes to a process'
> namespaces. [setns(2)]
> Note: It may be possible to combine AUDIT_NS_* record formats and
> distinguish them with an op=$action field depending on the fields
> required for each message type.
> 
> The audit container identifier will need to be reaped from all
> implicated namespaces upon the destruction of a container.
> 
> This namespace information adds supporting information for tracking
> events not attributable to specific processes.
> 
> Changelog:
> 
> (Upstream V3)
> - switch back to u64 (from pmoore, can be expanded to u128 in future if
>   need arises without breaking API.  u32 was originally proposed, up to
>   c36 discussed)
> - write-once, but children inherit audit container identifier and can
>   then still be written once
> - switch to CAP_AUDIT_CONTROL
> - group namespace actions together, auxilliary records to namespace
>   operations.
> 
> (Upstream V2)
> - switch from u64 to u128 UUID
> - switch from "signal" and "trigger" to "register"
> - restrict registration to single process or force all threads and
>   children into same container

I am trying to understand the back and forth on the ID size.

>From an orchestrator POV anything that requires tracking a node
specific ID is not ideal.

Orchestrators tend to span many nodes, and containers tend to have IDs
that are either UUID or have a Hash (like SHA256) as identifier.

The problem here is two-fold:

a) Your auditing requires some mapping to be useful outside of the
system.
If you aggreggate audit logs outside of the system or you want to
correlate the system audit logs with other components dealing with
containers, now you need a place where you provide a mapping from your
audit u64 to the ID a container has in the rest of the system.

b) Now you need a mapping of some sort. The simplest way a container
orchestrator can go about this is to just use the UUID or Hash
representing their view of the container, truncate it to a u64 and use
that for Audit. This means there are some chances there will be a
collision and a duplicate u64 ID will be used by the orchestrator as
the container ID. What happen in that case ?

Simo.

-- 
Simo Sorce
Sr. Principal Software Engineer
Red Hat, Inc

--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit


Re: RFC(v2): Audit Kernel Container IDs

2017-10-17 Thread Simo Sorce
On Tue, 2017-10-17 at 07:59 -0700, Casey Schaufler wrote:
> On 10/17/2017 5:31 AM, Simo Sorce wrote:
> > On Mon, 2017-10-16 at 21:42 -0400, Steve Grubb wrote:
> > > On Monday, October 16, 2017 8:33:40 PM EDT Richard Guy Briggs
> > > wrote:
> > > > There is such a thing, but the kernel doesn't know about it
> > > > yet.  This same situation exists for loginuid and sessionid
> > > > which
> > > > are userspace concepts that the kernel tracks for the
> > > > convenience
> > > > of userspace.  As for its name, I'm not particularly picky, so
> > > > if
> > > > you don't like CAP_CONTAINER_* then I'm fine with
> > > > CAP_AUDIT_CONTAINERID.  It really needs to be distinct from
> > > > CAP_AUDIT_WRITE and CAP_AUDIT_CONTROL since we don't want to
> > > > give
> > > > the ability to set a containerID to any process that is able to
> > > > do
> > > > audit logging (such as vsftpd) and similarly we don't want to
> > > > give
> > > > the orchestrator the ability to control the setup of the audit
> > > > daemon.
> > > 
> > > A long time ago, we were debating what should guard against rouge
> > > processes from setting the loginuid. Casey argued that the
> > > ability to
> > > set the loginuid means they have the ability to control the audit
> > > trail. That means that it should be guarded by CAP_AUDIT_CONTROL.
> > > I
> > > think the same logic applies today. 
> > 
> > The difference is that with loginuid you needed to give processes
> > able
> > to audit also the ability to change it. You do not want to tie the
> > ability to change container ids to the ability to audit. You want
> > to be
> > able to do audit stuff (within the container) without allowing it
> > to
> > change the container id.
> 
> Without a *kernel* policy on containerIDs you can't say what
> security policy is being exempted.

The policy has been basically stated earlier.

A way to track a set of processes from a specific point in time
forward. The name used is "container id", but it could be anything.
This marker is mostly used by user space to track process hierarchies
without races, these processes can be very privileged, and must not be
allowed to change the marker themselves when granted the current common
capabilities.

Is this a good enough description ? If not can you clarify your
expectations ?

>  Without that you can't say what capability is (or isn't)
> appropriate.

See if the above is sufficient please.

> You need a reason to have a capability check that makes sense in the
> context of the kernel security policy.

I think the proposal had a reason, we may debate on whether that reason
is good enough.

> Since we don't know what a container is in the kernel,

Please do not fixate on the word container.

>  that's pretty hard. We don't create "fuzzy" capabilities
> based on the trendy application behavior of the moment. If the
> behavior is not related it audit, there's no reason for it, and
> if it is, CAP_AUDIT_CONTROL works just fine. If this doesn't work
> in your application security model I suggest that is where you
> need to make changes.

The authors of the proposal came to the conclusion that kernel
assistance is needed. It would be nice to discuss the merits of it.
If you do not understand why the request has been made it would be more
useful to ask specific questions to understand what and why is the ask.

Pushing back is fine, if you have understood the problem and have valid
arguments against a kernel level solution (and possibly suggestions for
a working user space solution), otherwise you are not adding value to
the discussion. 

Simo.

-- 
Simo Sorce
Sr. Principal Software Engineer
Red Hat, Inc

--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit

Re: RFC(v2): Audit Kernel Container IDs

2017-10-17 Thread Simo Sorce
On Mon, 2017-10-16 at 21:42 -0400, Steve Grubb wrote:
> On Monday, October 16, 2017 8:33:40 PM EDT Richard Guy Briggs wrote:

> > There is such a thing, but the kernel doesn't know about it
> > yet.  This same situation exists for loginuid and sessionid which
> > are userspace concepts that the kernel tracks for the convenience
> > of userspace.  As for its name, I'm not particularly picky, so if
> > you don't like CAP_CONTAINER_* then I'm fine with
> > CAP_AUDIT_CONTAINERID.  It really needs to be distinct from
> > CAP_AUDIT_WRITE and CAP_AUDIT_CONTROL since we don't want to give
> > the ability to set a containerID to any process that is able to do
> > audit logging (such as vsftpd) and similarly we don't want to give
> > the orchestrator the ability to control the setup of the audit
> > daemon.
> 
> A long time ago, we were debating what should guard against rouge
> processes from setting the loginuid. Casey argued that the ability to
> set the loginuid means they have the ability to control the audit
> trail. That means that it should be guarded by CAP_AUDIT_CONTROL. I
> think the same logic applies today. 

The difference is that with loginuid you needed to give processes able
to audit also the ability to change it. You do not want to tie the
ability to change container ids to the ability to audit. You want to be
able to do audit stuff (within the container) without allowing it to
change the container id.
Of course if we made container id a write-once property maybe there is
no need for controls at all, but I'm pretty sure there will be
situations where write-once may not be usable in practice.

> The ability to arbitrarily set a container ID means the process has
> the ability to indirectly control the audit trail.

The container Id can be used also for authorization purposes (by other
processes on the host), not just audit, I think this is why a separate
control has been proposed.

Simo.

-- 
Simo Sorce
Sr. Principal Software Engineer
Red Hat, Inc

--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit