Re: [RFC][PATCH] audit: Simplify by assuming the callers socket buffer is large enough

2014-03-10 Thread Eric Paris
On Mon, 2014-03-10 at 15:30 -0400, David Miller wrote:
> From: Eric Paris 
> Date: Fri, 07 Mar 2014 17:52:02 -0500
> 
> > The second user Eric patched, audit_send_list(), can grow without bound.
> > The number of skb's is going to be the size of the number of audit rules
> > that root loaded.  We run the list of rules, generate an skb per rule,
> > and add all of them to an skb_buff_head.  We then pass the skb_buff_head
> > to a kthread so that current will be able to read/drain the socket.
> > There really is no limit to how big the skb_buff_head could possibly
> > grow.  This doesn't necessarily absolutely have to be lossless but it
> > can actually quite reasonably be a whole lot of data that needs to get
> > sent.  I know of no way to deliver unbounded lengths of data to the
> > current task via netlink without blocking on more space in the socket.
> > Even if the socket rmem was MAX_INT, how can we deliver more?  The rule
> > size is unbounded.  How do I get an unbounded amount of data onto this
> > side of the socket when I have to generate it all during the request...
> 
> This is what netlink dumps  are for.  It is how we are able to dump
> routing tables with millions of routes to userspace.
> 
> By using normal netlink requests and netlink_unicast() for this, you
> are ignoring an entire mechanism in netlink designed specifically to
> handle this kind of situation.
> 
> Netlink dumps track state and build one or more SKBs (as necessary),
> one by one, to form the reply.  It implements flow control, state
> tracking for iteration, optimized SKB sizing and allocation, etc.

Awesome.  I'll see what I can find!


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] audit: Simplify by assuming the callers socket buffer is large enough

2014-03-10 Thread David Miller
From: Eric Paris 
Date: Fri, 07 Mar 2014 17:52:02 -0500

> The second user Eric patched, audit_send_list(), can grow without bound.
> The number of skb's is going to be the size of the number of audit rules
> that root loaded.  We run the list of rules, generate an skb per rule,
> and add all of them to an skb_buff_head.  We then pass the skb_buff_head
> to a kthread so that current will be able to read/drain the socket.
> There really is no limit to how big the skb_buff_head could possibly
> grow.  This doesn't necessarily absolutely have to be lossless but it
> can actually quite reasonably be a whole lot of data that needs to get
> sent.  I know of no way to deliver unbounded lengths of data to the
> current task via netlink without blocking on more space in the socket.
> Even if the socket rmem was MAX_INT, how can we deliver more?  The rule
> size is unbounded.  How do I get an unbounded amount of data onto this
> side of the socket when I have to generate it all during the request...

This is what netlink dumps  are for.  It is how we are able to dump
routing tables with millions of routes to userspace.

By using normal netlink requests and netlink_unicast() for this, you
are ignoring an entire mechanism in netlink designed specifically to
handle this kind of situation.

Netlink dumps track state and build one or more SKBs (as necessary),
one by one, to form the reply.  It implements flow control, state
tracking for iteration, optimized SKB sizing and allocation, etc.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] audit: Simplify by assuming the callers socket buffer is large enough

2014-03-10 Thread David Miller
From: Eric Paris epa...@redhat.com
Date: Fri, 07 Mar 2014 17:52:02 -0500

 The second user Eric patched, audit_send_list(), can grow without bound.
 The number of skb's is going to be the size of the number of audit rules
 that root loaded.  We run the list of rules, generate an skb per rule,
 and add all of them to an skb_buff_head.  We then pass the skb_buff_head
 to a kthread so that current will be able to read/drain the socket.
 There really is no limit to how big the skb_buff_head could possibly
 grow.  This doesn't necessarily absolutely have to be lossless but it
 can actually quite reasonably be a whole lot of data that needs to get
 sent.  I know of no way to deliver unbounded lengths of data to the
 current task via netlink without blocking on more space in the socket.
 Even if the socket rmem was MAX_INT, how can we deliver more?  The rule
 size is unbounded.  How do I get an unbounded amount of data onto this
 side of the socket when I have to generate it all during the request...

This is what netlink dumps  are for.  It is how we are able to dump
routing tables with millions of routes to userspace.

By using normal netlink requests and netlink_unicast() for this, you
are ignoring an entire mechanism in netlink designed specifically to
handle this kind of situation.

Netlink dumps track state and build one or more SKBs (as necessary),
one by one, to form the reply.  It implements flow control, state
tracking for iteration, optimized SKB sizing and allocation, etc.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] audit: Simplify by assuming the callers socket buffer is large enough

2014-03-10 Thread Eric Paris
On Mon, 2014-03-10 at 15:30 -0400, David Miller wrote:
 From: Eric Paris epa...@redhat.com
 Date: Fri, 07 Mar 2014 17:52:02 -0500
 
  The second user Eric patched, audit_send_list(), can grow without bound.
  The number of skb's is going to be the size of the number of audit rules
  that root loaded.  We run the list of rules, generate an skb per rule,
  and add all of them to an skb_buff_head.  We then pass the skb_buff_head
  to a kthread so that current will be able to read/drain the socket.
  There really is no limit to how big the skb_buff_head could possibly
  grow.  This doesn't necessarily absolutely have to be lossless but it
  can actually quite reasonably be a whole lot of data that needs to get
  sent.  I know of no way to deliver unbounded lengths of data to the
  current task via netlink without blocking on more space in the socket.
  Even if the socket rmem was MAX_INT, how can we deliver more?  The rule
  size is unbounded.  How do I get an unbounded amount of data onto this
  side of the socket when I have to generate it all during the request...
 
 This is what netlink dumps  are for.  It is how we are able to dump
 routing tables with millions of routes to userspace.
 
 By using normal netlink requests and netlink_unicast() for this, you
 are ignoring an entire mechanism in netlink designed specifically to
 handle this kind of situation.
 
 Netlink dumps track state and build one or more SKBs (as necessary),
 one by one, to form the reply.  It implements flow control, state
 tracking for iteration, optimized SKB sizing and allocation, etc.

Awesome.  I'll see what I can find!


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] audit: Simplify by assuming the callers socket buffer is large enough

2014-03-07 Thread David Miller
From: Steve Grubb 
Date: Fri, 07 Mar 2014 22:27:28 -0500

> On Friday, March 07, 2014 07:48:01 PM David Miller wrote:
>> From: Eric Paris 
>> Date: Fri, 07 Mar 2014 17:52:02 -0500
>> 
>> > Audit is non-tolerant to failure and loss.
>> 
>> Netlink is not a loss-less transport.
> 
> Perhaps. But in all our testing over the years its been very good.

What I really meant by that was that there is flow control.

You can push as much data reliably over it as you want, but you have
to block when the socket limits are hit.

And I'd say you might as well make the creator of the event do the
blocking rather than making other threads do this.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] audit: Simplify by assuming the callers socket buffer is large enough

2014-03-07 Thread Eric Paris
On Fri, 2014-03-07 at 19:48 -0500, David Miller wrote:
> From: Eric Paris 
> Date: Fri, 07 Mar 2014 17:52:02 -0500
> 
> > Audit is non-tolerant to failure and loss.
> 
> Netlink is not a loss-less transport.
I'm happy to accept that (and know it to be true).  How can I better
architect things?  It seems Eric is complaining that when we get a
request for info, we queue that info up, and then use a kthread to make
it available when the task next calls recv.  By using blocking sockets
in the kthread we have no problem with the size of the socket read buf.
If we switch to non-blocking sockets how can we possibly queue up more
than rmem size of data?  (honestly, if userspace used INT_MAX it is
almost certainly overkill for even the largest rulesets, but
theoretically, it's not...)

Is our design somehow wrong?  Flawed?  Mind you it's pretty dumb that we
do basically the same thing in 3 different audit code path, but the
architecture doesn't seem crazy to me.  Then again, I'm not brilliant by
any stretch!

   +--+
   |  |
   |   auditctl (audit tool run by root)  |
   | netlink send netlink receive |
   +--+
  +^
  ||
  v+
  ++++
  | kernel audit generate skbs || send skbs to userspace |
  ++++
  +^
  |++  |
  +--->| send skbs to a kthread |+-+
   ++

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] audit: Simplify by assuming the callers socket buffer is large enough

2014-03-07 Thread Steve Grubb
On Friday, March 07, 2014 07:48:01 PM David Miller wrote:
> From: Eric Paris 
> Date: Fri, 07 Mar 2014 17:52:02 -0500
> 
> > Audit is non-tolerant to failure and loss.
> 
> Netlink is not a loss-less transport.

Perhaps. But in all our testing over the years its been very good.

-Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] audit: Simplify by assuming the callers socket buffer is large enough

2014-03-07 Thread David Miller
From: Eric Paris 
Date: Fri, 07 Mar 2014 17:52:02 -0500

> Audit is non-tolerant to failure and loss.

Netlink is not a loss-less transport.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] audit: Simplify by assuming the callers socket buffer is large enough

2014-03-07 Thread Eric Paris
As usual Eric, your commentary is anything but useful.  However your
technical thoughts are not off the mark.  Can we stick to those?

On Wed, 2014-03-05 at 10:06 -0800, Eric W. Biederman wrote:
> Steve Grubb  writes:
> 
> > On Tuesday, March 04, 2014 07:21:52 PM David Miller wrote:
> >> From: ebied...@xmission.com (Eric W. Biederman)
> >> Date: Tue, 04 Mar 2014 14:41:16 -0800
> >> 
> >> > If we really want the ability to always appened to the queue of skb's
> >> > is to just have a version of netlink_send_skb that ignores the queued
> >> > limits.  Of course an evil program then could force the generation of
> >> > enough audit records to DOS the kernel, but we seem to be in that
> >> > situation now.  Shrug.
> >> 
> >> There is never a valid reason to bypass the socket limits.

Audit does have some pretty crazy/wacky/dumb ways of doing things.  And
they've worked really well.  I'm the first to agree that doesn't make
them right.  But also that I don't know how to do them better.  I'm
happy to try if I know how.  Audit is non-tolerant to failure and loss.
Many users of audit would prefer the system panic() than lose a message.
If someone shows me how to do it better I'll happily admit there are
likely places where what we do is just a 'little' too
strong/foolish/crazy.  Note that ALL users of these functions must have
at least 1 capability (CAP_AUDIT_CONTROL).  So if there is a malicious
app, it is a root malicious app...

The kernel audit has 3 different 'things' that send skbs to userspace.
All of them work a little crazy, but similar crazy.  The current task
calls into the kernel via netlink, and the kernel then builds one or
more skb(s) and passes those skb(s) (via differing mechanisms) to a
kthread which in turn calls netlink_unicast(,,,0) sending the response
back to the current task in 2 of the three cases.  In all cases, since
the timeout is infinite, we assume that the only possible reason this
call to netlink_unicast() will fail is because the other end of the
socket went away.  Simple drawing of 2 of the 3 cases.
 +--+
 |  |
 |   auditctl (audit tool run by root)  |
 | netlink send netlink receive |
 +--+
+^
||
v+
++++
| kernel audit generate skbs || send skbs to userspace |
++++
+^
|++  |
+--->| send skbs to a kthread |+-+
 ++
The most important of the 3 cases and the one that people care the
absolute most about 'things cannot be lost' is the actual audit
messages.  Messages like 'process A just did action B to object C'.
These are handled by means of the current process generating an skb and
passing those to an audit specific queue.  This audit internal queue
depth is controllable by userspace.  If we overflow this queue we may
call panic() (admin choice, obviously non-default).  Again, the kthread
on the other end of that queue assumes that all calls to
netlink_unicast(,,,0) will eventually succeed (unless the receiving task
died).  It is actually imperative that the current process be blocked
until the message is on track to userspace.  Even Eric isn't trying to
change this one case in his patch.  This is the one case where the task
receiving the skb is not (likely) the current task (but could be)

The other two, the ones Eric patched, are much more flexible.  In both
cases a userspace task ask the kernel for a specific piece of
information (by sending a netlink message).  current is going to be the
task draining the netlink queue.  This is the reason the send is being
punted to a kthread.  So current can read from the netlink socket.  In
one case audit_send_reply_thread() the response is small and can't
really grow without bound.  Converting to a nonblocking socket might
well make sense here.

The second user Eric patched, audit_send_list(), can grow without bound.
The number of skb's is going to be the size of the number of audit rules
that root loaded.  We run the list of rules, generate an skb per rule,
and add all of them to an skb_buff_head.  We then pass the skb_buff_head
to a kthread so that current will be able to read/drain the socket.
There really is no limit to how big the skb_buff_head could possibly
grow.  This doesn't necessarily absolutely have to be lossless but it
can actually quite 

Re: [RFC][PATCH] audit: Simplify by assuming the callers socket buffer is large enough

2014-03-07 Thread Eric Paris
As usual Eric, your commentary is anything but useful.  However your
technical thoughts are not off the mark.  Can we stick to those?

On Wed, 2014-03-05 at 10:06 -0800, Eric W. Biederman wrote:
 Steve Grubb sgr...@redhat.com writes:
 
  On Tuesday, March 04, 2014 07:21:52 PM David Miller wrote:
  From: ebied...@xmission.com (Eric W. Biederman)
  Date: Tue, 04 Mar 2014 14:41:16 -0800
  
   If we really want the ability to always appened to the queue of skb's
   is to just have a version of netlink_send_skb that ignores the queued
   limits.  Of course an evil program then could force the generation of
   enough audit records to DOS the kernel, but we seem to be in that
   situation now.  Shrug.
  
  There is never a valid reason to bypass the socket limits.

Audit does have some pretty crazy/wacky/dumb ways of doing things.  And
they've worked really well.  I'm the first to agree that doesn't make
them right.  But also that I don't know how to do them better.  I'm
happy to try if I know how.  Audit is non-tolerant to failure and loss.
Many users of audit would prefer the system panic() than lose a message.
If someone shows me how to do it better I'll happily admit there are
likely places where what we do is just a 'little' too
strong/foolish/crazy.  Note that ALL users of these functions must have
at least 1 capability (CAP_AUDIT_CONTROL).  So if there is a malicious
app, it is a root malicious app...

The kernel audit has 3 different 'things' that send skbs to userspace.
All of them work a little crazy, but similar crazy.  The current task
calls into the kernel via netlink, and the kernel then builds one or
more skb(s) and passes those skb(s) (via differing mechanisms) to a
kthread which in turn calls netlink_unicast(,,,0) sending the response
back to the current task in 2 of the three cases.  In all cases, since
the timeout is infinite, we assume that the only possible reason this
call to netlink_unicast() will fail is because the other end of the
socket went away.  Simple drawing of 2 of the 3 cases.
 +--+
 |  |
 |   auditctl (audit tool run by root)  |
 | netlink send netlink receive |
 +--+
+^
||
v+
++++
| kernel audit generate skbs || send skbs to userspace |
++++
+^
|++  |
+---| send skbs to a kthread |+-+
 ++
The most important of the 3 cases and the one that people care the
absolute most about 'things cannot be lost' is the actual audit
messages.  Messages like 'process A just did action B to object C'.
These are handled by means of the current process generating an skb and
passing those to an audit specific queue.  This audit internal queue
depth is controllable by userspace.  If we overflow this queue we may
call panic() (admin choice, obviously non-default).  Again, the kthread
on the other end of that queue assumes that all calls to
netlink_unicast(,,,0) will eventually succeed (unless the receiving task
died).  It is actually imperative that the current process be blocked
until the message is on track to userspace.  Even Eric isn't trying to
change this one case in his patch.  This is the one case where the task
receiving the skb is not (likely) the current task (but could be)

The other two, the ones Eric patched, are much more flexible.  In both
cases a userspace task ask the kernel for a specific piece of
information (by sending a netlink message).  current is going to be the
task draining the netlink queue.  This is the reason the send is being
punted to a kthread.  So current can read from the netlink socket.  In
one case audit_send_reply_thread() the response is small and can't
really grow without bound.  Converting to a nonblocking socket might
well make sense here.

The second user Eric patched, audit_send_list(), can grow without bound.
The number of skb's is going to be the size of the number of audit rules
that root loaded.  We run the list of rules, generate an skb per rule,
and add all of them to an skb_buff_head.  We then pass the skb_buff_head
to a kthread so that current will be able to read/drain the socket.
There really is no limit to how big the skb_buff_head could possibly
grow.  This doesn't necessarily absolutely have to be lossless but it
can actually quite reasonably be a whole 

Re: [RFC][PATCH] audit: Simplify by assuming the callers socket buffer is large enough

2014-03-07 Thread David Miller
From: Eric Paris epa...@redhat.com
Date: Fri, 07 Mar 2014 17:52:02 -0500

 Audit is non-tolerant to failure and loss.

Netlink is not a loss-less transport.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] audit: Simplify by assuming the callers socket buffer is large enough

2014-03-07 Thread Steve Grubb
On Friday, March 07, 2014 07:48:01 PM David Miller wrote:
 From: Eric Paris epa...@redhat.com
 Date: Fri, 07 Mar 2014 17:52:02 -0500
 
  Audit is non-tolerant to failure and loss.
 
 Netlink is not a loss-less transport.

Perhaps. But in all our testing over the years its been very good.

-Steve
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] audit: Simplify by assuming the callers socket buffer is large enough

2014-03-07 Thread Eric Paris
On Fri, 2014-03-07 at 19:48 -0500, David Miller wrote:
 From: Eric Paris epa...@redhat.com
 Date: Fri, 07 Mar 2014 17:52:02 -0500
 
  Audit is non-tolerant to failure and loss.
 
 Netlink is not a loss-less transport.
I'm happy to accept that (and know it to be true).  How can I better
architect things?  It seems Eric is complaining that when we get a
request for info, we queue that info up, and then use a kthread to make
it available when the task next calls recv.  By using blocking sockets
in the kthread we have no problem with the size of the socket read buf.
If we switch to non-blocking sockets how can we possibly queue up more
than rmem size of data?  (honestly, if userspace used INT_MAX it is
almost certainly overkill for even the largest rulesets, but
theoretically, it's not...)

Is our design somehow wrong?  Flawed?  Mind you it's pretty dumb that we
do basically the same thing in 3 different audit code path, but the
architecture doesn't seem crazy to me.  Then again, I'm not brilliant by
any stretch!

   +--+
   |  |
   |   auditctl (audit tool run by root)  |
   | netlink send netlink receive |
   +--+
  +^
  ||
  v+
  ++++
  | kernel audit generate skbs || send skbs to userspace |
  ++++
  +^
  |++  |
  +---| send skbs to a kthread |+-+
   ++

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] audit: Simplify by assuming the callers socket buffer is large enough

2014-03-07 Thread David Miller
From: Steve Grubb sgr...@redhat.com
Date: Fri, 07 Mar 2014 22:27:28 -0500

 On Friday, March 07, 2014 07:48:01 PM David Miller wrote:
 From: Eric Paris epa...@redhat.com
 Date: Fri, 07 Mar 2014 17:52:02 -0500
 
  Audit is non-tolerant to failure and loss.
 
 Netlink is not a loss-less transport.
 
 Perhaps. But in all our testing over the years its been very good.

What I really meant by that was that there is flow control.

You can push as much data reliably over it as you want, but you have
to block when the socket limits are hit.

And I'd say you might as well make the creator of the event do the
blocking rather than making other threads do this.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] audit: Simplify by assuming the callers socket buffer is large enough

2014-03-05 Thread Eric W. Biederman
Steve Grubb  writes:

> On Tuesday, March 04, 2014 07:21:52 PM David Miller wrote:
>> From: ebied...@xmission.com (Eric W. Biederman)
>> Date: Tue, 04 Mar 2014 14:41:16 -0800
>> 
>> > If we really want the ability to always appened to the queue of skb's
>> > is to just have a version of netlink_send_skb that ignores the queued
>> > limits.  Of course an evil program then could force the generation of
>> > enough audit records to DOS the kernel, but we seem to be in that
>> > situation now.  Shrug.
>> 
>> There is never a valid reason to bypass the socket limits.
>> 
>> It protects the system from things going out of control.
>> 
>> Netlink packet sends can fail, and audit should cope with that
>> event instead of trying to bludgeon it into not happening.
>
> I am not sure what exactly is the problem with the code that was being 
> patched. The audit code has been stable and working pretty well for everyone 
> for the last 5-6 years. But lately there has been a lot of kernel code churn 
> and I am concerned that the changes are being made without a complete 
> understanding of the design goals.

The problem is that the audit code in the kernel is not using netlink
correctly and which leads to mistakes when the code is updated because
the code in the kernel is unnecessarily complex, and inefficient.

> The audit system has to be very reliable. It can't lose any event or record. 
> The people that really depend on it would rather have access denied to the 
> system than lose any event. This is the reason it goes to such
> lengths.

But the designers of the code didn't really go to any lengths at all.
There are enormous and cumbersome hacks to avoid fixing the kernel
interfaces to do what audit wants to use the kernel interfaces for.
That is the audit code for transmitting replies is stupid and is causing
serious maintenance issues, by growing hacks upon hacks instead of
dealing with the core issues.

The first approximation of what you want should have been be to set the
netlink socket recvbuf to as large as the can be:
setsockopt(sk, SOL_SOCKET, SO_RCVBUFFORCE, INTMAX);

If/when 2GiB is shown to be not enough people should should have worked
on the netlink layer to allow even larger piles of audit messages to be
in flight, assuming you really do want to bring down the system when the
audit client is just too slow.

> If I understand the patch, it looks like its affecting code that adds, 
> deletes, 
> or lists audit rules. This code is quite old and working well. It didn't at 
> first. We found a lot of problems by writing scripts like:

Working well does not mean implemented well, and for a security feature
not implemented well means broken.  And frankly since I can't see a
single setsockopt to set the received buffer size for audit netlink
sockets I have to stronlgy suspect that people were working around
userpsace bugs with stupid kernel code.

> #!/bin/bash
> while [ 1 ] ;
> do
> echo "Inserting syscalls..."
> sys=1
> while [ $sys -lt 100 ]
> do
> sys=`expr $sys + 1`
> echo "$sys"
> auditctl -a entry,always -S $sys
> done
>
> echo "Listing..."
> auditctl -l
> echo "Deleting..."
> auditctl -D
> done
>
> with another shell running:
>
> #!/bin/bash
> while [ 1 ] ; 
> do
>   echo "Listing..."
>   auditctl -l
> done
>
> and another shell running:
>
> #!/bin/bash
> while [ 1 ] ; 
> do
>   echo "Disabling..."
>   auditctl -e 0
>   echo "Enabling..."
>   auditctl -e 1
> done
>
> You can probably imagine more stress tests. But the proposed code should be 
> well tested similar to this.

Honestly since no one cares enough to maintain the kernel code properly
I really think we should just rip audit out instead trying to present
userspace with the delusion that the code works, and will continue to
work properly.

My apologies for being grumpy.  I have no intention of breaking
userspace (which is why my last patch was an rfc) but at the same time
I care absolutely nothing about audit.  It solves none of my problems
and works at all of the wrong layers to be interesting for me.

I just happened to notice that the code has been broken since 3.14-rc1
and no one understands or cares enough about audit to even accept
trivial kernel patches to fix the bugs.  No offense but I am not
interested in testing code I care nothing about, especially when there
is no one to pick up the patches.  My RFC patch was supposed to be a
word to the wise.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] audit: Simplify by assuming the callers socket buffer is large enough

2014-03-05 Thread Steve Grubb
On Tuesday, March 04, 2014 07:21:52 PM David Miller wrote:
> From: ebied...@xmission.com (Eric W. Biederman)
> Date: Tue, 04 Mar 2014 14:41:16 -0800
> 
> > If we really want the ability to always appened to the queue of skb's
> > is to just have a version of netlink_send_skb that ignores the queued
> > limits.  Of course an evil program then could force the generation of
> > enough audit records to DOS the kernel, but we seem to be in that
> > situation now.  Shrug.
> 
> There is never a valid reason to bypass the socket limits.
> 
> It protects the system from things going out of control.
> 
> Netlink packet sends can fail, and audit should cope with that
> event instead of trying to bludgeon it into not happening.

I am not sure what exactly is the problem with the code that was being 
patched. The audit code has been stable and working pretty well for everyone 
for the last 5-6 years. But lately there has been a lot of kernel code churn 
and I am concerned that the changes are being made without a complete 
understanding of the design goals.

The audit system has to be very reliable. It can't lose any event or record. 
The people that really depend on it would rather have access denied to the 
system than lose any event. This is the reason it goes to such lengths.

If I understand the patch, it looks like its affecting code that adds, deletes, 
or lists audit rules. This code is quite old and working well. It didn't at 
first. We found a lot of problems by writing scripts like:

#!/bin/bash
while [ 1 ] ;
do
echo "Inserting syscalls..."
sys=1
while [ $sys -lt 100 ]
do
sys=`expr $sys + 1`
echo "$sys"
auditctl -a entry,always -S $sys
done

echo "Listing..."
auditctl -l
echo "Deleting..."
auditctl -D
done

with another shell running:

#!/bin/bash
while [ 1 ] ; 
do
echo "Listing..."
auditctl -l
done

and another shell running:

#!/bin/bash
while [ 1 ] ; 
do
echo "Disabling..."
auditctl -e 0
echo "Enabling..."
auditctl -e 1
done

You can probably imagine more stress tests. But the proposed code should be 
well tested similar to this.

Thanks,
-Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] audit: Simplify by assuming the callers socket buffer is large enough

2014-03-05 Thread Steve Grubb
On Tuesday, March 04, 2014 07:21:52 PM David Miller wrote:
 From: ebied...@xmission.com (Eric W. Biederman)
 Date: Tue, 04 Mar 2014 14:41:16 -0800
 
  If we really want the ability to always appened to the queue of skb's
  is to just have a version of netlink_send_skb that ignores the queued
  limits.  Of course an evil program then could force the generation of
  enough audit records to DOS the kernel, but we seem to be in that
  situation now.  Shrug.
 
 There is never a valid reason to bypass the socket limits.
 
 It protects the system from things going out of control.
 
 Netlink packet sends can fail, and audit should cope with that
 event instead of trying to bludgeon it into not happening.

I am not sure what exactly is the problem with the code that was being 
patched. The audit code has been stable and working pretty well for everyone 
for the last 5-6 years. But lately there has been a lot of kernel code churn 
and I am concerned that the changes are being made without a complete 
understanding of the design goals.

The audit system has to be very reliable. It can't lose any event or record. 
The people that really depend on it would rather have access denied to the 
system than lose any event. This is the reason it goes to such lengths.

If I understand the patch, it looks like its affecting code that adds, deletes, 
or lists audit rules. This code is quite old and working well. It didn't at 
first. We found a lot of problems by writing scripts like:

#!/bin/bash
while [ 1 ] ;
do
echo Inserting syscalls...
sys=1
while [ $sys -lt 100 ]
do
sys=`expr $sys + 1`
echo $sys
auditctl -a entry,always -S $sys
done

echo Listing...
auditctl -l
echo Deleting...
auditctl -D
done

with another shell running:

#!/bin/bash
while [ 1 ] ; 
do
echo Listing...
auditctl -l
done

and another shell running:

#!/bin/bash
while [ 1 ] ; 
do
echo Disabling...
auditctl -e 0
echo Enabling...
auditctl -e 1
done

You can probably imagine more stress tests. But the proposed code should be 
well tested similar to this.

Thanks,
-Steve
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] audit: Simplify by assuming the callers socket buffer is large enough

2014-03-05 Thread Eric W. Biederman
Steve Grubb sgr...@redhat.com writes:

 On Tuesday, March 04, 2014 07:21:52 PM David Miller wrote:
 From: ebied...@xmission.com (Eric W. Biederman)
 Date: Tue, 04 Mar 2014 14:41:16 -0800
 
  If we really want the ability to always appened to the queue of skb's
  is to just have a version of netlink_send_skb that ignores the queued
  limits.  Of course an evil program then could force the generation of
  enough audit records to DOS the kernel, but we seem to be in that
  situation now.  Shrug.
 
 There is never a valid reason to bypass the socket limits.
 
 It protects the system from things going out of control.
 
 Netlink packet sends can fail, and audit should cope with that
 event instead of trying to bludgeon it into not happening.

 I am not sure what exactly is the problem with the code that was being 
 patched. The audit code has been stable and working pretty well for everyone 
 for the last 5-6 years. But lately there has been a lot of kernel code churn 
 and I am concerned that the changes are being made without a complete 
 understanding of the design goals.

The problem is that the audit code in the kernel is not using netlink
correctly and which leads to mistakes when the code is updated because
the code in the kernel is unnecessarily complex, and inefficient.

 The audit system has to be very reliable. It can't lose any event or record. 
 The people that really depend on it would rather have access denied to the 
 system than lose any event. This is the reason it goes to such
 lengths.

But the designers of the code didn't really go to any lengths at all.
There are enormous and cumbersome hacks to avoid fixing the kernel
interfaces to do what audit wants to use the kernel interfaces for.
That is the audit code for transmitting replies is stupid and is causing
serious maintenance issues, by growing hacks upon hacks instead of
dealing with the core issues.

The first approximation of what you want should have been be to set the
netlink socket recvbuf to as large as the can be:
setsockopt(sk, SOL_SOCKET, SO_RCVBUFFORCE, INTMAX);

If/when 2GiB is shown to be not enough people should should have worked
on the netlink layer to allow even larger piles of audit messages to be
in flight, assuming you really do want to bring down the system when the
audit client is just too slow.

 If I understand the patch, it looks like its affecting code that adds, 
 deletes, 
 or lists audit rules. This code is quite old and working well. It didn't at 
 first. We found a lot of problems by writing scripts like:

Working well does not mean implemented well, and for a security feature
not implemented well means broken.  And frankly since I can't see a
single setsockopt to set the received buffer size for audit netlink
sockets I have to stronlgy suspect that people were working around
userpsace bugs with stupid kernel code.

 #!/bin/bash
 while [ 1 ] ;
 do
 echo Inserting syscalls...
 sys=1
 while [ $sys -lt 100 ]
 do
 sys=`expr $sys + 1`
 echo $sys
 auditctl -a entry,always -S $sys
 done

 echo Listing...
 auditctl -l
 echo Deleting...
 auditctl -D
 done

 with another shell running:

 #!/bin/bash
 while [ 1 ] ; 
 do
   echo Listing...
   auditctl -l
 done

 and another shell running:

 #!/bin/bash
 while [ 1 ] ; 
 do
   echo Disabling...
   auditctl -e 0
   echo Enabling...
   auditctl -e 1
 done

 You can probably imagine more stress tests. But the proposed code should be 
 well tested similar to this.

Honestly since no one cares enough to maintain the kernel code properly
I really think we should just rip audit out instead trying to present
userspace with the delusion that the code works, and will continue to
work properly.

My apologies for being grumpy.  I have no intention of breaking
userspace (which is why my last patch was an rfc) but at the same time
I care absolutely nothing about audit.  It solves none of my problems
and works at all of the wrong layers to be interesting for me.

I just happened to notice that the code has been broken since 3.14-rc1
and no one understands or cares enough about audit to even accept
trivial kernel patches to fix the bugs.  No offense but I am not
interested in testing code I care nothing about, especially when there
is no one to pick up the patches.  My RFC patch was supposed to be a
word to the wise.

Eric
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] audit: Simplify by assuming the callers socket buffer is large enough

2014-03-04 Thread David Miller
From: ebied...@xmission.com (Eric W. Biederman)
Date: Tue, 04 Mar 2014 14:41:16 -0800

> If we really want the ability to always appened to the queue of skb's
> is to just have a version of netlink_send_skb that ignores the queued
> limits.  Of course an evil program then could force the generation of
> enough audit records to DOS the kernel, but we seem to be in that
> situation now.  Shrug.

There is never a valid reason to bypass the socket limits.

It protects the system from things going out of control.

Netlink packet sends can fail, and audit should cope with that
event instead of trying to bludgeon it into not happening.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] audit: Simplify by assuming the callers socket buffer is large enough

2014-03-04 Thread Andrew Morton
On Tue, 04 Mar 2014 14:41:16 -0800 ebied...@xmission.com (Eric W. Biederman) 
wrote:

> David Miller  writes:
> 
> > From: Andrew Morton 
> > Date: Tue, 4 Mar 2014 13:30:04 -0800
> >
> >> On Fri, 28 Feb 2014 20:50:19 -0800 ebied...@xmission.com (Eric W. 
> >> Biederman) wrote:
> >> 
> >>> 
> >>> Modify audit_send_reply to directly use a non-blocking send and
> >>> to return an error on failure (if anyone cares).
> >>> 
> >>> Modify audit_list_rules_send to use audit_send_reply and give up
> >>> if we can not send a packet.
> >>> 
> >>> Merge audit_list_rules into iaudit_list_rules_send as the code
> >>> is now sufficiently simple to not justify to callers.
> >>> 
> >>> Kill audit_send_list, audit_send_reply_thread because using
> >>> a separate thread for replies is not needed when sending
> >>> packets syncrhonously.
> >> 
> >> Nothing much seems to be happening here?
> 
> Well you picked up the patch that fixes the worst of the bugs that I was
> complaining about.  Beyond that I don't know what makes sense.

Oh, so I did.  I wasn't planning on merging it myself, hoping that
someone who hasaclue will step in.  Help.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] audit: Simplify by assuming the callers socket buffer is large enough

2014-03-04 Thread Eric W. Biederman
David Miller  writes:

> From: Andrew Morton 
> Date: Tue, 4 Mar 2014 13:30:04 -0800
>
>> On Fri, 28 Feb 2014 20:50:19 -0800 ebied...@xmission.com (Eric W. Biederman) 
>> wrote:
>> 
>>> 
>>> Modify audit_send_reply to directly use a non-blocking send and
>>> to return an error on failure (if anyone cares).
>>> 
>>> Modify audit_list_rules_send to use audit_send_reply and give up
>>> if we can not send a packet.
>>> 
>>> Merge audit_list_rules into iaudit_list_rules_send as the code
>>> is now sufficiently simple to not justify to callers.
>>> 
>>> Kill audit_send_list, audit_send_reply_thread because using
>>> a separate thread for replies is not needed when sending
>>> packets syncrhonously.
>> 
>> Nothing much seems to be happening here?

Well you picked up the patch that fixes the worst of the bugs that I was
complaining about.  Beyond that I don't know what makes sense.

>> In an earlier email you said "While reading through 3.14-rc1 I found a
>> pretty siginficant mishandling of network namespaces in the recent
>> audit changes." Were those recent audit changes a post-3.14 thing?  And
>> what is the user-visible effect of the mishandling?
>
> I took a quick look at this patch yesterday, and my only suspicion is
> that threads are created currently by this code purely to cons up a
> blockable context for the netlink code.
>
> Perhaps it wants to avoid any netlink packet drops from being possible.
>
> I'm not so sure.

Looking at the history (see below) the stated reason from David
Woodhouse is to prevent the removal of packets from the packet queues
when we have reached our rcvbuf limit.

The reasoning doesn't make a lick of sense to me and I was hoping the
folks dealing with audit would comment.

If we really want the ability to always appened to the queue of skb's
is to just have a version of netlink_send_skb that ignores the queued
limits.  Of course an evil program then could force the generation of
enough audit records to DOS the kernel, but we seem to be in that
situation now.  Shrug.

> Anyways, some investigation into perhaps figuring out the intentions of
> the original implementation would be nice.

I had looked briefly and missed a few things.  Here is the complete
history in all of it's confusion.

In brief.  The code came in using non-blocking netlink writes.

David Woodhouse made the writes blocking.

Al Viro, and then Eric Paris patch the code to deal with deadlocks
cause by the blocking writes.


commit 4527a30f157fca45c102ea49a2fb34a4502264eb
Author: akpm 
Date:   Mon Apr 12 20:29:12 2004 +

[PATCH] Light-weight Auditing Framework

From: Rik Faith 

This patch provides a low-overhead system-call auditing framework for Linux
that is usable by LSM components (e.g., SELinux).  This is an update of the
patch discussed in this thread:

http://marc.theaimsgroup.com/?t=10781588811=1=2

In brief, it provides for netlink-based logging of audit records that have
been generated in other parts of the kernel (e.g., SELinux) as well as the
ability to audit system calls, either independently (using simple
filtering) or as a compliment to the audit record that another part of the
kernel generated.

The main goals were to provide system call auditing with 1) as low overhead
as possible, and 2) without duplicating functionality that is already
provided by SELinux (and/or other security infrastructures).  This
framework will work "stand-alone", but is not designed to provide, e.g.,
CAPP functionality without another security component in place.

This updated patch includes changes from feedback I have received,
including the ability to compile without CONFIG_NET (and better use of
tabs, so use -w if you diff against the older patch).

Please see http://people.redhat.com/faith/audit/ for an early example
user-space client (auditd-0.4.tar.gz) and instructions on how to try it.

My future intentions at the kernel level include improving filtering (e.g.,
syscall personality/exit codes) and syscall support for more architectures.
 First, though, I'm going to work on documentation, a (real) audit daemon,
and patches for other user-space tools so that people can play with the
framework and understand how it can be used with and without SELinux.


Update:

Light-weight Auditing Framework receive filter fixes
From: Rik Faith 

Since audit_receive_filter() is only called with audit_netlink_sem held, it
cannot race with either audit_del_rule() or audit_add_rule(), so the
list_for_each_entry_rcu()s may be replaced by list_for_each_entry()s, and
the rcu_read_{un,}lock()s removed.  A fix for this is part of the attached
patch.

Other features of the attached patch are:

1) generalized the ability to test for inequality
   
2) added syscall exit status reporting and testing

3) added ability to 

Re: [RFC][PATCH] audit: Simplify by assuming the callers socket buffer is large enough

2014-03-04 Thread David Miller
From: Andrew Morton 
Date: Tue, 4 Mar 2014 13:30:04 -0800

> On Fri, 28 Feb 2014 20:50:19 -0800 ebied...@xmission.com (Eric W. Biederman) 
> wrote:
> 
>> 
>> Modify audit_send_reply to directly use a non-blocking send and
>> to return an error on failure (if anyone cares).
>> 
>> Modify audit_list_rules_send to use audit_send_reply and give up
>> if we can not send a packet.
>> 
>> Merge audit_list_rules into iaudit_list_rules_send as the code
>> is now sufficiently simple to not justify to callers.
>> 
>> Kill audit_send_list, audit_send_reply_thread because using
>> a separate thread for replies is not needed when sending
>> packets syncrhonously.
> 
> Nothing much seems to be happening here?
> 
> In an earlier email you said "While reading through 3.14-rc1 I found a
> pretty siginficant mishandling of network namespaces in the recent
> audit changes." Were those recent audit changes a post-3.14 thing?  And
> what is the user-visible effect of the mishandling?

I took a quick look at this patch yesterday, and my only suspicion is
that threads are created currently by this code purely to cons up a
blockable context for the netlink code.

Perhaps it wants to avoid any netlink packet drops from being possible.

I'm not so sure.

Anyways, some investigation into perhaps figuring out the intentions of
the original implementation would be nice.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] audit: Simplify by assuming the callers socket buffer is large enough

2014-03-04 Thread Andrew Morton
On Fri, 28 Feb 2014 20:50:19 -0800 ebied...@xmission.com (Eric W. Biederman) 
wrote:

> 
> Modify audit_send_reply to directly use a non-blocking send and
> to return an error on failure (if anyone cares).
> 
> Modify audit_list_rules_send to use audit_send_reply and give up
> if we can not send a packet.
> 
> Merge audit_list_rules into iaudit_list_rules_send as the code
> is now sufficiently simple to not justify to callers.
> 
> Kill audit_send_list, audit_send_reply_thread because using
> a separate thread for replies is not needed when sending
> packets syncrhonously.

Nothing much seems to be happening here?

In an earlier email you said "While reading through 3.14-rc1 I found a
pretty siginficant mishandling of network namespaces in the recent
audit changes." Were those recent audit changes a post-3.14 thing?  And
what is the user-visible effect of the mishandling?

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] audit: Simplify by assuming the callers socket buffer is large enough

2014-03-04 Thread Andrew Morton
On Fri, 28 Feb 2014 20:50:19 -0800 ebied...@xmission.com (Eric W. Biederman) 
wrote:

 
 Modify audit_send_reply to directly use a non-blocking send and
 to return an error on failure (if anyone cares).
 
 Modify audit_list_rules_send to use audit_send_reply and give up
 if we can not send a packet.
 
 Merge audit_list_rules into iaudit_list_rules_send as the code
 is now sufficiently simple to not justify to callers.
 
 Kill audit_send_list, audit_send_reply_thread because using
 a separate thread for replies is not needed when sending
 packets syncrhonously.

Nothing much seems to be happening here?

In an earlier email you said While reading through 3.14-rc1 I found a
pretty siginficant mishandling of network namespaces in the recent
audit changes. Were those recent audit changes a post-3.14 thing?  And
what is the user-visible effect of the mishandling?

Thanks.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] audit: Simplify by assuming the callers socket buffer is large enough

2014-03-04 Thread David Miller
From: Andrew Morton a...@linux-foundation.org
Date: Tue, 4 Mar 2014 13:30:04 -0800

 On Fri, 28 Feb 2014 20:50:19 -0800 ebied...@xmission.com (Eric W. Biederman) 
 wrote:
 
 
 Modify audit_send_reply to directly use a non-blocking send and
 to return an error on failure (if anyone cares).
 
 Modify audit_list_rules_send to use audit_send_reply and give up
 if we can not send a packet.
 
 Merge audit_list_rules into iaudit_list_rules_send as the code
 is now sufficiently simple to not justify to callers.
 
 Kill audit_send_list, audit_send_reply_thread because using
 a separate thread for replies is not needed when sending
 packets syncrhonously.
 
 Nothing much seems to be happening here?
 
 In an earlier email you said While reading through 3.14-rc1 I found a
 pretty siginficant mishandling of network namespaces in the recent
 audit changes. Were those recent audit changes a post-3.14 thing?  And
 what is the user-visible effect of the mishandling?

I took a quick look at this patch yesterday, and my only suspicion is
that threads are created currently by this code purely to cons up a
blockable context for the netlink code.

Perhaps it wants to avoid any netlink packet drops from being possible.

I'm not so sure.

Anyways, some investigation into perhaps figuring out the intentions of
the original implementation would be nice.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] audit: Simplify by assuming the callers socket buffer is large enough

2014-03-04 Thread Eric W. Biederman
David Miller da...@davemloft.net writes:

 From: Andrew Morton a...@linux-foundation.org
 Date: Tue, 4 Mar 2014 13:30:04 -0800

 On Fri, 28 Feb 2014 20:50:19 -0800 ebied...@xmission.com (Eric W. Biederman) 
 wrote:
 
 
 Modify audit_send_reply to directly use a non-blocking send and
 to return an error on failure (if anyone cares).
 
 Modify audit_list_rules_send to use audit_send_reply and give up
 if we can not send a packet.
 
 Merge audit_list_rules into iaudit_list_rules_send as the code
 is now sufficiently simple to not justify to callers.
 
 Kill audit_send_list, audit_send_reply_thread because using
 a separate thread for replies is not needed when sending
 packets syncrhonously.
 
 Nothing much seems to be happening here?

Well you picked up the patch that fixes the worst of the bugs that I was
complaining about.  Beyond that I don't know what makes sense.

 In an earlier email you said While reading through 3.14-rc1 I found a
 pretty siginficant mishandling of network namespaces in the recent
 audit changes. Were those recent audit changes a post-3.14 thing?  And
 what is the user-visible effect of the mishandling?

 I took a quick look at this patch yesterday, and my only suspicion is
 that threads are created currently by this code purely to cons up a
 blockable context for the netlink code.

 Perhaps it wants to avoid any netlink packet drops from being possible.

 I'm not so sure.

Looking at the history (see below) the stated reason from David
Woodhouse is to prevent the removal of packets from the packet queues
when we have reached our rcvbuf limit.

The reasoning doesn't make a lick of sense to me and I was hoping the
folks dealing with audit would comment.

If we really want the ability to always appened to the queue of skb's
is to just have a version of netlink_send_skb that ignores the queued
limits.  Of course an evil program then could force the generation of
enough audit records to DOS the kernel, but we seem to be in that
situation now.  Shrug.

 Anyways, some investigation into perhaps figuring out the intentions of
 the original implementation would be nice.

I had looked briefly and missed a few things.  Here is the complete
history in all of it's confusion.

In brief.  The code came in using non-blocking netlink writes.

David Woodhouse made the writes blocking.

Al Viro, and then Eric Paris patch the code to deal with deadlocks
cause by the blocking writes.


commit 4527a30f157fca45c102ea49a2fb34a4502264eb
Author: akpm akpm
Date:   Mon Apr 12 20:29:12 2004 +

[PATCH] Light-weight Auditing Framework

From: Rik Faith fa...@redhat.com

This patch provides a low-overhead system-call auditing framework for Linux
that is usable by LSM components (e.g., SELinux).  This is an update of the
patch discussed in this thread:

http://marc.theaimsgroup.com/?t=10781588811r=1w=2

In brief, it provides for netlink-based logging of audit records that have
been generated in other parts of the kernel (e.g., SELinux) as well as the
ability to audit system calls, either independently (using simple
filtering) or as a compliment to the audit record that another part of the
kernel generated.

The main goals were to provide system call auditing with 1) as low overhead
as possible, and 2) without duplicating functionality that is already
provided by SELinux (and/or other security infrastructures).  This
framework will work stand-alone, but is not designed to provide, e.g.,
CAPP functionality without another security component in place.

This updated patch includes changes from feedback I have received,
including the ability to compile without CONFIG_NET (and better use of
tabs, so use -w if you diff against the older patch).

Please see http://people.redhat.com/faith/audit/ for an early example
user-space client (auditd-0.4.tar.gz) and instructions on how to try it.

My future intentions at the kernel level include improving filtering (e.g.,
syscall personality/exit codes) and syscall support for more architectures.
 First, though, I'm going to work on documentation, a (real) audit daemon,
and patches for other user-space tools so that people can play with the
framework and understand how it can be used with and without SELinux.


Update:

Light-weight Auditing Framework receive filter fixes
From: Rik Faith fa...@redhat.com

Since audit_receive_filter() is only called with audit_netlink_sem held, it
cannot race with either audit_del_rule() or audit_add_rule(), so the
list_for_each_entry_rcu()s may be replaced by list_for_each_entry()s, and
the rcu_read_{un,}lock()s removed.  A fix for this is part of the attached
patch.

Other features of the attached patch are:

1) generalized the ability to test for inequality
   
2) added syscall exit status reporting and testing

3) added 

Re: [RFC][PATCH] audit: Simplify by assuming the callers socket buffer is large enough

2014-03-04 Thread Andrew Morton
On Tue, 04 Mar 2014 14:41:16 -0800 ebied...@xmission.com (Eric W. Biederman) 
wrote:

 David Miller da...@davemloft.net writes:
 
  From: Andrew Morton a...@linux-foundation.org
  Date: Tue, 4 Mar 2014 13:30:04 -0800
 
  On Fri, 28 Feb 2014 20:50:19 -0800 ebied...@xmission.com (Eric W. 
  Biederman) wrote:
  
  
  Modify audit_send_reply to directly use a non-blocking send and
  to return an error on failure (if anyone cares).
  
  Modify audit_list_rules_send to use audit_send_reply and give up
  if we can not send a packet.
  
  Merge audit_list_rules into iaudit_list_rules_send as the code
  is now sufficiently simple to not justify to callers.
  
  Kill audit_send_list, audit_send_reply_thread because using
  a separate thread for replies is not needed when sending
  packets syncrhonously.
  
  Nothing much seems to be happening here?
 
 Well you picked up the patch that fixes the worst of the bugs that I was
 complaining about.  Beyond that I don't know what makes sense.

Oh, so I did.  I wasn't planning on merging it myself, hoping that
someone who hasaclue will step in.  Help.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH] audit: Simplify by assuming the callers socket buffer is large enough

2014-03-04 Thread David Miller
From: ebied...@xmission.com (Eric W. Biederman)
Date: Tue, 04 Mar 2014 14:41:16 -0800

 If we really want the ability to always appened to the queue of skb's
 is to just have a version of netlink_send_skb that ignores the queued
 limits.  Of course an evil program then could force the generation of
 enough audit records to DOS the kernel, but we seem to be in that
 situation now.  Shrug.

There is never a valid reason to bypass the socket limits.

It protects the system from things going out of control.

Netlink packet sends can fail, and audit should cope with that
event instead of trying to bludgeon it into not happening.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/