Re: [PATCH ghak25 v2 1/9] netfilter: normalize x_table function declarations

2020-01-08 Thread Nicolas Dichtel
Le 06/01/2020 à 19:54, Richard Guy Briggs a écrit :
> Git context diffs were being produced with unhelpful declaration types
> in the place of function names to help identify the funciton in which
> changes were made.
Just for my information, how do you reproduce that? With a 'git diff'?

> 
> Normalize x_table function declarations so that git context diff
> function labels work as expected.
> 
[snip]
> 
> -- 
> 1.8.3.1
git v1.8.3.1 is seven years old:
https://github.com/git/git/releases/tag/v1.8.3.1

I don't see any problems with git v2.24. Not sure that the patch brings any
helpful value except complicating backports.

Regards,
Nicolas


--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit

[PATCH net-next] netfilter: create audit records for ebtables replaces

2014-09-08 Thread Nicolas Dichtel
This is already done for x_tables (family AF_INET and AF_INET6), let's do it
for AF_BRIDGE also.

Signed-off-by: Nicolas Dichtel nicolas.dich...@6wind.com
---

v2: move audit chunks to do_replace_finish()

 net/bridge/netfilter/ebtables.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c
index 6d69631b9f4d..d9a8c05d995d 100644
--- a/net/bridge/netfilter/ebtables.c
+++ b/net/bridge/netfilter/ebtables.c
@@ -26,6 +26,7 @@
 #include asm/uaccess.h
 #include linux/smp.h
 #include linux/cpumask.h
+#include linux/audit.h
 #include net/sock.h
 /* needed for logical [in,out]-dev filtering */
 #include ../br_private.h
@@ -1058,6 +1059,20 @@ static int do_replace_finish(struct net *net, struct 
ebt_replace *repl,
vfree(table);
 
vfree(counterstmp);
+
+#ifdef CONFIG_AUDIT
+   if (audit_enabled) {
+   struct audit_buffer *ab;
+
+   ab = audit_log_start(current-audit_context, GFP_KERNEL,
+AUDIT_NETFILTER_CFG);
+   if (ab) {
+   audit_log_format(ab, table=%s family=%u entries=%u,
+repl-name, AF_BRIDGE, repl-nentries);
+   audit_log_end(ab);
+   }
+   }
+#endif
return ret;
 
 free_unlock:
-- 
1.9.0

--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit


[PATCH net-next] netfilter: create audit records for ebtables replaces

2014-09-06 Thread Nicolas Dichtel
This is already done for x_tables (family AF_INET and AF_INET6), let's do it
for AF_BRIDGE also.

Signed-off-by: Nicolas Dichtel nicolas.dich...@6wind.com
---
 net/bridge/netfilter/ebtables.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c
index 6d69631b9f4d..4ba0c5c78778 100644
--- a/net/bridge/netfilter/ebtables.c
+++ b/net/bridge/netfilter/ebtables.c
@@ -26,6 +26,7 @@
 #include asm/uaccess.h
 #include linux/smp.h
 #include linux/cpumask.h
+#include linux/audit.h
 #include net/sock.h
 /* needed for logical [in,out]-dev filtering */
 #include ../br_private.h
@@ -1126,6 +1127,20 @@ static int do_replace(struct net *net, const void __user 
*user,
}
 
ret = do_replace_finish(net, tmp, newinfo);
+#ifdef CONFIG_AUDIT
+   if (audit_enabled) {
+   struct audit_buffer *ab;
+
+   ab = audit_log_start(current-audit_context, GFP_KERNEL,
+AUDIT_NETFILTER_CFG);
+   if (ab) {
+   audit_log_format(ab, table=%s family=%u entries=%u,
+tmp.name, AF_BRIDGE,
+tmp.nentries);
+   audit_log_end(ab);
+   }
+   }
+#endif
if (ret == 0)
return ret;
 free_entries:
-- 
1.9.0

--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit


Re: [PATCH V4 3/8] namespaces: expose ns instance serial numbers in proc

2014-08-25 Thread Nicolas Dichtel

Le 24/08/2014 19:52, Andy Lutomirski a écrit :

On Thu, Aug 21, 2014 at 6:58 PM, Richard Guy Briggs r...@redhat.com wrote:

On 14/08/21, Andy Lutomirski wrote:

On Aug 20, 2014 8:12 PM, Richard Guy Briggs r...@redhat.com wrote:

Expose the namespace instace serial numbers in the proc filesystem at
/proc/pid/ns/ns_snum.  The link text gives the serial number in hex.


What's the use case?

I understand the utility of giving unique numbers to the audit code,
but I don't think this part is necessary for that, and I'd like to
understand what else will use this before committing to a duplicative
API like this.


How does a container manager get those numbers?  It could provoke a task
to cause an audit event that emits a NS_INFO message, or it could run a
task in that container to report its namespace serial numbers directly
from its /proc mount.


Why does a container manager need them?  Is there any reason that
keeping them entirely contained within the audit system would be a
problem?



The discussion in this thread touches on the use cases:
 https://lkml.org/lkml/2014/4/22/662


Note that this API is thoroughly incompatible with CRIU.  If we do
this, someone will ask for a namespace number namespace, and that way
lies madness.


I had a very brief look at CRIU, but not enough to understand the issue.
Others have hinted at this problem.

Do you have a suggestion of a different approach that would be
compatible with CRIU?

I'd originally considered some sort of UUID that would be globally
unique, but that would be very hard to devise or guarantee, and besides,
namespaces aren't only used by containers and could be shared in other
ways.  Tracking the usage and migration of namespaces should be the task
of an upper layer.



CRIU wants to save the complete state of a namespace and then restore
it.  For that to work, any information exposed to things in the
namespace *cannot* be globally unique or unique per boot, since CRIU
needs to arrange for that information to match whatever it was when
CRIU saved it.

How are ifindex of network devices managed? These ifindexes are unique per boot,
thus can change depending on the order in which netdev are created.
These ifindexes are unique per boot and exposed to userspace ...



Also, I think that code running in a namespace has no business even
knowing a unique identity of that namespace from the perspective of
the host.

Another scenario is when you have virtual network devices across two netns. You
need to identify the peer netns to have a netlink message which is fully 
interpretable by the userspace.



Regards,
Nicolas

--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit


Re: [PATCH V4 3/8] namespaces: expose ns instance serial numbers in proc

2014-08-25 Thread Nicolas Dichtel

Le 25/08/2014 16:04, Andy Lutomirski a écrit :

On Aug 25, 2014 6:30 AM, Nicolas Dichtel nicolas.dich...@6wind.com wrote:

CRIU wants to save the complete state of a namespace and then restore
it.  For that to work, any information exposed to things in the
namespace *cannot* be globally unique or unique per boot, since CRIU
needs to arrange for that information to match whatever it was when
CRIU saved it.


How are ifindex of network devices managed? These ifindexes are unique per boot,
thus can change depending on the order in which netdev are created.
These ifindexes are unique per boot and exposed to userspace ...



This does not appear to be true.

$ sudo unshare --net
# ip link add veth0 type veth peer name veth1
# ip link
1: lo: LOOPBACK mtu 65536 qdisc noop state DOWN mode DEFAULT group default
 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: veth1: BROADCAST,MULTICAST mtu 1500 qdisc noop state DOWN mode
DEFAULT group default qlen 1000
 link/ether 06:0d:59:c7:a6:a8 brd ff:ff:ff:ff:ff:ff
3: veth0: BROADCAST,MULTICAST mtu 1500 qdisc noop state DOWN mode
DEFAULT group default qlen 1000
 link/ether b2:5c:8b:f2:12:28 brd ff:ff:ff:ff:ff:ff
# logout
$ ip link
1: lo: LOOPBACK,UP,LOWER_UP mtu 65536 qdisc noqueue state UNKNOWN
 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
3: em1: NO-CARRIER,BROADCAST,MULTICAST,UP mtu 1500 qdisc pfifo_fast
state DOWN qlen 1000


I've probably misunderstood what you're trying to say. ifindexes are unique per
boot and per netns. These ifindexes depend on the interface creation order:

$ ip netns add 1
$ ip link set eth1 netns 1
$ ip netns exec 1 ip link add veth0 type veth peer name veth1
$ ip netns exec 1 ip link
1: lo: LOOPBACK mtu 65536 qdisc noop state DOWN mode DEFAULT group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: veth1: BROADCAST,MULTICAST mtu 1500 qdisc noop state DOWN mode DEFAULT 
group default qlen 1000

link/ether 9a:a0:89:99:a0:3c brd ff:ff:ff:ff:ff:ff
3: eth1: BROADCAST,MULTICAST mtu 1500 qdisc noop state DOWN mode DEFAULT group 
default qlen 1000

link/ether 52:54:00:12:34:57 brd ff:ff:ff:ff:ff:ff
4: veth0: BROADCAST,MULTICAST mtu 1500 qdisc noop state DOWN mode DEFAULT 
group default qlen 1000

link/ether 96:86:44:49:ce:a8 brd ff:ff:ff:ff:ff:ff
$ ip netns del 1
$ ip netns add 1
$ ip netns exec 1 ip link add veth0 type veth peer name veth1
$ ip link set eth1 netns 1
$ ip netns exec 1 ip link
1: lo: LOOPBACK mtu 65536 qdisc noop state DOWN mode DEFAULT group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: veth1: BROADCAST,MULTICAST mtu 1500 qdisc noop state DOWN mode DEFAULT 
group default qlen 1000

link/ether 86:92:90:01:32:6b brd ff:ff:ff:ff:ff:ff
3: veth0: BROADCAST,MULTICAST mtu 1500 qdisc noop state DOWN mode DEFAULT 
group default qlen 1000

link/ether ae:8b:d2:71:48:a2 brd ff:ff:ff:ff:ff:ff
4: eth1: BROADCAST,MULTICAST mtu 1500 qdisc noop state DOWN mode DEFAULT group 
default qlen 1000

link/ether 52:54:00:12:34:57 brd ff:ff:ff:ff:ff:ff

Note: when an interface is moved to another netns, the ifindex is kept if
possible, else another ifindex is chosen.
I will dig a bit to understand how CRIU save these netns informations.





Also, I think that code running in a namespace has no business even
knowing a unique identity of that namespace from the perspective of
the host.


Another scenario is when you have virtual network devices across two netns. You
need to identify the peer netns to have a netlink message which is fully 
interpretable by the userspace.


Let me try again, with emphasis in the right place.

I think that *code running in a namespace* has no business even
knowing a unique identity of *that namespace* from the perspective of
the host.

In your example, if there's a veth device between netns A and netns B,
then code *in netns A* has no business knowing the identity of its
veth peer if its peer (B) is a sibling or ancestor.  It also IMO has
no business knowing the identity of its own netns (A) other than as
my netns.

I do not agree (see the example below).



If A and B are siblings, then their parent needs to know where that
veth device goes, but I think this is already the case to a sufficient
extent today.

I'm not aware of a hierarchy between netns. A daemon should be able to
got the full network configuration, even if it's started when this configuration
is already applied, ie even if it doesn't know what happen before it starts.



I feel like this discussion is falling into a common trap of new API
discussions.  Can one of you who wants this API please articulate,
with a reasonably precise example, what it is that you want to do, why
you can't easily do it already, and how this API helps?  I currently
understand how the API creates problems, but I don't understand how it
solves any problems, and I will NAK it (and I suspect that Eric will,
too, which is pretty much fatal) unless that changes.

What I'm trying to solve is to have

Re: [PATCH V4 3/8] namespaces: expose ns instance serial numbers in proc

2014-08-25 Thread Nicolas Dichtel

Le 25/08/2014 18:13, Andy Lutomirski a écrit :

On Mon, Aug 25, 2014 at 8:43 AM, Nicolas Dichtel
nicolas.dich...@6wind.com wrote:

Le 25/08/2014 16:04, Andy Lutomirski a écrit :


On Aug 25, 2014 6:30 AM, Nicolas Dichtel nicolas.dich...@6wind.com
wrote:


CRIU wants to save the complete state of a namespace and then restore
it.  For that to work, any information exposed to things in the
namespace *cannot* be globally unique or unique per boot, since CRIU
needs to arrange for that information to match whatever it was when
CRIU saved it.



How are ifindex of network devices managed? These ifindexes are unique
per boot,
thus can change depending on the order in which netdev are created.
These ifindexes are unique per boot and exposed to userspace ...



This does not appear to be true.

$ sudo unshare --net
# ip link add veth0 type veth peer name veth1
# ip link
1: lo: LOOPBACK mtu 65536 qdisc noop state DOWN mode DEFAULT group
default
  link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: veth1: BROADCAST,MULTICAST mtu 1500 qdisc noop state DOWN mode
DEFAULT group default qlen 1000
  link/ether 06:0d:59:c7:a6:a8 brd ff:ff:ff:ff:ff:ff
3: veth0: BROADCAST,MULTICAST mtu 1500 qdisc noop state DOWN mode
DEFAULT group default qlen 1000
  link/ether b2:5c:8b:f2:12:28 brd ff:ff:ff:ff:ff:ff
# logout
$ ip link
1: lo: LOOPBACK,UP,LOWER_UP mtu 65536 qdisc noqueue state UNKNOWN
  link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
3: em1: NO-CARRIER,BROADCAST,MULTICAST,UP mtu 1500 qdisc pfifo_fast
state DOWN qlen 1000


I've probably misunderstood what you're trying to say. ifindexes are unique
per
boot and per netns.


I think we both misunderstood each other.  The ifindexes are unique
*per netns*, which means that, if you're unprivileged in a netns,
global information doesn't leak to you.  I think this is good.

Ok, I agree. I think audit daemons are always running under privileged users.





Let me try again, with emphasis in the right place.

I think that *code running in a namespace* has no business even
knowing a unique identity of *that namespace* from the perspective of
the host.

In your example, if there's a veth device between netns A and netns B,
then code *in netns A* has no business knowing the identity of its
veth peer if its peer (B) is a sibling or ancestor.  It also IMO has
no business knowing the identity of its own netns (A) other than as
my netns.


I do not agree (see the example below).




If A and B are siblings, then their parent needs to know where that
veth device goes, but I think this is already the case to a sufficient
extent today.


I'm not aware of a hierarchy between netns. A daemon should be able to
got the full network configuration, even if it's started when this
configuration
is already applied, ie even if it doesn't know what happen before it starts.



I don't know exactly which namespaces have an explicit hierarchy, but
there is certainly a hierarchy of *user* namespaces, and network
namespaces live in user namespaces, so they at least have somewhat of
a hierarchy.





I feel like this discussion is falling into a common trap of new API
discussions.  Can one of you who wants this API please articulate,
with a reasonably precise example, what it is that you want to do, why
you can't easily do it already, and how this API helps?  I currently
understand how the API creates problems, but I don't understand how it
solves any problems, and I will NAK it (and I suspect that Eric will,
too, which is pretty much fatal) unless that changes.


What I'm trying to solve is to have full info in netlink messages sent by
the
kernel, thus beeing able to identify a peer netns (and this is close from
what
audit guys are trying to have). Theorically, messages sent by the kernel can
be
reused as is to have the same configuration. This is not the case with
x-netns
devices. Here is an example, with ip tunnels:

$ ip netns add 1
$ ip link add ipip1 type ipip remote 10.16.0.121 local 10.16.0.249 dev eth0
$ ip -d link ls ipip1
8: ipip1@eth0: POINTOPOINT,NOARP mtu 1480 qdisc noop state DOWN mode
DEFAULT group default
 link/ipip 10.16.0.249 peer 10.16.0.121 promiscuity 0
 ipip remote 10.16.0.121 local 10.16.0.249 dev eth0 ttl inherit pmtudisc
$ ip link set ipip1 netns 1
$ ip netns exec 1 ip -d link ls ipip1
8: ipip1@tunl0: POINTOPOINT,NOARP,M-DOWN mtu 1480 qdisc noop state DOWN
mode DEFAULT group default
 link/ipip 10.16.0.249 peer 10.16.0.121 promiscuity 0
 ipip remote 10.16.0.121 local 10.16.0.249 dev tunl0 ttl inherit pmtudisc

Now informations got with 'ip link' are wrong and incomplete:
  - the link dev is now tunl0 instead of eth0, because we only got an ifindex
from the kernel without any netns informations.
  - the encapsulation addresses are not part of this netns but the user
doesn't
known that (still because netns info is missing). These IPv4 addresses
may
exist into this netns.
  - it's not possible to create the same netdevice

Re: [PATCH V3 0/6] namespaces: log namespaces per task

2014-08-20 Thread Nicolas Dichtel

Le 20/08/2014 18:25, Richard Guy Briggs a écrit :

On 14/08/19, Eric W. Biederman wrote:

Richard Guy Briggs r...@redhat.com writes:


On 14/05/20, Richard Guy Briggs wrote:

On 14/05/20, Eric Paris wrote:

On Tue, 2014-05-20 at 09:12 -0400, Richard Guy Briggs wrote:

The purpose is to track namespaces in use by logged processes from the
perspective of init_*_ns.


(Including the Linux API list due to the additions to /proc/pid/ns/.
Please see http://www.kernelhub.org/?p=2msg=477668 and in particular
http://www.kernelhub.org/?msg=477678p=2 )


Sigh if you have to use something like this use the proc inode
number.  It is the same thing.

I hate to claim it is unique absent of the proc superblock but it is and
will be for the forseable future.

It would be better to include the block device number that appears in
proc of 3h of the primary mount of to qualify the number.  But it is not
particularly important.  Coming up with an additional unique number that
needs to be maintained seems stronlgy silly.


I am reading a contradiction here:
https://www.redhat.com/archives/linux-audit/2013-March/msg00032.html

and this posting went completely ignored:
https://www.redhat.com/archives/linux-audit/2014-January/msg00180.html

And then there was this patchset and thread where there was some good
discussion to clarify the use case:
https://lkml.org/lkml/2014/4/22/662

Then V2:
https://lkml.org/lkml/2014/5/9/637

Then V3 3 months ago:
https://www.redhat.com/archives/linux-audit/2014-May/msg00071.html

I'm about to post another version of the patchset addressing Eric Paris'
concerns about record types, field naming...

I also try to find a solution to identify netns in userland to solve
some network problems (see 
http://thread.gmane.org/gmane.linux.network/315933/focus=321753).


This serial number solution may be reused for this.
We really need to find a way to solve this.


Regards,
Nicolas

--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit


Re: [PATCH 0/2] namespaces: log namespaces per task

2014-05-07 Thread Nicolas Dichtel

Le 06/05/2014 23:15, Richard Guy Briggs a écrit :

On 14/05/05, Nicolas Dichtel wrote:

Le 02/05/2014 16:28, Richard Guy Briggs a ?crit :

On 14/05/02, Serge E. Hallyn wrote:

Quoting Richard Guy Briggs (r...@redhat.com):

I saw no replies to my questions when I replied a year after Aris' posting, so
I don't know if it was ignored or got lost in stale threads:
 https://www.redhat.com/archives/linux-audit/2013-March/msg00020.html
 https://www.redhat.com/archives/linux-audit/2013-March/msg00033.html

(https://lists.linux-foundation.org/pipermail/containers/2013-March/032063.html)
 https://www.redhat.com/archives/linux-audit/2014-January/msg00180.html

I've tried to answer a number of questions that were raised in that thread.

The goal is not quite identical to Aris' patchset.

The purpose is to track namespaces in use by logged processes from the
perspective of init_*_ns.  The first patch defines a function to list them.
The second patch provides an example of usage for audit_log_task_info() which
is used by syscall audits, among others.  audit_log_task() and
audit_common_recv_message() would be other potential use cases.

Use a serial number per namespace (unique across one boot of one kernel)
instead of the inode number (which is claimed to have had the right to change
reserved and is not necessarily unique if there is more than one proc fs).  It
could be argued that the inode numbers have now become a defacto interface and
can't change now, but I'm proposing this approach to see if this helps address
some of the objections to the earlier patchset.

There could also have messages added to track the creation and the destruction
of namespaces, listing the parent for hierarchical namespaces such as pidns,
userns, and listing other ids for non-hierarchical namespaces, as well as other
information to help identify a namespace.

There has been some progress made for audit in net namespaces and pid
namespaces since this previous thread.  net namespaces are now served as peers
by one auditd in the init_net namespace with processes in a non-init_net
namespace being able to write records if they are in the init_user_ns and have
CAP_AUDIT_WRITE.  Processes in a non-init_pid_ns can now similarly write
records.  As for CAP_AUDIT_READ, I just posted a patchset to check capabilities
of userspace processes that try to join netlink broadcast groups.


Questions:
Is there a way to link serial numbers of namespaces involved in migration of a
container to another kernel?  (I had a brief look at CRIU.)  Is there a unique
identifier for each running instance of a kernel?  Or at least some identifier
within the container migration realm?


Eric Biederman has always been adamantly opposed to adding new namespaces
of namespaces, so the fact that you're asking this question concerns me.


I have seen that position and I don't fully understand the justification
for it other than added complexity.

Just FYI, have you seen this thread:
http://thread.gmane.org/gmane.linux.network/286572/

There is some explanations/examples about this topic.


Thanks for that reference.  I read it through, but will need to do so
again to get it to sink in.


I think audit has the same problematic than x-netns netdevice: beeing able to 
identify a peer netns, when a userland apps read a message from the kernel.


The main problem with file descriptor is that you cannot use them when you
broadcast a message from kernel to userland.

Maybe we can use the local names concept (like file descriptors but without
their constraints), ie having an identifier of a peer (net)ns which is only
valid the current (net)ns. When the kernel needs to identify a peer (net)ns, it
uses this identifier (or allocate it the first time). After that, the userland
apps may reuse this identifier to configure things in the peer (net)ns.

Eric, any thoughts about this?

Regards,
Nicolas

--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit


Re: [PATCH 0/2] namespaces: log namespaces per task

2014-05-06 Thread Nicolas Dichtel

Le 06/05/2014 01:23, James Bottomley a écrit :



On May 5, 2014 3:36:38 PM PDT, Serge Hallyn serge.hal...@ubuntu.com wrote:

Quoting James Bottomley (james.bottom...@hansenpartnership.com):

On Mon, 2014-05-05 at 22:27 +, Serge Hallyn wrote:

Quoting James Bottomley (james.bottom...@hansenpartnership.com):

On Mon, 2014-05-05 at 17:48 -0400, Richard Guy Briggs wrote:

On 14/05/05, Serge E. Hallyn wrote:

Quoting James Bottomley

(james.bottom...@hansenpartnership.com):

On Tue, 2014-04-22 at 14:12 -0400, Richard Guy Briggs

wrote:

Questions:
Is there a way to link serial numbers of namespaces

involved in migration of a

container to another kernel?  (I had a brief look at

CRIU.)  Is there a unique

identifier for each running instance of a kernel?  Or at

least some identifier

within the container migration realm?


Are you asking for a way of distinguishing an migrated

container from an

unmigrated one?  The answer is pretty much no because the

job of

migration is to restore to the same state as much as

possible.


Reading between the lines, I think your goal is to

correlate audit

information across a container migration, right?  Ideally

the management

system should be able to cough up an audit trail for a

container

wherever it's running and however many times it's been

migrated?


In that case, I think your idea of a numeric serial number

in a dense

range is wrong.  Because the range is dense you're

obviously never going

to be able to use the same serial number across a

migration.  However,


Ah, but I was being silly before, we can actually address

this pretty

simply.  If we just (for instance) add
/proc/self/ns/{ic,mnt,net,pid,user,uts}_seq containing the

serial number

for the relevant ns for the task, then criu can dump this

info at

checkpoint.  Then at restart it can dump an audit message per

task and

ns saying old_serial=%x,new_serial=%x.  That way the audit

log reader

can if it cares keep track.


This is the sort of idea I had in mind...


OK, but I don't understand then why you need a serial number.

There are

plenty of things we preserve across a migration, like namespace

name for

instance.  Could you explain what function it performs because I

think I

might be missing something.


We're looking ahead to a time when audit is namespaced, and a

container

can keep its own audit logs (without limiting what the host audits

of

course).  So if a container is auditing suspicious activity by some
task in a sub-namesapce, then the whole parent container gets

migrated,

after migration we want to continue being able to correlate the

namespaces.


We're also looking at audit trails on a host that is up for years.

We

would like every namespace to be uniquely logged there.  That is

why

inode #s on /proc/self/ns/* are not sufficient, unless we add a

generation

# (which would end more complicated, not less, than a serial #).


Right, but when the contaner has an audit namespace, that namespace

has

a name,


What ns has a name?


The netns for instance.

netns does not have names. iproute2 uses names (a filename in fact, to hold a
reference on the netns), but the kernel never got this name. It only get a file
descriptor (or a pid).


Regards,
Nicolas

--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit


[PATCH] audit: fix size of netlink messages

2013-06-07 Thread Nicolas Dichtel
Netlink messages must be aligned on NLMSG_ALIGNTO (4 bytes), thus we need to
update the skb length before sending it to userspace.

This patch adds the needed padding to be compliant with this requirement.

Signed-off-by: Nicolas Dichtel nicolas.dich...@6wind.com
---
 kernel/audit.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/kernel/audit.c b/kernel/audit.c
index 21c7fa6..31d213a 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -1669,6 +1669,17 @@ void audit_log_end(struct audit_buffer *ab)
struct nlmsghdr *nlh = nlmsg_hdr(ab-skb);
nlh-nlmsg_len = ab-skb-len - NLMSG_HDRLEN;
 
+   if (NLMSG_ALIGN(ab-skb-len) != ab-skb-len) {
+   unsigned int pad = NLMSG_ALIGN(ab-skb-len) -
+  ab-skb-len;
+
+   if (skb_tailroom(ab-skb) = pad)
+   skb_put(ab-skb, pad);
+   else if (pskb_expand_head(ab-skb, 0, pad,
+ GFP_KERNEL)  0)
+   audit_log_lost(out of memory in 
audit_log_end);
+   }
+
if (audit_pid) {
skb_queue_tail(audit_skb_queue, ab-skb);
wake_up_interruptible(kauditd_wait);
-- 
1.8.2.1

--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit


Re: [PATCH] audit: fix size of netlink messages

2013-06-07 Thread Nicolas Dichtel

Put Thomas in CC.

Le 07/06/2013 17:43, Eric Paris a écrit :

On Fri, 2013-06-07 at 17:25 +0200, Nicolas Dichtel wrote:

NAK.

I tried this once before and as I recal userspace actually expected the
stoopidity of being unaligned and broke   :-(

On which userspace tools do you think?

For example, in the libnl, the function which tries to get the next netlink 
message expects this alignment:


struct nlmsghdr *nlmsg_next(struct nlmsghdr *nlh, int *remaining)
{
int totlen = NLMSG_ALIGN(nlh-nlmsg_len);

*remaining -= totlen;

return (struct nlmsghdr *) ((unsigned char *) nlh + totlen);
}

http://www.infradead.org/~tgr/libnl/doc/core.html#_message_parsing_amp_construction




Netlink messages must be aligned on NLMSG_ALIGNTO (4 bytes), thus we need to
update the skb length before sending it to userspace.

This patch adds the needed padding to be compliant with this requirement.

Signed-off-by: Nicolas Dichtel nicolas.dich...@6wind.com
---
  kernel/audit.c | 11 +++
  1 file changed, 11 insertions(+)

diff --git a/kernel/audit.c b/kernel/audit.c
index 21c7fa6..31d213a 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -1669,6 +1669,17 @@ void audit_log_end(struct audit_buffer *ab)
struct nlmsghdr *nlh = nlmsg_hdr(ab-skb);
nlh-nlmsg_len = ab-skb-len - NLMSG_HDRLEN;

+   if (NLMSG_ALIGN(ab-skb-len) != ab-skb-len) {

Here, I think it's better to use nlmsg_padlen().


+   unsigned int pad = NLMSG_ALIGN(ab-skb-len) -
+  ab-skb-len;
+
+   if (skb_tailroom(ab-skb) = pad)
+   skb_put(ab-skb, pad);
+   else if (pskb_expand_head(ab-skb, 0, pad,
+ GFP_KERNEL)  0)
+   audit_log_lost(out of memory in 
audit_log_end);
+   }
+
if (audit_pid) {
skb_queue_tail(audit_skb_queue, ab-skb);
wake_up_interruptible(kauditd_wait);





--
Linux-audit mailing list
Linux-audit@redhat.com
https://www.redhat.com/mailman/listinfo/linux-audit