Re: [PATCH ghak25 v2 1/9] netfilter: normalize x_table function declarations
Le 06/01/2020 à 19:54, Richard Guy Briggs a écrit : > Git context diffs were being produced with unhelpful declaration types > in the place of function names to help identify the funciton in which > changes were made. Just for my information, how do you reproduce that? With a 'git diff'? > > Normalize x_table function declarations so that git context diff > function labels work as expected. > [snip] > > -- > 1.8.3.1 git v1.8.3.1 is seven years old: https://github.com/git/git/releases/tag/v1.8.3.1 I don't see any problems with git v2.24. Not sure that the patch brings any helpful value except complicating backports. Regards, Nicolas -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
[PATCH net-next] netfilter: create audit records for ebtables replaces
This is already done for x_tables (family AF_INET and AF_INET6), let's do it for AF_BRIDGE also. Signed-off-by: Nicolas Dichtel nicolas.dich...@6wind.com --- v2: move audit chunks to do_replace_finish() net/bridge/netfilter/ebtables.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c index 6d69631b9f4d..d9a8c05d995d 100644 --- a/net/bridge/netfilter/ebtables.c +++ b/net/bridge/netfilter/ebtables.c @@ -26,6 +26,7 @@ #include asm/uaccess.h #include linux/smp.h #include linux/cpumask.h +#include linux/audit.h #include net/sock.h /* needed for logical [in,out]-dev filtering */ #include ../br_private.h @@ -1058,6 +1059,20 @@ static int do_replace_finish(struct net *net, struct ebt_replace *repl, vfree(table); vfree(counterstmp); + +#ifdef CONFIG_AUDIT + if (audit_enabled) { + struct audit_buffer *ab; + + ab = audit_log_start(current-audit_context, GFP_KERNEL, +AUDIT_NETFILTER_CFG); + if (ab) { + audit_log_format(ab, table=%s family=%u entries=%u, +repl-name, AF_BRIDGE, repl-nentries); + audit_log_end(ab); + } + } +#endif return ret; free_unlock: -- 1.9.0 -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
[PATCH net-next] netfilter: create audit records for ebtables replaces
This is already done for x_tables (family AF_INET and AF_INET6), let's do it for AF_BRIDGE also. Signed-off-by: Nicolas Dichtel nicolas.dich...@6wind.com --- net/bridge/netfilter/ebtables.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c index 6d69631b9f4d..4ba0c5c78778 100644 --- a/net/bridge/netfilter/ebtables.c +++ b/net/bridge/netfilter/ebtables.c @@ -26,6 +26,7 @@ #include asm/uaccess.h #include linux/smp.h #include linux/cpumask.h +#include linux/audit.h #include net/sock.h /* needed for logical [in,out]-dev filtering */ #include ../br_private.h @@ -1126,6 +1127,20 @@ static int do_replace(struct net *net, const void __user *user, } ret = do_replace_finish(net, tmp, newinfo); +#ifdef CONFIG_AUDIT + if (audit_enabled) { + struct audit_buffer *ab; + + ab = audit_log_start(current-audit_context, GFP_KERNEL, +AUDIT_NETFILTER_CFG); + if (ab) { + audit_log_format(ab, table=%s family=%u entries=%u, +tmp.name, AF_BRIDGE, +tmp.nentries); + audit_log_end(ab); + } + } +#endif if (ret == 0) return ret; free_entries: -- 1.9.0 -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH V4 3/8] namespaces: expose ns instance serial numbers in proc
Le 24/08/2014 19:52, Andy Lutomirski a écrit : On Thu, Aug 21, 2014 at 6:58 PM, Richard Guy Briggs r...@redhat.com wrote: On 14/08/21, Andy Lutomirski wrote: On Aug 20, 2014 8:12 PM, Richard Guy Briggs r...@redhat.com wrote: Expose the namespace instace serial numbers in the proc filesystem at /proc/pid/ns/ns_snum. The link text gives the serial number in hex. What's the use case? I understand the utility of giving unique numbers to the audit code, but I don't think this part is necessary for that, and I'd like to understand what else will use this before committing to a duplicative API like this. How does a container manager get those numbers? It could provoke a task to cause an audit event that emits a NS_INFO message, or it could run a task in that container to report its namespace serial numbers directly from its /proc mount. Why does a container manager need them? Is there any reason that keeping them entirely contained within the audit system would be a problem? The discussion in this thread touches on the use cases: https://lkml.org/lkml/2014/4/22/662 Note that this API is thoroughly incompatible with CRIU. If we do this, someone will ask for a namespace number namespace, and that way lies madness. I had a very brief look at CRIU, but not enough to understand the issue. Others have hinted at this problem. Do you have a suggestion of a different approach that would be compatible with CRIU? I'd originally considered some sort of UUID that would be globally unique, but that would be very hard to devise or guarantee, and besides, namespaces aren't only used by containers and could be shared in other ways. Tracking the usage and migration of namespaces should be the task of an upper layer. CRIU wants to save the complete state of a namespace and then restore it. For that to work, any information exposed to things in the namespace *cannot* be globally unique or unique per boot, since CRIU needs to arrange for that information to match whatever it was when CRIU saved it. How are ifindex of network devices managed? These ifindexes are unique per boot, thus can change depending on the order in which netdev are created. These ifindexes are unique per boot and exposed to userspace ... Also, I think that code running in a namespace has no business even knowing a unique identity of that namespace from the perspective of the host. Another scenario is when you have virtual network devices across two netns. You need to identify the peer netns to have a netlink message which is fully interpretable by the userspace. Regards, Nicolas -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH V4 3/8] namespaces: expose ns instance serial numbers in proc
Le 25/08/2014 16:04, Andy Lutomirski a écrit : On Aug 25, 2014 6:30 AM, Nicolas Dichtel nicolas.dich...@6wind.com wrote: CRIU wants to save the complete state of a namespace and then restore it. For that to work, any information exposed to things in the namespace *cannot* be globally unique or unique per boot, since CRIU needs to arrange for that information to match whatever it was when CRIU saved it. How are ifindex of network devices managed? These ifindexes are unique per boot, thus can change depending on the order in which netdev are created. These ifindexes are unique per boot and exposed to userspace ... This does not appear to be true. $ sudo unshare --net # ip link add veth0 type veth peer name veth1 # ip link 1: lo: LOOPBACK mtu 65536 qdisc noop state DOWN mode DEFAULT group default link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: veth1: BROADCAST,MULTICAST mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 06:0d:59:c7:a6:a8 brd ff:ff:ff:ff:ff:ff 3: veth0: BROADCAST,MULTICAST mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether b2:5c:8b:f2:12:28 brd ff:ff:ff:ff:ff:ff # logout $ ip link 1: lo: LOOPBACK,UP,LOWER_UP mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 3: em1: NO-CARRIER,BROADCAST,MULTICAST,UP mtu 1500 qdisc pfifo_fast state DOWN qlen 1000 I've probably misunderstood what you're trying to say. ifindexes are unique per boot and per netns. These ifindexes depend on the interface creation order: $ ip netns add 1 $ ip link set eth1 netns 1 $ ip netns exec 1 ip link add veth0 type veth peer name veth1 $ ip netns exec 1 ip link 1: lo: LOOPBACK mtu 65536 qdisc noop state DOWN mode DEFAULT group default link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: veth1: BROADCAST,MULTICAST mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 9a:a0:89:99:a0:3c brd ff:ff:ff:ff:ff:ff 3: eth1: BROADCAST,MULTICAST mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 52:54:00:12:34:57 brd ff:ff:ff:ff:ff:ff 4: veth0: BROADCAST,MULTICAST mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 96:86:44:49:ce:a8 brd ff:ff:ff:ff:ff:ff $ ip netns del 1 $ ip netns add 1 $ ip netns exec 1 ip link add veth0 type veth peer name veth1 $ ip link set eth1 netns 1 $ ip netns exec 1 ip link 1: lo: LOOPBACK mtu 65536 qdisc noop state DOWN mode DEFAULT group default link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: veth1: BROADCAST,MULTICAST mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 86:92:90:01:32:6b brd ff:ff:ff:ff:ff:ff 3: veth0: BROADCAST,MULTICAST mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether ae:8b:d2:71:48:a2 brd ff:ff:ff:ff:ff:ff 4: eth1: BROADCAST,MULTICAST mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 52:54:00:12:34:57 brd ff:ff:ff:ff:ff:ff Note: when an interface is moved to another netns, the ifindex is kept if possible, else another ifindex is chosen. I will dig a bit to understand how CRIU save these netns informations. Also, I think that code running in a namespace has no business even knowing a unique identity of that namespace from the perspective of the host. Another scenario is when you have virtual network devices across two netns. You need to identify the peer netns to have a netlink message which is fully interpretable by the userspace. Let me try again, with emphasis in the right place. I think that *code running in a namespace* has no business even knowing a unique identity of *that namespace* from the perspective of the host. In your example, if there's a veth device between netns A and netns B, then code *in netns A* has no business knowing the identity of its veth peer if its peer (B) is a sibling or ancestor. It also IMO has no business knowing the identity of its own netns (A) other than as my netns. I do not agree (see the example below). If A and B are siblings, then their parent needs to know where that veth device goes, but I think this is already the case to a sufficient extent today. I'm not aware of a hierarchy between netns. A daemon should be able to got the full network configuration, even if it's started when this configuration is already applied, ie even if it doesn't know what happen before it starts. I feel like this discussion is falling into a common trap of new API discussions. Can one of you who wants this API please articulate, with a reasonably precise example, what it is that you want to do, why you can't easily do it already, and how this API helps? I currently understand how the API creates problems, but I don't understand how it solves any problems, and I will NAK it (and I suspect that Eric will, too, which is pretty much fatal) unless that changes. What I'm trying to solve is to have
Re: [PATCH V4 3/8] namespaces: expose ns instance serial numbers in proc
Le 25/08/2014 18:13, Andy Lutomirski a écrit : On Mon, Aug 25, 2014 at 8:43 AM, Nicolas Dichtel nicolas.dich...@6wind.com wrote: Le 25/08/2014 16:04, Andy Lutomirski a écrit : On Aug 25, 2014 6:30 AM, Nicolas Dichtel nicolas.dich...@6wind.com wrote: CRIU wants to save the complete state of a namespace and then restore it. For that to work, any information exposed to things in the namespace *cannot* be globally unique or unique per boot, since CRIU needs to arrange for that information to match whatever it was when CRIU saved it. How are ifindex of network devices managed? These ifindexes are unique per boot, thus can change depending on the order in which netdev are created. These ifindexes are unique per boot and exposed to userspace ... This does not appear to be true. $ sudo unshare --net # ip link add veth0 type veth peer name veth1 # ip link 1: lo: LOOPBACK mtu 65536 qdisc noop state DOWN mode DEFAULT group default link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: veth1: BROADCAST,MULTICAST mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 06:0d:59:c7:a6:a8 brd ff:ff:ff:ff:ff:ff 3: veth0: BROADCAST,MULTICAST mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether b2:5c:8b:f2:12:28 brd ff:ff:ff:ff:ff:ff # logout $ ip link 1: lo: LOOPBACK,UP,LOWER_UP mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 3: em1: NO-CARRIER,BROADCAST,MULTICAST,UP mtu 1500 qdisc pfifo_fast state DOWN qlen 1000 I've probably misunderstood what you're trying to say. ifindexes are unique per boot and per netns. I think we both misunderstood each other. The ifindexes are unique *per netns*, which means that, if you're unprivileged in a netns, global information doesn't leak to you. I think this is good. Ok, I agree. I think audit daemons are always running under privileged users. Let me try again, with emphasis in the right place. I think that *code running in a namespace* has no business even knowing a unique identity of *that namespace* from the perspective of the host. In your example, if there's a veth device between netns A and netns B, then code *in netns A* has no business knowing the identity of its veth peer if its peer (B) is a sibling or ancestor. It also IMO has no business knowing the identity of its own netns (A) other than as my netns. I do not agree (see the example below). If A and B are siblings, then their parent needs to know where that veth device goes, but I think this is already the case to a sufficient extent today. I'm not aware of a hierarchy between netns. A daemon should be able to got the full network configuration, even if it's started when this configuration is already applied, ie even if it doesn't know what happen before it starts. I don't know exactly which namespaces have an explicit hierarchy, but there is certainly a hierarchy of *user* namespaces, and network namespaces live in user namespaces, so they at least have somewhat of a hierarchy. I feel like this discussion is falling into a common trap of new API discussions. Can one of you who wants this API please articulate, with a reasonably precise example, what it is that you want to do, why you can't easily do it already, and how this API helps? I currently understand how the API creates problems, but I don't understand how it solves any problems, and I will NAK it (and I suspect that Eric will, too, which is pretty much fatal) unless that changes. What I'm trying to solve is to have full info in netlink messages sent by the kernel, thus beeing able to identify a peer netns (and this is close from what audit guys are trying to have). Theorically, messages sent by the kernel can be reused as is to have the same configuration. This is not the case with x-netns devices. Here is an example, with ip tunnels: $ ip netns add 1 $ ip link add ipip1 type ipip remote 10.16.0.121 local 10.16.0.249 dev eth0 $ ip -d link ls ipip1 8: ipip1@eth0: POINTOPOINT,NOARP mtu 1480 qdisc noop state DOWN mode DEFAULT group default link/ipip 10.16.0.249 peer 10.16.0.121 promiscuity 0 ipip remote 10.16.0.121 local 10.16.0.249 dev eth0 ttl inherit pmtudisc $ ip link set ipip1 netns 1 $ ip netns exec 1 ip -d link ls ipip1 8: ipip1@tunl0: POINTOPOINT,NOARP,M-DOWN mtu 1480 qdisc noop state DOWN mode DEFAULT group default link/ipip 10.16.0.249 peer 10.16.0.121 promiscuity 0 ipip remote 10.16.0.121 local 10.16.0.249 dev tunl0 ttl inherit pmtudisc Now informations got with 'ip link' are wrong and incomplete: - the link dev is now tunl0 instead of eth0, because we only got an ifindex from the kernel without any netns informations. - the encapsulation addresses are not part of this netns but the user doesn't known that (still because netns info is missing). These IPv4 addresses may exist into this netns. - it's not possible to create the same netdevice
Re: [PATCH V3 0/6] namespaces: log namespaces per task
Le 20/08/2014 18:25, Richard Guy Briggs a écrit : On 14/08/19, Eric W. Biederman wrote: Richard Guy Briggs r...@redhat.com writes: On 14/05/20, Richard Guy Briggs wrote: On 14/05/20, Eric Paris wrote: On Tue, 2014-05-20 at 09:12 -0400, Richard Guy Briggs wrote: The purpose is to track namespaces in use by logged processes from the perspective of init_*_ns. (Including the Linux API list due to the additions to /proc/pid/ns/. Please see http://www.kernelhub.org/?p=2msg=477668 and in particular http://www.kernelhub.org/?msg=477678p=2 ) Sigh if you have to use something like this use the proc inode number. It is the same thing. I hate to claim it is unique absent of the proc superblock but it is and will be for the forseable future. It would be better to include the block device number that appears in proc of 3h of the primary mount of to qualify the number. But it is not particularly important. Coming up with an additional unique number that needs to be maintained seems stronlgy silly. I am reading a contradiction here: https://www.redhat.com/archives/linux-audit/2013-March/msg00032.html and this posting went completely ignored: https://www.redhat.com/archives/linux-audit/2014-January/msg00180.html And then there was this patchset and thread where there was some good discussion to clarify the use case: https://lkml.org/lkml/2014/4/22/662 Then V2: https://lkml.org/lkml/2014/5/9/637 Then V3 3 months ago: https://www.redhat.com/archives/linux-audit/2014-May/msg00071.html I'm about to post another version of the patchset addressing Eric Paris' concerns about record types, field naming... I also try to find a solution to identify netns in userland to solve some network problems (see http://thread.gmane.org/gmane.linux.network/315933/focus=321753). This serial number solution may be reused for this. We really need to find a way to solve this. Regards, Nicolas -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH 0/2] namespaces: log namespaces per task
Le 06/05/2014 23:15, Richard Guy Briggs a écrit : On 14/05/05, Nicolas Dichtel wrote: Le 02/05/2014 16:28, Richard Guy Briggs a ?crit : On 14/05/02, Serge E. Hallyn wrote: Quoting Richard Guy Briggs (r...@redhat.com): I saw no replies to my questions when I replied a year after Aris' posting, so I don't know if it was ignored or got lost in stale threads: https://www.redhat.com/archives/linux-audit/2013-March/msg00020.html https://www.redhat.com/archives/linux-audit/2013-March/msg00033.html (https://lists.linux-foundation.org/pipermail/containers/2013-March/032063.html) https://www.redhat.com/archives/linux-audit/2014-January/msg00180.html I've tried to answer a number of questions that were raised in that thread. The goal is not quite identical to Aris' patchset. The purpose is to track namespaces in use by logged processes from the perspective of init_*_ns. The first patch defines a function to list them. The second patch provides an example of usage for audit_log_task_info() which is used by syscall audits, among others. audit_log_task() and audit_common_recv_message() would be other potential use cases. Use a serial number per namespace (unique across one boot of one kernel) instead of the inode number (which is claimed to have had the right to change reserved and is not necessarily unique if there is more than one proc fs). It could be argued that the inode numbers have now become a defacto interface and can't change now, but I'm proposing this approach to see if this helps address some of the objections to the earlier patchset. There could also have messages added to track the creation and the destruction of namespaces, listing the parent for hierarchical namespaces such as pidns, userns, and listing other ids for non-hierarchical namespaces, as well as other information to help identify a namespace. There has been some progress made for audit in net namespaces and pid namespaces since this previous thread. net namespaces are now served as peers by one auditd in the init_net namespace with processes in a non-init_net namespace being able to write records if they are in the init_user_ns and have CAP_AUDIT_WRITE. Processes in a non-init_pid_ns can now similarly write records. As for CAP_AUDIT_READ, I just posted a patchset to check capabilities of userspace processes that try to join netlink broadcast groups. Questions: Is there a way to link serial numbers of namespaces involved in migration of a container to another kernel? (I had a brief look at CRIU.) Is there a unique identifier for each running instance of a kernel? Or at least some identifier within the container migration realm? Eric Biederman has always been adamantly opposed to adding new namespaces of namespaces, so the fact that you're asking this question concerns me. I have seen that position and I don't fully understand the justification for it other than added complexity. Just FYI, have you seen this thread: http://thread.gmane.org/gmane.linux.network/286572/ There is some explanations/examples about this topic. Thanks for that reference. I read it through, but will need to do so again to get it to sink in. I think audit has the same problematic than x-netns netdevice: beeing able to identify a peer netns, when a userland apps read a message from the kernel. The main problem with file descriptor is that you cannot use them when you broadcast a message from kernel to userland. Maybe we can use the local names concept (like file descriptors but without their constraints), ie having an identifier of a peer (net)ns which is only valid the current (net)ns. When the kernel needs to identify a peer (net)ns, it uses this identifier (or allocate it the first time). After that, the userland apps may reuse this identifier to configure things in the peer (net)ns. Eric, any thoughts about this? Regards, Nicolas -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH 0/2] namespaces: log namespaces per task
Le 06/05/2014 01:23, James Bottomley a écrit : On May 5, 2014 3:36:38 PM PDT, Serge Hallyn serge.hal...@ubuntu.com wrote: Quoting James Bottomley (james.bottom...@hansenpartnership.com): On Mon, 2014-05-05 at 22:27 +, Serge Hallyn wrote: Quoting James Bottomley (james.bottom...@hansenpartnership.com): On Mon, 2014-05-05 at 17:48 -0400, Richard Guy Briggs wrote: On 14/05/05, Serge E. Hallyn wrote: Quoting James Bottomley (james.bottom...@hansenpartnership.com): On Tue, 2014-04-22 at 14:12 -0400, Richard Guy Briggs wrote: Questions: Is there a way to link serial numbers of namespaces involved in migration of a container to another kernel? (I had a brief look at CRIU.) Is there a unique identifier for each running instance of a kernel? Or at least some identifier within the container migration realm? Are you asking for a way of distinguishing an migrated container from an unmigrated one? The answer is pretty much no because the job of migration is to restore to the same state as much as possible. Reading between the lines, I think your goal is to correlate audit information across a container migration, right? Ideally the management system should be able to cough up an audit trail for a container wherever it's running and however many times it's been migrated? In that case, I think your idea of a numeric serial number in a dense range is wrong. Because the range is dense you're obviously never going to be able to use the same serial number across a migration. However, Ah, but I was being silly before, we can actually address this pretty simply. If we just (for instance) add /proc/self/ns/{ic,mnt,net,pid,user,uts}_seq containing the serial number for the relevant ns for the task, then criu can dump this info at checkpoint. Then at restart it can dump an audit message per task and ns saying old_serial=%x,new_serial=%x. That way the audit log reader can if it cares keep track. This is the sort of idea I had in mind... OK, but I don't understand then why you need a serial number. There are plenty of things we preserve across a migration, like namespace name for instance. Could you explain what function it performs because I think I might be missing something. We're looking ahead to a time when audit is namespaced, and a container can keep its own audit logs (without limiting what the host audits of course). So if a container is auditing suspicious activity by some task in a sub-namesapce, then the whole parent container gets migrated, after migration we want to continue being able to correlate the namespaces. We're also looking at audit trails on a host that is up for years. We would like every namespace to be uniquely logged there. That is why inode #s on /proc/self/ns/* are not sufficient, unless we add a generation # (which would end more complicated, not less, than a serial #). Right, but when the contaner has an audit namespace, that namespace has a name, What ns has a name? The netns for instance. netns does not have names. iproute2 uses names (a filename in fact, to hold a reference on the netns), but the kernel never got this name. It only get a file descriptor (or a pid). Regards, Nicolas -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
[PATCH] audit: fix size of netlink messages
Netlink messages must be aligned on NLMSG_ALIGNTO (4 bytes), thus we need to update the skb length before sending it to userspace. This patch adds the needed padding to be compliant with this requirement. Signed-off-by: Nicolas Dichtel nicolas.dich...@6wind.com --- kernel/audit.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/kernel/audit.c b/kernel/audit.c index 21c7fa6..31d213a 100644 --- a/kernel/audit.c +++ b/kernel/audit.c @@ -1669,6 +1669,17 @@ void audit_log_end(struct audit_buffer *ab) struct nlmsghdr *nlh = nlmsg_hdr(ab-skb); nlh-nlmsg_len = ab-skb-len - NLMSG_HDRLEN; + if (NLMSG_ALIGN(ab-skb-len) != ab-skb-len) { + unsigned int pad = NLMSG_ALIGN(ab-skb-len) - + ab-skb-len; + + if (skb_tailroom(ab-skb) = pad) + skb_put(ab-skb, pad); + else if (pskb_expand_head(ab-skb, 0, pad, + GFP_KERNEL) 0) + audit_log_lost(out of memory in audit_log_end); + } + if (audit_pid) { skb_queue_tail(audit_skb_queue, ab-skb); wake_up_interruptible(kauditd_wait); -- 1.8.2.1 -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit
Re: [PATCH] audit: fix size of netlink messages
Put Thomas in CC. Le 07/06/2013 17:43, Eric Paris a écrit : On Fri, 2013-06-07 at 17:25 +0200, Nicolas Dichtel wrote: NAK. I tried this once before and as I recal userspace actually expected the stoopidity of being unaligned and broke :-( On which userspace tools do you think? For example, in the libnl, the function which tries to get the next netlink message expects this alignment: struct nlmsghdr *nlmsg_next(struct nlmsghdr *nlh, int *remaining) { int totlen = NLMSG_ALIGN(nlh-nlmsg_len); *remaining -= totlen; return (struct nlmsghdr *) ((unsigned char *) nlh + totlen); } http://www.infradead.org/~tgr/libnl/doc/core.html#_message_parsing_amp_construction Netlink messages must be aligned on NLMSG_ALIGNTO (4 bytes), thus we need to update the skb length before sending it to userspace. This patch adds the needed padding to be compliant with this requirement. Signed-off-by: Nicolas Dichtel nicolas.dich...@6wind.com --- kernel/audit.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/kernel/audit.c b/kernel/audit.c index 21c7fa6..31d213a 100644 --- a/kernel/audit.c +++ b/kernel/audit.c @@ -1669,6 +1669,17 @@ void audit_log_end(struct audit_buffer *ab) struct nlmsghdr *nlh = nlmsg_hdr(ab-skb); nlh-nlmsg_len = ab-skb-len - NLMSG_HDRLEN; + if (NLMSG_ALIGN(ab-skb-len) != ab-skb-len) { Here, I think it's better to use nlmsg_padlen(). + unsigned int pad = NLMSG_ALIGN(ab-skb-len) - + ab-skb-len; + + if (skb_tailroom(ab-skb) = pad) + skb_put(ab-skb, pad); + else if (pskb_expand_head(ab-skb, 0, pad, + GFP_KERNEL) 0) + audit_log_lost(out of memory in audit_log_end); + } + if (audit_pid) { skb_queue_tail(audit_skb_queue, ab-skb); wake_up_interruptible(kauditd_wait); -- Linux-audit mailing list Linux-audit@redhat.com https://www.redhat.com/mailman/listinfo/linux-audit