Re: [PATCH net-next v2 04/16] l2tp: refactor tunnel lifetime handling wrt its socket
Hi James, I love your patch! Yet something to improve: [auto build test ERROR on net-next/master] url: https://github.com/0day-ci/linux/commits/James-Chapman/l2tp-fix-API-races-discovered-by-syzbot/20180215-060031 config: x86_64-rhel (attached as .config) compiler: gcc-7 (Debian 7.3.0-1) 7.3.0 reproduce: # save the attached .config to linux build tree make ARCH=x86_64 All errors (new ones prefixed by >>): >> ERROR: "l2tp_tunnel_free" [net/l2tp/l2tp_ppp.ko] undefined! >> ERROR: "l2tp_tunnel_free" [net/l2tp/l2tp_netlink.ko] undefined! --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
pull request: bluetooth-next 2018-02-15
Hi Dave, Here's the first bluetooth-next pull request targetting the 4.17 kernel release. - Fixes & cleanups to Atheros and Marvell drivers - Support for two new Realtek controllers - Support for new Intel Bluetooth controller - Fix for supporting multiple slave-role Bluetooth LE connections Please let me know if there are any issues pulling. Thanks. Johan --- The following changes since commit 617aebe6a97efa539cc4b8a52adccd89596e6be0: Merge tag 'usercopy-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux (2018-02-03 16:25:42 -0800) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next.git for-upstream for you to fetch changes up to 907f84990924bf3a8d248c040dabeb5127ae6938: Bluetooth: btrtl: Add RTL8723D and RTL8821C devices (2018-02-11 21:50:35 +0100) Alex Lu (1): Bluetooth: btrtl: Add RTL8723D and RTL8821C devices Jia-Ju Bai (3): Bluetooth: hci_ath: Replace mdelay with msleep in ath_wakeup_ar3k Bluetooth: btmrvl_main: Replace GFP_ATOMIC with GFP_KERNEL in btmrvl_send_sync_cmd Bluetooth: hci_ll: Replace mdelay with msleep in download_firmware Larry Finger (1): Bluetooth: btusb: Add device ID for RTL8822BE Maxim Zhukov (5): Bluetooth: ath3k: replace hardcode numbers with define Bluetooth: ath3k: do not init variables Bluetooth: ath3k: remove blank line after if Bluetooth: ath3k: Fix warning: quoted string split across lines Bluetooth: ath3k: fix checkpatch warning Tedd Ho-Jeong An (1): Bluetooth: btusb: Add support for Intel Bluetooth device 22560 [8087:0026] Ćukasz Rymanowski (1): Bluetooth: Fix incorrect bits for LE states drivers/bluetooth/ath3k.c | 28 ++ drivers/bluetooth/btmrvl_main.c | 2 +- drivers/bluetooth/btrtl.c | 119 drivers/bluetooth/btusb.c | 10 drivers/bluetooth/hci_ath.c | 4 +- drivers/bluetooth/hci_ll.c | 2 +- net/bluetooth/hci_request.c | 6 +- 7 files changed, 118 insertions(+), 53 deletions(-) signature.asc Description: PGP signature
Re: [PATCH net-next 0/3] eBPF Seccomp filters
On Thu, Feb 15, 2018 at 1:30 PM, Alexei Starovoitov wrote: > Specifically for android we added bpf_lsm hooks, cookie/uid helpers, > and read-only maps. > Lorenzo, > there was a claim in this thread that bpf is disabled on android. > Can you please clarify ? It's not compiled out, at least at the moment. https://android.googlesource.com/kernel/configs/+/master/android-4.9/android-base.cfg has CONFIG_BPF_SYSCALL=y. As with many things on Android, use of EBPF is (heavily) restricted via selinux, and I'm not aware of any plans to allow unprivileged applications to use EBPF, or even or any usecases other than network accounting. Even for this use case, we're looking at having the program being completely read-only and baked into the system image. I definitely don't have a complete view of things though. Also, bear in mind that none of this code has been released yet, so things could change.
Re: Serious performance degradation in Linux 4.15
On Wed, Feb 14, 2018 at 10:46:20PM +, Matt Fleming wrote: > Here's some more numbers. This is with RETPOLINE=y but you'll see it > doesn't make much of a difference. Oh, this is also with powersave > cpufreq governor. Hurmph, I'll go have a look when I can boot tip/master again :/ But didn't you bench those patches before we merged them? I can't remember you reporting this..
[PATCH net v2] fib_semantics: Don't match route with mismatching tclassid
In fib_nh_match(), if output interface or gateway are passed in the FIB configuration, we don't have to check next hops of multipath routes to conclude whether we have a match or not. However, we might still have routes with different realms matching the same output interface and gateway configuration, and this needs to cause the match to fail. Otherwise the first route inserted in the FIB will match, regardless of the realms: # ip route add 1.1.1.1 dev eth0 table 1234 realms 1/2 # ip route append 1.1.1.1 dev eth0 table 1234 realms 3/4 # ip route list table 1234 1.1.1.1 dev eth0 scope link realms 1/2 1.1.1.1 dev eth0 scope link realms 3/4 # ip route del 1.1.1.1 dev ens3 table 1234 realms 3/4 # ip route list table 1234 1.1.1.1 dev ens3 scope link realms 3/4 whereas route with realms 3/4 should have been deleted instead. Explicitly check for fc_flow passed in the FIB configuration (this comes from RTA_FLOW extracted by rtm_to_fib_config()) and fail matching if it differs from nh_tclassid. The handling of RTA_FLOW for multipath routes later in fib_nh_match() is still needed, as we can have multiple RTA_FLOW attributes that need to be matched against the tclassid of each next hop. v2: Check that fc_flow is set before discarding the match, so that the user can still select the first matching rule by not specifying any realm, as suggested by David Ahern. Reported-by: Jianlin Shi Signed-off-by: Stefano Brivio --- net/ipv4/fib_semantics.c | 5 + 1 file changed, 5 insertions(+) diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c index c586597da20d..7d36a950d961 100644 --- a/net/ipv4/fib_semantics.c +++ b/net/ipv4/fib_semantics.c @@ -646,6 +646,11 @@ int fib_nh_match(struct fib_config *cfg, struct fib_info *fi, fi->fib_nh, cfg, extack)) return 1; } +#ifdef CONFIG_IP_ROUTE_CLASSID + if (cfg->fc_flow && + cfg->fc_flow != fi->fib_nh->nh_tclassid) + return 1; +#endif if ((!cfg->fc_oif || cfg->fc_oif == fi->fib_nh->nh_oif) && (!cfg->fc_gw || cfg->fc_gw == fi->fib_nh->nh_gw)) return 0; -- 2.15.1
RE: [net-next 1/1] tipc: avoid unnecessary copying of bundled messages
> -Original Message- > From: netdev-ow...@vger.kernel.org [mailto:netdev- > ow...@vger.kernel.org] On Behalf Of David Miller > Sent: Wednesday, February 14, 2018 21:27 > To: Jon Maloy > Cc: netdev@vger.kernel.org; Mohan Krishna Ghanta Krishnamurthy > ; Tung Quang Nguyen > ; Hoang Huu Le > ; Canh Duc Luu > ; Ying Xue ; tipc- > discuss...@lists.sourceforge.net > Subject: Re: [net-next 1/1] tipc: avoid unnecessary copying of bundled > messages > > From: Jon Maloy > Date: Wed, 14 Feb 2018 13:50:31 +0100 > > > diff --git a/net/tipc/msg.c b/net/tipc/msg.c index 4e1c6f6..a368fa8 > > 100644 > > --- a/net/tipc/msg.c > > +++ b/net/tipc/msg.c > > @@ -434,6 +434,9 @@ bool tipc_msg_extract(struct sk_buff *skb, struct > sk_buff **iskb, int *pos) > > skb_pull(*iskb, offset); > > imsz = msg_size(buf_msg(*iskb)); > > skb_trim(*iskb, imsz); > > + > > + /* Scale extracted buffer's truesize to avoid double accounting */ > > + (*iskb)->truesize = SKB_TRUESIZE(imsz); > > if (unlikely(!tipc_msg_validate(iskb))) > > goto none; > > *pos += align(imsz); > > As Eric said, you have to be really careful here. > > If you clone a 10K SKB 10 times, you really have to account for the full > truesize 10 times. > > That is unless you explicitly trim off frags in the new clone, then adjust the > truesize by explicitly decreasing it by the amount of memory backing the > frags you trimmed off completely (not partially). The buffers we are cloning are linearized 1 MTU incoming buffers. There are no fragments. Each clone normally points to only a tiny fraction of the data area of the base buffer. I don't claim that copying always is bad, but in this case it happens in the majority of cases, and as I see it completely unnecessarily. There is actually some under accounting, however, since we now only count the data area of the base buffer (== the sum of the data area of the clones) plus the overhead of the clones. A more accurate calculation, taking into account even the overhead of the base buffer, would look like this: (*iskb)->truesize = SKB_TRUSIZE(imsz) + (skb->truesize - skb->len) / msg_msgcnt(msg); I.e., we calculate the overhead of the base buffer and divide it equally among the clones. Now I really can't see we are missing anything. BR ///jon > > Finally, you can only do this on an SKB that has never entered a socket SKB > queue, otherwise you corrupt memory accounting.
Re: [PATCH bpf 0/2] tools: bpftool: minor fixes for JSON in batch mode
On 02/15/2018 07:42 AM, Jakub Kicinski wrote: > Quentin says: > > These are two minor fixes to avoid breaking JSON output in batch mode. The > first one makes bpftool output a "null" JSON object, as expected in batch > mode if nothing else is to be printed, when dumping program instructions > into an output file. The second one replaces a call to "perror()" with > something that does not break JSON when parsing input file for batch mode. Applied to bpf tree, thanks Quentin!
Re: [PATCH bpf-next 0/4] Misc test usability improvements & cleanup
On 02/14/2018 10:50 PM, Joe Stringer wrote: > This is series makes some minor changes primarily focused on making it easier > to understand why test_verifier is failing a test. This includes printing the > observed output when a test fails in a different way than expected, or when > unprivileged tests fail due to sysctl kernel.unprivileged_bpf_disabled=1. The > last patch removes some apparently dead code. Looks good, applied to bpf-next, thanks Joe!
[net-next 00/10] tipc: de-generealize topology server
The topology server is partially based on a template that is much more generic than what we need. This results in a code that is unnecessarily hard to follow and keeping bug free. We now take the consequence of the fact that we only have one such server in TIPC, - with no prospects for introducing any more, and adapt the code to the specialized task is really is doing. Jon Maloy (10): tipc: remove redundant code in topology server tipc: remove unnecessary function pointers tipc: eliminate struct tipc_subscriber tipc: simplify interaction between subscription and topology connection tipc: simplify endianness handling in topology subscriber tipc: collapse subscription creation functions tipc: some prefix changes tipc: make struct tipc_server private for server.c tipc: separate topology server listener socket from subcsriber sockets tipc: rename tipc_server to tipc_topsrv net/tipc/Makefile | 2 +- net/tipc/core.h | 6 +- net/tipc/group.c | 2 +- net/tipc/name_table.c | 73 +++--- net/tipc/name_table.h | 2 +- net/tipc/server.c | 710 -- net/tipc/server.h | 103 net/tipc/subscr.c | 361 + net/tipc/subscr.h | 66 +++-- net/tipc/topsrv.c | 702 + net/tipc/topsrv.h | 54 11 files changed, 912 insertions(+), 1169 deletions(-) delete mode 100644 net/tipc/server.c delete mode 100644 net/tipc/server.h create mode 100644 net/tipc/topsrv.c create mode 100644 net/tipc/topsrv.h -- 2.1.4
[net-next 01/10] tipc: remove redundant code in topology server
The socket handling in the topology server is unnecessarily generic. It is prepared to handle both SOCK_RDM, SOCK_DGRAM and SOCK_STREAM type sockets, as well as the only socket type which is really used, SOCK_SEQPACKET. We now remove this redundant code to make the code more readable. Acked-by: Ying Xue Signed-off-by: Jon Maloy --- net/tipc/server.c | 36 +++- net/tipc/server.h | 4 +--- net/tipc/subscr.c | 4 +--- 3 files changed, 9 insertions(+), 35 deletions(-) diff --git a/net/tipc/server.c b/net/tipc/server.c index df0c563..04a6dd9 100644 --- a/net/tipc/server.c +++ b/net/tipc/server.c @@ -82,7 +82,6 @@ struct tipc_conn { struct outqueue_entry { struct list_head list; struct kvec iov; - struct sockaddr_tipc dest; }; static void tipc_recv_work(struct work_struct *work); @@ -93,7 +92,6 @@ static void tipc_conn_kref_release(struct kref *kref) { struct tipc_conn *con = container_of(kref, struct tipc_conn, kref); struct tipc_server *s = con->server; - struct sockaddr_tipc *saddr = s->saddr; struct socket *sock = con->sock; struct sock *sk; @@ -103,8 +101,6 @@ static void tipc_conn_kref_release(struct kref *kref) __module_get(sock->ops->owner); __module_get(sk->sk_prot_creator->owner); } - saddr->scope = -TIPC_NODE_SCOPE; - kernel_bind(sock, (struct sockaddr *)saddr, sizeof(*saddr)); sock_release(sock); con->sock = NULL; } @@ -325,36 +321,24 @@ static struct socket *tipc_create_listen_sock(struct tipc_conn *con) { struct tipc_server *s = con->server; struct socket *sock = NULL; + int imp = TIPC_CRITICAL_IMPORTANCE; int ret; ret = sock_create_kern(s->net, AF_TIPC, SOCK_SEQPACKET, 0, &sock); if (ret < 0) return NULL; ret = kernel_setsockopt(sock, SOL_TIPC, TIPC_IMPORTANCE, - (char *)&s->imp, sizeof(s->imp)); + (char *)&imp, sizeof(imp)); if (ret < 0) goto create_err; ret = kernel_bind(sock, (struct sockaddr *)s->saddr, sizeof(*s->saddr)); if (ret < 0) goto create_err; - switch (s->type) { - case SOCK_STREAM: - case SOCK_SEQPACKET: - con->rx_action = tipc_accept_from_sock; - - ret = kernel_listen(sock, 0); - if (ret < 0) - goto create_err; - break; - case SOCK_DGRAM: - case SOCK_RDM: - con->rx_action = tipc_receive_from_sock; - break; - default: - pr_err("Unknown socket type %d\n", s->type); + con->rx_action = tipc_accept_from_sock; + ret = kernel_listen(sock, 0); + if (ret < 0) goto create_err; - } /* As server's listening socket owner and creator is the same module, * we have to decrease TIPC module reference count to guarantee that @@ -444,7 +428,7 @@ static void tipc_clean_outqueues(struct tipc_conn *con) } int tipc_conn_sendmsg(struct tipc_server *s, int conid, - struct sockaddr_tipc *addr, void *data, size_t len) + void *data, size_t len) { struct outqueue_entry *e; struct tipc_conn *con; @@ -464,9 +448,6 @@ int tipc_conn_sendmsg(struct tipc_server *s, int conid, return -ENOMEM; } - if (addr) - memcpy(&e->dest, addr, sizeof(struct sockaddr_tipc)); - spin_lock_bh(&con->outqueue_lock); list_add_tail(&e->list, &con->outqueue); spin_unlock_bh(&con->outqueue_lock); @@ -575,10 +556,6 @@ static void tipc_send_to_sock(struct tipc_conn *con) if (con->sock) { memset(&msg, 0, sizeof(msg)); msg.msg_flags = MSG_DONTWAIT; - if (s->type == SOCK_DGRAM || s->type == SOCK_RDM) { - msg.msg_name = &e->dest; - msg.msg_namelen = sizeof(struct sockaddr_tipc); - } ret = kernel_sendmsg(con->sock, &msg, &e->iov, 1, e->iov.iov_len); if (ret == -EWOULDBLOCK || ret == 0) { @@ -591,6 +568,7 @@ static void tipc_send_to_sock(struct tipc_conn *con) evt = e->iov.iov_base; tipc_send_kern_top_evt(s->net, evt); } + /* Don't starve users filling buffers */ if (++count >= MAX_SEND_MSG_COUNT) { cond_resched(); diff --git a/net/tipc/server.h b/net/tipc/server.h index 64df751..434736d 100644 --- a/net/tipc/server.h +++ b/net/tipc/server.h @@ -79,12 +79,10 @@ struct tipc_server {
[net-next 03/10] tipc: eliminate struct tipc_subscriber
It is unnecessary to keep two structures, struct tipc_conn and struct tipc_subscriber, with a one-to-one relationship and still with different life cycles. The fact that the two often run in different contexts, and still may access each other via direct pointers constitutes an additional hazard, something we have experienced at several occasions, and still see happening. We have identified at least two remaining problems that are easier to fix if we simplify the topology server data structure somewhat. - When there is a race between a subscription up/down event and a timeout event, it is fully possible that the former might be delivered after the latter, leading to confusion for the receiver. - The function tipc_subcrp_timeout() is executing in interrupt context, while the following call chain is at least theoretically possible: tipc_subscrp_timeout() tipc_subscrp_send_event() tipc_conn_sendmsg() conn_put() tipc_conn_kref_release() sock_release(sock) I.e., we end up calling a function that might try to sleep in interrupt context. To eliminate this, we need to ensure that the tipc_conn structure and the socket, as well as the subscription instances, only are deleted in work queue context, i.e., after the timeout event really has been sent out. We now remove this unnecessary complexity, by merging data and functionality of the subscriber structure into struct tipc_conn and the associated file server.c. We thereafter add a spinlock and a new 'inactive' state to the subscription structure. Using those, both problems described above can be easily solved. Acked-by: Ying Xue Signed-off-by: Jon Maloy --- net/tipc/server.c | 161 -- net/tipc/server.h | 2 +- net/tipc/subscr.c | 173 ++ net/tipc/subscr.h | 17 +++--- 4 files changed, 146 insertions(+), 207 deletions(-) diff --git a/net/tipc/server.c b/net/tipc/server.c index 8aa2a33..b8268c0 100644 --- a/net/tipc/server.c +++ b/net/tipc/server.c @@ -2,6 +2,7 @@ * net/tipc/server.c: TIPC server infrastructure * * Copyright (c) 2012-2013, Wind River Systems + * Copyright (c) 2017, Ericsson AB * All rights reserved. * * Redistribution and use in source and binary forms, with or without @@ -57,12 +58,13 @@ * @sock: socket handler associated with connection * @flags: indicates connection state * @server: pointer to connected server + * @sub_list: lsit to all pertaing subscriptions + * @sub_lock: lock protecting the subscription list + * @outqueue_lock: control access to the outqueue * @rwork: receive work item - * @usr_data: user-specified field * @rx_action: what to do when connection socket is active * @outqueue: pointer to first outbound message in queue * @outqueue_lock: control access to the outqueue - * @outqueue: list of connection objects for its server * @swork: send work item */ struct tipc_conn { @@ -71,9 +73,10 @@ struct tipc_conn { struct socket *sock; unsigned long flags; struct tipc_server *server; + struct list_head sub_list; + spinlock_t sub_lock; /* for subscription list */ struct work_struct rwork; int (*rx_action) (struct tipc_conn *con); - void *usr_data; struct list_head outqueue; spinlock_t outqueue_lock; struct work_struct swork; @@ -81,6 +84,7 @@ struct tipc_conn { /* An entry waiting to be sent */ struct outqueue_entry { + u32 evt; struct list_head list; struct kvec iov; }; @@ -89,18 +93,33 @@ static void tipc_recv_work(struct work_struct *work); static void tipc_send_work(struct work_struct *work); static void tipc_clean_outqueues(struct tipc_conn *con); +static bool connected(struct tipc_conn *con) +{ + return con && test_bit(CF_CONNECTED, &con->flags); +} + +/** + * htohl - convert value to endianness used by destination + * @in: value to convert + * @swap: non-zero if endianness must be reversed + * + * Returns converted value + */ +static u32 htohl(u32 in, int swap) +{ + return swap ? swab32(in) : in; +} + static void tipc_conn_kref_release(struct kref *kref) { struct tipc_conn *con = container_of(kref, struct tipc_conn, kref); struct tipc_server *s = con->server; struct socket *sock = con->sock; - struct sock *sk; if (sock) { - sk = sock->sk; if (test_bit(CF_SERVER, &con->flags)) { __module_get(sock->ops->owner); - __module_get(sk->sk_prot_creator->owner); + __module_get(sock->sk->sk_prot_creator->owner); } sock_release(sock); con->sock = NULL; @@ -129,11 +148,8 @@ static struct tipc_conn *tipc_conn_lookup(struct tipc_server *s, int conid) spin_lock_bh(&s->idr_lock); con = idr_find(&s->conn_idr, conid); - if (con) { -
[net-next 02/10] tipc: remove unnecessary function pointers
Interaction between the functionality in server.c and subscr.c is done via function pointers installed in struct server. This makes the code harder to follow, and doesn't serve any obvious purpose. Here, we replace the function pointers with direct function calls. Acked-by: Ying Xue Signed-off-by: Jon Maloy --- net/tipc/server.c | 21 ++--- net/tipc/server.h | 5 - net/tipc/subscr.c | 27 ++- net/tipc/subscr.h | 4 4 files changed, 20 insertions(+), 37 deletions(-) diff --git a/net/tipc/server.c b/net/tipc/server.c index 04a6dd9..8aa2a33 100644 --- a/net/tipc/server.c +++ b/net/tipc/server.c @@ -33,6 +33,7 @@ * POSSIBILITY OF SUCH DAMAGE. */ +#include "subscr.h" #include "server.h" #include "core.h" #include "socket.h" @@ -182,7 +183,6 @@ static void tipc_register_callbacks(struct socket *sock, struct tipc_conn *con) static void tipc_close_conn(struct tipc_conn *con) { - struct tipc_server *s = con->server; struct sock *sk = con->sock->sk; bool disconnect = false; @@ -191,7 +191,7 @@ static void tipc_close_conn(struct tipc_conn *con) if (disconnect) { sk->sk_user_data = NULL; if (con->conid) - s->tipc_conn_release(con->conid, con->usr_data); + tipc_subscrb_delete(con->usr_data); } write_unlock_bh(&sk->sk_callback_lock); @@ -240,7 +240,6 @@ static int tipc_receive_from_sock(struct tipc_conn *con) { struct tipc_server *s = con->server; struct sock *sk = con->sock->sk; - struct sockaddr_tipc addr; struct msghdr msg = {}; struct kvec iov; void *buf; @@ -254,7 +253,7 @@ static int tipc_receive_from_sock(struct tipc_conn *con) iov.iov_base = buf; iov.iov_len = s->max_rcvbuf_size; - msg.msg_name = &addr; + msg.msg_name = NULL; iov_iter_kvec(&msg.msg_iter, READ | ITER_KVEC, &iov, 1, iov.iov_len); ret = sock_recvmsg(con->sock, &msg, MSG_DONTWAIT); if (ret <= 0) { @@ -264,8 +263,8 @@ static int tipc_receive_from_sock(struct tipc_conn *con) read_lock_bh(&sk->sk_callback_lock); if (test_bit(CF_CONNECTED, &con->flags)) - ret = s->tipc_conn_recvmsg(sock_net(con->sock->sk), con->conid, - &addr, con->usr_data, buf, ret); + ret = tipc_subscrb_rcv(sock_net(con->sock->sk), con->conid, + con->usr_data, buf, ret); read_unlock_bh(&sk->sk_callback_lock); kmem_cache_free(s->rcvbuf_cache, buf); if (ret < 0) @@ -284,7 +283,6 @@ static int tipc_receive_from_sock(struct tipc_conn *con) static int tipc_accept_from_sock(struct tipc_conn *con) { - struct tipc_server *s = con->server; struct socket *sock = con->sock; struct socket *newsock; struct tipc_conn *newcon; @@ -305,7 +303,8 @@ static int tipc_accept_from_sock(struct tipc_conn *con) tipc_register_callbacks(newsock, newcon); /* Notify that new connection is incoming */ - newcon->usr_data = s->tipc_conn_new(newcon->conid); + newcon->usr_data = tipc_subscrb_create(newcon->conid); + if (!newcon->usr_data) { sock_release(newsock); conn_put(newcon); @@ -489,7 +488,7 @@ bool tipc_topsrv_kern_subscr(struct net *net, u32 port, u32 type, u32 lower, *conid = con->conid; s = con->server; - scbr = s->tipc_conn_new(*conid); + scbr = tipc_subscrb_create(*conid); if (!scbr) { conn_put(con); return false; @@ -497,7 +496,7 @@ bool tipc_topsrv_kern_subscr(struct net *net, u32 port, u32 type, u32 lower, con->usr_data = scbr; con->sock = NULL; - s->tipc_conn_recvmsg(net, *conid, NULL, scbr, &sub, sizeof(sub)); + tipc_subscrb_rcv(net, *conid, scbr, &sub, sizeof(sub)); return true; } @@ -513,7 +512,7 @@ void tipc_topsrv_kern_unsubscr(struct net *net, int conid) test_and_clear_bit(CF_CONNECTED, &con->flags); srv = con->server; if (con->conid) - srv->tipc_conn_release(con->conid, con->usr_data); + tipc_subscrb_delete(con->usr_data); conn_put(con); conn_put(con); } diff --git a/net/tipc/server.h b/net/tipc/server.h index 434736d..b4b83bd 100644 --- a/net/tipc/server.h +++ b/net/tipc/server.h @@ -72,11 +72,6 @@ struct tipc_server { struct workqueue_struct *rcv_wq; struct workqueue_struct *send_wq; int max_rcvbuf_size; - void *(*tipc_conn_new)(int conid); - void (*tipc_conn_release)(int conid, void *usr_data); - int (*tipc_conn_recvmsg)(struct net *net, int conid, -struct sockaddr_tipc *addr, void *usr_data, -void *buf, size_t len); struct sockaddr_tipc *saddr; char nam
[net-next 04/10] tipc: simplify interaction between subscription and topology connection
The message transmission and reception in the topology server is more generic than is currently necessary. By basing the funtionality on the fact that we only send items of type struct tipc_event and always receive items of struct tipc_subcr we can make several simplifications, and also get rid of some unnecessary dynamic memory allocations. Acked-by: Ying Xue Signed-off-by: Jon Maloy --- net/tipc/name_table.c | 10 +-- net/tipc/server.c | 170 +- net/tipc/server.h | 12 +--- net/tipc/subscr.c | 40 ++-- net/tipc/subscr.h | 5 +- 5 files changed, 88 insertions(+), 149 deletions(-) diff --git a/net/tipc/name_table.c b/net/tipc/name_table.c index ed0457c..c0ca7be 100644 --- a/net/tipc/name_table.c +++ b/net/tipc/name_table.c @@ -810,14 +810,15 @@ int tipc_nametbl_withdraw(struct net *net, u32 type, u32 lower, u32 ref, */ void tipc_nametbl_subscribe(struct tipc_subscription *s, bool status) { - struct tipc_net *tn = net_generic(s->net, tipc_net_id); + struct tipc_server *srv = s->server; + struct tipc_net *tn = tipc_net(srv->net); u32 type = tipc_subscrp_convert_seq_type(s->evt.s.seq.type, s->swap); int index = hash(type); struct name_seq *seq; struct tipc_name_seq ns; spin_lock_bh(&tn->nametbl_lock); - seq = nametbl_find_seq(s->net, type); + seq = nametbl_find_seq(srv->net, type); if (!seq) seq = tipc_nameseq_create(type, &tn->nametbl->seq_hlist[index]); if (seq) { @@ -837,12 +838,13 @@ void tipc_nametbl_subscribe(struct tipc_subscription *s, bool status) */ void tipc_nametbl_unsubscribe(struct tipc_subscription *s) { - struct tipc_net *tn = net_generic(s->net, tipc_net_id); + struct tipc_server *srv = s->server; + struct tipc_net *tn = tipc_net(srv->net); struct name_seq *seq; u32 type = tipc_subscrp_convert_seq_type(s->evt.s.seq.type, s->swap); spin_lock_bh(&tn->nametbl_lock); - seq = nametbl_find_seq(s->net, type); + seq = nametbl_find_seq(srv->net, type); if (seq != NULL) { spin_lock_bh(&seq->lock); list_del_init(&s->nameseq_list); diff --git a/net/tipc/server.c b/net/tipc/server.c index b8268c0..7933fb9 100644 --- a/net/tipc/server.c +++ b/net/tipc/server.c @@ -84,9 +84,9 @@ struct tipc_conn { /* An entry waiting to be sent */ struct outqueue_entry { - u32 evt; + bool inactive; + struct tipc_event evt; struct list_head list; - struct kvec iov; }; static void tipc_recv_work(struct work_struct *work); @@ -154,6 +154,9 @@ static struct tipc_conn *tipc_conn_lookup(struct tipc_server *s, int conid) return con; } +/* sock_data_ready - interrupt callback indicating the socket has data to read + * The queued job is launched in tipc_recv_from_sock() + */ static void sock_data_ready(struct sock *sk) { struct tipc_conn *con; @@ -168,6 +171,10 @@ static void sock_data_ready(struct sock *sk) read_unlock_bh(&sk->sk_callback_lock); } +/* sock_write_space - interrupt callback after a sendmsg EAGAIN + * Indicates that there now is more is space in the send buffer + * The queued job is launched in tipc_send_to_sock() + */ static void sock_write_space(struct sock *sk) { struct tipc_conn *con; @@ -273,10 +280,10 @@ static struct tipc_conn *tipc_alloc_conn(struct tipc_server *s) return con; } -int tipc_con_rcv_sub(struct net *net, int conid, struct tipc_conn *con, -void *buf, size_t len) +static int tipc_con_rcv_sub(struct tipc_server *srv, + struct tipc_conn *con, + struct tipc_subscr *s) { - struct tipc_subscr *s = (struct tipc_subscr *)buf; struct tipc_subscription *sub; bool status; int swap; @@ -292,7 +299,7 @@ int tipc_con_rcv_sub(struct net *net, int conid, struct tipc_conn *con, return 0; } status = !(s->filter & htohl(TIPC_SUB_NO_STATUS, swap)); - sub = tipc_subscrp_subscribe(net, s, conid, swap, status); + sub = tipc_subscrp_subscribe(srv, s, con->conid, swap, status); if (!sub) return -1; @@ -304,43 +311,27 @@ int tipc_con_rcv_sub(struct net *net, int conid, struct tipc_conn *con, static int tipc_receive_from_sock(struct tipc_conn *con) { - struct tipc_server *s = con->server; + struct tipc_server *srv = con->server; struct sock *sk = con->sock->sk; struct msghdr msg = {}; + struct tipc_subscr s; struct kvec iov; - void *buf; int ret; - buf = kmem_cache_alloc(s->rcvbuf_cache, GFP_ATOMIC); - if (!buf) { - ret = -ENOMEM; - goto out_close; - } - - iov.iov_base = buf; - iov.iov_len = s->max_rcvbuf_size; + iov.iov_base = &s; + iov.iov_le
[net-next 05/10] tipc: simplify endianness handling in topology subscriber
Because of the requirement for total distribution transparency, users send subscriptions and receive topology events in their own host format. It is up to the topology server to determine this format and do the correct conversions to and from its own host format when needed. Until now, this has been handled in a rather non-transparent way inside the topology server and subscriber code, leading to unnecessary complexity when creating subscriptions and issuing events. We now improve this situation by adding two new macros, tipc_sub_read() and tipc_evt_write(). Both those functions calculate the need for conversion internally before performing their respective operations. Hence, all handling of such conversions become transparent to the rest of the code. Acked-by: Ying Xue Signed-off-by: Jon Maloy --- net/tipc/name_table.c | 42 +++--- net/tipc/name_table.h | 2 +- net/tipc/server.c | 25 ++-- net/tipc/subscr.c | 83 +++ net/tipc/subscr.h | 36 -- 5 files changed, 86 insertions(+), 102 deletions(-) diff --git a/net/tipc/name_table.c b/net/tipc/name_table.c index c0ca7be..2fbd0a2 100644 --- a/net/tipc/name_table.c +++ b/net/tipc/name_table.c @@ -412,18 +412,22 @@ static struct publication *tipc_nameseq_remove_publ(struct net *net, * sequence overlapping with the requested sequence */ static void tipc_nameseq_subscribe(struct name_seq *nseq, - struct tipc_subscription *s, - bool status) + struct tipc_subscription *sub) { struct sub_seq *sseq = nseq->sseqs; struct tipc_name_seq ns; + struct tipc_subscr *s = &sub->evt.s; + bool no_status; - tipc_subscrp_convert_seq(&s->evt.s.seq, s->swap, &ns); + ns.type = tipc_sub_read(s, seq.type); + ns.lower = tipc_sub_read(s, seq.lower); + ns.upper = tipc_sub_read(s, seq.upper); + no_status = tipc_sub_read(s, filter) & TIPC_SUB_NO_STATUS; - tipc_subscrp_get(s); - list_add(&s->nameseq_list, &nseq->subscriptions); + tipc_subscrp_get(sub); + list_add(&sub->nameseq_list, &nseq->subscriptions); - if (!status || !sseq) + if (no_status || !sseq) return; while (sseq != &nseq->sseqs[nseq->first_free]) { @@ -433,7 +437,7 @@ static void tipc_nameseq_subscribe(struct name_seq *nseq, int must_report = 1; list_for_each_entry(crs, &info->zone_list, zone_list) { - tipc_subscrp_report_overlap(s, sseq->lower, + tipc_subscrp_report_overlap(sub, sseq->lower, sseq->upper, TIPC_PUBLISHED, crs->ref, crs->node, @@ -808,11 +812,12 @@ int tipc_nametbl_withdraw(struct net *net, u32 type, u32 lower, u32 ref, /** * tipc_nametbl_subscribe - add a subscription object to the name table */ -void tipc_nametbl_subscribe(struct tipc_subscription *s, bool status) +void tipc_nametbl_subscribe(struct tipc_subscription *sub) { - struct tipc_server *srv = s->server; + struct tipc_server *srv = sub->server; struct tipc_net *tn = tipc_net(srv->net); - u32 type = tipc_subscrp_convert_seq_type(s->evt.s.seq.type, s->swap); + struct tipc_subscr *s = &sub->evt.s; + u32 type = tipc_sub_read(s, seq.type); int index = hash(type); struct name_seq *seq; struct tipc_name_seq ns; @@ -823,10 +828,12 @@ void tipc_nametbl_subscribe(struct tipc_subscription *s, bool status) seq = tipc_nameseq_create(type, &tn->nametbl->seq_hlist[index]); if (seq) { spin_lock_bh(&seq->lock); - tipc_nameseq_subscribe(seq, s, status); + tipc_nameseq_subscribe(seq, sub); spin_unlock_bh(&seq->lock); } else { - tipc_subscrp_convert_seq(&s->evt.s.seq, s->swap, &ns); + ns.type = tipc_sub_read(s, seq.type); + ns.lower = tipc_sub_read(s, seq.lower); + ns.upper = tipc_sub_read(s, seq.upper); pr_warn("Failed to create subscription for {%u,%u,%u}\n", ns.type, ns.lower, ns.upper); } @@ -836,19 +843,20 @@ void tipc_nametbl_subscribe(struct tipc_subscription *s, bool status) /** * tipc_nametbl_unsubscribe - remove a subscription object from name table */ -void tipc_nametbl_unsubscribe(struct tipc_subscription *s) +void tipc_nametbl_unsubscribe(struct tipc_subscription *sub) { - struct tipc_server *srv = s->server; + struct tipc_server *srv = sub->server; + struct tipc_subscr *s = &sub->evt.s; struct tipc_net *tn = tipc_net(srv->net); s
[net-next 06/10] tipc: collapse subscription creation functions
After the previous changes it becomes logical to collapse the two-level creation of subscription instances into one. We do that here. We also rename the creation and deletion functions for more consistency. Acked-by: Ying Xue Signed-off-by: Jon Maloy --- net/tipc/server.c | 4 ++-- net/tipc/server.h | 1 + net/tipc/subscr.c | 46 -- net/tipc/subscr.h | 14 +++--- 4 files changed, 22 insertions(+), 43 deletions(-) diff --git a/net/tipc/server.c b/net/tipc/server.c index 5d231fa..6a18b10 100644 --- a/net/tipc/server.c +++ b/net/tipc/server.c @@ -203,7 +203,7 @@ static void tipc_con_delete_sub(struct tipc_conn *con, struct tipc_subscr *s) spin_lock_bh(&con->sub_lock); list_for_each_entry_safe(sub, tmp, sub_list, subscrp_list) { if (!s || !memcmp(s, &sub->evt.s, sizeof(*s))) - tipc_sub_delete(sub); + tipc_sub_unsubscribe(sub); else if (s) break; } @@ -278,7 +278,7 @@ static int tipc_con_rcv_sub(struct tipc_server *srv, tipc_con_delete_sub(con, s); return 0; } - sub = tipc_subscrp_subscribe(srv, s, con->conid); + sub = tipc_sub_subscribe(srv, s, con->conid); if (!sub) return -1; diff --git a/net/tipc/server.h b/net/tipc/server.h index 2de8709..995b795 100644 --- a/net/tipc/server.h +++ b/net/tipc/server.h @@ -2,6 +2,7 @@ * net/tipc/server.h: Include file for TIPC server code * * Copyright (c) 2012-2013, Wind River Systems + * Copyright (c) 2017, Ericsson AB * All rights reserved. * * Redistribution and use in source and binary forms, with or without diff --git a/net/tipc/subscr.c b/net/tipc/subscr.c index 406b09f..8d37b61 100644 --- a/net/tipc/subscr.c +++ b/net/tipc/subscr.c @@ -134,33 +134,29 @@ void tipc_subscrp_get(struct tipc_subscription *subscription) kref_get(&subscription->kref); } -static struct tipc_subscription *tipc_subscrp_create(struct tipc_server *srv, -struct tipc_subscr *s, -int conid) +struct tipc_subscription *tipc_sub_subscribe(struct tipc_server *srv, +struct tipc_subscr *s, +int conid) { struct tipc_net *tn = tipc_net(srv->net); struct tipc_subscription *sub; u32 filter = tipc_sub_read(s, filter); + u32 timeout; - /* Refuse subscription if global limit exceeded */ - if (atomic_read(&tn->subscription_count) >= TIPC_MAX_SUBSCRIPTIONS) { - pr_warn("Subscription rejected, limit reached (%u)\n", - TIPC_MAX_SUBSCRIPTIONS); + if (atomic_read(&tn->subscription_count) >= TIPC_MAX_SUBSCR) { + pr_warn("Subscription rejected, max (%u)\n", TIPC_MAX_SUBSCR); + return NULL; + } + if ((filter & TIPC_SUB_PORTS && filter & TIPC_SUB_SERVICE) || + (tipc_sub_read(s, seq.lower) > tipc_sub_read(s, seq.upper))) { + pr_warn("Subscription rejected, illegal request\n"); return NULL; } - - /* Allocate subscription object */ sub = kmalloc(sizeof(*sub), GFP_ATOMIC); if (!sub) { pr_warn("Subscription rejected, no memory\n"); return NULL; } - - /* Initialize subscription object */ - if (filter & TIPC_SUB_PORTS && filter & TIPC_SUB_SERVICE) - goto err; - if (tipc_sub_read(s, seq.lower) > tipc_sub_read(s, seq.upper)) - goto err; sub->server = srv; sub->conid = conid; sub->inactive = false; @@ -168,24 +164,6 @@ static struct tipc_subscription *tipc_subscrp_create(struct tipc_server *srv, spin_lock_init(&sub->lock); atomic_inc(&tn->subscription_count); kref_init(&sub->kref); - return sub; -err: - pr_warn("Subscription rejected, illegal request\n"); - kfree(sub); - return NULL; -} - -struct tipc_subscription *tipc_subscrp_subscribe(struct tipc_server *srv, -struct tipc_subscr *s, -int conid) -{ - struct tipc_subscription *sub = NULL; - u32 timeout; - - sub = tipc_subscrp_create(srv, s, conid); - if (!sub) - return NULL; - tipc_nametbl_subscribe(sub); timer_setup(&sub->timer, tipc_subscrp_timeout, 0); timeout = tipc_sub_read(&sub->evt.s, timeout); @@ -194,7 +172,7 @@ struct tipc_subscription *tipc_subscrp_subscribe(struct tipc_server *srv, return sub; } -void tipc_sub_delete(struct tipc_subscription *sub) +void tipc_sub_unsubscribe(struct tipc_subscription *sub) { tipc_nametbl_unsubscribe(sub); if (sub->evt.s.timeout != TIP
[net-next 08/10] tipc: make struct tipc_server private for server.c
In order to narrow the interface and dependencies between the topology server and the subscription/binding table functionality we move struct tipc_server inside the file server.c. This requires some code adaptations in other files, but those are mostly minor. The most important change is that we have to move the start/stop functions for the topology server to server.c, where they logically belong anyway. Acked-by: Ying Xue Signed-off-by: Jon Maloy --- net/tipc/name_table.c | 10 ++-- net/tipc/server.c | 124 -- net/tipc/server.h | 36 +-- net/tipc/subscr.c | 64 ++ net/tipc/subscr.h | 4 +- 5 files changed, 110 insertions(+), 128 deletions(-) diff --git a/net/tipc/name_table.c b/net/tipc/name_table.c index b234b7e..e01c9c6 100644 --- a/net/tipc/name_table.c +++ b/net/tipc/name_table.c @@ -813,8 +813,7 @@ int tipc_nametbl_withdraw(struct net *net, u32 type, u32 lower, u32 ref, */ void tipc_nametbl_subscribe(struct tipc_subscription *sub) { - struct tipc_server *srv = sub->server; - struct tipc_net *tn = tipc_net(srv->net); + struct tipc_net *tn = tipc_net(sub->net); struct tipc_subscr *s = &sub->evt.s; u32 type = tipc_sub_read(s, seq.type); int index = hash(type); @@ -822,7 +821,7 @@ void tipc_nametbl_subscribe(struct tipc_subscription *sub) struct tipc_name_seq ns; spin_lock_bh(&tn->nametbl_lock); - seq = nametbl_find_seq(srv->net, type); + seq = nametbl_find_seq(sub->net, type); if (!seq) seq = tipc_nameseq_create(type, &tn->nametbl->seq_hlist[index]); if (seq) { @@ -844,14 +843,13 @@ void tipc_nametbl_subscribe(struct tipc_subscription *sub) */ void tipc_nametbl_unsubscribe(struct tipc_subscription *sub) { - struct tipc_server *srv = sub->server; struct tipc_subscr *s = &sub->evt.s; - struct tipc_net *tn = tipc_net(srv->net); + struct tipc_net *tn = tipc_net(sub->net); struct name_seq *seq; u32 type = tipc_sub_read(s, seq.type); spin_lock_bh(&tn->nametbl_lock); - seq = nametbl_find_seq(srv->net, type); + seq = nametbl_find_seq(sub->net, type); if (seq != NULL) { spin_lock_bh(&seq->lock); list_del_init(&sub->nameseq_list); diff --git a/net/tipc/server.c b/net/tipc/server.c index a5c112e..0abbdd6 100644 --- a/net/tipc/server.c +++ b/net/tipc/server.c @@ -49,7 +49,37 @@ #define CF_CONNECTED 1 #define CF_SERVER 2 -#define sock2con(x) ((struct tipc_conn *)(x)->sk_user_data) +#define TIPC_SERVER_NAME_LEN 32 + +/** + * struct tipc_server - TIPC server structure + * @conn_idr: identifier set of connection + * @idr_lock: protect the connection identifier set + * @idr_in_use: amount of allocated identifier entry + * @net: network namspace instance + * @rcvbuf_cache: memory cache of server receive buffer + * @rcv_wq: receive workqueue + * @send_wq: send workqueue + * @max_rcvbuf_size: maximum permitted receive message length + * @tipc_conn_new: callback will be called when new connection is incoming + * @tipc_conn_release: callback will be called before releasing the connection + * @tipc_conn_recvmsg: callback will be called when message arrives + * @saddr: TIPC server address + * @name: server name + * @imp: message importance + * @type: socket type + */ +struct tipc_server { + struct idr conn_idr; + spinlock_t idr_lock; /* for idr list */ + int idr_in_use; + struct net *net; + struct workqueue_struct *rcv_wq; + struct workqueue_struct *send_wq; + int max_rcvbuf_size; + struct sockaddr_tipc *saddr; + char name[TIPC_SERVER_NAME_LEN]; +}; /** * struct tipc_conn - TIPC connection structure @@ -93,6 +123,11 @@ static void tipc_recv_work(struct work_struct *work); static void tipc_send_work(struct work_struct *work); static void tipc_clean_outqueues(struct tipc_conn *con); +static struct tipc_conn *sock2con(struct sock *sk) +{ + return sk->sk_user_data; +} + static bool connected(struct tipc_conn *con) { return con && test_bit(CF_CONNECTED, &con->flags); @@ -198,14 +233,17 @@ static void tipc_register_callbacks(struct socket *sock, struct tipc_conn *con) static void tipc_con_delete_sub(struct tipc_conn *con, struct tipc_subscr *s) { struct list_head *sub_list = &con->sub_list; + struct tipc_net *tn = tipc_net(con->server->net); struct tipc_subscription *sub, *tmp; spin_lock_bh(&con->sub_lock); list_for_each_entry_safe(sub, tmp, sub_list, sub_list) { - if (!s || !memcmp(s, &sub->evt.s, sizeof(*s))) + if (!s || !memcmp(s, &sub->evt.s, sizeof(*s))) { tipc_sub_unsubscribe(sub); - else if (s) + atomic_dec(&tn->subscription_count); + } else if (s) {
[net-next 07/10] tipc: some prefix changes
Since we now have removed struct tipc_subscriber from the code, and only struct tipc_subscription remains, there is no longer need for long and awkward prefixes to distinguish between their pertaining functions. We now change all tipc_subscrp_* prefixes to tipc_sub_*. This is a purely cosmetic change. Acked-by: Ying Xue Signed-off-by: Jon Maloy --- net/tipc/name_table.c | 33 net/tipc/server.c | 5 ++--- net/tipc/subscr.c | 52 +-- net/tipc/subscr.h | 20 ++-- 4 files changed, 54 insertions(+), 56 deletions(-) diff --git a/net/tipc/name_table.c b/net/tipc/name_table.c index 2fbd0a2..b234b7e 100644 --- a/net/tipc/name_table.c +++ b/net/tipc/name_table.c @@ -326,10 +326,10 @@ static struct publication *tipc_nameseq_insert_publ(struct net *net, /* Any subscriptions waiting for notification? */ list_for_each_entry_safe(s, st, &nseq->subscriptions, nameseq_list) { - tipc_subscrp_report_overlap(s, publ->lower, publ->upper, - TIPC_PUBLISHED, publ->ref, - publ->node, publ->scope, - created_subseq); + tipc_sub_report_overlap(s, publ->lower, publ->upper, + TIPC_PUBLISHED, publ->ref, + publ->node, publ->scope, + created_subseq); } return publ; } @@ -397,10 +397,9 @@ static struct publication *tipc_nameseq_remove_publ(struct net *net, /* Notify any waiting subscriptions */ list_for_each_entry_safe(s, st, &nseq->subscriptions, nameseq_list) { - tipc_subscrp_report_overlap(s, publ->lower, publ->upper, - TIPC_WITHDRAWN, publ->ref, - publ->node, publ->scope, - removed_subseq); + tipc_sub_report_overlap(s, publ->lower, publ->upper, + TIPC_WITHDRAWN, publ->ref, publ->node, + publ->scope, removed_subseq); } return publ; @@ -424,25 +423,25 @@ static void tipc_nameseq_subscribe(struct name_seq *nseq, ns.upper = tipc_sub_read(s, seq.upper); no_status = tipc_sub_read(s, filter) & TIPC_SUB_NO_STATUS; - tipc_subscrp_get(sub); + tipc_sub_get(sub); list_add(&sub->nameseq_list, &nseq->subscriptions); if (no_status || !sseq) return; while (sseq != &nseq->sseqs[nseq->first_free]) { - if (tipc_subscrp_check_overlap(&ns, sseq->lower, sseq->upper)) { + if (tipc_sub_check_overlap(&ns, sseq->lower, sseq->upper)) { struct publication *crs; struct name_info *info = sseq->info; int must_report = 1; list_for_each_entry(crs, &info->zone_list, zone_list) { - tipc_subscrp_report_overlap(sub, sseq->lower, - sseq->upper, - TIPC_PUBLISHED, - crs->ref, crs->node, - crs->scope, - must_report); + tipc_sub_report_overlap(sub, sseq->lower, + sseq->upper, + TIPC_PUBLISHED, + crs->ref, crs->node, + crs->scope, + must_report); must_report = 0; } } @@ -856,7 +855,7 @@ void tipc_nametbl_unsubscribe(struct tipc_subscription *sub) if (seq != NULL) { spin_lock_bh(&seq->lock); list_del_init(&sub->nameseq_list); - tipc_subscrp_put(sub); + tipc_sub_put(sub); if (!seq->first_free && list_empty(&seq->subscriptions)) { hlist_del_init_rcu(&seq->ns_list); kfree(seq->sseqs); diff --git a/net/tipc/server.c b/net/tipc/server.c index 6a18b10..a5c112e 100644 --- a/net/tipc/server.c +++ b/net/tipc/server.c @@ -201,7 +201,7 @@ static void tipc_con_delete_sub(struct tipc_conn *con, struct tipc_subscr *s) struct tipc_subscription *sub, *tmp; spin_lock_bh(&con->sub_lock); - list_for_each_entry_safe(sub, tmp, sub_list, subscrp_list) { + list_for_each_entry_safe(sub, tmp, s
[net-next 09/10] tipc: separate topology server listener socket from subcsriber sockets
We move the listener socket to struct tipc_server and give it its own work item. This makes it easier to follow the code, and entails some simplifications in the reception code in subscriber sockets. Acked-by: Ying Xue Signed-off-by: Jon Maloy --- net/tipc/server.c | 328 -- 1 file changed, 147 insertions(+), 181 deletions(-) diff --git a/net/tipc/server.c b/net/tipc/server.c index 0abbdd6..0e351e8 100644 --- a/net/tipc/server.c +++ b/net/tipc/server.c @@ -64,7 +64,6 @@ * @tipc_conn_new: callback will be called when new connection is incoming * @tipc_conn_release: callback will be called before releasing the connection * @tipc_conn_recvmsg: callback will be called when message arrives - * @saddr: TIPC server address * @name: server name * @imp: message importance * @type: socket type @@ -74,10 +73,11 @@ struct tipc_server { spinlock_t idr_lock; /* for idr list */ int idr_in_use; struct net *net; + struct work_struct awork; struct workqueue_struct *rcv_wq; struct workqueue_struct *send_wq; int max_rcvbuf_size; - struct sockaddr_tipc *saddr; + struct socket *listener; char name[TIPC_SERVER_NAME_LEN]; }; @@ -106,7 +106,6 @@ struct tipc_conn { struct list_head sub_list; spinlock_t sub_lock; /* for subscription list */ struct work_struct rwork; - int (*rx_action) (struct tipc_conn *con); struct list_head outqueue; spinlock_t outqueue_lock; struct work_struct swork; @@ -121,12 +120,6 @@ struct outqueue_entry { static void tipc_recv_work(struct work_struct *work); static void tipc_send_work(struct work_struct *work); -static void tipc_clean_outqueues(struct tipc_conn *con); - -static struct tipc_conn *sock2con(struct sock *sk) -{ - return sk->sk_user_data; -} static bool connected(struct tipc_conn *con) { @@ -137,21 +130,21 @@ static void tipc_conn_kref_release(struct kref *kref) { struct tipc_conn *con = container_of(kref, struct tipc_conn, kref); struct tipc_server *s = con->server; - struct socket *sock = con->sock; + struct outqueue_entry *e, *safe; - if (sock) { - if (test_bit(CF_SERVER, &con->flags)) { - __module_get(sock->ops->owner); - __module_get(sock->sk->sk_prot_creator->owner); - } - sock_release(sock); - con->sock = NULL; - } spin_lock_bh(&s->idr_lock); idr_remove(&s->conn_idr, con->conid); s->idr_in_use--; spin_unlock_bh(&s->idr_lock); - tipc_clean_outqueues(con); + if (con->sock) + sock_release(con->sock); + + spin_lock_bh(&con->outqueue_lock); + list_for_each_entry_safe(e, safe, &con->outqueue, list) { + list_del(&e->list); + kfree(e); + } + spin_unlock_bh(&con->outqueue_lock); kfree(con); } @@ -178,14 +171,14 @@ static struct tipc_conn *tipc_conn_lookup(struct tipc_server *s, int conid) } /* sock_data_ready - interrupt callback indicating the socket has data to read - * The queued job is launched in tipc_recv_from_sock() + * The queued work is launched into tipc_recv_work()->tipc_recv_from_sock() */ static void sock_data_ready(struct sock *sk) { struct tipc_conn *con; read_lock_bh(&sk->sk_callback_lock); - con = sock2con(sk); + con = sk->sk_user_data; if (connected(con)) { conn_get(con); if (!queue_work(con->server->rcv_wq, &con->rwork)) @@ -195,15 +188,15 @@ static void sock_data_ready(struct sock *sk) } /* sock_write_space - interrupt callback after a sendmsg EAGAIN - * Indicates that there now is more is space in the send buffer - * The queued job is launched in tipc_send_to_sock() + * Indicates that there now is more space in the send buffer + * The queued work is launched into tipc_send_work()->tipc_send_to_sock() */ static void sock_write_space(struct sock *sk) { struct tipc_conn *con; read_lock_bh(&sk->sk_callback_lock); - con = sock2con(sk); + con = sk->sk_user_data; if (connected(con)) { conn_get(con); if (!queue_work(con->server->send_wq, &con->swork)) @@ -212,23 +205,8 @@ static void sock_write_space(struct sock *sk) read_unlock_bh(&sk->sk_callback_lock); } -static void tipc_register_callbacks(struct socket *sock, struct tipc_conn *con) -{ - struct sock *sk = sock->sk; - - write_lock_bh(&sk->sk_callback_lock); - - sk->sk_data_ready = sock_data_ready; - sk->sk_write_space = sock_write_space; - sk->sk_user_data = con; - - con->sock = sock; - - write_unlock_bh(&sk->sk_callback_lock); -} - /* tipc_con_delete_sub - delete a specific or all subscriptions - * for a given subscriber + * for a given subscriber */
[net-next 10/10] tipc: rename tipc_server to tipc_topsrv
We rename struct tipc_server to struct tipc_topsrv. This reflect its now specialized role as topology server. Accoringly, we change or add function prefixes to make it clearer which functionality those belong to. There are no functional changes in this commit. Acked-by: Ying.Xue Signed-off-by: Jon Maloy --- net/tipc/Makefile | 2 +- net/tipc/core.h | 6 +- net/tipc/group.c | 2 +- net/tipc/server.c | 700 - net/tipc/server.h | 57 - net/tipc/subscr.c | 2 +- net/tipc/subscr.h | 2 +- net/tipc/topsrv.c | 702 ++ net/tipc/topsrv.h | 54 + 9 files changed, 763 insertions(+), 764 deletions(-) delete mode 100644 net/tipc/server.c delete mode 100644 net/tipc/server.h create mode 100644 net/tipc/topsrv.c create mode 100644 net/tipc/topsrv.h diff --git a/net/tipc/Makefile b/net/tipc/Makefile index 37bb0bf..1edb719 100644 --- a/net/tipc/Makefile +++ b/net/tipc/Makefile @@ -9,7 +9,7 @@ tipc-y += addr.o bcast.o bearer.o \ core.o link.o discover.o msg.o \ name_distr.o subscr.o monitor.o name_table.o net.o \ netlink.o netlink_compat.o node.o socket.o eth_media.o \ - server.o socket.o group.o + topsrv.o socket.o group.o tipc-$(CONFIG_TIPC_MEDIA_UDP) += udp_media.o tipc-$(CONFIG_TIPC_MEDIA_IB) += ib_media.o diff --git a/net/tipc/core.h b/net/tipc/core.h index 20b21af..ff8b071 100644 --- a/net/tipc/core.h +++ b/net/tipc/core.h @@ -64,7 +64,7 @@ struct tipc_bearer; struct tipc_bc_base; struct tipc_link; struct tipc_name_table; -struct tipc_server; +struct tipc_topsrv; struct tipc_monitor; #define TIPC_MOD_VER "2.0.0" @@ -112,7 +112,7 @@ struct tipc_net { struct list_head dist_queue; /* Topology subscription server */ - struct tipc_server *topsrv; + struct tipc_topsrv *topsrv; atomic_t subscription_count; }; @@ -131,7 +131,7 @@ static inline struct list_head *tipc_nodes(struct net *net) return &tipc_net(net)->node_list; } -static inline struct tipc_server *tipc_topsrv(struct net *net) +static inline struct tipc_topsrv *tipc_topsrv(struct net *net) { return tipc_net(net)->topsrv; } diff --git a/net/tipc/group.c b/net/tipc/group.c index 122162a..03086cc 100644 --- a/net/tipc/group.c +++ b/net/tipc/group.c @@ -37,7 +37,7 @@ #include "addr.h" #include "group.h" #include "bcast.h" -#include "server.h" +#include "topsrv.h" #include "msg.h" #include "socket.h" #include "node.h" diff --git a/net/tipc/server.c b/net/tipc/server.c deleted file mode 100644 index 0e351e8..000 --- a/net/tipc/server.c +++ /dev/null @@ -1,700 +0,0 @@ -/* - * net/tipc/server.c: TIPC server infrastructure - * - * Copyright (c) 2012-2013, Wind River Systems - * Copyright (c) 2017, Ericsson AB - * All rights reserved. - * - * Redistribution and use in source and binary forms, with or without - * modification, are permitted provided that the following conditions are met: - * - * 1. Redistributions of source code must retain the above copyright - *notice, this list of conditions and the following disclaimer. - * 2. Redistributions in binary form must reproduce the above copyright - *notice, this list of conditions and the following disclaimer in the - *documentation and/or other materials provided with the distribution. - * 3. Neither the names of the copyright holders nor the names of its - *contributors may be used to endorse or promote products derived from - *this software without specific prior written permission. - * - * Alternatively, this software may be distributed under the terms of the - * GNU General Public License ("GPL") version 2 as published by the Free - * Software Foundation. - * - * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" - * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE - * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE - * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE - * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR - * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF - * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS - * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN - * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) - * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE - * POSSIBILITY OF SUCH DAMAGE. - */ - -#include "subscr.h" -#include "server.h" -#include "core.h" -#include "socket.h" -#include "addr.h" -#include "msg.h" -#include -#include - -/* Number of messages to send before rescheduling */ -#define MAX_SEND_MSG_COUNT 25 -#define MAX_RECV_MSG_COUNT 25 -#define CF_CONNECTED 1 -#define CF_SERVER 2 - -#define TIPC_SERVER_NAME_LEN 32 - -/** - * struc
Re: [vhost:vhost 24/24] drivers/firmware/qemu_fw_cfg.c:499:22: error: storage size of 'files' isn't known
Hi On Wed, Feb 14, 2018 at 9:27 PM, kbuild test robot wrote: > tree: https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git vhost > head: 5d457fe6aeaab9d0a1665eafc8af7139bc6b6f2e > commit: 5d457fe6aeaab9d0a1665eafc8af7139bc6b6f2e [24/24] fw_cfg: fix sparse > warnings around FW_CFG_FILE_DIR read > config: i386-randconfig-x015-201806 (attached as .config) > compiler: gcc-7 (Debian 7.3.0-1) 7.3.0 > reproduce: > git checkout 5d457fe6aeaab9d0a1665eafc8af7139bc6b6f2e > # save the attached .config to linux build tree > make ARCH=i386 > > All errors (new ones prefixed by >>): > >drivers/firmware/qemu_fw_cfg.c: In function 'fw_cfg_register_dir_entries': >>> drivers/firmware/qemu_fw_cfg.c:499:22: error: storage size of 'files' isn't >>> known > struct fw_cfg_files files; > ^ struct fw_cfg_files { __be32 count; /* number of entries */ struct fw_cfg_file f[]; }; Interesting, I don't have that warning with 7.3.1. I thought the size would be sizeof(count) by standard. I replaced it with a __be32 files_count variable instead. >drivers/firmware/qemu_fw_cfg.c:499:22: warning: unused variable 'files' > [-Wunused-variable] > files.count is used 3 lines below, that looks like a compiler bug to me. > vim +499 drivers/firmware/qemu_fw_cfg.c > >493 >494 /* iterate over all fw_cfg directory entries, registering each one */ >495 static int fw_cfg_register_dir_entries(void) >496 { >497 int ret = 0; >498 u32 count, i; > > 499 struct fw_cfg_files files; >500 struct fw_cfg_file *dir; >501 size_t dir_size; >502 >503 fw_cfg_read_blob(FW_CFG_FILE_DIR, &files.count, 0, > sizeof(files.count)); >504 count = be32_to_cpu(files.count); >505 dir_size = count * sizeof(struct fw_cfg_file); >506 >507 dir = kmalloc(dir_size, GFP_KERNEL); >508 if (!dir) >509 return -ENOMEM; >510 >511 fw_cfg_read_blob(FW_CFG_FILE_DIR, dir, sizeof(files.count), > dir_size); >512 >513 for (i = 0; i < count; i++) { >514 ret = fw_cfg_register_file(&dir[i]); >515 if (ret) >516 break; >517 } >518 >519 kfree(dir); >520 return ret; >521 } >522 > > --- > 0-DAY kernel test infrastructureOpen Source Technology Center > https://lists.01.org/pipermail/kbuild-all Intel Corporation
Re: [vhost:vhost 22/28] drivers/firmware/qemu_fw_cfg.c:35:10: fatal error: linux/fw_cfg.h: No such file or directory
On Wed, Feb 14, 2018 at 7:21 PM, kbuild test robot wrote: > tree: https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git vhost > head: 20b12f92d80433b9bd0d54b9712897501ac66fdd > commit: f59055103f6930c771fc597c42a92cbe997a765d [22/28] fw_cfg: add a public > uapi header > config: i386-randconfig-i0-201806 (attached as .config) > compiler: gcc-7 (Debian 7.3.0-1) 7.3.0 > reproduce: > git checkout f59055103f6930c771fc597c42a92cbe997a765d > # save the attached .config to linux build tree > make ARCH=i386 > > All errors (new ones prefixed by >>): > >>> drivers/firmware/qemu_fw_cfg.c:35:10: fatal error: linux/fw_cfg.h: No such >>> file or directory > #include > ^~~~ >compilation terminated. > > vim +35 drivers/firmware/qemu_fw_cfg.c > > > 35 #include For some reasons, my cflags have -I include/uapi. Let's be more explicit: -#include +#include > 36 > > --- > 0-DAY kernel test infrastructureOpen Source Technology Center > https://lists.01.org/pipermail/kbuild-all Intel Corporation
Re: ppp/pppoe, still panic 4.15.3 in ppp_push
On 2018-02-14 19:25, Guillaume Nault wrote: On Wed, Feb 14, 2018 at 06:49:19PM +0200, Denys Fedoryshchenko wrote: On 2018-02-14 18:47, Guillaume Nault wrote: > On Wed, Feb 14, 2018 at 06:29:34PM +0200, Denys Fedoryshchenko wrote: > > On 2018-02-14 18:07, Guillaume Nault wrote: > > > On Wed, Feb 14, 2018 at 03:17:23PM +0200, Denys Fedoryshchenko wrote: > > > > Hi, > > > > > > > > Upgraded kernel to 4.15.3, still it crashes after while (several > > > > hours, > > > > cannot do bisect, as it is production server). > > > > > > > > dev ppp # gdb ppp_generic.o > > > > GNU gdb (Gentoo 7.12.1 vanilla) 7.12.1 > > > > <> > > > > Reading symbols from ppp_generic.o...done. > > > > (gdb) list *ppp_push+0x73 > > > > 0x681 is in ppp_push (drivers/net/ppp/ppp_generic.c:1663). > > > > 1658list = list->next; > > > > 1659pch = list_entry(list, struct channel, clist); > > > > 1660 > > > > 1661spin_lock(&pch->downl); > > > > 1662if (pch->chan) { > > > > 1663if (pch->chan->ops->start_xmit(pch->chan, skb)) > > > > 1664ppp->xmit_pending = NULL; > > > > 1665} else { > > > > 1666/* channel got unregistered */ > > > > 1667kfree_skb(skb); > > > > > > > > > > > I expect a memory corruption. Do you have the possibility to run with > > > KASAN by any chance? > > I will try to enable it tonight. For now i reverted "drivers, net, > > ppp: > > convert ppp_file.refcnt from atomic_t to refcount_t" for test. > > > This commit looks good to me. Do you have doubts about it because it's > new in 4.15? Does it mean that your last known-good kernel is 4.14? I am just doing "manual" bisect, checking all possibilities, and picking patch to revert randomly. Must be a painful process. Are all of your networking modules required? With luck, you might be able to isolate a faulty module in fewer steps. Yes, correct, my known-good is 4.14.2. Good to know. Let me know if you can get a KASAN trace. Here we go: [24558.921549] == [24558.922167] BUG: KASAN: use-after-free in ppp_ioctl+0xa6a/0x1522 [ppp_generic] [24558.922776] Write of size 8 at addr 8803d35bf3f8 by task accel-pppd/12622 [24558.923113] [24558.923451] CPU: 0 PID: 12622 Comm: accel-pppd Tainted: G W4.15.3-build-0134 #1 [24558.924058] Hardware name: HP ProLiant DL320e Gen8 v2, BIOS P80 04/02/2015 [24558.924406] Call Trace: [24558.924753] dump_stack+0x46/0x59 [24558.925103] print_address_description+0x6b/0x23b [24558.925451] ? ppp_ioctl+0xa6a/0x1522 [ppp_generic] [24558.925797] kasan_report+0x21b/0x241 [24558.926136] ppp_ioctl+0xa6a/0x1522 [ppp_generic] [24558.926479] ? ppp_nl_newlink+0x1da/0x1da [ppp_generic] [24558.926829] ? sock_sendmsg+0x89/0x99 [24558.927176] ? __vfs_write+0xd9/0x4ad [24558.927523] ? kernel_read+0xed/0xed [24558.927872] ? SyS_getpeername+0x18c/0x18c [24558.928213] ? bit_waitqueue+0x2a/0x2a [24558.928561] ? wake_atomic_t_function+0x115/0x115 [24558.928898] vfs_ioctl+0x6e/0x81 [24558.929228] do_vfs_ioctl+0xa00/0xb10 [24558.929571] ? sigprocmask+0x1a6/0x1d0 [24558.929907] ? sigsuspend+0x13e/0x13e [24558.930239] ? ioctl_preallocate+0x14e/0x14e [24558.930568] ? SyS_rt_sigprocmask+0xf1/0x142 [24558.930904] ? sigprocmask+0x1d0/0x1d0 [24558.931252] SyS_ioctl+0x39/0x55 [24558.931595] ? do_vfs_ioctl+0xb10/0xb10 [24558.931942] do_syscall_64+0x1b1/0x31f [24558.932288] entry_SYSCALL_64_after_hwframe+0x21/0x86 [24558.932627] RIP: 0033:0x7f302849d8a7 [24558.932965] RSP: 002b:7f3029a52af8 EFLAGS: 0206 ORIG_RAX: 0010 [24558.933578] RAX: ffda RBX: 7f3027d861e3 RCX: 7f302849d8a7 [24558.933927] RDX: 7f3023f49468 RSI: 4004743a RDI: 3a67 [24558.934266] RBP: 7f3029a52b20 R08: R09: 55c8308d8e40 [24558.934607] R10: 0008 R11: 0206 R12: 7f3023f49358 [24558.934947] R13: 7ffe86e5723f R14: R15: 7f3029a53700 [24558.935288] [24558.935626] Allocated by task 12622: [24558.935972] ppp_register_net_channel+0x5f/0x5c6 [ppp_generic] [24558.936306] pppoe_connect+0xab7/0xc71 [pppoe] [24558.936640] SyS_connect+0x14b/0x1b7 [24558.936975] do_syscall_64+0x1b1/0x31f [24558.937319] entry_SYSCALL_64_after_hwframe+0x21/0x86 [24558.937655] [24558.937993] Freed by task 12622: [24558.938321] kfree+0xb0/0x11d [24558.938658] ppp_release+0x111/0x120 [ppp_generic] [24558.938994] __fput+0x2ba/0x51a [24558.939332] task_work_run+0x11c/0x13d [24558.939676] exit_to_usermode_loop+0x7c/0xaf [24558.940022] do_syscall_64+0x2ea/0x31f [24558.940368] entry_SYSCALL_64_after_hwframe+0x21/0x86 [24558.94
Re: [PATCHv2 net-next 3/8] net: sched: act: handle generic action errors
On Wed, 2018-02-14 at 17:13 -0500, Alexander Aring wrote: > This patch adds extack support for generic act handling. The extack > will be set deeper to each called function which is not part of netdev > core api. > > Based on work by David Ahern hello Alexander, after looking at the code more closely, I think there is margin for some more improvement here (i.e make the message contents easier to grep from a log). Please let me know if the comments below make sense for you. thank you in advance! -- davide > Cc: David Ahern > Signed-off-by: Alexander Aring > --- > net/sched/act_api.c | 93 > +++-- > 1 file changed, 61 insertions(+), 32 deletions(-) > > diff --git a/net/sched/act_api.c b/net/sched/act_api.c > index 8d89b026414f..a5138ae026a1 100644 > --- a/net/sched/act_api.c > +++ b/net/sched/act_api.c > @@ -617,31 +617,40 @@ struct tc_action *tcf_action_init_1(struct net *net, > struct tcf_proto *tp, > int err; > > if (name == NULL) { > - err = nla_parse_nested(tb, TCA_ACT_MAX, nla, NULL, NULL); > + err = nla_parse_nested(tb, TCA_ACT_MAX, nla, NULL, extack); > if (err < 0) > goto err_out; > err = -EINVAL; > kind = tb[TCA_ACT_KIND]; > - if (!kind) > + if (!kind) { > + NL_SET_ERR_MSG(extack, "TC action kind must be > specified"); > goto err_out; > - if (nla_strlcpy(act_name, kind, IFNAMSIZ) >= IFNAMSIZ) > + } > + if (nla_strlcpy(act_name, kind, IFNAMSIZ) >= IFNAMSIZ) { > + NL_SET_ERR_MSG(extack, "TC action name too long"); > goto err_out; > + } > if (tb[TCA_ACT_COOKIE]) { > int cklen = nla_len(tb[TCA_ACT_COOKIE]); > > - if (cklen > TC_COOKIE_MAX_SIZE) > + if (cklen > TC_COOKIE_MAX_SIZE) { > + NL_SET_ERR_MSG(extack, "TC cookie size above > the maximum"); > goto err_out; > + } > > cookie = nla_memdup_cookie(tb); > if (!cookie) { > + NL_SET_ERR_MSG(extack, "No memory to generate > TC cookie"); > err = -ENOMEM; > goto err_out; > } > } > } else { > - err = -EINVAL; > - if (strlcpy(act_name, name, IFNAMSIZ) >= IFNAMSIZ) > + if (strlcpy(act_name, name, IFNAMSIZ) >= IFNAMSIZ) { > + NL_SET_ERR_MSG(extack, "TC action name too long"); > + err = -EINVAL; > goto err_out; > + } > } > > a_o = tc_lookup_action_n(act_name); > @@ -664,6 +673,7 @@ struct tc_action *tcf_action_init_1(struct net *net, > struct tcf_proto *tp, > goto err_mod; > } > #endif > + NL_SET_ERR_MSG(extack, "Failed to load TC action module"); > err = -ENOENT; > goto err_out; > } > @@ -698,6 +708,7 @@ struct tc_action *tcf_action_init_1(struct net *net, > struct tcf_proto *tp, > > list_add_tail(&a->list, &actions); > tcf_action_destroy(&actions, bind); > + NL_SET_ERR_MSG(extack, "Failed to init action chain"); most of the times, the word 'action' is prepended by the word 'TC'. Proposal: "Failed to init TC action chain" > return ERR_PTR(err); > } > } > @@ -734,7 +745,7 @@ int tcf_action_init(struct net *net, struct tcf_proto > *tp, struct nlattr *nla, > int err; > int i; > > - err = nla_parse_nested(tb, TCA_ACT_MAX_PRIO, nla, NULL, NULL); > + err = nla_parse_nested(tb, TCA_ACT_MAX_PRIO, nla, NULL, extack); > if (err < 0) > return err; > > @@ -842,7 +853,8 @@ static int tca_get_fill(struct sk_buff *skb, struct > list_head *actions, > > static int > tcf_get_notify(struct net *net, u32 portid, struct nlmsghdr *n, > -struct list_head *actions, int event) > +struct list_head *actions, int event, > +struct netlink_ext_ack *extack) > { > struct sk_buff *skb; > > @@ -851,6 +863,7 @@ tcf_get_notify(struct net *net, u32 portid, struct > nlmsghdr *n, > return -ENOBUFS; > if (tca_get_fill(skb, actions, portid, n->nlmsg_seq, 0, event, >0, 0) <= 0) { > + NL_SET_ERR_MSG(extack, "Failed to fill netlink tc action > attributes"); see also @@ -975,6 +1000,7 @@ and @@ -975,6 +1000,7 @@: this is the same message you get in case tcf_add_notify() fails, and it's very similar to what you get when tcf_del_notify() fails: the only difference is uppercase/lowercase 'tc' word.
Re: [PATCH iproute2 2/7] devlink: mnlg: Add support for extended ack
On 02/14/2018 05:12 PM, Stephen Hemminger wrote: > On Wed, 14 Feb 2018 10:55:17 +0200 > Arkadi Sharshevsky wrote: > >> +static mnl_cb_t mnlg_cb_array[NLMSG_MIN_TYPE] = { >> +[NLMSG_NOOP]= mnlg_cb_noop, >> +[NLMSG_ERROR] = mnlg_cb_error, >> +[NLMSG_DONE]= mnlg_cb_stop, >> +[NLMSG_OVERRUN] = mnlg_cb_noop, >> +}; >> + > > Could be const? > I pass the array to mnl_cb_run2() which will discard the 'const' qualifier. So I dont think this is very beneficial.
Re: [PATCH V2 net-next 4/7] rds: support for zcopy completion notification
On (02/14/18 19:41), Willem de Bruijn wrote: > > One more thing: this code notifies that the operation succeeded, but > the data was copied in the process. It does not have to be set otherwise. I see. this one was a bit confusing for me (hence the copy/paste) - maybe because the TCP/UDP/PACKET case is a bit different from RDS, and sendmsg may find that it has to switch from zcopy to bcopy after the sendmsg() itself has returned, but this cannot happen for RDS (thus I should never set this code?). Is it ok if I fix this in the next round or do you think this warrants a V3? --Sowmini
[RFC PATCH] chtls_netdev() can be static
Fixes: 5995a3b59239 ("Makefile Kconfig") Signed-off-by: Fengguang Wu --- chtls_main.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/crypto/chelsio/chtls/chtls_main.c b/drivers/crypto/chelsio/chtls/chtls_main.c index 58efb4a..3452f44 100644 --- a/drivers/crypto/chelsio/chtls/chtls_main.c +++ b/drivers/crypto/chelsio/chtls/chtls_main.c @@ -136,8 +136,8 @@ static int chtls_stop_listen(struct sock *sk) return 0; } -struct net_device *chtls_netdev(struct tls_device *dev, - struct net_device *netdev) +static struct net_device *chtls_netdev(struct tls_device *dev, + struct net_device *netdev) { struct chtls_dev *cdev = to_chtls_dev(dev); int i; @@ -149,12 +149,12 @@ struct net_device *chtls_netdev(struct tls_device *dev, return cdev->ports[i]; } -void chtls_update_prot(struct tls_device *dev, struct sock *sk) +static void chtls_update_prot(struct tls_device *dev, struct sock *sk) { sk->sk_prot = &chtls_base_prot; } -int chtls_inline_feature(struct tls_device *dev) +static int chtls_inline_feature(struct tls_device *dev) { struct chtls_dev *cdev = to_chtls_dev(dev); struct net_device *netdev; @@ -168,14 +168,14 @@ int chtls_inline_feature(struct tls_device *dev) return 1; } -int chtls_create_hash(struct tls_device *dev, struct sock *sk) +static int chtls_create_hash(struct tls_device *dev, struct sock *sk) { if (sk->sk_state == TCP_LISTEN) return chtls_start_listen(sk); return 0; } -void chtls_destroy_hash(struct tls_device *dev, struct sock *sk) +static void chtls_destroy_hash(struct tls_device *dev, struct sock *sk) { if (sk->sk_state == TCP_LISTEN) chtls_stop_listen(sk);
Re: [Crypto v4 12/12] Makefile Kconfig
Hi Atul, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on cryptodev/master] [cannot apply to net/master net-next/master v4.16-rc1 next-20180214] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Atul-Gupta/Chelsio-Inline-TLS/20180215-072600 base: https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git master reproduce: # apt-get install sparse make ARCH=x86_64 allmodconfig make C=1 CF=-D__CHECK_ENDIAN__ sparse warnings: (new ones prefixed by >>) >> drivers/crypto/chelsio/chtls/chtls_main.c:139:19: sparse: symbol >> 'chtls_netdev' was not declared. Should it be >> drivers/crypto/chelsio/chtls/chtls_main.c:152:6: sparse: symbol >> 'chtls_update_prot' was not declared. Should it be >> drivers/crypto/chelsio/chtls/chtls_main.c:157:5: sparse: symbol >> 'chtls_inline_feature' was not declared. Should it be >> drivers/crypto/chelsio/chtls/chtls_main.c:171:5: sparse: symbol >> 'chtls_create_hash' was not declared. Should it be >> drivers/crypto/chelsio/chtls/chtls_main.c:178:6: sparse: symbol >> 'chtls_destroy_hash' was not declared. Should it be -- >> drivers/crypto/chelsio/chtls/chtls_cm.c:373:27: sparse: incorrect type in >> assignment (different address spaces) @@ expected struct socket_wq @@ got @@ drivers/crypto/chelsio/chtls/chtls_cm.c:373:27: expected struct socket_wq drivers/crypto/chelsio/chtls/chtls_cm.c:373:27: got struct socket_wq >> drivers/crypto/chelsio/chtls/chtls_cm.c:395:23: sparse: incompatible types >> in comparison expression (different address spaces) drivers/crypto/chelsio/chtls/chtls_cm.c:714:6: sparse: symbol 'free_atid' was not declared. Should it be >> drivers/crypto/chelsio/chtls/chtls_cm.h:150:35: sparse: incorrect type in >> argument 1 (different address spaces) @@ expected struct socket_wq @@ got >> struct socket_wq struct socket_wq @@ drivers/crypto/chelsio/chtls/chtls_cm.h:150:35: expected struct socket_wq drivers/crypto/chelsio/chtls/chtls_cm.h:150:35: got struct socket_wq >> drivers/crypto/chelsio/chtls/chtls_cm.c:1165:37: sparse: incorrect type in >> argument 2 (different base types) @@ expected unsigned int local_ip @@ got >> ed int local_ip @@ drivers/crypto/chelsio/chtls/chtls_cm.c:1165:37: expected unsigned int local_ip drivers/crypto/chelsio/chtls/chtls_cm.c:1165:37: got restricted __be32 daddr >> drivers/crypto/chelsio/chtls/chtls_cm.c:1165:49: sparse: incorrect type in >> argument 3 (different base types) @@ expected unsigned int peer_ip @@ got ed >> int peer_ip @@ drivers/crypto/chelsio/chtls/chtls_cm.c:1165:49: expected unsigned int peer_ip drivers/crypto/chelsio/chtls/chtls_cm.c:1165:49: got restricted __be32 saddr >> drivers/crypto/chelsio/chtls/chtls_cm.h:173:37: sparse: incorrect type in >> assignment (different base types) @@ expected restricted __be32 >> skc_rcv_saddr @@ got unsignrestricted __be32 skc_rcv_saddr @@ drivers/crypto/chelsio/chtls/chtls_cm.h:173:37: expected restricted __be32 skc_rcv_saddr drivers/crypto/chelsio/chtls/chtls_cm.h:173:37: got unsigned int local_ip >> drivers/crypto/chelsio/chtls/chtls_cm.h:174:37: sparse: incorrect type in >> assignment (different base types) @@ expected restricted __be32 skc_daddr @@ >> got unsignrestricted __be32 skc_daddr @@ drivers/crypto/chelsio/chtls/chtls_cm.h:174:37: expected restricted __be32 skc_daddr drivers/crypto/chelsio/chtls/chtls_cm.h:174:37: got unsigned int peer_ip >> drivers/crypto/chelsio/chtls/chtls_cm.c:1243:23: sparse: incorrect type in >> argument 1 (different address spaces) @@ expected struct dst_entry @@ got >> struct dst_entry struct dst_entry @@ drivers/crypto/chelsio/chtls/chtls_cm.c:1243:23: expected struct dst_entry drivers/crypto/chelsio/chtls/chtls_cm.c:1243:23: got struct dst_entry >> drivers/crypto/chelsio/chtls/chtls_cm.c:1539:24: sparse: cast to restricted >> __be16 >> drivers/crypto/chelsio/chtls/chtls_cm.c:1539:24: sparse: cast to restricted >> __be16 >> drivers/crypto/chelsio/chtls/chtls_cm.c:1539:24: sparse: cast to restricted >> __be16 >> drivers/crypto/chelsio/chtls/chtls_cm.c:1539:24: sparse: cast to restricted >> __be16 drivers/crypto/chelsio/chtls/chtls_cm.c:1540:31: sparse: cast to restricted __be16 drivers/crypto/chelsio/chtls/chtls_cm.c:1540:31: sparse: cast to restricted __be16 drivers/crypto/chelsio/chtls/chtls_cm.c:1540:31: sparse: cast to restricted __be16 drivers/crypto/chelsio/chtls/chtls_cm.c:1540:31: sparse: cast to restricted __be16 drivers/crypto/chelsio/chtls/chtls_cm.c:1664:31: s
[PATCH net] cxgb4: free up resources of pf 0-3
free pf 0-3 resources, commit baf5086840ab ("cxgb4: restructure VF mgmt code") erroneously removed the code which frees the pf 0-3 resources, causing the probe of pf 0-3 to fail in case of driver reload. Fixes: baf5086840ab ("cxgb4: restructure VF mgmt code") Signed-off-by: Ganesh Goudar --- drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 25 + 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c index 56bc626..7b452e8 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c @@ -4982,9 +4982,10 @@ static int cxgb4_iov_configure(struct pci_dev *pdev, int num_vfs) pcie_fw = readl(adap->regs + PCIE_FW_A); /* Check if cxgb4 is the MASTER and fw is initialized */ - if (!(pcie_fw & PCIE_FW_INIT_F) || + if (num_vfs && + (!(pcie_fw & PCIE_FW_INIT_F) || !(pcie_fw & PCIE_FW_MASTER_VLD_F) || - PCIE_FW_MASTER_G(pcie_fw) != CXGB4_UNIFIED_PF) { + PCIE_FW_MASTER_G(pcie_fw) != CXGB4_UNIFIED_PF)) { dev_warn(&pdev->dev, "cxgb4 driver needs to be MASTER to support SRIOV\n"); return -EOPNOTSUPP; @@ -5599,24 +5600,24 @@ static void remove_one(struct pci_dev *pdev) #if IS_ENABLED(CONFIG_IPV6) t4_cleanup_clip_tbl(adapter); #endif - iounmap(adapter->regs); if (!is_t4(adapter->params.chip)) iounmap(adapter->bar2); - pci_disable_pcie_error_reporting(pdev); - if ((adapter->flags & DEV_ENABLED)) { - pci_disable_device(pdev); - adapter->flags &= ~DEV_ENABLED; - } - pci_release_regions(pdev); - kfree(adapter->mbox_log); - synchronize_rcu(); - kfree(adapter); } #ifdef CONFIG_PCI_IOV else { cxgb4_iov_configure(adapter->pdev, 0); } #endif + iounmap(adapter->regs); + pci_disable_pcie_error_reporting(pdev); + if ((adapter->flags & DEV_ENABLED)) { + pci_disable_device(pdev); + adapter->flags &= ~DEV_ENABLED; + } + pci_release_regions(pdev); + kfree(adapter->mbox_log); + synchronize_rcu(); + kfree(adapter); } /* "Shutdown" quiesces the device, stopping Ingress Packet and Interrupt -- 2.1.0
[PATCH net] cxgb4: fix trailing zero in CIM LA dump
Set correct size of the CIM LA dump for T6. Fixes: 27887bc7cb7f ("cxgb4: collect hardware LA dumps") Signed-off-by: Rahul Lakkireddy Signed-off-by: Ganesh Goudar --- drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c | 2 +- drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c b/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c index 557fd8bfd54e..00a1d2d13169 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cudbg_lib.c @@ -472,7 +472,7 @@ int cudbg_collect_cim_la(struct cudbg_init *pdbg_init, if (is_t6(padap->params.chip)) { size = padap->params.cim_la_size / 10 + 1; - size *= 11 * sizeof(u32); + size *= 10 * sizeof(u32); } else { size = padap->params.cim_la_size / 8; size *= 8 * sizeof(u32); diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c index 30485f9a598f..143686c60234 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c @@ -102,7 +102,7 @@ static u32 cxgb4_get_entity_length(struct adapter *adap, u32 entity) case CUDBG_CIM_LA: if (is_t6(adap->params.chip)) { len = adap->params.cim_la_size / 10 + 1; - len *= 11 * sizeof(u32); + len *= 10 * sizeof(u32); } else { len = adap->params.cim_la_size / 8; len *= 8 * sizeof(u32); -- 2.14.1
[net-next v2 1/1] tipc: avoid unnecessary copying of bundled messages
A received sk buffer may contain dozens of smaller 'bundled' messages which after extraction go each in their own direction. Unfortunately, when we extract those messages using skb_clone() each of the extracted buffers inherit the truesize value of the original buffer. Apart from causing massive overaccounting of the base buffer's memory, this often causes tipc_msg_validate() to come to the false conclusion that the ratio truesize/datasize > 4, and perform an unnecessary copying of the extracted buffer. We now fix this problem by explicitly correcting the truesize value of the buffer clones to be the truesize of the clone itself plus a calculated fraction of the base buffer's overhead. This change eliminates the overaccounting and at least mitigates the occurrence of unnecessary buffer copying. Reported-by: Hoang Le Acked-by: Ying Xue Signed-off-by: Jon Maloy --- net/tipc/msg.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/net/tipc/msg.c b/net/tipc/msg.c index 4e1c6f6..ce0bfc4 100644 --- a/net/tipc/msg.c +++ b/net/tipc/msg.c @@ -416,8 +416,8 @@ bool tipc_msg_bundle(struct sk_buff *skb, struct tipc_msg *msg, u32 mtu) */ bool tipc_msg_extract(struct sk_buff *skb, struct sk_buff **iskb, int *pos) { + int imsz, offset, clone_cnt, skb_overhead; struct tipc_msg *msg; - int imsz, offset; *iskb = NULL; if (unlikely(skb_linearize(skb))) @@ -434,6 +434,11 @@ bool tipc_msg_extract(struct sk_buff *skb, struct sk_buff **iskb, int *pos) skb_pull(*iskb, offset); imsz = msg_size(buf_msg(*iskb)); skb_trim(*iskb, imsz); + + /* Scale extracted buffer's truesize to avoid double accounting */ + clone_cnt = max_t(u32, 1, msg_msgcnt(msg)); + skb_overhead = skb->truesize - skb->len; + (*iskb)->truesize = SKB_TRUESIZE(imsz) + skb_overhead / clone_cnt; if (unlikely(!tipc_msg_validate(iskb))) goto none; *pos += align(imsz); -- 2.1.4
Re: [PATCH] net: sched: tbf: handle GSO_BY_FRAGS case in enqueue
On Mon, Feb 05, 2018 at 02:34:33PM +1100, Daniel Axtens wrote: > tbf_enqueue() checks the size of a packet before enqueuing it. > However, the GSO size check does not consider the GSO_BY_FRAGS > case, and so will drop GSO SCTP packets, causing a massive drop > in throughput. > > Use skb_gso_validate_mac_len() instead, as it does consider that > case. > > --- > > skb_gso_validate_mac_len() is an out-of-line call, but so is > skb_gso_mac_seglen(), so this is slower but not much slower. I > will send a patch to make the skb_gso_validate_* functions > inline-able shortly. > > Also, GSO_BY_FRAGS considered harmful - I'm pretty sure this is > not the only place it causes issues. Thanks for spotting these issues and fixing them. > > Signed-off-by: Daniel Axtens If this block was meant to be an out-of-band/changelog comment, your SOB line should be above the first --- marker. Anyhow, Reviewed-by: Marcelo Ricardo Leitner > --- > net/sched/sch_tbf.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/net/sched/sch_tbf.c b/net/sched/sch_tbf.c > index 229172d509cc..03225a8df973 100644 > --- a/net/sched/sch_tbf.c > +++ b/net/sched/sch_tbf.c > @@ -188,7 +188,8 @@ static int tbf_enqueue(struct sk_buff *skb, struct Qdisc > *sch, > int ret; > > if (qdisc_pkt_len(skb) > q->max_size) { > - if (skb_is_gso(skb) && skb_gso_mac_seglen(skb) <= q->max_size) > + if (skb_is_gso(skb) && > + skb_gso_validate_mac_len(skb, q->max_size)) > return tbf_segment(skb, sch, to_free); > return qdisc_drop(skb, sch, to_free); > } > -- > 2.14.1 >
[PATCH net-next 1/2] net: dsa: mv88e6xxx: Release mutex between each statistics read
The PTP code needs low latency access to the PTP hardware timestamps. Reading all the statistics in one go adds a lot of latency to the PTP code. So take and release the reg_lock mutex for each individual statistics, allowing the PTP thread jump in between. Signed-off-by: Andrew Lunn --- drivers/net/dsa/mv88e6xxx/chip.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index af63710e93c1..bd5cb8a0330e 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -714,9 +714,12 @@ static void mv88e6xxx_stats_get_stats(struct mv88e6xxx_chip *chip, int port, for (i = 0, j = 0; i < ARRAY_SIZE(mv88e6xxx_hw_stats); i++) { stat = &mv88e6xxx_hw_stats[i]; if (stat->type & types) { + mutex_lock(&chip->reg_lock); data[j] = _mv88e6xxx_get_ethtool_stat(chip, stat, port, bank1_select, histogram); + mutex_unlock(&chip->reg_lock); + j++; } } @@ -764,14 +767,13 @@ static void mv88e6xxx_get_ethtool_stats(struct dsa_switch *ds, int port, mutex_lock(&chip->reg_lock); ret = mv88e6xxx_stats_snapshot(chip, port); - if (ret < 0) { - mutex_unlock(&chip->reg_lock); + mutex_unlock(&chip->reg_lock); + + if (ret < 0) return; - } mv88e6xxx_get_stats(chip, port, data); - mutex_unlock(&chip->reg_lock); } static int mv88e6xxx_stats_set_histogram(struct mv88e6xxx_chip *chip) -- 2.15.1
[PATCH net-next 2/2] net: dsa: mv88e6xxx: Release mutex between each ATU read
The PTP code needs low latency access to the PTP hardware timestamps. Reading all the ATU entries in one go adds a lot of latency to the PTP code. So take and release the reg_lock mutex for each individual MAC address in the ATU, allowing the PTP thread jump in between. Signed-off-by: Andrew Lunn --- drivers/net/dsa/mv88e6xxx/chip.c | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index bd5cb8a0330e..39c7ad7e490f 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -1437,7 +1437,9 @@ static int mv88e6xxx_port_db_dump_fid(struct mv88e6xxx_chip *chip, eth_broadcast_addr(addr.mac); do { + mutex_lock(&chip->reg_lock); err = mv88e6xxx_g1_atu_getnext(chip, fid, &addr); + mutex_unlock(&chip->reg_lock); if (err) return err; @@ -1470,7 +1472,10 @@ static int mv88e6xxx_port_db_dump(struct mv88e6xxx_chip *chip, int port, int err; /* Dump port's default Filtering Information Database (VLAN ID 0) */ + mutex_lock(&chip->reg_lock); err = mv88e6xxx_port_get_fid(chip, port, &fid); + mutex_unlock(&chip->reg_lock); + if (err) return err; @@ -1480,7 +1485,9 @@ static int mv88e6xxx_port_db_dump(struct mv88e6xxx_chip *chip, int port, /* Dump VLANs' Filtering Information Databases */ do { + mutex_lock(&chip->reg_lock); err = mv88e6xxx_vtu_getnext(chip, &vlan); + mutex_unlock(&chip->reg_lock); if (err) return err; @@ -1500,13 +1507,8 @@ static int mv88e6xxx_port_fdb_dump(struct dsa_switch *ds, int port, dsa_fdb_dump_cb_t *cb, void *data) { struct mv88e6xxx_chip *chip = ds->priv; - int err; - - mutex_lock(&chip->reg_lock); - err = mv88e6xxx_port_db_dump(chip, port, cb, data); - mutex_unlock(&chip->reg_lock); - return err; + return mv88e6xxx_port_db_dump(chip, port, cb, data); } static int mv88e6xxx_bridge_map(struct mv88e6xxx_chip *chip, -- 2.15.1
[PATCH net-next 0/2] net: dsa: mv88e6xxx: Improve PTP access latency
PTP needs to retrieve the hardware timestamps from the switch device in a low latency manor. However ethtool -S and bridge fdb show can hold the switch register access mutex for a long time. These patches changes the reading the statistics and the ATU so that the mutex is released and taken again between each statistic or ATU entry. The PTP code can then interleave its access to the hardware, keeping its latency low. Andrew Lunn (2): net: dsa: mv88e6xxx: Release mutex between each statistics read net: dsa: mv88e6xxx: Release mutex between each ATU read drivers/net/dsa/mv88e6xxx/chip.c | 24 ++-- 1 file changed, 14 insertions(+), 10 deletions(-) -- 2.15.1
[PATCH net-next] cxgb4: append firmware dump to vmcore in kernel panic
Register callback to panic_notifier_list. Invoke dump collect routine to append dump to vmcore. Signed-off-by: Rahul Lakkireddy Signed-off-by: Ganesh Goudar --- drivers/net/ethernet/chelsio/cxgb4/cxgb4.h | 5 ++ drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c | 93 +++- drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.h | 4 + drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 13 4 files changed, 111 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h index d3fa53db61ee..3d578fa1d640 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h @@ -568,6 +568,7 @@ enum { /* adapter flags */ FW_OFLD_CONN = (1 << 9), ROOT_NO_RELAXED_ORDERING = (1 << 10), SHUTTING_DOWN = (1 << 11), + K_CRASH= (1 << 12), }; enum { @@ -946,6 +947,10 @@ struct adapter { /* Ethtool Dump */ struct ethtool_dump eth_dump; + + void *dump_buf; /* Dump buffer for collecting logs in panic */ + u32 dump_buf_size; /* Dump buffer size */ + struct notifier_block panic_nb; /* Panic notifier info */ }; /* Support for "sched-class" command to allow a TX Scheduling Class to be diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c index 30485f9a598f..579a019a246f 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c @@ -383,13 +383,25 @@ static void cxgb4_cudbg_collect_entity(struct cudbg_init *pdbg_init, static int cudbg_alloc_compress_buff(struct cudbg_init *pdbg_init) { + struct adapter *adap = pdbg_init->adap; u32 workspace_size; workspace_size = cudbg_get_workspace_size(); - pdbg_init->compress_buff = vzalloc(CUDBG_COMPRESS_BUFF_SIZE + - workspace_size); - if (!pdbg_init->compress_buff) - return -ENOMEM; + + if (adap->flags & K_CRASH) { + /* In panic scenario, the compression buffer is already +* allocated. So, just update accordingly. +*/ + pdbg_init->compress_buff = (u8 *)adap->dump_buf + + adap->dump_buf_size - + workspace_size - + CUDBG_COMPRESS_BUFF_SIZE; + } else { + pdbg_init->compress_buff = vzalloc(CUDBG_COMPRESS_BUFF_SIZE + + workspace_size); + if (!pdbg_init->compress_buff) + return -ENOMEM; + } pdbg_init->compress_buff_size = CUDBG_COMPRESS_BUFF_SIZE; pdbg_init->workspace = (u8 *)pdbg_init->compress_buff + @@ -399,6 +411,14 @@ static int cudbg_alloc_compress_buff(struct cudbg_init *pdbg_init) static void cudbg_free_compress_buff(struct cudbg_init *pdbg_init) { + struct adapter *adap = pdbg_init->adap; + + /* Don't free in panic scenario. We need the buffer to be present +* in vmcore so that we can extract the dump. +*/ + if (adap->flags & K_CRASH) + return; + if (pdbg_init->compress_buff) vfree(pdbg_init->compress_buff); } @@ -488,3 +508,68 @@ void cxgb4_init_ethtool_dump(struct adapter *adapter) adapter->eth_dump.version = adapter->params.fw_vers; adapter->eth_dump.len = 0; } + +static int cxgb4_panic_notify(struct notifier_block *this, unsigned long event, + void *ptr) +{ + struct adapter *adap = container_of(this, struct adapter, panic_nb); + bool use_bd; + u32 len; + + /* Save original value and restore after collection */ + use_bd = adap->use_bd; + + dev_info(adap->pdev_dev, "Initialized cxgb4 crash handler"); + adap->flags |= K_CRASH; + + /* Don't contact firmware. Directly access registers */ + adap->use_bd = true; + + len = adap->dump_buf_size; + cxgb4_cudbg_collect(adap, adap->dump_buf, &len, CXGB4_ETH_DUMP_ALL); + dev_info(adap->pdev_dev, "cxgb4 debug collection done..."); + + /* Restore original value */ + adap->use_bd = use_bd; + return NOTIFY_DONE; +} + +int cxgb4_cudbg_register_notifier(struct adapter *adap) +{ + u32 wsize, len; + + len = sizeof(struct cudbg_hdr) + + sizeof(struct cudbg_entity_hdr) * CUDBG_MAX_ENTITY; + len += cxgb4_get_dump_length(adap, CXGB4_ETH_DUMP_ALL); + + /* If compression is enabled, allocate extra memory needed for +* compression too. +*/ + wsize = cudbg_get_workspace_size(); + if (wsize) + wsize += CUDBG_COMPRESS_BUFF_SIZE; + + adap->dump_buf_size = len + wsize; + adap->dump_bu
Re: [PATCH 0/3] Remove IPVlan module dependencies on IPv6 and Netfilter
On Thu, Feb 15, 2018 at 2:11 AM, David Miller wrote: > From: Matteo Croce > Date: Wed, 14 Feb 2018 19:13:42 +0100 > >> The IPVlan module currently depends on IPv6 and Netfilter. >> Refactor the code to allow building IPVlan module regardless of the value of >> CONFIG_IPV6 and CONFIG_NETFILTER. >> Also change the dependency to CONFIG_NET_L3_MASTER_DEV into a select, >> as compiling L3 Master device alone has no sense. > > As stated, the L3 master and netfilter are hard depenencies when using > ipvlan in some modes. > > You can't just ifdef the driver like this, it changes fundamental > pieces of functionality. > > I would say leave things as they are right now. Hi David, yes, I noticed that L3 master and netfilter are really needed in l3s mode, so let's drop patch 2/3. What about the other two, removing IPv6 and change the Kconfig? Other devices like VXLan, Geneve and VRF uses the same architecture to allow conditional compilation of the IPv6 module, I think that IPVlan should do the same. Regards, -- Matteo Croce per aspera ad upstream
Re: tg3 crashes under high load, when using 100Mbits
On Mon, Feb 12, 2018 at 10:59 AM, Siva Reddy Kallam wrote: > On Fri, Feb 9, 2018 at 10:41 AM, Kai Heng Feng > wrote: >> Hi Broadcom folks, >> >> We are now enabling a new platform with tg3 nic, unfortunately we observed >> the bug [1] that dated back to 2015. >> I tried commit 4419bb1cedcd ("tg3: Add workaround to restrict 5762 MRRS to >> 2048â) but it doesât work. >> >> Do you have any idea how to solve the issue? >> >> [1] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1447664 >> >> Kai-Heng >> > Thank you for reporting. We will check and update you. With link aware mode, the clock speed could be slow and boot code does not complete within the expected time with lower link speeds. Need to override and the clock in driver. We are checking the feasibility of adding this in driver or firmware.
Re: [PATCH V6 2/4] sctp: Add ip option support
On Tue, Feb 13, 2018 at 08:54:44PM +, Richard Haines wrote: > Add ip option support to allow LSM security modules to utilise CIPSO/IPv4 > and CALIPSO/IPv6 services. > > Signed-off-by: Richard Haines > --- > include/net/sctp/sctp.h| 4 +++- > include/net/sctp/structs.h | 2 ++ > net/sctp/chunk.c | 12 +++- > net/sctp/ipv6.c| 42 +++--- > net/sctp/output.c | 5 - > net/sctp/protocol.c| 36 > net/sctp/socket.c | 14 ++ > 7 files changed, 97 insertions(+), 18 deletions(-) > > diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h > index f7ae6b0..25c5c87 100644 > --- a/include/net/sctp/sctp.h > +++ b/include/net/sctp/sctp.h > @@ -441,9 +441,11 @@ static inline int sctp_list_single_entry(struct > list_head *head) > static inline int sctp_frag_point(const struct sctp_association *asoc, int > pmtu) > { > struct sctp_sock *sp = sctp_sk(asoc->base.sk); > + struct sctp_af *af = sp->pf->af; > int frag = pmtu; > > - frag -= sp->pf->af->net_header_len; > + frag -= af->ip_options_len(asoc->base.sk); > + frag -= af->net_header_len; > frag -= sizeof(struct sctphdr) + sctp_datachk_len(&asoc->stream); > > if (asoc->user_frag) > diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h > index 03e92dd..ead5fce 100644 > --- a/include/net/sctp/structs.h > +++ b/include/net/sctp/structs.h > @@ -491,6 +491,7 @@ struct sctp_af { > void(*ecn_capable)(struct sock *sk); > __u16 net_header_len; > int sockaddr_len; > + int (*ip_options_len)(struct sock *sk); > sa_family_t sa_family; > struct list_head list; > }; > @@ -515,6 +516,7 @@ struct sctp_pf { > int (*addr_to_user)(struct sctp_sock *sk, union sctp_addr *addr); > void (*to_sk_saddr)(union sctp_addr *, struct sock *sk); > void (*to_sk_daddr)(union sctp_addr *, struct sock *sk); > + void (*copy_ip_options)(struct sock *sk, struct sock *newsk); > struct sctp_af *af; > }; > > diff --git a/net/sctp/chunk.c b/net/sctp/chunk.c > index 991a530..d5c0ef7 100644 > --- a/net/sctp/chunk.c > +++ b/net/sctp/chunk.c > @@ -154,7 +154,6 @@ static void sctp_datamsg_assign(struct sctp_datamsg *msg, > struct sctp_chunk *chu > chunk->msg = msg; > } > > - > /* A data chunk can have a maximum payload of (2^16 - 20). Break > * down any such message into smaller chunks. Opportunistically, fragment > * the chunks down to the current MTU constraints. We may get refragmented > @@ -171,6 +170,8 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct > sctp_association *asoc, > struct list_head *pos, *temp; > struct sctp_chunk *chunk; > struct sctp_datamsg *msg; > + struct sctp_sock *sp; > + struct sctp_af *af; > int err; > > msg = sctp_datamsg_new(GFP_KERNEL); > @@ -189,9 +190,11 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct > sctp_association *asoc, > /* This is the biggest possible DATA chunk that can fit into >* the packet >*/ > - max_data = asoc->pathmtu - > -sctp_sk(asoc->base.sk)->pf->af->net_header_len - > -sizeof(struct sctphdr) - sctp_datachk_len(&asoc->stream); > + sp = sctp_sk(asoc->base.sk); > + af = sp->pf->af; > + max_data = asoc->pathmtu - af->net_header_len - > +sizeof(struct sctphdr) - sctp_datachk_len(&asoc->stream) - > +af->ip_options_len(asoc->base.sk); > max_data = SCTP_TRUNC4(max_data); > > /* If the the peer requested that we authenticate DATA chunks > @@ -211,7 +214,6 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct > sctp_association *asoc, > > /* Set first_len and then account for possible bundles on first frag */ > first_len = max_data; > - > /* Check to see if we have a pending SACK and try to let it be bundled >* with this message. Do this if we don't have any data queued already. >* To check that, look at out_qlen and retransmit list. > diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c > index e35d4f7..0b0f895 100644 > --- a/net/sctp/ipv6.c > +++ b/net/sctp/ipv6.c > @@ -427,6 +427,38 @@ static void sctp_v6_copy_addrlist(struct list_head > *addrlist, > rcu_read_unlock(); > } > > +/* Copy over any ip options */ > +static void sctp_v6_copy_ip_options(struct sock *sk, struct sock *newsk) > +{ > + struct ipv6_pinfo *newnp, *np = inet6_sk(sk); > + struct ipv6_txoptions *opt; > + > + newnp = inet6_sk(newsk); > + > + rcu_read_lock(); > + opt = rcu_dereference(np->opt); > + if (opt) > + opt = ipv6_dup_options(newsk, opt); do you want to print a warning here in the event the allocation for the dup operation fails? > + RCU_INIT_POINTER(newnp->opt, opt); > + rcu_read
Re: [PATCH V6 0/4] Add SELinux SCTP protocol support
On Wed, Feb 14, 2018 at 02:19:03PM -0500, Paul Moore wrote: > On Tue, Feb 13, 2018 at 3:52 PM, Richard Haines > wrote: > > These patches have been built on Fedora 27 with kernel-4.16.0-0.rc1 plus > > the following userspace patches to enable testing: > > > > 1) Updates to libsepol 2.7 to support the sctp portcon statement. > >The patch is available from: > > http://arctic.selinuxproject.org/~rhaines/selinux-sctp/ > > selinux-Add-support-for-the-SCTP-portcon-keyword.patch > > > > 2) Updates to the SELinux Test Suite adding SCTP tests. Please read the > >selinux-testsuite/README.sctp for details. The patch is available from: > > http://arctic.selinuxproject.org/~rhaines/selinux-sctp/ > > selinux-testsuite-Add-SCTP-test-support.patch > > > > 3) Updates to lksctp-tools that show SELinux info in sctp_darn and > >sctp_test. It also contains a minor patch for test_1_to_1_connect.c > >as when CIPSO/CALIPSO configured, NetLabel returns a different error > >code for illegal addresses in test 5. The patch is available from: > > http://arctic.selinuxproject.org/~rhaines/selinux-sctp/ > > lksctp-tools-Add-SELinux-support-to-sctp_test-and-sc.patch > > > > All SCTP lksctp-tools/src/func_tests run correctly in enforcing mode. > > > > All SCTP regression tests "./sctp-tests run" run correctly in enforcing > > mode. These tests are obtained from: https://github.com/sctp/sctp-tests > > > > The selinux-testsuite patch also adds remote tests (that need some manual > > configuration). These are useful for testing CIPSO/CALIPSO over a network > > with a number of categories to produce large ip option fields with various > > message sizes forcing fragmentation etc.. > > > > Changes since RFC Patch: > > Removed the NetLabel patch (was [RFC PATCH 4/5] netlabel: Add SCTP support) > > as re-engineered. However this patchset will require the NetLabel > > patch at [1] to fully run the SCTP selinux-testsuite. > > > > V1 Changes: > > PATCH 1/4 > > Remove unused parameter from security_sctp_assoc_request(). > > Reformat and update LSM-sctp.rst documentation. > > PATCH 2/4 > > Add variables and RCU locks as requested in [2] to support IP options. > > PATCH 3/4 > > Added security_sctp_assoc_request() hook to sctp_sf_do_unexpected_init() > > and sctp_sf_do_5_2_4_dupcook(). > > Removed security_sctp_assoc_request() hook from sctp_sf_do_5_1C_ack() as > > no longer required. > > PATCH 4/4 > > Reformat and update SELinux-sctp.rst documentation. > > Remove bindx and connectx permissions. > > Rework selinux_socket_connect() and selinux_netlbl_socket_connect() to > > utilise helpers for code reuse. > > Add spinlock to selinux_sctp_assoc_request(). > > Remove unused parameter from security_sctp_assoc_request(). > > Use address->sa_family == AF_INET in *_bind and *_connect to ensure > > correct address type. > > Minor cleanups. > > > > V2 Changes: > > PATCH 4/4 - Remove spin lock from selinux_sctp_assoc_request() > > PATCH 4/4 - Fix selinux_sctp_sk_clone() kbuild test robot catch [3] > > > > V3 Changes: > > PATCH 2/4 - Account for IP options length in sctp.h sctp_frag_point() by > > Marcelo > > > > V4 Changes: > > PATCH 1/4 - Move specific SELinux descriptions from LSM-sctp.rst and > > lsm_hooks.h to SELinux-sctp.rst in PATCH 4/4 > > PATCH 4/4 - Rename selinux_netlbl_sctp_socket_connect() to > > selinux_netlbl_socket_connect_locked() and move description comments to > > selinux_sctp_bind_connect() > > > > V5 Change: Rework selinux_netlbl_socket_connect() and > > selinux_netlbl_socket_connect_locked as requested by Paul. > > > > V6 Changes: > > Rework SCTP patches 2/4 and 3/4 as there have been major SCTP updates since > > kernel 4.14. > > > > [1] https://marc.info/?l=selinux&m=151061619115945&w=2 > > [2] https://marc.info/?l=selinux&m=150962470215797&w=2 > > [3] https://marc.info/?l=selinux&m=151198281817779&w=2 > > > > Richard Haines (4): > > security: Add support for SCTP security hooks > > sctp: Add ip option support > > sctp: Add LSM hooks > > selinux: Add SCTP support > > Marcelo, or any other SCTP folks, do the SCTP changes still look okay > to you? I'd like to merge these into the selinux/next tree by the end > of the week ... > I had a few comments that I just posted. Neil > -- > paul moore > www.paul-moore.com > -- > To unsubscribe from this list: send the line "unsubscribe linux-sctp" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >
[PATCH] PCI/cxgb4: Extend T3 PCI quirk to T4+ devices
From: Casey Leedom We've run into a problem where our device is attached to a Virtual Machine and the use of the new pci_set_vpd_size() API doesn't help. The VM kernel has been informed that the accesses are okay, but all of the actual VPD Capability Accesses are trapped down into the KVM Hypervisor where it goes ahead and imposes the silent denials. The right idea is to follow the kernel.org commit 1c7de2b4ff88 ("PCI: Enable access to non-standard VPD for Chelsio devices (cxgb3)") which Alexey Kardashevskiy authored to establish a PCI Quirk for our T3-based adapters. This commit extends that PCI Quirk to cover Chelsio T4 devices and later. The advantage of this approach is that the VPD Size gets set early in the Base OS/Hypervisor Boot and doesn't require that the cxgb4 driver even be available in the Base OS/Hypervisor. Thus PF4 can be exported to a Virtual Machine and everything should work. Fixes: 67e658794ca1 ("cxgb4: Set VPD size so we can read both VPD structures") Cc: # v4.9+ Signed-off-by: Casey Leedom Signed-off-by: Arjun Vynipadath Signed-off-by: Ganesh Goudar --- drivers/net/ethernet/chelsio/cxgb4/t4_hw.c | 10 drivers/pci/quirks.c | 39 ++ 2 files changed, 23 insertions(+), 26 deletions(-) diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c index 047609e..920bccd 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c +++ b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c @@ -2637,7 +2637,6 @@ void t4_get_regs(struct adapter *adap, void *buf, size_t buf_size) } #define EEPROM_STAT_ADDR 0x7bfc -#define VPD_SIZE 0x800 #define VPD_BASE 0x400 #define VPD_BASE_OLD 0 #define VPD_LEN1024 @@ -2704,15 +2703,6 @@ int t4_get_raw_vpd_params(struct adapter *adapter, struct vpd_params *p) if (!vpd) return -ENOMEM; - /* We have two VPD data structures stored in the adapter VPD area. -* By default, Linux calculates the size of the VPD area by traversing -* the first VPD area at offset 0x0, so we need to tell the OS what -* our real VPD size is. -*/ - ret = pci_set_vpd_size(adapter->pdev, VPD_SIZE); - if (ret < 0) - goto out; - /* Card information normally starts at VPD_BASE but early cards had * it at 0. */ diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index fc73401..8b14bd3 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -3419,22 +3419,29 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_PORT_RIDGE, static void quirk_chelsio_extend_vpd(struct pci_dev *dev) { - pci_set_vpd_size(dev, 8192); -} - -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CHELSIO, 0x20, quirk_chelsio_extend_vpd); -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CHELSIO, 0x21, quirk_chelsio_extend_vpd); -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CHELSIO, 0x22, quirk_chelsio_extend_vpd); -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CHELSIO, 0x23, quirk_chelsio_extend_vpd); -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CHELSIO, 0x24, quirk_chelsio_extend_vpd); -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CHELSIO, 0x25, quirk_chelsio_extend_vpd); -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CHELSIO, 0x26, quirk_chelsio_extend_vpd); -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CHELSIO, 0x30, quirk_chelsio_extend_vpd); -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CHELSIO, 0x31, quirk_chelsio_extend_vpd); -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CHELSIO, 0x32, quirk_chelsio_extend_vpd); -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CHELSIO, 0x35, quirk_chelsio_extend_vpd); -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CHELSIO, 0x36, quirk_chelsio_extend_vpd); -DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CHELSIO, 0x37, quirk_chelsio_extend_vpd); + int chip = (dev->device & 0xf000) >> 12; + int func = (dev->device & 0x0f00) >> 8; + int prod = (dev->device & 0x00ff) >> 0; + + /* +* If this is a T3-based adapter, there's a 1KB VPD area at offset +* 0xc00 which contains the preferred VPD values. If this is a T4 or +* later based adapter, the special VPD is at offset 0x400 for the +* Physical Functions (the SR-IOV Virtual Functions have no VPD +* Capabilities). The PCI VPD Access core routines will normally +* compute the size of the VPD by parsing the VPD Data Structure at +* offset 0x000. This will result in silent failures when attempting +* to accesses these other VPD areas which are beyond those computed +* limits. +*/ + if (chip == 0x0 && prod >= 0x20) + pci_set_vpd_size(dev, 8192); + else if (chip >= 0x4 && func < 0x8) + pci_set_vpd_size(dev, 2048); +} + +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_CHELSIO, PCI_ANY_ID, + quirk_chelsio_extend_vpd); #ifdef CONFIG_ACPI /* -- 2.1.0
Re: [PATCH] net: sched: tbf: handle GSO_BY_FRAGS case in enqueue
Hi Marcelo, > > If this block was meant to be an out-of-band/changelog comment, your > SOB line should be above the first --- marker. > Anyhow, > Reviewed-by: Marcelo Ricardo Leitner Thanks - I did a v2 with this around the right way [1], but DaveM asked me to be a bit more thorough and look for other GSO_BY_FRAGS issues. [2] I did find a couple, including the one I asked you about with qdisk_pkt_len_init [3] (thanks for your answer, btw). Currently I'm trying to wrap my head around GSO_DODGY as Eric mentioned on that thread, and then I'll do a series with this fix and the other GSO_BY_FRAGS fixes I find. Regards, Daniel [1] https://patchwork.ozlabs.org/patch/869145/ [2] https://patchwork.ozlabs.org/comment/1852414/ [3] https://www.spinics.net/lists/netdev/msg482397.html > >> --- >> net/sched/sch_tbf.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/net/sched/sch_tbf.c b/net/sched/sch_tbf.c >> index 229172d509cc..03225a8df973 100644 >> --- a/net/sched/sch_tbf.c >> +++ b/net/sched/sch_tbf.c >> @@ -188,7 +188,8 @@ static int tbf_enqueue(struct sk_buff *skb, struct Qdisc >> *sch, >> int ret; >> >> if (qdisc_pkt_len(skb) > q->max_size) { >> -if (skb_is_gso(skb) && skb_gso_mac_seglen(skb) <= q->max_size) >> +if (skb_is_gso(skb) && >> +skb_gso_validate_mac_len(skb, q->max_size)) >> return tbf_segment(skb, sch, to_free); >> return qdisc_drop(skb, sch, to_free); >> } >> -- >> 2.14.1 >>
[PATCH net] net: sched: fix unbalance in the error path of tca_action_flush()
When tca_action_flush() calls the action walk() and gets an error, a successful call to nla_nest_start() is not followed by a call to nla_nest_cancel(). It's harmless, as the skb is freed in the error path - but it's worth to fix this unbalance. Signed-off-by: Davide Caratti --- net/sched/act_api.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/net/sched/act_api.c b/net/sched/act_api.c index eba6682727dd..43f4991b 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -938,8 +938,10 @@ static int tca_action_flush(struct net *net, struct nlattr *nla, goto out_module_put; err = ops->walk(net, skb, &dcb, RTM_DELACTION, ops); - if (err <= 0) + if (err <= 0) { + nla_nest_cancel(skb, nest); goto out_module_put; + } nla_nest_end(skb, nest); -- 2.14.3
Re: [Crypto v5 03/12] support for inline tls
On 02/15/18 12:24 PM, Atul Gupta wrote: > @@ -401,6 +430,15 @@ static int do_tls_setsockopt_tx(struct sock *sk, char > __user *optval, > goto out; > } > > + rc = get_tls_offload_dev(sk); > + if (rc) { > + goto out; > + } else { > + /* Retain HW unhash for cleanup and move to SW Tx */ > + sk->sk_prot[TLS_BASE_TX].unhash = > + sk->sk_prot[TLS_FULL_HW].unhash; Isn't sk->sk_prot a pointer to a global shared struct here still? It looks like this would actually modify the global struct, and not just for this sk.
[PATCHv3 net-next 0/8] net: sched: act: add extack support
Hi, this patch series adds extack support for the TC action subsystem. As example I for the extack support in a TC action I choosed mirred action. - Alex Cc: David Ahern changes since v3: - adapt recommended changes from Davide Caratti, please check if I catch everything. Thanks. changes since v2: - remove newline in extack of generic walker handling Thanks to Davide Caratti - add ker...@mojatatu.com in cc Alexander Aring (8): net: sched: act: fix code style net: sched: act: add extack to init net: sched: act: handle generic action errors net: sched: act: add extack to init callback net: sched: act: add extack for lookup callback net: sched: act: add extack for walk callback net: sched: act: handle extack in tcf_generic_walker net: sched: act: mirred: add extack support include/net/act_api.h | 17 -- net/sched/act_api.c| 135 + net/sched/act_bpf.c| 10 ++-- net/sched/act_connmark.c | 11 ++-- net/sched/act_csum.c | 10 ++-- net/sched/act_gact.c | 10 ++-- net/sched/act_ife.c| 10 ++-- net/sched/act_ipt.c| 20 --- net/sched/act_mirred.c | 25 ++--- net/sched/act_nat.c| 11 ++-- net/sched/act_pedit.c | 10 ++-- net/sched/act_police.c | 11 ++-- net/sched/act_sample.c | 10 ++-- net/sched/act_simple.c | 10 ++-- net/sched/act_skbedit.c| 10 ++-- net/sched/act_skbmod.c | 10 ++-- net/sched/act_tunnel_key.c | 10 ++-- net/sched/act_vlan.c | 10 ++-- net/sched/cls_api.c| 4 +- 19 files changed, 215 insertions(+), 129 deletions(-) -- 2.11.0
[PATCHv3 net-next 2/8] net: sched: act: add extack to init
This patch adds extack to tcf_action_init and tcf_action_init_1 functions. These are necessary to make individual extack handling in each act implementation. Based on work by David Ahern Cc: David Ahern Signed-off-by: Alexander Aring --- include/net/act_api.h | 5 +++-- net/sched/act_api.c | 17 +++-- net/sched/cls_api.c | 4 ++-- 3 files changed, 16 insertions(+), 10 deletions(-) diff --git a/include/net/act_api.h b/include/net/act_api.h index 32ef544f4ddc..41d95930ffbc 100644 --- a/include/net/act_api.h +++ b/include/net/act_api.h @@ -163,10 +163,11 @@ int tcf_action_exec(struct sk_buff *skb, struct tc_action **actions, int nr_actions, struct tcf_result *res); int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla, struct nlattr *est, char *name, int ovr, int bind, - struct list_head *actions); + struct list_head *actions, struct netlink_ext_ack *extack); struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp, struct nlattr *nla, struct nlattr *est, - char *name, int ovr, int bind); + char *name, int ovr, int bind, + struct netlink_ext_ack *extack); int tcf_action_dump(struct sk_buff *skb, struct list_head *, int, int); int tcf_action_dump_old(struct sk_buff *skb, struct tc_action *a, int, int); int tcf_action_dump_1(struct sk_buff *skb, struct tc_action *a, int, int); diff --git a/net/sched/act_api.c b/net/sched/act_api.c index becc63689fae..8d89b026414f 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -605,7 +605,8 @@ static struct tc_cookie *nla_memdup_cookie(struct nlattr **tb) struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp, struct nlattr *nla, struct nlattr *est, - char *name, int ovr, int bind) + char *name, int ovr, int bind, + struct netlink_ext_ack *extack) { struct tc_action *a; struct tc_action_ops *a_o; @@ -726,7 +727,7 @@ static void cleanup_a(struct list_head *actions, int ovr) int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla, struct nlattr *est, char *name, int ovr, int bind, - struct list_head *actions) + struct list_head *actions, struct netlink_ext_ack *extack) { struct nlattr *tb[TCA_ACT_MAX_PRIO + 1]; struct tc_action *act; @@ -738,7 +739,8 @@ int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla, return err; for (i = 1; i <= TCA_ACT_MAX_PRIO && tb[i]; i++) { - act = tcf_action_init_1(net, tp, tb[i], est, name, ovr, bind); + act = tcf_action_init_1(net, tp, tb[i], est, name, ovr, bind, + extack); if (IS_ERR(act)) { err = PTR_ERR(act); goto err; @@ -1060,12 +1062,14 @@ tcf_add_notify(struct net *net, struct nlmsghdr *n, struct list_head *actions, } static int tcf_action_add(struct net *net, struct nlattr *nla, - struct nlmsghdr *n, u32 portid, int ovr) + struct nlmsghdr *n, u32 portid, int ovr, + struct netlink_ext_ack *extack) { int ret = 0; LIST_HEAD(actions); - ret = tcf_action_init(net, NULL, nla, NULL, NULL, ovr, 0, &actions); + ret = tcf_action_init(net, NULL, nla, NULL, NULL, ovr, 0, &actions, + extack); if (ret) return ret; @@ -1113,7 +1117,8 @@ static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n, if (n->nlmsg_flags & NLM_F_REPLACE) ovr = 1; replay: - ret = tcf_action_add(net, tca[TCA_ACT_TAB], n, portid, ovr); + ret = tcf_action_add(net, tca[TCA_ACT_TAB], n, portid, ovr, +extack); if (ret == -EAGAIN) goto replay; break; diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index 2bc1bc23d42e..f21610c5da1a 100644 --- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c @@ -1434,7 +1434,7 @@ int tcf_exts_validate(struct net *net, struct tcf_proto *tp, struct nlattr **tb, if (exts->police && tb[exts->police]) { act = tcf_action_init_1(net, tp, tb[exts->police], rate_tlv, "police", ovr, - TCA_ACT_BIND); + TCA_ACT_BIND, extack); if (IS_ERR(act)) return PTR_ERR(act);
[PATCHv3 net-next 6/8] net: sched: act: add extack for walk callback
This patch adds extack support for act walker callback api. This prepares to handle extack support inside each specific act implementation. Cc: David Ahern Signed-off-by: Alexander Aring --- include/net/act_api.h | 3 ++- net/sched/act_api.c| 4 ++-- net/sched/act_bpf.c| 3 ++- net/sched/act_connmark.c | 3 ++- net/sched/act_csum.c | 3 ++- net/sched/act_gact.c | 3 ++- net/sched/act_ife.c| 3 ++- net/sched/act_ipt.c| 6 -- net/sched/act_mirred.c | 3 ++- net/sched/act_nat.c| 3 ++- net/sched/act_pedit.c | 3 ++- net/sched/act_police.c | 3 ++- net/sched/act_sample.c | 3 ++- net/sched/act_simple.c | 3 ++- net/sched/act_skbedit.c| 3 ++- net/sched/act_skbmod.c | 3 ++- net/sched/act_tunnel_key.c | 3 ++- net/sched/act_vlan.c | 3 ++- 18 files changed, 38 insertions(+), 20 deletions(-) diff --git a/include/net/act_api.h b/include/net/act_api.h index 0bd65db506ba..ab3529255377 100644 --- a/include/net/act_api.h +++ b/include/net/act_api.h @@ -94,7 +94,8 @@ struct tc_action_ops { int bind, struct netlink_ext_ack *extack); int (*walk)(struct net *, struct sk_buff *, struct netlink_callback *, int, - const struct tc_action_ops *); + const struct tc_action_ops *, + struct netlink_ext_ack *); void(*stats_update)(struct tc_action *, u64, u32, u64); struct net_device *(*get_dev)(const struct tc_action *a); }; diff --git a/net/sched/act_api.c b/net/sched/act_api.c index 02b3c99da786..18df1e956e6a 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -963,7 +963,7 @@ static int tca_action_flush(struct net *net, struct nlattr *nla, goto out_module_put; } - err = ops->walk(net, skb, &dcb, RTM_DELACTION, ops); + err = ops->walk(net, skb, &dcb, RTM_DELACTION, ops, extack); if (err <= 0) goto out_module_put; @@ -1253,7 +1253,7 @@ static int tc_dump_action(struct sk_buff *skb, struct netlink_callback *cb) if (nest == NULL) goto out_module_put; - ret = a_o->walk(net, skb, cb, RTM_GETACTION, a_o); + ret = a_o->walk(net, skb, cb, RTM_GETACTION, a_o, NULL); if (ret < 0) goto out_module_put; diff --git a/net/sched/act_bpf.c b/net/sched/act_bpf.c index d9654b863347..7e01e2c710c4 100644 --- a/net/sched/act_bpf.c +++ b/net/sched/act_bpf.c @@ -367,7 +367,8 @@ static void tcf_bpf_cleanup(struct tc_action *act) static int tcf_bpf_walker(struct net *net, struct sk_buff *skb, struct netlink_callback *cb, int type, - const struct tc_action_ops *ops) + const struct tc_action_ops *ops, + struct netlink_ext_ack *extack) { struct tc_action_net *tn = net_generic(net, bpf_net_id); diff --git a/net/sched/act_connmark.c b/net/sched/act_connmark.c index 0504b7600fb6..cb722da0bb15 100644 --- a/net/sched/act_connmark.c +++ b/net/sched/act_connmark.c @@ -177,7 +177,8 @@ static inline int tcf_connmark_dump(struct sk_buff *skb, struct tc_action *a, static int tcf_connmark_walker(struct net *net, struct sk_buff *skb, struct netlink_callback *cb, int type, - const struct tc_action_ops *ops) + const struct tc_action_ops *ops, + struct netlink_ext_ack *extack) { struct tc_action_net *tn = net_generic(net, connmark_net_id); diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c index bdd17b9ef034..3e8efadb750f 100644 --- a/net/sched/act_csum.c +++ b/net/sched/act_csum.c @@ -631,7 +631,8 @@ static void tcf_csum_cleanup(struct tc_action *a) static int tcf_csum_walker(struct net *net, struct sk_buff *skb, struct netlink_callback *cb, int type, - const struct tc_action_ops *ops) + const struct tc_action_ops *ops, + struct netlink_ext_ack *extack) { struct tc_action_net *tn = net_generic(net, csum_net_id); diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c index e1e69e38f4b0..d96ebe4bb65a 100644 --- a/net/sched/act_gact.c +++ b/net/sched/act_gact.c @@ -201,7 +201,8 @@ static int tcf_gact_dump(struct sk_buff *skb, struct tc_action *a, static int tcf_gact_walker(struct net *net, struct sk_buff *skb, struct netlink_callback *cb, int type, - const struct tc_action_ops *ops) + const struct tc_action_ops *ops, + struct netlink_ext_ack *extack) { struct tc_action_net *tn = net_generic(net, gact_net_id); diff --git a/net/sched/act_ife.c b/net/sched/act_ife.c index 0b70fb0cc609..b777e381e0dd 100644 --- a/net/sc
[PATCHv3 net-next 8/8] net: sched: act: mirred: add extack support
This patch adds extack support for TC mirred action. Cc: David Ahern Signed-off-by: Alexander Aring --- net/sched/act_mirred.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c index 05c2ebe92eca..fd34015331ab 100644 --- a/net/sched/act_mirred.c +++ b/net/sched/act_mirred.c @@ -80,13 +80,17 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla, bool exists = false; int ret; - if (!nla) + if (!nla) { + NL_SET_ERR_MSG_MOD(extack, "Mirred requires attributes to be passed"); return -EINVAL; - ret = nla_parse_nested(tb, TCA_MIRRED_MAX, nla, mirred_policy, NULL); + } + ret = nla_parse_nested(tb, TCA_MIRRED_MAX, nla, mirred_policy, extack); if (ret < 0) return ret; - if (!tb[TCA_MIRRED_PARMS]) + if (!tb[TCA_MIRRED_PARMS]) { + NL_SET_ERR_MSG_MOD(extack, "Missing required mirred parameters"); return -EINVAL; + } parm = nla_data(tb[TCA_MIRRED_PARMS]); exists = tcf_idr_check(tn, parm->index, a, bind); @@ -102,6 +106,7 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla, default: if (exists) tcf_idr_release(*a, bind); + NL_SET_ERR_MSG_MOD(extack, "Unknown mirred option"); return -EINVAL; } if (parm->ifindex) { @@ -117,8 +122,10 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla, } if (!exists) { - if (!dev) + if (!dev) { + NL_SET_ERR_MSG_MOD(extack, "Specified device does not exist"); return -EINVAL; + } ret = tcf_idr_create(tn, parm->index, est, a, &act_mirred_ops, bind, true); if (ret) -- 2.11.0
[PATCHv3 net-next 4/8] net: sched: act: add extack to init callback
This patch adds extack support for act init callback api. This prepares to handle extack support inside each specific act implementation. Based on work by David Ahern Cc: David Ahern Signed-off-by: Alexander Aring --- include/net/act_api.h | 2 +- net/sched/act_api.c| 5 +++-- net/sched/act_bpf.c| 2 +- net/sched/act_connmark.c | 3 ++- net/sched/act_csum.c | 2 +- net/sched/act_gact.c | 2 +- net/sched/act_ife.c| 2 +- net/sched/act_ipt.c| 4 ++-- net/sched/act_mirred.c | 2 +- net/sched/act_nat.c| 3 ++- net/sched/act_pedit.c | 2 +- net/sched/act_police.c | 3 ++- net/sched/act_sample.c | 2 +- net/sched/act_simple.c | 2 +- net/sched/act_skbedit.c| 2 +- net/sched/act_skbmod.c | 2 +- net/sched/act_tunnel_key.c | 2 +- net/sched/act_vlan.c | 2 +- 18 files changed, 24 insertions(+), 20 deletions(-) diff --git a/include/net/act_api.h b/include/net/act_api.h index 41d95930ffbc..3717e0f2bb1b 100644 --- a/include/net/act_api.h +++ b/include/net/act_api.h @@ -90,7 +90,7 @@ struct tc_action_ops { int (*lookup)(struct net *net, struct tc_action **a, u32 index); int (*init)(struct net *net, struct nlattr *nla, struct nlattr *est, struct tc_action **act, int ovr, - int bind); + int bind, struct netlink_ext_ack *extack); int (*walk)(struct net *, struct sk_buff *, struct netlink_callback *, int, const struct tc_action_ops *); diff --git a/net/sched/act_api.c b/net/sched/act_api.c index 5d62f1d123ee..1f39f7800d7c 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -680,9 +680,10 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp, /* backward compatibility for policer */ if (name == NULL) - err = a_o->init(net, tb[TCA_ACT_OPTIONS], est, &a, ovr, bind); + err = a_o->init(net, tb[TCA_ACT_OPTIONS], est, &a, ovr, bind, + extack); else - err = a_o->init(net, nla, est, &a, ovr, bind); + err = a_o->init(net, nla, est, &a, ovr, bind, extack); if (err < 0) goto err_mod; diff --git a/net/sched/act_bpf.c b/net/sched/act_bpf.c index b3f2c15affa7..b3ebfa9598e2 100644 --- a/net/sched/act_bpf.c +++ b/net/sched/act_bpf.c @@ -272,7 +272,7 @@ static void tcf_bpf_prog_fill_cfg(const struct tcf_bpf *prog, static int tcf_bpf_init(struct net *net, struct nlattr *nla, struct nlattr *est, struct tc_action **act, - int replace, int bind) + int replace, int bind, struct netlink_ext_ack *extack) { struct tc_action_net *tn = net_generic(net, bpf_net_id); struct nlattr *tb[TCA_ACT_BPF_MAX + 1]; diff --git a/net/sched/act_connmark.c b/net/sched/act_connmark.c index 2b15ba84e0c8..20e0215360b5 100644 --- a/net/sched/act_connmark.c +++ b/net/sched/act_connmark.c @@ -96,7 +96,8 @@ static const struct nla_policy connmark_policy[TCA_CONNMARK_MAX + 1] = { static int tcf_connmark_init(struct net *net, struct nlattr *nla, struct nlattr *est, struct tc_action **a, -int ovr, int bind) +int ovr, int bind, +struct netlink_ext_ack *extack) { struct tc_action_net *tn = net_generic(net, connmark_net_id); struct nlattr *tb[TCA_CONNMARK_MAX + 1]; diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c index b7ba9b06b147..3b8c48bb2683 100644 --- a/net/sched/act_csum.c +++ b/net/sched/act_csum.c @@ -46,7 +46,7 @@ static struct tc_action_ops act_csum_ops; static int tcf_csum_init(struct net *net, struct nlattr *nla, struct nlattr *est, struct tc_action **a, int ovr, -int bind) +int bind, struct netlink_ext_ack *extack) { struct tc_action_net *tn = net_generic(net, csum_net_id); struct tcf_csum_params *params_old, *params_new; diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c index b56986d41c87..912f3398f1c1 100644 --- a/net/sched/act_gact.c +++ b/net/sched/act_gact.c @@ -56,7 +56,7 @@ static const struct nla_policy gact_policy[TCA_GACT_MAX + 1] = { static int tcf_gact_init(struct net *net, struct nlattr *nla, struct nlattr *est, struct tc_action **a, -int ovr, int bind) +int ovr, int bind, struct netlink_ext_ack *extack) { struct tc_action_net *tn = net_generic(net, gact_net_id); struct nlattr *tb[TCA_GACT_MAX + 1]; diff --git a/net/sched/act_ife.c b/net/sched/act_ife.c index 5954e992685a..e5127d400737 100644 --- a/net/sched/act_ife.c +++ b/net/sched/act_ife.c @@ -447,7 +447,7 @@ static int populate_metalist(struct tcf_ife_info *ife,
[PATCHv3 net-next 7/8] net: sched: act: handle extack in tcf_generic_walker
This patch adds extack handling for a common used TC act function "tcf_generic_walker()" to add an extack message on failures. The tcf_generic_walker() function can fail if get a invalid command different than DEL and GET. The naming "action" here is wrong, the correct naming would be command. Cc: David Ahern Signed-off-by: Alexander Aring --- include/net/act_api.h | 3 ++- net/sched/act_api.c| 6 -- net/sched/act_bpf.c| 2 +- net/sched/act_connmark.c | 2 +- net/sched/act_csum.c | 2 +- net/sched/act_gact.c | 2 +- net/sched/act_ife.c| 2 +- net/sched/act_ipt.c| 4 ++-- net/sched/act_mirred.c | 2 +- net/sched/act_nat.c| 2 +- net/sched/act_pedit.c | 2 +- net/sched/act_police.c | 2 +- net/sched/act_sample.c | 2 +- net/sched/act_simple.c | 2 +- net/sched/act_skbedit.c| 2 +- net/sched/act_skbmod.c | 2 +- net/sched/act_tunnel_key.c | 2 +- net/sched/act_vlan.c | 2 +- 18 files changed, 23 insertions(+), 20 deletions(-) diff --git a/include/net/act_api.h b/include/net/act_api.h index ab3529255377..9c2f22695025 100644 --- a/include/net/act_api.h +++ b/include/net/act_api.h @@ -140,7 +140,8 @@ static inline void tc_action_net_exit(struct list_head *net_list, int tcf_generic_walker(struct tc_action_net *tn, struct sk_buff *skb, struct netlink_callback *cb, int type, - const struct tc_action_ops *ops); + const struct tc_action_ops *ops, + struct netlink_ext_ack *extack); int tcf_idr_search(struct tc_action_net *tn, struct tc_action **a, u32 index); bool tcf_idr_check(struct tc_action_net *tn, u32 index, struct tc_action **a, int bind); diff --git a/net/sched/act_api.c b/net/sched/act_api.c index 18df1e956e6a..ce18cd9c2860 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -202,7 +202,8 @@ static int tcf_del_walker(struct tcf_idrinfo *idrinfo, struct sk_buff *skb, int tcf_generic_walker(struct tc_action_net *tn, struct sk_buff *skb, struct netlink_callback *cb, int type, - const struct tc_action_ops *ops) + const struct tc_action_ops *ops, + struct netlink_ext_ack *extack) { struct tcf_idrinfo *idrinfo = tn->idrinfo; @@ -211,7 +212,8 @@ int tcf_generic_walker(struct tc_action_net *tn, struct sk_buff *skb, } else if (type == RTM_GETACTION) { return tcf_dump_walker(idrinfo, skb, cb); } else { - WARN(1, "tcf_generic_walker: unknown action %d\n", type); + WARN(1, "tcf_generic_walker: unknown command %d\n", type); + NL_SET_ERR_MSG(extack, "tcf_generic_walker: unknown command"); return -EINVAL; } } diff --git a/net/sched/act_bpf.c b/net/sched/act_bpf.c index 7e01e2c710c4..cb3c5d403c88 100644 --- a/net/sched/act_bpf.c +++ b/net/sched/act_bpf.c @@ -372,7 +372,7 @@ static int tcf_bpf_walker(struct net *net, struct sk_buff *skb, { struct tc_action_net *tn = net_generic(net, bpf_net_id); - return tcf_generic_walker(tn, skb, cb, type, ops); + return tcf_generic_walker(tn, skb, cb, type, ops, extack); } static int tcf_bpf_search(struct net *net, struct tc_action **a, u32 index, diff --git a/net/sched/act_connmark.c b/net/sched/act_connmark.c index cb722da0bb15..e4b880fa51fe 100644 --- a/net/sched/act_connmark.c +++ b/net/sched/act_connmark.c @@ -182,7 +182,7 @@ static int tcf_connmark_walker(struct net *net, struct sk_buff *skb, { struct tc_action_net *tn = net_generic(net, connmark_net_id); - return tcf_generic_walker(tn, skb, cb, type, ops); + return tcf_generic_walker(tn, skb, cb, type, ops, extack); } static int tcf_connmark_search(struct net *net, struct tc_action **a, u32 index, diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c index 3e8efadb750f..d5c2e528d150 100644 --- a/net/sched/act_csum.c +++ b/net/sched/act_csum.c @@ -636,7 +636,7 @@ static int tcf_csum_walker(struct net *net, struct sk_buff *skb, { struct tc_action_net *tn = net_generic(net, csum_net_id); - return tcf_generic_walker(tn, skb, cb, type, ops); + return tcf_generic_walker(tn, skb, cb, type, ops, extack); } static int tcf_csum_search(struct net *net, struct tc_action **a, u32 index, diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c index d96ebe4bb65a..f072bcf33760 100644 --- a/net/sched/act_gact.c +++ b/net/sched/act_gact.c @@ -206,7 +206,7 @@ static int tcf_gact_walker(struct net *net, struct sk_buff *skb, { struct tc_action_net *tn = net_generic(net, gact_net_id); - return tcf_generic_walker(tn, skb, cb, type, ops); + return tcf_generic_walker(tn, skb, cb, type, ops, extack); } static int tcf_gact_search(struct net *net, struct tc_action **a, u32 index, diff --git a/net/sched/act_ife.c b/net/sched/act
[PATCHv3 net-next 5/8] net: sched: act: add extack for lookup callback
This patch adds extack support for act lookup callback api. This prepares to handle extack support inside each specific act implementation. Cc: David Ahern Signed-off-by: Alexander Aring --- include/net/act_api.h | 3 ++- net/sched/act_api.c| 2 +- net/sched/act_bpf.c| 3 ++- net/sched/act_connmark.c | 3 ++- net/sched/act_csum.c | 3 ++- net/sched/act_gact.c | 3 ++- net/sched/act_ife.c| 3 ++- net/sched/act_ipt.c| 6 -- net/sched/act_mirred.c | 3 ++- net/sched/act_nat.c| 3 ++- net/sched/act_pedit.c | 3 ++- net/sched/act_police.c | 3 ++- net/sched/act_sample.c | 3 ++- net/sched/act_simple.c | 3 ++- net/sched/act_skbedit.c| 3 ++- net/sched/act_skbmod.c | 3 ++- net/sched/act_tunnel_key.c | 3 ++- net/sched/act_vlan.c | 3 ++- 18 files changed, 37 insertions(+), 19 deletions(-) diff --git a/include/net/act_api.h b/include/net/act_api.h index 3717e0f2bb1b..0bd65db506ba 100644 --- a/include/net/act_api.h +++ b/include/net/act_api.h @@ -87,7 +87,8 @@ struct tc_action_ops { struct tcf_result *); int (*dump)(struct sk_buff *, struct tc_action *, int, int); void(*cleanup)(struct tc_action *); - int (*lookup)(struct net *net, struct tc_action **a, u32 index); + int (*lookup)(struct net *net, struct tc_action **a, u32 index, + struct netlink_ext_ack *extack); int (*init)(struct net *net, struct nlattr *nla, struct nlattr *est, struct tc_action **act, int ovr, int bind, struct netlink_ext_ack *extack); diff --git a/net/sched/act_api.c b/net/sched/act_api.c index 1f39f7800d7c..02b3c99da786 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -901,7 +901,7 @@ static struct tc_action *tcf_action_get_1(struct net *net, struct nlattr *nla, goto err_out; } err = -ENOENT; - if (ops->lookup(net, &a, index) == 0) + if (ops->lookup(net, &a, index, extack) == 0) goto err_mod; module_put(ops->owner); diff --git a/net/sched/act_bpf.c b/net/sched/act_bpf.c index b3ebfa9598e2..d9654b863347 100644 --- a/net/sched/act_bpf.c +++ b/net/sched/act_bpf.c @@ -374,7 +374,8 @@ static int tcf_bpf_walker(struct net *net, struct sk_buff *skb, return tcf_generic_walker(tn, skb, cb, type, ops); } -static int tcf_bpf_search(struct net *net, struct tc_action **a, u32 index) +static int tcf_bpf_search(struct net *net, struct tc_action **a, u32 index, + struct netlink_ext_ack *extack) { struct tc_action_net *tn = net_generic(net, bpf_net_id); diff --git a/net/sched/act_connmark.c b/net/sched/act_connmark.c index 20e0215360b5..0504b7600fb6 100644 --- a/net/sched/act_connmark.c +++ b/net/sched/act_connmark.c @@ -184,7 +184,8 @@ static int tcf_connmark_walker(struct net *net, struct sk_buff *skb, return tcf_generic_walker(tn, skb, cb, type, ops); } -static int tcf_connmark_search(struct net *net, struct tc_action **a, u32 index) +static int tcf_connmark_search(struct net *net, struct tc_action **a, u32 index, + struct netlink_ext_ack *extack) { struct tc_action_net *tn = net_generic(net, connmark_net_id); diff --git a/net/sched/act_csum.c b/net/sched/act_csum.c index 3b8c48bb2683..bdd17b9ef034 100644 --- a/net/sched/act_csum.c +++ b/net/sched/act_csum.c @@ -638,7 +638,8 @@ static int tcf_csum_walker(struct net *net, struct sk_buff *skb, return tcf_generic_walker(tn, skb, cb, type, ops); } -static int tcf_csum_search(struct net *net, struct tc_action **a, u32 index) +static int tcf_csum_search(struct net *net, struct tc_action **a, u32 index, + struct netlink_ext_ack *extack) { struct tc_action_net *tn = net_generic(net, csum_net_id); diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c index 912f3398f1c1..e1e69e38f4b0 100644 --- a/net/sched/act_gact.c +++ b/net/sched/act_gact.c @@ -208,7 +208,8 @@ static int tcf_gact_walker(struct net *net, struct sk_buff *skb, return tcf_generic_walker(tn, skb, cb, type, ops); } -static int tcf_gact_search(struct net *net, struct tc_action **a, u32 index) +static int tcf_gact_search(struct net *net, struct tc_action **a, u32 index, + struct netlink_ext_ack *extack) { struct tc_action_net *tn = net_generic(net, gact_net_id); diff --git a/net/sched/act_ife.c b/net/sched/act_ife.c index e5127d400737..0b70fb0cc609 100644 --- a/net/sched/act_ife.c +++ b/net/sched/act_ife.c @@ -831,7 +831,8 @@ static int tcf_ife_walker(struct net *net, struct sk_buff *skb, return tcf_generic_walker(tn, skb, cb, type, ops); } -static int tcf_ife_search(struct net *net, struct tc_action **a, u32 index) +static int tcf_ife_search(struct net *net, struct tc_action **a, u32 index, +
[PATCHv3 net-next 1/8] net: sched: act: fix code style
This patch is used by subsequent patches. It fixes code style issues caught by checkpatch. Signed-off-by: Alexander Aring --- include/net/act_api.h | 5 +++-- net/sched/act_api.c| 12 ++-- net/sched/act_mirred.c | 6 +++--- 3 files changed, 12 insertions(+), 11 deletions(-) diff --git a/include/net/act_api.h b/include/net/act_api.h index 6ed9692f20bd..32ef544f4ddc 100644 --- a/include/net/act_api.h +++ b/include/net/act_api.h @@ -87,12 +87,13 @@ struct tc_action_ops { struct tcf_result *); int (*dump)(struct sk_buff *, struct tc_action *, int, int); void(*cleanup)(struct tc_action *); - int (*lookup)(struct net *, struct tc_action **, u32); + int (*lookup)(struct net *net, struct tc_action **a, u32 index); int (*init)(struct net *net, struct nlattr *nla, struct nlattr *est, struct tc_action **act, int ovr, int bind); int (*walk)(struct net *, struct sk_buff *, - struct netlink_callback *, int, const struct tc_action_ops *); + struct netlink_callback *, int, + const struct tc_action_ops *); void(*stats_update)(struct tc_action *, u64, u32, u64); struct net_device *(*get_dev)(const struct tc_action *a); }; diff --git a/net/sched/act_api.c b/net/sched/act_api.c index 4886ea4a7d6e..becc63689fae 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -621,7 +621,7 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp, goto err_out; err = -EINVAL; kind = tb[TCA_ACT_KIND]; - if (kind == NULL) + if (!kind) goto err_out; if (nla_strlcpy(act_name, kind, IFNAMSIZ) >= IFNAMSIZ) goto err_out; @@ -822,7 +822,7 @@ static int tca_get_fill(struct sk_buff *skb, struct list_head *actions, t->tca__pad2 = 0; nest = nla_nest_start(skb, TCA_ACT_TAB); - if (nest == NULL) + if (!nest) goto out_nlmsg_trim; if (tcf_action_dump(skb, actions, bind, ref) < 0) @@ -934,7 +934,7 @@ static int tca_action_flush(struct net *net, struct nlattr *nla, t->tca__pad2 = 0; nest = nla_nest_start(skb, TCA_ACT_TAB); - if (nest == NULL) + if (!nest) goto out_module_put; err = ops->walk(net, skb, &dcb, RTM_DELACTION, ops); @@ -1005,10 +1005,10 @@ tca_action_gd(struct net *net, struct nlattr *nla, struct nlmsghdr *n, return ret; if (event == RTM_DELACTION && n->nlmsg_flags & NLM_F_ROOT) { - if (tb[1] != NULL) + if (tb[1]) return tca_action_flush(net, tb[1], n, portid); - else - return -EINVAL; + + return -EINVAL; } for (i = 1; i <= TCA_ACT_MAX_PRIO && tb[i]; i++) { diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c index e6ff88f72900..abcd5f12b913 100644 --- a/net/sched/act_mirred.c +++ b/net/sched/act_mirred.c @@ -80,12 +80,12 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla, bool exists = false; int ret; - if (nla == NULL) + if (!nla) return -EINVAL; ret = nla_parse_nested(tb, TCA_MIRRED_MAX, nla, mirred_policy, NULL); if (ret < 0) return ret; - if (tb[TCA_MIRRED_PARMS] == NULL) + if (!tb[TCA_MIRRED_PARMS]) return -EINVAL; parm = nla_data(tb[TCA_MIRRED_PARMS]); @@ -117,7 +117,7 @@ static int tcf_mirred_init(struct net *net, struct nlattr *nla, } if (!exists) { - if (dev == NULL) + if (!dev) return -EINVAL; ret = tcf_idr_create(tn, parm->index, est, a, &act_mirred_ops, bind, true); -- 2.11.0
[PATCHv3 net-next 3/8] net: sched: act: handle generic action errors
This patch adds extack support for generic act handling. The extack will be set deeper to each called function which is not part of netdev core api. Based on work by David Ahern Cc: David Ahern Signed-off-by: Alexander Aring --- net/sched/act_api.c | 93 +++-- 1 file changed, 61 insertions(+), 32 deletions(-) diff --git a/net/sched/act_api.c b/net/sched/act_api.c index 8d89b026414f..5d62f1d123ee 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -617,31 +617,40 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp, int err; if (name == NULL) { - err = nla_parse_nested(tb, TCA_ACT_MAX, nla, NULL, NULL); + err = nla_parse_nested(tb, TCA_ACT_MAX, nla, NULL, extack); if (err < 0) goto err_out; err = -EINVAL; kind = tb[TCA_ACT_KIND]; - if (!kind) + if (!kind) { + NL_SET_ERR_MSG(extack, "TC action kind must be specified"); goto err_out; - if (nla_strlcpy(act_name, kind, IFNAMSIZ) >= IFNAMSIZ) + } + if (nla_strlcpy(act_name, kind, IFNAMSIZ) >= IFNAMSIZ) { + NL_SET_ERR_MSG(extack, "TC action name too long"); goto err_out; + } if (tb[TCA_ACT_COOKIE]) { int cklen = nla_len(tb[TCA_ACT_COOKIE]); - if (cklen > TC_COOKIE_MAX_SIZE) + if (cklen > TC_COOKIE_MAX_SIZE) { + NL_SET_ERR_MSG(extack, "TC cookie size above the maximum"); goto err_out; + } cookie = nla_memdup_cookie(tb); if (!cookie) { + NL_SET_ERR_MSG(extack, "No memory to generate TC cookie"); err = -ENOMEM; goto err_out; } } } else { - err = -EINVAL; - if (strlcpy(act_name, name, IFNAMSIZ) >= IFNAMSIZ) + if (strlcpy(act_name, name, IFNAMSIZ) >= IFNAMSIZ) { + NL_SET_ERR_MSG(extack, "TC action name too long"); + err = -EINVAL; goto err_out; + } } a_o = tc_lookup_action_n(act_name); @@ -664,6 +673,7 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp, goto err_mod; } #endif + NL_SET_ERR_MSG(extack, "Failed to load TC action module"); err = -ENOENT; goto err_out; } @@ -698,6 +708,7 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp, list_add_tail(&a->list, &actions); tcf_action_destroy(&actions, bind); + NL_SET_ERR_MSG(extack, "Failed to init TC action chain"); return ERR_PTR(err); } } @@ -734,7 +745,7 @@ int tcf_action_init(struct net *net, struct tcf_proto *tp, struct nlattr *nla, int err; int i; - err = nla_parse_nested(tb, TCA_ACT_MAX_PRIO, nla, NULL, NULL); + err = nla_parse_nested(tb, TCA_ACT_MAX_PRIO, nla, NULL, extack); if (err < 0) return err; @@ -842,7 +853,8 @@ static int tca_get_fill(struct sk_buff *skb, struct list_head *actions, static int tcf_get_notify(struct net *net, u32 portid, struct nlmsghdr *n, - struct list_head *actions, int event) + struct list_head *actions, int event, + struct netlink_ext_ack *extack) { struct sk_buff *skb; @@ -851,6 +863,7 @@ tcf_get_notify(struct net *net, u32 portid, struct nlmsghdr *n, return -ENOBUFS; if (tca_get_fill(skb, actions, portid, n->nlmsg_seq, 0, event, 0, 0) <= 0) { + NL_SET_ERR_MSG(extack, "Failed to fill netlink attributes while adding TC action"); kfree_skb(skb); return -EINVAL; } @@ -859,7 +872,8 @@ tcf_get_notify(struct net *net, u32 portid, struct nlmsghdr *n, } static struct tc_action *tcf_action_get_1(struct net *net, struct nlattr *nla, - struct nlmsghdr *n, u32 portid) + struct nlmsghdr *n, u32 portid, + struct netlink_ext_ack *extack) { struct nlattr *tb[TCA_ACT_MAX + 1]; const struct tc_action_ops *ops; @@ -867,20 +881,24 @@ static struct tc_action *tcf_action_get_1(struct net *net, struct nlattr *nla, int index; int err; - err = nla_parse_nested(tb, TCA_ACT_MAX, nla, NULL, NULL);
Re: ppp/pppoe, still panic 4.15.3 in ppp_push
On Thu, Feb 15, 2018 at 12:19:52PM +0200, Denys Fedoryshchenko wrote: > Here we go: > > [24558.921549] > == > [24558.922167] BUG: KASAN: use-after-free in ppp_ioctl+0xa6a/0x1522 > [ppp_generic] > [24558.922776] Write of size 8 at addr 8803d35bf3f8 by task > accel-pppd/12622 > [24558.923113] > [24558.923451] CPU: 0 PID: 12622 Comm: accel-pppd Tainted: GW > 4.15.3-build-0134 #1 > [24558.924058] Hardware name: HP ProLiant DL320e Gen8 v2, BIOS P80 > 04/02/2015 > [24558.924406] Call Trace: > [24558.924753] dump_stack+0x46/0x59 > [24558.925103] print_address_description+0x6b/0x23b > [24558.925451] ? ppp_ioctl+0xa6a/0x1522 [ppp_generic] > [24558.925797] kasan_report+0x21b/0x241 > [24558.926136] ppp_ioctl+0xa6a/0x1522 [ppp_generic] > [24558.926479] ? ppp_nl_newlink+0x1da/0x1da [ppp_generic] > [24558.926829] ? sock_sendmsg+0x89/0x99 > [24558.927176] ? __vfs_write+0xd9/0x4ad > [24558.927523] ? kernel_read+0xed/0xed > [24558.927872] ? SyS_getpeername+0x18c/0x18c > [24558.928213] ? bit_waitqueue+0x2a/0x2a > [24558.928561] ? wake_atomic_t_function+0x115/0x115 > [24558.928898] vfs_ioctl+0x6e/0x81 > [24558.929228] do_vfs_ioctl+0xa00/0xb10 > [24558.929571] ? sigprocmask+0x1a6/0x1d0 > [24558.929907] ? sigsuspend+0x13e/0x13e > [24558.930239] ? ioctl_preallocate+0x14e/0x14e > [24558.930568] ? SyS_rt_sigprocmask+0xf1/0x142 > [24558.930904] ? sigprocmask+0x1d0/0x1d0 > [24558.931252] SyS_ioctl+0x39/0x55 > [24558.931595] ? do_vfs_ioctl+0xb10/0xb10 > [24558.931942] do_syscall_64+0x1b1/0x31f > [24558.932288] entry_SYSCALL_64_after_hwframe+0x21/0x86 > [24558.932627] RIP: 0033:0x7f302849d8a7 > [24558.932965] RSP: 002b:7f3029a52af8 EFLAGS: 0206 ORIG_RAX: > 0010 > [24558.933578] RAX: ffda RBX: 7f3027d861e3 RCX: > 7f302849d8a7 > [24558.933927] RDX: 7f3023f49468 RSI: 4004743a RDI: > 3a67 > [24558.934266] RBP: 7f3029a52b20 R08: R09: > 55c8308d8e40 > [24558.934607] R10: 0008 R11: 0206 R12: > 7f3023f49358 > [24558.934947] R13: 7ffe86e5723f R14: R15: > 7f3029a53700 > [24558.935288] > [24558.935626] Allocated by task 12622: > [24558.935972] ppp_register_net_channel+0x5f/0x5c6 [ppp_generic] > [24558.936306] pppoe_connect+0xab7/0xc71 [pppoe] > [24558.936640] SyS_connect+0x14b/0x1b7 > [24558.936975] do_syscall_64+0x1b1/0x31f > [24558.937319] entry_SYSCALL_64_after_hwframe+0x21/0x86 > [24558.937655] > [24558.937993] Freed by task 12622: > [24558.938321] kfree+0xb0/0x11d > [24558.938658] ppp_release+0x111/0x120 [ppp_generic] > [24558.938994] __fput+0x2ba/0x51a > [24558.939332] task_work_run+0x11c/0x13d > [24558.939676] exit_to_usermode_loop+0x7c/0xaf > [24558.940022] do_syscall_64+0x2ea/0x31f > [24558.940368] entry_SYSCALL_64_after_hwframe+0x21/0x86 > [24558.947099] Your first guess was right. It looks like we have an issue with reference counting on the channels. Can you send me your ppp_generic.o?
Re: [PATCHv3 net-next 0/8] net: sched: act: add extack support
On 18-02-15 10:54 AM, Alexander Aring wrote: Hi, this patch series adds extack support for the TC action subsystem. As example I for the extack support in a TC action I choosed mirred action. For the patch series: Acked-by: Jamal Hadi Salim cheers, jamal
Re: ppp/pppoe, still panic 4.15.3 in ppp_push
On 2018-02-15 17:55, Guillaume Nault wrote: On Thu, Feb 15, 2018 at 12:19:52PM +0200, Denys Fedoryshchenko wrote: Here we go: [24558.921549] == [24558.922167] BUG: KASAN: use-after-free in ppp_ioctl+0xa6a/0x1522 [ppp_generic] [24558.922776] Write of size 8 at addr 8803d35bf3f8 by task accel-pppd/12622 [24558.923113] [24558.923451] CPU: 0 PID: 12622 Comm: accel-pppd Tainted: G W 4.15.3-build-0134 #1 [24558.924058] Hardware name: HP ProLiant DL320e Gen8 v2, BIOS P80 04/02/2015 [24558.924406] Call Trace: [24558.924753] dump_stack+0x46/0x59 [24558.925103] print_address_description+0x6b/0x23b [24558.925451] ? ppp_ioctl+0xa6a/0x1522 [ppp_generic] [24558.925797] kasan_report+0x21b/0x241 [24558.926136] ppp_ioctl+0xa6a/0x1522 [ppp_generic] [24558.926479] ? ppp_nl_newlink+0x1da/0x1da [ppp_generic] [24558.926829] ? sock_sendmsg+0x89/0x99 [24558.927176] ? __vfs_write+0xd9/0x4ad [24558.927523] ? kernel_read+0xed/0xed [24558.927872] ? SyS_getpeername+0x18c/0x18c [24558.928213] ? bit_waitqueue+0x2a/0x2a [24558.928561] ? wake_atomic_t_function+0x115/0x115 [24558.928898] vfs_ioctl+0x6e/0x81 [24558.929228] do_vfs_ioctl+0xa00/0xb10 [24558.929571] ? sigprocmask+0x1a6/0x1d0 [24558.929907] ? sigsuspend+0x13e/0x13e [24558.930239] ? ioctl_preallocate+0x14e/0x14e [24558.930568] ? SyS_rt_sigprocmask+0xf1/0x142 [24558.930904] ? sigprocmask+0x1d0/0x1d0 [24558.931252] SyS_ioctl+0x39/0x55 [24558.931595] ? do_vfs_ioctl+0xb10/0xb10 [24558.931942] do_syscall_64+0x1b1/0x31f [24558.932288] entry_SYSCALL_64_after_hwframe+0x21/0x86 [24558.932627] RIP: 0033:0x7f302849d8a7 [24558.932965] RSP: 002b:7f3029a52af8 EFLAGS: 0206 ORIG_RAX: 0010 [24558.933578] RAX: ffda RBX: 7f3027d861e3 RCX: 7f302849d8a7 [24558.933927] RDX: 7f3023f49468 RSI: 4004743a RDI: 3a67 [24558.934266] RBP: 7f3029a52b20 R08: R09: 55c8308d8e40 [24558.934607] R10: 0008 R11: 0206 R12: 7f3023f49358 [24558.934947] R13: 7ffe86e5723f R14: R15: 7f3029a53700 [24558.935288] [24558.935626] Allocated by task 12622: [24558.935972] ppp_register_net_channel+0x5f/0x5c6 [ppp_generic] [24558.936306] pppoe_connect+0xab7/0xc71 [pppoe] [24558.936640] SyS_connect+0x14b/0x1b7 [24558.936975] do_syscall_64+0x1b1/0x31f [24558.937319] entry_SYSCALL_64_after_hwframe+0x21/0x86 [24558.937655] [24558.937993] Freed by task 12622: [24558.938321] kfree+0xb0/0x11d [24558.938658] ppp_release+0x111/0x120 [ppp_generic] [24558.938994] __fput+0x2ba/0x51a [24558.939332] task_work_run+0x11c/0x13d [24558.939676] exit_to_usermode_loop+0x7c/0xaf [24558.940022] do_syscall_64+0x2ea/0x31f [24558.940368] entry_SYSCALL_64_after_hwframe+0x21/0x86 [24558.947099] Your first guess was right. It looks like we have an issue with reference counting on the channels. Can you send me your ppp_generic.o? http://nuclearcat.com/ppp_generic.o Compiled with gcc version 6.4.0 (Gentoo 6.4.0-r1 p1.3)
Re: [PATCH net-next 0/3] eBPF Seccomp filters
> On Feb 14, 2018, at 8:30 PM, Alexei Starovoitov > wrote: > > On Wed, Feb 14, 2018 at 10:32:22AM -0700, Tycho Andersen wrote: What's the reason for adding eBPF support? seccomp shouldn't need it, and it only makes the code more complex. I'd rather stick with cBPF until we have an overwhelmingly good reason to use eBPF as a "native" seccomp filter language. >>> >>> I can think of two fairly strong use cases for eBPF's ability to call >>> functions: logging and Tycho's user notifier thing. >> >> Worth noting that there is one additional thing that I didn't >> implement, but which would be nice and is probably not possible with >> eBPF (at least, not without a bunch of additional infrastructure): >> passing fds back to the tracee from the manager if you intercept >> socket(), or accept() or something. >> >> This could again be accomplished via other means, though it would be a >> lot nicer to have a primitive for it. > > there is bpf_perf_event_output() interface that allows to stream > arbitrary data from kernel into user space via perf ring buffer. > User space can epoll on it. We use this in both tracing and networking > for notifications and streaming data transfers. > I suspect this can be used for 'logging' too, since it's cheap and fast. I think this is the right idea but we'd want to tweak it. We don't want the log messages to go to some systemwide buffer (seccomp can already so this and its annoying) -- we want them to go to the filter's creator. In fact, the seccomp listener fd concept could easily be extended to do exactly this. > > Also I think the argument that seccomp+eBPF will be faster than > seccomp+cBPF is a weak one. I bet kpti on/off makes no difference > under seccomp, since _all_ syscalls are already slow for sandboxed app. It's been a while since I benchmarked it, but I suspect that a simple seccomp filter is quite a bit faster than a PTI transition.
[PATCH net] dn_getsockoptdecnet: move nf_{get/set}sockopt outside sock lock
After commit 3f34cfae1238 ("netfilter: on sockopt() acquire sock lock only in the required scope"), the caller of nf_{get/set}sockopt() must not hold any lock, but, in such changeset, I forgot to cope with DECnet. This commit addresses the issue moving the nf call outside the lock, in the dn_{get,set}sockopt() with the same schema currently used by ipv4 and ipv6. Also moves the unhandled sockopts of the end of the main switch statements, to improve code readability. Reported-by: Petr Vandrovec BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=198791#c2 Fixes: 3f34cfae1238 ("netfilter: on sockopt() acquire sock lock only in the required scope") Signed-off-by: Paolo Abeni --- net/decnet/af_decnet.c | 62 +++--- 1 file changed, 33 insertions(+), 29 deletions(-) diff --git a/net/decnet/af_decnet.c b/net/decnet/af_decnet.c index 91dd09f79808..791aff68af88 100644 --- a/net/decnet/af_decnet.c +++ b/net/decnet/af_decnet.c @@ -1338,6 +1338,12 @@ static int dn_setsockopt(struct socket *sock, int level, int optname, char __use lock_sock(sk); err = __dn_setsockopt(sock, level, optname, optval, optlen, 0); release_sock(sk); +#ifdef CONFIG_NETFILTER + /* we need to exclude all possible ENOPROTOOPTs except default case */ + if (err == -ENOPROTOOPT && optname != DSO_LINKINFO && + optname != DSO_STREAM && optname != DSO_SEQPACKET) + err = nf_setsockopt(sk, PF_DECnet, optname, optval, optlen); +#endif return err; } @@ -1445,15 +1451,6 @@ static int __dn_setsockopt(struct socket *sock, int level,int optname, char __us dn_nsp_send_disc(sk, 0x38, 0, sk->sk_allocation); break; - default: -#ifdef CONFIG_NETFILTER - return nf_setsockopt(sk, PF_DECnet, optname, optval, optlen); -#endif - case DSO_LINKINFO: - case DSO_STREAM: - case DSO_SEQPACKET: - return -ENOPROTOOPT; - case DSO_MAXWINDOW: if (optlen != sizeof(unsigned long)) return -EINVAL; @@ -1501,6 +1498,12 @@ static int __dn_setsockopt(struct socket *sock, int level,int optname, char __us return -EINVAL; scp->info_loc = u.info; break; + + case DSO_LINKINFO: + case DSO_STREAM: + case DSO_SEQPACKET: + default: + return -ENOPROTOOPT; } return 0; @@ -1514,6 +1517,20 @@ static int dn_getsockopt(struct socket *sock, int level, int optname, char __use lock_sock(sk); err = __dn_getsockopt(sock, level, optname, optval, optlen, 0); release_sock(sk); +#ifdef CONFIG_NETFILTER + if (err == -ENOPROTOOPT && optname != DSO_STREAM && + optname != DSO_SEQPACKET && optname != DSO_CONACCEPT && + optname != DSO_CONREJECT) { + int len; + + if (get_user(len, optlen)) + return -EFAULT; + + err = nf_getsockopt(sk, PF_DECnet, optname, optval, &len); + if (err >= 0) + err = put_user(len, optlen); + } +#endif return err; } @@ -1579,26 +1596,6 @@ static int __dn_getsockopt(struct socket *sock, int level,int optname, char __us r_data = &link; break; - default: -#ifdef CONFIG_NETFILTER - { - int ret, len; - - if (get_user(len, optlen)) - return -EFAULT; - - ret = nf_getsockopt(sk, PF_DECnet, optname, optval, &len); - if (ret >= 0) - ret = put_user(len, optlen); - return ret; - } -#endif - case DSO_STREAM: - case DSO_SEQPACKET: - case DSO_CONACCEPT: - case DSO_CONREJECT: - return -ENOPROTOOPT; - case DSO_MAXWINDOW: if (r_len > sizeof(unsigned long)) r_len = sizeof(unsigned long); @@ -1630,6 +1627,13 @@ static int __dn_getsockopt(struct socket *sock, int level,int optname, char __us r_len = sizeof(unsigned char); r_data = &scp->info_rem; break; + + case DSO_STREAM: + case DSO_SEQPACKET: + case DSO_CONACCEPT: + case DSO_CONREJECT: + default: + return -ENOPROTOOPT; } if (r_data) { -- 2.14.3
Re: [PATCHv2 net-next 3/8] net: sched: act: handle generic action errors
On Thu, 2018-02-15 at 10:43 -0500, Alexander Aring wrote: > I will send v3 with your changes. But not with one error handling in > label, the most whole point is to get some messages when somebody > throw a -EINVAL to the userspace and TC subsystem has a lot of places > with that. that's ok for me. thanks! -- davide
RE: [Crypto v5 03/12] support for inline tls
-Original Message- From: Dave Watson [mailto:davejwat...@fb.com] Sent: Thursday, February 15, 2018 9:22 PM To: Atul Gupta Cc: da...@davemloft.net; herb...@gondor.apana.org.au; s...@queasysnail.net; linux-cry...@vger.kernel.org; netdev@vger.kernel.org; Ganesh GR Subject: Re: [Crypto v5 03/12] support for inline tls On 02/15/18 12:24 PM, Atul Gupta wrote: > @@ -401,6 +430,15 @@ static int do_tls_setsockopt_tx(struct sock *sk, char > __user *optval, > goto out; > } > > + rc = get_tls_offload_dev(sk); > + if (rc) { > + goto out; > + } else { > + /* Retain HW unhash for cleanup and move to SW Tx */ > + sk->sk_prot[TLS_BASE_TX].unhash = > + sk->sk_prot[TLS_FULL_HW].unhash; Isn't sk->sk_prot a pointer to a global shared struct here still? It looks like this would actually modify the global struct, and not just for this sk. Yes, its global. It require add on check to modify only when tls_offload dev list has an entry. I will revisit and correct. Can you look through other changes please? Thanks Atul
Re: [PATCHv2 net-next 3/8] net: sched: act: handle generic action errors
Hi, On Thu, Feb 15, 2018 at 6:14 AM, Davide Caratti wrote: > On Wed, 2018-02-14 at 17:13 -0500, Alexander Aring wrote: >> This patch adds extack support for generic act handling. The extack >> will be set deeper to each called function which is not part of netdev >> core api. >> >> Based on work by David Ahern > > hello Alexander, > > after looking at the code more closely, I think there is margin for some > more improvement here (i.e make the message contents easier to grep from a > log). Please let me know if the comments below make sense for you. > I will send v3 with your changes. But not with one error handling in label, the most whole point is to get some messages when somebody throw a -EINVAL to the userspace and TC subsystem has a lot of places with that. - Alex
Re: [RFC][PATCH bpf v2 1/2] bpf: allow 64-bit offsets for bpf function calls
On 02/13/2018 05:05 AM, Sandipan Das wrote: > The imm field of a bpf_insn is a signed 32-bit integer. For > JIT-ed bpf-to-bpf function calls, it stores the offset from > __bpf_call_base to the start of the callee function. > > For some architectures, such as powerpc64, it was found that > this offset may be as large as 64 bits because of which this > cannot be accomodated in the imm field without truncation. > > To resolve this, we additionally make aux->func within each > bpf_prog associated with the functions to point to the list > of all function addresses determined by the verifier. > > We keep the value assigned to the off field of the bpf_insn > as a way to index into aux->func and also set aux->func_cnt > so that this can be used for performing basic upper bound > checks for the off field. > > Signed-off-by: Sandipan Das > --- > v2: Make aux->func point to the list of functions determined > by the verifier rather than allocating a separate callee > list for each function. Approach looks good to me; do you know whether s390x JIT would have similar requirement? I think one limitation that would still need to be addressed later with such approach would be regarding the xlated prog dump in bpftool, see 'BPF calls via JIT' in 7105e828c087 ("bpf: allow for correlation of maps and helpers in dump"). Any ideas for this (potentially if we could use off + imm for calls, we'd get to 48 bits, but that seems still not be enough as you say)? Thanks, Daniel
Re: [Crypto v5 03/12] support for inline tls
On 02/15/18 04:10 PM, Atul Gupta wrote: > > -Original Message- > > From: Dave Watson [mailto:davejwat...@fb.com] > > Sent: Thursday, February 15, 2018 9:22 PM > > To: Atul Gupta > > Cc: da...@davemloft.net; herb...@gondor.apana.org.au; s...@queasysnail.net; > > linux-cry...@vger.kernel.org; netdev@vger.kernel.org; Ganesh GR > > > > Subject: Re: [Crypto v5 03/12] support for inline tls > > > > On 02/15/18 12:24 PM, Atul Gupta wrote: > > > @@ -401,6 +430,15 @@ static int do_tls_setsockopt_tx(struct sock *sk, > > > char __user *optval, > > > goto out; > > > } > > > > > > + rc = get_tls_offload_dev(sk); > > > + if (rc) { > > > + goto out; > > > + } else { > > > + /* Retain HW unhash for cleanup and move to SW Tx */ > > > + sk->sk_prot[TLS_BASE_TX].unhash = > > > + sk->sk_prot[TLS_FULL_HW].unhash; > > > > Isn't sk->sk_prot a pointer to a global shared struct here still? It looks > > like this would actually modify the global struct, and not just for this sk. > Yes, its global. It require add on check to modify only when tls_offload dev > list has an entry. I will revisit and correct. > > Can you look through other changes please? I looked through 1,2,3,11 (the tls-related ones) and don't have any other code comments. Patch 11 commit message still mentions ULP, could use updating / clarification.
RE: [Crypto v5 03/12] support for inline tls
> > > @@ -401,6 +430,15 @@ static int do_tls_setsockopt_tx(struct sock *sk, > > > char __user *optval, > > > goto out; > > > } > > > > > > + rc = get_tls_offload_dev(sk); > > > + if (rc) { > > > + goto out; > > > + } else { > > > + /* Retain HW unhash for cleanup and move to SW Tx */ > > > + sk->sk_prot[TLS_BASE_TX].unhash = > > > + sk->sk_prot[TLS_FULL_HW].unhash; > > > > Isn't sk->sk_prot a pointer to a global shared struct here still? It looks > > like this would actually modify the global struct, and not just for this sk. > Yes, its global. It require add on check to modify only when tls_offload dev > list has an entry. I will revisit and correct. > > Can you look through other changes please? I looked through 1,2,3,11 (the tls-related ones) and don't have any other code comments. Patch 11 commit message still mentions ULP, could use updating / clarification. Thank you, I will collate and clean for v6.
Re: [PATCH V2 net-next 4/7] rds: support for zcopy completion notification
On Thu, Feb 15, 2018 at 7:03 AM, Sowmini Varadhan wrote: > On (02/14/18 19:41), Willem de Bruijn wrote: >> >> One more thing: this code notifies that the operation succeeded, but >> the data was copied in the process. It does not have to be set otherwise. > > I see. this one was a bit confusing for me (hence the copy/paste) - > maybe because the TCP/UDP/PACKET case is a bit different from RDS, > and sendmsg may find that it has to switch from zcopy to bcopy > after the sendmsg() itself has returned, but this cannot happen > for RDS (thus I should never set this code?). Agreed. > Is it ok if I fix this in the next round or do you think > this warrants a V3? The flag is a hint, so on its own it does not require a respin. There are by now four small items to patch up. I would respin so that the patchset that is recorded can be read later as a single correct solution. But either way works, I don't feel strongly.
Re: [PATCH V2 net-next 4/7] rds: support for zcopy completion notification
On (02/15/18 11:35), Willem de Bruijn wrote: > > The flag is a hint, so on its own it does not require a respin. > > There are by now four small items to patch up. I would respin so that the > patchset that is recorded can be read later as a single correct solution. Agreed, I'll genenrate V3 later today --Sowmini
Re: [PATCH v7 04/37] earlycon: add reg-offset to physical address before mapping
On Tue, Feb 13, 2018 at 3:09 AM, Greentime Hu wrote: > It will get the wrong virtual address because port->mapbase is not added > the correct reg-offset yet. We have to update it before earlycon_map() > is called > > Signed-off-by: Greentime Hu > Acked-by: Arnd Bergmann > Cc: Peter Hurley > Cc: sta...@vger.kernel.org > Fixes: 088da2a17619 ("of: earlycon: Initialize port fields from DT > properties") This should get applied now rather than sit in numerous revisions of this series for nds32. Acked-by: Rob Herring Greg, can you apply? Rob
v4.16-rc1 misaligned atomics in skb__clone / __napi_alloc_skb
Hi, While fuzzing arm64 v4.16-rc1 with Syzkaller, I've been hitting a misaligned atomic in __skb_clone: atomic_inc(&(skb_shinfo(skb)->dataref)); .. where dataref doesn't have the required natural alignment, and the atomic operation faults. e.g. i often see it aligned to a single byte boundary rather than a four byte boundary. AFAICT, the skb_shared_info is misaligned at the instant it's allocated in __napi_alloc_skb(). With the patch at the end of this mail, the atomic_set() (which is a WRITE_ONCE()) in __build_skb() blows up, e.g. WARNING: CPU: 0 PID: 8457 at mm/access_once.c:12 access_once_alignment_check+0x34/0x40 mm/access_once.c:12 Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 8457 Comm: syz-executor1 Not tainted 4.16.0-rc1-2-gb03ae7b8b0de #9 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace+0x0/0x390 arch/arm64/kernel/time.c:52 show_stack+0x20/0x30 arch/arm64/kernel/traps.c:151 __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0xd0/0x130 lib/dump_stack.c:53 panic+0x220/0x3fc kernel/panic.c:183 __warn+0x270/0x2bc kernel/panic.c:547 report_bug+0x1dc/0x2d0 lib/bug.c:184 bug_handler+0x7c/0x128 arch/arm64/kernel/traps.c:758 call_break_hook arch/arm64/kernel/debug-monitors.c:305 [inline] brk_handler+0x1a0/0x300 arch/arm64/kernel/debug-monitors.c:320 do_debug_exception+0x15c/0x408 arch/arm64/mm/fault.c:808 el1_dbg+0x18/0x78 access_once_alignment_check+0x34/0x40 mm/access_once.c:12 __napi_alloc_skb+0x18c/0x2b8 net/core/skbuff.c:482 napi_alloc_skb include/linux/skbuff.h:2643 [inline] napi_get_frags+0x68/0x120 net/core/dev.c:5108 tun_napi_alloc_frags drivers/net/tun.c:1477 [inline] tun_get_user+0x13b0/0x3fe8 drivers/net/tun.c:1820 tun_chr_write_iter+0xa8/0x158 drivers/net/tun.c:1988 call_write_iter include/linux/fs.h:1781 [inline] do_iter_readv_writev+0x2f8/0x490 fs/read_write.c:653 do_iter_write+0x14c/0x4b0 fs/read_write.c:932 vfs_writev+0x130/0x288 fs/read_write.c:977 do_writev+0xe0/0x248 fs/read_write.c:1012 SYSC_writev fs/read_write.c:1085 [inline] SyS_writev+0x34/0x48 fs/read_write.c:1082 el0_svc_naked+0x30/0x34 SMP: stopping secondary CPUs Kernel Offset: disabled CPU features: 0x1002082 Memory Limit: none Rebooting in 86400 seconds.. ... I see these splats with both tun and virtio-net. I have some Syzkaller logs, and can reproduce the problem locally, but unfortunately the C reproducer it generated doesn't seem to work on its own. Any ideas as to how this could happen? Thanks, Mark. >8 diff --git a/include/linux/compiler.h b/include/linux/compiler.h index c2cc57a2f508..c06b810a3b3b 100644 --- a/include/linux/compiler.h +++ b/include/linux/compiler.h @@ -163,6 +163,8 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val, #include +void access_once_alignment_check(const volatile void *ptr, int size); + #define __READ_ONCE_SIZE \ ({ \ switch (size) { \ @@ -180,6 +182,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val, static __always_inline void __read_once_size(const volatile void *p, void *res, int size) { + access_once_alignment_check(p, size); __READ_ONCE_SIZE; } @@ -203,6 +206,8 @@ void __read_once_size_nocheck(const volatile void *p, void *res, int size) static __always_inline void __write_once_size(volatile void *p, void *res, int size) { + access_once_alignment_check(p, size); + switch (size) { case 1: *(volatile __u8 *)p = *(__u8 *)res; break; case 2: *(volatile __u16 *)p = *(__u16 *)res; break; diff --git a/mm/Makefile b/mm/Makefile index e669f02c5a54..604d269d7d57 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -3,6 +3,7 @@ # Makefile for the linux memory manager. # +KASAN_SANITIZE_access_once.o := n KASAN_SANITIZE_slab_common.o := n KASAN_SANITIZE_slab.o := n KASAN_SANITIZE_slub.o := n @@ -10,6 +11,7 @@ KASAN_SANITIZE_slub.o := n # These files are disabled because they produce non-interesting and/or # flaky coverage that is not a function of syscall inputs. E.g. slab is out of # free pages, or a task is migrated between nodes. +KCOV_INSTRUMENT_access_once.o := n KCOV_INSTRUMENT_slab_common.o := n KCOV_INSTRUMENT_slob.o := n KCOV_INSTRUMENT_slab.o := n @@ -39,7 +41,7 @@ obj-y := filemap.o mempool.o oom_kill.o \ mm_init.o mmu_context.o percpu.o slab_common.o \ compaction.o vmacache.o swap_slots.o \ interval_tree.o list_lru.o workingset.o \ - debug.o $(mmu-y) + debug.o access_once.o $(mmu-y) obj-y += init-mm.o diff --git a/mm/access_once.c b/mm/access_once.c new file mode 100644 index ..42ee35d171c4 --- /dev/null +++ b/mm/access_once.c @@ -0,0 +1,15 @@ +#include +#include
[PATCH net] udplite: fix partial checksum initialization
Since UDP-Lite is always using checksum, the following path is triggered when calculating pseudo header for it: udp4_csum_init() or udp6_csum_init() skb_checksum_init_zero_check() __skb_checksum_validate_complete() The problem can appear if skb->len is less than CHECKSUM_BREAK. In this particular case __skb_checksum_validate_complete() also invokes __skb_checksum_complete(skb). If UDP-Lite is using partial checksum that covers only part of a packet, the function will return bad checksum and the packet will be dropped. It can be fixed if we skip skb_checksum_init_zero_check() and only set the required pseudo header checksum for UDP-Lite with partial checksum before udp4_csum_init()/udp6_csum_init() functions return. Fixes: ed70fcfcee95 ("net: Call skb_checksum_init in IPv4") Fixes: e4f45b7f40bd ("net: Call skb_checksum_init in IPv6") Signed-off-by: Alexey Kodanev --- Alternatively, we could modify skb_checksum_init_zero_check() because it is used only in udp4_csum_init() and udp6_csum_init() as well as introducing a new 'cscov' parameter in __skb_checksum_validate_complete() and calling __skb_checksum_complete_head() if 'cscov' not equals zero. But these changes would touch general code in skbuff.h include/net/udplite.h | 1 + net/ipv4/udp.c | 5 + net/ipv6/ip6_checksum.c | 5 + 3 files changed, 11 insertions(+) diff --git a/include/net/udplite.h b/include/net/udplite.h index 81bdbf9..9185e45 100644 --- a/include/net/udplite.h +++ b/include/net/udplite.h @@ -64,6 +64,7 @@ static inline int udplite_checksum_init(struct sk_buff *skb, struct udphdr *uh) UDP_SKB_CB(skb)->cscov = cscov; if (skb->ip_summed == CHECKSUM_COMPLETE) skb->ip_summed = CHECKSUM_NONE; + skb->csum_valid = 0; } return 0; diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index bfaefe5..e5ef7c3 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -2024,6 +2024,11 @@ static inline int udp4_csum_init(struct sk_buff *skb, struct udphdr *uh, err = udplite_checksum_init(skb, uh); if (err) return err; + + if (UDP_SKB_CB(skb)->partial_cov) { + skb->csum = inet_compute_pseudo(skb, proto); + return 0; + } } /* Note, we are only interested in != 0 or == 0, thus the diff --git a/net/ipv6/ip6_checksum.c b/net/ipv6/ip6_checksum.c index ec43d18..547515e 100644 --- a/net/ipv6/ip6_checksum.c +++ b/net/ipv6/ip6_checksum.c @@ -73,6 +73,11 @@ int udp6_csum_init(struct sk_buff *skb, struct udphdr *uh, int proto) err = udplite_checksum_init(skb, uh); if (err) return err; + + if (UDP_SKB_CB(skb)->partial_cov) { + skb->csum = ip6_compute_pseudo(skb, proto); + return 0; + } } /* To support RFC 6936 (allow zero checksum in UDP/IPV6 for tunnels) -- 1.8.3.1
Re: v4.16-rc1 misaligned atomics in skb__clone / __napi_alloc_skb
On Thu, Feb 15, 2018 at 9:04 AM, Mark Rutland wrote: > Hi, > > While fuzzing arm64 v4.16-rc1 with Syzkaller, I've been hitting a > misaligned atomic in __skb_clone: > > atomic_inc(&(skb_shinfo(skb)->dataref)); > > .. where dataref doesn't have the required natural alignment, and the > atomic operation faults. e.g. i often see it aligned to a single byte > boundary rather than a four byte boundary. > > AFAICT, the skb_shared_info is misaligned at the instant it's allocated > in __napi_alloc_skb(). With the patch at the end of this mail, the > atomic_set() (which is a WRITE_ONCE()) in __build_skb() blows up, e.g. > > WARNING: CPU: 0 PID: 8457 at mm/access_once.c:12 > access_once_alignment_check+0x34/0x40 mm/access_once.c:12 > Kernel panic - not syncing: panic_on_warn set ... > > CPU: 0 PID: 8457 Comm: syz-executor1 Not tainted > 4.16.0-rc1-2-gb03ae7b8b0de #9 > Hardware name: linux,dummy-virt (DT) > Call trace: > dump_backtrace+0x0/0x390 arch/arm64/kernel/time.c:52 > show_stack+0x20/0x30 arch/arm64/kernel/traps.c:151 > __dump_stack lib/dump_stack.c:17 [inline] > dump_stack+0xd0/0x130 lib/dump_stack.c:53 > panic+0x220/0x3fc kernel/panic.c:183 > __warn+0x270/0x2bc kernel/panic.c:547 > report_bug+0x1dc/0x2d0 lib/bug.c:184 > bug_handler+0x7c/0x128 arch/arm64/kernel/traps.c:758 > call_break_hook arch/arm64/kernel/debug-monitors.c:305 [inline] > brk_handler+0x1a0/0x300 arch/arm64/kernel/debug-monitors.c:320 > do_debug_exception+0x15c/0x408 arch/arm64/mm/fault.c:808 > el1_dbg+0x18/0x78 > access_once_alignment_check+0x34/0x40 mm/access_once.c:12 > __napi_alloc_skb+0x18c/0x2b8 net/core/skbuff.c:482 > napi_alloc_skb include/linux/skbuff.h:2643 [inline] > napi_get_frags+0x68/0x120 net/core/dev.c:5108 > tun_napi_alloc_frags drivers/net/tun.c:1477 [inline] > tun_get_user+0x13b0/0x3fe8 drivers/net/tun.c:1820 > tun_chr_write_iter+0xa8/0x158 drivers/net/tun.c:1988 > call_write_iter include/linux/fs.h:1781 [inline] > do_iter_readv_writev+0x2f8/0x490 fs/read_write.c:653 > do_iter_write+0x14c/0x4b0 fs/read_write.c:932 > vfs_writev+0x130/0x288 fs/read_write.c:977 > do_writev+0xe0/0x248 fs/read_write.c:1012 > SYSC_writev fs/read_write.c:1085 [inline] > SyS_writev+0x34/0x48 fs/read_write.c:1082 > el0_svc_naked+0x30/0x34 > SMP: stopping secondary CPUs > Kernel Offset: disabled > CPU features: 0x1002082 > Memory Limit: none > Rebooting in 86400 seconds.. > > ... I see these splats with both tun and virtio-net. > > I have some Syzkaller logs, and can reproduce the problem locally, but > unfortunately the C reproducer it generated doesn't seem to work on its > own. > > Any ideas as to how this could happen? > Yes, it seems tun.c breaks the assumptions. If it really wants to provide arbitrary fragments and alignments, it should use a separate Please try : diff --git a/drivers/net/tun.c b/drivers/net/tun.c index 81e6cc951e7fc7c983919365c34842c34bcaedcf..92c6b6d02f7c18b63c42ffe1d9cb7286975e1263 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -1500,7 +1500,7 @@ static struct sk_buff *tun_napi_alloc_frags(struct tun_file *tfile, } local_bh_disable(); - data = napi_alloc_frag(fragsz); + data = napi_alloc_frag(SKB_DATA_ALIGN(fragsz)); local_bh_enable(); if (!data) { err = -ENOMEM; > Thanks, > Mark. > > >8 > diff --git a/include/linux/compiler.h b/include/linux/compiler.h > index c2cc57a2f508..c06b810a3b3b 100644 > --- a/include/linux/compiler.h > +++ b/include/linux/compiler.h > @@ -163,6 +163,8 @@ void ftrace_likely_update(struct ftrace_likely_data *f, > int val, > > #include > > +void access_once_alignment_check(const volatile void *ptr, int size); > + > #define __READ_ONCE_SIZE \ > ({ \ > switch (size) { \ > @@ -180,6 +182,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, > int val, > static __always_inline > void __read_once_size(const volatile void *p, void *res, int size) > { > + access_once_alignment_check(p, size); > __READ_ONCE_SIZE; > } > > @@ -203,6 +206,8 @@ void __read_once_size_nocheck(const volatile void *p, > void *res, int size) > > static __always_inline void __write_once_size(volatile void *p, void *res, > int size) > { > + access_once_alignment_check(p, size); > + > switch (size) { > case 1: *(volatile __u8 *)p = *(__u8 *)res; break; > case 2: *(volatile __u16 *)p = *(__u16 *)res; break; > diff --git a/mm/Makefile b/mm/Makefile > index e669f02c5a54..604d269d7d57 100644 > --- a/mm/Makefile > +++ b/mm/Makefile > @@ -3,6 +3,7 @@ > # Makefile for the linux memory manager. > # > > +KASAN_SANITIZE_access_once.o := n > KASAN_SANITIZE_slab_common.o := n > KASAN_SANITIZE_slab.o := n > KASAN_SANITIZE_slub.o :=
Re: v4.16-rc1 misaligned atomics in skb__clone / __napi_alloc_skb
On Thu, Feb 15, 2018 at 9:20 AM, Eric Dumazet wrote: > > Yes, it seems tun.c breaks the assumptions. > > If it really wants to provide arbitrary fragments and alignments, it > should use a separate Sorry, I have sent the message to soon. tun.c should use a private 'struct page_frag_cache' to deliver arbitrary frags/alignments, so that syzkaller might catch interesting bugs in the stack. > > Please try : > > diff --git a/drivers/net/tun.c b/drivers/net/tun.c > index > 81e6cc951e7fc7c983919365c34842c34bcaedcf..92c6b6d02f7c18b63c42ffe1d9cb7286975e1263 > 100644 > --- a/drivers/net/tun.c > +++ b/drivers/net/tun.c > @@ -1500,7 +1500,7 @@ static struct sk_buff > *tun_napi_alloc_frags(struct tun_file *tfile, > } > > local_bh_disable(); > - data = napi_alloc_frag(fragsz); > + data = napi_alloc_frag(SKB_DATA_ALIGN(fragsz)); > local_bh_enable(); > if (!data) { > err = -ENOMEM; This patch should solve your immediate problem, but would lower fuzzer abilities to find bugs. I will send something more suited to original intent of these commits : 90e33d45940793def6f773b2d528e9f3c84ffdc7 tun: enable napi_gro_frags() for TUN/TAP driver 943170998b200190f99d3fe7e771437e2c51f319 tun: enable NAPI for TUN/TAP driver
Re: v4.16-rc1 misaligned atomics in skb__clone / __napi_alloc_skb
On Thu, Feb 15, 2018 at 09:24:36AM -0800, Eric Dumazet wrote: > On Thu, Feb 15, 2018 at 9:20 AM, Eric Dumazet wrote: > > > > Yes, it seems tun.c breaks the assumptions. > > > > If it really wants to provide arbitrary fragments and alignments, it > > should use a separate > > Sorry, I have sent the message to soon. > > tun.c should use a private 'struct page_frag_cache' to deliver > arbitrary frags/alignments, > so that syzkaller might catch interesting bugs in the stack. > > > Please try : > > > > diff --git a/drivers/net/tun.c b/drivers/net/tun.c > > index > > 81e6cc951e7fc7c983919365c34842c34bcaedcf..92c6b6d02f7c18b63c42ffe1d9cb7286975e1263 > > 100644 > > --- a/drivers/net/tun.c > > +++ b/drivers/net/tun.c > > @@ -1500,7 +1500,7 @@ static struct sk_buff > > *tun_napi_alloc_frags(struct tun_file *tfile, > > } > > > > local_bh_disable(); > > - data = napi_alloc_frag(fragsz); > > + data = napi_alloc_frag(SKB_DATA_ALIGN(fragsz)); > > local_bh_enable(); > > if (!data) { > > err = -ENOMEM; > > This patch should solve your immediate problem, but would lower fuzzer > abilities to find bugs. So far so good, it seems! > I will send something more suited to original intent of these commits : > > 90e33d45940793def6f773b2d528e9f3c84ffdc7 tun: enable napi_gro_frags() > for TUN/TAP driver > 943170998b200190f99d3fe7e771437e2c51f319 tun: enable NAPI for TUN/TAP driver Thanks! I'd be more than happy to test any such patches. As I mentioned, I'm also seeing similar in virtio-net, e.g. WARNING: CPU: 0 PID: 8 at mm/access_once.c:12 access_once_alignment_check+0x34/0x40 mm/access_once.c:12 Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 8 Comm: ksoftirqd/0 Not tainted 4.16.0-rc1-2-gb03ae7b8b0de #9 Hardware name: linux,dummy-virt (DT) Call trace: dump_backtrace+0x0/0x390 arch/arm64/kernel/time.c:52 show_stack+0x20/0x30 arch/arm64/kernel/traps.c:151 __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0xd0/0x130 lib/dump_stack.c:53 panic+0x220/0x3fc kernel/panic.c:183 __warn+0x270/0x2bc kernel/panic.c:547 report_bug+0x1dc/0x2d0 lib/bug.c:184 bug_handler+0x7c/0x128 arch/arm64/kernel/traps.c:758 call_break_hook arch/arm64/kernel/debug-monitors.c:305 [inline] brk_handler+0x1a0/0x300 arch/arm64/kernel/debug-monitors.c:320 do_debug_exception+0x15c/0x408 arch/arm64/mm/fault.c:808 el1_dbg+0x18/0x78 access_once_alignment_check+0x34/0x40 mm/access_once.c:12 __napi_alloc_skb+0x18c/0x2b8 net/core/skbuff.c:482 napi_alloc_skb include/linux/skbuff.h:2643 [inline] page_to_skb.isra.17+0x58/0x610 drivers/net/virtio_net.c:345 receive_mergeable drivers/net/virtio_net.c:783 [inline] receive_buf+0x978/0x2a70 drivers/net/virtio_net.c:888 virtnet_receive drivers/net/virtio_net.c:1160 [inline] virtnet_poll+0x24c/0x850 drivers/net/virtio_net.c:1240 napi_poll net/core/dev.c:5690 [inline] net_rx_action+0x324/0xa50 net/core/dev.c:5756 __do_softirq+0x318/0x734 kernel/softirq.c:285 run_ksoftirqd+0x70/0xa8 kernel/softirq.c:666 smpboot_thread_fn+0x544/0x9d0 kernel/smpboot.c:164 kthread+0x2f8/0x380 kernel/kthread.c:238 ret_from_fork+0x10/0x18 arch/arm64/kernel/entry.S:1158 SMP: stopping secondary CPUs Kernel Offset: disabled CPU features: 0x1002082 Memory Limit: none Rebooting in 86400 seconds.. ... does similar apply there? Thanks, Mark.
[PATCH net-next,v3] net: sched: add em_ipt ematch for calling xtables matches
The commit a new tc ematch for using netfilter xtable matches. This allows early classification as well as mirroning/redirecting traffic based on logic implemented in netfilter extensions. Current supported use case is classification based on the incoming IPSec state used during decpsulation using the 'policy' iptables extension (xt_policy). The module dynamically fetches the netfilter match module and calls it using a fake xt_action_param structure based on validated userspace provided parameters. As the xt_policy match does not access skb->data, no skb modifications are needed on match. Signed-off-by: Eyal Birger --- v3: - limit supported match to xt_policy and validate parameters - receive match protocol from userspace v2: - Remove skb push/pull and limit functionality to ingress --- include/uapi/linux/pkt_cls.h | 3 +- include/uapi/linux/tc_ematch/tc_em_ipt.h | 20 +++ net/sched/Kconfig| 12 ++ net/sched/Makefile | 1 + net/sched/em_ipt.c | 257 +++ 5 files changed, 292 insertions(+), 1 deletion(-) create mode 100644 include/uapi/linux/tc_ematch/tc_em_ipt.h create mode 100644 net/sched/em_ipt.c diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h index 46c5066..7cafb26d 100644 --- a/include/uapi/linux/pkt_cls.h +++ b/include/uapi/linux/pkt_cls.h @@ -555,7 +555,8 @@ enum { #defineTCF_EM_VLAN 6 #defineTCF_EM_CANID7 #defineTCF_EM_IPSET8 -#defineTCF_EM_MAX 8 +#defineTCF_EM_IPT 9 +#defineTCF_EM_MAX 9 enum { TCF_EM_PROG_TC diff --git a/include/uapi/linux/tc_ematch/tc_em_ipt.h b/include/uapi/linux/tc_ematch/tc_em_ipt.h new file mode 100644 index 000..49a6553 --- /dev/null +++ b/include/uapi/linux/tc_ematch/tc_em_ipt.h @@ -0,0 +1,20 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +#ifndef __LINUX_TC_EM_IPT_H +#define __LINUX_TC_EM_IPT_H + +#include +#include + +enum { + TCA_EM_IPT_UNSPEC, + TCA_EM_IPT_HOOK, + TCA_EM_IPT_MATCH_NAME, + TCA_EM_IPT_MATCH_REVISION, + TCA_EM_IPT_NFPROTO, + TCA_EM_IPT_MATCH_DATA, + __TCA_EM_IPT_MAX +}; + +#define TCA_EM_IPT_MAX (__TCA_EM_IPT_MAX - 1) + +#endif diff --git a/net/sched/Kconfig b/net/sched/Kconfig index f24a6ae..a01169f 100644 --- a/net/sched/Kconfig +++ b/net/sched/Kconfig @@ -658,6 +658,18 @@ config NET_EMATCH_IPSET To compile this code as a module, choose M here: the module will be called em_ipset. +config NET_EMATCH_IPT + tristate "IPtables Matches" + depends on NET_EMATCH && NETFILTER && NETFILTER_XTABLES + ---help--- + Say Y here to be able to classify packets based on iptables + matches. + Current supported match is "policy" which allows packet classification + based on IPsec policy that was used during decapsulation + + To compile this code as a module, choose M here: the + module will be called em_ipt. + config NET_CLS_ACT bool "Actions" select NET_CLS diff --git a/net/sched/Makefile b/net/sched/Makefile index 5b63544..8811d38 100644 --- a/net/sched/Makefile +++ b/net/sched/Makefile @@ -75,3 +75,4 @@ obj-$(CONFIG_NET_EMATCH_META) += em_meta.o obj-$(CONFIG_NET_EMATCH_TEXT) += em_text.o obj-$(CONFIG_NET_EMATCH_CANID) += em_canid.o obj-$(CONFIG_NET_EMATCH_IPSET) += em_ipset.o +obj-$(CONFIG_NET_EMATCH_IPT) += em_ipt.o diff --git a/net/sched/em_ipt.c b/net/sched/em_ipt.c new file mode 100644 index 000..a5f34e9 --- /dev/null +++ b/net/sched/em_ipt.c @@ -0,0 +1,257 @@ +/* + * net/sched/em_ipt.c IPtables matches Ematch + * + * (c) 2018 Eyal Birger + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +struct em_ipt_match { + const struct xt_match *match; + u32 hook; + u8 match_data[0] __aligned(8); +}; + +struct em_ipt_xt_match { + char *match_name; + int (*validate_match_data)(struct nlattr **tb, u8 mrev); +}; + +static const struct nla_policy em_ipt_policy[TCA_EM_IPT_MAX + 1] = { + [TCA_EM_IPT_MATCH_NAME] = { .type = NLA_STRING, + .len = XT_EXTENSION_MAXNAMELEN }, + [TCA_EM_IPT_MATCH_REVISION] = { .type = NLA_U8 }, + [TCA_EM_IPT_HOOK] = { .type = NLA_U32 }, + [TCA_EM_IPT_NFPROTO]= { .type = NLA_U8 }, + [TCA_EM_IPT_MATCH_DATA] = { .type = NLA_UNSPEC }, +}; + +static int check_match(struct net *net, struct em_ipt_match *
Re: v4.16-rc1 misaligned atomics in skb__clone / __napi_alloc_skb
On Thu, 2018-02-15 at 09:24 -0800, Eric Dumazet wrote: > > I will send something more suited to original intent of these commits : > > 90e33d45940793def6f773b2d528e9f3c84ffdc7 tun: enable napi_gro_frags() > for TUN/TAP driver > 943170998b200190f99d3fe7e771437e2c51f319 tun: enable NAPI for TUN/TAP driver Can you try this patch ? Thanks !  drivers/net/tun.c |   16 ++--  1 file changed, 6 insertions(+), 10 deletions(-) diff --git a/drivers/net/tun.c b/drivers/net/tun.c index 81e6cc951e7fc7c983919365c34842c34bcaedcf..b52258c327d2e1d7c7476de345e49f082909c246 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -1489,27 +1489,23 @@ static struct sk_buff *tun_napi_alloc_frags(struct tun_file *tfile, skb->truesize += skb->data_len; for (i = 1; i < it->nr_segs; i++) { + struct page_frag *pfrag = €t->task_frag; size_t fragsz = it->iov[i].iov_len; - unsigned long offset; - struct page *page; - void *data; if (fragsz == 0 || fragsz > PAGE_SIZE) { err = -EINVAL; goto free; } - local_bh_disable(); - data = napi_alloc_frag(fragsz); - local_bh_enable(); - if (!data) { + if (!skb_page_frag_refill(fragsz, pfrag, GFP_KERNEL)) { err = -ENOMEM; goto free; } - page = virt_to_head_page(data); - offset = data - page_address(page); - skb_fill_page_desc(skb, i - 1, page, offset, fragsz); + skb_fill_page_desc(skb, i - 1, pfrag->page, + pfrag->offset, fragsz); + page_ref_inc(pfrag->page); + pfrag->offset += fragsz; } return skb;
Re: [PATCH net] net: sched: fix unbalance in the error path of tca_action_flush()
On 18-02-15 09:50 AM, Davide Caratti wrote: When tca_action_flush() calls the action walk() and gets an error, a successful call to nla_nest_start() is not followed by a call to nla_nest_cancel(). It's harmless, as the skb is freed in the error path - but it's worth to fix this unbalance. Kind of pushing the boundaries saying this targets net tree - there is no bug you are fixing (as you say the sk is freed). Maybe it makes the code prettier ... cheers, jamal
Re: [PATCH net-next 1/2] net: netfilter: export xt_policy match_policy_in() as xt_policy_match_policy_in()
Hi Pablo, On Wed, 14 Feb 2018 11:19:40 +0100 Pablo Neira Ayuso wrote: > On Wed, Feb 14, 2018 at 10:14:24AM +0200, Eyal Birger wrote: > > Hi Pablo, > > > > On Mon, 15 Jan 2018 13:48:41 +0200 > > Eyal Birger wrote: > > > > > On Mon, Jan 15, 2018 at 12:57 PM, Pablo Neira Ayuso > > > wrote: > > > > On Sun, Jan 14, 2018 at 02:47:46PM +0200, Eyal Birger wrote: > > > >> On Fri, Jan 12, 2018 at 4:00 PM, Pablo Neira Ayuso > > > >> wrote: > > > >> > On Fri, Jan 12, 2018 at 03:56:21PM +0200, Eyal Birger > > > >> > wrote: > > > >> >> On Fri, Jan 12, 2018 at 3:41 PM, Pablo Neira Ayuso > > > >> >> wrote: > > > >> >> > On Fri, Jan 12, 2018 at 02:57:24PM +0200, Eyal Birger > > > >> >> > wrote: > > > >> >> >> @@ -51,9 +52,9 @@ match_xfrm_state(const struct > > > >> >> >> xfrm_state *x, const struct xt_policy_elem *e, > > > >> >> >> MATCH(reqid, x->props.reqid); } > > > >> >> >> > > > >> >> >> -static int > > > >> >> >> -match_policy_in(const struct sk_buff *skb, const struct > > > >> >> >> xt_policy_info *info, > > > >> >> >> - unsigned short family) > > > >> >> >> +int xt_policy_match_policy_in(const struct sk_buff *skb, > > > >> >> >> + const struct xt_policy_info > > > >> >> >> *info, > > > >> >> >> + unsigned short family) > > > >> >> >> { > > > >> >> >> const struct xt_policy_elem *e; > > > >> >> >> const struct sec_path *sp = skb->sp; > > > >> >> >> @@ -80,10 +81,11 @@ match_policy_in(const struct sk_buff > > > >> >> >> *skb, const struct xt_policy_info *info, > > > >> >> >> > > > >> >> >> return strict ? 1 : 0; > > > >> >> >> } > > > >> >> >> +EXPORT_SYMBOL_GPL(xt_policy_match_policy_in); > > > >> >> > > > > >> >> > If you just want to call xt_policy_match from tc, then you > > > >> >> > could use tc ipt infrastructure instead. > > > >> >> > > > >> >> Thanks for the suggestion - > > > >> >> Are you referring to act_ipt? it looks like it allows > > > >> >> calling targets; I couldn't find a classifier calling a > > > >> >> netfilter matcher. > > > >> > > > > >> > Then, I'd suggest you extend that infrastructure to alllow to > > > >> > call matches, so we reduce the number of interdepencies > > > >> > between different subsystems. > > > >> > > > >> This appears very versatile. though in this case the use of the > > > >> xtables code and structures was done in order to avoid > > > >> introducing new uapi structures and supporting > > > >> match code, not necessarily to expose the full capabilities of > > > >> extended matches, similar in spirit to what was done in the > > > >> em_ipset ematch. > > > >> > > > >> Perhaps in order to avoid the direct export of xt_policy code, > > > >> I could call xt_request_find_match() from the em_policy module, > > > >> requesting the xt_policy match? > > > >> this way api exposure is minimized while not overly > > > >> complicating the scope of this feature. > > > >> > > > >> What do you think? > > > > > > > > That would look better indeed. > > > > > > > > But once you call xt_request_find_match() from there, how far > > > > is to allow any arbitrary match? I think you only have to > > > > specify the match name, family and the binary layout structure > > > > that represents xt_policy, right? > > > > > > > > > > I don't think that should be a problem. I'd need to pass the > > > protocol onto the ematches .change() callbacks and get the > > > appropriate match from there. > > > > > > > I'm telling this, because I think it would be fair enough to me > > > > if you add the generic infrastructure to the kernel to allow > > > > arbitrary load of xt matches, and then from userspace you just > > > > add the code to support this which is what you need. > > > > > > > > Probably someone else - not you - may follow up later on to > > > > generalize the userspace codebase to support other matches, by > > > > when that happens, the right bits will be in the kernel > > > > already. > > > > > > I'm fine with submitting the more generic infrastructure. > > > Will follow up with a new series. > > > > Following up on this thread, I think this feature would better be > > implemented utilizing xt_policy from tc instead of supporting > > arbitrary xt matches. > > > > Feedback on the generic framework ([1], [2]) revolved around the > > ability to create the skb environment for running matches accessing > > the skb->data. > > I think conclusion was that we're all fine. At ingress this turns into > noop and at egress there's no skb sharing at all. Anyway, see below. > > > My concern is that it would be difficult to maintain the correct > > environment for any xt match, whereas it is simple to create a > > designated ematch for a specific xt match - as done for ipset - > > which can validate the necessary prerequisites for that xt match. > > Then, artificially restrict this to work for xt_policy only. But > please, no new exported symbols to achieve t
Re: [PATCH] i40evf/i40evf_main: Fix variable assignment in i40evf_parse_cls_flower
On Thu, 2018-02-15 at 11:44 -0600, Gustavo A. R. Silva wrote: > It seems this is a copy-paste error and that the proper variable to > use > in this particular case is _src_ instead of _dst_. > > Addresses-Coverity-ID: 1465282 ("Copy-paste error") > Fixes: 0075fa0fadd0 ("i40evf: Add support to apply cloud filters") > Signed-off-by: Gustavo A. R. Silva Thanks for the patch! Acked-by:Harshitha Ramamurthy > --- > Â drivers/net/ethernet/intel/i40evf/i40evf_main.c | 2 +- > Â 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c > b/drivers/net/ethernet/intel/i40evf/i40evf_main.c > index 4955ce3..58be99e 100644 > --- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c > +++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c > @@ -2763,7 +2763,7 @@ static int i40evf_parse_cls_flower(struct > i40evf_adapter *adapter, > Â > Â if (key->src) { > Â filter->f.mask.tcp_spec.src_port |= > cpu_to_be16(0x); > - filter->f.data.tcp_spec.src_port = key->dst; > + filter->f.data.tcp_spec.src_port = key->src; > Â } > Â } > Â filter->f.field_flags = field_flags;
Re: v4.16-rc1 misaligned atomics in skb__clone / __napi_alloc_skb
On Thu, Feb 15, 2018 at 09:43:06AM -0800, Eric Dumazet wrote: > On Thu, 2018-02-15 at 09:24 -0800, Eric Dumazet wrote: > > > > I will send something more suited to original intent of these commits : > > > > 90e33d45940793def6f773b2d528e9f3c84ffdc7 tun: enable napi_gro_frags() > > for TUN/TAP driver > > 943170998b200190f99d3fe7e771437e2c51f319 tun: enable NAPI for TUN/TAP driver > > Can you try this patch ? Looks good! No splats after 10 minutes with a test that usually fails in a few seconds. FWIW: Tested-by: Mark Rutland Thanks, Mark. >  drivers/net/tun.c |   16 ++-- >  1 file changed, 6 insertions(+), 10 deletions(-) > > diff --git a/drivers/net/tun.c b/drivers/net/tun.c > index > 81e6cc951e7fc7c983919365c34842c34bcaedcf..b52258c327d2e1d7c7476de345e49f082909c246 > 100644 > --- a/drivers/net/tun.c > +++ b/drivers/net/tun.c > @@ -1489,27 +1489,23 @@ static struct sk_buff *tun_napi_alloc_frags(struct > tun_file *tfile, > skb->truesize += skb->data_len; > > for (i = 1; i < it->nr_segs; i++) { > + struct page_frag *pfrag = €t->task_frag; > size_t fragsz = it->iov[i].iov_len; > - unsigned long offset; > - struct page *page; > - void *data; > > if (fragsz == 0 || fragsz > PAGE_SIZE) { > err = -EINVAL; > goto free; > } > > - local_bh_disable(); > - data = napi_alloc_frag(fragsz); > - local_bh_enable(); > - if (!data) { > + if (!skb_page_frag_refill(fragsz, pfrag, GFP_KERNEL)) { > err = -ENOMEM; > goto free; > } > > - page = virt_to_head_page(data); > - offset = data - page_address(page); > - skb_fill_page_desc(skb, i - 1, page, offset, fragsz); > + skb_fill_page_desc(skb, i - 1, pfrag->page, > +pfrag->offset, fragsz); > + page_ref_inc(pfrag->page); > + pfrag->offset += fragsz; > } > > return skb; >
[PATCH] i40evf/i40evf_main: Fix variable assignment in i40evf_parse_cls_flower
It seems this is a copy-paste error and that the proper variable to use in this particular case is _src_ instead of _dst_. Addresses-Coverity-ID: 1465282 ("Copy-paste error") Fixes: 0075fa0fadd0 ("i40evf: Add support to apply cloud filters") Signed-off-by: Gustavo A. R. Silva --- drivers/net/ethernet/intel/i40evf/i40evf_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c index 4955ce3..58be99e 100644 --- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c +++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c @@ -2763,7 +2763,7 @@ static int i40evf_parse_cls_flower(struct i40evf_adapter *adapter, if (key->src) { filter->f.mask.tcp_spec.src_port |= cpu_to_be16(0x); - filter->f.data.tcp_spec.src_port = key->dst; + filter->f.data.tcp_spec.src_port = key->src; } } filter->f.field_flags = field_flags; -- 2.7.4
Re: [PATCH] i40evf/i40evf_main: Fix variable assignment in i40evf_parse_cls_flower
On Thu, 2018-02-15 at 09:56 -0800, Ramamurthy, Harshitha wrote: > On Thu, 2018-02-15 at 11:44 -0600, Gustavo A. R. Silva wrote: > > It seems this is a copy-paste error and that the proper variable to > > use > > in this particular case is _src_ instead of _dst_. > > > > Addresses-Coverity-ID: 1465282 ("Copy-paste error") > > Fixes: 0075fa0fadd0 ("i40evf: Add support to apply cloud filters") > > Signed-off-by: Gustavo A. R. Silva > > Thanks for the patch! > > Acked-by:Harshitha Ramamurthy Acked-by: Jeff Kirsher Dave, no need for me to pick this patch up and then push to you. You can circumvent me and pick this up. > > --- > > drivers/net/ethernet/intel/i40evf/i40evf_main.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) signature.asc Description: This is a digitally signed message part
Re: [PATCH iproute2 2/7] devlink: mnlg: Add support for extended ack
On Thu, 15 Feb 2018 13:57:18 +0200 Arkadi Sharshevsky wrote: > On 02/14/2018 05:12 PM, Stephen Hemminger wrote: > > On Wed, 14 Feb 2018 10:55:17 +0200 > > Arkadi Sharshevsky wrote: > > > >> +static mnl_cb_t mnlg_cb_array[NLMSG_MIN_TYPE] = { > >> + [NLMSG_NOOP]= mnlg_cb_noop, > >> + [NLMSG_ERROR] = mnlg_cb_error, > >> + [NLMSG_DONE]= mnlg_cb_stop, > >> + [NLMSG_OVERRUN] = mnlg_cb_noop, > >> +}; > >> + > > > > Could be const? > > > > I pass the array to mnl_cb_run2() which will discard the 'const' > qualifier. So I dont think this is very beneficial. Thanks, makes sense,
Re: [vhost:vhost 24/24] drivers/firmware/qemu_fw_cfg.c:499:22: error: storage size of 'files' isn't known
On Thu, Feb 15, 2018 at 10:46:50AM +0100, Marc-Andre Lureau wrote: > Hi > > On Wed, Feb 14, 2018 at 9:27 PM, kbuild test robot > wrote: > > tree: https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git vhost > > head: 5d457fe6aeaab9d0a1665eafc8af7139bc6b6f2e > > commit: 5d457fe6aeaab9d0a1665eafc8af7139bc6b6f2e [24/24] fw_cfg: fix sparse > > warnings around FW_CFG_FILE_DIR read > > config: i386-randconfig-x015-201806 (attached as .config) > > compiler: gcc-7 (Debian 7.3.0-1) 7.3.0 > > reproduce: > > git checkout 5d457fe6aeaab9d0a1665eafc8af7139bc6b6f2e > > # save the attached .config to linux build tree > > make ARCH=i386 > > > > All errors (new ones prefixed by >>): > > > >drivers/firmware/qemu_fw_cfg.c: In function > > 'fw_cfg_register_dir_entries': > >>> drivers/firmware/qemu_fw_cfg.c:499:22: error: storage size of 'files' > >>> isn't known > > struct fw_cfg_files files; > > ^ > > struct fw_cfg_files { > __be32 count; /* number of entries */ > struct fw_cfg_file f[]; > }; > > Interesting, I don't have that warning with 7.3.1. > > I thought the size would be sizeof(count) by standard. > > I replaced it with a __be32 files_count variable instead. > > >drivers/firmware/qemu_fw_cfg.c:499:22: warning: unused variable 'files' > > [-Wunused-variable] > > > > files.count is used 3 lines below, that looks like a compiler bug to me. No - i tried dropping one patch out of series, this did not work out. So whole series is out for now. > > vim +499 drivers/firmware/qemu_fw_cfg.c > > > >493 > >494 /* iterate over all fw_cfg directory entries, registering each one > > */ > >495 static int fw_cfg_register_dir_entries(void) > >496 { > >497 int ret = 0; > >498 u32 count, i; > > > 499 struct fw_cfg_files files; > >500 struct fw_cfg_file *dir; > >501 size_t dir_size; > >502 > >503 fw_cfg_read_blob(FW_CFG_FILE_DIR, &files.count, 0, > > sizeof(files.count)); > >504 count = be32_to_cpu(files.count); > >505 dir_size = count * sizeof(struct fw_cfg_file); > >506 > >507 dir = kmalloc(dir_size, GFP_KERNEL); > >508 if (!dir) > >509 return -ENOMEM; > >510 > >511 fw_cfg_read_blob(FW_CFG_FILE_DIR, dir, sizeof(files.count), > > dir_size); > >512 > >513 for (i = 0; i < count; i++) { > >514 ret = fw_cfg_register_file(&dir[i]); > >515 if (ret) > >516 break; > >517 } > >518 > >519 kfree(dir); > >520 return ret; > >521 } > >522 > > > > --- > > 0-DAY kernel test infrastructureOpen Source Technology > > Center > > https://lists.01.org/pipermail/kbuild-all Intel > > Corporation
[PATCH v2] net: dsa: mv88e6xxx: hwtstamp: fix potential negative array index read
_port_ is being used as index to array port_hwtstamp before verifying it is a non-negative number and a valid index at line 209 and 258: if (port < 0 || port >= mv88e6xxx_num_ports(chip)) Fix this by checking _port_ before using it as index to array port_hwtstamp. Addresses-Coverity-ID: 1465287 ("Negative array index read") Addresses-Coverity-ID: 1465291 ("Negative array index read") Fixes: c6fe0ad2c349 ("net: dsa: mv88e6xxx: add rx/tx timestamping support") Signed-off-by: Gustavo A. R. Silva --- Changes in v2: -Fix the same issue in mv88e6xxx_should_tstamp. -Update commit message. drivers/net/dsa/mv88e6xxx/hwtstamp.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/net/dsa/mv88e6xxx/hwtstamp.c b/drivers/net/dsa/mv88e6xxx/hwtstamp.c index b251d53..5a665aa 100644 --- a/drivers/net/dsa/mv88e6xxx/hwtstamp.c +++ b/drivers/net/dsa/mv88e6xxx/hwtstamp.c @@ -200,8 +200,8 @@ int mv88e6xxx_port_hwtstamp_get(struct dsa_switch *ds, int port, struct ifreq *ifr) { struct mv88e6xxx_chip *chip = ds->priv; - struct mv88e6xxx_port_hwtstamp *ps = &chip->port_hwtstamp[port]; - struct hwtstamp_config *config = &ps->tstamp_config; + struct mv88e6xxx_port_hwtstamp *ps; + struct hwtstamp_config *config; if (!chip->info->ptp_support) return -EOPNOTSUPP; @@ -209,6 +209,9 @@ int mv88e6xxx_port_hwtstamp_get(struct dsa_switch *ds, int port, if (port < 0 || port >= mv88e6xxx_num_ports(chip)) return -EINVAL; + ps = &chip->port_hwtstamp[port]; + config = &ps->tstamp_config; + return copy_to_user(ifr->ifr_data, config, sizeof(*config)) ? -EFAULT : 0; } @@ -249,7 +252,7 @@ static u8 *parse_ptp_header(struct sk_buff *skb, unsigned int type) static u8 *mv88e6xxx_should_tstamp(struct mv88e6xxx_chip *chip, int port, struct sk_buff *skb, unsigned int type) { - struct mv88e6xxx_port_hwtstamp *ps = &chip->port_hwtstamp[port]; + struct mv88e6xxx_port_hwtstamp *ps; u8 *hdr; if (!chip->info->ptp_support) @@ -262,6 +265,7 @@ static u8 *mv88e6xxx_should_tstamp(struct mv88e6xxx_chip *chip, int port, if (!hdr) return NULL; + ps = &chip->port_hwtstamp[port]; if (!test_bit(MV88E6XXX_HWTSTAMP_ENABLED, &ps->state)) return NULL; -- 2.7.4
[PATCH iproute2-next] ip: Use single variable to represent -pretty
After commit a233caa0aaee ("json: make pretty printing optional") I get following build failure: LINK rtmon ../lib/libutil.a(json_print.o): In function `new_json_obj': json_print.c:(.text+0x35): undefined reference to `show_pretty' collect2: error: ld returned 1 exit status make[1]: *** [rtmon] Error 1 make: *** [all] Error 2 It is caused by missing show_pretty variable in rtmon. On the other hand tc/tc.c there are two distinct variables and single matches() call that handles -pretty option thus setting show_pretty will never happen. Note that since commit 44dcfe820185 ("Change formatting of u32 back to default") show_pretty is used in tc/f_u32.c so this is first place where -pretty introduced. Furthermore other utilities like misc/ifstat.c and misc/nstat.c define pretty variable, however only for their own purposes. They both support JSON output and thus depend show_pretty in new_json_obj(). Assuming above use common variable to represent -pretty option, define it in utils.c and declare in utils.h that is commonly used. Replace show_pretty with pretty. Fixes: a233caa0aaee ("json: make pretty printing optional") Signed-off-by: Serhey Popovych --- include/json_print.h |2 -- ip/ip.c |3 +-- lib/json_print.c |2 +- lib/utils.c |1 + misc/ifstat.c|1 - misc/nstat.c |1 - tc/f_u32.c |4 +--- tc/tc.c |6 +- 8 files changed, 5 insertions(+), 15 deletions(-) diff --git a/include/json_print.h b/include/json_print.h index 45a817c..2ca7830 100644 --- a/include/json_print.h +++ b/include/json_print.h @@ -15,8 +15,6 @@ #include "json_writer.h" #include "color.h" -extern int show_pretty; - json_writer_t *get_json_writer(void); /* diff --git a/ip/ip.c b/ip/ip.c index 233a9d7..f624884 100644 --- a/ip/ip.c +++ b/ip/ip.c @@ -31,7 +31,6 @@ int show_stats; int show_details; int oneline; int brief; -int show_pretty; int json; int timestamp; const char *_SL_; @@ -261,7 +260,7 @@ int main(int argc, char **argv) } else if (matches(opt, "-json") == 0) { ++json; } else if (matches(opt, "-pretty") == 0) { - ++show_pretty; + ++pretty; } else if (matches(opt, "-rcvbuf") == 0) { unsigned int size; diff --git a/lib/json_print.c b/lib/json_print.c index b507b14..bda7293 100644 --- a/lib/json_print.c +++ b/lib/json_print.c @@ -28,7 +28,7 @@ void new_json_obj(int json) perror("json object"); exit(1); } - if (show_pretty) + if (pretty) jsonw_pretty(_jw, true); jsonw_start_array(_jw); } diff --git a/lib/utils.c b/lib/utils.c index d86c2ee..cbe5d8d 100644 --- a/lib/utils.c +++ b/lib/utils.c @@ -37,6 +37,7 @@ int resolve_hosts; int timestamp_short; +int pretty; int read_prop(const char *dev, char *prop, long *value) { diff --git a/misc/ifstat.c b/misc/ifstat.c index ac3eff6..50b906e 100644 --- a/misc/ifstat.c +++ b/misc/ifstat.c @@ -45,7 +45,6 @@ int no_update; int scan_interval; int time_constant; int show_errors; -int pretty; double W; char **patterns; int npatterns; diff --git a/misc/nstat.c b/misc/nstat.c index a4dd405..ffa14b1 100644 --- a/misc/nstat.c +++ b/misc/nstat.c @@ -37,7 +37,6 @@ int reset_history; int ignore_history; int no_output; int json_output; -int pretty; int no_update; int scan_interval; int time_constant; diff --git a/tc/f_u32.c b/tc/f_u32.c index 019d56c..b6064cd 100644 --- a/tc/f_u32.c +++ b/tc/f_u32.c @@ -25,8 +25,6 @@ #include "utils.h" #include "tc_util.h" -extern int show_pretty; - static void explain(void) { fprintf(stderr, @@ -965,7 +963,7 @@ static void show_keys(FILE *f, const struct tc_u32_key *key) { int i = 0; - if (!show_pretty) + if (!pretty) goto show_k; for (i = 0; i < ARRAY_SIZE(u32_pprinters); i++) { diff --git a/tc/tc.c b/tc/tc.c index aba5c10..cfccf87 100644 --- a/tc/tc.c +++ b/tc/tc.c @@ -33,7 +33,6 @@ int show_stats; int show_details; int show_raw; -int show_pretty; int show_graph; int timestamp; @@ -42,7 +41,6 @@ int use_iec; int force; bool use_names; int json; -int pretty; static char *conf_file; @@ -449,7 +447,7 @@ int main(int argc, char **argv) } else if (matches(argv[1], "-raw") == 0) { ++show_raw; } else if (matches(argv[1], "-pretty") == 0) { - ++show_pretty; + ++pretty; } else if (matches(argv[1], "-graph") == 0) { show_graph = 1; } else if (matches(argv[1], "-Version") == 0) { @@ -485,8 +483,6 @@ int main(int argc, char **argv) ++timestamp_short; } else if (matches(argv[1]
[PATCH iproute2-next v4 8/9] utils: Introduce and use print_name_and_link() to print name@link
There is at least three places implementing same things: two in ipaddress.c print_linkinfo() & print_linkinfo_brief() and one in bridge/link.c. They are diverge from each other very little: bridge/link.c does not support JSON output at the moment and print_linkinfo_brief() does not handle IFLA_LINK_NETNS case. Introduce and use print_name_and_link() routine to handle name@link output in all possible variations; respect IFLA_LINK_NETNS attribute to handle case when link is in different namespace; use ll_idx_n2a() for interface name instead of "" to share logic with other code (e.g. ll_name_to_index() and ll_index_to_name()) supporting such template. Signed-off-by: Serhey Popovych --- bridge/link.c | 13 +++-- include/utils.h |4 ip/ipaddress.c | 44 ++-- lib/utils.c | 49 + 4 files changed, 58 insertions(+), 52 deletions(-) diff --git a/bridge/link.c b/bridge/link.c index a11cbb1..90c9734 100644 --- a/bridge/link.c +++ b/bridge/link.c @@ -125,20 +125,13 @@ int print_linkinfo(const struct sockaddr_nl *who, if (n->nlmsg_type == RTM_DELLINK) fprintf(fp, "Deleted "); - fprintf(fp, "%d: %s ", ifi->ifi_index, - tb[IFLA_IFNAME] ? rta_getattr_str(tb[IFLA_IFNAME]) : ""); + fprintf(fp, "%d: ", ifi->ifi_index); + + print_name_and_link("%s: ", COLOR_NONE, name, tb); if (tb[IFLA_OPERSTATE]) print_operstate(fp, rta_getattr_u8(tb[IFLA_OPERSTATE])); - if (tb[IFLA_LINK]) { - int iflink = rta_getattr_u32(tb[IFLA_LINK]); - - fprintf(fp, "@%s: ", - iflink ? ll_index_to_name(iflink) : "NONE"); - } else - fprintf(fp, ": "); - print_link_flags(fp, ifi->ifi_flags); if (tb[IFLA_MTU]) diff --git a/include/utils.h b/include/utils.h index 84ca873..75ddb4a 100644 --- a/include/utils.h +++ b/include/utils.h @@ -12,6 +12,7 @@ #include "libnetlink.h" #include "ll_map.h" #include "rtm_map.h" +#include "json_print.h" extern int preferred_family; extern int human_readable; @@ -250,6 +251,9 @@ void print_escape_buf(const __u8 *buf, size_t len, const char *escape); int print_timestamp(FILE *fp); void print_nlmsg_timestamp(FILE *fp, const struct nlmsghdr *n); +unsigned int print_name_and_link(const char *fmt, enum color_attr color, +const char *name, struct rtattr *tb[]); + #define BIT(nr) (1UL << (nr)) #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) diff --git a/ip/ipaddress.c b/ip/ipaddress.c index 08d2576..90cb5e5 100644 --- a/ip/ipaddress.c +++ b/ip/ipaddress.c @@ -760,7 +760,6 @@ int print_linkinfo_brief(const struct sockaddr_nl *who, struct rtattr *tb[IFLA_MAX+1]; int len = n->nlmsg_len; const char *name; - char buf[32] = { 0, }; unsigned int m_flag = 0; if (n->nlmsg_type != RTM_NEWLINK && n->nlmsg_type != RTM_DELLINK) @@ -810,25 +809,7 @@ int print_linkinfo_brief(const struct sockaddr_nl *who, if (n->nlmsg_type == RTM_DELLINK) print_bool(PRINT_ANY, "deleted", "Deleted ", true); - if (tb[IFLA_LINK]) { - int iflink = rta_getattr_u32(tb[IFLA_LINK]); - - if (iflink == 0) { - snprintf(buf, sizeof(buf), "%s@NONE", name); - print_null(PRINT_JSON, "link", NULL, NULL); - } else { - const char *link = ll_index_to_name(iflink); - - print_string(PRINT_JSON, "link", NULL, link); - snprintf(buf, sizeof(buf), "%s@%s", name, link); - m_flag = ll_index_to_flags(iflink); - m_flag = !(m_flag & IFF_UP); - } - } else - snprintf(buf, sizeof(buf), "%s", name); - - print_string(PRINT_FP, NULL, "%-16s ", buf); - print_string(PRINT_JSON, "ifname", NULL, name); + m_flag = print_name_and_link("%-16s ", COLOR_NONE, name, tb); if (tb[IFLA_OPERSTATE]) print_operstate(fp, rta_getattr_u8(tb[IFLA_OPERSTATE])); @@ -936,29 +917,8 @@ int print_linkinfo(const struct sockaddr_nl *who, print_bool(PRINT_ANY, "deleted", "Deleted ", true); print_int(PRINT_ANY, "ifindex", "%d: ", ifi->ifi_index); - print_color_string(PRINT_ANY, COLOR_IFNAME, "ifname", "%s", name); - - if (tb[IFLA_LINK]) { - int iflink = rta_getattr_u32(tb[IFLA_LINK]); - if (iflink == 0) - print_null(PRINT_ANY, "link", "@%s: ", "NONE"); - else { - if (tb[IFLA_LINK_NETNSID]) - print_int(PRINT_ANY, - "link_index", "@if%d: ", iflink); - else { - print_string(PRINT
[PATCH iproute2-next v4 4/9] ipaddress: Improve print_linkinfo()
There are few places to improve: 1) return -1 when entry is filtered instead of zero, which means accept entry: ipaddress_list_flush_or_save() the only user of this 2) use ll_idx_n2a() as last resort to translate name to index for "should never happen" cases when cache shouldn't be considered 3) replace open coded access to IFLA_IFNAME attribute data by RTA_DATA() with rta_getattr_str() 4) simplify ifname printing since name is never NULL, thanks to (2). Signed-off-by: Serhey Popovych --- ip/ipaddress.c | 30 +- 1 file changed, 13 insertions(+), 17 deletions(-) diff --git a/ip/ipaddress.c b/ip/ipaddress.c index 0daba8c..6eac370 100644 --- a/ip/ipaddress.c +++ b/ip/ipaddress.c @@ -778,14 +778,14 @@ int print_linkinfo_brief(const struct sockaddr_nl *who, parse_rtattr(tb, IFLA_MAX, IFLA_RTA(ifi), len); if (tb[IFLA_IFNAME] == NULL) { fprintf(stderr, "BUG: device with ifindex %d has nil ifname\n", ifi->ifi_index); - name = ""; + name = ll_idx_n2a(ifi->ifi_index); } else { name = rta_getattr_str(tb[IFLA_IFNAME]); } if (filter.label && (!filter.family || filter.family == AF_PACKET) && - fnmatch(filter.label, RTA_DATA(tb[IFLA_IFNAME]), 0)) + fnmatch(filter.label, name, 0)) return -1; if (tb[IFLA_GROUP]) { @@ -887,6 +887,7 @@ int print_linkinfo(const struct sockaddr_nl *who, struct ifinfomsg *ifi = NLMSG_DATA(n); struct rtattr *tb[IFLA_MAX+1]; int len = n->nlmsg_len; + const char *name; unsigned int m_flag = 0; if (n->nlmsg_type != RTM_NEWLINK && n->nlmsg_type != RTM_DELLINK) @@ -897,18 +898,22 @@ int print_linkinfo(const struct sockaddr_nl *who, return -1; if (filter.ifindex && ifi->ifi_index != filter.ifindex) - return 0; + return -1; if (filter.up && !(ifi->ifi_flags&IFF_UP)) - return 0; + return -1; parse_rtattr(tb, IFLA_MAX, IFLA_RTA(ifi), len); - if (tb[IFLA_IFNAME] == NULL) + if (tb[IFLA_IFNAME] == NULL) { fprintf(stderr, "BUG: device with ifindex %d has nil ifname\n", ifi->ifi_index); + name = ll_idx_n2a(ifi->ifi_index); + } else { + name = rta_getattr_str(tb[IFLA_IFNAME]); + } if (filter.label && (!filter.family || filter.family == AF_PACKET) && - fnmatch(filter.label, RTA_DATA(tb[IFLA_IFNAME]), 0)) - return 0; + fnmatch(filter.label, name, 0)) + return -1; if (tb[IFLA_GROUP]) { int group = rta_getattr_u32(tb[IFLA_GROUP]); @@ -935,16 +940,7 @@ int print_linkinfo(const struct sockaddr_nl *who, print_bool(PRINT_ANY, "deleted", "Deleted ", true); print_int(PRINT_ANY, "ifindex", "%d: ", ifi->ifi_index); - if (tb[IFLA_IFNAME]) { - print_color_string(PRINT_ANY, - COLOR_IFNAME, - "ifname", "%s", - rta_getattr_str(tb[IFLA_IFNAME])); - } else { - print_null(PRINT_JSON, "ifname", NULL, NULL); - print_color_null(PRINT_FP, COLOR_IFNAME, -"ifname", "%s", ""); - } + print_color_string(PRINT_ANY, COLOR_IFNAME, "ifname", "%s", name); if (tb[IFLA_LINK]) { int iflink = rta_getattr_u32(tb[IFLA_LINK]); -- 1.7.10.4
[PATCH iproute2-next v4 9/9] ipaddress: Make print_linkinfo_brief() static
It shares lot of code with print_linkinfo(): drop duplicated part, change parameters list, make it static and call from print_linkinfo() after common path. While there move SPRINT_BUF() to the function scope from blocks to avoid duplication and use "%s" to print "\n" to help compiler optimize exit for both print_linkinfo_brief() and normal paths. Signed-off-by: Serhey Popovych --- ip/ip_common.h |2 -- ip/ipaddress.c | 76 ip/iplink.c|5 +--- 3 files changed, 11 insertions(+), 72 deletions(-) diff --git a/ip/ip_common.h b/ip/ip_common.h index 1397d99..e4e628b 100644 --- a/ip/ip_common.h +++ b/ip/ip_common.h @@ -29,8 +29,6 @@ struct link_filter { int get_operstate(const char *name); int print_linkinfo(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg); -int print_linkinfo_brief(const struct sockaddr_nl *who, -struct nlmsghdr *n, void *arg); int print_addrinfo(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg); int print_addrlabel(const struct sockaddr_nl *who, diff --git a/ip/ipaddress.c b/ip/ipaddress.c index 90cb5e5..1380453 100644 --- a/ip/ipaddress.c +++ b/ip/ipaddress.c @@ -752,63 +752,12 @@ static void print_link_stats(FILE *fp, struct nlmsghdr *n) fprintf(fp, "%s", _SL_); } -int print_linkinfo_brief(const struct sockaddr_nl *who, -struct nlmsghdr *n, void *arg) +static int print_linkinfo_brief(FILE *fp, const char *name, + const struct ifinfomsg *ifi, + struct rtattr *tb[]) { - FILE *fp = (FILE *)arg; - struct ifinfomsg *ifi = NLMSG_DATA(n); - struct rtattr *tb[IFLA_MAX+1]; - int len = n->nlmsg_len; - const char *name; unsigned int m_flag = 0; - if (n->nlmsg_type != RTM_NEWLINK && n->nlmsg_type != RTM_DELLINK) - return -1; - - len -= NLMSG_LENGTH(sizeof(*ifi)); - if (len < 0) - return -1; - - if (filter.ifindex && ifi->ifi_index != filter.ifindex) - return -1; - if (filter.up && !(ifi->ifi_flags&IFF_UP)) - return -1; - - parse_rtattr(tb, IFLA_MAX, IFLA_RTA(ifi), len); - - name = get_ifname_rta(ifi->ifi_index, tb[IFLA_IFNAME]); - if (!name) - return -1; - - if (filter.label && - (!filter.family || filter.family == AF_PACKET) && - fnmatch(filter.label, name, 0)) - return -1; - - if (tb[IFLA_GROUP]) { - int group = rta_getattr_u32(tb[IFLA_GROUP]); - - if (filter.group != -1 && group != filter.group) - return -1; - } - - if (tb[IFLA_MASTER]) { - int master = rta_getattr_u32(tb[IFLA_MASTER]); - - if (filter.master > 0 && master != filter.master) - return -1; - } else if (filter.master > 0) - return -1; - - if (filter.kind && match_link_kind(tb, filter.kind, 0)) - return -1; - - if (filter.slave_kind && match_link_kind(tb, filter.slave_kind, 1)) - return -1; - - if (n->nlmsg_type == RTM_DELLINK) - print_bool(PRINT_ANY, "deleted", "Deleted ", true); - m_flag = print_name_and_link("%-16s ", COLOR_NONE, name, tb); if (tb[IFLA_OPERSTATE]) @@ -868,6 +817,7 @@ int print_linkinfo(const struct sockaddr_nl *who, int len = n->nlmsg_len; const char *name; unsigned int m_flag = 0; + SPRINT_BUF(b1); if (n->nlmsg_type != RTM_NEWLINK && n->nlmsg_type != RTM_DELLINK) return 0; @@ -916,6 +866,9 @@ int print_linkinfo(const struct sockaddr_nl *who, if (n->nlmsg_type == RTM_DELLINK) print_bool(PRINT_ANY, "deleted", "Deleted ", true); + if (brief) + return print_linkinfo_brief(fp, name, ifi, tb); + print_int(PRINT_ANY, "ifindex", "%d: ", ifi->ifi_index); m_flag = print_name_and_link("%s: ", COLOR_IFNAME, name, tb); @@ -949,7 +902,6 @@ int print_linkinfo(const struct sockaddr_nl *who, print_linkmode(fp, tb[IFLA_LINKMODE]); if (tb[IFLA_GROUP]) { - SPRINT_BUF(b1); int group = rta_getattr_u32(tb[IFLA_GROUP]); print_string(PRINT_ANY, @@ -965,8 +917,6 @@ int print_linkinfo(const struct sockaddr_nl *who, print_link_event(fp, rta_getattr_u32(tb[IFLA_EVENT])); if (!filter.family || filter.family == AF_PACKET || show_details) { - SPRINT_BUF(b1); - print_string(PRINT_FP, NULL, "%s", _SL_); print_string(PRINT_ANY, "link_type", @@ -1066,7 +1016,6 @@ int print_linkinfo(const struct sockaddr_nl *who, rta_getattr_str(tb[IFLA_PHYS_PORT_NAME]));
[PATCH] net: dsa: mv88e6xxx: hwtstamp: fix potential negative array index read
_port_ is being used as index to array port_hwtstamp before verifying it is a non-negative number and a valid index at 209: if (port < 0 || port >= mv88e6xxx_num_ports(chip)) Fix this by checking _port_ before using it as index to array port_hwtstamp. Addresses-Coverity-ID: 1465287 ("Negative array index read") Fixes: c6fe0ad2c349 ("net: dsa: mv88e6xxx: add rx/tx timestamping support") Signed-off-by: Gustavo A. R. Silva --- drivers/net/dsa/mv88e6xxx/hwtstamp.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/drivers/net/dsa/mv88e6xxx/hwtstamp.c b/drivers/net/dsa/mv88e6xxx/hwtstamp.c index b251d53..761decc 100644 --- a/drivers/net/dsa/mv88e6xxx/hwtstamp.c +++ b/drivers/net/dsa/mv88e6xxx/hwtstamp.c @@ -200,8 +200,8 @@ int mv88e6xxx_port_hwtstamp_get(struct dsa_switch *ds, int port, struct ifreq *ifr) { struct mv88e6xxx_chip *chip = ds->priv; - struct mv88e6xxx_port_hwtstamp *ps = &chip->port_hwtstamp[port]; - struct hwtstamp_config *config = &ps->tstamp_config; + struct mv88e6xxx_port_hwtstamp *ps; + struct hwtstamp_config *config; if (!chip->info->ptp_support) return -EOPNOTSUPP; @@ -209,6 +209,9 @@ int mv88e6xxx_port_hwtstamp_get(struct dsa_switch *ds, int port, if (port < 0 || port >= mv88e6xxx_num_ports(chip)) return -EINVAL; + ps = &chip->port_hwtstamp[port]; + config = &ps->tstamp_config; + return copy_to_user(ifr->ifr_data, config, sizeof(*config)) ? -EFAULT : 0; } -- 2.7.4
[PATCH iproute2-next v4 7/9] utils: Introduce and use get_ifname_rta()
Be consistent in handling of IFLA_IFNAME attribute in all places: if there is no attribute report bug to stderr and use ll_idx_n2a() as last measure to get name in "if%u" format instead of "". Use check_ifname() to validate network device name: this catches both unexpected return from kernel and ll_idx_n2a(). Signed-off-by: Serhey Popovych --- bridge/link.c |8 include/utils.h |1 + ip/ipaddress.c | 20 lib/utils.c | 19 +++ 4 files changed, 32 insertions(+), 16 deletions(-) diff --git a/bridge/link.c b/bridge/link.c index 870ebe0..a11cbb1 100644 --- a/bridge/link.c +++ b/bridge/link.c @@ -99,9 +99,10 @@ int print_linkinfo(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg) { FILE *fp = arg; - int len = n->nlmsg_len; struct ifinfomsg *ifi = NLMSG_DATA(n); struct rtattr *tb[IFLA_MAX+1]; + int len = n->nlmsg_len; + const char *name; len -= NLMSG_LENGTH(sizeof(*ifi)); if (len < 0) { @@ -117,10 +118,9 @@ int print_linkinfo(const struct sockaddr_nl *who, parse_rtattr_flags(tb, IFLA_MAX, IFLA_RTA(ifi), len, NLA_F_NESTED); - if (tb[IFLA_IFNAME] == NULL) { - fprintf(stderr, "BUG: nil ifname\n"); + name = get_ifname_rta(ifi->ifi_index, tb[IFLA_IFNAME]); + if (!name) return -1; - } if (n->nlmsg_type == RTM_DELLINK) fprintf(fp, "Deleted "); diff --git a/include/utils.h b/include/utils.h index 867e540..84ca873 100644 --- a/include/utils.h +++ b/include/utils.h @@ -183,6 +183,7 @@ void duparg(const char *, const char *) __attribute__((noreturn)); void duparg2(const char *, const char *) __attribute__((noreturn)); int check_ifname(const char *); int get_ifname(char *, const char *); +const char *get_ifname_rta(int ifindex, const struct rtattr *rta); int matches(const char *arg, const char *pattern); int inet_addr_match(const inet_prefix *a, const inet_prefix *b, int bits); int inet_addr_match_rta(const inet_prefix *m, const struct rtattr *rta); diff --git a/ip/ipaddress.c b/ip/ipaddress.c index 749178d..08d2576 100644 --- a/ip/ipaddress.c +++ b/ip/ipaddress.c @@ -776,12 +776,10 @@ int print_linkinfo_brief(const struct sockaddr_nl *who, return -1; parse_rtattr(tb, IFLA_MAX, IFLA_RTA(ifi), len); - if (tb[IFLA_IFNAME] == NULL) { - fprintf(stderr, "BUG: device with ifindex %d has nil ifname\n", ifi->ifi_index); - name = ll_idx_n2a(ifi->ifi_index); - } else { - name = rta_getattr_str(tb[IFLA_IFNAME]); - } + + name = get_ifname_rta(ifi->ifi_index, tb[IFLA_IFNAME]); + if (!name) + return -1; if (filter.label && (!filter.family || filter.family == AF_PACKET) && @@ -903,12 +901,10 @@ int print_linkinfo(const struct sockaddr_nl *who, return -1; parse_rtattr(tb, IFLA_MAX, IFLA_RTA(ifi), len); - if (tb[IFLA_IFNAME] == NULL) { - fprintf(stderr, "BUG: device with ifindex %d has nil ifname\n", ifi->ifi_index); - name = ll_idx_n2a(ifi->ifi_index); - } else { - name = rta_getattr_str(tb[IFLA_IFNAME]); - } + + name = get_ifname_rta(ifi->ifi_index, tb[IFLA_IFNAME]); + if (!name) + return -1; if (filter.label && (!filter.family || filter.family == AF_PACKET) && diff --git a/lib/utils.c b/lib/utils.c index d86c2ee..572d42a 100644 --- a/lib/utils.c +++ b/lib/utils.c @@ -871,6 +871,25 @@ int get_ifname(char *buf, const char *name) return ret; } +const char *get_ifname_rta(int ifindex, const struct rtattr *rta) +{ + const char *name; + + if (rta) { + name = rta_getattr_str(rta); + } else { + fprintf(stderr, + "BUG: device with ifindex %d has nil ifname\n", + ifindex); + name = ll_idx_n2a(ifindex); + } + + if (check_ifname(name)) + return NULL; + + return name; +} + int matches(const char *cmd, const char *pattern) { int len = strlen(cmd); -- 1.7.10.4
[PATCH iproute2-next v4 1/9] ipaddress: Abstract IFA_LABEL matching code
There at least two places in ip/ipaddress.c where we match IFA_LABEL against filter.label if that is given. Get rid of "common" if () statement for inet_addr_match_rta() and ifa_label_match_rta(): it is not common because first will check for filter.pfx.family != AF_UNSPEC inside and second for filter.label being non NULL. This allows us to further simplify down code and prepare for ll_idx_n2a() replacement with ll_index_to_name() without 80 columns checkpatch notice. Signed-off-by: Serhey Popovych --- ip/ipaddress.c | 57 +++- 1 file changed, 27 insertions(+), 30 deletions(-) diff --git a/ip/ipaddress.c b/ip/ipaddress.c index 6990b81..ad69d09 100644 --- a/ip/ipaddress.c +++ b/ip/ipaddress.c @@ -1305,6 +1305,22 @@ static int get_filter(const char *arg) return 0; } +static int ifa_label_match_rta(int ifindex, const struct rtattr *rta) +{ + const char *label; + SPRINT_BUF(b1); + + if (!filter.label) + return 0; + + if (rta) + label = RTA_DATA(rta); + else + label = ll_idx_n2a(ifindex, b1); + + return fnmatch(filter.label, label, 0); +} + int print_addrinfo(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg) { @@ -1343,21 +1359,13 @@ int print_addrinfo(const struct sockaddr_nl *who, struct nlmsghdr *n, return 0; if ((filter.flags ^ ifa_flags) & filter.flagmask) return 0; - if (filter.label) { - SPRINT_BUF(b1); - const char *label; - - if (rta_tb[IFA_LABEL]) - label = RTA_DATA(rta_tb[IFA_LABEL]); - else - label = ll_idx_n2a(ifa->ifa_index, b1); - if (fnmatch(filter.label, label, 0) != 0) - return 0; - } if (filter.family && filter.family != ifa->ifa_family) return 0; + if (ifa_label_match_rta(ifa->ifa_index, rta_tb[IFA_LABEL])) + return 0; + if (inet_addr_match_rta(&filter.pfx, rta_tb[IFA_LOCAL])) return 0; @@ -1713,25 +1721,14 @@ static void ipaddr_filter(struct nlmsg_chain *linfo, struct nlmsg_chain *ainfo) if ((filter.flags ^ ifa_flags) & filter.flagmask) continue; - if (filter.pfx.family || filter.label) { - struct rtattr *rta = - tb[IFA_LOCAL] ? : tb[IFA_ADDRESS]; - - if (inet_addr_match_rta(&filter.pfx, rta)) - continue; - - if (filter.label) { - SPRINT_BUF(b1); - const char *label; - - if (tb[IFA_LABEL]) - label = RTA_DATA(tb[IFA_LABEL]); - else - label = ll_idx_n2a(ifa->ifa_index, b1); - if (fnmatch(filter.label, label, 0) != 0) - continue; - } - } + + if (ifa_label_match_rta(ifa->ifa_index, tb[IFA_LABEL])) + continue; + + if (!tb[IFA_LOCAL]) + tb[IFA_LOCAL] = tb[IFA_ADDRESS]; + if (inet_addr_match_rta(&filter.pfx, tb[IFA_LOCAL])) + continue; ok = 1; break; -- 1.7.10.4
[PATCH iproute2-next v4 3/9] utils: Reimplement ll_idx_n2a() and introduce ll_idx_a2n()
Now all users of ll_idx_n2a() replaced with ll_index_to_name() we can move it's functionality to ll_index_to_name() and implement index to name conversion using snprintf() and "if%u". Use %u specifier in "if%..." template consistently: network device indexes are always greather than zero. Also introduce ll_idx_n2a() conterpart: ll_idx_a2n() that is used to translate name of the "if%u" form to index using sscanf(). Signed-off-by: Serhey Popovych --- include/ll_map.h |4 +++- lib/ll_map.c | 31 +-- 2 files changed, 24 insertions(+), 11 deletions(-) diff --git a/include/ll_map.h b/include/ll_map.h index c8474e6..8546ff9 100644 --- a/include/ll_map.h +++ b/include/ll_map.h @@ -8,9 +8,11 @@ int ll_remember_index(const struct sockaddr_nl *who, void ll_init_map(struct rtnl_handle *rth); unsigned ll_name_to_index(const char *name); const char *ll_index_to_name(unsigned idx); -const char *ll_idx_n2a(unsigned idx, char *buf); int ll_index_to_type(unsigned idx); int ll_index_to_flags(unsigned idx); unsigned namehash(const char *str); +const char *ll_idx_n2a(unsigned int idx); +unsigned int ll_idx_a2n(const char *name); + #endif /* __LL_MAP_H__ */ diff --git a/lib/ll_map.c b/lib/ll_map.c index f65614f..0afe689 100644 --- a/lib/ll_map.c +++ b/lib/ll_map.c @@ -136,8 +136,26 @@ int ll_remember_index(const struct sockaddr_nl *who, return 0; } -const char *ll_idx_n2a(unsigned idx, char *buf) +const char *ll_idx_n2a(unsigned int idx) { + static char buf[IFNAMSIZ]; + + snprintf(buf, sizeof(buf), "if%u", idx); + return buf; +} + +unsigned int ll_idx_a2n(const char *name) +{ + unsigned int idx; + + if (sscanf(name, "if%u", &idx) != 1) + return 0; + return idx; +} + +const char *ll_index_to_name(unsigned int idx) +{ + static char buf[IFNAMSIZ]; const struct ll_cache *im; if (idx == 0) @@ -148,18 +166,11 @@ const char *ll_idx_n2a(unsigned idx, char *buf) return im->name; if (if_indextoname(idx, buf) == NULL) - snprintf(buf, IFNAMSIZ, "if%d", idx); + snprintf(buf, IFNAMSIZ, "if%u", idx); return buf; } -const char *ll_index_to_name(unsigned idx) -{ - static char nbuf[IFNAMSIZ]; - - return ll_idx_n2a(idx, nbuf); -} - int ll_index_to_type(unsigned idx) { const struct ll_cache *im; @@ -196,7 +207,7 @@ unsigned ll_name_to_index(const char *name) idx = if_nametoindex(name); if (idx == 0) - sscanf(name, "if%u", &idx); + idx = ll_idx_a2n(name); return idx; } -- 1.7.10.4
[PATCH iproute2-next v4 5/9] ipaddress: Simplify print_linkinfo_brief() and it's usage
Simplify calling code in ipaddr_list_flush_or_save() by introducing intermediate variable of @struct nlmsghdr, drop duplicated code: print_linkinfo_brief() never returns values other than <= 0 so we can move print_selected_addrinfo() outside of each block. Signed-off-by: Serhey Popovych --- ip/ipaddress.c | 31 ++- 1 file changed, 14 insertions(+), 17 deletions(-) diff --git a/ip/ipaddress.c b/ip/ipaddress.c index 6eac370..749178d 100644 --- a/ip/ipaddress.c +++ b/ip/ipaddress.c @@ -753,7 +753,7 @@ static void print_link_stats(FILE *fp, struct nlmsghdr *n) } int print_linkinfo_brief(const struct sockaddr_nl *who, - struct nlmsghdr *n, void *arg) +struct nlmsghdr *n, void *arg) { FILE *fp = (FILE *)arg; struct ifinfomsg *ifi = NLMSG_DATA(n); @@ -2013,24 +2013,21 @@ static int ipaddr_list_flush_or_save(int argc, char **argv, int action) ipaddr_filter(&linfo, ainfo); for (l = linfo.head; l; l = l->next) { - int res = 0; - struct ifinfomsg *ifi = NLMSG_DATA(&l->h); + struct nlmsghdr *n = &l->h; + struct ifinfomsg *ifi = NLMSG_DATA(n); + int res; open_json_object(NULL); - if (brief) { - if (print_linkinfo_brief(NULL, &l->h, stdout) == 0) - if (filter.family != AF_PACKET) - print_selected_addrinfo(ifi, - ainfo->head, - stdout); - } else if (no_link || - (res = print_linkinfo(NULL, &l->h, stdout)) >= 0) { - if (filter.family != AF_PACKET) - print_selected_addrinfo(ifi, - ainfo->head, stdout); - if (res > 0 && !do_link && show_stats) - print_link_stats(stdout, &l->h); - } + if (brief) + res = print_linkinfo_brief(NULL, n, stdout); + else if (no_link) + res = 0; + else + res = print_linkinfo(NULL, n, stdout); + if (res >= 0 && filter.family != AF_PACKET) + print_selected_addrinfo(ifi, ainfo->head, stdout); + if (res > 0 && !do_link && show_stats) + print_link_stats(stdout, n); close_json_object(); } fflush(stdout); -- 1.7.10.4
[PATCH iproute2-next v4 6/9] lib: Correct object file dependencies
Neither internal libnetlink nor libgenl depends on ll_map.o: prepare for upcoming changes that brings much more cleaner dependency between utils.o and ll_map.o. Signed-off-by: Serhey Popovych --- lib/Makefile |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/Makefile b/lib/Makefile index 7b34ed5..bab8cbf 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -3,11 +3,11 @@ include ../config.mk CFLAGS += -fPIC -UTILOBJ = utils.o rt_names.o ll_types.o ll_proto.o ll_addr.o \ +UTILOBJ = utils.o rt_names.o ll_map.o ll_types.o ll_proto.o ll_addr.o \ inet_proto.o namespace.o json_writer.o json_print.o \ names.o color.o bpf.o exec.o fs.o -NLOBJ=libgenl.o ll_map.o libnetlink.o +NLOBJ=libgenl.o libnetlink.o all: libnetlink.a libutil.a -- 1.7.10.4
[PATCH iproute2-next v4 2/9] ipaddress: ll_map: Replace ll_idx_n2a() with ll_index_to_name()
There is no reentrancy as well as deferred result usage for all cases where ll_idx_n2a() being used: it is safe to use ll_index_to_name() that internally calls ll_idx_n2a() with static buffer to hold result. While there print master network device name using correct color. Signed-off-by: Serhey Popovych --- ip/ipaddress.c | 21 + 1 file changed, 9 insertions(+), 12 deletions(-) diff --git a/ip/ipaddress.c b/ip/ipaddress.c index ad69d09..0daba8c 100644 --- a/ip/ipaddress.c +++ b/ip/ipaddress.c @@ -813,14 +813,13 @@ int print_linkinfo_brief(const struct sockaddr_nl *who, print_bool(PRINT_ANY, "deleted", "Deleted ", true); if (tb[IFLA_LINK]) { - SPRINT_BUF(b1); int iflink = rta_getattr_u32(tb[IFLA_LINK]); if (iflink == 0) { snprintf(buf, sizeof(buf), "%s@NONE", name); print_null(PRINT_JSON, "link", NULL, NULL); } else { - const char *link = ll_idx_n2a(iflink, b1); + const char *link = ll_index_to_name(iflink); print_string(PRINT_JSON, "link", NULL, link); snprintf(buf, sizeof(buf), "%s@%s", name, link); @@ -957,12 +956,10 @@ int print_linkinfo(const struct sockaddr_nl *who, print_int(PRINT_ANY, "link_index", "@if%d: ", iflink); else { - SPRINT_BUF(b1); - print_string(PRINT_ANY, "link", "@%s: ", -ll_idx_n2a(iflink, b1)); +ll_index_to_name(iflink)); m_flag = ll_index_to_flags(iflink); m_flag = !(m_flag & IFF_UP); } @@ -984,12 +981,13 @@ int print_linkinfo(const struct sockaddr_nl *who, "qdisc %s ", rta_getattr_str(tb[IFLA_QDISC])); if (tb[IFLA_MASTER]) { - SPRINT_BUF(b1); + int master = rta_getattr_u32(tb[IFLA_MASTER]); - print_string(PRINT_ANY, -"master", -"master %s ", -ll_idx_n2a(rta_getattr_u32(tb[IFLA_MASTER]), b1)); + print_color_string(PRINT_ANY, + COLOR_IFNAME, + "master", + "master %s ", + ll_index_to_name(master)); } if (tb[IFLA_OPERSTATE]) @@ -1308,7 +1306,6 @@ static int get_filter(const char *arg) static int ifa_label_match_rta(int ifindex, const struct rtattr *rta) { const char *label; - SPRINT_BUF(b1); if (!filter.label) return 0; @@ -1316,7 +1313,7 @@ static int ifa_label_match_rta(int ifindex, const struct rtattr *rta) if (rta) label = RTA_DATA(rta); else - label = ll_idx_n2a(ifindex, b1); + label = ll_index_to_name(ifindex); return fnmatch(filter.label, label, 0); } -- 1.7.10.4
[PATCH iproute2-next v4 0/9] ipaddress: Make print_linkinfo_brief() static
With this series I propose to make print_linkinfo_brief() static in favor of print_linkinfo() as single point for linkinfo printing. Changes presented with this series tested using following script: \#!/bin/bash iproute2_dir="$1" iface='eth0.2' pushd "$iproute2_dir" &>/dev/null for i in new old; do DIR="/tmp/$i" mkdir -p "$DIR" ln -snf ip.$i ip/ip # normal ip/ip link show >"$DIR/ip-link-show" ip/ip -4 addr show >"$DIR/ip-4-addr-show" ip/ip -6 addr show >"$DIR/ip-6-addr-show" ip/ip addr show dev "$iface" >"$DIR/ip-addr-show-$iface" # brief ip/ip -br link show >"$DIR/ip-br-link-show" ip/ip -br -4 addr show >"$DIR/ip-br-4-addr-show" ip/ip -br -6 addr show >"$DIR/ip-br-6-addr-show" ip/ip -br addr show dev "$iface" >"$DIR/ip-br-addr-show-$iface" done rm -f ip/ip diff -urN /tmp/{old,new} |sed -n -Ee'/^(-{3}|\+{3})[[:space:]]+/!p' rc=$? popd &>/dev/null exit $rc Expected results : Actual results : Although test coverage is far from ideal in my opinion it covers most important aspects of the changes presented by the series. All this work is done in prepare of iplink_get() enhancements to support attribute parse that finally will be used to simplify ip/tunnel RTM_GETLINK code. As always reviews, comments, suggestions and criticism is welcome. v4 Print master network device name using correct color. v3 Fixed subject line. v2 Rebased to current iproute2-next/master. No changes. Thanks, Serhii Serhey Popovych (9): ipaddress: Abstract IFA_LABEL matching code ipaddress: ll_map: Replace ll_idx_n2a() with ll_index_to_name() utils: Reimplement ll_idx_n2a() and introduce ll_idx_a2n() ipaddress: Improve print_linkinfo() ipaddress: Simplify print_linkinfo_brief() and it's usage lib: Correct object file dependencies utils: Introduce and use get_ifname_rta() utils: Introduce and use print_name_and_link() to print name@link ipaddress: Make print_linkinfo_brief() static bridge/link.c| 21 ++--- include/ll_map.h |4 +- include/utils.h |5 ++ ip/ip_common.h |2 - ip/ipaddress.c | 231 ++ ip/iplink.c |5 +- lib/Makefile |4 +- lib/ll_map.c | 31 +--- lib/utils.c | 68 9 files changed, 166 insertions(+), 205 deletions(-) -- 1.7.10.4
Re: [PATCH iproute2-next] ip: Use single variable to represent -pretty
On Thu, 15 Feb 2018 20:31:33 +0200 Serhey Popovych wrote: > After commit a233caa0aaee ("json: make pretty printing optional") I get > following build failure: > > LINK rtmon > ../lib/libutil.a(json_print.o): In function `new_json_obj': > json_print.c:(.text+0x35): undefined reference to `show_pretty' > collect2: error: ld returned 1 exit status > make[1]: *** [rtmon] Error 1 > make: *** [all] Error 2 > > It is caused by missing show_pretty variable in rtmon. > > On the other hand tc/tc.c there are two distinct variables and single > matches() call that handles -pretty option thus setting show_pretty > will never happen. Note that since commit 44dcfe820185 ("Change > formatting of u32 back to default") show_pretty is used in tc/f_u32.c > so this is first place where -pretty introduced. > > Furthermore other utilities like misc/ifstat.c and misc/nstat.c define > pretty variable, however only for their own purposes. They both support > JSON output and thus depend show_pretty in new_json_obj(). > > Assuming above use common variable to represent -pretty option, define > it in utils.c and declare in utils.h that is commonly used. Replace > show_pretty with pretty. > > Fixes: a233caa0aaee ("json: make pretty printing optional") > Signed-off-by: Serhey Popovych > --- Looks Good to me. Not sure it did not show up in my builds. I am fixing bridge to use similar json/color/pretty flags.
Re: [PATCH iproute2-next v4 8/9] utils: Introduce and use print_name_and_link() to print name@link
On Thu, 15 Feb 2018 20:34:07 +0200 Serhey Popovych wrote: > + fprintf(fp, "%d: ", ifi->ifi_index); > + > + print_name_and_link("%s: ", COLOR_NONE, name, tb); This should be COLOR_IFNAME but I am working on that.
Re: [PATCH] i40evf/i40evf_main: Fix variable assignment in i40evf_parse_cls_flower
On 02/15/2018 12:11 PM, Jeff Kirsher wrote: On Thu, 2018-02-15 at 09:56 -0800, Ramamurthy, Harshitha wrote: On Thu, 2018-02-15 at 11:44 -0600, Gustavo A. R. Silva wrote: It seems this is a copy-paste error and that the proper variable to use in this particular case is _src_ instead of _dst_. Addresses-Coverity-ID: 1465282 ("Copy-paste error") Fixes: 0075fa0fadd0 ("i40evf: Add support to apply cloud filters") Signed-off-by: Gustavo A. R. Silva Thanks for the patch! Glad to help. :) Acked-by:Harshitha Ramamurthy Acked-by: Jeff Kirsher Dave, no need for me to pick this patch up and then push to you. You can circumvent me and pick this up. --- drivers/net/ethernet/intel/i40evf/i40evf_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Thanks guys. -- Gustavo
Re: [PATCH net] net: sched: fix unbalance in the error path of tca_action_flush()
On Thu, 2018-02-15 at 10:31 -0500, Jamal Hadi Salim wrote: > On 18-02-15 09:50 AM, Davide Caratti wrote: > > When tca_action_flush() calls the action walk() and gets an error, > > a successful call to nla_nest_start() is not followed by a call to > > nla_nest_cancel(). It's harmless, as the skb is freed in the error > > path - but it's worth to fix this unbalance. > > Kind of pushing the boundaries saying this targets net tree - there is > no bug you are fixing (as you say the sk is freed). > Maybe it makes the code prettier ... you are right, this fix is only cosmetic as long as the skb is no more used after walk() returns an error. It just triggered the doubt to me while I was reading the series that added support to ext_acks, and this is the followup. David, please apply to net-next (or drop it, at your convenience). thank you in advance, -- davide
[PATCH v3 net-next 1/7] skbuff: export mm_[un]account_pinned_pages for other modules
RDS would like to use the helper functions for managing pinned pages added by Commit a91dbff551a6 ("sock: ulimit on MSG_ZEROCOPY pages") Signed-off-by: Sowmini Varadhan --- include/linux/skbuff.h |3 +++ net/core/skbuff.c |6 -- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 5ebc0f8..b1cc38a 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -466,6 +466,9 @@ struct ubuf_info { #define skb_uarg(SKB) ((struct ubuf_info *)(skb_shinfo(SKB)->destructor_arg)) +int mm_account_pinned_pages(struct mmpin *mmp, size_t size); +void mm_unaccount_pinned_pages(struct mmpin *mmp); + struct ubuf_info *sock_zerocopy_alloc(struct sock *sk, size_t size); struct ubuf_info *sock_zerocopy_realloc(struct sock *sk, size_t size, struct ubuf_info *uarg); diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 09bd89c..1a7485a 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -890,7 +890,7 @@ struct sk_buff *skb_morph(struct sk_buff *dst, struct sk_buff *src) } EXPORT_SYMBOL_GPL(skb_morph); -static int mm_account_pinned_pages(struct mmpin *mmp, size_t size) +int mm_account_pinned_pages(struct mmpin *mmp, size_t size) { unsigned long max_pg, num_pg, new_pg, old_pg; struct user_struct *user; @@ -919,14 +919,16 @@ static int mm_account_pinned_pages(struct mmpin *mmp, size_t size) return 0; } +EXPORT_SYMBOL_GPL(mm_account_pinned_pages); -static void mm_unaccount_pinned_pages(struct mmpin *mmp) +void mm_unaccount_pinned_pages(struct mmpin *mmp) { if (mmp->user) { atomic_long_sub(mmp->num_pg, &mmp->user->locked_vm); free_uid(mmp->user); } } +EXPORT_SYMBOL_GPL(mm_unaccount_pinned_pages); struct ubuf_info *sock_zerocopy_alloc(struct sock *sk, size_t size) { -- 1.7.1
[PATCH v3 net-next 3/7] sock: permit SO_ZEROCOPY on PF_RDS socket
allow the application to set SO_ZEROCOPY on the underlying sk of a PF_RDS socket Signed-off-by: Sowmini Varadhan --- net/core/sock.c | 25 ++--- 1 files changed, 14 insertions(+), 11 deletions(-) diff --git a/net/core/sock.c b/net/core/sock.c index e90d461..a1fa4a5 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1049,18 +1049,21 @@ int sock_setsockopt(struct socket *sock, int level, int optname, break; case SO_ZEROCOPY: - if (sk->sk_family != PF_INET && sk->sk_family != PF_INET6) + if (sk->sk_family == PF_INET || sk->sk_family == PF_INET6) { + if (sk->sk_protocol != IPPROTO_TCP) + ret = -ENOTSUPP; + else if (sk->sk_state != TCP_CLOSE) + ret = -EBUSY; + } else if (sk->sk_family != PF_RDS) { ret = -ENOTSUPP; - else if (sk->sk_protocol != IPPROTO_TCP) - ret = -ENOTSUPP; - else if (sk->sk_state != TCP_CLOSE) - ret = -EBUSY; - else if (val < 0 || val > 1) - ret = -EINVAL; - else - sock_valbool_flag(sk, SOCK_ZEROCOPY, valbool); - break; - + } + if (!ret) { + if (val < 0 || val > 1) + ret = -EINVAL; + else + sock_valbool_flag(sk, SOCK_ZEROCOPY, valbool); + break; + } default: ret = -ENOPROTOOPT; break; -- 1.7.1