Re: [PATCH] r8152: disable rx checksum offload on Dell TB dock
On Thu, Nov 23, 2017 at 01:38:38AM -0500, Kai-Heng Feng wrote: > r8153 on Dell TB dock corrupts rx packets. > > The root cause is not found yet, but disabling rx checksumming can > workaround the issue. We can use this connection to decide if it's > a Dell TB dock: > Realtek r8153 <-> SMSC hub <-> ASMedia XHCI controller > > BugLink: https://bugs.launchpad.net/bugs/1729674 > Cc: Mario Limonciello> Signed-off-by: Kai-Heng Feng > --- > drivers/net/usb/r8152.c | 33 - > 1 file changed, 32 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c > index d51d9abf7986..58b80b5e7803 100644 > --- a/drivers/net/usb/r8152.c > +++ b/drivers/net/usb/r8152.c > @@ -27,6 +27,8 @@ > #include > #include > #include > +#include > +#include > > /* Information for net-next */ > #define NETNEXT_VERSION "09" > @@ -5135,6 +5137,35 @@ static u8 rtl_get_version(struct usb_interface *intf) > return version; > } > > +/* Ethernet on Dell TB 15/16 dock is connected this way: > + * Realtek r8153 <-> SMSC hub <-> ASMedia XHCI controller > + * We use this connection to make sure r8153 is on the Dell TB dock. > + */ > +static bool check_dell_tb_dock(struct usb_device *udev) > +{ > + struct usb_device *hub = udev->parent; > + struct usb_device *root_hub; > + struct pci_dev *controller; > + > + if (!hub) > + return false; > + > + if (!(le16_to_cpu(hub->descriptor.idVendor) == 0x0424 && > + le16_to_cpu(hub->descriptor.idProduct) == 0x5537)) > + return false; > + > + root_hub = hub->parent; > + if (!root_hub || root_hub->parent) > + return false; > + > + controller = to_pci_dev(bus_to_hcd(root_hub->bus)->self.controller); That's a very scary, and dangerous, cast. You can not ever be sure that the hub really is a "root hub" like this. > + if (controller->vendor == 0x1b21 && controller->device == 0x1142) > + return true; Why can't you just look at the USB device itself and go off of a quirk in it? Something like a version or string or something else? This sounds like a USB host controller issue, not a USB device issue, can't we fix the "real" problem here instead of this crazy work-around? Odds are any device plugged into the hub should have the same issue, right? thanks, greg k-h
Re: [Outreachy kernel] Re: [PATCH] net: usb: hso.c: remove unneeded DRIVER_LICENSE #define
On Thu, 23 Nov 2017, Greg Kroah-Hartman wrote: > On Wed, Nov 22, 2017 at 10:20:49PM +0100, Julia Lawall wrote: > > > > > > On Wed, 22 Nov 2017, Joe Perches wrote: > > > > > On Fri, 2017-11-17 at 15:19 +0100, Greg Kroah-Hartman wrote: > > > > There is no need to #define the license of the driver, just put it in > > > > the MODULE_LICENSE() line directly as a text string. > > > > > > > > This allows tools that check that the module license matches the source > > > > code license to work properly, as there is no need to unwind the > > > > unneeded dereference. > > > [] > > > > diff --git a/drivers/net/usb/hso.c b/drivers/net/usb/hso.c > > > [] > > > > @@ -76,7 +76,6 @@ > > > > > > > > #define MOD_AUTHOR "Option Wireless" > > > > #define MOD_DESCRIPTION"USB High Speed Option > > > > driver" > > > > -#define MOD_LICENSE"GPL" > > > > > > > > #define HSO_MAX_NET_DEVICES10 > > > > #define HSO__MAX_MTU 2048 > > > > @@ -3288,7 +3287,7 @@ module_exit(hso_exit); > > > > > > > > MODULE_AUTHOR(MOD_AUTHOR); > > > > MODULE_DESCRIPTION(MOD_DESCRIPTION); > > > > -MODULE_LICENSE(MOD_LICENSE); > > > > +MODULE_LICENSE("GPL"); > > > > > > Probably all of these MODULE_(MOD_) uses could be > > > simplified as well. > > > > > > Perhaps there's utility in a (cocci?) script that looks for > > > used-once > > > macro #defines in various types of macros. > > > > What about module_version, eg: > > > > diff -u -p a/drivers/ata/pata_pdc202xx_old.c > > b/drivers/ata/pata_pdc202xx_old.c > > --- a/drivers/ata/pata_pdc202xx_old.c > > +++ b/drivers/ata/pata_pdc202xx_old.c > > @@ -21,7 +21,6 @@ > > #include > > > > #define DRV_NAME "pata_pdc202xx_old" > > -#define DRV_VERSION "0.4.3" > > > > static int pdc2026x_cable_detect(struct ata_port *ap) > > { > > @@ -389,4 +388,4 @@ MODULE_AUTHOR("Alan Cox"); > > MODULE_DESCRIPTION("low-level driver for Promise 2024x and 20262-20267"); > > MODULE_LICENSE("GPL"); > > MODULE_DEVICE_TABLE(pci, pdc202xx); > > -MODULE_VERSION(DRV_VERSION); > > +MODULE_VERSION("0.4.3"); > > I've just deleted MODULE_VERSION() entirely from some subsystems, as > once the driver is in the kernel source tree, the "version" makes almost > no sense at all. > > But I know some companies love incrementing it (some network and scsi > drivers specifically), so those might want to keep it around for some > odd reason. OK, that seems like a simple soluton. Thanks. julia
Re: [Outreachy kernel] Re: [PATCH] net: usb: hso.c: remove unneeded DRIVER_LICENSE #define
On Wed, Nov 22, 2017 at 10:20:49PM +0100, Julia Lawall wrote: > > > On Wed, 22 Nov 2017, Joe Perches wrote: > > > On Fri, 2017-11-17 at 15:19 +0100, Greg Kroah-Hartman wrote: > > > There is no need to #define the license of the driver, just put it in > > > the MODULE_LICENSE() line directly as a text string. > > > > > > This allows tools that check that the module license matches the source > > > code license to work properly, as there is no need to unwind the > > > unneeded dereference. > > [] > > > diff --git a/drivers/net/usb/hso.c b/drivers/net/usb/hso.c > > [] > > > @@ -76,7 +76,6 @@ > > > > > > #define MOD_AUTHOR "Option Wireless" > > > #define MOD_DESCRIPTION "USB High Speed Option driver" > > > -#define MOD_LICENSE "GPL" > > > > > > #define HSO_MAX_NET_DEVICES 10 > > > #define HSO__MAX_MTU 2048 > > > @@ -3288,7 +3287,7 @@ module_exit(hso_exit); > > > > > > MODULE_AUTHOR(MOD_AUTHOR); > > > MODULE_DESCRIPTION(MOD_DESCRIPTION); > > > -MODULE_LICENSE(MOD_LICENSE); > > > +MODULE_LICENSE("GPL"); > > > > Probably all of these MODULE_(MOD_) uses could be > > simplified as well. > > > > Perhaps there's utility in a (cocci?) script that looks for > > used-once > > macro #defines in various types of macros. > > What about module_version, eg: > > diff -u -p a/drivers/ata/pata_pdc202xx_old.c > b/drivers/ata/pata_pdc202xx_old.c > --- a/drivers/ata/pata_pdc202xx_old.c > +++ b/drivers/ata/pata_pdc202xx_old.c > @@ -21,7 +21,6 @@ > #include > > #define DRV_NAME "pata_pdc202xx_old" > -#define DRV_VERSION "0.4.3" > > static int pdc2026x_cable_detect(struct ata_port *ap) > { > @@ -389,4 +388,4 @@ MODULE_AUTHOR("Alan Cox"); > MODULE_DESCRIPTION("low-level driver for Promise 2024x and 20262-20267"); > MODULE_LICENSE("GPL"); > MODULE_DEVICE_TABLE(pci, pdc202xx); > -MODULE_VERSION(DRV_VERSION); > +MODULE_VERSION("0.4.3"); I've just deleted MODULE_VERSION() entirely from some subsystems, as once the driver is in the kernel source tree, the "version" makes almost no sense at all. But I know some companies love incrementing it (some network and scsi drivers specifically), so those might want to keep it around for some odd reason. thanks greg k-h
[PATCH] r8152: disable rx checksum offload on Dell TB dock
r8153 on Dell TB dock corrupts rx packets. The root cause is not found yet, but disabling rx checksumming can workaround the issue. We can use this connection to decide if it's a Dell TB dock: Realtek r8153 <-> SMSC hub <-> ASMedia XHCI controller BugLink: https://bugs.launchpad.net/bugs/1729674 Cc: Mario LimoncielloSigned-off-by: Kai-Heng Feng --- drivers/net/usb/r8152.c | 33 - 1 file changed, 32 insertions(+), 1 deletion(-) diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c index d51d9abf7986..58b80b5e7803 100644 --- a/drivers/net/usb/r8152.c +++ b/drivers/net/usb/r8152.c @@ -27,6 +27,8 @@ #include #include #include +#include +#include /* Information for net-next */ #define NETNEXT_VERSION"09" @@ -5135,6 +5137,35 @@ static u8 rtl_get_version(struct usb_interface *intf) return version; } +/* Ethernet on Dell TB 15/16 dock is connected this way: + * Realtek r8153 <-> SMSC hub <-> ASMedia XHCI controller + * We use this connection to make sure r8153 is on the Dell TB dock. + */ +static bool check_dell_tb_dock(struct usb_device *udev) +{ + struct usb_device *hub = udev->parent; + struct usb_device *root_hub; + struct pci_dev *controller; + + if (!hub) + return false; + + if (!(le16_to_cpu(hub->descriptor.idVendor) == 0x0424 && + le16_to_cpu(hub->descriptor.idProduct) == 0x5537)) + return false; + + root_hub = hub->parent; + if (!root_hub || root_hub->parent) + return false; + + controller = to_pci_dev(bus_to_hcd(root_hub->bus)->self.controller); + + if (controller->vendor == 0x1b21 && controller->device == 0x1142) + return true; + + return false; +} + static int rtl8152_probe(struct usb_interface *intf, const struct usb_device_id *id) { @@ -5202,7 +5233,7 @@ static int rtl8152_probe(struct usb_interface *intf, NETIF_F_HIGHDMA | NETIF_F_FRAGLIST | NETIF_F_IPV6_CSUM | NETIF_F_TSO6; - if (tp->version == RTL_VER_01) { + if (tp->version == RTL_VER_01 || check_dell_tb_dock(udev)) { netdev->features &= ~NETIF_F_RXCSUM; netdev->hw_features &= ~NETIF_F_RXCSUM; } -- 2.14.1
Re: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1 timed out
Hi Bhadram you said that In normal ping scenario this is not observed, I wonder if you could try for example, ping with -s 1400. In that case, if still fail I think the issue could be the FIFO tuning and I expect overflow on RX MMC counters. Let me know Regards, Peppe On 11/20/2017 3:22 PM, Bhadram Varka wrote: Hi Giuseppe, Thanks for responding. Actually I am using net-next tree for making the changes. Below patches already present in code base. a0daae1 net: stmmac: Disable flow ctrl for RX AVB queues and really enable TX AVB queues 52a7623 net: stmmac: Use correct values in TQS/RQS fields Thanks, Bhadram. -Original Message- From: Giuseppe CAVALLARO [mailto:peppe.cavall...@st.com] Sent: Monday, November 20, 2017 6:37 PM To: Bhadram Varka; joao.pi...@synopsys.com Cc: linux-netdev Subject: Re: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1 timed out Hello Bhadram there are some new patches actually in net/net-next repo that you should have; for example: [PATCH net-next v2 0/2] net: stmmac: Improvements for multi-queuing and for AVB Let me know if these help you. Regards Peppe On 11/20/2017 7:38 AM, Bhadram Varka wrote: Hi Joao/Peppe, Observed this issue more frequently with multi-channel case. Am I missing something in DT ? Please help here to understand the issue. Thanks, Bhadram -Original Message- From: Bhadram Varka Sent: Thursday, November 16, 2017 9:41 AM To: linux-netdev Subject: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1 timed out Hi, I am trying to enable multi-queue in Tegra186 EQOS (which has support for 4 channels). Observed below netdev watchdog warning. Its easily reproable with iperf test. In normal ping scenario this is not observed. I did not observe any issue if we disable TSO. Looks like issue in stmmac_tso_xmit() in multi-channel scenario. [ 88.801672] NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 0 timed out [ 88.808818] [ cut here ] [ 88.813435] WARNING: CPU: 5 PID: 0 at net/sched/sch_generic.c:320 dev_watchdog+0x2cc/0x2d8 [ 88.821681] Modules linked in: dwmac_dwc_qos_eth stmmac_platform crc32_ce crct10dif_ce stmmac ip_tables x_tables ipv6 [ 88.832290] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G S 4.14.0-rc7-01956-g9395db5-dirty #21 [ 88.841663] Hardware name: NVIDIA Tegra186 P2771- Development Board (DT) [ 88.848697] task: 8001ec8fd400 task.stack: 09e38000 [ 88.854606] PC is at dev_watchdog+0x2cc/0x2d8 [ 88.858952] LR is at dev_watchdog+0x2cc/0x2d8 [ 88.863300] pc : [] lr : [] pstate: 2145 [ 88.870678] sp : 0802bd80 [ 88.873983] x29: 0802bd80 x28: 00a0 [ 88.879287] x27: x26: 8001eae2c3b0 [ 88.884589] x25: 0005 x24: 8001ecb6be80 [ 88.889891] x23: 8001eae2c39c x22: 8001eae2bfb0 [ 88.895192] x21: 8001eae2c000 x20: 08fe7000 [ 88.900493] x19: 0001 x18: 0010 [ 88.905795] x17: x16: [ 88.911098] x15: x14: 756f2064656d6974 [ 88.916399] x13: 2031206575657571 x12: 08fe9df0 [ 88.921699] x11: 08586180 x10: 642d6874652d6377 [ 88.927000] x9 : 0016 x8 : 3a474f4448435441 [ 88.932301] x7 : 572056454454454e x6 : 014f [ 88.937602] x5 : 0020 x4 : [ 88.942902] x3 : x2 : 08fec4c0 [ 88.948203] x1 : 8001ec8fd400 x0 : 0041 [ 88.953504] Call trace: [ 88.955944] Exception stack(0x0802bc40 to 0x0802bd80) [ 88.962371] bc40: 0041 8001ec8fd400 08fec4c0 [ 88.970184] bc60: 0020 014f 572056454454454e [ 88.977998] bc80: 3a474f4448435441 0016 642d6874652d6377 08586180 [ 88.985811] bca0: 08fe9df0 2031206575657571 756f2064656d6974 [ 88.993624] bcc0: 0010 0001 [ 89.001439] bce0: 08fe7000 8001eae2c000 8001eae2bfb0 8001eae2c39c [ 89.009252] bd00: 8001ecb6be80 0005 8001eae2c3b0 [ 89.017065] bd20: 00a0 0802bd80 0894a76c 0802bd80 [ 89.024879] bd40: 0894a76c 2145 00b67570 0001 [ 89.032693] bd60: 0001 8001ecb6b200 0802bd80 0894a76c [ 89.040508] [] dev_watchdog+0x2cc/0x2d8 [ 89.045900] [] call_timer_fn.isra.5+0x24/0x80 [ 89.051809] [] expire_timers+0xa4/0xb0 [ 89.057111] [] run_timer_softirq+0x140/0x170 [ 89.062933] [] __do_softirq+0x12c/0x228 [ 89.068323] [] irq_exit+0xd0/0x108 [ 89.073278] [] __handle_domain_irq+0x60/0xb8 [ 89.079098] [] gic_handle_irq+0x58/0xa8 [ 89.084484] Exception
Re: [PATCH net] net: thunderx: Fix TCP/UDP checksum offload for IPv6 pkts
On Wed, Nov 22, 2017 at 9:27 PM, Eric Dumazetwrote: > On Wed, 2017-11-22 at 15:37 +0300, Aleksey Makarov wrote: >> From: Sunil Goutham >> >> This fixes a previous patch which missed some changes >> and due to which L3 checksum offload was getting enabled >> for IPv6 pkts. And HW is dropping these pkts as it assumes >> the pkt is IPv4 when IP csum offload is set in the SQ >> descriptor. >> >> Fixes: 494fd005 ("net: thunderx: Enable TSO and checksum offloads >> for ipv6") >> Signed-off-by: Sunil Goutham >> Signed-off-by: Aleksey Makarov >> --- >> drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c >> b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c >> index d4496e9afcdf..184d5bdbe7e0 100644 >> --- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c >> +++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c >> @@ -1355,10 +1355,11 @@ nicvf_sq_add_hdr_subdesc(struct nicvf *nic, >> struct snd_queue *sq, int qentry, >> >> /* Offload checksum calculation to HW */ >> if (skb->ip_summed == CHECKSUM_PARTIAL) { >> - hdr->csum_l3 = 1; /* Enable IP csum calculation */ >> hdr->l3_offset = skb_network_offset(skb); >> hdr->l4_offset = skb_transport_offset(skb); >> >> + /* Enable IP HDR csum calculation for V4 pkts */ >> + hdr->csum_l3 = (ip.v4->version == 4) ? 1 : 0; > > Have you tried to set hdr->csum_l3 to 0 regardless of version being 4 > or 6 ? > > This would remove the need for yet another conditional. > > AFAIK, linux does not offload IPv4 header checksums to NIC, it is not > worth the trouble. Looks like I misunderstood the IPSUM netdev feature flag. Thanks, will check. Sunil. > > > > ___ > linux-arm-kernel mailing list > linux-arm-ker...@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
[PATCH net 1/2] ipvlan: Fix insufficient skb linear check for arp
From: Gao FengIn the function ipvlan_get_L3_hdr, current codes use pskb_may_pull to make sure the skb header has enough linear room for arp header. But it would access the arp payload in func ipvlan_addr_lookup. So it still may access the unepxected memory. Now use arp_hdr_len(port->dev) instead of the arp header as the param. Signed-off-by: Gao Feng --- drivers/net/ipvlan/ipvlan_core.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c index f2a7e92..4476425 100644 --- a/drivers/net/ipvlan/ipvlan_core.c +++ b/drivers/net/ipvlan/ipvlan_core.c @@ -116,7 +116,7 @@ bool ipvlan_addr_busy(struct ipvl_port *port, void *iaddr, bool is_v6) return false; } -static void *ipvlan_get_L3_hdr(struct sk_buff *skb, int *type) +static void *ipvlan_get_L3_hdr(struct ipvl_port *port, struct sk_buff *skb, int *type) { void *lyr3h = NULL; @@ -124,7 +124,7 @@ static void *ipvlan_get_L3_hdr(struct sk_buff *skb, int *type) case htons(ETH_P_ARP): { struct arphdr *arph; - if (unlikely(!pskb_may_pull(skb, sizeof(*arph + if (unlikely(!pskb_may_pull(skb, arp_hdr_len(port->dev return NULL; arph = arp_hdr(skb); @@ -510,7 +510,7 @@ static int ipvlan_xmit_mode_l3(struct sk_buff *skb, struct net_device *dev) struct ipvl_addr *addr; int addr_type; - lyr3h = ipvlan_get_L3_hdr(skb, _type); + lyr3h = ipvlan_get_L3_hdr(ipvlan->port, skb, _type); if (!lyr3h) goto out; @@ -539,7 +539,7 @@ static int ipvlan_xmit_mode_l2(struct sk_buff *skb, struct net_device *dev) if (!ipvlan_is_vepa(ipvlan->port) && ether_addr_equal(eth->h_dest, eth->h_source)) { - lyr3h = ipvlan_get_L3_hdr(skb, _type); + lyr3h = ipvlan_get_L3_hdr(ipvlan->port, skb, _type); if (lyr3h) { addr = ipvlan_addr_lookup(ipvlan->port, lyr3h, addr_type, true); if (addr) { @@ -606,7 +606,7 @@ static bool ipvlan_external_frame(struct sk_buff *skb, struct ipvl_port *port) int addr_type; if (ether_addr_equal(eth->h_source, skb->dev->dev_addr)) { - lyr3h = ipvlan_get_L3_hdr(skb, _type); + lyr3h = ipvlan_get_L3_hdr(port, skb, _type); if (!lyr3h) return true; @@ -627,7 +627,7 @@ static rx_handler_result_t ipvlan_handle_mode_l3(struct sk_buff **pskb, struct sk_buff *skb = *pskb; rx_handler_result_t ret = RX_HANDLER_PASS; - lyr3h = ipvlan_get_L3_hdr(skb, _type); + lyr3h = ipvlan_get_L3_hdr(port, skb, _type); if (!lyr3h) goto out; @@ -666,7 +666,7 @@ static rx_handler_result_t ipvlan_handle_mode_l2(struct sk_buff **pskb, } else { struct ipvl_addr *addr; - lyr3h = ipvlan_get_L3_hdr(skb, _type); + lyr3h = ipvlan_get_L3_hdr(port, skb, _type); if (!lyr3h) return ret; @@ -717,7 +717,7 @@ static struct ipvl_addr *ipvlan_skb_to_addr(struct sk_buff *skb, if (!port || port->mode != IPVLAN_MODE_L3S) goto out; - lyr3h = ipvlan_get_L3_hdr(skb, _type); + lyr3h = ipvlan_get_L3_hdr(port, skb, _type); if (!lyr3h) goto out; -- 1.9.1
[PATCH net 2/2] ipvlan: Fix insufficient skb linear check for ipv6 icmp
From: Gao FengIn the function ipvlan_get_L3_hdr, current codes use pskb_may_pull to make sure the skb header has enough linear room for ipv6 header. But it would use the latter memory directly without linear check when it is icmp. So it still may access the unepxected memory in ipvlan_addr_lookup. Now invoke the pskb_may_pull again if it is ipv6 icmp. Signed-off-by: Gao Feng --- drivers/net/ipvlan/ipvlan_core.c | 20 +++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c index 4476425..11c1e79 100644 --- a/drivers/net/ipvlan/ipvlan_core.c +++ b/drivers/net/ipvlan/ipvlan_core.c @@ -165,8 +165,26 @@ static void *ipvlan_get_L3_hdr(struct ipvl_port *port, struct sk_buff *skb, int /* Only Neighbour Solicitation pkts need different treatment */ if (ipv6_addr_any(>saddr) && ip6h->nexthdr == NEXTHDR_ICMP) { + struct icmp6hdr *icmph; + + if (unlikely(!pskb_may_pull(skb, sizeof(*ip6h) + sizeof(*icmph + return NULL; + + ip6h = ipv6_hdr(skb); + icmph = (struct icmp6hdr *)(ip6h + 1); + + if (icmph->icmp6_type == NDISC_NEIGHBOUR_SOLICITATION) { + /* Need to access the ipv6 address in body */ + if (unlikely(!pskb_may_pull(skb, sizeof(*ip6h) + sizeof(*icmph) + + sizeof(struct in6_addr + return NULL; + + ip6h = ipv6_hdr(skb); + icmph = (struct icmp6hdr *)(ip6h + 1); + } + *type = IPVL_ICMPV6; - lyr3h = ip6h + 1; + lyr3h = icmph; } break; } -- 1.9.1
[PATCH net 0/2] ipvlan: Fix insufficient skb linear check
From: Gao FengThe current ipvlan codes use pskb_may_pull to get the skb linear header in func ipvlan_get_L3_hdr, but the size isn't enough for arp and ipv6 icmp. So it may access the unexpected momory in ipvlan_addr_lookup. Gao Feng (2): ipvlan: Fix insufficient skb linear check for arp ipvlan: Fix insufficient skb linear check for ipv6 icmp drivers/net/ipvlan/ipvlan_core.c | 36 +++- 1 file changed, 27 insertions(+), 9 deletions(-) -- 1.9.1
[PATCH net] geneve: only configure or fill UDP_ZERO_CSUM6_RX/TX info when CONFIG_IPV6
Stefano pointed that configure or show UDP_ZERO_CSUM6_RX/TX info doesn't make sense if we haven't enabled CONFIG_IPV6. Fix it by adding if IS_ENABLED(CONFIG_IPV6) check. Signed-off-by: Hangbin Liu--- drivers/net/geneve.c | 16 +++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c index 4e16d83..b718a02 100644 --- a/drivers/net/geneve.c +++ b/drivers/net/geneve.c @@ -1337,21 +1337,33 @@ static int geneve_nl2info(struct nlattr *tb[], struct nlattr *data[], } if (data[IFLA_GENEVE_UDP_ZERO_CSUM6_TX]) { +#if IS_ENABLED(CONFIG_IPV6) if (changelink) { attrtype = IFLA_GENEVE_UDP_ZERO_CSUM6_TX; goto change_notsup; } if (nla_get_u8(data[IFLA_GENEVE_UDP_ZERO_CSUM6_TX])) info->key.tun_flags &= ~TUNNEL_CSUM; +#else + NL_SET_ERR_MSG_ATTR(extack, data[IFLA_GENEVE_UDP_ZERO_CSUM6_TX], + "IPv6 support not enabled in the kernel"); + return -EPFNOSUPPORT; +#endif } if (data[IFLA_GENEVE_UDP_ZERO_CSUM6_RX]) { +#if IS_ENABLED(CONFIG_IPV6) if (changelink) { attrtype = IFLA_GENEVE_UDP_ZERO_CSUM6_RX; goto change_notsup; } if (nla_get_u8(data[IFLA_GENEVE_UDP_ZERO_CSUM6_RX])) *use_udp6_rx_checksums = false; +#else + NL_SET_ERR_MSG_ATTR(extack, data[IFLA_GENEVE_UDP_ZERO_CSUM6_RX], + "IPv6 support not enabled in the kernel"); + return -EPFNOSUPPORT; +#endif } return 0; @@ -1527,11 +1539,13 @@ static int geneve_fill_info(struct sk_buff *skb, const struct net_device *dev) goto nla_put_failure; if (metadata && nla_put_flag(skb, IFLA_GENEVE_COLLECT_METADATA)) - goto nla_put_failure; + goto nla_put_failure; +#if IS_ENABLED(CONFIG_IPV6) if (nla_put_u8(skb, IFLA_GENEVE_UDP_ZERO_CSUM6_RX, !geneve->use_udp6_rx_checksums)) goto nla_put_failure; +#endif return 0; -- 2.5.5
Re: [PATCH linux-firmware 0/2] Mellanox: Add new mlxsw_spectrum firmware 13.1530.152
On Thu, 2017-11-09 at 09:15 +0200, Shalom Toledo wrote: > This set adds a new firmware version 13.1530.152 as well as information about > the previous firmware version of the mlxsw_spectrum driver > > Shalom Toledo (2): > WHENCE: Add missing entry for mlxsw_spectrum firmware > Mellanox: Add new mlxsw_spectrum firmware 13.1530.152 > > WHENCE | 38 > +++ > mellanox/mlxsw_spectrum-13.1530.152.mfa2 | Bin 0 -> 924020 bytes > 2 files changed, 38 insertions(+) > create mode 100644 mellanox/mlxsw_spectrum-13.1530.152.mfa2 Applied both of these, thanks. Ben. -- Ben Hutchings When in doubt, use brute force. - Ken Thompson signature.asc Description: This is a digitally signed message part
[PATCH v4 2/2] sock: Move the socket inuse to namespace.
This patch add a member in struct netns_core. And this is a counter for socket_inuse in the _net_ namespace. The patch will add/sub counter in the sk_alloc or sk_free. Because socket and sock is in pair. It's a easy way to maintain the code and help developer to review. More important, it avoids holding the _net_ namespace again. Signed-off-by: Martin ZhangSigned-off-by: Tonghao Zhang --- v3 --> v4: 1. add noop function for !CONF_PROC_FS case. 2. change the __this_cpu_add to this_cpu_add. This is reported by lkp. reported at: http://patchwork.ozlabs.org/patch/837424/ --- include/net/netns/core.h | 1 + include/net/sock.h | 1 + net/core/sock.c | 40 +++- net/socket.c | 13 ++--- 4 files changed, 43 insertions(+), 12 deletions(-) diff --git a/include/net/netns/core.h b/include/net/netns/core.h index 6490b79..1de41f3 100644 --- a/include/net/netns/core.h +++ b/include/net/netns/core.h @@ -11,6 +11,7 @@ struct netns_core { int sysctl_somaxconn; struct prot_inuse __percpu *prot_inuse; + int __percpu *sock_inuse; }; #endif diff --git a/include/net/sock.h b/include/net/sock.h index 6f1be97..169a26f 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1263,6 +1263,7 @@ static inline void sk_sockets_allocated_inc(struct sock *sk) /* Called with local bh disabled */ void sock_prot_inuse_add(struct net *net, struct proto *prot, int inc); int sock_prot_inuse_get(struct net *net, struct proto *proto); +int sock_inuse_get(struct net *net); #else static inline void sock_prot_inuse_add(struct net *net, struct proto *prot, int inc) diff --git a/net/core/sock.c b/net/core/sock.c index b899d86..04bbab1 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -145,6 +145,8 @@ static DEFINE_MUTEX(proto_list_mutex); static LIST_HEAD(proto_list); +static void sock_inuse_add(struct net *net, int val); + /** * sk_ns_capable - General socket capability test * @sk: Socket to use a capability on or through @@ -1536,6 +1538,7 @@ struct sock *sk_alloc(struct net *net, int family, gfp_t priority, if (likely(sk->sk_net_refcnt)) get_net(net); sock_net_set(sk, net); + sock_inuse_add(net, 1); refcount_set(>sk_wmem_alloc, 1); mem_cgroup_sk_alloc(sk); @@ -1597,6 +1600,8 @@ void sk_destruct(struct sock *sk) static void __sk_free(struct sock *sk) { + sock_inuse_add(sock_net(sk), -1); + if (unlikely(sock_diag_has_destroy_listeners(sk) && sk->sk_net_refcnt)) sock_diag_broadcast_destroy(sk); else @@ -1665,6 +1670,7 @@ struct sock *sk_clone_lock(const struct sock *sk, const gfp_t priority) newsk->sk_backlog.head = newsk->sk_backlog.tail = NULL; newsk->sk_backlog.len = 0; + sock_inuse_add(sock_net(newsk), 1); atomic_set(>sk_rmem_alloc, 0); /* * sk_wmem_alloc set to one (see sk_free() and sock_wfree()) @@ -3060,15 +3066,43 @@ int sock_prot_inuse_get(struct net *net, struct proto *prot) } EXPORT_SYMBOL_GPL(sock_prot_inuse_get); +static void sock_inuse_add(struct net *net, int val) +{ + this_cpu_add(*net->core.sock_inuse, val); +} + +int sock_inuse_get(struct net *net) +{ + int cpu, res = 0; + + for_each_possible_cpu(cpu) + res += *per_cpu_ptr(net->core.sock_inuse, cpu); + + return res >= 0 ? res : 0; +} + +EXPORT_SYMBOL_GPL(sock_inuse_get); + static int __net_init sock_inuse_init_net(struct net *net) { net->core.prot_inuse = alloc_percpu(struct prot_inuse); - return net->core.prot_inuse ? 0 : -ENOMEM; + if (!net->core.prot_inuse) + return -ENOMEM; + + net->core.sock_inuse = alloc_percpu(int); + if (!net->core.sock_inuse) + goto out; + + return 0; +out: + free_percpu(net->core.prot_inuse); + return -ENOMEM; } static void __net_exit sock_inuse_exit_net(struct net *net) { free_percpu(net->core.prot_inuse); + free_percpu(net->core.sock_inuse); } static struct pernet_operations net_inuse_ops = { @@ -3111,6 +3145,10 @@ static inline void assign_proto_idx(struct proto *prot) static inline void release_proto_idx(struct proto *prot) { } + +static void sock_inuse_add(struct net *net, int val) +{ +} #endif static void req_prot_cleanup(struct request_sock_ops *rsk_prot) diff --git a/net/socket.c b/net/socket.c index b085f14..1bdf364 100644 --- a/net/socket.c +++ b/net/socket.c @@ -2646,17 +2646,8 @@ static int __init sock_init(void) #ifdef CONFIG_PROC_FS void socket_seq_show(struct seq_file *seq) { - int cpu; - int counter = 0; - - for_each_possible_cpu(cpu) - counter += per_cpu(sockets_in_use, cpu); - - /* It can be
Re: pull-request Cavium LiquidIO firmware v1.7.0
On Mon, 2017-11-13 at 13:07 -0800, Felix Manlunas wrote: > The following changes since commit bf04291309d3169c0ad3b8db52564235bbd08e30: > > WHENCE: Add new qed firmware (2017-10-09 18:03:26 +0100) > > are available in the git repository at: > > https://github.com/felix-cavium/linux-firmware.git for-upstreaming-v1.7.0 > > for you to fetch changes up to 6c161c56d5bbf233f5931820193802a5bcb10356: > > linux-firmware: liquidio: update firmware to v1.7.0 (2017-11-07 17:11:40 > -0800) > > Signed-off-by: Felix Manlunas> Signed-off-by: Derek Chickles > > Felix Manlunas (1): > linux-firmware: liquidio: update firmware to v1.7.0 > > WHENCE | 8 > liquidio/lio_210nv_nic.bin | Bin 1261080 -> 1265368 bytes > liquidio/lio_210sv_nic.bin | Bin 1159096 -> 1163128 bytes > liquidio/lio_23xx_nic.bin | Bin 1266528 -> 1271456 bytes > liquidio/lio_410nv_nic.bin | Bin 1261080 -> 1265368 bytes > 5 files changed, 4 insertions(+), 4 deletions(-) Pulled, thanks. Ben. -- Ben Hutchings When in doubt, use brute force. - Ken Thompson signature.asc Description: This is a digitally signed message part
Re: pull request: Cavium Octeon III firmware
On Tue, 2017-10-31 at 17:05 -0500, Steven J. Hill wrote: > Hello. > > Would like to add firmware for our Octeon III PKI driver. Thanks. Where is this driver? I don't see any reference to the file in linux- next. [...] > cavium/pki-cluster.bin | Bin 0 -> 7488 bytes > 1 file changed, 0 insertions(+), 0 deletions(-) > create mode 100644 cavium/pki-cluster.bin When adding a file you also need to update WHENCE to include its copyright details. Ben. -- Ben Hutchings When in doubt, use brute force. - Ken Thompson signature.asc Description: This is a digitally signed message part
[PATCH v4 1/2] sock: Change the netns_core member name.
Change the member name will make the code more readable. This patch will be used in next patch. Signed-off-by: Martin ZhangSigned-off-by: Tonghao Zhang --- include/net/netns/core.h | 2 +- net/core/sock.c | 10 +- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/include/net/netns/core.h b/include/net/netns/core.h index 78eb1ff..6490b79 100644 --- a/include/net/netns/core.h +++ b/include/net/netns/core.h @@ -10,7 +10,7 @@ struct netns_core { int sysctl_somaxconn; - struct prot_inuse __percpu *inuse; + struct prot_inuse __percpu *prot_inuse; }; #endif diff --git a/net/core/sock.c b/net/core/sock.c index 03e1b1e..b899d86 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -3044,7 +3044,7 @@ struct prot_inuse { void sock_prot_inuse_add(struct net *net, struct proto *prot, int val) { - __this_cpu_add(net->core.inuse->val[prot->inuse_idx], val); + __this_cpu_add(net->core.prot_inuse->val[prot->inuse_idx], val); } EXPORT_SYMBOL_GPL(sock_prot_inuse_add); @@ -3054,7 +3054,7 @@ int sock_prot_inuse_get(struct net *net, struct proto *prot) int res = 0; for_each_possible_cpu(cpu) - res += per_cpu_ptr(net->core.inuse, cpu)->val[idx]; + res += per_cpu_ptr(net->core.prot_inuse, cpu)->val[idx]; return res >= 0 ? res : 0; } @@ -3062,13 +3062,13 @@ int sock_prot_inuse_get(struct net *net, struct proto *prot) static int __net_init sock_inuse_init_net(struct net *net) { - net->core.inuse = alloc_percpu(struct prot_inuse); - return net->core.inuse ? 0 : -ENOMEM; + net->core.prot_inuse = alloc_percpu(struct prot_inuse); + return net->core.prot_inuse ? 0 : -ENOMEM; } static void __net_exit sock_inuse_exit_net(struct net *net) { - free_percpu(net->core.inuse); + free_percpu(net->core.prot_inuse); } static struct pernet_operations net_inuse_ops = { -- 1.8.3.1
Re: [PATCH v7 0/4] Add the ability to do BPF directed error injection
On Wed, Nov 22, 2017 at 04:23:29PM -0500, Josef Bacik wrote: > This is hopefully the final version, I've addressed the comment by Igno and > added his Acks. > > v6->v7: > - moved the opt-in macro to bpf.h out of kprobes.h. Thanks Josef! All patches look great to me. We'll probably take them all into bpf-next.git to start testing together with other bpf changes and when net-next reopens will send them to Dave. Then optionally can send pull-req for the first patch only to tip if Ingo thinks that there can be conflicts with the work happening in parallel on kprobe/x86 bits. This way hopefully there will be no conflicts during the next merge window. Makes sense? > v5->v6: > - add BPF_ALLOW_ERROR_INJECTION() tagging for functions that will support this > feature. This way only functions that opt-in will be allowed to be > overridden. > - added a btrfs patch to allow error injection for open_ctree() so that the > bpf > sample actually works. > > v4->v5: > - disallow kprobe_override programs from being put in the prog map array so we > don't tail call into something we didn't check. This allows us to make the > normal path still fast without a bunch of percpu operations. > > v3->v4: > - fix a build error found by kbuild test bot (I didn't wait long enough > apparently.) > - Added a warning message as per Daniels suggestion. > > v2->v3: > - added a ->kprobe_override flag to bpf_prog. > - added some sanity checks to disallow attaching bpf progs that have > ->kprobe_override set that aren't for ftrace kprobes. > - added the trace_kprobe_ftrace helper to check if the trace_event_call is a > ftrace kprobe. > - renamed bpf_kprobe_state to bpf_kprobe_override, fixed it so we only read > this > value in the kprobe path, and thus only write to it if we're overriding or > clearing the override. > > v1->v2: > - moved things around to make sure that bpf_override_return could really only > be > used for an ftrace kprobe. > - killed the special return values from trace_call_bpf. > - renamed pc_modified to bpf_kprobe_state so bpf_override_return could tell if > it was being called from an ftrace kprobe context. > - reworked the logic in kprobe_perf_func to take advantage of > bpf_kprobe_state. > - updated the test as per Alexei's review. > > - Original message - > > A lot of our error paths are not well tested because we have no good way of > injecting errors generically. Some subystems (block, memory) have ways to > inject errors, but they are random so it's hard to get reproduceable results. > > With BPF we can add determinism to our error injection. We can use kprobes > and > other things to verify we are injecting errors at the exact case we are trying > to test. This patch gives us the tool to actual do the error injection part. > It is very simple, we just set the return value of the pt_regs we're given to > whatever we provide, and then override the PC with a dummy function that > simply > returns. > > Right now this only works on x86, but it would be simple enough to expand to > other architectures. Thanks, > > Josef
Re: [PATCH 1/1] qed: Add firmware 8.33.1.0
On Wed, 2017-10-11 at 00:57 -0700, Rahul Verma wrote: > The new qed firmware contains fixes to firmware and added > support for new features, > -Add UFP support and drop action support. > -DCQCN support for unlimited number of QP > -Add IP type to GFT filter profile. > -Added new TCP function counters. > -Support flow ID in aRFS flow. > > Signed-off-by: Rahul Verma> --- > WHENCE | 1 + > qed/qed_init_values_zipped-8.33.1.0.bin | Bin 0 -> 838612 bytes > 2 files changed, 1 insertion(+) > create mode 100755 qed/qed_init_values_zipped-8.33.1.0.bin [...] Applied; sorry for the delay. Ben. -- Ben Hutchings When in doubt, use brute force. - Ken Thompson signature.asc Description: This is a digitally signed message part
[PATCH iproute2/master] bpf: initialize the verifier log
If program loading fails before verifier prints its first message, the verifier log will not be initialized. Always set the first character of the log buffer to zero to make sure we don't dump non-printable characters to the terminal. Signed-off-by: Jakub KicinskiReviewed-by: Quentin Monnet --- lib/bpf.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/lib/bpf.c b/lib/bpf.c index 10ea23a471ef..fdc28772fb71 100644 --- a/lib/bpf.c +++ b/lib/bpf.c @@ -1153,7 +1153,7 @@ static int bpf_log_realloc(struct bpf_elf_ctx *ctx) { const size_t log_max = UINT_MAX >> 8; size_t log_size = ctx->log_size; - void *ptr; + char *ptr; if (!ctx->log) { log_size = 65536; @@ -1169,6 +1169,7 @@ static int bpf_log_realloc(struct bpf_elf_ctx *ctx) if (!ptr) return -ENOMEM; + ptr[0] = 0; ctx->log = ptr; ctx->log_size = log_size; -- 2.14.1
[PATCH v2 net] bpf: fix branch pruning logic
when the verifier detects that register contains a runtime constant and it's compared with another constant it will prune exploration of the branch that is guaranteed not to be taken at runtime. This is all correct, but malicious program may be constructed in such a way that it always has a constant comparison and the other branch is never taken under any conditions. In this case such path through the program will not be explored by the verifier. It won't be taken at run-time either, but since all instructions are JITed the malicious program may cause JITs to complain about using reserved fields, etc. To fix the issue we have to track the instructions explored by the verifier and sanitize instructions that are dead at run time with NOPs. We cannot reject such dead code, since llvm generates it for valid C code, since it doesn't do as much data flow analysis as the verifier does. Fixes: 17a5267067f3 ("bpf: verifier (add verifier core)") Signed-off-by: Alexei StarovoitovAcked-by: Daniel Borkmann --- v1->v2: made sanitize_dead_code() conditional. Only do it when program was successfully validated, since broken progs will be freed immediately and no need to spend time to clear insns. for net-next we might try to remove dead code and adjust all branches instead of replacing with nops Implementation detail: converted_op_size is unused. We can reuse that space. --- include/linux/bpf_verifier.h | 2 +- kernel/bpf/verifier.c| 27 +++ 2 files changed, 28 insertions(+), 1 deletion(-) diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 07b96aaca256..7b418f0a62f6 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -115,7 +115,7 @@ struct bpf_insn_aux_data { struct bpf_map *map_ptr;/* pointer for call insn into lookup_elem */ }; int ctx_field_size; /* the ctx field size for load insn, maybe 0 */ - int converted_op_size; /* the valid value width after perceived conversion */ + bool seen; /* this insn was processed by the verifier */ }; #define MAX_USED_MAPS 64 /* max number of maps accessed by one eBPF program */ diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index dd54d20ace2f..0a34594dab96 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -3825,6 +3825,7 @@ static int do_check(struct bpf_verifier_env *env) return err; regs = cur_regs(env); + env->insn_aux_data[insn_idx].seen = true; if (class == BPF_ALU || class == BPF_ALU64) { err = check_alu_op(env, insn); if (err) @@ -4020,6 +4021,7 @@ static int do_check(struct bpf_verifier_env *env) return err; insn_idx++; + env->insn_aux_data[insn_idx].seen = true; } else { verbose(env, "invalid BPF_LD mode\n"); return -EINVAL; @@ -4202,6 +4204,7 @@ static int adjust_insn_aux_data(struct bpf_verifier_env *env, u32 prog_len, u32 off, u32 cnt) { struct bpf_insn_aux_data *new_data, *old_data = env->insn_aux_data; + int i; if (cnt == 1) return 0; @@ -4211,6 +4214,8 @@ static int adjust_insn_aux_data(struct bpf_verifier_env *env, u32 prog_len, memcpy(new_data, old_data, sizeof(struct bpf_insn_aux_data) * off); memcpy(new_data + off + cnt - 1, old_data + off, sizeof(struct bpf_insn_aux_data) * (prog_len - off - cnt + 1)); + for (i = off; i < off + cnt - 1; i++) + new_data[i].seen = true; env->insn_aux_data = new_data; vfree(old_data); return 0; @@ -4229,6 +4234,25 @@ static struct bpf_prog *bpf_patch_insn_data(struct bpf_verifier_env *env, u32 of return new_prog; } +/* The verifier does more data flow analysis than llvm and will not explore + * branches that are dead at run time. Malicious programs can have dead code + * too. Therefore replace all dead at-run-time code with nops. + */ +static void sanitize_dead_code(struct bpf_verifier_env *env) +{ + struct bpf_insn_aux_data *aux_data = env->insn_aux_data; + struct bpf_insn nop = BPF_MOV64_REG(BPF_REG_0, BPF_REG_0); + struct bpf_insn *insn = env->prog->insnsi; + const int insn_cnt = env->prog->len; + int i; + + for (i = 0; i < insn_cnt; i++) { + if (aux_data[i].seen) + continue; + memcpy(insn + i, , sizeof(nop)); + } +} + /* convert load instructions that access fields of 'struct __sk_buff' * into sequence of instructions that access fields of 'struct sk_buff' */ @@ -4556,6 +4580,9 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr)
[PATCH] net-sysfs: export gso_max_size attribute
The netdevice gso_max_size is exposed to allow users fine-control on systems with multiple NICs with different GSO buffer sizes, and where the virtual devices like bridge and veth, need to be aware of the GSO size of the underlying devices. In a virtualized environment, setting the right GSO sizes for physical and virtual devices makes all TSO work to be on physical NIC, improving throughput and reducing CPU util. If virtual devices send buffers greater than what NIC supports, it forces host to do TSO for buffers exceeding the limit, increasing CPU utilization in host. Suggested-by: Shiny SebastianSigned-off-by: Solio Sarabia --- In one test scenario with Hyper-V host, Ubuntu 16.04 VM, with Docker inside VM, and NTttcp sending 40 Gbps from one container, setting the right gso_max_size values for all network devices in the chain, reduces CPU overhead about 3x (for the sender), since all TSO work is done by physical NIC. net/core/net-sysfs.c | 30 ++ 1 file changed, 30 insertions(+) diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c index 799b752..7314bc8 100644 --- a/net/core/net-sysfs.c +++ b/net/core/net-sysfs.c @@ -376,6 +376,35 @@ static ssize_t gro_flush_timeout_store(struct device *dev, } NETDEVICE_SHOW_RW(gro_flush_timeout, fmt_ulong); +static int change_gso_max_size(struct net_device *dev, unsigned long new_size) +{ + unsigned int orig_size = dev->gso_max_size; + + if (new_size != (unsigned int)new_size) + return -ERANGE; + + if (new_size == orig_size) + return 0; + + if (new_size <= 0 || new_size > GSO_MAX_SIZE) + return -ERANGE; + + dev->gso_max_size = new_size; + return 0; +} + +static ssize_t gso_max_size_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t len) +{ + if (!capable(CAP_NET_ADMIN)) + return -EPERM; + + return netdev_store(dev, attr, buf, len, change_gso_max_size); +} + +NETDEVICE_SHOW_RW(gso_max_size, fmt_dec); + static ssize_t ifalias_store(struct device *dev, struct device_attribute *attr, const char *buf, size_t len) { @@ -543,6 +572,7 @@ static struct attribute *net_class_attrs[] __ro_after_init = { _attr_flags.attr, _attr_tx_queue_len.attr, _attr_gro_flush_timeout.attr, + _attr_gso_max_size.attr, _attr_phys_port_id.attr, _attr_phys_port_name.attr, _attr_phys_switch_id.attr, -- 2.7.4
Re: [e1000_shutdown] e1000 0000:00:03.0: disabling already-disabled device
On Wed, Nov 22, 2017 at 03:40:52AM +0530, Tushar Dave wrote: On 11/21/2017 06:11 PM, Fengguang Wu wrote: Hello, FYI this happens in mainline kernel 4.14.0-01330-g3c07399. It happens since 4.13 . It occurs in 3 out of 162 boots. [ 44.637743] advantechwdt: Unexpected close, not stopping watchdog! [ 44.997548] input: ImExPS/2 Generic Explorer Mouse as /devices/platform/i8042/serio1/input/input6 [ 45.013419] e1000 :00:03.0: disabling already-disabled device [ 45.013447] [ cut here ] [ 45.014868] WARNING: CPU: 1 PID: 71 at drivers/pci/pci.c:1641 pci_disable_device+0xa1/0x105: pci_disable_device at drivers/pci/pci.c:1640 [ 45.016171] CPU: 1 PID: 71 Comm: rcu_perf_shutdo Not tainted 4.14.0-01330-g3c07399 #1 [ 45.017197] task: 88011bee9e40 task.stack: c986 [ 45.017987] RIP: 0010:pci_disable_device+0xa1/0x105: pci_disable_device at drivers/pci/pci.c:1640 [ 45.018603] RSP: :c9863e30 EFLAGS: 00010286 [ 45.019282] RAX: 0035 RBX: 88013a230008 RCX: [ 45.020182] RDX: RSI: RDI: 0203 [ 45.021084] RBP: 88013a3f31e8 R08: 0001 R09: [ 45.021986] R10: 827ec29c R11: 0002 R12: 0001 [ 45.022946] R13: 88013a230008 R14: 880117802b20 R15: c9863e8f [ 45.023842] FS: () GS:88013fd0() knlGS: [ 45.024863] CS: 0010 DS: ES: CR0: 80050033 [ 45.025583] CR2: c96d4000 CR3: 0220f000 CR4: 06a0 [ 45.026478] Call Trace: [ 45.026811] __e1000_shutdown+0x1d4/0x1e2: __e1000_shutdown at drivers/net/ethernet/intel/e1000/e1000_main.c:5162 [ 45.027344] ? rcu_perf_cleanup+0x2a1/0x2a1: rcu_perf_shutdown at kernel/rcu/rcuperf.c:627 [ 45.027883] e1000_shutdown+0x14/0x3a: e1000_shutdown at drivers/net/ethernet/intel/e1000/e1000_main.c:5235 [ 45.028351] device_shutdown+0x110/0x1aa: device_shutdown at drivers/base/core.c:2807 [ 45.028858] kernel_power_off+0x31/0x64: kernel_power_off at kernel/reboot.c:260 [ 45.029343] rcu_perf_shutdown+0x9b/0xa7: rcu_perf_shutdown at kernel/rcu/rcuperf.c:637 [ 45.029852] ? __wake_up_common_lock+0xa2/0xa2: autoremove_wake_function at kernel/sched/wait.c:376 [ 45.030414] kthread+0x126/0x12e: kthread at kernel/kthread.c:233 [ 45.030834] ? __kthread_bind_mask+0x8e/0x8e: kthread at kernel/kthread.c:190 [ 45.031399] ? ret_from_fork+0x1f/0x30: ret_from_fork at arch/x86/entry/entry_64.S:443 [ 45.031883] ? kernel_init+0xa/0xf5: kernel_init at init/main.c:997 [ 45.032325] ret_from_fork+0x1f/0x30: ret_from_fork at arch/x86/entry/entry_64.S:443 [ 45.032777] Code: 00 48 85 ed 75 07 48 8b ab a8 00 00 00 48 8d bb 98 00 00 00 e8 aa d1 11 00 48 89 ea 48 89 c6 48 c7 c7 d8 e4 0b 82 e8 55 7d da ff <0f> ff b9 01 00 00 00 31 d2 be 01 00 00 00 48 c7 c7 f0 b1 61 82 [ 45.035222] ---[ end trace c257137b1b1976ef ]--- [ 45.037838] ACPI: Preparing to enter system sleep state S5 Attached the full dmesg, kconfig and reproduce scripts. Looks like e1000 pci/pxi-x device is already suspended. And therefore call to e1000_suspend() -> __e1000_shutdown() -> pci_disable_device() already had disabled the device. Disabling device again by e1000_shutdown handler during system shutdown causes warning at drivers/pci/pci.c:1641. I think function __e1000_shutdown should just return if device is already suspended! I don't have e1000 hardware to test right now. So if this seems logical to others I will send a patch. Tushar, it happens on QEMU boot testing, so do not rely on e1000 HW. Unless you'd like to prevent regressions on real HW. The original report attached a reproduce script to run the QEMU test. Or you may send me the patch for testing. Thanks, Fengguang
Re: kernel BUG at crypto/asymmetric_keys/public_key.c:80
On 11/22/2017 10:42 AM, Johannes Berg wrote: > On Wed, 2017-11-22 at 19:29 +0100, Arend van Spriel wrote: >> + Johannes >> >> >>> BUG_ON(!sig->digest); >> BUG_ON(!sig->s); > > I *think* this is the same bug that was reported before, then this > should fix it: > > https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=d7be102f2945a626f55e0501e52bb31ba3e77b81 > > Can you try? My baseline already has this commit actually, is there something else you would want me to check? Thanks! -- Florian
[PATCH V2 14/29] bnx2x: deprecate pci_get_bus_and_slot()
pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as where a PCI device is present. This restricts the device drivers to be reused for other domain numbers. Getting ready to remove pci_get_bus_and_slot() function in favor of pci_get_domain_bus_and_slot(). Introduce bnx2x_vf_domain() function to extract the domain information and save it to VF specific data structure. Use the saved domain value while calling pci_get_domain_bus_and_slot(). Signed-off-by: Sinan Kaya--- drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c | 10 +- drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h | 1 + 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c index 9ca994d..9f40c23 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c @@ -812,7 +812,7 @@ static u8 bnx2x_vf_is_pcie_pending(struct bnx2x *bp, u8 abs_vfid) if (!vf) return false; - dev = pci_get_bus_and_slot(vf->bus, vf->devfn); + dev = pci_get_domain_bus_and_slot(vf->domain, vf->bus, vf->devfn); if (dev) return bnx2x_is_pcie_pending(dev); return false; @@ -1041,6 +1041,13 @@ void bnx2x_iov_init_dmae(struct bnx2x *bp) REG_WR(bp, DMAE_REG_BACKWARD_COMP_EN, 0); } +static int bnx2x_vf_domain(struct bnx2x *bp, int vfid) +{ + struct pci_dev *dev = bp->pdev; + + return pci_domain_nr(dev->bus); +} + static int bnx2x_vf_bus(struct bnx2x *bp, int vfid) { struct pci_dev *dev = bp->pdev; @@ -1611,6 +1618,7 @@ int bnx2x_iov_nic_init(struct bnx2x *bp) struct bnx2x_virtf *vf = BP_VF(bp, vfid); /* fill in the BDF and bars */ + vf->domain = bnx2x_vf_domain(bp, vfid); vf->bus = bnx2x_vf_bus(bp, vfid); vf->devfn = bnx2x_vf_devfn(bp, vfid); bnx2x_vf_set_bars(bp, vf); diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h index 53466f6..eb814c6 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.h @@ -182,6 +182,7 @@ struct bnx2x_virtf { u32 error; /* 0 means all's-well */ /* BDF */ + unsigned int domain; unsigned int bus; unsigned int devfn; -- 1.9.1
[PATCH V2 15/29] pch_gbe: deprecate pci_get_bus_and_slot()
pci_get_bus_and_slot() is restrictive such that it assumes domain=0 as where a PCI device is present. This restricts the device drivers to be reused for other domain numbers. Getting ready to remove pci_get_bus_and_slot() function in favor of pci_get_domain_bus_and_slot(). Use the domain information from pdev while calling into pci_get_domain_bus_and_slot() function. Signed-off-by: Sinan Kaya--- drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c index 5ae9681..4be7806 100644 --- a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c +++ b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c @@ -2603,8 +2603,10 @@ static int pch_gbe_probe(struct pci_dev *pdev, if (adapter->pdata && adapter->pdata->platform_init) adapter->pdata->platform_init(pdev); - adapter->ptp_pdev = pci_get_bus_and_slot(adapter->pdev->bus->number, - PCI_DEVFN(12, 4)); + adapter->ptp_pdev = + pci_get_domain_bus_and_slot(pci_domain_nr(adapter->pdev->bus), + adapter->pdev->bus->number, + PCI_DEVFN(12, 4)); netdev->netdev_ops = _gbe_netdev_ops; netdev->watchdog_timeo = PCH_GBE_WATCHDOG_PERIOD; -- 1.9.1
Re: [PATCH v1] Bluetooth: introduce DEFINE_SHOW_ATTRIBUTE() macro
On 11/22/2017 02:04 PM, Randy Dunlap wrote: > On 11/22/2017 01:15 PM, Andy Shevchenko wrote: >> This macro deduplicates a lot of similar code across the hci_debugfs.c >> module. Targeting to be moved to seq_file.h eventually. >> >> Signed-off-by: Andy Shevchenko>> --- >> net/bluetooth/hci_debugfs.c | 184 >> +--- >> 1 file changed, 18 insertions(+), 166 deletions(-) > > Looks like a good idea, but below, there is a use of > DEFINE_SHOW_ATTRIBUTE() before it is #defined. > >> diff --git a/net/bluetooth/hci_debugfs.c b/net/bluetooth/hci_debugfs.c >> index 63df63ebfb24..d4174d508cbf 100644 >> --- a/net/bluetooth/hci_debugfs.c >> +++ b/net/bluetooth/hci_debugfs.c >> @@ -88,6 +88,9 @@ static int __name ## _show(struct seq_file *f, void *ptr) >> \ >> return 0; \ >> } \ >>\ >> +DEFINE_SHOW_ATTRIBUTE(__name) > > eh? OK, it's a continuation of the macro above it. Sorry about that. >> +> +#define DEFINE_SHOW_ATTRIBUTE(__name) >> \ >> static int __name ## _open(struct inode *inode, struct file *file)\ >> { \ >> return single_open(file, __name ## _show, inode->i_private); \ > > -- ~Randy
Re: [PATCH net 0/4] bpf: fix semantics issues with helpers receiving NULL arguments
On 11/22/2017 07:32 PM, Gianluca Borello wrote: > This set includes some fixes in semantics and usability issues that emerged > recently, and would be good to have them in net before the next release. > > In particular, ARG_CONST_SIZE_OR_ZERO semantics was recently changed in > commit 9fd29c08e520 ("bpf: improve verifier ARG_CONST_SIZE_OR_ZERO > semantics") with the goal of letting the compiler generate simpler code > that the verifier can more easily accept. > > To handle this change in semantics, a few checks in some helpers were > added, like in commit 9c019e2bc4b2 ("bpf: change helper bpf_probe_read arg2 > type to ARG_CONST_SIZE_OR_ZERO"), and those checks are less than ideal > because once they make it into a released kernel bpf programs can start > relying on them, preventing the possibility of being removed later on. > > This patch tries to fix the issue by introducing a new argument type > ARG_PTR_TO_MEM_OR_NULL that can be used for helpers that can receive a >tuple. By doing so, we can fix the semantics of the other helpers > that don't need and can just handle , allowing the code > to get rid of those checks. Series applied to bpf tree, thanks Gianluca!
Re: [PATCH v1] Bluetooth: introduce DEFINE_SHOW_ATTRIBUTE() macro
On 11/22/2017 01:15 PM, Andy Shevchenko wrote: > This macro deduplicates a lot of similar code across the hci_debugfs.c > module. Targeting to be moved to seq_file.h eventually. > > Signed-off-by: Andy Shevchenko> --- > net/bluetooth/hci_debugfs.c | 184 > +--- > 1 file changed, 18 insertions(+), 166 deletions(-) Looks like a good idea, but below, there is a use of DEFINE_SHOW_ATTRIBUTE() before it is #defined. > diff --git a/net/bluetooth/hci_debugfs.c b/net/bluetooth/hci_debugfs.c > index 63df63ebfb24..d4174d508cbf 100644 > --- a/net/bluetooth/hci_debugfs.c > +++ b/net/bluetooth/hci_debugfs.c > @@ -88,6 +88,9 @@ static int __name ## _show(struct seq_file *f, void *ptr) > \ > return 0; \ > } \ > \ > +DEFINE_SHOW_ATTRIBUTE(__name) eh? > +> +#define DEFINE_SHOW_ATTRIBUTE(__name) > \ > static int __name ## _open(struct inode *inode, struct file *file) \ > { \ > return single_open(file, __name ## _show, inode->i_private); \ -- ~Randy
[PATCH v7 5/5] btrfs: allow us to inject errors at io_ctl_init
From: Josef BacikThis was instrumental in reproducing a space cache bug. Signed-off-by: Josef Bacik Acked-by: Ingo Molnar --- fs/btrfs/free-space-cache.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c index cdc9f4015ec3..daa98dc1f844 100644 --- a/fs/btrfs/free-space-cache.c +++ b/fs/btrfs/free-space-cache.c @@ -22,6 +22,7 @@ #include #include #include +#include #include "ctree.h" #include "free-space-cache.h" #include "transaction.h" @@ -332,6 +333,7 @@ static int io_ctl_init(struct btrfs_io_ctl *io_ctl, struct inode *inode, return 0; } +BPF_ALLOW_ERROR_INJECTION(io_ctl_init); static void io_ctl_free(struct btrfs_io_ctl *io_ctl) { -- 2.7.5
[PATCH v7 4/5] samples/bpf: add a test for bpf_override_return
From: Josef BacikThis adds a basic test for bpf_override_return to verify it works. We override the main function for mounting a btrfs fs so it'll return -ENOMEM and then make sure that trying to mount a btrfs fs will fail. Acked-by: Alexei Starovoitov Acked-by: Ingo Molnar Signed-off-by: Josef Bacik --- samples/bpf/Makefile | 4 samples/bpf/test_override_return.sh | 15 +++ samples/bpf/tracex7_kern.c| 16 samples/bpf/tracex7_user.c| 28 tools/include/uapi/linux/bpf.h| 7 ++- tools/testing/selftests/bpf/bpf_helpers.h | 3 ++- 6 files changed, 71 insertions(+), 2 deletions(-) create mode 100755 samples/bpf/test_override_return.sh create mode 100644 samples/bpf/tracex7_kern.c create mode 100644 samples/bpf/tracex7_user.c diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile index ea2b9e6135f3..83d06bc1f710 100644 --- a/samples/bpf/Makefile +++ b/samples/bpf/Makefile @@ -14,6 +14,7 @@ hostprogs-y += tracex3 hostprogs-y += tracex4 hostprogs-y += tracex5 hostprogs-y += tracex6 +hostprogs-y += tracex7 hostprogs-y += test_probe_write_user hostprogs-y += trace_output hostprogs-y += lathist @@ -58,6 +59,7 @@ tracex3-objs := bpf_load.o $(LIBBPF) tracex3_user.o tracex4-objs := bpf_load.o $(LIBBPF) tracex4_user.o tracex5-objs := bpf_load.o $(LIBBPF) tracex5_user.o tracex6-objs := bpf_load.o $(LIBBPF) tracex6_user.o +tracex7-objs := bpf_load.o $(LIBBPF) tracex7_user.o load_sock_ops-objs := bpf_load.o $(LIBBPF) load_sock_ops.o test_probe_write_user-objs := bpf_load.o $(LIBBPF) test_probe_write_user_user.o trace_output-objs := bpf_load.o $(LIBBPF) trace_output_user.o @@ -100,6 +102,7 @@ always += tracex3_kern.o always += tracex4_kern.o always += tracex5_kern.o always += tracex6_kern.o +always += tracex7_kern.o always += sock_flags_kern.o always += test_probe_write_user_kern.o always += trace_output_kern.o @@ -153,6 +156,7 @@ HOSTLOADLIBES_tracex3 += -lelf HOSTLOADLIBES_tracex4 += -lelf -lrt HOSTLOADLIBES_tracex5 += -lelf HOSTLOADLIBES_tracex6 += -lelf +HOSTLOADLIBES_tracex7 += -lelf HOSTLOADLIBES_test_cgrp2_sock2 += -lelf HOSTLOADLIBES_load_sock_ops += -lelf HOSTLOADLIBES_test_probe_write_user += -lelf diff --git a/samples/bpf/test_override_return.sh b/samples/bpf/test_override_return.sh new file mode 100755 index ..e68b9ee6814b --- /dev/null +++ b/samples/bpf/test_override_return.sh @@ -0,0 +1,15 @@ +#!/bin/bash + +rm -f testfile.img +dd if=/dev/zero of=testfile.img bs=1M seek=1000 count=1 +DEVICE=$(losetup --show -f testfile.img) +mkfs.btrfs -f $DEVICE +mkdir tmpmnt +./tracex7 $DEVICE +if [ $? -eq 0 ] +then + echo "SUCCESS!" +else + echo "FAILED!" +fi +losetup -d $DEVICE diff --git a/samples/bpf/tracex7_kern.c b/samples/bpf/tracex7_kern.c new file mode 100644 index ..1ab308a43e0f --- /dev/null +++ b/samples/bpf/tracex7_kern.c @@ -0,0 +1,16 @@ +#include +#include +#include +#include "bpf_helpers.h" + +SEC("kprobe/open_ctree") +int bpf_prog1(struct pt_regs *ctx) +{ + unsigned long rc = -12; + + bpf_override_return(ctx, rc); + return 0; +} + +char _license[] SEC("license") = "GPL"; +u32 _version SEC("version") = LINUX_VERSION_CODE; diff --git a/samples/bpf/tracex7_user.c b/samples/bpf/tracex7_user.c new file mode 100644 index ..8a52ac492e8b --- /dev/null +++ b/samples/bpf/tracex7_user.c @@ -0,0 +1,28 @@ +#define _GNU_SOURCE + +#include +#include +#include +#include "libbpf.h" +#include "bpf_load.h" + +int main(int argc, char **argv) +{ + FILE *f; + char filename[256]; + char command[256]; + int ret; + + snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]); + + if (load_bpf_file(filename)) { + printf("%s", bpf_log_buf); + return 1; + } + + snprintf(command, 256, "mount %s tmpmnt/", argv[1]); + f = popen(command, "r"); + ret = pclose(f); + + return ret ? 0 : 1; +} diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 4a4b6e78c977..3756dde69834 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -673,6 +673,10 @@ union bpf_attr { * @buf: buf to fill * @buf_size: size of the buf * Return : 0 on success or negative error code + * + * int bpf_override_return(pt_regs, rc) + * @pt_regs: pointer to struct pt_regs + * @rc: the return value to set */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -732,7 +736,8 @@ union bpf_attr { FN(xdp_adjust_meta),\ FN(perf_event_read_value), \ FN(perf_prog_read_value), \ - FN(getsockopt), + FN(getsockopt), \ + FN(override_return), /* integer value in 'imm' field of BPF_CALL
[PATCH 04/11] net: ethernet: ti: cpsw: move mac_hi/lo defines in cpsw.h
Move mac_hi/lo defines in common header cpsw.h and re-use them for netcp_ethss.c. Signed-off-by: Grygorii Strashko--- drivers/net/ethernet/ti/cpsw.c| 4 drivers/net/ethernet/ti/cpsw.h| 4 drivers/net/ethernet/ti/netcp_ethss.c | 5 + 3 files changed, 5 insertions(+), 8 deletions(-) diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c index 2c50596..f914589 100644 --- a/drivers/net/ethernet/ti/cpsw.c +++ b/drivers/net/ethernet/ti/cpsw.c @@ -978,10 +978,6 @@ static inline void soft_reset(const char *module, void __iomem *reg) WARN(readl_relaxed(reg) & 1, "failed to soft-reset %s\n", module); } -#define mac_hi(mac)(((mac)[0] << 0) | ((mac)[1] << 8) |\ -((mac)[2] << 16) | ((mac)[3] << 24)) -#define mac_lo(mac)(((mac)[4] << 0) | ((mac)[5] << 8)) - static void cpsw_set_slave_mac(struct cpsw_slave *slave, struct cpsw_priv *priv) { diff --git a/drivers/net/ethernet/ti/cpsw.h b/drivers/net/ethernet/ti/cpsw.h index a325f555..cf111db 100644 --- a/drivers/net/ethernet/ti/cpsw.h +++ b/drivers/net/ethernet/ti/cpsw.h @@ -17,6 +17,10 @@ #include #include +#define mac_hi(mac)(((mac)[0] << 0) | ((mac)[1] << 8) |\ +((mac)[2] << 16) | ((mac)[3] << 24)) +#define mac_lo(mac)(((mac)[4] << 0) | ((mac)[5] << 8)) + void cpsw_phy_sel(struct device *dev, phy_interface_t phy_mode, int slave); int ti_cm_get_macid(struct device *dev, int slave, u8 *mac_addr); diff --git a/drivers/net/ethernet/ti/netcp_ethss.c b/drivers/net/ethernet/ti/netcp_ethss.c index 4ad8216..1ff0ade 100644 --- a/drivers/net/ethernet/ti/netcp_ethss.c +++ b/drivers/net/ethernet/ti/netcp_ethss.c @@ -27,6 +27,7 @@ #include #include +#include "cpsw.h" #include "cpsw_ale.h" #include "netcp.h" #include "cpts.h" @@ -2047,10 +2048,6 @@ static const struct ethtool_ops keystone_ethtool_ops = { .get_ts_info= keystone_get_ts_info, }; -#define mac_hi(mac)(((mac)[0] << 0) | ((mac)[1] << 8) |\ -((mac)[2] << 16) | ((mac)[3] << 24)) -#define mac_lo(mac)(((mac)[4] << 0) | ((mac)[5] << 8)) - static void gbe_set_slave_mac(struct gbe_slave *slave, struct gbe_intf *gbe_intf) { -- 2.10.5
[PATCH 05/11] net: ethernet: ti: cpsw: fix ale port numbers
TI OMAP/Sitara SoCs have fixed number of ALE ports 3, which includes Host port also. Hence, use fixed value instead of value calcualted from DT, which can be set by user and might not reflect actual HW configuration. Signed-off-by: Grygorii Strashko--- drivers/net/ethernet/ti/cpsw.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c index f914589..ca7c52a 100644 --- a/drivers/net/ethernet/ti/cpsw.c +++ b/drivers/net/ethernet/ti/cpsw.c @@ -88,6 +88,7 @@ do { \ #define CPSW_VERSION_4 0x190112 #define HOST_PORT_NUM 0 +#define CPSW_ALE_PORTS_NUM 3 #define SLIVER_SIZE0x40 #define CPSW1_HOST_PORT_OFFSET 0x028 @@ -3076,7 +3077,7 @@ static int cpsw_probe(struct platform_device *pdev) ale_params.dev = >dev; ale_params.ale_ageout = ale_ageout; ale_params.ale_entries = data->ale_entries; - ale_params.ale_ports= data->slaves; + ale_params.ale_ports= CPSW_ALE_PORTS_NUM; cpsw->ale = cpsw_ale_create(_params); if (!cpsw->ale) { -- 2.10.5
[PATCH 03/11] net: ethernet: ti: cpsw: move platform data struct to .c file
CPSW platform data struct cpsw_platform_data and struct cpsw_slave_data are used only incide cpsw.c module, so move these definitions there. Signed-off-by: Grygorii Strashko--- drivers/net/ethernet/ti/cpsw.c | 21 + drivers/net/ethernet/ti/cpsw.h | 21 - 2 files changed, 21 insertions(+), 21 deletions(-) diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c index 955ee68..2c50596 100644 --- a/drivers/net/ethernet/ti/cpsw.c +++ b/drivers/net/ethernet/ti/cpsw.c @@ -352,6 +352,27 @@ struct cpsw_hw_stats { u32 rxdmaoverruns; }; +struct cpsw_slave_data { + struct device_node *phy_node; + charphy_id[MII_BUS_ID_SIZE]; + int phy_if; + u8 mac_addr[ETH_ALEN]; + u16 dual_emac_res_vlan; /* Reserved VLAN for DualEMAC */ +}; + +struct cpsw_platform_data { + struct cpsw_slave_data *slave_data; + u32 ss_reg_ofs; /* Subsystem control register offset */ + u32 channels; /* number of cpdma channels (symmetric) */ + u32 slaves; /* number of slave cpgmac ports */ + u32 active_slave; /* time stamping, ethtool and SIOCGMIIPHY slave */ + u32 ale_entries;/* ale table size */ + u32 bd_ram_size; /*buffer descriptor ram size */ + u32 mac_control;/* Mac control register */ + u16 default_vlan; /* Def VLAN for ALE lookup in VLAN aware mode*/ + booldual_emac; /* Enable Dual EMAC mode */ +}; + struct cpsw_slave { void __iomem*regs; struct cpsw_sliver_regs __iomem *sliver; diff --git a/drivers/net/ethernet/ti/cpsw.h b/drivers/net/ethernet/ti/cpsw.h index 6c3037a..a325f555 100644 --- a/drivers/net/ethernet/ti/cpsw.h +++ b/drivers/net/ethernet/ti/cpsw.h @@ -17,27 +17,6 @@ #include #include -struct cpsw_slave_data { - struct device_node *phy_node; - charphy_id[MII_BUS_ID_SIZE]; - int phy_if; - u8 mac_addr[ETH_ALEN]; - u16 dual_emac_res_vlan; /* Reserved VLAN for DualEMAC */ -}; - -struct cpsw_platform_data { - struct cpsw_slave_data *slave_data; - u32 ss_reg_ofs; /* Subsystem control register offset */ - u32 channels; /* number of cpdma channels (symmetric) */ - u32 slaves; /* number of slave cpgmac ports */ - u32 active_slave; /* time stamping, ethtool and SIOCGMIIPHY slave */ - u32 ale_entries;/* ale table size */ - u32 bd_ram_size; /*buffer descriptor ram size */ - u32 mac_control;/* Mac control register */ - u16 default_vlan; /* Def VLAN for ALE lookup in VLAN aware mode*/ - booldual_emac; /* Enable Dual EMAC mode */ -}; - void cpsw_phy_sel(struct device *dev, phy_interface_t phy_mode, int slave); int ti_cm_get_macid(struct device *dev, int slave, u8 *mac_addr); -- 2.10.5
[PATCH 01/11] net: ethernet: ti: cpsw: drop unused var poll from cpsw_update_channels_res
Drop unused variable "poll" from cpsw_update_channels_res(). Signed-off-by: Grygorii Strashko--- drivers/net/ethernet/ti/cpsw.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c index 0c7c7a1..9235b9e 100644 --- a/drivers/net/ethernet/ti/cpsw.c +++ b/drivers/net/ethernet/ti/cpsw.c @@ -2300,7 +2300,6 @@ static int cpsw_check_ch_settings(struct cpsw_common *cpsw, static int cpsw_update_channels_res(struct cpsw_priv *priv, int ch_num, int rx) { - int (*poll)(struct napi_struct *, int); struct cpsw_common *cpsw = priv->cpsw; void (*handler)(void *, int, int); struct netdev_queue *queue; @@ -2311,12 +2310,10 @@ static int cpsw_update_channels_res(struct cpsw_priv *priv, int ch_num, int rx) ch = >rx_ch_num; vec = cpsw->rxv; handler = cpsw_rx_handler; - poll = cpsw_rx_poll; } else { ch = >tx_ch_num; vec = cpsw->txv; handler = cpsw_tx_handler; - poll = cpsw_tx_poll; } while (*ch < ch_num) { -- 2.10.5
[PATCH 00/11] net: ethernet: ti: cpsw/ale clean up and optimization
This is set of non critical clean ups and optimizations for TI CPSW and ALE drivers. Grygorii Strashko (11): net: ethernet: ti: cpsw: drop unused var poll from cpsw_update_channels_res net: ethernet: ti: cpsw: use proper io apis net: ethernet: ti: cpsw: move platform data struct to .c file net: ethernet: ti: cpsw: move mac_hi/lo defines in cpsw.h net: ethernet: ti: cpsw: fix ale port numbers net: ethernet: ti: ale: use proper io apis net: ethernet: ti: ale: disable ale from stop() net: ethernet: ti: ale: optimize ale entry mask bits configuartion net: ethernet: ti: ale: move static initialization in cpsw_ale_create() net: ethernet: ti: ale: use devm_kzalloc in cpsw_ale_create() net: ethernet: ti: ale: fix port check in cpsw_ale_control_set/get drivers/net/ethernet/ti/cpsw.c| 84 +++--- drivers/net/ethernet/ti/cpsw.h| 23 +-- drivers/net/ethernet/ti/cpsw_ale.c| 109 ++ drivers/net/ethernet/ti/cpsw_ale.h| 1 - drivers/net/ethernet/ti/netcp_ethss.c | 6 +- 5 files changed, 98 insertions(+), 125 deletions(-) -- 2.10.5
[PATCH 06/11] net: ethernet: ti: ale: use proper io apis
Switch to use writel_relaxed/readl_relaxed() IO API instead of raw version as it is recommended. Signed-off-by: Grygorii Strashko--- drivers/net/ethernet/ti/cpsw_ale.c | 26 +- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/drivers/net/ethernet/ti/cpsw_ale.c b/drivers/net/ethernet/ti/cpsw_ale.c index cd1185e..b21ed3d 100644 --- a/drivers/net/ethernet/ti/cpsw_ale.c +++ b/drivers/net/ethernet/ti/cpsw_ale.c @@ -150,11 +150,11 @@ static int cpsw_ale_read(struct cpsw_ale *ale, int idx, u32 *ale_entry) WARN_ON(idx > ale->params.ale_entries); - __raw_writel(idx, ale->params.ale_regs + ALE_TABLE_CONTROL); + writel_relaxed(idx, ale->params.ale_regs + ALE_TABLE_CONTROL); for (i = 0; i < ALE_ENTRY_WORDS; i++) - ale_entry[i] = __raw_readl(ale->params.ale_regs + - ALE_TABLE + 4 * i); + ale_entry[i] = readl_relaxed(ale->params.ale_regs + +ALE_TABLE + 4 * i); return idx; } @@ -166,11 +166,11 @@ static int cpsw_ale_write(struct cpsw_ale *ale, int idx, u32 *ale_entry) WARN_ON(idx > ale->params.ale_entries); for (i = 0; i < ALE_ENTRY_WORDS; i++) - __raw_writel(ale_entry[i], ale->params.ale_regs + -ALE_TABLE + 4 * i); + writel_relaxed(ale_entry[i], ale->params.ale_regs + + ALE_TABLE + 4 * i); - __raw_writel(idx | ALE_TABLE_WRITE, ale->params.ale_regs + -ALE_TABLE_CONTROL); + writel_relaxed(idx | ALE_TABLE_WRITE, ale->params.ale_regs + + ALE_TABLE_CONTROL); return idx; } @@ -733,9 +733,9 @@ int cpsw_ale_control_set(struct cpsw_ale *ale, int port, int control, offset = info->offset + (port * info->port_offset); shift = info->shift + (port * info->port_shift); - tmp = __raw_readl(ale->params.ale_regs + offset); + tmp = readl_relaxed(ale->params.ale_regs + offset); tmp = (tmp & ~(mask << shift)) | (value << shift); - __raw_writel(tmp, ale->params.ale_regs + offset); + writel_relaxed(tmp, ale->params.ale_regs + offset); return 0; } @@ -760,7 +760,7 @@ int cpsw_ale_control_get(struct cpsw_ale *ale, int port, int control) offset = info->offset + (port * info->port_offset); shift = info->shift + (port * info->port_shift); - tmp = __raw_readl(ale->params.ale_regs + offset) >> shift; + tmp = readl_relaxed(ale->params.ale_regs + offset) >> shift; return tmp & BITMASK(info->bits); } EXPORT_SYMBOL_GPL(cpsw_ale_control_get); @@ -781,7 +781,7 @@ void cpsw_ale_start(struct cpsw_ale *ale) { u32 rev, ale_entries; - rev = __raw_readl(ale->params.ale_regs + ALE_IDVER); + rev = readl_relaxed(ale->params.ale_regs + ALE_IDVER); if (!ale->params.major_ver_mask) ale->params.major_ver_mask = 0xff; ale->version = @@ -793,8 +793,8 @@ void cpsw_ale_start(struct cpsw_ale *ale) if (!ale->params.ale_entries) { ale_entries = - __raw_readl(ale->params.ale_regs + ALE_STATUS) & - ALE_STATUS_SIZE_MASK; + readl_relaxed(ale->params.ale_regs + ALE_STATUS) & + ALE_STATUS_SIZE_MASK; /* ALE available on newer NetCP switches has introduced * a register, ALE_STATUS, to indicate the size of ALE * table which shows the size as a multiple of 1024 entries. -- 2.10.5
[PATCH 11/11] net: ethernet: ti: ale: fix port check in cpsw_ale_control_set/get
ALE ports number includes the Host port and ext Ports, and ALE ports numbering starts from 0, so correct corresponding port checks in cpsw_ale_control_set/get(). Signed-off-by: Grygorii Strashko--- drivers/net/ethernet/ti/cpsw_ale.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/ti/cpsw_ale.c b/drivers/net/ethernet/ti/cpsw_ale.c index db5f28e..416efec4c 100644 --- a/drivers/net/ethernet/ti/cpsw_ale.c +++ b/drivers/net/ethernet/ti/cpsw_ale.c @@ -723,7 +723,7 @@ int cpsw_ale_control_set(struct cpsw_ale *ale, int port, int control, if (info->port_offset == 0 && info->port_shift == 0) port = 0; /* global, port is a dont care */ - if (port < 0 || port > ale->params.ale_ports) + if (port < 0 || port >= ale->params.ale_ports) return -EINVAL; mask = BITMASK(info->bits); @@ -754,7 +754,7 @@ int cpsw_ale_control_get(struct cpsw_ale *ale, int port, int control) if (info->port_offset == 0 && info->port_shift == 0) port = 0; /* global, port is a dont care */ - if (port < 0 || port > ale->params.ale_ports) + if (port < 0 || port >= ale->params.ale_ports) return -EINVAL; offset = info->offset + (port * info->port_offset); -- 2.10.5
[PATCH 08/11] net: ethernet: ti: ale: optimize ale entry mask bits configuartion
The ale->params.ale_ports parameter can be used to deriver values for all ale entry mask bits: port_mask_bits, port_mask_bits, port_num_bits. Hence, calculate above values and drop all hardcoded values. For port_num_bits calcualtion use order_base_2() API. Signed-off-by: Grygorii Strashko--- drivers/net/ethernet/ti/cpsw_ale.c | 13 +++-- 1 file changed, 3 insertions(+), 10 deletions(-) diff --git a/drivers/net/ethernet/ti/cpsw_ale.c b/drivers/net/ethernet/ti/cpsw_ale.c index 322f87c..34f97c1 100644 --- a/drivers/net/ethernet/ti/cpsw_ale.c +++ b/drivers/net/ethernet/ti/cpsw_ale.c @@ -816,9 +816,9 @@ void cpsw_ale_start(struct cpsw_ale *ale) "ALE Table size %ld\n", ale->params.ale_entries); /* set default bits for existing h/w */ - ale->port_mask_bits = 3; - ale->port_num_bits = 2; - ale->vlan_field_bits = 3; + ale->port_mask_bits = ale->params.ale_ports; + ale->port_num_bits = order_base_2(ale->params.ale_ports); + ale->vlan_field_bits = ale->params.ale_ports; /* Set defaults override for ALE on NetCP NU switch and for version * 1R3 @@ -847,13 +847,6 @@ void cpsw_ale_start(struct cpsw_ale *ale) ale_controls[ALE_PORT_UNTAGGED_EGRESS].shift = 0; ale_controls[ALE_PORT_UNTAGGED_EGRESS].offset = ALE_UNKNOWNVLAN_FORCE_UNTAG_EGRESS; - ale->port_mask_bits = ale->params.ale_ports; - ale->port_num_bits = ale->params.ale_ports - 1; - ale->vlan_field_bits = ale->params.ale_ports; - } else if (ale->version == ALE_VERSION_1R3) { - ale->port_mask_bits = ale->params.ale_ports; - ale->port_num_bits = 3; - ale->vlan_field_bits = ale->params.ale_ports; } cpsw_ale_control_set(ale, 0, ALE_ENABLE, 1); -- 2.10.5
[PATCH 07/11] net: ethernet: ti: ale: disable ale from stop()
ALE is enabled from cpsw_ale_start() now, but disabled only from cpsw_ale_destroy() which introduces inconsitance as cpsw_ale_start() is called when netif[s] is opened, but cpsw_ale_destroy() is called when driver is removed. Hence, move ALE disabling in cpsw_ale_stop(). Signed-off-by: Grygorii Strashko--- drivers/net/ethernet/ti/cpsw_ale.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/ti/cpsw_ale.c b/drivers/net/ethernet/ti/cpsw_ale.c index b21ed3d..322f87c 100644 --- a/drivers/net/ethernet/ti/cpsw_ale.c +++ b/drivers/net/ethernet/ti/cpsw_ale.c @@ -870,6 +870,7 @@ EXPORT_SYMBOL_GPL(cpsw_ale_start); void cpsw_ale_stop(struct cpsw_ale *ale) { del_timer_sync(>timer); + cpsw_ale_control_set(ale, 0, ALE_ENABLE, 0); } EXPORT_SYMBOL_GPL(cpsw_ale_stop); @@ -892,7 +893,6 @@ int cpsw_ale_destroy(struct cpsw_ale *ale) { if (!ale) return -EINVAL; - cpsw_ale_control_set(ale, 0, ALE_ENABLE, 0); kfree(ale); return 0; } -- 2.10.5
[PATCH 10/11] net: ethernet: ti: ale: use devm_kzalloc in cpsw_ale_create()
Use cpsw_ale_create in cpsw_ale_create(). This also makes cpsw_ale_destroy() function nop, so remove it. Signed-off-by: Grygorii Strashko--- drivers/net/ethernet/ti/cpsw.c| 17 +++-- drivers/net/ethernet/ti/cpsw_ale.c| 11 +-- drivers/net/ethernet/ti/cpsw_ale.h| 1 - drivers/net/ethernet/ti/netcp_ethss.c | 1 - 4 files changed, 8 insertions(+), 22 deletions(-) diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c index ca7c52a..f033096 100644 --- a/drivers/net/ethernet/ti/cpsw.c +++ b/drivers/net/ethernet/ti/cpsw.c @@ -3089,14 +3089,14 @@ static int cpsw_probe(struct platform_device *pdev) cpsw->cpts = cpts_create(cpsw->dev, cpts_regs, cpsw->dev->of_node); if (IS_ERR(cpsw->cpts)) { ret = PTR_ERR(cpsw->cpts); - goto clean_ale_ret; + goto clean_dma_ret; } ndev->irq = platform_get_irq(pdev, 1); if (ndev->irq < 0) { dev_err(priv->dev, "error getting irq resource\n"); ret = ndev->irq; - goto clean_ale_ret; + goto clean_dma_ret; } of_id = of_match_device(cpsw_of_mtable, >dev); @@ -3120,7 +3120,7 @@ static int cpsw_probe(struct platform_device *pdev) if (ret) { dev_err(priv->dev, "error registering net device\n"); ret = -ENODEV; - goto clean_ale_ret; + goto clean_dma_ret; } if (cpsw->data.dual_emac) { @@ -3143,7 +3143,7 @@ static int cpsw_probe(struct platform_device *pdev) irq = platform_get_irq(pdev, 1); if (irq < 0) { ret = irq; - goto clean_ale_ret; + goto clean_dma_ret; } cpsw->irqs_table[0] = irq; @@ -3151,14 +3151,14 @@ static int cpsw_probe(struct platform_device *pdev) 0, dev_name(>dev), cpsw); if (ret < 0) { dev_err(priv->dev, "error attaching irq (%d)\n", ret); - goto clean_ale_ret; + goto clean_dma_ret; } /* TX IRQ */ irq = platform_get_irq(pdev, 2); if (irq < 0) { ret = irq; - goto clean_ale_ret; + goto clean_dma_ret; } cpsw->irqs_table[1] = irq; @@ -3166,7 +3166,7 @@ static int cpsw_probe(struct platform_device *pdev) 0, dev_name(>dev), cpsw); if (ret < 0) { dev_err(priv->dev, "error attaching irq (%d)\n", ret); - goto clean_ale_ret; + goto clean_dma_ret; } cpsw_notice(priv, probe, @@ -3179,8 +3179,6 @@ static int cpsw_probe(struct platform_device *pdev) clean_unregister_netdev_ret: unregister_netdev(ndev); -clean_ale_ret: - cpsw_ale_destroy(cpsw->ale); clean_dma_ret: cpdma_ctlr_destroy(cpsw->dma); clean_dt_ret: @@ -3210,7 +3208,6 @@ static int cpsw_remove(struct platform_device *pdev) unregister_netdev(ndev); cpts_release(cpsw->cpts); - cpsw_ale_destroy(cpsw->ale); cpdma_ctlr_destroy(cpsw->dma); cpsw_remove_dt(pdev); pm_runtime_put_sync(>dev); diff --git a/drivers/net/ethernet/ti/cpsw_ale.c b/drivers/net/ethernet/ti/cpsw_ale.c index d542859..db5f28e 100644 --- a/drivers/net/ethernet/ti/cpsw_ale.c +++ b/drivers/net/ethernet/ti/cpsw_ale.c @@ -802,7 +802,7 @@ struct cpsw_ale *cpsw_ale_create(struct cpsw_ale_params *params) struct cpsw_ale *ale; u32 rev, ale_entries; - ale = kzalloc(sizeof(*ale), GFP_KERNEL); + ale = devm_kzalloc(params->dev, sizeof(*ale), GFP_KERNEL); if (!ale) return NULL; @@ -881,15 +881,6 @@ struct cpsw_ale *cpsw_ale_create(struct cpsw_ale_params *params) } EXPORT_SYMBOL_GPL(cpsw_ale_create); -int cpsw_ale_destroy(struct cpsw_ale *ale) -{ - if (!ale) - return -EINVAL; - kfree(ale); - return 0; -} -EXPORT_SYMBOL_GPL(cpsw_ale_destroy); - void cpsw_ale_dump(struct cpsw_ale *ale, u32 *data) { int i; diff --git a/drivers/net/ethernet/ti/cpsw_ale.h b/drivers/net/ethernet/ti/cpsw_ale.h index 25d24e8..d4fe901 100644 --- a/drivers/net/ethernet/ti/cpsw_ale.h +++ b/drivers/net/ethernet/ti/cpsw_ale.h @@ -100,7 +100,6 @@ enum cpsw_ale_port_state { #define ALE_ENTRY_WORDSDIV_ROUND_UP(ALE_ENTRY_BITS, 32) struct cpsw_ale *cpsw_ale_create(struct cpsw_ale_params *params); -int cpsw_ale_destroy(struct cpsw_ale *ale); void cpsw_ale_start(struct cpsw_ale *ale); void cpsw_ale_stop(struct cpsw_ale *ale); diff --git a/drivers/net/ethernet/ti/netcp_ethss.c b/drivers/net/ethernet/ti/netcp_ethss.c index 1ff0ade..ceb15a2 100644 --- a/drivers/net/ethernet/ti/netcp_ethss.c +++ b/drivers/net/ethernet/ti/netcp_ethss.c @@ -3690,7 +3690,6 @@ static int gbe_remove(struct netcp_device *netcp_device, void *inst_priv)
[PATCH 02/11] net: ethernet: ti: cpsw: use proper io apis
Switch to use writel_relaxed/readl_relaxed() IO API instead of raw version as it is recommended. Signed-off-by: Grygorii Strashko--- drivers/net/ethernet/ti/cpsw.c | 36 ++-- 1 file changed, 18 insertions(+), 18 deletions(-) diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c index 9235b9e..955ee68 100644 --- a/drivers/net/ethernet/ti/cpsw.c +++ b/drivers/net/ethernet/ti/cpsw.c @@ -365,12 +365,12 @@ struct cpsw_slave { static inline u32 slave_read(struct cpsw_slave *slave, u32 offset) { - return __raw_readl(slave->regs + offset); + return readl_relaxed(slave->regs + offset); } static inline void slave_write(struct cpsw_slave *slave, u32 val, u32 offset) { - __raw_writel(val, slave->regs + offset); + writel_relaxed(val, slave->regs + offset); } struct cpsw_vector { @@ -660,8 +660,8 @@ static void cpsw_ndo_set_rx_mode(struct net_device *ndev) static void cpsw_intr_enable(struct cpsw_common *cpsw) { - __raw_writel(0xFF, >wr_regs->tx_en); - __raw_writel(0xFF, >wr_regs->rx_en); + writel_relaxed(0xFF, >wr_regs->tx_en); + writel_relaxed(0xFF, >wr_regs->rx_en); cpdma_ctlr_int_ctrl(cpsw->dma, true); return; @@ -669,8 +669,8 @@ static void cpsw_intr_enable(struct cpsw_common *cpsw) static void cpsw_intr_disable(struct cpsw_common *cpsw) { - __raw_writel(0, >wr_regs->tx_en); - __raw_writel(0, >wr_regs->rx_en); + writel_relaxed(0, >wr_regs->tx_en); + writel_relaxed(0, >wr_regs->rx_en); cpdma_ctlr_int_ctrl(cpsw->dma, false); return; @@ -949,12 +949,12 @@ static inline void soft_reset(const char *module, void __iomem *reg) { unsigned long timeout = jiffies + HZ; - __raw_writel(1, reg); + writel_relaxed(1, reg); do { cpu_relax(); - } while ((__raw_readl(reg) & 1) && time_after(timeout, jiffies)); + } while ((readl_relaxed(reg) & 1) && time_after(timeout, jiffies)); - WARN(__raw_readl(reg) & 1, "failed to soft-reset %s\n", module); + WARN(readl_relaxed(reg) & 1, "failed to soft-reset %s\n", module); } #define mac_hi(mac)(((mac)[0] << 0) | ((mac)[1] << 8) |\ @@ -1015,7 +1015,7 @@ static void _cpsw_adjust_link(struct cpsw_slave *slave, if (mac_control != slave->mac_control) { phy_print_status(phy); - __raw_writel(mac_control, >sliver->mac_control); + writel_relaxed(mac_control, >sliver->mac_control); } slave->mac_control = mac_control; @@ -1278,7 +1278,7 @@ static void cpsw_slave_open(struct cpsw_slave *slave, struct cpsw_priv *priv) soft_reset_slave(slave); /* setup priority mapping */ - __raw_writel(RX_PRIORITY_MAPPING, >sliver->rx_pri_map); + writel_relaxed(RX_PRIORITY_MAPPING, >sliver->rx_pri_map); switch (cpsw->version) { case CPSW_VERSION_1: @@ -1304,7 +1304,7 @@ static void cpsw_slave_open(struct cpsw_slave *slave, struct cpsw_priv *priv) } /* setup max packet size, and mac address */ - __raw_writel(cpsw->rx_packet_max, >sliver->rx_maxlen); + writel_relaxed(cpsw->rx_packet_max, >sliver->rx_maxlen); cpsw_set_slave_mac(slave, priv); slave->mac_control = 0; /* no link yet */ @@ -1395,9 +1395,9 @@ static void cpsw_init_host_port(struct cpsw_priv *priv) writel(fifo_mode, >host_port_regs->tx_in_ctl); /* setup host port priority mapping */ - __raw_writel(CPDMA_TX_PRIORITY_MAP, ->host_port_regs->cpdma_tx_pri_map); - __raw_writel(0, >host_port_regs->cpdma_rx_chan_map); + writel_relaxed(CPDMA_TX_PRIORITY_MAP, + >host_port_regs->cpdma_tx_pri_map); + writel_relaxed(0, >host_port_regs->cpdma_rx_chan_map); cpsw_ale_control_set(cpsw->ale, HOST_PORT_NUM, ALE_PORT_STATE, ALE_PORT_STATE_FORWARD); @@ -1514,10 +1514,10 @@ static int cpsw_ndo_open(struct net_device *ndev) /* initialize shared resources for every ndev */ if (!cpsw->usage_count) { /* disable priority elevation */ - __raw_writel(0, >regs->ptype); + writel_relaxed(0, >regs->ptype); /* enable statistics collection only on all ports */ - __raw_writel(0x7, >regs->stat_port_en); + writel_relaxed(0x7, >regs->stat_port_en); /* Enable internal fifo flow control */ writel(0x7, >regs->flow_control); @@ -1703,7 +1703,7 @@ static void cpsw_hwtstamp_v2(struct cpsw_priv *priv) slave_write(slave, mtype, CPSW2_TS_SEQ_MTYPE); slave_write(slave, ctrl, CPSW2_CONTROL); - __raw_writel(ETH_P_1588, >regs->ts_ltype); + writel_relaxed(ETH_P_1588, >regs->ts_ltype); } static int cpsw_hwtstamp_set(struct net_device *dev, struct ifreq *ifr) --
[PATCH 09/11] net: ethernet: ti: ale: move static initialization in cpsw_ale_create()
Move static initialization from cpsw_ale_start() to cpsw_ale_create() as it does not make much sence to perform static initializtion in cpsw_ale_start() which is called everytime netif[s] is opened. Signed-off-by: Grygorii Strashko--- drivers/net/ethernet/ti/cpsw_ale.c | 57 +++--- 1 file changed, 28 insertions(+), 29 deletions(-) diff --git a/drivers/net/ethernet/ti/cpsw_ale.c b/drivers/net/ethernet/ti/cpsw_ale.c index 34f97c1..d542859 100644 --- a/drivers/net/ethernet/ti/cpsw_ale.c +++ b/drivers/net/ethernet/ti/cpsw_ale.c @@ -779,8 +779,36 @@ static void cpsw_ale_timer(unsigned long arg) void cpsw_ale_start(struct cpsw_ale *ale) { + cpsw_ale_control_set(ale, 0, ALE_ENABLE, 1); + cpsw_ale_control_set(ale, 0, ALE_CLEAR, 1); + + setup_timer(>timer, cpsw_ale_timer, (unsigned long)ale); + if (ale->ageout) { + ale->timer.expires = jiffies + ale->ageout; + add_timer(>timer); + } +} +EXPORT_SYMBOL_GPL(cpsw_ale_start); + +void cpsw_ale_stop(struct cpsw_ale *ale) +{ + del_timer_sync(>timer); + cpsw_ale_control_set(ale, 0, ALE_ENABLE, 0); +} +EXPORT_SYMBOL_GPL(cpsw_ale_stop); + +struct cpsw_ale *cpsw_ale_create(struct cpsw_ale_params *params) +{ + struct cpsw_ale *ale; u32 rev, ale_entries; + ale = kzalloc(sizeof(*ale), GFP_KERNEL); + if (!ale) + return NULL; + + ale->params = *params; + ale->ageout = ale->params.ale_ageout * HZ; + rev = readl_relaxed(ale->params.ale_regs + ALE_IDVER); if (!ale->params.major_ver_mask) ale->params.major_ver_mask = 0xff; @@ -849,35 +877,6 @@ void cpsw_ale_start(struct cpsw_ale *ale) ALE_UNKNOWNVLAN_FORCE_UNTAG_EGRESS; } - cpsw_ale_control_set(ale, 0, ALE_ENABLE, 1); - cpsw_ale_control_set(ale, 0, ALE_CLEAR, 1); - - setup_timer(>timer, cpsw_ale_timer, (unsigned long)ale); - if (ale->ageout) { - ale->timer.expires = jiffies + ale->ageout; - add_timer(>timer); - } -} -EXPORT_SYMBOL_GPL(cpsw_ale_start); - -void cpsw_ale_stop(struct cpsw_ale *ale) -{ - del_timer_sync(>timer); - cpsw_ale_control_set(ale, 0, ALE_ENABLE, 0); -} -EXPORT_SYMBOL_GPL(cpsw_ale_stop); - -struct cpsw_ale *cpsw_ale_create(struct cpsw_ale_params *params) -{ - struct cpsw_ale *ale; - - ale = kzalloc(sizeof(*ale), GFP_KERNEL); - if (!ale) - return NULL; - - ale->params = *params; - ale->ageout = ale->params.ale_ageout * HZ; - return ale; } EXPORT_SYMBOL_GPL(cpsw_ale_create); -- 2.10.5
[PATCH v7 3/5] bpf: add a bpf_override_function helper
From: Josef BacikError injection is sloppy and very ad-hoc. BPF could fill this niche perfectly with it's kprobe functionality. We could make sure errors are only triggered in specific call chains that we care about with very specific situations. Accomplish this with the bpf_override_funciton helper. This will modify the probe'd callers return value to the specified value and set the PC to an override function that simply returns, bypassing the originally probed function. This gives us a nice clean way to implement systematic error injection for all of our code paths. Acked-by: Alexei Starovoitov Acked-by: Ingo Molnar Signed-off-by: Josef Bacik --- arch/Kconfig | 3 +++ arch/x86/Kconfig | 1 + arch/x86/include/asm/kprobes.h | 4 +++ arch/x86/include/asm/ptrace.h| 5 arch/x86/kernel/kprobes/ftrace.c | 14 ++ include/linux/filter.h | 3 ++- include/linux/trace_events.h | 1 + include/uapi/linux/bpf.h | 7 - kernel/bpf/core.c| 3 +++ kernel/bpf/verifier.c| 2 ++ kernel/events/core.c | 7 + kernel/trace/Kconfig | 11 kernel/trace/bpf_trace.c | 38 +++ kernel/trace/trace_kprobe.c | 55 +++- kernel/trace/trace_probe.h | 12 + 15 files changed, 157 insertions(+), 9 deletions(-) diff --git a/arch/Kconfig b/arch/Kconfig index d789a89cb32c..4fb618082259 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -195,6 +195,9 @@ config HAVE_OPTPROBES config HAVE_KPROBES_ON_FTRACE bool +config HAVE_KPROBE_OVERRIDE + bool + config HAVE_NMI bool diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 971feac13506..5126d2750dd0 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -152,6 +152,7 @@ config X86 select HAVE_KERNEL_XZ select HAVE_KPROBES select HAVE_KPROBES_ON_FTRACE + select HAVE_KPROBE_OVERRIDE select HAVE_KRETPROBES select HAVE_KVM select HAVE_LIVEPATCH if X86_64 diff --git a/arch/x86/include/asm/kprobes.h b/arch/x86/include/asm/kprobes.h index 6cf65437b5e5..c6c3b1f4306a 100644 --- a/arch/x86/include/asm/kprobes.h +++ b/arch/x86/include/asm/kprobes.h @@ -67,6 +67,10 @@ extern const int kretprobe_blacklist_size; void arch_remove_kprobe(struct kprobe *p); asmlinkage void kretprobe_trampoline(void); +#ifdef CONFIG_KPROBES_ON_FTRACE +extern void arch_ftrace_kprobe_override_function(struct pt_regs *regs); +#endif + /* Architecture specific copy of original instruction*/ struct arch_specific_insn { /* copy of the original instruction */ diff --git a/arch/x86/include/asm/ptrace.h b/arch/x86/include/asm/ptrace.h index 91c04c8e67fa..f04e71800c2f 100644 --- a/arch/x86/include/asm/ptrace.h +++ b/arch/x86/include/asm/ptrace.h @@ -108,6 +108,11 @@ static inline unsigned long regs_return_value(struct pt_regs *regs) return regs->ax; } +static inline void regs_set_return_value(struct pt_regs *regs, unsigned long rc) +{ + regs->ax = rc; +} + /* * user_mode(regs) determines whether a register set came from user * mode. On x86_32, this is true if V8086 mode was enabled OR if the diff --git a/arch/x86/kernel/kprobes/ftrace.c b/arch/x86/kernel/kprobes/ftrace.c index 041f7b6dfa0f..3c455bf490cb 100644 --- a/arch/x86/kernel/kprobes/ftrace.c +++ b/arch/x86/kernel/kprobes/ftrace.c @@ -97,3 +97,17 @@ int arch_prepare_kprobe_ftrace(struct kprobe *p) p->ainsn.boostable = false; return 0; } + +asmlinkage void override_func(void); +asm( + ".type override_func, @function\n" + "override_func:\n" + " ret\n" + ".size override_func, .-override_func\n" +); + +void arch_ftrace_kprobe_override_function(struct pt_regs *regs) +{ + regs->ip = (unsigned long)_func; +} +NOKPROBE_SYMBOL(arch_ftrace_kprobe_override_function); diff --git a/include/linux/filter.h b/include/linux/filter.h index cdd78a7beaae..dfa44fd74bae 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -458,7 +458,8 @@ struct bpf_prog { locked:1, /* Program image locked? */ gpl_compatible:1, /* Is filter GPL compatible? */ cb_access:1,/* Is control block accessed? */ - dst_needed:1; /* Do we need dst entry? */ + dst_needed:1, /* Do we need dst entry? */ + kprobe_override:1; /* Do we override a kprobe? */ kmemcheck_bitfield_end(meta); enum bpf_prog_type type; /* Type of BPF program */ u32 len;/* Number of filter blocks */ diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h index
[PATCH v7 2/5] btrfs: make open_ctree error injectable
From: Josef BacikThis allows us to do error injection with BPF for open_ctree. Signed-off-by: Josef Bacik Acked-by: Ingo Molnar --- fs/btrfs/disk-io.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index dfdab849037b..69d17a640b94 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -30,6 +30,7 @@ #include #include #include +#include #include #include "ctree.h" #include "disk-io.h" @@ -3283,6 +3284,7 @@ int open_ctree(struct super_block *sb, goto fail_block_groups; goto retry_root_backup; } +BPF_ALLOW_ERROR_INJECTION(open_ctree); static void btrfs_end_buffer_write_sync(struct buffer_head *bh, int uptodate) { -- 2.7.5
[PATCH v7 1/5] add infrastructure for tagging functions as error injectable
From: Josef BacikUsing BPF we can override kprob'ed functions and return arbitrary values. Obviously this can be a bit unsafe, so make this feature opt-in for functions. Simply tag a function with KPROBE_ERROR_INJECT_SYMBOL in order to give BPF access to that function for error injection purposes. Signed-off-by: Josef Bacik Acked-by: Ingo Molnar --- arch/x86/include/asm/asm.h| 6 ++ include/asm-generic/vmlinux.lds.h | 10 +++ include/linux/bpf.h | 11 +++ include/linux/kprobes.h | 1 + include/linux/module.h| 5 ++ kernel/kprobes.c | 163 ++ kernel/module.c | 6 +- 7 files changed, 201 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h index b0dc91f4bedc..340f4cc43255 100644 --- a/arch/x86/include/asm/asm.h +++ b/arch/x86/include/asm/asm.h @@ -85,6 +85,12 @@ _ASM_PTR (entry); \ .popsection +# define _ASM_KPROBE_ERROR_INJECT(entry) \ + .pushsection "_kprobe_error_inject_list","aw" ; \ + _ASM_ALIGN ;\ + _ASM_PTR (entry); \ + .popseciton + .macro ALIGN_DESTINATION /* check for bad alignment of destination */ movl %edi,%ecx diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h index 8acfc1e099e1..85822804861e 100644 --- a/include/asm-generic/vmlinux.lds.h +++ b/include/asm-generic/vmlinux.lds.h @@ -136,6 +136,15 @@ #define KPROBE_BLACKLIST() #endif +#ifdef CONFIG_BPF_KPROBE_OVERRIDE +#define ERROR_INJECT_LIST(). = ALIGN(8); \ + VMLINUX_SYMBOL(__start_kprobe_error_inject_list) = .; \ + KEEP(*(_kprobe_error_inject_list)) \ + VMLINUX_SYMBOL(__stop_kprobe_error_inject_list) = .; +#else +#define ERROR_INJECT_LIST() +#endif + #ifdef CONFIG_EVENT_TRACING #define FTRACE_EVENTS(). = ALIGN(8); \ VMLINUX_SYMBOL(__start_ftrace_events) = .; \ @@ -560,6 +569,7 @@ FTRACE_EVENTS() \ TRACE_SYSCALLS()\ KPROBE_BLACKLIST() \ + ERROR_INJECT_LIST() \ MEM_DISCARD(init.rodata)\ CLK_OF_TABLES() \ RESERVEDMEM_OF_TABLES() \ diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 520aeebe0d93..552a666a338b 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -530,4 +530,15 @@ extern const struct bpf_func_proto bpf_sock_map_update_proto; void bpf_user_rnd_init_once(void); u64 bpf_user_rnd_u32(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5); +#if defined(__KERNEL__) && !defined(__ASSEMBLY__) +#ifdef CONFIG_BPF_KPROBE_OVERRIDE +#define BPF_ALLOW_ERROR_INJECTION(fname) \ +static unsigned long __used\ + __attribute__((__section__("_kprobe_error_inject_list"))) \ + _eil_addr_##fname = (unsigned long)fname; +#else +#define BPF_ALLOW_ERROR_INJECTION(fname) +#endif +#endif + #endif /* _LINUX_BPF_H */ diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h index bd2684700b74..4f501cb73aec 100644 --- a/include/linux/kprobes.h +++ b/include/linux/kprobes.h @@ -271,6 +271,7 @@ extern bool arch_kprobe_on_func_entry(unsigned long offset); extern bool kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset); extern bool within_kprobe_blacklist(unsigned long addr); +extern bool within_kprobe_error_injection_list(unsigned long addr); struct kprobe_insn_cache { struct mutex mutex; diff --git a/include/linux/module.h b/include/linux/module.h index fe5aa3736707..7bb1a9b9a322 100644 --- a/include/linux/module.h +++ b/include/linux/module.h @@ -475,6 +475,11 @@ struct module { ctor_fn_t *ctors; unsigned int num_ctors; #endif + +#ifdef CONFIG_BPF_KPROBE_OVERRIDE + unsigned int num_kprobe_ei_funcs; + unsigned long *kprobe_ei_funcs; +#endif } cacheline_aligned __randomize_layout; #ifndef MODULE_ARCH_INIT #define MODULE_ARCH_INIT {} diff --git a/kernel/kprobes.c b/kernel/kprobes.c index a1606a4224e1..bdd7dd724f6f 100644 --- a/kernel/kprobes.c +++ b/kernel/kprobes.c @@ -83,6 +83,16 @@ static raw_spinlock_t *kretprobe_table_lock_ptr(unsigned long hash) return &(kretprobe_table_locks[hash].lock); } +/* List of
[PATCH v7 0/4] Add the ability to do BPF directed error injection
This is hopefully the final version, I've addressed the comment by Igno and added his Acks. v6->v7: - moved the opt-in macro to bpf.h out of kprobes.h. v5->v6: - add BPF_ALLOW_ERROR_INJECTION() tagging for functions that will support this feature. This way only functions that opt-in will be allowed to be overridden. - added a btrfs patch to allow error injection for open_ctree() so that the bpf sample actually works. v4->v5: - disallow kprobe_override programs from being put in the prog map array so we don't tail call into something we didn't check. This allows us to make the normal path still fast without a bunch of percpu operations. v3->v4: - fix a build error found by kbuild test bot (I didn't wait long enough apparently.) - Added a warning message as per Daniels suggestion. v2->v3: - added a ->kprobe_override flag to bpf_prog. - added some sanity checks to disallow attaching bpf progs that have ->kprobe_override set that aren't for ftrace kprobes. - added the trace_kprobe_ftrace helper to check if the trace_event_call is a ftrace kprobe. - renamed bpf_kprobe_state to bpf_kprobe_override, fixed it so we only read this value in the kprobe path, and thus only write to it if we're overriding or clearing the override. v1->v2: - moved things around to make sure that bpf_override_return could really only be used for an ftrace kprobe. - killed the special return values from trace_call_bpf. - renamed pc_modified to bpf_kprobe_state so bpf_override_return could tell if it was being called from an ftrace kprobe context. - reworked the logic in kprobe_perf_func to take advantage of bpf_kprobe_state. - updated the test as per Alexei's review. - Original message - A lot of our error paths are not well tested because we have no good way of injecting errors generically. Some subystems (block, memory) have ways to inject errors, but they are random so it's hard to get reproduceable results. With BPF we can add determinism to our error injection. We can use kprobes and other things to verify we are injecting errors at the exact case we are trying to test. This patch gives us the tool to actual do the error injection part. It is very simple, we just set the return value of the pt_regs we're given to whatever we provide, and then override the PC with a dummy function that simply returns. Right now this only works on x86, but it would be simple enough to expand to other architectures. Thanks, Josef
Re: [Outreachy kernel] Re: [PATCH] net: usb: hso.c: remove unneeded DRIVER_LICENSE #define
On Wed, 22 Nov 2017, Joe Perches wrote: > On Fri, 2017-11-17 at 15:19 +0100, Greg Kroah-Hartman wrote: > > There is no need to #define the license of the driver, just put it in > > the MODULE_LICENSE() line directly as a text string. > > > > This allows tools that check that the module license matches the source > > code license to work properly, as there is no need to unwind the > > unneeded dereference. > [] > > diff --git a/drivers/net/usb/hso.c b/drivers/net/usb/hso.c > [] > > @@ -76,7 +76,6 @@ > > > > #define MOD_AUTHOR "Option Wireless" > > #define MOD_DESCRIPTION"USB High Speed Option driver" > > -#define MOD_LICENSE"GPL" > > > > #define HSO_MAX_NET_DEVICES10 > > #define HSO__MAX_MTU 2048 > > @@ -3288,7 +3287,7 @@ module_exit(hso_exit); > > > > MODULE_AUTHOR(MOD_AUTHOR); > > MODULE_DESCRIPTION(MOD_DESCRIPTION); > > -MODULE_LICENSE(MOD_LICENSE); > > +MODULE_LICENSE("GPL"); > > Probably all of these MODULE_(MOD_) uses could be > simplified as well. > > Perhaps there's utility in a (cocci?) script that looks for > used-once > macro #defines in various types of macros. What about module_version, eg: diff -u -p a/drivers/ata/pata_pdc202xx_old.c b/drivers/ata/pata_pdc202xx_old.c --- a/drivers/ata/pata_pdc202xx_old.c +++ b/drivers/ata/pata_pdc202xx_old.c @@ -21,7 +21,6 @@ #include #define DRV_NAME "pata_pdc202xx_old" -#define DRV_VERSION "0.4.3" static int pdc2026x_cable_detect(struct ata_port *ap) { @@ -389,4 +388,4 @@ MODULE_AUTHOR("Alan Cox"); MODULE_DESCRIPTION("low-level driver for Promise 2024x and 20262-20267"); MODULE_LICENSE("GPL"); MODULE_DEVICE_TABLE(pci, pdc202xx); -MODULE_VERSION(DRV_VERSION); +MODULE_VERSION("0.4.3"); julia
[PATCH v1] Bluetooth: introduce DEFINE_SHOW_ATTRIBUTE() macro
This macro deduplicates a lot of similar code across the hci_debugfs.c module. Targeting to be moved to seq_file.h eventually. Signed-off-by: Andy Shevchenko--- net/bluetooth/hci_debugfs.c | 184 +--- 1 file changed, 18 insertions(+), 166 deletions(-) diff --git a/net/bluetooth/hci_debugfs.c b/net/bluetooth/hci_debugfs.c index 63df63ebfb24..d4174d508cbf 100644 --- a/net/bluetooth/hci_debugfs.c +++ b/net/bluetooth/hci_debugfs.c @@ -88,6 +88,9 @@ static int __name ## _show(struct seq_file *f, void *ptr) \ return 0; \ }\ \ +DEFINE_SHOW_ATTRIBUTE(__name) + +#define DEFINE_SHOW_ATTRIBUTE(__name)\ static int __name ## _open(struct inode *inode, struct file *file) \ {\ return single_open(file, __name ## _show, inode->i_private); \ @@ -126,17 +129,7 @@ static int features_show(struct seq_file *f, void *ptr) return 0; } -static int features_open(struct inode *inode, struct file *file) -{ - return single_open(file, features_show, inode->i_private); -} - -static const struct file_operations features_fops = { - .open = features_open, - .read = seq_read, - .llseek = seq_lseek, - .release= single_release, -}; +DEFINE_SHOW_ATTRIBUTE(features); static int device_id_show(struct seq_file *f, void *ptr) { @@ -150,17 +143,7 @@ static int device_id_show(struct seq_file *f, void *ptr) return 0; } -static int device_id_open(struct inode *inode, struct file *file) -{ - return single_open(file, device_id_show, inode->i_private); -} - -static const struct file_operations device_id_fops = { - .open = device_id_open, - .read = seq_read, - .llseek = seq_lseek, - .release= single_release, -}; +DEFINE_SHOW_ATTRIBUTE(device_id); static int device_list_show(struct seq_file *f, void *ptr) { @@ -180,17 +163,7 @@ static int device_list_show(struct seq_file *f, void *ptr) return 0; } -static int device_list_open(struct inode *inode, struct file *file) -{ - return single_open(file, device_list_show, inode->i_private); -} - -static const struct file_operations device_list_fops = { - .open = device_list_open, - .read = seq_read, - .llseek = seq_lseek, - .release= single_release, -}; +DEFINE_SHOW_ATTRIBUTE(device_list); static int blacklist_show(struct seq_file *f, void *p) { @@ -205,17 +178,7 @@ static int blacklist_show(struct seq_file *f, void *p) return 0; } -static int blacklist_open(struct inode *inode, struct file *file) -{ - return single_open(file, blacklist_show, inode->i_private); -} - -static const struct file_operations blacklist_fops = { - .open = blacklist_open, - .read = seq_read, - .llseek = seq_lseek, - .release= single_release, -}; +DEFINE_SHOW_ATTRIBUTE(blacklist); static int uuids_show(struct seq_file *f, void *p) { @@ -240,17 +203,7 @@ static int uuids_show(struct seq_file *f, void *p) return 0; } -static int uuids_open(struct inode *inode, struct file *file) -{ - return single_open(file, uuids_show, inode->i_private); -} - -static const struct file_operations uuids_fops = { - .open = uuids_open, - .read = seq_read, - .llseek = seq_lseek, - .release= single_release, -}; +DEFINE_SHOW_ATTRIBUTE(uuids); static int remote_oob_show(struct seq_file *f, void *ptr) { @@ -269,17 +222,7 @@ static int remote_oob_show(struct seq_file *f, void *ptr) return 0; } -static int remote_oob_open(struct inode *inode, struct file *file) -{ - return single_open(file, remote_oob_show, inode->i_private); -} - -static const struct file_operations remote_oob_fops = { - .open = remote_oob_open, - .read = seq_read, - .llseek = seq_lseek, - .release= single_release, -}; +DEFINE_SHOW_ATTRIBUTE(remote_oob); static int conn_info_min_age_set(void *data, u64 val) { @@ -443,17 +386,7 @@ static int inquiry_cache_show(struct seq_file *f, void *p) return 0; } -static int inquiry_cache_open(struct inode *inode, struct file *file) -{ - return single_open(file, inquiry_cache_show, inode->i_private); -} - -static const struct file_operations inquiry_cache_fops = { - .open = inquiry_cache_open, - .read = seq_read, - .llseek = seq_lseek, - .release=
[PATCH net] bpf: fix branch pruning logic
when the verifier detects that register contains a runtime constant and it's compared with another constant it will prune exploration of the branch that is guaranteed not to be taken at runtime. This is all correct, but malicious program may be constructed in such a way that it always has a constant comparison and the other branch is never taken under any conditions. In this case such path through the program will not be explored by the verifier. It won't be taken at run-time either, but since all instructions are JITed the malicious program may cause JITs to complain about using reserved fields, etc. To fix the issue we have to track the instructions explored by the verifier and sanitize instructions that are dead at run time with NOPs. We cannot reject such dead code, since llvm generates it for valid C code, since it doesn't do as much data flow analysis as the verifier does. Fixes: 17a5267067f3 ("bpf: verifier (add verifier core)") Signed-off-by: Alexei StarovoitovAcked-by: Daniel Borkmann --- for net-next we might try to remove dead code and adjust all branches instead of replacing with nops Implementation detail: converted_op_size is unused. We can reuse that space. --- include/linux/bpf_verifier.h | 2 +- kernel/bpf/verifier.c| 26 ++ 2 files changed, 27 insertions(+), 1 deletion(-) diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 07b96aaca256..7b418f0a62f6 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -115,7 +115,7 @@ struct bpf_insn_aux_data { struct bpf_map *map_ptr;/* pointer for call insn into lookup_elem */ }; int ctx_field_size; /* the ctx field size for load insn, maybe 0 */ - int converted_op_size; /* the valid value width after perceived conversion */ + bool seen; /* this insn was processed by the verifier */ }; #define MAX_USED_MAPS 64 /* max number of maps accessed by one eBPF program */ diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index dd54d20ace2f..77a23e0db4e9 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -3825,6 +3825,7 @@ static int do_check(struct bpf_verifier_env *env) return err; regs = cur_regs(env); + env->insn_aux_data[insn_idx].seen = true; if (class == BPF_ALU || class == BPF_ALU64) { err = check_alu_op(env, insn); if (err) @@ -4020,6 +4021,7 @@ static int do_check(struct bpf_verifier_env *env) return err; insn_idx++; + env->insn_aux_data[insn_idx].seen = true; } else { verbose(env, "invalid BPF_LD mode\n"); return -EINVAL; @@ -4202,6 +4204,7 @@ static int adjust_insn_aux_data(struct bpf_verifier_env *env, u32 prog_len, u32 off, u32 cnt) { struct bpf_insn_aux_data *new_data, *old_data = env->insn_aux_data; + int i; if (cnt == 1) return 0; @@ -4211,6 +4214,8 @@ static int adjust_insn_aux_data(struct bpf_verifier_env *env, u32 prog_len, memcpy(new_data, old_data, sizeof(struct bpf_insn_aux_data) * off); memcpy(new_data + off + cnt - 1, old_data + off, sizeof(struct bpf_insn_aux_data) * (prog_len - off - cnt + 1)); + for (i = off; i < off + cnt - 1; i++) + new_data[i].seen = true; env->insn_aux_data = new_data; vfree(old_data); return 0; @@ -4229,6 +4234,25 @@ static struct bpf_prog *bpf_patch_insn_data(struct bpf_verifier_env *env, u32 of return new_prog; } +/* The verifier does more data flow analysis than llvm and will not explore + * branches that are dead at run time. Malicious programs can have dead code + * too. Therefore replace all dead at-run-time code with nops. + */ +static void sanitize_dead_code(struct bpf_verifier_env *env) +{ + struct bpf_insn_aux_data *aux_data = env->insn_aux_data; + struct bpf_insn nop = BPF_MOV64_REG(BPF_REG_0, BPF_REG_0); + struct bpf_insn *insn = env->prog->insnsi; + const int insn_cnt = env->prog->len; + int i; + + for (i = 0; i < insn_cnt; i++) { + if (aux_data[i].seen) + continue; + memcpy(insn + i, , sizeof(nop)); + } +} + /* convert load instructions that access fields of 'struct __sk_buff' * into sequence of instructions that access fields of 'struct sk_buff' */ @@ -4555,6 +4579,8 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr) while (!pop_stack(env, NULL, NULL)); free_states(env); + sanitize_dead_code(env); + if (ret == 0) /* program is valid, convert *(u32*)(ctx + off) accesses
[PATCH iproute 2/5] ila: added csum neutral support to ipila
Add checksum neutral to ip ila configuration. This control whether the C-bit is interpreted as checksum neutral bit. Signed-off-by: Tom Herbert--- ip/ipila.c | 57 +++-- 1 file changed, 55 insertions(+), 2 deletions(-) diff --git a/ip/ipila.c b/ip/ipila.c index fe5c4d8b..d4935d18 100644 --- a/ip/ipila.c +++ b/ip/ipila.c @@ -26,7 +26,9 @@ static void usage(void) { fprintf(stderr, "Usage: ip ila add loc_match LOCATOR_MATCH " - "loc LOCATOR [ dev DEV ]\n"); + "loc LOCATOR [ dev DEV ] " + "[ csum-mode { adj-transport | neutral-map | " + "no-action } ]\n"); fprintf(stderr, " ip ila del loc_match LOCATOR_MATCH " "[ loc LOCATOR ] [ dev DEV ]\n"); fprintf(stderr, " ip ila list\n"); @@ -48,6 +50,32 @@ static int genl_family = -1; #define ADDR_BUF_SIZE sizeof(":::") +static char *ila_csum_mode2name(__u8 csum_mode) +{ + switch (csum_mode) { + case ILA_CSUM_ADJUST_TRANSPORT: + return "adj-transport"; + case ILA_CSUM_NEUTRAL_MAP: + return "neutral-map"; + case ILA_CSUM_NO_ACTION: + return "no-action"; + default: + return "unknown"; + } +} + +static int ila_csum_name2mode(char *name) +{ + if (strcmp(name, "adj-transport") == 0) + return ILA_CSUM_ADJUST_TRANSPORT; + else if (strcmp(name, "neutral-map") == 0) + return ILA_CSUM_NEUTRAL_MAP; + else if (strcmp(name, "no-action") == 0) + return ILA_CSUM_NO_ACTION; + else + return -1; +} + static int print_addr64(__u64 addr, char *buff, size_t len) { __u16 *words = (__u16 *) @@ -113,9 +141,19 @@ static int print_ila_mapping(const struct sockaddr_nl *who, print_ila_locid(fp, ILA_ATTR_LOCATOR, tb, ADDR_BUF_SIZE); if (tb[ILA_ATTR_IFINDEX]) - fprintf(fp, "%s", ll_index_to_name(rta_getattr_u32(tb[ILA_ATTR_IFINDEX]))); + fprintf(fp, "%-16s", + ll_index_to_name(rta_getattr_u32( + tb[ILA_ATTR_IFINDEX]))); + else + fprintf(fp, "%-16s", "-"); + + if (tb[ILA_ATTR_CSUM_MODE]) + fprintf(fp, "%s", + ila_csum_mode2name(rta_getattr_u8( + tb[ILA_ATTR_CSUM_MODE]))); else fprintf(fp, "-"); + fprintf(fp, "\n"); return 0; @@ -152,9 +190,11 @@ static int ila_parse_opt(int argc, char **argv, struct nlmsghdr *n, __u64 locator = 0; __u64 locator_match = 0; int ifindex = 0; + int csum_mode = 0; bool loc_set = false; bool loc_match_set = false; bool ifindex_set = false; + bool csum_mode_set = false; while (argc > 0) { if (!matches(*argv, "loc")) { @@ -174,6 +214,16 @@ static int ila_parse_opt(int argc, char **argv, struct nlmsghdr *n, return -1; } loc_match_set = true; + } else if (!matches(*argv, "csum-mode")) { + NEXT_ARG(); + + csum_mode = ila_csum_name2mode(*argv); + if (csum_mode < 0) { + fprintf(stderr, "Bad csum-mode: %s\n", + *argv); + return -1; + } + csum_mode_set = true; } else if (!matches(*argv, "dev")) { NEXT_ARG(); @@ -211,6 +261,9 @@ static int ila_parse_opt(int argc, char **argv, struct nlmsghdr *n, if (ifindex_set) addattr32(n, 1024, ILA_ATTR_IFINDEX, ifindex); + if (csum_mode_set) + addattr8(n, 1024, ILA_ATTR_CSUM_MODE, csum_mode); + return 0; } -- 2.11.0
[PATCH iproute 5/5] ila: create ila_common.h
Move common functions related to checksum, identifier and hook-type parsing to a common include file. Signed-off-by: Tom Herbert--- ip/ila_common.h | 105 ++ ip/ipila.c| 77 +--- ip/iproute_lwtunnel.c | 97 +- 3 files changed, 107 insertions(+), 172 deletions(-) create mode 100644 ip/ila_common.h diff --git a/ip/ila_common.h b/ip/ila_common.h new file mode 100644 index ..04c6c2ed --- /dev/null +++ b/ip/ila_common.h @@ -0,0 +1,105 @@ +#ifndef _ILA_COMMON_H_ +#define _ILA_COMMON_H_ + +#include +#include + +static inline char *ila_csum_mode2name(__u8 csum_mode) +{ + switch (csum_mode) { + case ILA_CSUM_ADJUST_TRANSPORT: + return "adj-transport"; + case ILA_CSUM_NEUTRAL_MAP: + return "neutral-map"; + case ILA_CSUM_NO_ACTION: + return "no-action"; + case ILA_CSUM_NEUTRAL_MAP_AUTO: + return "neutral-map-auto"; + default: + return "unknown"; + } +} + +static inline int ila_csum_name2mode(char *name) +{ + if (strcmp(name, "adj-transport") == 0) + return ILA_CSUM_ADJUST_TRANSPORT; + else if (strcmp(name, "neutral-map") == 0) + return ILA_CSUM_NEUTRAL_MAP; + else if (strcmp(name, "neutral-map-auto") == 0) + return ILA_CSUM_NEUTRAL_MAP_AUTO; + else if (strcmp(name, "no-action") == 0) + return ILA_CSUM_NO_ACTION; + else if (strcmp(name, "neutral-map-auto") == 0) + return ILA_CSUM_NEUTRAL_MAP_AUTO; + else + return -1; +} + +static inline char *ila_ident_type2name(__u8 ident_type) +{ + switch (ident_type) { + case ILA_ATYPE_IID: + return "iid"; + case ILA_ATYPE_LUID: + return "luid"; + case ILA_ATYPE_VIRT_V4: + return "virt-v4"; + case ILA_ATYPE_VIRT_UNI_V6: + return "virt-uni-v6"; + case ILA_ATYPE_VIRT_MULTI_V6: + return "virt-multi-v6"; + case ILA_ATYPE_NONLOCAL_ADDR: + return "nonlocal-addr"; + case ILA_ATYPE_USE_FORMAT: + return "use-format"; + default: + return "unknown"; + } +} + +static inline int ila_ident_name2type(char *name) +{ + if (!strcmp(name, "luid")) + return ILA_ATYPE_LUID; + else if (!strcmp(name, "use-format")) + return ILA_ATYPE_USE_FORMAT; +#if 0 /* No kernel support for configuring these yet */ + else if (!strcmp(name, "iid")) + return ILA_ATYPE_IID; + else if (!strcmp(name, "virt-v4")) + return ILA_ATYPE_VIRT_V4; + else if (!strcmp(name, "virt-uni-v6")) + return ILA_ATYPE_VIRT_UNI_V6; + else if (!strcmp(name, "virt-multi-v6")) + return ILA_ATYPE_VIRT_MULTI_V6; + else if (!strcmp(name, "nonlocal-addr")) + return ILA_ATYPE_NONLOCAL_ADDR; +#endif + else + return -1; +} + +static inline char *ila_hook_type2name(__u8 hook_type) +{ + switch (hook_type) { + case ILA_HOOK_ROUTE_OUTPUT: + return "output"; + case ILA_HOOK_ROUTE_INPUT: + return "input"; + default: + return "unknown"; + } +} + +static inline int ila_hook_name2type(char *name) +{ + if (!strcmp(name, "output")) + return ILA_HOOK_ROUTE_OUTPUT; + else if (!strcmp(name, "input")) + return ILA_HOOK_ROUTE_INPUT; + else + return -1; +} + +#endif /* _ILA_COMMON_H_ */ diff --git a/ip/ipila.c b/ip/ipila.c index c7a8ede8..fcc20bf7 100644 --- a/ip/ipila.c +++ b/ip/ipila.c @@ -22,6 +22,7 @@ #include "libgenl.h" #include "utils.h" #include "ip_common.h" +#include "ila_common.h" static void usage(void) { @@ -51,82 +52,6 @@ static int genl_family = -1; #define ADDR_BUF_SIZE sizeof(":::") -static char *ila_csum_mode2name(__u8 csum_mode) -{ - switch (csum_mode) { - case ILA_CSUM_ADJUST_TRANSPORT: - return "adj-transport"; - case ILA_CSUM_NEUTRAL_MAP: - return "neutral-map"; - case ILA_CSUM_NO_ACTION: - return "no-action"; - case ILA_CSUM_NEUTRAL_MAP_AUTO: - return "neutral-map-auto"; - default: - return "unknown"; - } -} - -static int ila_csum_name2mode(char *name) -{ - if (strcmp(name, "adj-transport") == 0) - return ILA_CSUM_ADJUST_TRANSPORT; - else if (strcmp(name, "neutral-map") == 0) - return ILA_CSUM_NEUTRAL_MAP; - else if (strcmp(name, "neutral-map-auto") == 0) - return ILA_CSUM_NEUTRAL_MAP_AUTO; - else if (strcmp(name, "no-action") == 0) - return ILA_CSUM_NO_ACTION; -
[PATCH iproute 3/5] ila: support to configure checksum neutral-map-auto
Configuration support in both ip ila and ip LWT for checksum neutral-map-auto. This is a mode of ILA where checksum neutral mapping is assumed for packets (there is no C-bit in the identifier to indicate checksum neutral). Signed-off-by: Tom Herbert--- ip/ipila.c| 8 +--- ip/iproute_lwtunnel.c | 4 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/ip/ipila.c b/ip/ipila.c index d4935d18..0b706f0b 100644 --- a/ip/ipila.c +++ b/ip/ipila.c @@ -28,7 +28,7 @@ static void usage(void) fprintf(stderr, "Usage: ip ila add loc_match LOCATOR_MATCH " "loc LOCATOR [ dev DEV ] " "[ csum-mode { adj-transport | neutral-map | " - "no-action } ]\n"); + "neutral-map-auto | no-action } ]\n"); fprintf(stderr, " ip ila del loc_match LOCATOR_MATCH " "[ loc LOCATOR ] [ dev DEV ]\n"); fprintf(stderr, " ip ila list\n"); @@ -59,6 +59,8 @@ static char *ila_csum_mode2name(__u8 csum_mode) return "neutral-map"; case ILA_CSUM_NO_ACTION: return "no-action"; + case ILA_CSUM_NEUTRAL_MAP_AUTO: + return "neutral-map-auto"; default: return "unknown"; } @@ -70,8 +72,8 @@ static int ila_csum_name2mode(char *name) return ILA_CSUM_ADJUST_TRANSPORT; else if (strcmp(name, "neutral-map") == 0) return ILA_CSUM_NEUTRAL_MAP; - else if (strcmp(name, "no-action") == 0) - return ILA_CSUM_NO_ACTION; + else if (strcmp(name, "neutral-map-auto") == 0) + return ILA_CSUM_NEUTRAL_MAP_AUTO; else return -1; } diff --git a/ip/iproute_lwtunnel.c b/ip/iproute_lwtunnel.c index 1c8adbe7..ebedd94a 100644 --- a/ip/iproute_lwtunnel.c +++ b/ip/iproute_lwtunnel.c @@ -288,6 +288,8 @@ static char *ila_csum_mode2name(__u8 csum_mode) return "neutral-map"; case ILA_CSUM_NO_ACTION: return "no-action"; + case ILA_CSUM_NEUTRAL_MAP_AUTO: + return "neutral-map-auto"; default: return "unknown"; } @@ -301,6 +303,8 @@ static int ila_csum_name2mode(char *name) return ILA_CSUM_NEUTRAL_MAP; else if (strcmp(name, "no-action") == 0) return ILA_CSUM_NO_ACTION; + else if (strcmp(name, "neutral-map-auto") == 0) + return ILA_CSUM_NEUTRAL_MAP_AUTO; else return -1; } -- 2.11.0
[PATCH iproute 4/5] ila: support for configuring identifier and hook types
Expose identifier type and hook types in ILA configuraiton and reporting. This adds support in both ip ila ILA LWT. Signed-off-by: Tom Herbert--- ip/ipila.c| 75 ++- ip/iproute_lwtunnel.c | 107 +- 2 files changed, 179 insertions(+), 3 deletions(-) diff --git a/ip/ipila.c b/ip/ipila.c index 0b706f0b..c7a8ede8 100644 --- a/ip/ipila.c +++ b/ip/ipila.c @@ -28,7 +28,8 @@ static void usage(void) fprintf(stderr, "Usage: ip ila add loc_match LOCATOR_MATCH " "loc LOCATOR [ dev DEV ] " "[ csum-mode { adj-transport | neutral-map | " - "neutral-map-auto | no-action } ]\n"); + "neutral-map-auto | no-action } ] " + "[ ident-type { luid | use-format } ]\n"); fprintf(stderr, " ip ila del loc_match LOCATOR_MATCH " "[ loc LOCATOR ] [ dev DEV ]\n"); fprintf(stderr, " ip ila list\n"); @@ -74,6 +75,54 @@ static int ila_csum_name2mode(char *name) return ILA_CSUM_NEUTRAL_MAP; else if (strcmp(name, "neutral-map-auto") == 0) return ILA_CSUM_NEUTRAL_MAP_AUTO; + else if (strcmp(name, "no-action") == 0) + return ILA_CSUM_NO_ACTION; + else if (strcmp(name, "neutral-map-auto") == 0) + return ILA_CSUM_NEUTRAL_MAP_AUTO; + else + return -1; +} + +static char *ila_ident_type2name(__u8 ident_type) +{ + switch (ident_type) { + case ILA_ATYPE_IID: + return "iid"; + case ILA_ATYPE_LUID: + return "luid"; + case ILA_ATYPE_VIRT_V4: + return "virt-v4"; + case ILA_ATYPE_VIRT_UNI_V6: + return "virt-uni-v6"; + case ILA_ATYPE_VIRT_MULTI_V6: + return "virt-multi-v6"; + case ILA_ATYPE_NONLOCAL_ADDR: + return "nonlocal-addr"; + case ILA_ATYPE_USE_FORMAT: + return "use-format"; + default: + return "unknown"; + } +} + +static int ila_ident_name2type(char *name) +{ + if (!strcmp(name, "luid")) + return ILA_ATYPE_LUID; + else if (!strcmp(name, "use-format")) + return ILA_ATYPE_USE_FORMAT; +#if 0 /* No kernel support for configuring these yet */ + else if (!strcmp(name, "iid")) + return ILA_ATYPE_IID; + else if (!strcmp(name, "virt-v4")) + return ILA_ATYPE_VIRT_V4; + else if (!strcmp(name, "virt-uni-v6")) + return ILA_ATYPE_VIRT_UNI_V6; + else if (!strcmp(name, "virt-multi-v6")) + return ILA_ATYPE_VIRT_MULTI_V6; + else if (!strcmp(name, "nonlocal-addr")) + return ILA_ATYPE_NONLOCAL_ADDR; +#endif else return -1; } @@ -147,13 +196,20 @@ static int print_ila_mapping(const struct sockaddr_nl *who, ll_index_to_name(rta_getattr_u32( tb[ILA_ATTR_IFINDEX]))); else - fprintf(fp, "%-16s", "-"); + fprintf(fp, "%-10s ", "-"); if (tb[ILA_ATTR_CSUM_MODE]) fprintf(fp, "%s", ila_csum_mode2name(rta_getattr_u8( tb[ILA_ATTR_CSUM_MODE]))); else + fprintf(fp, "%-10s ", "-"); + + if (tb[ILA_ATTR_IDENT_TYPE]) + fprintf(fp, "%s", + ila_ident_type2name(rta_getattr_u8( + tb[ILA_ATTR_IDENT_TYPE]))); + else fprintf(fp, "-"); fprintf(fp, "\n"); @@ -193,10 +249,12 @@ static int ila_parse_opt(int argc, char **argv, struct nlmsghdr *n, __u64 locator_match = 0; int ifindex = 0; int csum_mode = 0; + int ident_type = 0; bool loc_set = false; bool loc_match_set = false; bool ifindex_set = false; bool csum_mode_set = false; + bool ident_type_set = false; while (argc > 0) { if (!matches(*argv, "loc")) { @@ -226,6 +284,16 @@ static int ila_parse_opt(int argc, char **argv, struct nlmsghdr *n, return -1; } csum_mode_set = true; + } else if (!matches(*argv, "ident-type")) { + NEXT_ARG(); + + ident_type = ila_ident_name2type(*argv); + if (ident_type < 0) { + fprintf(stderr, "Bad ident-type: %s\n", + *argv); + return -1; + } + ident_type_set = true; } else if (!matches(*argv, "dev")) { NEXT_ARG(); @@ -266,6 +334,9 @@ static int ila_parse_opt(int argc, char
[PATCH iproute 1/5] ila: Fix reporting of ILA locators and locator match
Fix retrieval of locator value for RTA to get 64 bits instead of 32. Signed-off-by: Tom Herbert--- ip/ipila.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ip/ipila.c b/ip/ipila.c index 0403fc42..fe5c4d8b 100644 --- a/ip/ipila.c +++ b/ip/ipila.c @@ -79,7 +79,7 @@ static void print_ila_locid(FILE *fp, int attr, struct rtattr *tb[], int space) int i; if (tb[attr]) { - blen = print_addr64(rta_getattr_u32(tb[attr]), + blen = print_addr64(rta_getattr_u64(tb[attr]), abuf, sizeof(abuf)); fprintf(fp, "%s", abuf); } else { -- 2.11.0
[PATCH iproute 0/5] ila: additional configuratio support
Add configuration support for checksum neutral-map-auto, identifier tyoes, and hook type (for LWT). Tom Herbert (5): ila: Fix reporting of ILA locators and locator match ila: added csum neutral support to ipila ila: support to configure checksum neutral-map-auto ila: support for configuring identifier and hook types ila: create ila_common.h ip/ila_common.h | 105 ++ ip/ipila.c| 57 +-- ip/iproute_lwtunnel.c | 68 +++- 3 files changed, 200 insertions(+), 30 deletions(-) create mode 100644 ip/ila_common.h -- 2.11.0
Re: [PATCH] net: sched: crash on blocks with goto chain action
On Tue, Nov 21, 2017 at 12:02 PM, Roman Kaplwrote: > > But maybe the "hold all chains" approach from 822e86d997 (net_sched: remove > tcf_block_put_deferred()) is simpler to understand? > Yes, it is much easier to understand for me, probably for others too.
Re: len = bpf_probe_read_str(); bpf_perf_event_output(... len) == FAIL
On Tue, Nov 21, 2017 at 2:31 PM, Alexei Starovoitovwrote: > > yeah sorry about this hack. Gianluca reported this issue as well. > Yonghong fixed it for bpf_probe_read only. We will extend > the fix to bpf_probe_read_str() and bpf_perf_event_output() asap. > The above workaround gets too much into llvm and verifier details > we should strive to make bpf program writing as easy as possible. > Hi Arnaldo With the help of Alexei, Daniel and Yonghong I just submitted a new series ("bpf: fix semantics issues with helpers receiving NULL arguments") that includes a fix in bpf_perf_event_output. This should simplify the way you write your bpf programs, so you shouldn't be required to write those convoluted branches anymore (there are a few usage examples in the commit log). In my case it made writing the code much easier, after applying it I haven't been surprised by the compiler output in a while, and I hope your experience will be improved as well. Thanks
Re: kernel BUG at crypto/asymmetric_keys/public_key.c:80
On Wed, 2017-11-22 at 19:29 +0100, Arend van Spriel wrote: > + Johannes > > >>> BUG_ON(!sig->digest); > BUG_ON(!sig->s); I *think* this is the same bug that was reported before, then this should fix it: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/?id=d7be102f2945a626f55e0501e52bb31ba3e77b81 Can you try? johannes
[PATCH net 4/4] bpf: change bpf_perf_event_output arg5 type to ARG_CONST_SIZE_OR_ZERO
Commit 9fd29c08e520 ("bpf: improve verifier ARG_CONST_SIZE_OR_ZERO semantics") relaxed the treatment of ARG_CONST_SIZE_OR_ZERO due to the way the compiler generates optimized BPF code when checking boundaries of an argument from C code. A typical example of this optimized code can be generated using the bpf_perf_event_output helper when operating on variable memory: /* len is a generic scalar */ if (len > 0 && len <= 0x7fff) bpf_perf_event_output(ctx, _map, 0, buf, len); 110: (79) r5 = *(u64 *)(r10 -40) 111: (bf) r1 = r5 112: (07) r1 += -1 113: (25) if r1 > 0x7ffe goto pc+6 114: (bf) r1 = r6 115: (18) r2 = 0x94e5f166c200 117: (b7) r3 = 0 118: (bf) r4 = r7 119: (85) call bpf_perf_event_output#25 R5 min value is negative, either use unsigned or 'var &= const' With this code, the verifier loses track of the variable. Replacing arg5 with ARG_CONST_SIZE_OR_ZERO is thus desirable since it avoids this quite common case which leads to usability issues, and the compiler generates code that the verifier can more easily test: if (len <= 0x7fff) bpf_perf_event_output(ctx, _map, 0, buf, len); or bpf_perf_event_output(ctx, _map, 0, buf, len & 0x7fff); No changes to the bpf_perf_event_output helper are necessary since it can handle a case where size is 0, and an empty frame is pushed. Reported-by: Arnaldo Carvalho de MeloSigned-off-by: Gianluca Borello Acked-by: Alexei Starovoitov Acked-by: Daniel Borkmann --- kernel/trace/bpf_trace.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index ed8601a1a861..27d1f4ffa3de 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -403,7 +403,7 @@ static const struct bpf_func_proto bpf_perf_event_output_proto = { .arg2_type = ARG_CONST_MAP_PTR, .arg3_type = ARG_ANYTHING, .arg4_type = ARG_PTR_TO_MEM, - .arg5_type = ARG_CONST_SIZE, + .arg5_type = ARG_CONST_SIZE_OR_ZERO, }; static DEFINE_PER_CPU(struct pt_regs, bpf_pt_regs); @@ -605,7 +605,7 @@ static const struct bpf_func_proto bpf_perf_event_output_proto_tp = { .arg2_type = ARG_CONST_MAP_PTR, .arg3_type = ARG_ANYTHING, .arg4_type = ARG_PTR_TO_MEM, - .arg5_type = ARG_CONST_SIZE, + .arg5_type = ARG_CONST_SIZE_OR_ZERO, }; BPF_CALL_3(bpf_get_stackid_tp, void *, tp_buff, struct bpf_map *, map, -- 2.14.1
[PATCH net 3/4] bpf: change bpf_probe_read_str arg2 type to ARG_CONST_SIZE_OR_ZERO
Commit 9fd29c08e520 ("bpf: improve verifier ARG_CONST_SIZE_OR_ZERO semantics") relaxed the treatment of ARG_CONST_SIZE_OR_ZERO due to the way the compiler generates optimized BPF code when checking boundaries of an argument from C code. A typical example of this optimized code can be generated using the bpf_probe_read_str helper when operating on variable memory: /* len is a generic scalar */ if (len > 0 && len <= 0x7fff) bpf_probe_read_str(p, len, s); 251: (79) r1 = *(u64 *)(r10 -88) 252: (07) r1 += -1 253: (25) if r1 > 0x7ffe goto pc-42 254: (bf) r1 = r7 255: (79) r2 = *(u64 *)(r10 -88) 256: (bf) r8 = r4 257: (85) call bpf_probe_read_str#45 R2 min value is negative, either use unsigned or 'var &= const' With this code, the verifier loses track of the variable. Replacing arg2 with ARG_CONST_SIZE_OR_ZERO is thus desirable since it avoids this quite common case which leads to usability issues, and the compiler generates code that the verifier can more easily test: if (len <= 0x7fff) bpf_probe_read_str(p, len, s); or bpf_probe_read_str(p, len & 0x7fff, s); No changes to the bpf_probe_read_str helper are necessary since strncpy_from_unsafe itself immediately returns if the size passed is 0. Signed-off-by: Gianluca BorelloAcked-by: Alexei Starovoitov Acked-by: Daniel Borkmann --- kernel/trace/bpf_trace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 728909f7951c..ed8601a1a861 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -494,7 +494,7 @@ static const struct bpf_func_proto bpf_probe_read_str_proto = { .gpl_only = true, .ret_type = RET_INTEGER, .arg1_type = ARG_PTR_TO_UNINIT_MEM, - .arg2_type = ARG_CONST_SIZE, + .arg2_type = ARG_CONST_SIZE_OR_ZERO, .arg3_type = ARG_ANYTHING, }; -- 2.14.1
[PATCH net 2/4] bpf: remove explicit handling of 0 for arg2 in bpf_probe_read
Commit 9c019e2bc4b2 ("bpf: change helper bpf_probe_read arg2 type to ARG_CONST_SIZE_OR_ZERO") changed arg2 type to ARG_CONST_SIZE_OR_ZERO to simplify writing bpf programs by taking advantage of the new semantics introduced for ARG_CONST_SIZE_OR_ZERO which allows arguments. In order to prevent the helper from actually passing a NULL pointer to probe_kernel_read, which can happen whenis passed to the helper, the commit also introduced an explicit check against size == 0. After the recent introduction of the ARG_PTR_TO_MEM_OR_NULL type, bpf_probe_read can not receive a pair of arguments anymore, thus the check is not needed anymore and can be removed, since probe_kernel_read can correctly handle a call. This also fixes the semantics of the helper before it gets officially released and bpf programs start relying on this check. Fixes: 9c019e2bc4b2 ("bpf: change helper bpf_probe_read arg2 type to ARG_CONST_SIZE_OR_ZERO") Signed-off-by: Gianluca Borello Acked-by: Alexei Starovoitov Acked-by: Daniel Borkmann Acked-by: Yonghong Song --- kernel/trace/bpf_trace.c | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index a5580c670866..728909f7951c 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -78,16 +78,12 @@ EXPORT_SYMBOL_GPL(trace_call_bpf); BPF_CALL_3(bpf_probe_read, void *, dst, u32, size, const void *, unsafe_ptr) { - int ret = 0; - - if (unlikely(size == 0)) - goto out; + int ret; ret = probe_kernel_read(dst, unsafe_ptr, size); if (unlikely(ret < 0)) memset(dst, 0, size); - out: return ret; } -- 2.14.1
[PATCH net 1/4] bpf: introduce ARG_PTR_TO_MEM_OR_NULL
With the current ARG_PTR_TO_MEM/ARG_PTR_TO_UNINIT_MEM semantics, an helper argument can be NULL when the next argument type is ARG_CONST_SIZE_OR_ZERO and the verifier can prove the value of this next argument is 0. However, most helpers are just interested in handling , so forcing them to deal withmakes the implementation of those helpers more complicated for no apparent benefits, requiring them to explicitly handle those corner cases with checks that bpf programs could start relying upon, preventing the possibility of removing them later. Solve this by making ARG_PTR_TO_MEM/ARG_PTR_TO_UNINIT_MEM never accept NULL even when ARG_CONST_SIZE_OR_ZERO is set, and introduce a new argument type ARG_PTR_TO_MEM_OR_NULL to explicitly deal with the NULL case. Currently, the only helper that needs this is bpf_csum_diff_proto(), so change arg1 and arg3 to this new type as well. Also add a new battery of tests that explicitly test the !ARG_PTR_TO_MEM_OR_NULL combination: all the current ones testing the various variations are focused on bpf_csum_diff, so cover also other helpers. Signed-off-by: Gianluca Borello Acked-by: Alexei Starovoitov Acked-by: Daniel Borkmann --- include/linux/bpf.h | 1 + kernel/bpf/verifier.c | 4 +- net/core/filter.c | 4 +- tools/testing/selftests/bpf/test_verifier.c | 113 ++-- 4 files changed, 112 insertions(+), 10 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 76c577281d78..e55e4255a210 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -78,6 +78,7 @@ enum bpf_arg_type { * functions that access data on eBPF program stack */ ARG_PTR_TO_MEM, /* pointer to valid memory (stack, packet, map value) */ + ARG_PTR_TO_MEM_OR_NULL, /* pointer to valid memory or NULL */ ARG_PTR_TO_UNINIT_MEM, /* pointer to memory does not need to be initialized, * helper function must fill all bytes or clear * them in error case. diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index dd54d20ace2f..308b0638ec5d 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -1384,13 +1384,15 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 regno, if (type != expected_type) goto err_type; } else if (arg_type == ARG_PTR_TO_MEM || + arg_type == ARG_PTR_TO_MEM_OR_NULL || arg_type == ARG_PTR_TO_UNINIT_MEM) { expected_type = PTR_TO_STACK; /* One exception here. In case function allows for NULL to be * passed in as argument, it's a SCALAR_VALUE type. Final test * happens during stack boundary checking. */ - if (register_is_null(*reg)) + if (register_is_null(*reg) && + arg_type == ARG_PTR_TO_MEM_OR_NULL) /* final test in check_stack_boundary() */; else if (!type_is_pkt_pointer(type) && type != PTR_TO_MAP_VALUE && diff --git a/net/core/filter.c b/net/core/filter.c index 1afa17935954..6a85e67fafce 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -1646,9 +1646,9 @@ static const struct bpf_func_proto bpf_csum_diff_proto = { .gpl_only = false, .pkt_access = true, .ret_type = RET_INTEGER, - .arg1_type = ARG_PTR_TO_MEM, + .arg1_type = ARG_PTR_TO_MEM_OR_NULL, .arg2_type = ARG_CONST_SIZE_OR_ZERO, - .arg3_type = ARG_PTR_TO_MEM, + .arg3_type = ARG_PTR_TO_MEM_OR_NULL, .arg4_type = ARG_CONST_SIZE_OR_ZERO, .arg5_type = ARG_ANYTHING, }; diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c index 2a5267bef160..3c64f30cf63c 100644 --- a/tools/testing/selftests/bpf/test_verifier.c +++ b/tools/testing/selftests/bpf/test_verifier.c @@ -5631,7 +5631,7 @@ static struct bpf_test tests[] = { .prog_type = BPF_PROG_TYPE_TRACEPOINT, }, { - "helper access to variable memory: size = 0 allowed on NULL", + "helper access to variable memory: size = 0 allowed on NULL (ARG_PTR_TO_MEM_OR_NULL)", .insns = { BPF_MOV64_IMM(BPF_REG_1, 0), BPF_MOV64_IMM(BPF_REG_2, 0), @@ -5645,7 +5645,7 @@ static struct bpf_test tests[] = { .prog_type = BPF_PROG_TYPE_SCHED_CLS, }, { - "helper access to variable memory: size > 0 not allowed on NULL", + "helper access to variable memory: size > 0 not allowed on NULL (ARG_PTR_TO_MEM_OR_NULL)", .insns = {
[PATCH net 0/4] bpf: fix semantics issues with helpers receiving NULL arguments
This set includes some fixes in semantics and usability issues that emerged recently, and would be good to have them in net before the next release. In particular, ARG_CONST_SIZE_OR_ZERO semantics was recently changed in commit 9fd29c08e520 ("bpf: improve verifier ARG_CONST_SIZE_OR_ZERO semantics") with the goal of letting the compiler generate simpler code that the verifier can more easily accept. To handle this change in semantics, a few checks in some helpers were added, like in commit 9c019e2bc4b2 ("bpf: change helper bpf_probe_read arg2 type to ARG_CONST_SIZE_OR_ZERO"), and those checks are less than ideal because once they make it into a released kernel bpf programs can start relying on them, preventing the possibility of being removed later on. This patch tries to fix the issue by introducing a new argument type ARG_PTR_TO_MEM_OR_NULL that can be used for helpers that can receive atuple. By doing so, we can fix the semantics of the other helpers that don't need and can just handle , allowing the code to get rid of those checks. Gianluca Borello (4): bpf: introduce ARG_PTR_TO_MEM_OR_NULL bpf: remove explicit handling of 0 for arg2 in bpf_probe_read bpf: change bpf_probe_read_str arg2 type to ARG_CONST_SIZE_OR_ZERO bpf: change bpf_perf_event_output arg5 type to ARG_CONST_SIZE_OR_ZERO include/linux/bpf.h | 1 + kernel/bpf/verifier.c | 4 +- kernel/trace/bpf_trace.c| 12 +-- net/core/filter.c | 4 +- tools/testing/selftests/bpf/test_verifier.c | 113 ++-- 5 files changed, 116 insertions(+), 18 deletions(-) -- 2.14.1
Re: kernel BUG at crypto/asymmetric_keys/public_key.c:80
+ Johannes On 22-11-17 18:43, Florian Fainelli wrote: Hi, (sorry for the cross post) I am at v4.14-12995-g0c86a6bd85ff and just met the following, attached is my .config file. Is this a known problem? Thanks! [1.798714] cfg80211: Loading compiled-in X.509 certificates for regulatory database [1.809390] [ cut here ] [1.814020] kernel BUG at crypto/asymmetric_keys/public_key.c:80! [1.820123] Internal error: Oops - BUG: 0 [#1] SMP ARM [1.825273] Modules linked in: [1.828341] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.0-12995-g0c86a6bd85ff #15 [1.836096] Hardware name: Broadcom STB (Flattened Device Tree) [1.842025] task: ee0a task.stack: ee096000 [1.846576] PC is at public_key_verify_signature+0x21c/0x260 [1.852248] LR is at x509_check_for_self_signed+0xb0/0x10c [1.857743] pc : []lr : []psr: 6013 [1.864019] sp : ee097cf8 ip : c0a7a3ae fp : [1.869252] r10: c248e9d8 r9 : c0b401e0 r8 : ee374040 [1.874487] r7 : c0a7a340 r6 : ee374200 r5 : c2404c48 r4 : edac8880 [1.881024] r3 : r2 : c0b40480 r1 : ee3741c0 r0 : ee374040 [1.887563] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user [1.894709] Control: 30c5387d Table: 3000 DAC: fffd [1.900465] Process swapper/0 (pid: 1, stack limit = 0xee096210) [1.906481] Stack: (0xee097cf8 to 0xee098000) [1.910845] 7ce0: 6013 [1.919037] 7d00: 014080c0 c052149c 11a0 c248e9d8 c033dcb8 ee097d98 014000c0 [1.927229] 7d20: 0001 ed58622e c0a7a634 0001 ed58622e c0a7a634 c0521530 [1.935421] 7d40: c052143c ed586200 c0518bc0 c248e9d8 [1.943612] 7d60: c02416f8 c244bc28 ed586200 ed586388 ed586384 6013 [1.951804] 7d80: c0b400d4 c0518bc0 c248e9d8 c025de10 [1.959995] 7da0: fffe fffe c0b400d4 c0518bc0 c0518c34 [1.968187] 7dc0: c0a23498 ee096000 00040e00 ee374302 edac8880 edac8880 ee374200 [1.976378] 7de0: c0a7a340 02a8 c0b401e0 c248e9d8 c0526ee8 edac8880 [1.984569] 7e00: ee374200 c0525f24 c244d210 ee097e80 c248e9d8 ee097e80 c244d1ac c0526b74 [1.992760] 7e20: c244d210 c244b988 c248e9d8 ee097e80 c244d1ac c0b401e0 c248e9d8 c0524f20 [2.000952] 7e40: c2404c48 c244b988 c0a7a340 edac8801 ee02c180 edac8800 c0511148 [2.009143] 7e60: c24b7eda 0048 6013 c244d1b4 [2.017335] 7e80: c0a7a340 02a8 [2.025527] 7ea0: 7fff 00040e00 c0a7a340 c24df7b4 02a8 c0a7a5e8 c0bb10d8 c0b18088 [2.033718] 7ec0: 1f03 c0e47898 02a8 1f03 000e c2404c48 [2.041910] 7ee0: e000 c0e4777c c0e6583c c0e74f98 0008 c0201bd8 [2.050101] 7f00: 6013 c025dda4 c0c05a00 c0e005d8 0007 [2.058292] 7f20: 0007 00040e00 c240d790 c2404c48 c0e65818 [2.066483] 7f40: 00040e00 00040e00 c24a3100 c24a3100 c24a3100 0109 [2.074675] 7f60: c0e65838 c0e6583c c0e74f98 c0e00e6c 0007 0007 c0e005d8 [2.082866] 7f80: c09b47f8 c09b47f8 [2.091057] 7fa0: c09b4800 c0208920 [2.099248] 7fc0: [2.107440] 7fe0: 0013 60bd36df 5ae9d652 [2.115645] [] (public_key_verify_signature) from [] (x509_check_for_self_signed+0xb0/0x10c) [2.125842] [] (x509_check_for_self_signed) from [] (x509_cert_parse+0x14c/0x1a8) [2.135080] [] (x509_cert_parse) from [] (x509_key_preparse+0x14/0x18c) [2.143449] [] (x509_key_preparse) from [] (asymmetric_key_preparse+0x54/0xd4) [2.152430] [] (asymmetric_key_preparse) from [] (key_create_or_update+0x120/0x3c4) [2.161846] [] (key_create_or_update) from [] (regulatory_init_db+0x11c/0x1e4) [2.170828] [] (regulatory_init_db) from [] (do_one_initcall+0x54/0x18c) [2.179293] [] (do_one_initcall) from [] (kernel_init_freeable+0x140/0x1cc) [2.188011] [] (kernel_init_freeable) from [] (kernel_init+0x8/0x110) [2.196210] [] (kernel_init) from [] (ret_from_fork+0x14/0x34) [2.203796] Code: ebf8636b eaab e7f001f2 e7f001f2 (e7f001f2) [2.209901] ---[ end trace 4ec242c4e6a05178 ]--- [2.214553] Kernel panic - not syncing: Attempted to kill init! exitcode=0x000b This is the regulatory database stuff that Johannes added. The BUG() that triggers is here: int public_key_verify_signature(const struct public_key *pkey, const struct public_key_signature *sig) { struct crypto_wait cwait; struct
kernel BUG at crypto/asymmetric_keys/public_key.c:80
Hi, (sorry for the cross post) I am at v4.14-12995-g0c86a6bd85ff and just met the following, attached is my .config file. Is this a known problem? Thanks! [1.798714] cfg80211: Loading compiled-in X.509 certificates for regulatory database [1.809390] [ cut here ] [1.814020] kernel BUG at crypto/asymmetric_keys/public_key.c:80! [1.820123] Internal error: Oops - BUG: 0 [#1] SMP ARM [1.825273] Modules linked in: [1.828341] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.0-12995-g0c86a6bd85ff #15 [1.836096] Hardware name: Broadcom STB (Flattened Device Tree) [1.842025] task: ee0a task.stack: ee096000 [1.846576] PC is at public_key_verify_signature+0x21c/0x260 [1.852248] LR is at x509_check_for_self_signed+0xb0/0x10c [1.857743] pc : []lr : []psr: 6013 [1.864019] sp : ee097cf8 ip : c0a7a3ae fp : [1.869252] r10: c248e9d8 r9 : c0b401e0 r8 : ee374040 [1.874487] r7 : c0a7a340 r6 : ee374200 r5 : c2404c48 r4 : edac8880 [1.881024] r3 : r2 : c0b40480 r1 : ee3741c0 r0 : ee374040 [1.887563] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user [1.894709] Control: 30c5387d Table: 3000 DAC: fffd [1.900465] Process swapper/0 (pid: 1, stack limit = 0xee096210) [1.906481] Stack: (0xee097cf8 to 0xee098000) [1.910845] 7ce0: 6013 [1.919037] 7d00: 014080c0 c052149c 11a0 c248e9d8 c033dcb8 ee097d98 014000c0 [1.927229] 7d20: 0001 ed58622e c0a7a634 0001 ed58622e c0a7a634 c0521530 [1.935421] 7d40: c052143c ed586200 c0518bc0 c248e9d8 [1.943612] 7d60: c02416f8 c244bc28 ed586200 ed586388 ed586384 6013 [1.951804] 7d80: c0b400d4 c0518bc0 c248e9d8 c025de10 [1.959995] 7da0: fffe fffe c0b400d4 c0518bc0 c0518c34 [1.968187] 7dc0: c0a23498 ee096000 00040e00 ee374302 edac8880 edac8880 ee374200 [1.976378] 7de0: c0a7a340 02a8 c0b401e0 c248e9d8 c0526ee8 edac8880 [1.984569] 7e00: ee374200 c0525f24 c244d210 ee097e80 c248e9d8 ee097e80 c244d1ac c0526b74 [1.992760] 7e20: c244d210 c244b988 c248e9d8 ee097e80 c244d1ac c0b401e0 c248e9d8 c0524f20 [2.000952] 7e40: c2404c48 c244b988 c0a7a340 edac8801 ee02c180 edac8800 c0511148 [2.009143] 7e60: c24b7eda 0048 6013 c244d1b4 [2.017335] 7e80: c0a7a340 02a8 [2.025527] 7ea0: 7fff 00040e00 c0a7a340 c24df7b4 02a8 c0a7a5e8 c0bb10d8 c0b18088 [2.033718] 7ec0: 1f03 c0e47898 02a8 1f03 000e c2404c48 [2.041910] 7ee0: e000 c0e4777c c0e6583c c0e74f98 0008 c0201bd8 [2.050101] 7f00: 6013 c025dda4 c0c05a00 c0e005d8 0007 [2.058292] 7f20: 0007 00040e00 c240d790 c2404c48 c0e65818 [2.066483] 7f40: 00040e00 00040e00 c24a3100 c24a3100 c24a3100 0109 [2.074675] 7f60: c0e65838 c0e6583c c0e74f98 c0e00e6c 0007 0007 c0e005d8 [2.082866] 7f80: c09b47f8 c09b47f8 [2.091057] 7fa0: c09b4800 c0208920 [2.099248] 7fc0: [2.107440] 7fe0: 0013 60bd36df 5ae9d652 [2.115645] [] (public_key_verify_signature) from [] (x509_check_for_self_signed+0xb0/0x10c) [2.125842] [] (x509_check_for_self_signed) from [] (x509_cert_parse+0x14c/0x1a8) [2.135080] [] (x509_cert_parse) from [] (x509_key_preparse+0x14/0x18c) [2.143449] [] (x509_key_preparse) from [] (asymmetric_key_preparse+0x54/0xd4) [2.152430] [] (asymmetric_key_preparse) from [] (key_create_or_update+0x120/0x3c4) [2.161846] [] (key_create_or_update) from [] (regulatory_init_db+0x11c/0x1e4) [2.170828] [] (regulatory_init_db) from [] (do_one_initcall+0x54/0x18c) [2.179293] [] (do_one_initcall) from [] (kernel_init_freeable+0x140/0x1cc) [2.188011] [] (kernel_init_freeable) from [] (kernel_init+0x8/0x110) [2.196210] [] (kernel_init) from [] (ret_from_fork+0x14/0x34) [2.203796] Code: ebf8636b eaab e7f001f2 e7f001f2 (e7f001f2) [2.209901] ---[ end trace 4ec242c4e6a05178 ]--- [2.214553] Kernel panic - not syncing: Attempted to kill init! exitcode=0x000b -- Florian config.gz Description: application/gzip
Re: [Outreachy kernel] Re: [PATCH] net: usb: hso.c: remove unneeded DRIVER_LICENSE #define
On Wed, 22 Nov 2017, Joe Perches wrote: > On Fri, 2017-11-17 at 15:19 +0100, Greg Kroah-Hartman wrote: > > There is no need to #define the license of the driver, just put it in > > the MODULE_LICENSE() line directly as a text string. > > > > This allows tools that check that the module license matches the source > > code license to work properly, as there is no need to unwind the > > unneeded dereference. > [] > > diff --git a/drivers/net/usb/hso.c b/drivers/net/usb/hso.c > [] > > @@ -76,7 +76,6 @@ > > > > #define MOD_AUTHOR "Option Wireless" > > #define MOD_DESCRIPTION"USB High Speed Option driver" > > -#define MOD_LICENSE"GPL" > > > > #define HSO_MAX_NET_DEVICES10 > > #define HSO__MAX_MTU 2048 > > @@ -3288,7 +3287,7 @@ module_exit(hso_exit); > > > > MODULE_AUTHOR(MOD_AUTHOR); > > MODULE_DESCRIPTION(MOD_DESCRIPTION); > > -MODULE_LICENSE(MOD_LICENSE); > > +MODULE_LICENSE("GPL"); > > Probably all of these MODULE_(MOD_) uses could be > simplified as well. > > Perhaps there's utility in a (cocci?) script that looks for > used-once > macro #defines in various types of macros. It could be possible. It's a bit tricky due to ifdefs that Coccinelle doesn't see and header files, but perhaps in special cases like this there is not much worry. julia > > -- > You received this message because you are subscribed to the Google Groups > "outreachy-kernel" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to outreachy-kernel+unsubscr...@googlegroups.com. > To post to this group, send email to outreachy-ker...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/outreachy-kernel/1511370336.6989.100.camel%40perches.com. > For more options, visit https://groups.google.com/d/optout. >
Re: [PATCH] net: usb: hso.c: remove unneeded DRIVER_LICENSE #define
On Wed, Nov 22, 2017 at 09:05:36AM -0800, Joe Perches wrote: > On Fri, 2017-11-17 at 15:19 +0100, Greg Kroah-Hartman wrote: > > There is no need to #define the license of the driver, just put it in > > the MODULE_LICENSE() line directly as a text string. > > > > This allows tools that check that the module license matches the source > > code license to work properly, as there is no need to unwind the > > unneeded dereference. > [] > > diff --git a/drivers/net/usb/hso.c b/drivers/net/usb/hso.c > [] > > @@ -76,7 +76,6 @@ > > > > #define MOD_AUTHOR "Option Wireless" > > #define MOD_DESCRIPTION"USB High Speed Option driver" > > -#define MOD_LICENSE"GPL" > > > > #define HSO_MAX_NET_DEVICES10 > > #define HSO__MAX_MTU 2048 > > @@ -3288,7 +3287,7 @@ module_exit(hso_exit); > > > > MODULE_AUTHOR(MOD_AUTHOR); > > MODULE_DESCRIPTION(MOD_DESCRIPTION); > > -MODULE_LICENSE(MOD_LICENSE); > > +MODULE_LICENSE("GPL"); > > Probably all of these MODULE_(MOD_) uses could be > simplified as well. Agreed, I did that for a bunch of USB drivers, need to do it for others as well. thanks, greg k-h
Re: [PATCH] net: usb: hso.c: remove unneeded DRIVER_LICENSE #define
On Fri, 2017-11-17 at 15:19 +0100, Greg Kroah-Hartman wrote: > There is no need to #define the license of the driver, just put it in > the MODULE_LICENSE() line directly as a text string. > > This allows tools that check that the module license matches the source > code license to work properly, as there is no need to unwind the > unneeded dereference. [] > diff --git a/drivers/net/usb/hso.c b/drivers/net/usb/hso.c [] > @@ -76,7 +76,6 @@ > > #define MOD_AUTHOR "Option Wireless" > #define MOD_DESCRIPTION "USB High Speed Option driver" > -#define MOD_LICENSE "GPL" > > #define HSO_MAX_NET_DEVICES 10 > #define HSO__MAX_MTU 2048 > @@ -3288,7 +3287,7 @@ module_exit(hso_exit); > > MODULE_AUTHOR(MOD_AUTHOR); > MODULE_DESCRIPTION(MOD_DESCRIPTION); > -MODULE_LICENSE(MOD_LICENSE); > +MODULE_LICENSE("GPL"); Probably all of these MODULE_(MOD_) uses could be simplified as well. Perhaps there's utility in a (cocci?) script that looks for used-once macro #defines in various types of macros.
Re: Uninitialized value in __sk_nulls_add_node_rcu()
On Wed, Nov 22, 2017 at 5:38 AM, Alexander Potapenkowrote: > On Thu, Oct 26, 2017 at 4:56 PM, Alexander Potapenko > wrote: >> On Thu, Oct 26, 2017 at 4:52 PM, Eric Dumazet wrote: >>> On Thu, Oct 26, 2017 at 7:47 AM, Eric Dumazet wrote: On Thu, Oct 26, 2017 at 7:20 AM, Alexander Potapenko wrote: > On Thu, Oct 26, 2017 at 2:51 PM, Alexander Potapenko > wrote: >> Hi David, Eric, >> >> I've changed KMSAN instrumentation a bit and it's now reporting a new >> error (see below) when I SSH into a VM. > I've double-checked the old instrumentation and found a bug in it, > which led to masking some of the errors on uninitialized bitfields. > I now believe this is a true positive report. Please do not top post on netdev >> Sorry about that. A child is cloned from the listener, check sock_copy() sk_reuseport is part of the copied fields. You might have some bug at your side ? >>> >>> Oh these are request socket. >>> >>> This is an harmless bug added in commit >>> d894ba18d4e449b3a7f6eb491f16c9e02933736e >>> ("soreuseport: fix ordering for mixed v4/v6 sockets") >>> >>> I will send a patch, but really this has no effect at all. >> Thanks for clarifying! >> For me this was a question of the tool's correctness, because until >> recently I wasn't able to understand whether this is a true bug or >> not. > A friendly ping. > Eric, did you find the time to send the patch? Not yet, I am still investigating this issue. Thanks. >> For |req| allocated by inet_reqsk_alloc() the value of >> req_to_sk(req)->sk_reuseport is uninitialized. >> Does this look valid? >> I'm a bit surprised this didn't show up before, but I couldn't find >> any code to initialize sk_reuseport. >> >> == >> BUG: KMSAN: use of uninitialized memory in inet_ehash_insert+0xd40/0x1050 >> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.13.0+ #3288 >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs >> 01/01/2011 >> Call Trace: >> >> __dump_stack lib/dump_stack.c:16 >> dump_stack+0x185/0x1d0 lib/dump_stack.c:52 >> kmsan_report+0x13f/0x1c0 mm/kmsan/kmsan.c:1016 >> __msan_warning_32+0x69/0xb0 mm/kmsan/kmsan_instr.c:766 >> __sk_nulls_add_node_rcu ./include/net/sock.h:684 >> inet_ehash_insert+0xd40/0x1050 net/ipv4/inet_hashtables.c:413 >> reqsk_queue_hash_req net/ipv4/inet_connection_sock.c:754 >> inet_csk_reqsk_queue_hash_add+0x1cc/0x300 >> net/ipv4/inet_connection_sock.c:765 >> tcp_conn_request+0x31e7/0x36f0 net/ipv4/tcp_input.c:6414 >> tcp_v4_conn_request+0x16d/0x220 net/ipv4/tcp_ipv4.c:1314 >> tcp_rcv_state_process+0x42a/0x7210 net/ipv4/tcp_input.c:5917 >> tcp_v4_do_rcv+0xa6a/0xcd0 net/ipv4/tcp_ipv4.c:1483 >> tcp_v4_rcv+0x3de0/0x4ab0 net/ipv4/tcp_ipv4.c:1763 >> ip_local_deliver_finish+0x6bb/0xcb0 net/ipv4/ip_input.c:216 >> NF_HOOK ./include/linux/netfilter.h:248 >> ip_local_deliver+0x3fa/0x480 net/ipv4/ip_input.c:257 >> dst_input ./include/net/dst.h:477 >> ip_rcv_finish+0x6fb/0x1540 net/ipv4/ip_input.c:397 >> NF_HOOK ./include/linux/netfilter.h:248 >> ip_rcv+0x10f6/0x15c0 net/ipv4/ip_input.c:488 >> __netif_receive_skb_core+0x36f6/0x3f60 net/core/dev.c:4298 >> __netif_receive_skb net/core/dev.c:4336 >> netif_receive_skb_internal+0x63c/0x19c0 net/core/dev.c:4497 >> napi_skb_finish net/core/dev.c:4858 >> napi_gro_receive+0x629/0xa50 net/core/dev.c:4889 >> e1000_receive_skb drivers/net/ethernet/intel/e1000/e1000_main.c:4018 >> e1000_clean_rx_irq+0x1492/0x1d30 >> drivers/net/ethernet/intel/e1000/e1000_main.c:4474 >> e1000_clean+0x43aa/0x5970 >> drivers/net/ethernet/intel/e1000/e1000_main.c:3819 >> napi_poll net/core/dev.c:5500 >> net_rx_action+0x73c/0x1820 net/core/dev.c:5566 >> __do_softirq+0x4b4/0x8dd kernel/softirq.c:284 >> invoke_softirq kernel/softirq.c:364 >> irq_exit+0x203/0x240 kernel/softirq.c:405 >> exiting_irq+0xe/0x10 ./arch/x86/include/asm/apic.h:638 >> do_IRQ+0x15e/0x1a0 arch/x86/kernel/irq.c:263 >> common_interrupt+0x86/0x86 >> ... >> >> arch_cpu_idle+0x20/0x30 arch/x86/kernel/process.c:332 >> default_idle_call kernel/sched/idle.c:98 >> cpuidle_idle_call kernel/sched/idle.c:156 >> do_idle+0x334/0x730 kernel/sched/idle.c:246 >> cpu_startup_entry+0x35/0x40 kernel/sched/idle.c:351 >> rest_init+0xb8/0xc0 init/main.c:437 >> start_kernel+0x4d7/0x530 init/main.c:703 >> x86_64_start_reservations arch/x86/kernel/head64.c:318 >> x86_64_start_kernel+0x3cd/0x3e0 arch/x86/kernel/head64.c:299 >> secondary_startup_64+0x9f/0x9f arch/x86/kernel/head_64.S:219 >>
Re: [PATCH net] net: thunderx: Fix TCP/UDP checksum offload for IPv6 pkts
On Wed, 2017-11-22 at 15:37 +0300, Aleksey Makarov wrote: > From: Sunil Goutham> > This fixes a previous patch which missed some changes > and due to which L3 checksum offload was getting enabled > for IPv6 pkts. And HW is dropping these pkts as it assumes > the pkt is IPv4 when IP csum offload is set in the SQ > descriptor. > > Fixes: 494fd005 ("net: thunderx: Enable TSO and checksum offloads > for ipv6") > Signed-off-by: Sunil Goutham > Signed-off-by: Aleksey Makarov > --- > drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c > b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c > index d4496e9afcdf..184d5bdbe7e0 100644 > --- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c > +++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c > @@ -1355,10 +1355,11 @@ nicvf_sq_add_hdr_subdesc(struct nicvf *nic, > struct snd_queue *sq, int qentry, > > /* Offload checksum calculation to HW */ > if (skb->ip_summed == CHECKSUM_PARTIAL) { > - hdr->csum_l3 = 1; /* Enable IP csum calculation */ > hdr->l3_offset = skb_network_offset(skb); > hdr->l4_offset = skb_transport_offset(skb); > > + /* Enable IP HDR csum calculation for V4 pkts */ > + hdr->csum_l3 = (ip.v4->version == 4) ? 1 : 0; Have you tried to set hdr->csum_l3 to 0 regardless of version being 4 or 6 ? This would remove the need for yet another conditional. AFAIK, linux does not offload IPv4 header checksums to NIC, it is not worth the trouble.
ipsec: ipcomp alg problem on vti interface
Hi Steffen, LTP has vti test-cases which fail on ipcomp alg, e.g. "tcp_ipsec_vti.sh -p comp -m tunnel -s 100" Basically, the setupconsists of the following commands: ip li add ltp_vti0 type vti local 10.0.0.2 remote 10.0.0.1 key 10 dev ltp_ns_veth2 ip li set ltp_vti0 up ip -4 xf st add src 10.0.0.1 dst 10.0.0.2 proto comp spi 0x1001 comp deflate mode tunnel ip -4 xf po add dir out tmpl src 10.0.0.2 dst 10.0.0.1 proto comp mode tunnel mark 10 ip -4 xf po add dir in tmpl src 10.0.0.1 dst 10.0.0.2 proto comp mode tunnel mark 10 ip route add 10.23.1.0/30 dev ltp_vti0 ip a add 10.23.1.1/30 dev ltp_vti0 ...omitted corresponded setup in netns for the other end. The problem appears with the small packets like SYN which are not compressed and sent as is through vti tunnel and appear on ltp_ns_veth2 as IPIP packets. On the other end, vti doesn't handle them and theyare rejected (InNoPol stats increased). As a workaround, setting: # sysctl net.ipv4.conf.ltp_ns_veth2.disable_policy=1 # sysctl net.ipv4.conf.ltp_ns_veth1.disable_policy=1 works, but compressed packets seen on vti device, the other on ltp_ns_veth2. Is there some flaw in setup or vti not designed to handle ipcomp alg that can send packets with/without compression (or without further encryption)? May be we should handle such packets by registering additional tunnel handler onvti, like in the diff below? diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c index 89453cf..99ad70b 100644 --- a/net/ipv4/ip_vti.c +++ b/net/ipv4/ip_vti.c @@ -44,11 +44,20 @@ #include #include +static bool log_ecn_error = true; +module_param(log_ecn_error, bool, 0644); +MODULE_PARM_DESC(log_ecn_error, "Log packets received with corrupted ECN"); + static struct rtnl_link_ops vti_link_ops __read_mostly; static unsigned int vti_net_id __read_mostly; static int vti_tunnel_init(struct net_device *dev); +static const struct tnl_ptk_info tpi = { + /* no tunnel info required for ipip. */ + .proto = htons(ETH_P_IP), +}; + static int vti_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type) { @@ -65,6 +74,13 @@ static int vti_input(struct sk_buff *skb, int nexthdr, __be32 spi, XFRM_TUNNEL_SKB_CB(skb)->tunnel.ip4 = tunnel; + if (ip_hdr(skb)->protocol == IPPROTO_IPIP) { + if (iptunnel_pull_header(skb, 0, tpi.proto, false)) + goto drop; + return ip_tunnel_rcv(tunnel, skb, , NULL, +log_ecn_error); + } + return xfrm_input(skb, nexthdr, spi, encap_type); } @@ -335,6 +351,11 @@ static int vti4_err(struct sk_buff *skb, u32 info) return 0; } +static int vti_ip_err(struct sk_buff *skb, u32 info) +{ + return -ENOENT; +} + static int vti_tunnel_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd) { @@ -440,6 +461,12 @@ static void __net_init vti_fb_tunnel_init(struct net_device *dev) .priority = 100, }; +static struct xfrm_tunnel vti_ip4_tunnnel __read_mostly = { + .handler= vti_rcv, + .err_handler= vti_ip_err, + .priority = 1, +}; + static int __net_init vti_init_net(struct net *net) { int err; @@ -607,6 +634,9 @@ static int __init vti_init(void) err = xfrm4_protocol_register(_ipcomp4_protocol, IPPROTO_COMP); if (err < 0) goto xfrm_proto_comp_failed; + err = xfrm4_tunnel_register(_ip4_tunnnel, AF_INET); + if (err < 0) + goto xfrm_tunnel_failed; msg = "netlink interface"; err = rtnl_link_register(_link_ops); @@ -616,6 +646,8 @@ static int __init vti_init(void) return err; rtnl_link_failed: + xfrm4_tunnel_deregister(_ip4_tunnnel, AF_INET); +xfrm_tunnel_failed: xfrm4_protocol_deregister(_ipcomp4_protocol, IPPROTO_COMP); xfrm_proto_comp_failed: xfrm4_protocol_deregister(_ah4_protocol, IPPROTO_AH); @@ -631,6 +663,7 @@ static int __init vti_init(void) static void __exit vti_fini(void) { rtnl_link_unregister(_link_ops); + xfrm4_tunnel_deregister(_ip4_tunnnel, AF_INET); xfrm4_protocol_deregister(_ipcomp4_protocol, IPPROTO_COMP); xfrm4_protocol_deregister(_ah4_protocol, IPPROTO_AH); xfrm4_protocol_deregister(_esp4_protocol, IPPROTO_ESP); Thanks, Alexey
[PATCH] [WAN]: lmc: Use memdup_user() as a cleanup
Fix coccicheck warning which recommends to use memdup_user(): drivers/net/wan/lmc/lmc_main.c:497:27-34: WARNING opportunity for memdup_user Generated by: scripts/coccinelle/memdup_user/memdup_user.cocci Signed-off-by: Vasyl Gomonovych--- drivers/net/wan/lmc/lmc_main.c | 13 +++-- 1 file changed, 3 insertions(+), 10 deletions(-) diff --git a/drivers/net/wan/lmc/lmc_main.c b/drivers/net/wan/lmc/lmc_main.c index 4698450c77d1..ded78a466fe3 100644 --- a/drivers/net/wan/lmc/lmc_main.c +++ b/drivers/net/wan/lmc/lmc_main.c @@ -494,18 +494,11 @@ int lmc_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd) /*fold00*/ break; } -data = kmalloc(xc.len, GFP_KERNEL); -if (!data) { -ret = -ENOMEM; +data = memdup_user(xc.data, xc.len); +if (IS_ERR(data)) { +ret = PTR_ERR(data); break; } - -if(copy_from_user(data, xc.data, xc.len)) -{ - kfree(data); - ret = -ENOMEM; - break; -} printk("%s: Starting load of data Len: %d at 0x%p == 0x%p\n", dev->name, xc.len, xc.data, data); -- 1.9.1
pull-request: wireless-drivers 2017-11-22
Hi Dave, here's the first pull request to net tree for 4.15. Please let me know if there are any problems. Kalle The following changes since commit 32a72bbd5da2411eab591bf9bc2e39349106193a: net: vxge: Fix some indentation issues (2017-11-20 11:36:30 +0900) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers.git tags/wireless-drivers-for-davem-2017-11-22 for you to fetch changes up to ed59b7d53c95548d83d4e7e1bc5edafcdcad09c9: Merge ath-current from git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.git (2017-11-21 11:52:18 +0200) wireless-drivers fixes for 4.15 First set of fixes for 4.15. Most important here is the iwlwifi fix for scan command firmware interface change. ath10k * fix CCMP-256, GCMP and GCMP-256 in raw mode, it was never working wcn36xx * fix device tree node search iwlwifi * fix a regression with firmware API change of scan cmd (introduced in firmware version 34) * add a bunch of PCI IDs and fix configuration structs for A000 devices * fix the exported firmware name strings for 9000 and A000 devices Johan Hovold (1): wcn36xx: fix iris child-node lookup Kalle Valo (2): Merge tag 'iwlwifi-for-kalle-2017-11-19' of git://git.kernel.org/.../iwlwifi/iwlwifi-fixes Merge ath-current from git://git.kernel.org/.../kvalo/ath.git Luca Coelho (2): iwlwifi: mvm: support version 7 of the SCAN_REQ_UMAC FW command iwlwifi: fix PCI IDs and configuration mapping for 9000 series Thomas Backlund (1): iwlwifi: fix firmware names for 9000 and A000 series hw Vasanthakumar Thiagarajan (1): ath10k: fix data rx for CCMP-256, GCMP and GCMP-256 in raw mode drivers/net/wireless/ath/ath10k/htt_rx.c | 51 ++--- drivers/net/wireless/ath/wcn36xx/main.c | 2 +- drivers/net/wireless/intel/iwlwifi/cfg/9000.c| 73 +++-- drivers/net/wireless/intel/iwlwifi/cfg/a000.c| 10 +- drivers/net/wireless/intel/iwlwifi/fw/api/scan.h | 59 +++--- drivers/net/wireless/intel/iwlwifi/fw/file.h | 1 + drivers/net/wireless/intel/iwlwifi/iwl-config.h | 5 + drivers/net/wireless/intel/iwlwifi/mvm/mvm.h | 6 ++ drivers/net/wireless/intel/iwlwifi/mvm/scan.c| 86 +++ drivers/net/wireless/intel/iwlwifi/pcie/drv.c| 132 ++- 10 files changed, 335 insertions(+), 90 deletions(-)
Re: Uninitialized value in __sk_nulls_add_node_rcu()
On Thu, Oct 26, 2017 at 4:56 PM, Alexander Potapenkowrote: > On Thu, Oct 26, 2017 at 4:52 PM, Eric Dumazet wrote: >> On Thu, Oct 26, 2017 at 7:47 AM, Eric Dumazet wrote: >>> On Thu, Oct 26, 2017 at 7:20 AM, Alexander Potapenko >>> wrote: On Thu, Oct 26, 2017 at 2:51 PM, Alexander Potapenko wrote: > Hi David, Eric, > > I've changed KMSAN instrumentation a bit and it's now reporting a new > error (see below) when I SSH into a VM. I've double-checked the old instrumentation and found a bug in it, which led to masking some of the errors on uninitialized bitfields. I now believe this is a true positive report. >>> >>> >>> Please do not top post on netdev > Sorry about that. >>> A child is cloned from the listener, check sock_copy() >>> >>> sk_reuseport is part of the copied fields. >>> >>> You might have some bug at your side ? >>> >> >> Oh these are request socket. >> >> This is an harmless bug added in commit >> d894ba18d4e449b3a7f6eb491f16c9e02933736e >> ("soreuseport: fix ordering for mixed v4/v6 sockets") >> >> I will send a patch, but really this has no effect at all. > Thanks for clarifying! > For me this was a question of the tool's correctness, because until > recently I wasn't able to understand whether this is a true bug or > not. A friendly ping. Eric, did you find the time to send the patch? >>> > For |req| allocated by inet_reqsk_alloc() the value of > req_to_sk(req)->sk_reuseport is uninitialized. > Does this look valid? > I'm a bit surprised this didn't show up before, but I couldn't find > any code to initialize sk_reuseport. > > == > BUG: KMSAN: use of uninitialized memory in inet_ehash_insert+0xd40/0x1050 > CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.13.0+ #3288 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs > 01/01/2011 > Call Trace: > > __dump_stack lib/dump_stack.c:16 > dump_stack+0x185/0x1d0 lib/dump_stack.c:52 > kmsan_report+0x13f/0x1c0 mm/kmsan/kmsan.c:1016 > __msan_warning_32+0x69/0xb0 mm/kmsan/kmsan_instr.c:766 > __sk_nulls_add_node_rcu ./include/net/sock.h:684 > inet_ehash_insert+0xd40/0x1050 net/ipv4/inet_hashtables.c:413 > reqsk_queue_hash_req net/ipv4/inet_connection_sock.c:754 > inet_csk_reqsk_queue_hash_add+0x1cc/0x300 > net/ipv4/inet_connection_sock.c:765 > tcp_conn_request+0x31e7/0x36f0 net/ipv4/tcp_input.c:6414 > tcp_v4_conn_request+0x16d/0x220 net/ipv4/tcp_ipv4.c:1314 > tcp_rcv_state_process+0x42a/0x7210 net/ipv4/tcp_input.c:5917 > tcp_v4_do_rcv+0xa6a/0xcd0 net/ipv4/tcp_ipv4.c:1483 > tcp_v4_rcv+0x3de0/0x4ab0 net/ipv4/tcp_ipv4.c:1763 > ip_local_deliver_finish+0x6bb/0xcb0 net/ipv4/ip_input.c:216 > NF_HOOK ./include/linux/netfilter.h:248 > ip_local_deliver+0x3fa/0x480 net/ipv4/ip_input.c:257 > dst_input ./include/net/dst.h:477 > ip_rcv_finish+0x6fb/0x1540 net/ipv4/ip_input.c:397 > NF_HOOK ./include/linux/netfilter.h:248 > ip_rcv+0x10f6/0x15c0 net/ipv4/ip_input.c:488 > __netif_receive_skb_core+0x36f6/0x3f60 net/core/dev.c:4298 > __netif_receive_skb net/core/dev.c:4336 > netif_receive_skb_internal+0x63c/0x19c0 net/core/dev.c:4497 > napi_skb_finish net/core/dev.c:4858 > napi_gro_receive+0x629/0xa50 net/core/dev.c:4889 > e1000_receive_skb drivers/net/ethernet/intel/e1000/e1000_main.c:4018 > e1000_clean_rx_irq+0x1492/0x1d30 > drivers/net/ethernet/intel/e1000/e1000_main.c:4474 > e1000_clean+0x43aa/0x5970 > drivers/net/ethernet/intel/e1000/e1000_main.c:3819 > napi_poll net/core/dev.c:5500 > net_rx_action+0x73c/0x1820 net/core/dev.c:5566 > __do_softirq+0x4b4/0x8dd kernel/softirq.c:284 > invoke_softirq kernel/softirq.c:364 > irq_exit+0x203/0x240 kernel/softirq.c:405 > exiting_irq+0xe/0x10 ./arch/x86/include/asm/apic.h:638 > do_IRQ+0x15e/0x1a0 arch/x86/kernel/irq.c:263 > common_interrupt+0x86/0x86 > ... > > arch_cpu_idle+0x20/0x30 arch/x86/kernel/process.c:332 > default_idle_call kernel/sched/idle.c:98 > cpuidle_idle_call kernel/sched/idle.c:156 > do_idle+0x334/0x730 kernel/sched/idle.c:246 > cpu_startup_entry+0x35/0x40 kernel/sched/idle.c:351 > rest_init+0xb8/0xc0 init/main.c:437 > start_kernel+0x4d7/0x530 init/main.c:703 > x86_64_start_reservations arch/x86/kernel/head64.c:318 > x86_64_start_kernel+0x3cd/0x3e0 arch/x86/kernel/head64.c:299 > secondary_startup_64+0x9f/0x9f arch/x86/kernel/head_64.S:219 > origin: > save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:59 > kmsan_save_stack_with_flags mm/kmsan/kmsan.c:302 > kmsan_internal_poison_shadow+0xb3/0x1b0 mm/kmsan/kmsan.c:198 > kmsan_kmalloc+0x80/0xe0 mm/kmsan/kmsan.c:337
Re: [PATCH net v2] net: accept UFO datagrams from tuntap and packet
On 2017年11月21日 23:22, Willem de Bruijn wrote: From: Willem de BruijnTuntap and similar devices can inject GSO packets. Accept type VIRTIO_NET_HDR_GSO_UDP, even though not generating UFO natively. Processes are expected to use feature negotiation such as TUNSETOFFLOAD to detect supported offload types and refrain from injecting other packets. This process breaks down with live migration: guest kernels do not renegotiate flags, so destination hosts need to expose all features that the source host does. Partially revert the UFO removal from 182e0b6b5846~1..d9d30adf5677. This patch introduces nearly(*) no new code to simplify verification. It brings back verbatim tuntap UFO negotiation, VIRTIO_NET_HDR_GSO_UDP insertion and software UFO segmentation. It does not reinstate protocol stack support, hardware offload (NETIF_F_UFO), SKB_GSO_UDP tunneling in SKB_GSO_SOFTWARE or reception of VIRTIO_NET_HDR_GSO_UDP packets in tuntap. To support SKB_GSO_UDP reappearing in the stack, also reinstate logic in act_csum and openvswitch. Achieve equivalence with v4.13 HEAD by squashing in commit 939912216fa8 ("net: skb_needs_check() removes CHECKSUM_UNNECESSARY check for tx.") and reverting commit 8d63bee643f1 ("net: avoid skb_warn_bad_offload false positives on UFO"). (*) To avoid having to bring back skb_shinfo(skb)->ip6_frag_id, ipv6_proxy_select_ident is changed to return a __be32 and this is assigned directly to the frag_hdr. Also, SKB_GSO_UDP is inserted at the end of the enum to minimize code churn. Tested Booted a v4.13 guest kernel with QEMU. On a host kernel before this patch `ethtool -k eth0` shows UFO disabled. After the patch, it is enabled, same as on a v4.13 host kernel. A UFO packet sent from the guest appears on the tap device: host: nc -l -p -u 8000 & tcpdump -n -i tap0 guest: dd if=/dev/zero of=payload.txt bs=1 count=2000 nc -u 192.16.1.1 8000 < payload.txt Direct tap to tap transmission of VIRTIO_NET_HDR_GSO_UDP succeeds, packets arriving fragmented: ./with_tap_pair.sh ./tap_send_ufo tap0 tap1 (fromhttps://github.com/wdebruij/kerneltools/tree/master/tests) Changes v1 -> v2 - simplified set_offload change (review comment) - documented test procedure
[PATCH net] net: thunderx: Fix TCP/UDP checksum offload for IPv6 pkts
From: Sunil GouthamThis fixes a previous patch which missed some changes and due to which L3 checksum offload was getting enabled for IPv6 pkts. And HW is dropping these pkts as it assumes the pkt is IPv4 when IP csum offload is set in the SQ descriptor. Fixes: 494fd005 ("net: thunderx: Enable TSO and checksum offloads for ipv6") Signed-off-by: Sunil Goutham Signed-off-by: Aleksey Makarov --- drivers/net/ethernet/cavium/thunder/nicvf_queues.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c index d4496e9afcdf..184d5bdbe7e0 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c @@ -1355,10 +1355,11 @@ nicvf_sq_add_hdr_subdesc(struct nicvf *nic, struct snd_queue *sq, int qentry, /* Offload checksum calculation to HW */ if (skb->ip_summed == CHECKSUM_PARTIAL) { - hdr->csum_l3 = 1; /* Enable IP csum calculation */ hdr->l3_offset = skb_network_offset(skb); hdr->l4_offset = skb_transport_offset(skb); + /* Enable IP HDR csum calculation for V4 pkts */ + hdr->csum_l3 = (ip.v4->version == 4) ? 1 : 0; proto = (ip.v4->version == 4) ? ip.v4->protocol : ip.v6->nexthdr; -- 2.15.0
Re: [PATCH] xen-netfront: remove warning when unloading module
Hi Eduardo, Thank you for the patch! Yet something to improve: [auto build test ERROR on xen-tip/linux-next] [also build test ERROR on v4.14 next-20171121] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/Eduardo-Otubo/xen-netfront-remove-warning-when-unloading-module/20171122-163844 base: https://git.kernel.org/pub/scm/linux/kernel/git/xen/tip.git linux-next config: x86_64-allmodconfig (attached as .config) compiler: gcc-6 (Debian 6.4.0-9) 6.4.0 20171026 reproduce: # save the attached .config to linux build tree make ARCH=x86_64 All errors (new ones prefixed by >>): drivers//net/xen-netfront.c: In function 'xennet_remove': >> drivers//net/xen-netfront.c:2139:12: error: 'struct xenbus_device' has no >> member named 'xenbus_state' while (dev->xenbus_state != XenbusStateClosed){ ^~ vim +2139 drivers//net/xen-netfront.c 2126 2127 static int xennet_remove(struct xenbus_device *dev) 2128 { 2129 struct netfront_info *info = dev_get_drvdata(>dev); 2130 2131 dev_dbg(>dev, "%s\n", dev->nodename); 2132 2133 xenbus_switch_state(dev, XenbusStateClosing); 2134 while (xenbus_read_driver_state(dev->otherend) != XenbusStateClosing){ 2135 cpu_relax(); 2136 schedule(); 2137 } 2138 xenbus_switch_state(dev, XenbusStateClosed); > 2139 while (dev->xenbus_state != XenbusStateClosed){ 2140 cpu_relax(); 2141 schedule(); 2142 } 2143 2144 xennet_disconnect_backend(info); 2145 2146 unregister_netdev(info->netdev); 2147 2148 if (info->queues) 2149 xennet_destroy_queues(info); 2150 xennet_free_netdev(info->netdev); 2151 2152 return 0; 2153 } 2154 --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation .config.gz Description: application/gzip
Re: [PATCH 12/31] nds32: Device specific operations
2017-11-11 0:14 GMT+08:00 Arnd Bergmann: > Could you move ioremap_nocache/ioremap_uc/ioremap_wc/ioremap_wt > out of that #ifdef, or would that break other architectures? > It seems ok. I just tried arm64, x86 and nds32. #endif /* CONFIG_MMU */ #ifndef ioremap_nocache void __iomem *ioremap(phys_addr_t phys_addr, size_t size); #define ioremap_nocache ioremap_nocache static inline void __iomem *ioremap_nocache(phys_addr_t offset, size_t size) { return ioremap(offset, size); } #endif #ifndef ioremap_uc #define ioremap_uc ioremap_uc static inline void __iomem *ioremap_uc(phys_addr_t offset, size_t size) { return ioremap_nocache(offset, size); } #endif #ifndef ioremap_wc #define ioremap_wc ioremap_wc static inline void __iomem *ioremap_wc(phys_addr_t offset, size_t size) { return ioremap_nocache(offset, size); } #endif #ifndef ioremap_wt #define ioremap_wt ioremap_wt static inline void __iomem *ioremap_wt(phys_addr_t offset, size_t size) { return ioremap_nocache(offset, size); } #endif
Wifi RTL8723bu driver test: failed to scan
Hello Jes Sorensen, I am currently testing a LM811 Wifi/BT USB dongle [1] on a Sinlinx SinA33 Allwinner SoC board [2]. I saw that I should use the realtek driver RTL8723BU for this USB dongle. Currently, I am only testing the Wifi and the mainline driver (kernel 4.14-rc7) does not seem to work. At least, the scanning does not output anything. I tested the driver recommended by LM Technologies [3] and it works fine (scan, connect and ping are ok). Before investigating on the differences between these two drivers, do you have any idea about this issue? Here are the commands and output I got with mainline's driver: # lsusb Bus 001 Device 001: ID 1d6b:0002 Bus 001 Device 003: ID 0bda:b720 Bus 001 Device 002: ID 05e3:0608 Bus 002 Device 001: ID 1d6b:0001 Bus 001 Device 004: ID 0bda:8152 # # modprobe rtl8xxxu [ 46.785896] usb 1-1.1: This Realtek USB WiFi dongle (0x0bda:0xb720) is untested! [ 46.802122] usb 1-1.1: Please report results to jes.soren...@gmail.com [ 46.980269] usb 1-1.1: Vendor: Realtek [ 46.988641] usb 1-1.1: Product: 802.11n WLAN Adapter [ 46.998182] usb 1-1.1: rtl8723bu_parse_efuse: dumping efuse (0x200 bytes): [ 47.014106] usb 1-1.1: 00: 29 81 03 7c 01 08 21 00 [ 47.023527] usb 1-1.1: 08: 40 07 05 35 10 00 00 00 [ 47.032888] usb 1-1.1: 10: 2c 2c 2c 2c 2c 2c 2b 2b [ 47.042197] usb 1-1.1: 18: 2b 2b 2b f4 ff ff ff ff [ 47.051442] usb 1-1.1: 20: ff ff ff ff ff ff ff ff [ 47.060608] usb 1-1.1: 28: ff ff ff ff ff ff ff ff [ 47.069672] usb 1-1.1: 30: ff ff ff ff ff ff ff ff [ 47.078679] usb 1-1.1: 38: ff ff 2d 2d 2d 2d 2d 2d [ 47.087599] usb 1-1.1: 40: 2d 2d 2d 2d 2d 03 ff ff [ 47.096539] usb 1-1.1: 48: ff ff ff ff ff ff ff ff [ 47.105489] usb 1-1.1: 50: ff ff ff ff ff ff ff ff [ 47.114418] usb 1-1.1: 58: ff ff ff ff ff ff ff ff [ 47.123322] usb 1-1.1: 60: ff ff ff ff ff ff ff ff [ 47.132238] usb 1-1.1: 68: ff ff ff ff ff ff ff ff [ 47.141059] usb 1-1.1: 70: ff ff ff ff ff ff ff ff [ 47.149810] usb 1-1.1: 78: ff ff ff ff ff ff ff ff [ 47.158479] usb 1-1.1: 80: ff ff ff ff ff ff ff ff [ 47.167107] usb 1-1.1: 88: ff ff ff ff ff ff ff ff [ 47.175651] usb 1-1.1: 90: ff ff ff ff ff ff ff ff [ 47.184102] usb 1-1.1: 98: ff ff ff ff ff ff ff ff [ 47.192463] usb 1-1.1: a0: ff ff ff ff ff ff ff ff [ 47.200728] usb 1-1.1: a8: ff ff ff ff ff ff ff ff [ 47.208943] usb 1-1.1: b0: ff ff ff ff ff ff ff ff [ 47.217053] usb 1-1.1: b8: 20 1e 20 00 00 00 ff ff [ 47.225146] usb 1-1.1: c0: ff 28 20 11 00 00 00 ff [ 47.233169] usb 1-1.1: c8: 00 ff ff ff ff ff ff ff [ 47.241140] usb 1-1.1: d0: ff ff ff ff ff ff ff ff [ 47.249009] usb 1-1.1: d8: ff ff ff ff ff ff ff ff [ 47.256776] usb 1-1.1: e0: ff ff ff ff ff ff ff ff [ 47.264481] usb 1-1.1: e8: ff ff ff ff ff ff ff ff [ 47.272098] usb 1-1.1: f0: ff ff ff ff ff ff ff ff [ 47.279619] usb 1-1.1: f8: ff ff ff ff ff ff ff ff [ 47.287029] usb 1-1.1: 100: da 0b 20 b7 e7 47 03 5c [ 47.294503] usb 1-1.1: 108: f3 70 32 1d c2 09 03 52 [ 47.301934] usb 1-1.1: 110: 65 61 6c 74 65 6b 16 03 [ 47.309294] usb 1-1.1: 118: 38 30 32 2e 31 31 6e 20 [ 47.316579] usb 1-1.1: 120: 57 4c 41 4e 20 41 64 61 [ 47.323811] usb 1-1.1: 128: 70 74 65 72 00 ff ff ff [ 47.331045] usb 1-1.1: 130: ff ff ff ff ff ff ff ff [ 47.338222] usb 1-1.1: 138: ff ff ff ff ff ff ff ff [ 47.345335] usb 1-1.1: 140: ff ff ff ff ff ff ff 0f [ 47.352364] usb 1-1.1: 148: ff ff ff ff ff ff ff ff [ 47.359299] usb 1-1.1: 150: ff ff ff ff ff ff ff ff [ 47.366135] usb 1-1.1: 158: ff ff ff ff ff ff ff ff [ 47.372882] usb 1-1.1: 160: ff ff ff ff ff ff ff ff [ 47.379541] usb 1-1.1: 168: ff ff ff ff ff ff ff ff [ 47.386117] usb 1-1.1: 170: ff ff ff ff ff ff ff ff [ 47.392664] usb 1-1.1: 178: ff ff ff ff ff ff ff ff [ 47.399154] usb 1-1.1: 180: ff ff ff ff ff ff ff ff [ 47.405530] usb 1-1.1: 188: ff ff ff ff ff ff ff ff [ 47.411846] usb 1-1.1: 190: ff ff ff ff ff ff ff ff [ 47.418125] usb 1-1.1: 198: ff ff ff ff ff ff ff ff [ 47.424365] usb 1-1.1: 1a0: ff ff ff ff ff ff ff ff [ 47.430587] usb 1-1.1: 1a8: ff ff ff ff ff ff ff ff [ 47.436769] usb 1-1.1: 1b0: ff ff ff ff ff ff ff ff [ 47.442967] usb 1-1.1: 1b8: ff ff ff ff ff ff ff ff [ 47.449168] usb 1-1.1: 1c0: ff ff ff ff ff ff ff ff [ 47.455323] usb 1-1.1: 1c8: ff ff ff ff ff ff ff ff [ 47.461471] usb 1-1.1: 1d0: ff ff ff ff ff ff ff ff [ 47.467592] usb 1-1.1: 1d8: ff ff ff ff ff ff ff ff [ 47.473713] usb 1-1.1: 1e0: ff ff ff ff ff ff ff ff [ 47.479841] usb 1-1.1: 1e8: ff ff ff ff ff ff ff ff [ 47.485942] usb 1-1.1: 1f0: ff ff ff ff ff ff ff ff [ 47.492062] usb 1-1.1: 1f8: ff ff ff ff ff ff ff ff [ 47.498173] usb 1-1.1: RTL8723BU rev E (SMIC) 1T1R, TX queues 3, WiFi=1, BT=1, GPS=0, HI PA=0 [ 47.509357] usb 1-1.1: RTL8723BU MAC:5c:f3:70:32:1d:c2 [ 47.516057] usb 1-1.1: rtl8xxxu: Loading firmware rtlwifi/rtl8723bu_nic.bin [ 47.532111] usb 1-1.1: Firmware revision 35.0 (signature 0x5301) [ 48.531644] usbcore: registered new interface
Re: broken ipv6 tcp csum offload on thunderx
On Wed, Nov 22, 2017 at 2:24 PM, Florian Westphalwrote: > Hi. > > We are experiencing broken ipv6 connectivity with 4.14 kernel > on arm64 with thunderx. > > ping6 still works, but it looks like tcp syn packets get sent > with a wrong checksum -- socket remains in SYN-SENT state. > > after running > > ethtool -K enP2p1s0f1 tx-checksum-ipv6 off > > ipv6 tcp appears to works fine. > > # ethtool -i enP2p1s0f1 > driver: thunder-nicvf > version: 1.0 > firmware-version: > expansion-rom-version: > bus-info: 0002:01:00.1 > supports-statistics: yes > supports-test: no > supports-eeprom-access: no > supports-register-dump: yes > supports-priv-flags: no > > [0.00] Boot CPU: AArch64 Processor [431f0a10] > [0.00] Machine model: cavium,thunder-88xx > [0.00] efi: Getting EFI parameters from FDT: > [0.00] efi: EFI v2.40 by Cavium Thunder cn88xx EFI > ThunderX-Firmware-Release-1.22.17 Sep 21 2017 14:26:28 > [0.00] efi: ACPI=0x ACPI 2.0=0x0014 SMBIOS=0xffef > SMBIOS 3.0=0x10ffaf3 ESRT=0x10fff673e18 > > What other information do you need to debug this? We have a fix ready for this, will submit upstream asap. Thanks, Sunil. > > Thanks, > Florian
broken ipv6 tcp csum offload on thunderx
Hi. We are experiencing broken ipv6 connectivity with 4.14 kernel on arm64 with thunderx. ping6 still works, but it looks like tcp syn packets get sent with a wrong checksum -- socket remains in SYN-SENT state. after running ethtool -K enP2p1s0f1 tx-checksum-ipv6 off ipv6 tcp appears to works fine. # ethtool -i enP2p1s0f1 driver: thunder-nicvf version: 1.0 firmware-version: expansion-rom-version: bus-info: 0002:01:00.1 supports-statistics: yes supports-test: no supports-eeprom-access: no supports-register-dump: yes supports-priv-flags: no [0.00] Boot CPU: AArch64 Processor [431f0a10] [0.00] Machine model: cavium,thunder-88xx [0.00] efi: Getting EFI parameters from FDT: [0.00] efi: EFI v2.40 by Cavium Thunder cn88xx EFI ThunderX-Firmware-Release-1.22.17 Sep 21 2017 14:26:28 [0.00] efi: ACPI=0x ACPI 2.0=0x0014 SMBIOS=0xffef SMBIOS 3.0=0x10ffaf3 ESRT=0x10fff673e18 What other information do you need to debug this? Thanks, Florian
[net 06/13] i40e: restore promiscuous after reset
From: Alan BradyAfter a reset we rebuild the VSIs which is going to clobber any promiscuous settings we had before reset. This makes it so that we restore the promiscuous settings we had before reset. Signed-off-by: Alan Brady Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_main.c | 155 +++- 1 file changed, 83 insertions(+), 72 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 173e924d4dae..775d5a125887 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -2166,6 +2166,73 @@ i40e_aqc_broadcast_filter(struct i40e_vsi *vsi, const char *vsi_name, return aq_ret; } +/** + * i40e_set_promiscuous - set promiscuous mode + * @pf: board private structure + * @promisc: promisc on or off + * + * There are different ways of setting promiscuous mode on a PF depending on + * what state/environment we're in. This identifies and sets it appropriately. + * Returns 0 on success. + **/ +static int i40e_set_promiscuous(struct i40e_pf *pf, bool promisc) +{ + struct i40e_vsi *vsi = pf->vsi[pf->lan_vsi]; + struct i40e_hw *hw = >hw; + i40e_status aq_ret; + + if (vsi->type == I40E_VSI_MAIN && + pf->lan_veb != I40E_NO_VEB && + !(pf->flags & I40E_FLAG_MFP_ENABLED)) { + /* set defport ON for Main VSI instead of true promisc +* this way we will get all unicast/multicast and VLAN +* promisc behavior but will not get VF or VMDq traffic +* replicated on the Main VSI. +*/ + if (promisc) + aq_ret = i40e_aq_set_default_vsi(hw, +vsi->seid, +NULL); + else + aq_ret = i40e_aq_clear_default_vsi(hw, + vsi->seid, + NULL); + if (aq_ret) { + dev_info(>pdev->dev, +"Set default VSI failed, err %s, aq_err %s\n", +i40e_stat_str(hw, aq_ret), +i40e_aq_str(hw, hw->aq.asq_last_status)); + } + } else { + aq_ret = i40e_aq_set_vsi_unicast_promiscuous( + hw, + vsi->seid, + promisc, NULL, + true); + if (aq_ret) { + dev_info(>pdev->dev, +"set unicast promisc failed, err %s, aq_err %s\n", +i40e_stat_str(hw, aq_ret), +i40e_aq_str(hw, hw->aq.asq_last_status)); + } + aq_ret = i40e_aq_set_vsi_multicast_promiscuous( + hw, + vsi->seid, + promisc, NULL); + if (aq_ret) { + dev_info(>pdev->dev, +"set multicast promisc failed, err %s, aq_err %s\n", +i40e_stat_str(hw, aq_ret), +i40e_aq_str(hw, hw->aq.asq_last_status)); + } + } + + if (!aq_ret) + pf->cur_promisc = promisc; + + return aq_ret; +} + /** * i40e_sync_vsi_filters - Update the VSI filter list to the HW * @vsi: ptr to the VSI @@ -2467,81 +2534,16 @@ int i40e_sync_vsi_filters(struct i40e_vsi *vsi) cur_promisc = (!!(vsi->current_netdev_flags & IFF_PROMISC) || test_bit(__I40E_VSI_OVERFLOW_PROMISC, vsi->state)); - if ((vsi->type == I40E_VSI_MAIN) && - (pf->lan_veb != I40E_NO_VEB) && - !(pf->flags & I40E_FLAG_MFP_ENABLED)) { - /* set defport ON for Main VSI instead of true promisc -* this way we will get all unicast/multicast and VLAN -* promisc behavior but will not get VF or VMDq traffic -* replicated on the Main VSI. -*/ - if (pf->cur_promisc != cur_promisc) { - pf->cur_promisc = cur_promisc; - if (cur_promisc) - aq_ret = -
[net 07/13] ixgbe: Fix skb list corruption on Power systems
From: Brian KingThis patch fixes an issue seen on Power systems with ixgbe which results in skb list corruption and an eventual kernel oops. The following is what was observed: CPU 1 CPU2 1: ixgbe_xmit_frame_ringixgbe_clean_tx_irq 2: first->skb = skb eop_desc = tx_buffer->next_to_watch 3: ixgbe_tx_map read_barrier_depends() 4: wmb check adapter written status bit 5: first->next_to_watch = tx_desc napi_consume_skb(tx_buffer->skb ..); 6: writel(i, tx_ring->tail); The read_barrier_depends is insufficient to ensure that tx_buffer->skb does not get loaded prior to tx_buffer->next_to_watch, which then results in loading a stale skb pointer. This patch replaces the read_barrier_depends with smp_rmb to ensure loads are ordered with respect to the load of tx_buffer->next_to_watch. Cc: stable Signed-off-by: Brian King Acked-by: Jesse Brandeburg Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c index ca06c3cc2ca8..62a18914f00f 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -1192,7 +1192,7 @@ static bool ixgbe_clean_tx_irq(struct ixgbe_q_vector *q_vector, break; /* prevent any other reads prior to eop_desc */ - read_barrier_depends(); + smp_rmb(); /* if DD is not set pending work has not been completed */ if (!(eop_desc->wb.status & cpu_to_le32(IXGBE_TXD_STAT_DD))) -- 2.15.0
[net 04/13] i40e: Fix FLR reset timeout issue
From: Filip SadowskiThis patch allows detection of upcoming core reset in case NIC gets stuck while performing FLR reset. The i40e_pf_reset() function returns I40E_ERR_NOT_READY when global reset was detected. Signed-off-by: Filip Sadowski Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_common.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/net/ethernet/intel/i40e/i40e_common.c b/drivers/net/ethernet/intel/i40e/i40e_common.c index 13c79468a6da..095965f268bd 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_common.c +++ b/drivers/net/ethernet/intel/i40e/i40e_common.c @@ -1269,6 +1269,7 @@ i40e_status i40e_pf_reset(struct i40e_hw *hw) * we don't need to do the PF Reset */ if (!cnt) { + u32 reg2 = 0; if (hw->revision_id == 0) cnt = I40E_PF_RESET_WAIT_COUNT_A0; else @@ -1280,6 +1281,12 @@ i40e_status i40e_pf_reset(struct i40e_hw *hw) reg = rd32(hw, I40E_PFGEN_CTRL); if (!(reg & I40E_PFGEN_CTRL_PFSWR_MASK)) break; + reg2 = rd32(hw, I40E_GLGEN_RSTAT); + if (reg2 & I40E_GLGEN_RSTAT_DEVSTATE_MASK) { + hw_dbg(hw, "Core reset upcoming. Skipping PF reset request.\n"); + hw_dbg(hw, "I40E_GLGEN_RSTAT = 0x%x\n", reg2); + return I40E_ERR_NOT_READY; + } usleep_range(1000, 2000); } if (reg & I40E_PFGEN_CTRL_PFSWR_MASK) { -- 2.15.0
[net 09/13] ixgbevf: Use smp_rmb rather than read_barrier_depends
From: Brian KingThe original issue being fixed in this patch was seen with the ixgbe driver, but the same issue exists with ixgbevf as well, as the code is very similar. read_barrier_depends is not sufficient to ensure loads following it are not speculatively loaded out of order by the CPU, which can result in stale data being loaded, causing potential system crashes. Cc: stable Signed-off-by: Brian King Acked-by: Jesse Brandeburg Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c index feed11bc9ddf..1f4a69134ade 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -326,7 +326,7 @@ static bool ixgbevf_clean_tx_irq(struct ixgbevf_q_vector *q_vector, break; /* prevent any other reads prior to eop_desc */ - read_barrier_depends(); + smp_rmb(); /* if DD is not set pending work has not been completed */ if (!(eop_desc->wb.status & cpu_to_le32(IXGBE_TXD_STAT_DD))) -- 2.15.0
[net 01/13] i40e: Fix for NUP NVM image downgrade failure
From: Jacob KellerSince commit 96a39aed25e6 ("i40e: Acquire NVM lock before reads on all devices") we've used the NVM lock to synchronize NVM reads even on devices which don't strictly need the lock. Doing so can cause a regression on older firmware prior to 1.5, especially when downgrading the firmware. Fix this by only grabbing the lock if we're running on an X722 device (which requires the lock as it uses the AdminQ to read the NVM), or if we're currently running 1.5 or newer firmware. Signed-off-by: Jacob Keller Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_adminq.c | 6 ++ drivers/net/ethernet/intel/i40e/i40e_common.c | 3 ++- drivers/net/ethernet/intel/i40e/i40e_nvm.c| 8 +--- drivers/net/ethernet/intel/i40e/i40e_type.h | 1 + 4 files changed, 14 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_adminq.c b/drivers/net/ethernet/intel/i40e/i40e_adminq.c index 9dcb2a961197..9af74253c3f7 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_adminq.c +++ b/drivers/net/ethernet/intel/i40e/i40e_adminq.c @@ -613,6 +613,12 @@ i40e_status i40e_init_adminq(struct i40e_hw *hw) hw->flags |= I40E_HW_FLAG_AQ_PHY_ACCESS_CAPABLE; } + /* Newer versions of firmware require lock when reading the NVM */ + if (hw->aq.api_maj_ver > 1 || + (hw->aq.api_maj_ver == 1 && +hw->aq.api_min_ver >= 5)) + hw->flags |= I40E_HW_FLAG_NVM_READ_REQUIRES_LOCK; + /* The ability to RX (not drop) 802.1ad frames was added in API 1.7 */ if (hw->aq.api_maj_ver > 1 || (hw->aq.api_maj_ver == 1 && diff --git a/drivers/net/ethernet/intel/i40e/i40e_common.c b/drivers/net/ethernet/intel/i40e/i40e_common.c index 0203665cb53c..13c79468a6da 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_common.c +++ b/drivers/net/ethernet/intel/i40e/i40e_common.c @@ -948,7 +948,8 @@ i40e_status i40e_init_shared_code(struct i40e_hw *hw) hw->pf_id = (u8)(func_rid & 0x7); if (hw->mac.type == I40E_MAC_X722) - hw->flags |= I40E_HW_FLAG_AQ_SRCTL_ACCESS_ENABLE; + hw->flags |= I40E_HW_FLAG_AQ_SRCTL_ACCESS_ENABLE | +I40E_HW_FLAG_NVM_READ_REQUIRES_LOCK; status = i40e_init_nvm(hw); return status; diff --git a/drivers/net/ethernet/intel/i40e/i40e_nvm.c b/drivers/net/ethernet/intel/i40e/i40e_nvm.c index 0ccab0a5d717..7689c2ee0d46 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_nvm.c +++ b/drivers/net/ethernet/intel/i40e/i40e_nvm.c @@ -328,15 +328,17 @@ static i40e_status __i40e_read_nvm_word(struct i40e_hw *hw, i40e_status i40e_read_nvm_word(struct i40e_hw *hw, u16 offset, u16 *data) { - i40e_status ret_code; + i40e_status ret_code = 0; - ret_code = i40e_acquire_nvm(hw, I40E_RESOURCE_READ); + if (hw->flags & I40E_HW_FLAG_NVM_READ_REQUIRES_LOCK) + ret_code = i40e_acquire_nvm(hw, I40E_RESOURCE_READ); if (ret_code) return ret_code; ret_code = __i40e_read_nvm_word(hw, offset, data); - i40e_release_nvm(hw); + if (hw->flags & I40E_HW_FLAG_NVM_READ_REQUIRES_LOCK) + i40e_release_nvm(hw); return ret_code; } diff --git a/drivers/net/ethernet/intel/i40e/i40e_type.h b/drivers/net/ethernet/intel/i40e/i40e_type.h index 00d4833e9925..0e8568719b4e 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_type.h +++ b/drivers/net/ethernet/intel/i40e/i40e_type.h @@ -629,6 +629,7 @@ struct i40e_hw { #define I40E_HW_FLAG_AQ_SRCTL_ACCESS_ENABLE BIT_ULL(0) #define I40E_HW_FLAG_802_1AD_CAPABLEBIT_ULL(1) #define I40E_HW_FLAG_AQ_PHY_ACCESS_CAPABLE BIT_ULL(2) +#define I40E_HW_FLAG_NVM_READ_REQUIRES_LOCK BIT_ULL(3) u64 flags; /* Used in set switch config AQ command */ -- 2.15.0
[net 02/13] i40e: fix the calculation of VFs mac addresses
From: Zijie Pannum_mac should be increased only after the call to i40e_add_mac_filter(). Fixes: 5f527ba962e2 ("i40e: Limit the number of MAC and VLAN addresses that can be added for VFs") Signed-off-by: Zijie Pan Signed-off-by: Nicolas Dichtel Reviewed-by: Tushar Dave Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 21 +++-- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c index f8a794b72462..a3dc9b932946 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c +++ b/drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c @@ -2218,18 +2218,19 @@ static int i40e_vc_add_mac_addr_msg(struct i40e_vf *vf, u8 *msg, u16 msglen) struct i40e_mac_filter *f; f = i40e_find_mac(vsi, al->list[i].addr); - if (!f) + if (!f) { f = i40e_add_mac_filter(vsi, al->list[i].addr); - if (!f) { - dev_err(>pdev->dev, - "Unable to add MAC filter %pM for VF %d\n", -al->list[i].addr, vf->vf_id); - ret = I40E_ERR_PARAM; - spin_unlock_bh(>mac_filter_hash_lock); - goto error_param; - } else { - vf->num_mac++; + if (!f) { + dev_err(>pdev->dev, + "Unable to add MAC filter %pM for VF %d\n", + al->list[i].addr, vf->vf_id); + ret = I40E_ERR_PARAM; + spin_unlock_bh(>mac_filter_hash_lock); + goto error_param; + } else { + vf->num_mac++; + } } } spin_unlock_bh(>mac_filter_hash_lock); -- 2.15.0
[net 00/13][pull request] Intel Wired LAN Driver Fixes 2017-11-21
This series contains fixes for igb/vf, ixgbe/vf, i40e/vf and fm10k. Jake fixes a regression issue with older firmware, where we were using the NVM lock to synchronize NVM reads for all devices and firmware versions, yet this caused issues with older firmware prior to version 1.5. Fixed this by only grabbing the lock for newer devices and firmware version 1.5 or newer. Zijie Pan fixes the calculation of the i40e VF MAC addresses, where it was possible to increment to the next MAC entry without calling i40e_add_mac_filter(). Amritha removes the upper limit of 64 queues on a channel VSI since the upper bound is determined by the VSI's num_queue_pairs. Filip fixes an issue during FLR resets, where should have been checking for upcoming core reset and if so, just return with I40E_ERR_NOT_READY. Alan fixes the notifying clients of l2 parameters by copying the parameters to the client instance struct and re-organizes the priority in which the client tasks fire so that if the flag for notifying l2 params is set, it will trigger before the client open task. Also fixed the promiscuous settings after reset for all the VSI's. Brian King from IBM fixes an issue seen on Power systems which would result in skb list corruption and eventual kernel oops. Brian provides the same fix for nearly all our drivers, to replace the read_barrier_depends with smp_rmb() to ensure loads are ordered with respect to the load of tx_buffer->next_to_watch. The following are changes since commit 0c86a6bd85ff0629cd2c5141027fc1c8bb6cde9c: Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net and are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue 40GbE Alan Brady (2): i40evf: fix client notify of l2 params i40e: restore promiscuous after reset Amritha Nambiar (1): i40e: Remove limit of 64 max queues per channel Brian King (7): ixgbe: Fix skb list corruption on Power systems i40e: Use smp_rmb rather than read_barrier_depends ixgbevf: Use smp_rmb rather than read_barrier_depends igbvf: Use smp_rmb rather than read_barrier_depends igb: Use smp_rmb rather than read_barrier_depends fm10k: Use smp_rmb rather than read_barrier_depends i40evf: Use smp_rmb rather than read_barrier_depends Filip Sadowski (1): i40e: Fix FLR reset timeout issue Jacob Keller (1): i40e: Fix for NUP NVM image downgrade failure Zijie Pan (1): i40e: fix the calculation of VFs mac addresses drivers/net/ethernet/intel/fm10k/fm10k_main.c | 2 +- drivers/net/ethernet/intel/i40e/i40e.h | 1 - drivers/net/ethernet/intel/i40e/i40e_adminq.c | 6 + drivers/net/ethernet/intel/i40e/i40e_common.c | 10 +- drivers/net/ethernet/intel/i40e/i40e_main.c| 165 +++-- drivers/net/ethernet/intel/i40e/i40e_nvm.c | 8 +- drivers/net/ethernet/intel/i40e/i40e_txrx.c| 2 +- drivers/net/ethernet/intel/i40e/i40e_type.h| 1 + drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 21 +-- drivers/net/ethernet/intel/i40evf/i40e_txrx.c | 2 +- drivers/net/ethernet/intel/i40evf/i40evf_client.c | 38 +++-- drivers/net/ethernet/intel/i40evf/i40evf_main.c| 10 +- drivers/net/ethernet/intel/igb/igb_main.c | 2 +- drivers/net/ethernet/intel/igbvf/netdev.c | 2 +- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 2 +- drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 2 +- 16 files changed, 154 insertions(+), 120 deletions(-) -- 2.15.0
[net 03/13] i40e: Remove limit of 64 max queues per channel
From: Amritha NambiarIt is safe to remove the upper limit of 64 queues on a channel VSI. The upper bound is determined by the VSI's num_queue_pairs and gets validated when the queue mapping info through mqprio interface is subject to bound checking in the driver. Signed-off-by: Amritha Nambiar Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e.h | 1 - drivers/net/ethernet/intel/i40e/i40e_main.c | 8 2 files changed, 9 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e.h b/drivers/net/ethernet/intel/i40e/i40e.h index 5829715fa342..e019baa905c5 100644 --- a/drivers/net/ethernet/intel/i40e/i40e.h +++ b/drivers/net/ethernet/intel/i40e/i40e.h @@ -90,7 +90,6 @@ #define I40E_AQ_LEN256 #define I40E_AQ_WORK_LIMIT 66 /* max number of VFs + a little */ #define I40E_MAX_USER_PRIORITY 8 -#define I40E_MAX_QUEUES_PER_CH 64 #define I40E_DEFAULT_TRAFFIC_CLASS BIT(0) #define I40E_DEFAULT_MSG_ENABLE4 #define I40E_QUEUE_WAIT_RETRY_LIMIT10 diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 4a964d6e4a9e..173e924d4dae 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -5629,14 +5629,6 @@ static int i40e_validate_num_queues(struct i40e_pf *pf, int num_queues, return -EINVAL; *reconfig_rss = false; - - if (num_queues > I40E_MAX_QUEUES_PER_CH) { - dev_err(>pdev->dev, - "Failed to create VMDq VSI. User requested num_queues (%d) > I40E_MAX_QUEUES_PER_VSI (%u)\n", - num_queues, I40E_MAX_QUEUES_PER_CH); - return -EINVAL; - } - if (vsi->current_rss_size) { if (num_queues > vsi->current_rss_size) { dev_dbg(>pdev->dev, -- 2.15.0
[net 13/13] i40evf: Use smp_rmb rather than read_barrier_depends
From: Brian KingThe original issue being fixed in this patch was seen with the ixgbe driver, but the same issue exists with i40evf as well, as the code is very similar. read_barrier_depends is not sufficient to ensure loads following it are not speculatively loaded out of order by the CPU, which can result in stale data being loaded, causing potential system crashes. Cc: stable Signed-off-by: Brian King Acked-by: Jesse Brandeburg Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40evf/i40e_txrx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c index fe817e2b6fef..50864f99446d 100644 --- a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c +++ b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c @@ -179,7 +179,7 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi, break; /* prevent any other reads prior to eop_desc */ - read_barrier_depends(); + smp_rmb(); i40e_trace(clean_tx_irq, tx_ring, tx_desc, tx_buf); /* if the descriptor isn't done, no work yet to do */ -- 2.15.0
[net 08/13] i40e: Use smp_rmb rather than read_barrier_depends
From: Brian KingThe original issue being fixed in this patch was seen with the ixgbe driver, but the same issue exists with i40e as well, as the code is very similar. read_barrier_depends is not sufficient to ensure loads following it are not speculatively loaded out of order by the CPU, which can result in stale data being loaded, causing potential system crashes. Cc: stable Signed-off-by: Brian King Acked-by: Jesse Brandeburg Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40e/i40e_main.c | 2 +- drivers/net/ethernet/intel/i40e/i40e_txrx.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 775d5a125887..4c08cc86463e 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -3966,7 +3966,7 @@ static bool i40e_clean_fdir_tx_irq(struct i40e_ring *tx_ring, int budget) break; /* prevent any other reads prior to eop_desc */ - read_barrier_depends(); + smp_rmb(); /* if the descriptor isn't done, no work yet to do */ if (!(eop_desc->cmd_type_offset_bsz & diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c index d6d352a6e6ea..4566d66ffc7c 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c @@ -759,7 +759,7 @@ static bool i40e_clean_tx_irq(struct i40e_vsi *vsi, break; /* prevent any other reads prior to eop_desc */ - read_barrier_depends(); + smp_rmb(); i40e_trace(clean_tx_irq, tx_ring, tx_desc, tx_buf); /* we have caught up to head, no work left to do */ -- 2.15.0
[net 11/13] igb: Use smp_rmb rather than read_barrier_depends
From: Brian KingThe original issue being fixed in this patch was seen with the ixgbe driver, but the same issue exists with igb as well, as the code is very similar. read_barrier_depends is not sufficient to ensure loads following it are not speculatively loaded out of order by the CPU, which can result in stale data being loaded, causing potential system crashes. Cc: stable Signed-off-by: Brian King Acked-by: Jesse Brandeburg Tested-by: Aaron Brown Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/igb/igb_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index e94d3c256667..c208753ff5b7 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -7317,7 +7317,7 @@ static bool igb_clean_tx_irq(struct igb_q_vector *q_vector, int napi_budget) break; /* prevent any other reads prior to eop_desc */ - read_barrier_depends(); + smp_rmb(); /* if DD is not set pending work has not been completed */ if (!(eop_desc->wb.status & cpu_to_le32(E1000_TXD_STAT_DD))) -- 2.15.0
[net 10/13] igbvf: Use smp_rmb rather than read_barrier_depends
From: Brian KingThe original issue being fixed in this patch was seen with the ixgbe driver, but the same issue exists with igbvf as well, as the code is very similar. read_barrier_depends is not sufficient to ensure loads following it are not speculatively loaded out of order by the CPU, which can result in stale data being loaded, causing potential system crashes. Cc: stable Signed-off-by: Brian King Acked-by: Jesse Brandeburg Tested-by: Aaron Brown Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/igbvf/netdev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/igbvf/netdev.c b/drivers/net/ethernet/intel/igbvf/netdev.c index 713e8df23744..4214c1519a87 100644 --- a/drivers/net/ethernet/intel/igbvf/netdev.c +++ b/drivers/net/ethernet/intel/igbvf/netdev.c @@ -810,7 +810,7 @@ static bool igbvf_clean_tx_irq(struct igbvf_ring *tx_ring) break; /* prevent any other reads prior to eop_desc */ - read_barrier_depends(); + smp_rmb(); /* if DD is not set pending work has not been completed */ if (!(eop_desc->wb.status & cpu_to_le32(E1000_TXD_STAT_DD))) -- 2.15.0
[net 05/13] i40evf: fix client notify of l2 params
From: Alan BradyThe current method for notifying clients of l2 parameters is broken because we fail to copy the new parameters to the client instance struct, we need to do the notification before the client 'open' function pointer gets called, and lastly we should set the l2 parameters when first adding a client instance. This patch first introduces the i40evf_client_get_params function to prevent code duplication in the i40evf_client_add_instance and the i40evf_notify_client_l2_params functions. We then fix the notify l2 params function to actually copy the parameters to client instance struct and do the same in the *_add_instance' function. Lastly this patch reorganizes the priority in which client tasks fire so that if the flag for notifying l2 params is set, it will trigger before the open because the client needs these new parameters as part of a client open task. Signed-off-by: Alan Brady Tested-by: Andrew Bowers Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/i40evf/i40evf_client.c | 38 --- drivers/net/ethernet/intel/i40evf/i40evf_main.c | 10 +++--- 2 files changed, 31 insertions(+), 17 deletions(-) diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_client.c b/drivers/net/ethernet/intel/i40evf/i40evf_client.c index d8131139565e..da60ce12b33d 100644 --- a/drivers/net/ethernet/intel/i40evf/i40evf_client.c +++ b/drivers/net/ethernet/intel/i40evf/i40evf_client.c @@ -25,6 +25,26 @@ static struct i40e_ops i40evf_lan_ops = { .setup_qvlist = i40evf_client_setup_qvlist, }; +/** + * i40evf_client_get_params - retrieve relevant client parameters + * @vsi: VSI with parameters + * @params: client param struct + **/ +static +void i40evf_client_get_params(struct i40e_vsi *vsi, struct i40e_params *params) +{ + int i; + + memset(params, 0, sizeof(struct i40e_params)); + params->mtu = vsi->netdev->mtu; + params->link_up = vsi->back->link_up; + + for (i = 0; i < I40E_MAX_USER_PRIORITY; i++) { + params->qos.prio_qos[i].tc = 0; + params->qos.prio_qos[i].qs_handle = vsi->qs_handle; + } +} + /** * i40evf_notify_client_message - call the client message receive callback * @vsi: the VSI associated with this client @@ -66,10 +86,6 @@ void i40evf_notify_client_l2_params(struct i40e_vsi *vsi) return; cinst = vsi->back->cinst; - memset(, 0, sizeof(params)); - params.mtu = vsi->netdev->mtu; - params.link_up = vsi->back->link_up; - params.qos.prio_qos[0].qs_handle = vsi->qs_handle; if (!cinst || !cinst->client || !cinst->client->ops || !cinst->client->ops->l2_param_change) { @@ -77,6 +93,8 @@ void i40evf_notify_client_l2_params(struct i40e_vsi *vsi) "Cannot locate client instance l2_param_change function\n"); return; } + i40evf_client_get_params(vsi, ); + cinst->lan_info.params = params; cinst->client->ops->l2_param_change(>lan_info, cinst->client, ); } @@ -166,9 +184,9 @@ static struct i40e_client_instance * i40evf_client_add_instance(struct i40evf_adapter *adapter) { struct i40e_client_instance *cinst = NULL; - struct netdev_hw_addr *mac = NULL; struct i40e_vsi *vsi = >vsi; - int i; + struct netdev_hw_addr *mac = NULL; + struct i40e_params params; if (!vf_registered_client) goto out; @@ -192,18 +210,14 @@ i40evf_client_add_instance(struct i40evf_adapter *adapter) cinst->lan_info.version.major = I40EVF_CLIENT_VERSION_MAJOR; cinst->lan_info.version.minor = I40EVF_CLIENT_VERSION_MINOR; cinst->lan_info.version.build = I40EVF_CLIENT_VERSION_BUILD; + i40evf_client_get_params(vsi, ); + cinst->lan_info.params = params; set_bit(__I40E_CLIENT_INSTANCE_NONE, >state); cinst->lan_info.msix_count = adapter->num_iwarp_msix; cinst->lan_info.msix_entries = >msix_entries[adapter->iwarp_base_vector]; - for (i = 0; i < I40E_MAX_USER_PRIORITY; i++) { - cinst->lan_info.params.qos.prio_qos[i].tc = 0; - cinst->lan_info.params.qos.prio_qos[i].qs_handle = - vsi->qs_handle; - } - mac = list_first_entry(>lan_info.netdev->dev_addrs.list, struct netdev_hw_addr, list); if (mac) diff --git a/drivers/net/ethernet/intel/i40evf/i40evf_main.c b/drivers/net/ethernet/intel/i40evf/i40evf_main.c index ca2ebdbd24d7..7b2a4eba92e2 100644 --- a/drivers/net/ethernet/intel/i40evf/i40evf_main.c +++ b/drivers/net/ethernet/intel/i40evf/i40evf_main.c @@ -2110,6 +2110,11 @@ static void i40evf_client_task(struct work_struct *work) adapter->flags &=
[net 12/13] fm10k: Use smp_rmb rather than read_barrier_depends
From: Brian KingThe original issue being fixed in this patch was seen with the ixgbe driver, but the same issue exists with fm10k as well, as the code is very similar. read_barrier_depends is not sufficient to ensure loads following it are not speculatively loaded out of order by the CPU, which can result in stale data being loaded, causing potential system crashes. Cc: stable Signed-off-by: Brian King Acked-by: Jesse Brandeburg Signed-off-by: Jeff Kirsher --- drivers/net/ethernet/intel/fm10k/fm10k_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_main.c b/drivers/net/ethernet/intel/fm10k/fm10k_main.c index dbd69310f263..538b42d5c187 100644 --- a/drivers/net/ethernet/intel/fm10k/fm10k_main.c +++ b/drivers/net/ethernet/intel/fm10k/fm10k_main.c @@ -1231,7 +1231,7 @@ static bool fm10k_clean_tx_irq(struct fm10k_q_vector *q_vector, break; /* prevent any other reads prior to eop_desc */ - read_barrier_depends(); + smp_rmb(); /* if DD is not set pending work has not been completed */ if (!(eop_desc->flags & FM10K_TXD_FLAG_DONE)) -- 2.15.0