Re: [PATCH net 2/2] r8152: reset device when tx timeout
On Tue, 2015-07-28 at 15:36 +0800, Hayes Wang wrote: The device reset is necessary if the hw becomes abnormal and stops transmitting packets. You are not the first one to face this problem. Hence there is a helper: * usb_queue_reset_device - Reset a USB device from an atomic context * @iface: USB interface belonging to the device to reset * * This function can be used to reset a USB device from an atomic * context, where usb_reset_device() won't work (as it blocks). Please use it if you can. Your version for example is buggy. It will oops if you unplug the device while a reset is scheduled. Regards Oliver -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net 1/2] r8152: add pre_reset and post_reset
On Tue, 2015-07-28 at 15:36 +0800, Hayes Wang wrote: Add rtl8152_pre_reset() and rtl8152_post_reset() which are used when calling usb_reset_device(). The two functions could reduce the time of reset when calling usb_reset_device() after probe(). Signed-off-by: Hayes Wang hayesw...@realtek.com --- drivers/net/usb/r8152.c | 68 + 1 file changed, 68 insertions(+) diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c index 144dc64..a6caa60 100644 --- a/drivers/net/usb/r8152.c +++ b/drivers/net/usb/r8152.c @@ -3342,6 +3342,72 @@ static void r8153_init(struct r8152 *tp) r8153_u2p3en(tp, true); } +static int rtl8152_pre_reset(struct usb_interface *intf) +{ + struct r8152 *tp = usb_get_intfdata(intf); + struct net_device *netdev; + int ret; + + if (intf-condition != USB_INTERFACE_BOUND || !tp) If the interface weren't bound, you wouldn't be called. + return 0; + + netdev = tp-netdev; + if (!netif_running(netdev)) + return 0; + + ret = usb_autopm_get_interface(intf); + if (ret 0) + return ret; What sense does this make? + + napi_disable(tp-napi); + clear_bit(WORK_ENABLE, tp-flags); + usb_kill_urb(tp-intr_urb); + cancel_delayed_work_sync(tp-schedule); + if (netif_carrier_ok(netdev)) { + netif_stop_queue(netdev); + mutex_lock(tp-control); + tp-rtl_ops.disable(tp); + mutex_unlock(tp-control); + } + + usb_autopm_put_interface(intf); + + return 0; +} + +static int rtl8152_post_reset(struct usb_interface *intf) +{ + struct r8152 *tp = usb_get_intfdata(intf); + struct net_device *netdev; + int ret; + + if (intf-condition != USB_INTERFACE_BOUND || !tp) Again unnecessary + return 0; + + netdev = tp-netdev; + if (!netif_running(netdev)) + return 0; + + ret = usb_autopm_get_interface(intf); The device will be awake. + if (ret 0) + return ret; + + set_bit(WORK_ENABLE, tp-flags); + if (netif_carrier_ok(netdev)) { + mutex_lock(tp-control); + tp-rtl_ops.enable(tp); + rtl8152_set_rx_mode(netdev); + mutex_unlock(tp-control); + netif_wake_queue(netdev); + } + + napi_enable(tp-napi); + + usb_autopm_put_interface(intf); + + return ret; +} + HTH Oliver -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Drivers: isdn: Drop unnecessary continue
The semantic patch used to make this change is : @@ @@ for (...;...;...) { ... if (...) { ... - continue; } } Signed-off-by: Shraddha Barke shraddha.6...@gmail.com --- drivers/isdn/hardware/mISDN/hfcsusb.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/isdn/hardware/mISDN/hfcsusb.c b/drivers/isdn/hardware/mISDN/hfcsusb.c index 114f3bc..34e4b6c 100644 --- a/drivers/isdn/hardware/mISDN/hfcsusb.c +++ b/drivers/isdn/hardware/mISDN/hfcsusb.c @@ -1923,7 +1923,6 @@ hfcsusb_probe(struct usb_interface *intf, const struct usb_device_id *id) (le16_to_cpu(dev-descriptor.idProduct) == hfcsusb_idtab[i].idProduct)) { vend_idx = i; - continue; } } -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 3/4] dwc_eth_qos: Add the synopsys folder to the build system.
Signed-off-by: Lars Persson lar...@axis.com --- drivers/net/ethernet/Kconfig | 1 + drivers/net/ethernet/Makefile | 1 + 2 files changed, 2 insertions(+) diff --git a/drivers/net/ethernet/Kconfig b/drivers/net/ethernet/Kconfig index f3bb178..05aa759 100644 --- a/drivers/net/ethernet/Kconfig +++ b/drivers/net/ethernet/Kconfig @@ -167,6 +167,7 @@ source drivers/net/ethernet/sgi/Kconfig source drivers/net/ethernet/smsc/Kconfig source drivers/net/ethernet/stmicro/Kconfig source drivers/net/ethernet/sun/Kconfig +source drivers/net/ethernet/synopsys/Kconfig source drivers/net/ethernet/tehuti/Kconfig source drivers/net/ethernet/ti/Kconfig source drivers/net/ethernet/tile/Kconfig diff --git a/drivers/net/ethernet/Makefile b/drivers/net/ethernet/Makefile index c51014b..f42177b 100644 --- a/drivers/net/ethernet/Makefile +++ b/drivers/net/ethernet/Makefile @@ -77,6 +77,7 @@ obj-$(CONFIG_NET_VENDOR_SGI) += sgi/ obj-$(CONFIG_NET_VENDOR_SMSC) += smsc/ obj-$(CONFIG_NET_VENDOR_STMICRO) += stmicro/ obj-$(CONFIG_NET_VENDOR_SUN) += sun/ +obj-$(CONFIG_NET_VENDOR_SYNOPSYS) += synopsys/ obj-$(CONFIG_NET_VENDOR_TEHUTI) += tehuti/ obj-$(CONFIG_NET_VENDOR_TI) += ti/ obj-$(CONFIG_TILE_NET) += tile/ -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 1/1] net/ipv4: Enable flow-based ECMP
On Tue, Jul 28, 2015 at 02:27:57AM +, Richard Laing wrote: From: Richard Laing richard.la...@alliedtelesis.co.nz Enable flow-based ECMP. Currently if equal-cost multipath is enabled the kernel chooses between equal cost paths for each matching packet, essentially packets are round-robined between the routes. This means that packets from a single flow can traverse different routes. If one of the routes experiences congestion this can result in delayed or out of order packets arriving at the destination. This patch allows packets to be routed based on their flow - packets in the same flow will always use the same route. This prevents out of order packets. There are other issues with round-robin based ECMP routing related to variable path MTU handling and debugging. See RFC2991 for more details on the problems associated with packet based ECMP routing. This patch relies on the skb hash value to select between routes. The selection uses a hash-threshold algorithm (see RFC2992). Signed-off-by: Richard Laing richard.la...@alliedtelesis.co.nz The patch looks corrupted (long lines split, tabs converted to (four?) spaces etc. Michal Kubecek -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Drivers: isdn: Drop unnecessary continue
The semantic patch used to make this change is : @@ @@ for (...;...;...) { ... if (...) { ... - continue; } } Signed-off-by: Shraddha Barke shraddha.6...@gmail.com --- drivers/isdn/hardware/mISDN/hfcsusb.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/isdn/hardware/mISDN/hfcsusb.c b/drivers/isdn/hardware/mISDN/hfcsusb.c index 114f3bc..91beb83 100644 --- a/drivers/isdn/hardware/mISDN/hfcsusb.c +++ b/drivers/isdn/hardware/mISDN/hfcsusb.c @@ -1921,10 +1921,9 @@ hfcsusb_probe(struct usb_interface *intf, const struct usb_device_id *id) if ((le16_to_cpu(dev-descriptor.idVendor) == hfcsusb_idtab[i].idVendor) (le16_to_cpu(dev-descriptor.idProduct) -== hfcsusb_idtab[i].idProduct)) { +== hfcsusb_idtab[i].idProduct)) vend_idx = i; - continue; - } + } printk(KERN_DEBUG -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v1 net-next 1/1] net: fec: add stop mode request on/off implemention
Hi, David, From: Duan Fugang-B38611 Sent: Monday, July 27, 2015 9:28 AM To: 'David Miller' Cc: netdev@vger.kernel.org; Li Frank-B20596; step...@networkplumber.org Subject: RE: [PATCH v1 net-next 1/1] net: fec: add stop mode request on/off implemention From: David Miller da...@davemloft.net Sent: Monday, July 27, 2015 7:27 AM To: Duan Fugang-B38611 Cc: netdev@vger.kernel.org; Li Frank-B20596; step...@networkplumber.org Subject: Re: [PATCH v1 net-next 1/1] net: fec: add stop mode request on/off implemention From: Fugang Duan b38...@freescale.com Date: Wed, 22 Jul 2015 18:13:43 +0800 The current driver depends on platform data to implement stop mode request on/off that call api pdata-sleep_mode_enable(). To reduce arch platform redundancy code, since the function only set SOC GPR register bit to request stop mode of/off, so we can move the function into driver. And the specifix GPR register offset and MASK bit can be transferred from DTS. Signed-off-by: Fugang Duan b38...@freescale.com Doesn't this break stop mode on those devices until the DTS is updated? That's really unfortunate, because you're leaving all of the platform data and implementation there, yet it's going to be unused. I really think you need to keep the code using the platform data bits around until all the DTSs are updated. No matter what you tell me about how DTSs are updated (don't even mention the details, I do not care) you simply cannot keep the platform data code around and not use it. It is completely nonsensible to have code that would properly function and properly support a feature for the device in the kernel, yet not use it. Period. Thanks for your comments. Firstly, I will send some board dts patches (and test). Secondly, the net/next tree have no platform data for stop mode because others suggest us to use dts not platform data, and there have no any boards support stop mode in net/next, so this doesn't break any boards in net/next. Regards, Andy I remove platform data callback is because there have no any platform use stop mode function. That is to remove redundant code. I tested the patch on 4.1 with extra patches (imx pm support patches), it works fine. But Linux next still loss i.MX power management patches, so wakeup source cannot work in next. So the patch itself has no problem. You can accept it now, or after imx pm patches enter to next, and then I will send it again with imx6x/7x support. Regards, Andy -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH net 2/2] r8152: reset device when tx timeout
Oliver Neukum [mailto:oneu...@suse.com] Sent: Tuesday, July 28, 2015 4:57 PM [...] * usb_queue_reset_device - Reset a USB device from an atomic context * @iface: USB interface belonging to the device to reset * * This function can be used to reset a USB device from an atomic * context, where usb_reset_device() won't work (as it blocks). Please use it if you can. Your version for example is buggy. It will oops if you unplug the device while a reset is scheduled. Thanks for your suggestion. I would replace it. Best Regards, Hayes N�r��yb�X��ǧv�^�){.n�+���z�^�)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥
[PATCH net-next 4/4] dwc_eth_qos: Add maintainer info
Add maintainer information for the Synopsys DWC Ethernet QOS driver. Signed-off-by: Lars Persson lar...@axis.com --- MAINTAINERS | 7 +++ 1 file changed, 7 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index a226416..0c78766 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -8903,6 +8903,13 @@ F: include/linux/dma/dw.h F: include/linux/platform_data/dma-dw.h F: drivers/dma/dw/ +SYNOPSYS DESIGNWARE ETHERNET QOS 4.10a driver +M: Lars Persson lars.pers...@axis.com +L: netdev@vger.kernel.org +S: Supported +F: Documentation/devicetree/bindings/net/snps,dwc-qos-ethernet.txt +F: drivers/net/ethernet/synopsys/dwc_eth_qos.c + SYNOPSYS DESIGNWARE MMC/SD/SDIO DRIVER M: Seungwon Jeon tgih@samsung.com M: Jaehoon Chung jh80.ch...@samsung.com -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 2/4] dwc_eth_qos: Add support for Synopsys DWC Ethernet QoS
This patch adds a platform driver for the new generation of the gigabit ethernet IP from Synopsys. It is developed for version 4.10a of the IP core. Signed-off-by: Lars Persson lar...@axis.com --- drivers/net/ethernet/synopsys/Kconfig | 27 + drivers/net/ethernet/synopsys/Makefile |5 + drivers/net/ethernet/synopsys/dwc_eth_qos.c | 3019 +++ 3 files changed, 3051 insertions(+) create mode 100644 drivers/net/ethernet/synopsys/Kconfig create mode 100644 drivers/net/ethernet/synopsys/Makefile create mode 100644 drivers/net/ethernet/synopsys/dwc_eth_qos.c diff --git a/drivers/net/ethernet/synopsys/Kconfig b/drivers/net/ethernet/synopsys/Kconfig new file mode 100644 index 000..a8f3151 --- /dev/null +++ b/drivers/net/ethernet/synopsys/Kconfig @@ -0,0 +1,27 @@ +# +# Synopsys network device configuration +# + +config NET_VENDOR_SYNOPSYS + bool Synopsys devices + default y + ---help--- + If you have a network (Ethernet) device belonging to this class, say Y. + + Note that the answer to this question doesn't directly affect the + kernel: saying N will just cause the configurator to skip all + the questions about Synopsys devices. If you say Y, you will be asked + for your specific device in the following questions. + +if NET_VENDOR_SYNOPSYS + +config SYNOPSYS_DWC_ETH_QOS + tristate Sypnopsys DWC Ethernet QOS v4.10a support + select PHYLIB + select CRC32 + select MII + depends on OF + ---help--- + This driver supports the DWC Ethernet QoS from Synopsys + +endif # NET_VENDOR_SYNOPSYS diff --git a/drivers/net/ethernet/synopsys/Makefile b/drivers/net/ethernet/synopsys/Makefile new file mode 100644 index 000..7a37572 --- /dev/null +++ b/drivers/net/ethernet/synopsys/Makefile @@ -0,0 +1,5 @@ +# +# Makefile for the Synopsys network device drivers. +# + +obj-$(CONFIG_SYNOPSYS_DWC_ETH_QOS) += dwc_eth_qos.o diff --git a/drivers/net/ethernet/synopsys/dwc_eth_qos.c b/drivers/net/ethernet/synopsys/dwc_eth_qos.c new file mode 100644 index 000..85b3326 --- /dev/null +++ b/drivers/net/ethernet/synopsys/dwc_eth_qos.c @@ -0,0 +1,3019 @@ +/* Synopsys DWC Ethernet Quality-of-Service v4.10a linux driver + * + * This is a driver for the Synopsys DWC Ethernet QoS IP version 4.10a (GMAC). + * This version introduced a lot of changes which breaks backwards + * compatibility the non-QoS IP from Synopsys (used in the ST Micro drivers). + * Some fields differ between version 4.00a and 4.10a, mainly the interrupt + * bit fields. The driver could be made compatible with 4.00, if all relevant + * HW erratas are handled. + * + * The GMAC is highly configurable at synthesis time. This driver has been + * developed for a subset of the total available feature set. Currently + * it supports: + * - TSO + * - Checksum offload for RX and TX. + * - Energy efficient ethernet. + * - GMII phy interface. + * - The statistics module. + * - Single RX and TX queue. + * + * Copyright (C) 2015 Axis Communications AB. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + */ + +#include linux/clk.h +#include linux/module.h +#include linux/kernel.h +#include linux/init.h +#include linux/io.h +#include linux/ethtool.h +#include linux/stat.h +#include linux/types.h + +#include linux/types.h +#include linux/slab.h +#include linux/delay.h +#include linux/mm.h +#include linux/netdevice.h +#include linux/etherdevice.h +#include linux/platform_device.h + +#include linux/phy.h +#include linux/mii.h +#include linux/delay.h +#include linux/dma-mapping.h +#include linux/vmalloc.h +#include linux/version.h + +#include linux/device.h +#include linux/bitrev.h +#include linux/crc32.h + +#include linux/of.h +#include linux/interrupt.h +#include linux/clocksource.h +#include linux/net_tstamp.h +#include linux/pm_runtime.h +#include linux/of_net.h +#include linux/of_address.h +#include linux/of_mdio.h +#include linux/timer.h +#include linux/tcp.h + +#define DRIVER_NAMEdwceqos +#define DRIVER_DESCRIPTION Synopsys DWC Ethernet QoS driver +#define DRIVER_VERSION 0.9 + +#define DWCEQOS_MSG_DEFAULT(NETIF_MSG_DRV | NETIF_MSG_PROBE | \ + NETIF_MSG_LINK | NETIF_MSG_IFDOWN | NETIF_MSG_IFUP) + +#define DWCEQOS_TX_TIMEOUT 5 /* Seconds */ + +#define DWCEQOS_LPI_TIMER_MIN 8 +#define DWCEQOS_LPI_TIMER_MAX ((1 20) - 1) + +#define DWCEQOS_RX_BUF_SIZE 2048 + +#define DWCEQOS_RX_DCNT 256 +#define DWCEQOS_TX_DCNT 256 + +#define DWCEQOS_HASH_TABLE_SIZE 64 + +/* The size field in the DMA descriptor is 14 bits */ +#define BYTES_PER_DMA_DESC 16376 + +/* Hardware registers */ +#define START_MAC_REG_OFFSET0x +#define MAX_MAC_REG_OFFSET 0x0bd0 +#define START_MTL_REG_OFFSET0x0c00 +#define
[PATCH net-next 1/4] dwc_eth_qos: Add Synopsys DWC Ethernet QoS bindings
Add device tree binding documentation for the Synopsys DWC Ethernet QoS driver supporting revision 4.10a of the hardware IP. Signed-off-by: Lars Persson lar...@axis.com --- .../bindings/net/snps,dwc-qos-ethernet.txt | 75 ++ 1 file changed, 75 insertions(+) create mode 100644 Documentation/devicetree/bindings/net/snps,dwc-qos-ethernet.txt diff --git a/Documentation/devicetree/bindings/net/snps,dwc-qos-ethernet.txt b/Documentation/devicetree/bindings/net/snps,dwc-qos-ethernet.txt new file mode 100644 index 000..51f8d2e --- /dev/null +++ b/Documentation/devicetree/bindings/net/snps,dwc-qos-ethernet.txt @@ -0,0 +1,75 @@ +* Synopsys DWC Ethernet QoS IP version 4.10 driver (GMAC) + + +Required properties: +- compatible: Should be snps,dwc-qos-ethernet-4.10 +- reg: Address and length of the register set for the device +- clocks: Phandles to the reference clock and the bus clock +- clock-names: Should be phy_ref_clk for the reference clock and apb_pclk + for the bus clock. +- interrupt-parent: Should be the phandle for the interrupt controller + that services interrupts for this device +- interrupts: Should contain the core's combined interrupt signal +- phy-mode: See ethernet.txt file in the same directory + +Optional properties: +- dma-coherent: Present if dma operations are coherent +- mac-address: See ethernet.txt in the same directory +- local-mac-address: See ethernet.txt in the same directory +- snps,en-lpi: If present it enables use of the AXI low-power interface +- snps,write-requests: Number of write requests that the AXI port can issue. + It depends on the SoC configuration. +- snps,read-requests: Number of read requests that the AXI port can issue. + It depends on the SoC configuration. +- snps,burst-map: Bitmap of allowed AXI burst lengts, with the LSB + representing 4, then 8 etc. +- snps,txpbl: DMA Programmable burst length for the TX DMA +- snps,rxpbl: DMA Programmable burst length for the RX DMA +- snps,en-tx-lpi-clockgating: Enable gating of the MAC TX clock during + TX low-power mode. +- phy-handle: See ethernet.txt file in the same directory +- mdio device tree subnode: When the GMAC has a phy connected to its local +mdio, there must be device tree subnode with the following +required properties: +- compatible: Must be snps,dwc-qos-ethernet-mdio. +- #address-cells: Must be 1. +- #size-cells: Must be 0. + +For each phy on the mdio bus, there must be a node with the following +fields: + +- reg: phy id used to communicate to phy. +- device_type: Must be ethernet-phy. +- fixed-mode device tree subnode: see fixed-link.txt in the same directory + +Examples: +ethernet2@4001 { + clock-names = phy_ref_clk, apb_pclk; + clocks = clkc 17, clkc 15; + compatible = snps,dwc-qos-ethernet-4.10; + interrupt-parent = intc; + interrupts = 0x0 0x1e 0x4; + reg = 0x4001 0x4000; + phy-handle = phy2; + phy-mode = gmii; + + snps,en-tx-lpi-clockgating; + snps,en-lpi; + snps,write-requests = 2; + snps,read-requests = 16; + snps,burst-map = 0x7; + snps,txpbl = 8; + snps,rxpbl = 2; + + dma-coherent; + + mdio { + #address-cells = 0x1; + #size-cells = 0x0; + phy2: phy@1 { + compatible = ethernet-phy-ieee802.3-c22; + device_type = ethernet-phy; + reg = 0x1; + }; + }; +}; -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Drivers: isdn: Drop unnecessary continue
On Tue, 28 Jul 2015, Shraddha Barke wrote: The semantic patch used to make this change is : @@ @@ for (...;...;...) { ... if (...) { ... - continue; } } Signed-off-by: Shraddha Barke shraddha.6...@gmail.com --- drivers/isdn/hardware/mISDN/hfcsusb.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/isdn/hardware/mISDN/hfcsusb.c b/drivers/isdn/hardware/mISDN/hfcsusb.c index 114f3bc..34e4b6c 100644 --- a/drivers/isdn/hardware/mISDN/hfcsusb.c +++ b/drivers/isdn/hardware/mISDN/hfcsusb.c @@ -1923,7 +1923,6 @@ hfcsusb_probe(struct usb_interface *intf, const struct usb_device_id *id) (le16_to_cpu(dev-descriptor.idProduct) == hfcsusb_idtab[i].idProduct)) { vend_idx = i; - continue; Now there is only one statement in the branch, so the {} should go as well. julia } } -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next 0/4] net/mlx4_en: Hardware accelerated 802.1ad
On 7/28/2015 1:00 AM, David Miller wrote: Series applied, thanks. Hi Dave, I don't see this on your kernel.org clone.. maybe forgot to press on the push button? Or. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net 2/2] r8152: reset device when tx timeout
The device reset is necessary if the hw becomes abnormal and stops transmitting packets. Signed-off-by: Hayes Wang hayesw...@realtek.com --- drivers/net/usb/r8152.c | 22 ++ 1 file changed, 18 insertions(+), 4 deletions(-) diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c index a6caa60..9bf6e0c 100644 --- a/drivers/net/usb/r8152.c +++ b/drivers/net/usb/r8152.c @@ -27,7 +27,7 @@ #include linux/usb/cdc.h /* Version Information */ -#define DRIVER_VERSION v1.08.0 (2015/01/13) +#define DRIVER_VERSION v1.08.1 (2015/07/28) #define DRIVER_AUTHOR Realtek linux nic maintainers nic_s...@realtek.com #define DRIVER_DESC Realtek RTL8152/RTL8153 Based USB Ethernet Adapters #define MODULENAME r8152 @@ -591,6 +591,7 @@ struct r8152 { struct sk_buff_head tx_queue, rx_queue; spinlock_t rx_lock, tx_lock; struct delayed_work schedule; + struct delayed_work work_reset; struct mii_if_info mii; struct mutex control; /* use for hw setting */ @@ -1902,11 +1903,11 @@ static void rtl_drop_queued_tx(struct r8152 *tp) static void rtl8152_tx_timeout(struct net_device *netdev) { struct r8152 *tp = netdev_priv(netdev); - int i; netif_warn(tp, tx_err, netdev, Tx timeout\n); - for (i = 0; i RTL8152_MAX_TX; i++) - usb_unlink_urb(tp-tx_info[i].urb); + + schedule_delayed_work(tp-work_reset, 0); + cancel_delayed_work(tp-schedule); } static void rtl8152_set_rx_mode(struct net_device *netdev) @@ -3408,6 +3409,18 @@ static int rtl8152_post_reset(struct usb_interface *intf) return ret; } +static void rtl_hw_reset(struct work_struct *work) +{ + struct r8152 *tp = container_of(work, struct r8152, work_reset.work); + + netif_info(tp, drv, tp-netdev, usb reset device\n); + + if (test_bit(RTL8152_UNPLUG, tp-flags)) + return; + + usb_reset_device(tp-udev); +} + static int rtl8152_suspend(struct usb_interface *intf, pm_message_t message) { struct r8152 *tp = usb_get_intfdata(intf); @@ -4102,6 +4115,7 @@ static int rtl8152_probe(struct usb_interface *intf, mutex_init(tp-control); INIT_DELAYED_WORK(tp-schedule, rtl_work_func_t); + INIT_DELAYED_WORK(tp-work_reset, rtl_hw_reset); netdev-netdev_ops = rtl8152_netdev_ops; netdev-watchdog_timeo = RTL8152_TX_TIMEOUT; -- 2.4.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net 1/2] r8152: add pre_reset and post_reset
Add rtl8152_pre_reset() and rtl8152_post_reset() which are used when calling usb_reset_device(). The two functions could reduce the time of reset when calling usb_reset_device() after probe(). Signed-off-by: Hayes Wang hayesw...@realtek.com --- drivers/net/usb/r8152.c | 68 + 1 file changed, 68 insertions(+) diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c index 144dc64..a6caa60 100644 --- a/drivers/net/usb/r8152.c +++ b/drivers/net/usb/r8152.c @@ -3342,6 +3342,72 @@ static void r8153_init(struct r8152 *tp) r8153_u2p3en(tp, true); } +static int rtl8152_pre_reset(struct usb_interface *intf) +{ + struct r8152 *tp = usb_get_intfdata(intf); + struct net_device *netdev; + int ret; + + if (intf-condition != USB_INTERFACE_BOUND || !tp) + return 0; + + netdev = tp-netdev; + if (!netif_running(netdev)) + return 0; + + ret = usb_autopm_get_interface(intf); + if (ret 0) + return ret; + + napi_disable(tp-napi); + clear_bit(WORK_ENABLE, tp-flags); + usb_kill_urb(tp-intr_urb); + cancel_delayed_work_sync(tp-schedule); + if (netif_carrier_ok(netdev)) { + netif_stop_queue(netdev); + mutex_lock(tp-control); + tp-rtl_ops.disable(tp); + mutex_unlock(tp-control); + } + + usb_autopm_put_interface(intf); + + return 0; +} + +static int rtl8152_post_reset(struct usb_interface *intf) +{ + struct r8152 *tp = usb_get_intfdata(intf); + struct net_device *netdev; + int ret; + + if (intf-condition != USB_INTERFACE_BOUND || !tp) + return 0; + + netdev = tp-netdev; + if (!netif_running(netdev)) + return 0; + + ret = usb_autopm_get_interface(intf); + if (ret 0) + return ret; + + set_bit(WORK_ENABLE, tp-flags); + if (netif_carrier_ok(netdev)) { + mutex_lock(tp-control); + tp-rtl_ops.enable(tp); + rtl8152_set_rx_mode(netdev); + mutex_unlock(tp-control); + netif_wake_queue(netdev); + } + + napi_enable(tp-napi); + + usb_autopm_put_interface(intf); + + return ret; +} + static int rtl8152_suspend(struct usb_interface *intf, pm_message_t message) { struct r8152 *tp = usb_get_intfdata(intf); @@ -4164,6 +4230,8 @@ static struct usb_driver rtl8152_driver = { .suspend = rtl8152_suspend, .resume = rtl8152_resume, .reset_resume = rtl8152_resume, + .pre_reset =rtl8152_pre_reset, + .post_reset = rtl8152_post_reset, .supports_autosuspend = 1, .disable_hub_initiated_lpm = 1, }; -- 2.4.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net 0/2] r8152: device reset
Although the driver works normally, we find the device may get all 0xff data when tranmitting packets on certain platforms. It would break the device and no packet could be transmitted. The reset is necessary to recover the hw for this situation. Hayes Wang (2): r8152: add pre_reset and post_reset r8152: reset device when tx timeout drivers/net/usb/r8152.c | 90 ++--- 1 file changed, 86 insertions(+), 4 deletions(-) -- 2.4.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] mac80211: fix invalid read in minstrel_sort_best_tp_rates()
At the last iteration of the loop, j may equal zero and thus tp_list[j - 1] causes an invalid read. Changed the logic of the loop so that j - 1 is always = 0. Signed-off-by: Adrien Schildknecht adrien+...@schischi.me --- net/mac80211/rc80211_minstrel.c | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/net/mac80211/rc80211_minstrel.c b/net/mac80211/rc80211_minstrel.c index 247552a..3ece7d1 100644 --- a/net/mac80211/rc80211_minstrel.c +++ b/net/mac80211/rc80211_minstrel.c @@ -92,14 +92,15 @@ int minstrel_get_tp_avg(struct minstrel_rate *mr, int prob_ewma) static inline void minstrel_sort_best_tp_rates(struct minstrel_sta_info *mi, int i, u8 *tp_list) { - int j = MAX_THR_RATES; - struct minstrel_rate_stats *tmp_mrs = mi-r[j - 1].stats; + int j; + struct minstrel_rate_stats *tmp_mrs; struct minstrel_rate_stats *cur_mrs = mi-r[i].stats; - while (j 0 (minstrel_get_tp_avg(mi-r[i], cur_mrs-prob_ewma) - minstrel_get_tp_avg(mi-r[tp_list[j - 1]], tmp_mrs-prob_ewma))) { - j--; + for (j = MAX_THR_RATES; j 0; --j) { tmp_mrs = mi-r[tp_list[j - 1]].stats; + if (minstrel_get_tp_avg(mi-r[i], cur_mrs-prob_ewma) = + minstrel_get_tp_avg(mi-r[tp_list[j - 1]], tmp_mrs-prob_ewma)) + break; } if (j MAX_THR_RATES - 1) -- 2.4.6 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next 0/4] net/mlx4_en: Hardware accelerated 802.1ad
From: Or Gerlitz ogerl...@mellanox.com Date: Tue, 28 Jul 2015 10:51:08 +0300 On 7/28/2015 1:00 AM, David Miller wrote: Series applied, thanks. Hi Dave, I don't see this on your kernel.org clone.. maybe forgot to press on the push button? It should be there now. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Drivers: isdn: Drop unnecessary continue
The patch should have v2 in the subject line, and should have a description of the change since the previous version under the --- On Tue, 28 Jul 2015, Shraddha Barke wrote: The semantic patch used to make this change is : @@ @@ for (...;...;...) { ... if (...) { ... - continue; } } Signed-off-by: Shraddha Barke shraddha.6...@gmail.com --- drivers/isdn/hardware/mISDN/hfcsusb.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/isdn/hardware/mISDN/hfcsusb.c b/drivers/isdn/hardware/mISDN/hfcsusb.c index 114f3bc..91beb83 100644 --- a/drivers/isdn/hardware/mISDN/hfcsusb.c +++ b/drivers/isdn/hardware/mISDN/hfcsusb.c @@ -1921,10 +1921,9 @@ hfcsusb_probe(struct usb_interface *intf, const struct usb_device_id *id) if ((le16_to_cpu(dev-descriptor.idVendor) == hfcsusb_idtab[i].idVendor) (le16_to_cpu(dev-descriptor.idProduct) - == hfcsusb_idtab[i].idProduct)) { + == hfcsusb_idtab[i].idProduct)) vend_idx = i; - continue; - } + There is no need to add a blank line here. julia } printk(KERN_DEBUG -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 0/4] dwc_eth_qos: Add support for Synopsys DWC Ethernet QoS
This is a driver supporting version 4.10a of the Synopsys DWC Ethernet QoS gigabit ethernet controller. The IP has changed significantly compared to the dwmac1000 so a separate driver is justified. The IP is highly configurable at synthesis time. This driver has been developed for a subset of the total available feature set. Currently it supports: * TSO * Checksum offload for RX and TX. * Energy efficient ethernet. * GMII phy interface. * The statistics module. * Single RX and TX queue. Lars Persson (4): dwc_eth_qos: Add Synopsys DWC Ethernet QoS bindings dwc_eth_qos: Add support for Synopsys DWC Ethernet QoS dwc_eth_qos: Add the synopsys folder to the build system. dwc_eth_qos: Add maintainer info .../bindings/net/snps,dwc-qos-ethernet.txt | 75 + MAINTAINERS|7 + drivers/net/ethernet/Kconfig |1 + drivers/net/ethernet/Makefile |1 + drivers/net/ethernet/synopsys/Kconfig | 27 + drivers/net/ethernet/synopsys/Makefile |5 + drivers/net/ethernet/synopsys/dwc_eth_qos.c| 3019 7 files changed, 3135 insertions(+) create mode 100644 Documentation/devicetree/bindings/net/snps,dwc-qos-ethernet.txt create mode 100644 drivers/net/ethernet/synopsys/Kconfig create mode 100644 drivers/net/ethernet/synopsys/Makefile create mode 100644 drivers/net/ethernet/synopsys/dwc_eth_qos.c -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 0/2] net: Initialize sk_hash to random value and reset for failing cnxs
This patch set implements a common function to simply set sk_txhash to a random number instead of going through the trouble to call flow dissector. From dst_negative_advice we now reset the sk_txhash in hopes of finding a better ECMP path through the network. Changing sk_txhash affects: - IPv6 flow label and UDP source port which affect ECMP in the network - Local EMCP route selection (pending changes to use sk_txhash) Tom Herbert (2): net: Set sk_txhash from a random number net: Recompute sk_txhash on negative routing advice include/net/ip.h| 16 include/net/ipv6.h | 19 --- include/net/sock.h | 16 net/ipv4/datagram.c | 2 +- net/ipv4/tcp_ipv4.c | 4 ++-- net/ipv6/datagram.c | 2 +- net/ipv6/tcp_ipv6.c | 4 ++-- 7 files changed, 22 insertions(+), 41 deletions(-) -- 1.8.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 1/2] net: Set sk_txhash from a random number
This patch creates sk_set_txhash and eliminates protocol specific inet_set_txhash and ip6_set_txhash. sk_set_txhash simply sets a random number instead of performing flow dissection. sk_set_txash is also allowed to be called multiple times for the same socket, we'll need this when redoing the hash for negative routing advice. Signed-off-by: Tom Herbert t...@herbertland.com --- include/net/ip.h| 16 include/net/ipv6.h | 19 --- include/net/sock.h | 8 net/ipv4/datagram.c | 2 +- net/ipv4/tcp_ipv4.c | 4 ++-- net/ipv6/datagram.c | 2 +- net/ipv6/tcp_ipv6.c | 4 ++-- 7 files changed, 14 insertions(+), 41 deletions(-) diff --git a/include/net/ip.h b/include/net/ip.h index d5fe9f2..bee5f35 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -370,22 +370,6 @@ static inline void iph_to_flow_copy_v4addrs(struct flow_keys *flow, flow-control.addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS; } -static inline void inet_set_txhash(struct sock *sk) -{ - struct inet_sock *inet = inet_sk(sk); - struct flow_keys keys; - - memset(keys, 0, sizeof(keys)); - - keys.addrs.v4addrs.src = inet-inet_saddr; - keys.addrs.v4addrs.dst = inet-inet_daddr; - keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS; - keys.ports.src = inet-inet_sport; - keys.ports.dst = inet-inet_dport; - - sk-sk_txhash = flow_hash_from_keys(keys); -} - static inline __wsum inet_gro_compute_pseudo(struct sk_buff *skb, int proto) { const struct iphdr *iph = skb_gro_network_header(skb); diff --git a/include/net/ipv6.h b/include/net/ipv6.h index 82dbdb0..7c79798 100644 --- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -707,25 +707,6 @@ static inline void iph_to_flow_copy_v6addrs(struct flow_keys *flow, } #if IS_ENABLED(CONFIG_IPV6) -static inline void ip6_set_txhash(struct sock *sk) -{ - struct inet_sock *inet = inet_sk(sk); - struct ipv6_pinfo *np = inet6_sk(sk); - struct flow_keys keys; - - memset(keys, 0, sizeof(keys)); - - memcpy(keys.addrs.v6addrs.src, np-saddr, - sizeof(keys.addrs.v6addrs.src)); - memcpy(keys.addrs.v6addrs.dst, sk-sk_v6_daddr, - sizeof(keys.addrs.v6addrs.dst)); - keys.control.addr_type = FLOW_DISSECTOR_KEY_IPV6_ADDRS; - keys.ports.src = inet-inet_sport; - keys.ports.dst = inet-inet_dport; - - sk-sk_txhash = flow_hash_from_keys(keys); -} - static inline __be32 ip6_make_flowlabel(struct net *net, struct sk_buff *skb, __be32 flowlabel, bool autolabel) { diff --git a/include/net/sock.h b/include/net/sock.h index 4353ef7..fe735c4 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1687,6 +1687,14 @@ static inline void sock_graft(struct sock *sk, struct socket *parent) kuid_t sock_i_uid(struct sock *sk); unsigned long sock_i_ino(struct sock *sk); +static inline void sk_set_txhash(struct sock *sk) +{ + sk-sk_txhash = prandom_u32(); + + if (unlikely(!sk-sk_txhash)) + sk-sk_txhash = 1; +} + static inline struct dst_entry * __sk_dst_get(struct sock *sk) { diff --git a/net/ipv4/datagram.c b/net/ipv4/datagram.c index 574fad9..f915abf 100644 --- a/net/ipv4/datagram.c +++ b/net/ipv4/datagram.c @@ -74,7 +74,7 @@ int __ip4_datagram_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len inet-inet_daddr = fl4-daddr; inet-inet_dport = usin-sin_port; sk-sk_state = TCP_ESTABLISHED; - inet_set_txhash(sk); + sk_set_txhash(sk); inet-inet_id = jiffies; sk_dst_set(sk, rt-dst); diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 486ba96..d27eb54 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -222,7 +222,7 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len) if (err) goto failure; - inet_set_txhash(sk); + sk_set_txhash(sk); rt = ip_route_newports(fl4, rt, orig_sport, orig_dport, inet-inet_sport, inet-inet_dport, sk); @@ -1277,7 +1277,7 @@ struct sock *tcp_v4_syn_recv_sock(struct sock *sk, struct sk_buff *skb, newinet-mc_ttl = ip_hdr(skb)-ttl; newinet-rcv_tos = ip_hdr(skb)-tos; inet_csk(newsk)-icsk_ext_hdr_len = 0; - inet_set_txhash(newsk); + sk_set_txhash(newsk); if (inet_opt) inet_csk(newsk)-icsk_ext_hdr_len = inet_opt-opt.optlen; newinet-inet_id = newtp-write_seq ^ jiffies; diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c index 2572a32..9aadd57 100644 --- a/net/ipv6/datagram.c +++ b/net/ipv6/datagram.c @@ -199,7 +199,7 @@ ipv4_connected: NULL); sk-sk_state = TCP_ESTABLISHED; - ip6_set_txhash(sk); + sk_set_txhash(sk); out: fl6_sock_release(flowlabel); return err; diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index
RE: [PATCH 1/5] Add functions producing system time given a backing counter value
-Original Message- From: John Stultz [mailto:john.stu...@linaro.org] Sent: Monday, July 27, 2015 8:44 PM To: Hall, Christopher S Cc: Thomas Gleixner; Richard Cochran; Ingo Molnar; Kirsher, Jeffrey T; Ronciak, John; H. Peter Anvin; x...@kernel.org; lkml; netdev@vger.kernel.org Subject: Re: [PATCH 1/5] Add functions producing system time given a backing counter value On Mon, Jul 27, 2015 at 5:46 PM, Christopher Hall christopher.s.h...@intel.com wrote: * counter_to_rawmono64 * counter_to_mono64 * counter_to_realtime64 Enables drivers to translate a captured system clock counter to system time. This is useful for network and audio devices that capture timestamps in terms of both the system clock and device clock. Huh. So for counter_to_realtime64 mono64, this seems to ignore the fact that the multiplier is constantly adjusted and corrected. So that calling the function twice with the same counter value may result in different returned values. I've not yet groked the whole patchset, but it seems like there needs to be some mechanism that ensures the counter value is captured and used in the same (or at least close) interval that the timekeeper data is valid for. The ART (and derived TSC) values are always in the past. There's no chance that we could exceed the interval. I don't think any similar usage would be a problem either. Are you suggesting that, for completeness, this be enforced by the conversion function? I do a check here to make sure that the current counter value isn't before the beginning of the current interval: timekeeping_get_delta() ... if (cycle_now tkr-cycle_last tkr-cycle_last - cycle_now ROLLOVER_THRESHOLD) return -EAGAIN; If tkr-cycle_last - cycle_now is large, the assumption is that rollover occurred. Otherwise, the caller should re-read the counter so that it falls within the current interval. In my normal use testing, re-read never occurred. Thanks for your input. Chris thanks -john N�r��yb�X��ǧv�^�){.n�+���z�^�)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥
Re: [PATCH iproute2 net-next] bridge: mdb: add support for vlans
On Tue, 28 Jul 2015 13:17:35 +0200 Nikolay Aleksandrov niko...@cumulusnetworks.com wrote: On 07/15/2015 05:45 PM, Nikolay Aleksandrov wrote: This patch allows the user to specify the vlan of the mdb group being added or deleted and adds support for displaying the vlan when dumping mdb information or monitoring it. It also updates the man page to reflect the new vid argument for mdb. Signed-off-by: Nikolay Aleksandrov niko...@cumulusnetworks.com --- note: the cast in print_mdb_entry() was necessary to shut the compiler bridge/mdb.c | 31 +++ include/linux/if_bridge.h | 1 + man/man8/bridge.8 | 8 +++- 3 files changed, 27 insertions(+), 13 deletions(-) Hi Stephen, Just wondering what's the state of this patch because I'd like to submit some improvements in the same area and I'm wondering if I should do them on top of this patch or if I need to change something in it ? Thanks, Nik Now on net-next branch of iproute2 since support is not in 4.2 kernel. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 3/5] Add calls to translate Always Running Timer (ART) to system time
-Original Message- From: John Stultz [mailto:john.stu...@linaro.org] Sent: Monday, July 27, 2015 9:11 PM To: Hall, Christopher S Cc: Thomas Gleixner; Richard Cochran; Ingo Molnar; Kirsher, Jeffrey T; Ronciak, John; H. Peter Anvin; x...@kernel.org; lkml; netdev@vger.kernel.org Subject: Re: [PATCH 3/5] Add calls to translate Always Running Timer (ART) to system time On Mon, Jul 27, 2015 at 5:46 PM, Christopher Hall christopher.s.h...@intel.com wrote: +static bool checked_art_to_tsc(cycle_t *tsc) +{ + if (!has_art()) + return false; + *tsc = art_to_tsc(*tsc); + return true; +} + +static int art_to_rawmono64(struct timespec64 *rawmono, cycle_t art) +{ + if (!checked_art_to_tsc(art)) + return -ENXIO; + return tsc_to_rawmono64(rawmono, art); +} +EXPORT_SYMBOL(art_to_rawmono64); This all seems to assume the TSC is the current clocksource, which it may not be if the user has overridden it. I don't make that assumption. The counter_to_* functions take a pointer to a clocksource struct. They return -ENXIO if that clocksource doesn’t match the current clocksource. The tsc_to_* functions pass the tsc clocksource pointer to the counter_to_* functions. These tsc conversion functions are called by the art_to_* functions. If instead there were a counter_to_rawmono64() which took the counter value and maybe the name of the clocksource (if the strncmp is affordable for your use), it might be easier for the core to provide an error if the current timekeeping clocksource isn't the one the counter value is based on. This would also allow the tsc_to_*() midlayers to be dropped (since they don't seem to do much). thanks -john Again, thanks for your input. Chris
Re: [PATCH net-next v4] af_mpls: fix undefined reference to ip6_route_output
Hi roopa, On Tue, Jul 28, 2015, at 21:28, roopa wrote: On 7/28/15, 6:04 AM, Hannes Frederic Sowa wrote: Can't you simply use ipv6_stub_impl.ipv6_dst_lookup with sk=NULL to do that and don't have a run-time dependency on IPv6 at all (for the cost of a function pointer). ipv6_stub_impl.ipv6_dst_lookup seems to require sk today. But it only needs it to get 'net' in the beginning and sk is optional afterwards. I will submit a patch to add 'net' as an arg to ipv6_dst_lookup. Users of ipv6_dst_lookup are few and that seems like an easy change and helps my patch. If you or others think otherwise, pls let me know. No need to extend this function at any cost. Simply add your own function pointer to the struct if needed. Probably you have to move the ipv6_stub = ipv6_stub_impl; initialization in inet6_init down so you don't expose the function pointer too early and thus it races with initialization (and error handling seems to be incorrect in this function, too). Thanks, Hannes -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next v4] af_mpls: fix undefined reference to ip6_route_output
On 7/28/15, 3:22 PM, Hannes Frederic Sowa wrote: Hi roopa, On Tue, Jul 28, 2015, at 21:28, roopa wrote: ipv6_stub_impl.ipv6_dst_lookup seems to require sk today. But it only needs it to get 'net' in the beginning and sk is optional afterwards. I will submit a patch to add 'net' as an arg to ipv6_dst_lookup. Users of ipv6_dst_lookup are few and that seems like an easy change and helps my patch. If you or others think otherwise, pls let me know. No need to extend this function at any cost. Simply add your own function pointer to the struct if needed. saw your this email after I hit send on the series. Since the new function pointer will be exactly similar to ipv6_dst_lookup with just an additional argument, a new function pointer does not seem necessary. But i can certainly change it to a new function pointer and resend if that is more acceptable. Probably you have to move the ipv6_stub = ipv6_stub_impl; initialization in inet6_init down so you don't expose the function pointer too early and thus it races with initialization (and error handling seems to be incorrect in this function, too). ok, will look. thanks, Roopa -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net] bridge: mdb: fix delmdb state in the notification
On Tue, Jul 28, 2015 at 4:10 AM, Nikolay Aleksandrov ra...@blackwall.org wrote: From: Nikolay Aleksandrov niko...@cumulusnetworks.com Since mdb states were introduced when deleting an entry the state was left as it was set in the delete request from the user which leads to the following output when doing a monitor (for example): $ bridge mdb add dev br0 port eth3 grp 239.0.0.1 permanent (monitor) dev br0 port eth3 grp 239.0.0.1 permanent $ bridge mdb del dev br0 port eth3 grp 239.0.0.1 permanent (monitor) dev br0 port eth3 grp 239.0.0.1 temp ^^^ Note the temp state in the delete notification which is wrong since the entry was permanent, the state in a delete is always reported as temp regardless of the real state of the entry. Hmm? I think it is iproute2 who forgets to set entry-state when deleting it? } else if (strcmp(*argv, permanent) == 0) { if (cmd == RTM_NEWMDB) entry.state |= MDB_PERMANENT; Kernel simply returns what you pass to it. Please fix iproute2. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 3/5] Add calls to translate Always Running Timer (ART) to system time
-Original Message- From: Andy Lutomirski [mailto:l...@kernel.org] Sent: Monday, July 27, 2015 6:32 PM To: Hall, Christopher S; john.stu...@linaro.org; t...@linutronix.de; richardcoch...@gmail.com; mi...@redhat.com; Kirsher, Jeffrey T; Ronciak, John; h...@zytor.com; x...@kernel.org Cc: linux-ker...@vger.kernel.org; netdev@vger.kernel.org; Borislav Petkov Subject: Re: [PATCH 3/5] Add calls to translate Always Running Timer (ART) to system time On 07/27/2015 05:46 PM, Christopher Hall wrote: * art_to_mono64 * art_to_rawmono64 * art_to_realtime64 Intel audio and PCH ethernet devices use the Always Running Timer (ART) to relate their device clock to system time Signed-off-by: Christopher Hall christopher.s.h...@intel.com --- arch/x86/Kconfig | 12 arch/x86/include/asm/art.h | 42 ++ arch/x86/kernel/Makefile | 1 + arch/x86/kernel/art.c | 134 + arch/x86/kernel/tsc.c | 4 ++ 5 files changed, 193 insertions(+) create mode 100644 arch/x86/include/asm/art.h create mode 100644 arch/x86/kernel/art.c diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index b3a1a5d..1ef9985 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1175,6 +1175,18 @@ config X86_CPUID with major 203 and minors 0 to 31 for /dev/cpu/0/cpuid to /dev/cpu/31/cpuid. +config X86_ART + bool Always Running Timer + default y + depends on X86_TSC + ---help--- + This option provides functionality to drivers and devices that use + the always-running-timer (ART) to correlate their device clock + counter with the system clock counter. The TSC is *exactly* related + to the ART by a ratio m/n specified by CPUID leaf 0x15 + (n=EAX,m=EBX). If ART is unused or unavailable there isn't any + performance impact. It's safe to say Y. + Is there a good reason to make this optional? If there aren't any objections, it sound OK to me. So no, I don't know of any good reasons. Also, is there *still* no way to ask the thing for its nominal frequnency? Or can we expect CPUID leaf 16H to work on CPUs that support this and can we expect it to actually work? There isn't any way to query nominal frequency. CPUID leaf 0x15 only exposes the relationship between ART and TSC. CPUID leaf 0x16 stays the more or less the same and isn't related to ART. The SDM says The returned information should not be used for any other purpose as the returned information does not accurately correlate to information / counters returned by other processor interfaces. Also, does this thing let us learn the real time base? SDM 17.14.4 suggests that the ART value isn't affected by privileged software (aka buggy/malicious firmware). Or, alternatively, how do we learn the offset K between ART and scaled TSC? ART isn't affected by software. The determination of K used to convert ART to TSC is in a footnote (2) in that section of the SDM. I'm not going to risk repeating it here and possibly altering its meaning. choice prompt High Memory Support default HIGHMEM4G diff --git a/arch/x86/include/asm/art.h b/arch/x86/include/asm/art.h new file mode 100644 index 000..da58ce4 --- /dev/null +++ b/arch/x86/include/asm/art.h @@ -0,0 +1,42 @@ +/* + * x86 ART related functions + */ +#ifndef _ASM_X86_ART_H +#define _ASM_X86_ART_H + +#ifndef CONFIG_X86_ART + +static inline int setup_art(void) +{ + return 0; +} + +static inline bool has_art(void) +{ + return false; +} + +static inline int art_to_rawmono64(struct timespec64 *rawmono, cycle_t art) +{ + return -ENXIO; +} +static inline int art_to_realtime64(struct timespec64 *realtime, cycle_t art) +{ + return -ENXIO; +} +static inline int art_to_mono64(struct timespec64 *mono, cycle_t art) +{ + return -ENXIO; +} + +#else + +extern int setup_art(void); +extern bool has_art(void); +extern int art_to_rawmono64(struct timespec64 *rawmono, cycle_t art); +extern int art_to_realtime64(struct timespec64 *realtime, cycle_t art); +extern int art_to_mono64(struct timespec64 *mono, cycle_t art); + +#endif + +#endif/*_ASM_X86_ART_H*/ diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 0f15af4..0908311 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -109,6 +109,7 @@ obj-$(CONFIG_PERF_EVENTS) += perf_regs.o obj-$(CONFIG_TRACING) += tracepoint.o obj-$(CONFIG_IOSF_MBI)+= iosf_mbi.o obj-$(CONFIG_PMC_ATOM)+= pmc_atom.o +obj-$(CONFIG_X86_ART) += art.o ### # 64 bit specific files diff --git a/arch/x86/kernel/art.c b/arch/x86/kernel/art.c new file mode 100644 index 000..1906cf0 --- /dev/null +++
[PATCH 0/3] net: netcp: bug fixes for dynamic module support
This series fixes few bugs to allow keystone netcp modules to be dynamically loaded and removed. Currently it allows following sequence multiple times insmod cpsw_ale.ko insmod davinci_mdio.ko insmod keystone_netcp.ko insmod keystone_netcp_ethss.ko ifup eth0 ifup eth1 ping hosts on eth0 ping hosts on eth1 ifdown eth1 ifdown eth0 rmmod keystone_netcp_ethss.ko rmmod keystone_netcp.ko rmmod davinci_mdio.ko rmmod cpsw_ale.ko Murali Karicheri (3): net: netcp: fix cleanup interface list in netcp_remove() net: netcp: ethss: fix up incorrect use of list api net: netcp: ethss: cleanup gbe_probe() and gbe_remove() functions drivers/net/ethernet/ti/netcp_core.c | 14 +++--- drivers/net/ethernet/ti/netcp_ethss.c | 49 ++- 2 files changed, 30 insertions(+), 33 deletions(-) -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] net: netcp: ethss: fix up incorrect use of list api
The code seems to assume a null is returned when the list is empty from first_sec_slave() to break the loop which is incorrect. Fix the code by using list_empty(). Signed-off-by: Murali Karicheri m-kariche...@ti.com --- drivers/net/ethernet/ti/netcp_ethss.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/net/ethernet/ti/netcp_ethss.c b/drivers/net/ethernet/ti/netcp_ethss.c index 9b7e0a3..77bcfca 100644 --- a/drivers/net/ethernet/ti/netcp_ethss.c +++ b/drivers/net/ethernet/ti/netcp_ethss.c @@ -2490,10 +2490,9 @@ static void free_secondary_ports(struct gbe_priv *gbe_dev) { struct gbe_slave *slave; - for (;;) { + while (!list_empty(gbe_dev-secondary_slaves)) { slave = first_sec_slave(gbe_dev); - if (!slave) - break; + if (slave-phy) phy_disconnect(slave-phy); list_del(slave-slave_list); -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] net: netcp: fix cleanup interface list in netcp_remove()
Currently if user do rmmod keystone_netcp.ko following warning is seen :- [ 59.035891] [ cut here ] [ 59.040535] WARNING: CPU: 2 PID: 1619 at drivers/net/ethernet/ti/ netcp_core.c:2127 netcp_remove) This is because the interface list is not cleaned up in netcp_remove. This patch fixes this. Also fix some checkpatch related warnings. Signed-off-by: Murali Karicheri m-kariche...@ti.com --- drivers/net/ethernet/ti/netcp_core.c | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/ti/netcp_core.c b/drivers/net/ethernet/ti/netcp_core.c index ec8ed30..a1c6961 100644 --- a/drivers/net/ethernet/ti/netcp_core.c +++ b/drivers/net/ethernet/ti/netcp_core.c @@ -2112,6 +2112,7 @@ probe_quit: static int netcp_remove(struct platform_device *pdev) { struct netcp_device *netcp_device = platform_get_drvdata(pdev); + struct netcp_intf *netcp_intf, *netcp_tmp; struct netcp_inst_modpriv *inst_modpriv, *tmp; struct netcp_module *module; @@ -2123,8 +2124,16 @@ static int netcp_remove(struct platform_device *pdev) list_del(inst_modpriv-inst_list); kfree(inst_modpriv); } - WARN(!list_empty(netcp_device-interface_head), %s interface list not empty!\n, -pdev-name); + + /* now that all modules are removed, clean up the interfaces */ + list_for_each_entry_safe(netcp_intf, netcp_tmp, +netcp_device-interface_head, +interface_list) { + netcp_delete_interface(netcp_device, netcp_intf-ndev); + } + + WARN(!list_empty(netcp_device-interface_head), +%s interface list not empty!\n, pdev-name); devm_kfree(pdev-dev, netcp_device); pm_runtime_put_sync(pdev-dev); -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next v5 2/2] af_mpls: fix undefined reference to ip6_route_output
From: Roopa Prabhu ro...@cumulusnetworks.com Undefined reference to ip6_route_output and ip_route_output was reported with CONFIG_INET=n and CONFIG_IPV6=n. This patch uses ipv6_stub_impl.ipv6_dst_lookup instead of ip6_route_output. And wraps affected code under IS_ENABLED(CONFIG_INET) and IS_ENABLED(CONFIG_IPV6). Reported-by: kbuild test robot fengguang...@intel.com Reported-by: Thomas Graf tg...@suug.ch Signed-off-by: Roopa Prabhu ro...@cumulusnetworks.com --- net/mpls/af_mpls.c | 39 +++ 1 file changed, 31 insertions(+), 8 deletions(-) diff --git a/net/mpls/af_mpls.c b/net/mpls/af_mpls.c index 49f1b0e..1c82888 100644 --- a/net/mpls/af_mpls.c +++ b/net/mpls/af_mpls.c @@ -15,7 +15,10 @@ #include net/ip_fib.h #include net/netevent.h #include net/netns/generic.h -#include net/ip6_route.h +#if IS_ENABLED(CONFIG_IPV6) +#include net/ipv6.h +#include net/addrconf.h +#endif #include internal.h #define LABEL_NOT_SPECIFIED (120) @@ -331,6 +334,7 @@ static unsigned find_free_label(struct net *net) return LABEL_NOT_SPECIFIED; } +#if IS_ENABLED(CONFIG_INET) static struct net_device *inet_fib_lookup_dev(struct net *net, void *addr) { struct net_device *dev = NULL; @@ -347,30 +351,47 @@ static struct net_device *inet_fib_lookup_dev(struct net *net, void *addr) ip_rt_put(rt); -errout: return dev; +errout: + return ERR_PTR(-ENODEV); } +#else +static struct net_device *inet_fib_lookup_dev(struct net *net, void *addr) +{ + return ERR_PTR(-EAFNOSUPPORT); +} +#endif +#if IS_ENABLED(CONFIG_IPV6) static struct net_device *inet6_fib_lookup_dev(struct net *net, void *addr) { struct net_device *dev = NULL; struct dst_entry *dst; struct flowi6 fl6; + if (!ipv6_stub) + return ERR_PTR(-EAFNOSUPPORT); + memset(fl6, 0, sizeof(fl6)); memcpy(fl6.daddr, addr, sizeof(struct in6_addr)); - dst = ip6_route_output(net, NULL, fl6); - if (dst-error) + if (ipv6_stub-ipv6_dst_lookup(net, NULL, dst, fl6)) goto errout; dev = dst-dev; dev_hold(dev); - -errout: dst_release(dst); return dev; + +errout: + return ERR_PTR(-ENODEV); } +#else +static struct net_device *inet6_fib_lookup_dev(struct net *net, void *addr) +{ + return ERR_PTR(-EAFNOSUPPORT); +} +#endif static struct net_device *find_outdev(struct net *net, struct mpls_route_config *cfg) @@ -425,10 +446,12 @@ static int mpls_route_add(struct mpls_route_config *cfg) if (cfg-rc_output_labels MAX_NEW_LABELS) goto errout; - err = -ENODEV; dev = find_outdev(net, cfg); - if (!dev) + if (IS_ERR(dev)) { + err = PTR_ERR(dev); + dev = NULL; goto errout; + } /* Ensure this is a supported device */ err = -EINVAL; -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next v5 0/2] af_mpls: fix undefined reference to ip6_route_output with CONFIG_IPV6=n
From: Roopa Prabhu ro...@cumulusnetworks.com This patch series uses ipv6_stub_impl.ipv6_dst_lookup instead of ip6_route_output. Follows the vxlan drivers usage of ipv6_stub_impl.ipv6_dst_lookup. There is no sk in the af_mpls context from where ipv6_stub_impl.ipv6_dst_lookup is used. sk appears to be needed to get the namespace 'net' and is optional otherwise. This patch series changes ipv6_stub_impl.ipv6_dst_lookup to take net argument. sk remains optional. The case of CONFIG_IPV6=m and MPLS_ROUTING=y is covered by checking if ipv6_stub is not NULL. I have tested this case for proper return values to the user. (I dont see an ipv6_stub null check in the vxlan driver. I will test it separately and submit a patch for vxlan driver if needed). v1 - v2: use IS_BUILTIN v2 - v3: Use new Kconfig option that depends on (IPV6 || IPV6=n) as suggested by Dave. Also uses IS_ERR as suggested by Thomas. v3 - v4: Include missed case of (MPLS_ROUTING=y IPV6=m) reported by Dave. v4 - v5: Use ipv6_stub_impl.ipv6_dst_lookup as suggested by Hannes Dave, v4 uses a new Kconfig option and v5 uses ipv6_stub_impl.ipv6_dst_lookup which looks like was added for vxlan driver for similar use case. Thanks and apologies for the iterations on this. Roopa Prabhu (2): ipv6: change ipv6_stub_impl.ipv6_dst_lookup to take net argument af_mpls: fix undefined reference to ip6_route_output drivers/net/vxlan.c|2 +- include/net/addrconf.h |4 ++-- include/net/ipv6.h |3 ++- net/ipv6/icmp.c|6 +++--- net/ipv6/ip6_output.c | 15 --- net/mpls/af_mpls.c | 39 +++ net/tipc/udp_media.c |3 ++- 7 files changed, 49 insertions(+), 23 deletions(-) -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next v5 1/2] ipv6: change ipv6_stub_impl.ipv6_dst_lookup to take net argument
From: Roopa Prabhu ro...@cumulusnetworks.com This patch adds net argument to ipv6_stub_impl.ipv6_dst_lookup for use cases where sk is not available (like mpls). sk appears to be needed to get the namespace 'net' and is optional otherwise. This patch series changes ipv6_stub_impl.ipv6_dst_lookup to take net argument. sk remains optional. All callers of ipv6_stub_impl.ipv6_dst_lookup have been modified to pass net. I have modified them to use already available 'net' in the scope of the call. I can change them to sock_net(sk) to avoid any unintended change in behaviour if sock namespace is different. They dont seem to be from code inspection. Signed-off-by: Roopa Prabhu ro...@cumulusnetworks.com --- drivers/net/vxlan.c|2 +- include/net/addrconf.h |4 ++-- include/net/ipv6.h |3 ++- net/ipv6/icmp.c|6 +++--- net/ipv6/ip6_output.c | 12 ++-- net/tipc/udp_media.c |3 ++- 6 files changed, 16 insertions(+), 14 deletions(-) diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c index 81f0f24..beed5d4 100644 --- a/drivers/net/vxlan.c +++ b/drivers/net/vxlan.c @@ -2034,7 +2034,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev, fl6.flowi6_mark = skb-mark; fl6.flowi6_proto = IPPROTO_UDP; - if (ipv6_stub-ipv6_dst_lookup(sk, ndst, fl6)) { + if (ipv6_stub-ipv6_dst_lookup(vxlan-net, sk, ndst, fl6)) { netdev_dbg(dev, no route to %pI6\n, dst-sin6.sin6_addr); dev-stats.tx_carrier_errors++; diff --git a/include/net/addrconf.h b/include/net/addrconf.h index def59d3..0c3ac5a 100644 --- a/include/net/addrconf.h +++ b/include/net/addrconf.h @@ -158,8 +158,8 @@ struct ipv6_stub { const struct in6_addr *addr); int (*ipv6_sock_mc_drop)(struct sock *sk, int ifindex, const struct in6_addr *addr); - int (*ipv6_dst_lookup)(struct sock *sk, struct dst_entry **dst, - struct flowi6 *fl6); + int (*ipv6_dst_lookup)(struct net *net, struct sock *sk, + struct dst_entry **dst, struct flowi6 *fl6); void (*udpv6_encap_enable)(void); void (*ndisc_send_na)(struct net_device *dev, struct neighbour *neigh, const struct in6_addr *daddr, diff --git a/include/net/ipv6.h b/include/net/ipv6.h index 82dbdb0..09d0ea4 100644 --- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -832,7 +832,8 @@ static inline struct sk_buff *ip6_finish_skb(struct sock *sk) inet6_sk(sk)-cork); } -int ip6_dst_lookup(struct sock *sk, struct dst_entry **dst, struct flowi6 *fl6); +int ip6_dst_lookup(struct net *net, struct sock *sk, struct dst_entry **dst, + struct flowi6 *fl6); struct dst_entry *ip6_dst_lookup_flow(struct sock *sk, struct flowi6 *fl6, const struct in6_addr *final_dst); struct dst_entry *ip6_sk_dst_lookup_flow(struct sock *sk, struct flowi6 *fl6, diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c index 713d743..6c2b213 100644 --- a/net/ipv6/icmp.c +++ b/net/ipv6/icmp.c @@ -329,7 +329,7 @@ static struct dst_entry *icmpv6_route_lookup(struct net *net, struct flowi6 fl2; int err; - err = ip6_dst_lookup(sk, dst, fl6); + err = ip6_dst_lookup(net, sk, dst, fl6); if (err) return ERR_PTR(err); @@ -361,7 +361,7 @@ static struct dst_entry *icmpv6_route_lookup(struct net *net, if (err) goto relookup_failed; - err = ip6_dst_lookup(sk, dst2, fl2); + err = ip6_dst_lookup(net, sk, dst2, fl2); if (err) goto relookup_failed; @@ -591,7 +591,7 @@ static void icmpv6_echo_reply(struct sk_buff *skb) else if (!fl6.flowi6_oif) fl6.flowi6_oif = np-ucast_oif; - err = ip6_dst_lookup(sk, dst, fl6); + err = ip6_dst_lookup(net, sk, dst, fl6); if (err) goto out; dst = xfrm_lookup(net, dst, flowi6_to_flowi(fl6), sk, 0); diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index c5fc852..92b7cf0 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -881,10 +881,9 @@ out: return dst; } -static int ip6_dst_lookup_tail(struct sock *sk, +static int ip6_dst_lookup_tail(struct net *net, struct sock *sk, struct dst_entry **dst, struct flowi6 *fl6) { - struct net *net = sock_net(sk); #ifdef CONFIG_IPV6_OPTIMISTIC_DAD struct neighbour *n; struct rt6_info *rt; @@ -994,10 +993,11 @@ out_err_release: * * It returns zero on success, or a standard errno code on error. */ -int ip6_dst_lookup(struct sock *sk, struct dst_entry **dst, struct flowi6 *fl6) +int ip6_dst_lookup(struct net *net, struct sock *sk, struct dst_entry **dst, +
Re: [PATCH net] bridge: mdb: fix delmdb state in the notification
On 07/29/2015 12:38 AM, Cong Wang wrote: On Tue, Jul 28, 2015 at 4:10 AM, Nikolay Aleksandrov ra...@blackwall.org wrote: From: Nikolay Aleksandrov niko...@cumulusnetworks.com Since mdb states were introduced when deleting an entry the state was left as it was set in the delete request from the user which leads to the following output when doing a monitor (for example): $ bridge mdb add dev br0 port eth3 grp 239.0.0.1 permanent (monitor) dev br0 port eth3 grp 239.0.0.1 permanent $ bridge mdb del dev br0 port eth3 grp 239.0.0.1 permanent (monitor) dev br0 port eth3 grp 239.0.0.1 temp ^^^ Note the temp state in the delete notification which is wrong since the entry was permanent, the state in a delete is always reported as temp regardless of the real state of the entry. Hmm? I think it is iproute2 who forgets to set entry-state when deleting it? } else if (strcmp(*argv, permanent) == 0) { if (cmd == RTM_NEWMDB) entry.state |= MDB_PERMANENT; Kernel simply returns what you pass to it. Please fix iproute2. Hi Cong, Please read the full commit log, I've explained that the state is not honored in the kernel so it doesn't matter if iproute2 sets the correct state that you give on the command line, that is if I give it temp and the entry is permanent - it will still get deleted and the notification will have the wrong state as temp because I've set it, while this way it'll at least return the correct state of the entry being deleted. Again I'm saying that I chose this solution over a check for the entry state because it may break some user-space tools that rely on the behaviour that the state is not checked in the kernel. Cheers, Nik -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 2/2] net: Recompute sk_txhash on negative routing advice
When a connection is failing a transport protocol calls dst_negative_advice to try to get a better route. This patch includes changing the sk_txhash in that function. This provides a rudimentary method to try to find a different path in the network since sk_txhash affects ECMP on the local host and through the network (via flow labels or UDP source port in encapsulation). Signed-off-by: Tom Herbert t...@herbertland.com --- include/net/sock.h | 8 1 file changed, 8 insertions(+) diff --git a/include/net/sock.h b/include/net/sock.h index fe735c4..24aa75c 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1695,6 +1695,12 @@ static inline void sk_set_txhash(struct sock *sk) sk-sk_txhash = 1; } +static inline void sk_rethink_txhash(struct sock *sk) +{ + if (sk-sk_txhash) + sk_set_txhash(sk); +} + static inline struct dst_entry * __sk_dst_get(struct sock *sk) { @@ -1719,6 +1725,8 @@ static inline void dst_negative_advice(struct sock *sk) { struct dst_entry *ndst, *dst = __sk_dst_get(sk); + sk_rethink_txhash(sk); + if (dst dst-ops-negative_advice) { ndst = dst-ops-negative_advice(dst); -- 1.8.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH iproute2 v7 4/4] ip link: proto_down config and display.
On Tue, 14 Jul 2015 13:43:22 -0700 anurad...@cumulusnetworks.com wrote: From: Anuradha Karuppiah anurad...@cumulusnetworks.com This patch adds support to set and display protodown on a switch port. The switch driver can handle this error state by doing a phys down on the port. One example user space application setting this flag is a multi-chassis LAG application to handle split-brain situation on peer-link failure. Example: root@net-next:~# ip link set eth1 protodown on root@net-next:~/iproute2# ip link show eth1 4: eth1: BROADCAST,MULTICAST mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 52:54:00:12:35:01 brd ff:ff:ff:ff:ff:ff protodown on root@net-next:~/iproute2# ip link set eth1 protodown off root@net-next:~/iproute2# ip link show eth1 4: eth1: BROADCAST,MULTICAST mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 52:54:00:12:35:01 brd ff:ff:ff:ff:ff:ff root@net-next:~/iproute2# Signed-off-by: Anuradha Karuppiah anurad...@cumulusnetworks.com Signed-off-by: Andy Gospodarek go...@cumulusnetworks.com Signed-off-by: Roopa Prabhu ro...@cumulusnetworks.com Signed-off-by: Wilson Kok w...@cumulusnetworks.com Applied to net-next branch. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/5] Add calls to translate Always Running Timer (ART) to system time
On Tue, Jul 28, 2015 at 6:18 PM, Hall, Christopher S christopher.s.h...@intel.com wrote: -Original Message- From: Andy Lutomirski [mailto:l...@kernel.org] Sent: Monday, July 27, 2015 6:32 PM To: Hall, Christopher S; john.stu...@linaro.org; t...@linutronix.de; richardcoch...@gmail.com; mi...@redhat.com; Kirsher, Jeffrey T; Ronciak, John; h...@zytor.com; x...@kernel.org Cc: linux-ker...@vger.kernel.org; netdev@vger.kernel.org; Borislav Petkov Subject: Re: [PATCH 3/5] Add calls to translate Always Running Timer (ART) to system time On 07/27/2015 05:46 PM, Christopher Hall wrote: * art_to_mono64 * art_to_rawmono64 * art_to_realtime64 Intel audio and PCH ethernet devices use the Always Running Timer (ART) to relate their device clock to system time Signed-off-by: Christopher Hall christopher.s.h...@intel.com --- arch/x86/Kconfig | 12 arch/x86/include/asm/art.h | 42 ++ arch/x86/kernel/Makefile | 1 + arch/x86/kernel/art.c | 134 + arch/x86/kernel/tsc.c | 4 ++ 5 files changed, 193 insertions(+) create mode 100644 arch/x86/include/asm/art.h create mode 100644 arch/x86/kernel/art.c diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index b3a1a5d..1ef9985 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1175,6 +1175,18 @@ config X86_CPUID with major 203 and minors 0 to 31 for /dev/cpu/0/cpuid to /dev/cpu/31/cpuid. +config X86_ART + bool Always Running Timer + default y + depends on X86_TSC + ---help--- + This option provides functionality to drivers and devices that use + the always-running-timer (ART) to correlate their device clock + counter with the system clock counter. The TSC is *exactly* related + to the ART by a ratio m/n specified by CPUID leaf 0x15 + (n=EAX,m=EBX). If ART is unused or unavailable there isn't any + performance impact. It's safe to say Y. + Is there a good reason to make this optional? If there aren't any objections, it sound OK to me. So no, I don't know of any good reasons. Also, is there *still* no way to ask the thing for its nominal frequnency? Or can we expect CPUID leaf 16H to work on CPUs that support this and can we expect it to actually work? There isn't any way to query nominal frequency. CPUID leaf 0x15 only exposes the relationship between ART and TSC. CPUID leaf 0x16 stays the more or less the same and isn't related to ART. The SDM says The returned information should not be used for any other purpose as the returned information does not accurately correlate to information / counters returned by other processor interfaces. Also, does this thing let us learn the real time base? SDM 17.14.4 suggests that the ART value isn't affected by privileged software (aka buggy/malicious firmware). Or, alternatively, how do we learn the offset K between ART and scaled TSC? ART isn't affected by software. The determination of K used to convert ART to TSC is in a footnote (2) in that section of the SDM. I'm not going to risk repeating it here and possibly altering its meaning. choice prompt High Memory Support default HIGHMEM4G diff --git a/arch/x86/include/asm/art.h b/arch/x86/include/asm/art.h new file mode 100644 index 000..da58ce4 --- /dev/null +++ b/arch/x86/include/asm/art.h @@ -0,0 +1,42 @@ +/* + * x86 ART related functions + */ +#ifndef _ASM_X86_ART_H +#define _ASM_X86_ART_H + +#ifndef CONFIG_X86_ART + +static inline int setup_art(void) +{ + return 0; +} + +static inline bool has_art(void) +{ + return false; +} + +static inline int art_to_rawmono64(struct timespec64 *rawmono, cycle_t art) +{ + return -ENXIO; +} +static inline int art_to_realtime64(struct timespec64 *realtime, cycle_t art) +{ + return -ENXIO; +} +static inline int art_to_mono64(struct timespec64 *mono, cycle_t art) +{ + return -ENXIO; +} + +#else + +extern int setup_art(void); +extern bool has_art(void); +extern int art_to_rawmono64(struct timespec64 *rawmono, cycle_t art); +extern int art_to_realtime64(struct timespec64 *realtime, cycle_t art); +extern int art_to_mono64(struct timespec64 *mono, cycle_t art); + +#endif + +#endif/*_ASM_X86_ART_H*/ diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index 0f15af4..0908311 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -109,6 +109,7 @@ obj-$(CONFIG_PERF_EVENTS) += perf_regs.o obj-$(CONFIG_TRACING) += tracepoint.o obj-$(CONFIG_IOSF_MBI)+= iosf_mbi.o obj-$(CONFIG_PMC_ATOM)+= pmc_atom.o +obj-$(CONFIG_X86_ART) += art.o ### # 64 bit specific files diff --git a/arch/x86/kernel/art.c
Re: [PATCHv2] net/ipv6: add sysctl option accept_ra_hop_limit
2015-07-28 11:58 GMT+08:00 YOSHIFUJI Hideaki hideaki.yoshif...@miraclelinux.com: Hi, Hangbin Liu wrote: 2015-07-28 7:50 GMT+08:00 YOSHIFUJI Hideaki/吉藤英明 hideaki.yoshif...@miraclelinux.com: Hi, Hangbin Liu wrote: Commit 6fd99094de2b (ipv6: Don't reduce hop limit for an interface) disabled accept hop limit from RA if it is higher than the current hop limit for security stuff. But this behavior kind of break the RFC definition. RFC 4861, 6.3.4. Processing Received Router Advertisements If the received Cur Hop Limit value is non-zero, the host SHOULD set its CurHopLimit variable to the received value. So add sysctl option accept_ra_hop_limit to let user choose whether accept hop limit info in RA. How about introducing minimum hop limit, instead? Hi Yoshifuji, This is a good idea. Maybe this can be another sysctl option? The minimum hop limit can be an enhancement of the security issue, then we will not only increase the hop limit, but also could decrease it in the range of values we accept. On the other hand, with this patch, we can enable, disable or partly enable accept hop limit. If we only use minimum hop limit, people could not use a static hop limit value. May be we use a “hop limit range instead? How do you think? I think name of sysctl is the same as you suggested and change the semantics. default value is 0 to accept all hotlimit value as before and people can set it to 32 (for example) to reject too-small hoplimit (0-31). OK, then I will try submit a minimum hop limit, thanks for your suggestion :) Regards Hangbin --yoshfuji Thanks Hangbin |commit 6fd99094de2b83d1d4c8457f2c83483b2828e75a |Author: D.S. Ljungmark ljungm...@modio.se |Date: Wed Mar 25 09:28:15 2015 +0100 | |ipv6: Don't reduce hop limit for an interface : |RFC 3756, Section 4.2.7, Parameter Spoofing | : | As an example, one possible approach to mitigate this threat is to | ignore very small hop limits. The nodes could implement a | configurable minimum hop limit, and ignore attempts to set it below | said limit. -- Hideaki Yoshifuji hideaki.yoshif...@miraclelinux.com Technical Division, MIRACLE LINUX CORPORATION -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [net-next PATCH 2/2] drivers: net: cpsw: add separate napi for tx packet handling for performance improvment
Mugunthan V N mugunthan...@ti.com : On Tuesday 28 July 2015 02:52 AM, Francois Romieu wrote: Mugunthan V N mugunthan...@ti.com : [...] @@ -752,13 +753,22 @@ static irqreturn_t cpsw_tx_interrupt(int irq, void *dev_id) struct cpsw_priv *priv = dev_id; cpdma_ctlr_eoi(priv-dma, CPDMA_EOI_TX); - cpdma_chan_process(priv-txch, 128); + writel(0, priv-wr_regs-tx_en); + + if (netif_running(priv-ndev)) { + napi_schedule(priv-napi_tx); + return IRQ_HANDLED; + } cpsw_ndo_stop calls napi_disable: you can remove netif_running. This netif_running check is to find which interface is up as the interrupt is shared by both the interfaces. When first interface is down and second interface is active then napi_schedule for first interface will fail and second interface napi needs to be scheduled. So I don't think netif_running needs to be removed. Each interface has its own napi tx (resp. rx) context: I would had expected two unconditional napi_schedule per tx (resp. rx) shared irq, not one. I'll read it again after some sleep. -- Ueimor -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv2] net/ipv6: add sysctl option accept_ra_hop_limit
Hangbin Liu wrote: 2015-07-28 11:58 GMT+08:00 YOSHIFUJI Hideaki hideaki.yoshif...@miraclelinux.com: Hi, Hangbin Liu wrote: 2015-07-28 7:50 GMT+08:00 YOSHIFUJI Hideaki/吉藤英明 hideaki.yoshif...@miraclelinux.com: Hi, Hangbin Liu wrote: Commit 6fd99094de2b (ipv6: Don't reduce hop limit for an interface) disabled accept hop limit from RA if it is higher than the current hop limit for security stuff. But this behavior kind of break the RFC definition. RFC 4861, 6.3.4. Processing Received Router Advertisements If the received Cur Hop Limit value is non-zero, the host SHOULD set its CurHopLimit variable to the received value. So add sysctl option accept_ra_hop_limit to let user choose whether accept hop limit info in RA. How about introducing minimum hop limit, instead? Hi Yoshifuji, This is a good idea. Maybe this can be another sysctl option? The minimum hop limit can be an enhancement of the security issue, then we will not only increase the hop limit, but also could decrease it in the range of values we accept. On the other hand, with this patch, we can enable, disable or partly enable accept hop limit. If we only use minimum hop limit, people could not use a static hop limit value. May be we use a “hop limit range instead? How do you think? I think name of sysctl is the same as you suggested and change the semantics. default value is 0 to accept all hotlimit value as before and people can set it to 32 (for example) to reject too-small hoplimit (0-31). OK, then I will try submit a minimum hop limit, thanks for your suggestion :) accept_ra_min_hop_limit would be better as we have accept_ra_rt_info_max_plen. Regards Hangbin --yoshfuji Thanks Hangbin |commit 6fd99094de2b83d1d4c8457f2c83483b2828e75a |Author: D.S. Ljungmark ljungm...@modio.se |Date: Wed Mar 25 09:28:15 2015 +0100 | |ipv6: Don't reduce hop limit for an interface : |RFC 3756, Section 4.2.7, Parameter Spoofing | : | As an example, one possible approach to mitigate this threat is to | ignore very small hop limits. The nodes could implement a | configurable minimum hop limit, and ignore attempts to set it below | said limit. -- Hideaki Yoshifuji hideaki.yoshif...@miraclelinux.com Technical Division, MIRACLE LINUX CORPORATION -- Hideaki Yoshifuji hideaki.yoshif...@miraclelinux.com Technical Division, MIRACLE LINUX CORPORATION -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net] bridge: Fix network header pointer for vlan tagged packets
There are several devices that can receive vlan tagged packets with CHECKSUM_PARTIAL like tap, possibly veth and xennet. When (multiple) vlan tagged packets with CHECKSUM_PARTIAL are forwarded by bridge to a device with the IP_CSUM feature, they end up with checksum error because before entering bridge, the network header is set to ETH_HLEN (not including vlan header length) in __netif_receive_skb_core(), get_rps_cpu(), or drivers' rx functions, and nobody fixes the pointer later. Since the network header is exepected to be ETH_HLEN in flow-dissection and hash-calculation in RPS in rx path, and since the header pointer fix is needed only in tx path, set the appropriate network header on forwarding packets. Signed-off-by: Toshiaki Makita makita.toshi...@lab.ntt.co.jp --- net/bridge/br_forward.c | 29 ++--- 1 file changed, 22 insertions(+), 7 deletions(-) diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c index 0ff6e1b..fa7bfce 100644 --- a/net/bridge/br_forward.c +++ b/net/bridge/br_forward.c @@ -37,15 +37,30 @@ static inline int should_deliver(const struct net_bridge_port *p, int br_dev_queue_push_xmit(struct sock *sk, struct sk_buff *skb) { - if (!is_skb_forwardable(skb-dev, skb)) { - kfree_skb(skb); - } else { - skb_push(skb, ETH_HLEN); - br_drop_fake_rtable(skb); - skb_sender_cpu_clear(skb); - dev_queue_xmit(skb); + if (!is_skb_forwardable(skb-dev, skb)) + goto drop; + + skb_push(skb, ETH_HLEN); + br_drop_fake_rtable(skb); + skb_sender_cpu_clear(skb); + + if (skb-ip_summed == CHECKSUM_PARTIAL + (skb-protocol == htons(ETH_P_8021Q) || +skb-protocol == htons(ETH_P_8021AD))) { + int depth; + + if (!__vlan_get_protocol(skb, skb-protocol, depth)) + goto drop; + + skb_set_network_header(skb, depth); } + dev_queue_xmit(skb); + + return 0; + +drop: + kfree_skb(skb); return 0; } EXPORT_SYMBOL_GPL(br_dev_queue_push_xmit); -- 1.8.1.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 0/7] introduce Hyper-V VM Sockets(hvsock)
Changes since v1: - updated [PATCH 6/7] hvsock: introduce Hyper-V VM Sockets feature - added __init and __exit for the module init/exit functions - net/hv_sock/Kconfig: default m - default m if HYPERV - MODULE_LICENSE: Dual MIT/GPL - Dual BSD/GPL Changes since v2: - fixed various coding issue pointed out by David Miller - fixed indentation issues - removed pr_debug in net/hv_sock/af_hvsock.c - used reverse-Chrismas-tree style for local variables. - EXPORT_SYMBOL - EXPORT_SYMBOL_GPL Changes since v3: - fixed a few coding issue pointed by Vitaly Kuznetsov and Dan Carpenter - fixed the ret value in vmbus_recvpacket_hvsock on error - fixed the style of multi-line comment: vmbus_get_hvsock_rw_status() Hyper-V VM Sockets (hvsock) is a byte-stream based communication mechanism between Windowsd 10 (or later) host and a guest. It's kind of TCP over VMBus, but the transportation layer (VMBus) is much simpler than IP. With Hyper-V VM Sockets, applications between the host and a guest can talk with each other directly by the traditional BSD-style socket APIs. The patchset implements the necessary support in the guest side by adding the necessary new APIs in the vmbus driver, and introducing a new driver hv_sock.ko, which implements_a new socket address family AF_HYPERV. I know the kernel has already had a VM Sockets driver (AF_VSOCK) based on VMware's VMCI (net/vmw_vsock/, drivers/misc/vmw_vmci), and KVM is proposing AF_VSOCK of virtio version: http://thread.gmane.org/gmane.linux.network/365205. However, though Hyper-V VM Sockets may seem conceptually similar to AF_VOSCK, there are differences in the transportation layer, and IMO these make the direct code reusing impractical: 1. In AF_VSOCK, the endpoint type is: u32 ContextID, u32 Port, but in AF_HYPERV, the endpoint type is: GUID VM_ID, GUID ServiceID. Here GUID is 128-bit. 2. AF_VSOCK supports SOCK_DGRAM, while AF_HYPERV doesn't. 3. AF_VSOCK supports some special sock opts, like SO_VM_SOCKETS_BUFFER_SIZE, SO_VM_SOCKETS_BUFFER_MIN/MAX_SIZE and SO_VM_SOCKETS_CONNECT_TIMEOUT. These are meaningless to AF_HYPERV. 4. Some AF_VSOCK's VMCI transportation ops are meanless to AF_HYPERV/VMBus, like.notify_recv_init .notify_recv_pre_block .notify_recv_pre_dequeue .notify_recv_post_dequeue .notify_send_init .notify_send_pre_block .notify_send_pre_enqueue .notify_send_post_enqueue etc. So I think we'd better introduce a new address family: AF_HYPERV. Please review the patchset. Looking forward to your comments! Dexuan Cui (7): Drivers: hv: vmbus: define the new offer type for Hyper-V socket (hvsock) Drivers: hv: vmbus: define a new VMBus message type for hvsock Drivers: hv: vmbus: add APIs to send/recv hvsock packet and get the r/w-ability Drivers: hv: vmbus: add APIs to register callbacks to process hvsock connection Drivers: hv: vmbus: add a helper function to set a channel's pending send size hvsock: introduce Hyper-V VM Sockets feature Drivers: hv: vmbus: disable local interrupt when hvsock's callback is running MAINTAINERS |2 + drivers/hv/Makefile |4 +- drivers/hv/channel.c | 149 + drivers/hv/channel_mgmt.c | 13 + drivers/hv/connection.c | 15 +- drivers/hv/hvsock_callbacks.c | 71 ++ drivers/hv/hyperv_vmbus.h |4 + drivers/hv/ring_buffer.c | 14 + include/linux/hyperv.h| 68 ++ include/linux/socket.h|4 +- include/net/af_hvsock.h | 44 ++ include/uapi/linux/hyperv.h | 16 + net/Kconfig |1 + net/Makefile |1 + net/hv_sock/Kconfig | 10 + net/hv_sock/Makefile |3 + net/hv_sock/af_hvsock.c | 1430 + 17 files changed, 1846 insertions(+), 3 deletions(-) create mode 100644 drivers/hv/hvsock_callbacks.c create mode 100644 include/net/af_hvsock.h create mode 100644 net/hv_sock/Kconfig create mode 100644 net/hv_sock/Makefile create mode 100644 net/hv_sock/af_hvsock.c -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next 14/16] net: Add sk_bind_dev_if to task_struct
Hannes Frederic Sowa han...@stressinduktion.org writes: Hello Eric, On Mon, 2015-07-27 at 15:33 -0500, Eric W. Biederman wrote: David Ahern d...@cumulusnetworks.com writes: Allow tasks to have a default device index for binding sockets. If set the value is passed to all AF_INET/AF_INET6 sockets when they are created. The task setting is passed parent to child on fork, but can be set or changed after task creation using prctl (if task has CAP_NET_ADMIN permissions). The setting for a socket can be retrieved using prctl(). This option allows an administrator to restrict a task to only send/receive packets through the specified device. In the case of VRF devices this option restricts tasks to a specific VRF. Correlation of the device index to a specific VRF, ie., ifindex -- VRF device -- VRF id is left to userspace. Nacked-by: Eric W. Biederman ebied...@xmission.com Because it is broken by design. Your routing device is only safe for programs that know it's limitations it is not appropriate for general applications. Since you don't even seen to know it's limitations I think this is a bad path to walk down. Can you please elaborate about the broken by design? Different operating systems are already using this approach with good success. I read your other mail regarding isolation of different VRFs and I agree that all code which persists state depending solely on the IP address is affected by this and this must be dealt with and fixed (actually, there aren't too many). The size of struct net would tend to disagree with the assertion that there are not too many. But I wouldn't call that broken by design. This stuff will get fixed like e.g. cross-talk between fragmentation queues, icmp rate limiters etc, which could already happen in the past. What is your opinion on the fundamental approach only from a user perspective? Do you think that is broken, too? I think promising something to userspace that a design can not deliver is a fundamental problem. Eric -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] net: rfkill-regulator: fix compiler warning
pdata char* name = const char* name Signed-off-by: Robert ABEL ra...@cit-ec.uni-bielefeld.de --- include/linux/rfkill-regulator.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/rfkill-regulator.h b/include/linux/rfkill-regulator.h index aca36bc..594d8e7 100644 --- a/include/linux/rfkill-regulator.h +++ b/include/linux/rfkill-regulator.h @@ -41,7 +41,7 @@ #include linux/rfkill.h struct rfkill_regulator_platform_data { - char *name; /* the name for the rfkill switch */ + const char *name; /* the name for the rfkill switch */ enum rfkill_type type; /* the type as specified in rfkill.h */ }; -- 2.5.0 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 5/5] s390/bpf: recache skb-data/hlen for skb_vlan_push/pop
Allow eBPF programs attached to TC qdiscs call skb_vlan_push/pop via helper functions. These functions may change skb-data/hlen. This data is cached by s390 JIT to improve performance of ld_abs/ld_ind instructions. Therefore after a change we have to reload the data. In case of usage of skb_vlan_push/pop, in the prologue we store the SKB pointer on the stack and restore it after BPF_JMP_CALL to skb_vlan_push/pop. Signed-off-by: Michael Holzheu holz...@linux.vnet.ibm.com --- arch/s390/net/bpf_jit.h | 5 +++- arch/s390/net/bpf_jit_comp.c | 55 ++-- 2 files changed, 37 insertions(+), 23 deletions(-) diff --git a/arch/s390/net/bpf_jit.h b/arch/s390/net/bpf_jit.h index f6498ee..f010c93 100644 --- a/arch/s390/net/bpf_jit.h +++ b/arch/s390/net/bpf_jit.h @@ -36,6 +36,8 @@ extern u8 sk_load_word[], sk_load_half[], sk_load_byte[]; * | BPF stack | | * | | | * +---+ | + * | 8 byte skbp | | + * R15+170 - +---+ | * | 8 byte hlen | | * R15+168 - +---+ | * | 4 byte align | | @@ -51,11 +53,12 @@ extern u8 sk_load_word[], sk_load_half[], sk_load_byte[]; * We get 160 bytes stack space from calling function, but only use * 12 * 8 byte for old backchain, r15..r6, and tail_call_cnt. */ -#define STK_SPACE (MAX_BPF_STACK + 8 + 4 + 4 + 160) +#define STK_SPACE (MAX_BPF_STACK + 8 + 8 + 4 + 4 + 160) #define STK_160_UNUSED (160 - 12 * 8) #define STK_OFF(STK_SPACE - STK_160_UNUSED) #define STK_OFF_TMP160 /* Offset of tmp buffer on stack */ #define STK_OFF_HLEN 168 /* Offset of SKB header length on stack */ +#define STK_OFF_SKBP 170 /* Offset of SKB pointer on stack */ #define STK_OFF_R6 (160 - 11 * 8) /* Offset of r6 on stack */ #define STK_OFF_TCCNT (160 - 12 * 8) /* Offset of tail_call_cnt on stack */ diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c index a025ddc..ece46d4 100644 --- a/arch/s390/net/bpf_jit_comp.c +++ b/arch/s390/net/bpf_jit_comp.c @@ -53,6 +53,7 @@ struct bpf_jit { #define SEEN_LITERAL 8 /* code uses literals */ #define SEEN_FUNC 16 /* calls C functions */ #define SEEN_TAIL_CALL 32 /* code uses tail calls */ +#define SEEN_SKB_CHANGE64 /* code changes skb data */ #define SEEN_STACK (SEEN_FUNC | SEEN_MEM | SEEN_SKB) /* @@ -382,6 +383,26 @@ static void save_restore_regs(struct bpf_jit *jit, int op) } /* + * For SKB access %b1 contains the SKB pointer. For bpf_jit.S + * we store the SKB header length on the stack and the SKB data + * pointer in REG_SKB_DATA. + */ +static void emit_load_skb_data_hlen(struct bpf_jit *jit) +{ + /* Header length: llgf %w1,len(%b1) */ + EMIT6_DISP_LH(0xe300, 0x0016, REG_W1, REG_0, BPF_REG_1, + offsetof(struct sk_buff, len)); + /* s %w1,data_len(%b1) */ + EMIT4_DISP(0x5b00, REG_W1, BPF_REG_1, + offsetof(struct sk_buff, data_len)); + /* stg %w1,ST_OFF_HLEN(%r0,%r15) */ + EMIT6_DISP_LH(0xe300, 0x0024, REG_W1, REG_0, REG_15, STK_OFF_HLEN); + /* lg %skb_data,data_off(%b1) */ + EMIT6_DISP_LH(0xe300, 0x0004, REG_SKB_DATA, REG_0, + BPF_REG_1, offsetof(struct sk_buff, data)); +} + +/* * Emit function prologue * * Save registers and create stack frame if necessary. @@ -421,25 +442,12 @@ static void bpf_jit_prologue(struct bpf_jit *jit, bool is_classic) EMIT6_DISP_LH(0xe300, 0x0024, REG_W1, REG_0, REG_15, 152); } - /* -* For SKB access %b1 contains the SKB pointer. For bpf_jit.S -* we store the SKB header length on the stack and the SKB data -* pointer in REG_SKB_DATA. -*/ - if (jit-seen SEEN_SKB) { - /* Header length: llgf %w1,len(%b1) */ - EMIT6_DISP_LH(0xe300, 0x0016, REG_W1, REG_0, BPF_REG_1, - offsetof(struct sk_buff, len)); - /* s %w1,data_len(%b1) */ - EMIT4_DISP(0x5b00, REG_W1, BPF_REG_1, - offsetof(struct sk_buff, data_len)); - /* stg %w1,ST_OFF_HLEN(%r0,%r15) */ + if (jit-seen SEEN_SKB) + emit_load_skb_data_hlen(jit); + if (jit-seen SEEN_SKB_CHANGE) + /* stg %b1,ST_OFF_SKBP(%r0,%r15) */ EMIT6_DISP_LH(0xe300, 0x0024, REG_W1, REG_0, REG_15, - STK_OFF_HLEN); - /* lg %skb_data,data_off(%b1) */ - EMIT6_DISP_LH(0xe300, 0x0004, REG_SKB_DATA, REG_0, - BPF_REG_1, offsetof(struct sk_buff, data)); - } + STK_OFF_SKBP); /* Clear A (%b0) and X (%b7) registers for converted BPF programs */ if
[PATCH net-next 2/5] s390/bpf: Fix multiple macro expansions
The EMIT6_DISP_LH macro passes the disp parameter to the _EMIT6_DISP_LH macro. The _EMIT6_DISP_LH macro uses the disp parameter twice: unsigned int __disp_h = ((u32)disp) 0xff000; unsigned int __disp_l = ((u32)disp) 0x00fff; The EMIT6_DISP_LH is used several times with EMIT_CONST_U64() as disp parameter. Therefore always two constants are created per usage of EMIT6_DISP_LH. Fix this and add variable __disp to avoid multiple expansions. Fixes: 054623105728 (s390/bpf: Add s390x eBPF JIT compiler backend) Signed-off-by: Michael Holzheu holz...@linux.vnet.ibm.com --- arch/s390/net/bpf_jit_comp.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c index 01ad166..de0f0bc 100644 --- a/arch/s390/net/bpf_jit_comp.c +++ b/arch/s390/net/bpf_jit_comp.c @@ -221,8 +221,9 @@ static inline void reg_set_seen(struct bpf_jit *jit, u32 b1) #define EMIT6_DISP_LH(op1, op2, b1, b2, b3, disp) \ ({ \ + int __disp = (disp);\ _EMIT6_DISP_LH(op1 | reg(b1, b2) 16 |\ - reg_high(b3) 8, op2, disp); \ + reg_high(b3) 8, op2, __disp); \ REG_SET_SEEN(b1); \ REG_SET_SEEN(b2); \ REG_SET_SEEN(b3); \ -- 2.3.8 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 1/5] s390/bpf: clear correct BPF accumulator register
Currently we assumed the following BPF to eBPF register mapping: - BPF_REG_A - BPF_REG_7 - BPF_REG_X - BPF_REG_8 Unfortunately this mapping is wrong. The correct mapping is: - BPF_REG_A - BPF_REG_0 - BPF_REG_X - BPF_REG_7 So clear the correct registers and use the BPF_REG_A and BPF_REG_X macros instead of BPF_REG_0/7. Fixes: 054623105728 (s390/bpf: Add s390x eBPF JIT compiler backend) Cc: sta...@vger.kernel.org # 4.0+ Signed-off-by: Michael Holzheu holz...@linux.vnet.ibm.com --- arch/s390/net/bpf_jit_comp.c | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c index 79c731e..01ad166 100644 --- a/arch/s390/net/bpf_jit_comp.c +++ b/arch/s390/net/bpf_jit_comp.c @@ -448,13 +448,13 @@ static void bpf_jit_prologue(struct bpf_jit *jit) EMIT6_DISP_LH(0xe300, 0x0004, REG_SKB_DATA, REG_0, BPF_REG_1, offsetof(struct sk_buff, data)); } - /* BPF compatibility: clear A (%b7) and X (%b8) registers */ - if (REG_SEEN(BPF_REG_7)) - /* lghi %b7,0 */ - EMIT4_IMM(0xa709, BPF_REG_7, 0); - if (REG_SEEN(BPF_REG_8)) - /* lghi %b8,0 */ - EMIT4_IMM(0xa709, BPF_REG_8, 0); + /* BPF compatibility: clear A (%b0) and X (%b7) registers */ + if (REG_SEEN(BPF_REG_A)) + /* lghi %ba,0 */ + EMIT4_IMM(0xa709, BPF_REG_A, 0); + if (REG_SEEN(BPF_REG_X)) + /* lghi %bx,0 */ + EMIT4_IMM(0xa709, BPF_REG_X, 0); } /* -- 2.3.8 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: ARP response with link local IP, why not broadcast
Just a quick update on the subject. Thanks for the input. It's good to see that I am not the only one that has this problem. Right now we go with our initial approach and bcast our arp responses. We have a very local network build only for one purpose. Other devices in that network use the same approach. And the master controll software will arp request every address eventually. It's not ideal and will potentially take a couple minutes to resolve every conflict. But it's the best compromise between effort and benefit. I'll let you know about our test results. Maybe somebody is interested. Btw, I still wonder if I can partially keep the kernel from answering ARP packets? On Wed, Jul 22, 2015 at 9:49 AM, Sebastian Fett db_ext...@gmx.de wrote: what is your use case? My problem ist a local network of audio devices. It is a valid possibility that two halfs of the setup are set up individually (Stage left and stage right). Both local networks will auto configure themselves via link local and will be stable. But there always can be two devices with the same IP in both networks. At one point those two networks will be connected. With the current behaviour the conflicting devices will never know of each other and the address conflict. Ah yes, this is a valid problem (Partition-Join tolerance) and one that is being discussed in the Ipv6 context on 6man: http://www.ietf.org/mail-archive/web/ipv6/current/msg22712.html FWIW, when Solaris implemented ACD (rfc 5227) the compromise that was made between bcasting *every* ARP response whle solving the type or issue that you describe was to use a periodic ARP announce, advertising the IP address (a Grat ARP) with exponential backoff. If a duplicate address is triggered (as would happen in the scenario that you describe) the system would fall into the aggressive defend mode. ARP announcemnts were bcast, but the noise is mitigated by tunable exponential backoff. Of course, all of this only helps to *detect* the duplicate- eventually some other entity has to jump in and arbitrate on which one should own the address. The devices are controlled by a central PC using avahi/bonjour. It will know of all conflicting devices, but will only be able to talk to the one that happens to be in it's ARP cache. And renewing that cache will not change anything, because it will happen with unicast messages. I looked at a Dante Controller (an audio data streaming device). And here all ARP messages are answered with broadcasts. I think that behaviour is acceptable because it only happens in local networks. Waking up sleeping devices will not be a concern there. I dont know if a short term solution that makes sense here is to have a tunable for this. But even the always bcast arp response will fail if you have a silent rejoin of the partitioned network- there is a reliance on the owner of an address bcasting their ARP resp at some point right? (there's also a DoS vector here- I can create a lot of bcast traffic by arping for an address..) That brings me to another question. When I react to an ARP packet in a userspace program, can I keep that packet from reaching the kernel as well? I would like to avoid to completely handle ARP in userspace. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Drivers: isdn: Drop unnecessary continue
On Tue, 2015-07-28 at 14:11 +0530, Shraddha Barke wrote: The semantic patch used to make this change is : @@ @@ for (...;...;...) { ... if (...) { ... - continue; } } Signed-off-by: Shraddha Barke shraddha.6...@gmail.com --- drivers/isdn/hardware/mISDN/hfcsusb.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/isdn/hardware/mISDN/hfcsusb.c b/drivers/isdn/hardware/mISDN/hfcsusb.c index 114f3bc..91beb83 100644 --- a/drivers/isdn/hardware/mISDN/hfcsusb.c +++ b/drivers/isdn/hardware/mISDN/hfcsusb.c @@ -1921,10 +1921,9 @@ hfcsusb_probe(struct usb_interface *intf, const struct usb_device_id *id) if ((le16_to_cpu(dev-descriptor.idVendor) == hfcsusb_idtab[i].idVendor) (le16_to_cpu(dev-descriptor.idProduct) - == hfcsusb_idtab[i].idProduct)) { + == hfcsusb_idtab[i].idProduct)) vend_idx = i; - continue; - } + } printk(KERN_DEBUG Well, it seems author intent was to use a break instead of a continue. Not a big deal... -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 3/5] s390/bpf: increase BPF_SIZE_MAX
Currently we have the restriction that jitted BPF programs can have a maximum size of one page. The reason is that we use short displacements for the literal pool. The 20 bit displacements are available since z990 and BPF requires z196 as minimum. Therefore we can remove this restriction and use everywhere 20 bit signed long displacements. Acked-by: Martin Schwidefsky schwidef...@de.ibm.com Signed-off-by: Michael Holzheu holz...@linux.vnet.ibm.com --- arch/s390/net/bpf_jit_comp.c | 15 +++ 1 file changed, 3 insertions(+), 12 deletions(-) diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c index de0f0bc..bea5cfc 100644 --- a/arch/s390/net/bpf_jit_comp.c +++ b/arch/s390/net/bpf_jit_comp.c @@ -45,7 +45,7 @@ struct bpf_jit { int labels[1]; /* Labels for local jumps */ }; -#define BPF_SIZE_MAX 4096/* Max size for program */ +#define BPF_SIZE_MAX 0x7 /* Max size for program (20 bit signed displ) */ #define SEEN_SKB 1 /* skb access */ #define SEEN_MEM 2 /* use mem[] for temporary storage */ @@ -203,15 +203,6 @@ static inline void reg_set_seen(struct bpf_jit *jit, u32 b1) _EMIT6(op1 | __disp, op2); \ }) -#define EMIT6_DISP(op1, op2, b1, b2, b3, disp) \ -({ \ - _EMIT6_DISP(op1 | reg(b1, b2) 16 | \ - reg_high(b3) 8, op2, disp); \ - REG_SET_SEEN(b1); \ - REG_SET_SEEN(b2); \ - REG_SET_SEEN(b3); \ -}) - #define _EMIT6_DISP_LH(op1, op2, disp) \ ({ \ unsigned int __disp_h = ((u32)disp) 0xff000; \ @@ -981,8 +972,8 @@ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp, int i REG_SET_SEEN(BPF_REG_5); jit-seen |= SEEN_FUNC; /* lg %w1,d(imm)(%l) */ - EMIT6_DISP(0xe300, 0x0004, REG_W1, REG_0, REG_L, - EMIT_CONST_U64(func)); + EMIT6_DISP_LH(0xe300, 0x0004, REG_W1, REG_0, REG_L, + EMIT_CONST_U64(func)); /* basr %r14,%w1 */ EMIT2(0x0d00, REG_14, REG_W1); /* lgr %b0,%r2: load return value into %b0 */ -- 2.3.8 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 0/5] s390/bpf: recache skb-data/hlen for skb_vlan_push/pop
Hi Dave, Here the s390 backend for Alexei's patch 4e10df9a60d9 (bpf: introduce bpf_skb_vlan_push/pop() helpers) plus two bugfixes and two minor improvements. The first patch s390/bpf: clear correct BPF accumulator register will also go upstream via Martin's fixes branch. Ok for you? Regards, Michael Michael Holzheu (5): s390/bpf: clear correct BPF accumulator register s390/bpf: Fix multiple macro expansions s390/bpf: increase BPF_SIZE_MAX s390/bpf: Only clear A and X for converted BPF programs s390/bpf: recache skb-data/hlen for skb_vlan_push/pop arch/s390/net/bpf_jit.h | 5 ++- arch/s390/net/bpf_jit_comp.c | 91 +++- 2 files changed, 52 insertions(+), 44 deletions(-) -- 2.3.8 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 4/5] s390/bpf: Only clear A and X for converted BPF programs
Only classic BPF programs that have been converted to eBPF need to clear the A and X registers. We can check for converted programs with: bpf_prog-type == BPF_PROG_TYPE_UNSPEC So add the check and skip initialization for real eBPF programs. Signed-off-by: Michael Holzheu holz...@linux.vnet.ibm.com --- arch/s390/net/bpf_jit_comp.c | 20 +++- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c index bea5cfc..a025ddc 100644 --- a/arch/s390/net/bpf_jit_comp.c +++ b/arch/s390/net/bpf_jit_comp.c @@ -387,7 +387,7 @@ static void save_restore_regs(struct bpf_jit *jit, int op) * Save registers and create stack frame if necessary. * See stack frame layout desription in bpf_jit.h! */ -static void bpf_jit_prologue(struct bpf_jit *jit) +static void bpf_jit_prologue(struct bpf_jit *jit, bool is_classic) { if (jit-seen SEEN_TAIL_CALL) { /* xc STK_OFF_TCCNT(4,%r15),STK_OFF_TCCNT(%r15) */ @@ -440,13 +440,15 @@ static void bpf_jit_prologue(struct bpf_jit *jit) EMIT6_DISP_LH(0xe300, 0x0004, REG_SKB_DATA, REG_0, BPF_REG_1, offsetof(struct sk_buff, data)); } - /* BPF compatibility: clear A (%b0) and X (%b7) registers */ - if (REG_SEEN(BPF_REG_A)) - /* lghi %ba,0 */ - EMIT4_IMM(0xa709, BPF_REG_A, 0); - if (REG_SEEN(BPF_REG_X)) - /* lghi %bx,0 */ - EMIT4_IMM(0xa709, BPF_REG_X, 0); + /* Clear A (%b0) and X (%b7) registers for converted BPF programs */ + if (is_classic) { + if (REG_SEEN(BPF_REG_A)) + /* lghi %ba,0 */ + EMIT4_IMM(0xa709, BPF_REG_A, 0); + if (REG_SEEN(BPF_REG_X)) + /* lghi %bx,0 */ + EMIT4_IMM(0xa709, BPF_REG_X, 0); + } } /* @@ -1232,7 +1234,7 @@ static int bpf_jit_prog(struct bpf_jit *jit, struct bpf_prog *fp) jit-lit = jit-lit_start; jit-prg = 0; - bpf_jit_prologue(jit); + bpf_jit_prologue(jit, fp-type == BPF_PROG_TYPE_UNSPEC); for (i = 0; i fp-len; i += insn_count) { insn_count = bpf_jit_insn(jit, fp, i); if (insn_count 0) -- 2.3.8 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next 14/16] net: Add sk_bind_dev_if to task_struct
On Tue, 2015-07-28 at 08:54 -0500, Eric W. Biederman wrote: Hannes Frederic Sowa han...@stressinduktion.org writes: Hello Eric, On Mon, 2015-07-27 at 15:33 -0500, Eric W. Biederman wrote: David Ahern d...@cumulusnetworks.com writes: Allow tasks to have a default device index for binding sockets. If set the value is passed to all AF_INET/AF_INET6 sockets when they are created. The task setting is passed parent to child on fork, but can be set or changed after task creation using prctl (if task has CAP_NET_ADMIN permissions). The setting for a socket can be retrieved using prctl(). This option allows an administrator to restrict a task to only send/receive packets through the specified device. In the case of VRF devices this option restricts tasks to a specific VRF. Correlation of the device index to a specific VRF, ie., ifindex -- VRF device -- VRF id is left to userspace. Nacked-by: Eric W. Biederman ebied...@xmission.com Because it is broken by design. Your routing device is only safe for programs that know it's limitations it is not appropriate for general applications. Since you don't even seen to know it's limitations I think this is a bad path to walk down. Can you please elaborate about the broken by design? Different operating systems are already using this approach with good success. I read your other mail regarding isolation of different VRFs and I agree that all code which persists state depending solely on the IP address is affected by this and this must be dealt with and fixed (actually, there aren't too many). The size of struct net would tend to disagree with the assertion that there are not too many. netns_frags and inet_peer comes to my mind at first. All those data structures simply need to have an opaque id added to the hash and comparison functions to deal with this problem. And we will need this in future anyway, as openvswitch will get connection tracking support and thus the fragmentation engine and icmp rate limiter will need to be taught about zones in OVS. But I wouldn't call that broken by design. This stuff will get fixed like e.g. cross-talk between fragmentation queues, icmp rate limiters etc, which could already happen in the past. What is your opinion on the fundamental approach only from a user perspective? Do you think that is broken, too? I think promising something to userspace that a design can not deliver is a fundamental problem. You are still talking about the isolation aspect, right? Bye, Hannes -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net] ebpf, x86: fix general protection fault when tail call is invoked
With eBPF JIT compiler enabled on x86_64, I was able to reliably trigger the following general protection fault out of an eBPF program with a simple tail call, f.e. tracex5 (or a stripped down version of it): [ 927.097918] general protection fault: [#1] SMP DEBUG_PAGEALLOC [...] [ 927.100870] task: 8801f228b780 ti: 880016a64000 task.ti: 880016a64000 [ 927.102096] RIP: 0010:[a002440d] [a002440d] 0xa002440d [ 927.103390] RSP: 0018:880016a67a68 EFLAGS: 00010006 [ 927.104683] RAX: 5a5a5a5a5a5a5a5a RBX: RCX: 0001 [ 927.105921] RDX: RSI: 88014e438000 RDI: 880016a67e00 [ 927.107137] RBP: 880016a67c90 R08: R09: 0001 [ 927.108351] R10: R11: R12: 880016a67e00 [ 927.109567] R13: R14: 88026500e460 R15: 880220a81520 [ 927.110787] FS: 7fe7d5c1f740() GS:88026500() knlGS: [ 927.112021] CS: 0010 DS: ES: CR0: 80050033 [ 927.113255] CR2: 003e7bbb91a0 CR3: 6e04b000 CR4: 001407e0 [ 927.114500] Stack: [ 927.115737] c90008cdb000 880016a67e00 88026500e460 880220a81520 [ 927.117005] 0001 001b 880016a67aa8 8106c548 [ 927.118276] 7ffcdaf22e58 880016a67ff0 [ 927.119543] Call Trace: [ 927.120797] [8106c548] ? lookup_address+0x28/0x30 [ 927.122058] [8113d176] ? __module_text_address+0x16/0x70 [ 927.123314] [8117bf0e] ? is_ftrace_trampoline+0x3e/0x70 [ 927.124562] [810c1a0f] ? __kernel_text_address+0x5f/0x80 [ 927.125806] [8102086f] ? print_context_stack+0x7f/0xf0 [ 927.127033] [810f7852] ? __lock_acquire+0x572/0x2050 [ 927.128254] [810f7852] ? __lock_acquire+0x572/0x2050 [ 927.129461] [8119edfa] ? trace_call_bpf+0x3a/0x140 [ 927.130654] [8119ee4a] trace_call_bpf+0x8a/0x140 [ 927.131837] [8119edfa] ? trace_call_bpf+0x3a/0x140 [ 927.133015] [8119f008] kprobe_perf_func+0x28/0x220 [ 927.134195] [811a1668] kprobe_dispatcher+0x38/0x60 [ 927.135367] [81174b91] ? seccomp_phase1+0x1/0x230 [ 927.136523] [81061400] kprobe_ftrace_handler+0xf0/0x150 [ 927.137666] [81174b95] ? seccomp_phase1+0x5/0x230 [ 927.138802] [8117950c] ftrace_ops_recurs_func+0x5c/0xb0 [ 927.139934] [a022b0d5] 0xa022b0d5 [ 927.141066] [81174b91] ? seccomp_phase1+0x1/0x230 [ 927.142199] [81174b95] seccomp_phase1+0x5/0x230 [ 927.143323] [8102c0a4] syscall_trace_enter_phase1+0xc4/0x150 [ 927.144450] [81174b95] ? seccomp_phase1+0x5/0x230 [ 927.145572] [8102c0a4] ? syscall_trace_enter_phase1+0xc4/0x150 [ 927.14] [817f9a9f] tracesys+0xd/0x44 [ 927.147723] Code: 48 8b 46 10 48 39 d0 76 2c 8b 85 fc fd ff ff 83 f8 20 77 21 83 c0 01 89 85 fc fd ff ff 48 8d 44 d6 80 48 8b 00 48 83 f8 00 74 0a 48 8b 40 20 48 83 c0 33 ff e0 48 89 d8 48 8b 9d d8 fd ff ff 4c [ 927.150046] RIP [a002440d] 0xa002440d The code section with the instructions that traps points into the eBPF JIT image of the root program (the one invoking the tail call instruction). Using bpf_jit_disasm -o on the eBPF root program image: [...] 4e: mov-0x204(%rbp),%eax 8b 85 fc fd ff ff 54: cmp$0x20,%eax --- if (tail_call_cnt MAX_TAIL_CALL_CNT) 83 f8 20 57: ja 0x007a 77 21 59: add$0x1,%eax--- tail_call_cnt++ 83 c0 01 5c: mov%eax,-0x204(%rbp) 89 85 fc fd ff ff 62: lea-0x80(%rsi,%rdx,8),%rax --- prog = array-prog[index] 48 8d 44 d6 80 67: mov(%rax),%rax 48 8b 00 6a: cmp$0x0,%rax--- check for NULL 48 83 f8 00 6e: je 0x007a 74 0a 70: mov0x20(%rax),%rax --- GPF triggered here! fetch of bpf_func 48 8b 40 20 [ matches 48 8b 40 20 ... from above ] 74: add$0x33,%rax --- prologue skip of new prog 48 83 c0 33 78: jmpq *%rax--- jump to new prog insns ff e0 [...] The problem is that rax has 5a5a5a5a5a5a5a5a, which suggests a tail call jump to map slot 0 is pointing to a poisoned page. The issue is the following: lea instruction has a wrong offset, i.e. it should be ... lea0x80(%rsi,%rdx,8),%rax ... but it actually seems to be ... lea -0x80(%rsi,%rdx,8),%rax ... where 0x80 is offsetof(struct bpf_array, prog), thus the offset needs to be positive instead of negative. Disassembling the
[PATCH net] sctp: fix sockopt size check
The problem is not on being bigger than what we want, but on being smaller, as it causes read of invalid memory. Note that the struct changes on commit 7e8616d8e773 didn't affect sctp_setsockopt_events one but that's where this check was flipped. Fixes: 7e8616d8e773 ([SCTP]: Update AUTH structures to match declarations in draft-16.) Signed-off-by: Marcelo Ricardo Leitner marcelo.leit...@gmail.com --- net/sctp/socket.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/sctp/socket.c b/net/sctp/socket.c index 1425ec2bbd5ae359a8e0408a89a6da6bb60bd87e..6c4f0dac2104d38ba6420ce6740224866a2ece82 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -2195,7 +2195,7 @@ static int sctp_setsockopt_events(struct sock *sk, char __user *optval, struct sctp_association *asoc; struct sctp_ulpevent *event; - if (optlen sizeof(struct sctp_event_subscribe)) + if (optlen sizeof(struct sctp_event_subscribe)) return -EINVAL; if (copy_from_user(sctp_sk(sk)-subscribe, optval, optlen)) return -EFAULT; -- 2.4.3 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next v4] af_mpls: fix undefined reference to ip6_route_output
On 28/07/15 07:40, Roopa Prabhu wrote: From: Roopa Prabhu ro...@cumulusnetworks.com Undefined reference to ip6_route_output and ip_route_output was reported with CONFIG_INET=n and CONFIG_IPV6=n. This patch adds new CONFIG_MPLS_NEXTHOP_DEVLOOKUP to lookup nexthop device if user has not specified it in RTA_OIF attribute. Make CONFIG_MPLS_NEXTHOP_DEVLOOKUP depend on INET and (IPV6 || IPV6=n) because it uses ip6_route_output and ip_route_output. Reported-by: kbuild test robot fengguang...@intel.com Reported-by: Thomas Graf tg...@suug.ch Signed-off-by: Roopa Prabhu ro...@cumulusnetworks.com Is there a compelling reason to allow the user/applications to not specify the output interface and to derive it from the nexthop? If the user/application intends to treat this as a recursive route then it has to make sure to trigger route updates to the kernel anyway, and an application should have the output interface and real nexthop close to hand in that case. If there isn't a compelling reason, then perhaps the best course of action is to revert the commit, instead of introducing a level of config complexity that means that users/applications may not be able to rely on this capability anyway? Thanks, Rob -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next v4] af_mpls: fix undefined reference to ip6_route_output
On 7/28/15, 7:17 AM, Robert Shearman wrote: On 28/07/15 07:40, Roopa Prabhu wrote: From: Roopa Prabhu ro...@cumulusnetworks.com Undefined reference to ip6_route_output and ip_route_output was reported with CONFIG_INET=n and CONFIG_IPV6=n. This patch adds new CONFIG_MPLS_NEXTHOP_DEVLOOKUP to lookup nexthop device if user has not specified it in RTA_OIF attribute. Make CONFIG_MPLS_NEXTHOP_DEVLOOKUP depend on INET and (IPV6 || IPV6=n) because it uses ip6_route_output and ip_route_output. Reported-by: kbuild test robot fengguang...@intel.com Reported-by: Thomas Graf tg...@suug.ch Signed-off-by: Roopa Prabhu ro...@cumulusnetworks.com Is there a compelling reason to allow the user/applications to not specify the output interface and to derive it from the nexthop? If the user/application intends to treat this as a recursive route then it has to make sure to trigger route updates to the kernel anyway, and an application should have the output interface and real nexthop close to hand in that case. RTA_OIF is optional for ipv4 and ipv6 routes and we wanted to keep it that way for mpls routes as well (Quagga is the application in our use case). It was a simple patch...until i realized the IPV6 dependency issues (I will sure remember this next time). If there isn't a compelling reason, then perhaps the best course of action is to revert the commit, instead of introducing a level of config complexity that means that users/applications may not be able to rely on this capability anyway? The config option though looks complex should not introduce any complexity for the user. It is on by default and always on for the default case. Only for the cases where the IPV6 is a loaded as a module and MPLS_ROUTING is not, the app may get family not supported errors. I did suggest a revert the first time. Mainly for me to fix the mistake i made and resubmit after proper IPV6 dependency testing. I am in the process of trying the option that hannes suggested. Thanks, Roopa -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [net-next 0/16] Proposal for VRF-lite - v3
On 7/27/15 2:30 PM, Eric W. Biederman wrote: This paragraph is false when it comes to sockets, as I have already pointed out. - VPN Routing and Forwarding (RFC4364 and it's kin) implies isolation strong enough to allow using the the same ip on different machines in different VPN instances and not have confusion. - The routing table is not the only table in the kernel that uses an ip address as a key. The result is that you can combine packets fragments that come in on different interfaces (irrespective of your VPN), confuse tcp parameters between interfaces, scramble your ipsec connections and I don't know what else. The duplicate IP address is a problem with the networking stack today; the VRF device does not introduce it. The VRF device does allow duplicate IP addresses within a namespace but separate VRFs, though yes various places that rely solely on source address like IP fragmentation do need to be fixed. I looked at the IPv4 fragmentation code yesterday and will continue today. So help me with the history: is there any reason why the device index is not used today? It seems like a straight forward change. 1. simple netdevices with the same IP address -- no problem using index in the lookup 2. 2 ipsec tunnels -- different netdevices, same IP address -- no problem using index 3. stacked devices like bonding and team interfaces appear to the stack as a single device -- no problem using index of stacked device 4. If an interface is deleted and a new one is created with the same IP address then we want to fail the lookup -- no problem using index 5. other??? Is there a use case where I can't add ifindex of the incoming device (or higher level device if skb-dev is changed) to the hash and lookup for fragments? Version 3 - addressed comments from first 2 RFCs with the exception of the name Nicolas: We will do the name conversion once we agree on what the correct name should be (vrf, mrf or something else) Not so. I described the deep problems between your goals and your implementation and they are not even mentioned let alone addressed. I have addressed comments to the extent that I can. As I stated in my last followup to you Eric I did not understand your point. I asked for clarification, a --verbose if you will. I can't read your mind, so I need you to elaborate on your points to be able to respond and address your concerns. - packets flow through the VRF device in both directions allowing the following: - tcpdump -i vrfn - tc rules on vrf device - netfilter rules on vrf device Ingo/Andy: I added you two as a start point for the proposed task related changes. Not sure who should be the reviewer; please let me know if someone else is more appropriate. Thanks. It looks like you are trying to implement a namespace that isn't a namespace. Given that it is broken by design you have my nack. This is an L3 separation within a namespace, not a device level separation which is what namespaces provide. David -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next 14/16] net: Add sk_bind_dev_if to task_struct
On 7/28/15 10:01 AM, Eric Dumazet wrote: On Tue, 2015-07-28 at 14:19 +0200, Hannes Frederic Sowa wrote: Hello Eric, On Mon, 2015-07-27 at 15:33 -0500, Eric W. Biederman wrote: David Ahern d...@cumulusnetworks.com writes: Allow tasks to have a default device index for binding sockets. If set the value is passed to all AF_INET/AF_INET6 sockets when they are created. The task setting is passed parent to child on fork, but can be set or changed after task creation using prctl (if task has CAP_NET_ADMIN permissions). The setting for a socket can be retrieved using prctl(). This option allows an administrator to restrict a task to only send/receive packets through the specified device. In the case of VRF devices this option restricts tasks to a specific VRF. Correlation of the device index to a specific VRF, ie., ifindex -- VRF device -- VRF id is left to userspace. Nacked-by: Eric W. Biederman ebied...@xmission.com Because it is broken by design. Your routing device is only safe for programs that know it's limitations it is not appropriate for general applications. Since you don't even seen to know it's limitations I think this is a bad path to walk down. Can you please elaborate about the broken by design? Different operating systems are already using this approach with good success. I read your other mail regarding isolation of different VRFs and I agree that all code which persists state depending solely on the IP address is affected by this and this must be dealt with and fixed (actually, there aren't too many). But I wouldn't call that broken by design. This stuff will get fixed like e.g. cross-talk between fragmentation queues, icmp rate limiters etc, which could already happen in the past. What is your opinion on the fundamental approach only from a user perspective? Do you think that is broken, too? I agree with Eric here. This sk_bind_dev_if on task_struct is quite a hack. What will be added next ? An array of dev_if ? netfilter support ? af_packet support ? What about /proc files and netlink dumps ? It could just as easily be a pointer to a struct (e.g., struct net_ctx) such that the intrusion to task_struct is simply 8 bytes -- very similar to the nsproxy used for the assorted namespaces. The struct can then contain whatever network config is imposed on the task. We already have network namespaces. Extend this if needed, instead of bypassing them. Problems with using network namespaces for VRFs has been discussed in the past. e.g., http://www.spinics.net/lists/netdev/msg298368.html David No need to add something else (with lack of proper reporting for various tools) -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next 14/16] net: Add sk_bind_dev_if to task_struct
On Jul 27, 2015 11:33 AM, David Ahern d...@cumulusnetworks.com wrote: Allow tasks to have a default device index for binding sockets. If set the value is passed to all AF_INET/AF_INET6 sockets when they are created. This is not intended to be a review of the concept. I haven't thought about whether the concept is a good idea, broken by design, or whatever. FWIW, if this were added to the kernel and didn't require excessive privilege, I'd probably use it. (I still don't really understand why binding to a device requires privilege in the first place, but, again, I haven't thought about it very much.) +#ifdef CONFIG_NET + case PR_SET_SK_BIND_DEV_IF: + { + struct net_device *dev; + int idx = (int) arg2; + + if (!capable(CAP_NET_ADMIN)) + return -EPERM; + Can you either use ns_capable or add a comment as to why not? Also, please return -EINVAL if unused args are nonzero. + if (idx) { + dev = dev_get_by_index(me-nsproxy-net_ns, idx); + if (!dev) + return -EINVAL; + dev_put(dev); + } + me-sk_bind_dev_if = idx; + break; + } + case PR_GET_SK_BIND_DEV_IF: + { + struct task_struct *tsk; + int sk_bind_dev_if = -EINVAL; + + rcu_read_lock(); + tsk = find_task_by_vpid(arg2); + if (tsk) + sk_bind_dev_if = tsk-sk_bind_dev_if; Why do you support different tasks here? Could this use proc instead? The same -EINVAL issue applies. Also, I think you need to hook setns and unshare to do something reasonable when the task is bound to a device. --Andy -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] net: mdio-octeon: Modify driver to work on both ThunderX and Octeon
On 07/27/2015 07:14 PM, mohun...@gmail.com wrote: From: Radha Mohan Chintakuntla rchintakun...@cavium.com This patch modifies the mdio-octeon driver to work on both ThunderX and Octeon SoCs from Cavium Inc. Signed-off-by: Sunil Goutham sgout...@cavium.com Signed-off-by: Radha Mohan Chintakuntla rchintakun...@cavium.com Signed-off-by: David Daney david.da...@cavium.com --- drivers/net/phy/Kconfig |9 ++- drivers/net/phy/mdio-octeon.c | 122 +++- 2 files changed, 111 insertions(+), 20 deletions(-) diff --git a/drivers/net/phy/Kconfig b/drivers/net/phy/Kconfig index cf18940..0d6af19 100644 --- a/drivers/net/phy/Kconfig +++ b/drivers/net/phy/Kconfig @@ -145,13 +145,14 @@ config MDIO_GPIO will be called mdio-gpio. config MDIO_OCTEON - tristate Support for MDIO buses on Octeon SOCs - depends on CAVIUM_OCTEON_SOC + tristate Support for MDIO buses on Octeon and ThunderX SOCs + depends on 64BIT default y If it now depends only on 64BIT, we should probably remove the default. People building for x86 are not interested in this driver. [...] +#ifdef __BIG_ENDIAN_BITFIELD +#define OCT_MDIO_BITFIELD_FIELD(field, more) \ + field; \ + more + +#else +#define OCT_MDIO_BITFIELD_FIELD(field, more) \ + more\ + field; + +#endif + +union cvmx_smix_clk { + uint64_t u64; Perhaps: s/uint64_t/u64/ There are several of these. + struct cvmx_smix_clk_s { + OCT_MDIO_BITFIELD_FIELD(u64 reserved_25_63:39, + OCT_MDIO_BITFIELD_FIELD(u64 mode:1, + OCT_MDIO_BITFIELD_FIELD(u64 reserved_21_23:3, + OCT_MDIO_BITFIELD_FIELD(u64 sample_hi:5, + OCT_MDIO_BITFIELD_FIELD(u64 sample_mode:1, + OCT_MDIO_BITFIELD_FIELD(u64 reserved_14_14:1, + OCT_MDIO_BITFIELD_FIELD(u64 clk_idle:1, + OCT_MDIO_BITFIELD_FIELD(u64 preamble:1, + OCT_MDIO_BITFIELD_FIELD(u64 sample:4, + OCT_MDIO_BITFIELD_FIELD(u64 phase:8, + ;)) + } s; +}; + [...] -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next 14/16] net: Add sk_bind_dev_if to task_struct
On Tue, 2015-07-28 at 14:19 +0200, Hannes Frederic Sowa wrote: Hello Eric, On Mon, 2015-07-27 at 15:33 -0500, Eric W. Biederman wrote: David Ahern d...@cumulusnetworks.com writes: Allow tasks to have a default device index for binding sockets. If set the value is passed to all AF_INET/AF_INET6 sockets when they are created. The task setting is passed parent to child on fork, but can be set or changed after task creation using prctl (if task has CAP_NET_ADMIN permissions). The setting for a socket can be retrieved using prctl(). This option allows an administrator to restrict a task to only send/receive packets through the specified device. In the case of VRF devices this option restricts tasks to a specific VRF. Correlation of the device index to a specific VRF, ie., ifindex -- VRF device -- VRF id is left to userspace. Nacked-by: Eric W. Biederman ebied...@xmission.com Because it is broken by design. Your routing device is only safe for programs that know it's limitations it is not appropriate for general applications. Since you don't even seen to know it's limitations I think this is a bad path to walk down. Can you please elaborate about the broken by design? Different operating systems are already using this approach with good success. I read your other mail regarding isolation of different VRFs and I agree that all code which persists state depending solely on the IP address is affected by this and this must be dealt with and fixed (actually, there aren't too many). But I wouldn't call that broken by design. This stuff will get fixed like e.g. cross-talk between fragmentation queues, icmp rate limiters etc, which could already happen in the past. What is your opinion on the fundamental approach only from a user perspective? Do you think that is broken, too? I agree with Eric here. This sk_bind_dev_if on task_struct is quite a hack. What will be added next ? An array of dev_if ? netfilter support ? af_packet support ? What about /proc files and netlink dumps ? We already have network namespaces. Extend this if needed, instead of bypassing them. No need to add something else (with lack of proper reporting for various tools) -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next 14/16] net: Add sk_bind_dev_if to task_struct
On 7/28/15 9:25 AM, Andy Lutomirski wrote: On Jul 27, 2015 11:33 AM, David Ahern d...@cumulusnetworks.com wrote: Allow tasks to have a default device index for binding sockets. If set the value is passed to all AF_INET/AF_INET6 sockets when they are created. This is not intended to be a review of the concept. I haven't thought about whether the concept is a good idea, broken by design, or whatever. FWIW, if this were added to the kernel and didn't require excessive privilege, I'd probably use it. (I still don't really understand why binding to a device requires privilege in the first place, but, again, I haven't thought about it very much.) The intent here is to restrict a task to only sending and receiving packets from a single network device. The device can be single ethernet interface, a stacked device (e.g, bond) or in our case a VRF device which restricts a task to interfaces (and hence network paths) associated with the VRF. +#ifdef CONFIG_NET + case PR_SET_SK_BIND_DEV_IF: + { + struct net_device *dev; + int idx = (int) arg2; + + if (!capable(CAP_NET_ADMIN)) + return -EPERM; + Can you either use ns_capable or add a comment as to why not? will do. Also, please return -EINVAL if unused args are nonzero. ok. + if (idx) { + dev = dev_get_by_index(me-nsproxy-net_ns, idx); + if (!dev) + return -EINVAL; + dev_put(dev); + } + me-sk_bind_dev_if = idx; + break; + } + case PR_GET_SK_BIND_DEV_IF: + { + struct task_struct *tsk; + int sk_bind_dev_if = -EINVAL; + + rcu_read_lock(); + tsk = find_task_by_vpid(arg2); + if (tsk) + sk_bind_dev_if = tsk-sk_bind_dev_if; Why do you support different tasks here? Could this use proc instead? In this case we want to allow a separate process to determine if a task is restricted to a device. The same -EINVAL issue applies. Also, I think you need to hook setns and unshare to do something reasonable when the task is bound to a device. ack on both. Thanks for the review, David -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/8] Use correctly the Xen memory terminologies in Linux
Hi all, This patch series aims to use the memory terminologies described in include/linux/mm.h [1] for Linux xen code. Linux is using mistakenly MFN when GFN is meant, I suspect this is because the first support of Xen was for PV. This has brought some misimplementation of memory helpers on ARM and make the developper confused about the expected behavior. For instance, with pfn_to_mfn, we expect to get a MFN based on the name. Although, if we look at the implementation on x86, it's returning a GFN. Most of the callers are also using it this way. The first 2 patches of this series is ARM related in order to remove PV specific helpers which should not be used and fixing the implementation of pfn_to_mfn. The rest of the series is here rename most of the usage in the common code of MFN to GFN. I also took the opportunity to replace most of the call to pfn_to_gfn in the common code by page_to_gfn avoid construction such as pfn_to_gfn(page_to_pfn(...). Note the one xen-blkfront will be dropped by 64K series [2], I can include it if necessary. This series is based on Linux 4.2-rc4. A branch with all the patches can be found here: git://xenbits.xen.org/people/julieng/linux-arm.git branch page-renaming-v1 Sincerely yours, [1] Xen tree: e758ed14f390342513405dd766e874934573e6cb [2] https://lkml.org/lkml/2015/7/9/628 Cc: Boris Ostrovsky boris.ostrov...@oracle.com Cc: David Vrabel david.vra...@citrix.com Cc: Dmitry Torokhov dmitry.torok...@gmail.com Cc: Greg Kroah-Hartman gre...@linuxfoundation.org Cc: H. Peter Anvin h...@zytor.com Cc: Ian Campbell ian.campb...@citrix.com Cc: Ingo Molnar mi...@redhat.com Cc: James E.J. Bottomley jbottom...@odin.com Cc: Jean-Christophe Plagniol-Villard plagn...@jcrosoft.com Cc: Jiri Slaby jsl...@suse.com Cc: Juergen Gross jgr...@suse.com Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com Cc: linux-...@vger.kernel.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-fb...@vger.kernel.org Cc: linux-in...@vger.kernel.org Cc: linuxppc-...@lists.ozlabs.org Cc: linux-s...@vger.kernel.org Cc: netdev@vger.kernel.org Cc: Roger Pau Monné roger@citrix.com Cc: Russell King li...@arm.linux.org.uk Cc: Stefano Stabellini stefano.stabell...@eu.citrix.com Cc: Thomas Gleixner t...@linutronix.de Cc: Tomi Valkeinen tomi.valkei...@ti.com Cc: Wei Liu wei.l...@citrix.com Cc: x...@kernel.org Julien Grall (8): arm/xen: Remove helpers which are PV specific xen: Make clear that swiotlb and biomerge are dealing with DMA address arm/xen: implement correctly pfn_to_mfn xen: Use the correctly the Xen memory terminologies xen/tmem: Use page_to_gfn rather than pfn_to_gfn video/xen-fbfront: Further s/MFN/GFN clean-up hvc/xen: Further s/MFN/GFN clean-up xen/privcmd: Further s/MFN/GFN/ clean-up arch/arm/include/asm/xen/page.h | 44 +++--- arch/arm/xen/enlighten.c| 18 ++--- arch/arm/xen/mm.c | 4 +-- arch/x86/include/asm/xen/page.h | 34 +-- arch/x86/xen/enlighten.c| 4 +-- arch/x86/xen/mmu.c | 48 - arch/x86/xen/p2m.c | 32 +++--- arch/x86/xen/setup.c| 12 - arch/x86/xen/smp.c | 4 +-- arch/x86/xen/suspend.c | 8 +++--- drivers/block/xen-blkfront.c| 6 ++--- drivers/input/misc/xen-kbdfront.c | 4 +-- drivers/net/xen-netback/netback.c | 4 +-- drivers/net/xen-netfront.c | 8 +++--- drivers/scsi/xen-scsifront.c| 8 +++--- drivers/tty/hvc/hvc_xen.c | 18 + drivers/video/fbdev/xen-fbfront.c | 20 +++--- drivers/xen/balloon.c | 2 +- drivers/xen/biomerge.c | 6 ++--- drivers/xen/events/events_base.c| 2 +- drivers/xen/events/events_fifo.c| 4 +-- drivers/xen/gntalloc.c | 3 ++- drivers/xen/manage.c| 2 +- drivers/xen/privcmd.c | 44 +++--- drivers/xen/swiotlb-xen.c | 16 +-- drivers/xen/tmem.c | 21 +-- drivers/xen/xenbus/xenbus_client.c | 2 +- drivers/xen/xenbus/xenbus_dev_backend.c | 2 +- drivers/xen/xenbus/xenbus_probe.c | 8 +++--- drivers/xen/xlate_mmu.c | 18 ++--- include/uapi/xen/privcmd.h | 4 +++ include/xen/page.h | 4 +-- include/xen/xen-ops.h | 10 +++ 33 files changed, 210 insertions(+), 214 deletions(-) -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next v4] af_mpls: fix undefined reference to ip6_route_output
On 7/28/15, 6:04 AM, Hannes Frederic Sowa wrote: On Mon, 2015-07-27 at 23:40 -0700, Roopa Prabhu wrote: From: Roopa Prabhu ro...@cumulusnetworks.com Undefined reference to ip6_route_output and ip_route_output was reported with CONFIG_INET=n and CONFIG_IPV6=n. This patch adds new CONFIG_MPLS_NEXTHOP_DEVLOOKUP to lookup nexthop device if user has not specified it in RTA_OIF attribute. Make CONFIG_MPLS_NEXTHOP_DEVLOOKUP depend on INET and (IPV6 || IPV6=n) because it uses ip6_route_output and ip_route_output. Reported-by: kbuild test robot fengguang...@intel.com Reported-by: Thomas Graf tg...@suug.ch Signed-off-by: Roopa Prabhu ro...@cumulusnetworks.com --- v1 - v2: use IS_BUILTIN v2 - v3: Use new Kconfig option that depends on (IPV6 || IPV6=n) as suggested by Dave. Also uses IS_ERR as suggested by Thomas. v3 - v4: Include missed case of (MPLS_ROUTING=y IPV6=m) reported by Dave. net/mpls/Kconfig |8 net/mpls/af_mpls.c | 19 ++- 2 files changed, 26 insertions(+), 1 deletion(-) diff --git a/net/mpls/Kconfig b/net/mpls/Kconfig index 5c467ef..134764e 100644 --- a/net/mpls/Kconfig +++ b/net/mpls/Kconfig @@ -33,4 +33,12 @@ config MPLS_IPTUNNEL ---help--- mpls ip tunnel support. +config MPLS_NEXTHOP_DEVLOOKUP + bool MPLS: nexthop oif dev lookup + depends on MPLS_ROUTING INET \ + ((IPV6 !(MPLS_ROUTING=y IPV6=m)) || IPV6=n) + ---help--- +This enables mpls route nexthop dev lookup when oif is not +specified by user + Urks. Can't you simply use ipv6_stub_impl.ipv6_dst_lookup with sk=NULL to do that and don't have a run-time dependency on IPv6 at all (for the cost of a function pointer). I did not realize that this could be an option. I now see vxlan using it. I will try it out. Maybe same for IPv4? I would prefer leaving IPV4 alone with CONFIG_INET. IPV6 was my problem case. Let me see if i can fix that first without introducing a config option. Thanks, Roopa -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH net 1/2] r8152: add pre_reset and post_reset
Oliver Neukum [mailto:oneu...@suse.com] Sent: Tuesday, July 28, 2015 4:53 PM [...] + return 0; + + netdev = tp-netdev; + if (!netif_running(netdev)) + return 0; + + ret = usb_autopm_get_interface(intf); + if (ret 0) + return ret; What sense does this make? [...] + return 0; + + netdev = tp-netdev; + if (!netif_running(netdev)) + return 0; + + ret = usb_autopm_get_interface(intf); The device will be awake. I don't sure if the device would be in runtimesuspend, so I wake it up by myself. I think you mean I don't have to do this. I would remove them and resend the patch. Thanks. Best Regards, Hayes
[PATCH V4 2/7] Drivers: hv: vmbus: define a new VMBus message type for hvsock
A function to send the type of message is also added. The coming net/hvsock driver will use this function to proactively request the host to offer a VMBus channel for a new hvsock connection. Signed-off-by: Dexuan Cui de...@microsoft.com --- drivers/hv/channel.c | 15 +++ drivers/hv/channel_mgmt.c | 4 include/linux/hyperv.h| 13 + 3 files changed, 32 insertions(+) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index 603ce97..b09d1b7 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -218,6 +218,21 @@ error0: } EXPORT_SYMBOL_GPL(vmbus_open); +/* Used for Hyper-V Socket: a guest client's connect() to the host */ +int vmbus_send_tl_connect_request(const uuid_le *shv_guest_servie_id, + const uuid_le *shv_host_servie_id) +{ + struct vmbus_channel_tl_connect_request conn_msg; + + memset(conn_msg, 0, sizeof(conn_msg)); + conn_msg.header.msgtype = CHANNELMSG_TL_CONNECT_REQUEST; + conn_msg.guest_endpoint_id = *shv_guest_servie_id; + conn_msg.host_service_id = *shv_host_servie_id; + + return vmbus_post_msg(conn_msg, sizeof(conn_msg)); +} +EXPORT_SYMBOL_GPL(vmbus_send_tl_connect_request); + /* * create_gpadl_header - Creates a gpadl for the specified buffer */ diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index 4506a66..7018c53 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -772,6 +772,10 @@ struct vmbus_channel_message_table_entry {CHANNELMSG_VERSION_RESPONSE, 1, vmbus_onversion_response}, {CHANNELMSG_UNLOAD, 0, NULL}, {CHANNELMSG_UNLOAD_RESPONSE,1, vmbus_unload_response}, + {CHANNELMSG_18, 0, NULL}, + {CHANNELMSG_19, 0, NULL}, + {CHANNELMSG_20, 0, NULL}, + {CHANNELMSG_TL_CONNECT_REQUEST, 0, NULL}, }; /* diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 2ca3ac1..264093a 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -393,6 +393,10 @@ enum vmbus_channel_message_type { CHANNELMSG_VERSION_RESPONSE = 15, CHANNELMSG_UNLOAD = 16, CHANNELMSG_UNLOAD_RESPONSE = 17, + CHANNELMSG_18 = 18, + CHANNELMSG_19 = 19, + CHANNELMSG_20 = 20, + CHANNELMSG_TL_CONNECT_REQUEST = 21, CHANNELMSG_COUNT }; @@ -563,6 +567,13 @@ struct vmbus_channel_initiate_contact { u64 monitor_page2; } __packed; +/* Hyper-V socket: guest's connect()-ing to host */ +struct vmbus_channel_tl_connect_request { + struct vmbus_channel_message_header header; + uuid_le guest_endpoint_id; + uuid_le host_service_id; +} __packed; + struct vmbus_channel_version_response { struct vmbus_channel_message_header header; u8 version_supported; @@ -1248,4 +1259,6 @@ extern struct resource hyperv_mmio; extern __u32 vmbus_proto_version; +int vmbus_send_tl_connect_request(const uuid_le *shv_guest_servie_id, + const uuid_le *shv_host_servie_id); #endif /* _HYPERV_H */ -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH V4 3/7] Drivers: hv: vmbus: add APIs to send/recv hvsock packet and get the r/w-ability
This will be used by the coming net/hvsock driver. Signed-off-by: Dexuan Cui de...@microsoft.com --- drivers/hv/channel.c | 134 ++ drivers/hv/hyperv_vmbus.h | 4 ++ drivers/hv/ring_buffer.c | 14 + include/linux/hyperv.h| 32 +++ 4 files changed, 184 insertions(+) diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c index b09d1b7..531a142 100644 --- a/drivers/hv/channel.c +++ b/drivers/hv/channel.c @@ -758,6 +758,53 @@ int vmbus_sendpacket_pagebuffer_ctl(struct vmbus_channel *channel, EXPORT_SYMBOL_GPL(vmbus_sendpacket_pagebuffer_ctl); /* + * vmbus_sendpacket_hvsock - Send the hvsock payload 'buf' into the vmbus + * ringbuffer + */ +int vmbus_sendpacket_hvsock(struct vmbus_channel *channel, void *buf, u32 len) +{ + struct vmpipe_proto_header pipe_hdr; + struct vmpacket_descriptor desc; + struct kvec bufferlist[4]; + u32 packetlen_aligned; + u32 packetlen; + u64 aligned_data = 0; + bool signal = false; + int ret; + + packetlen = HVSOCK_HEADER_LEN + len; + packetlen_aligned = ALIGN(packetlen, sizeof(u64)); + + /* Setup the descriptor */ + desc.type = VM_PKT_DATA_INBAND; + /* in 8-bytes granularity */ + desc.offset8 = sizeof(struct vmpacket_descriptor) 3; + desc.len8 = (u16)(packetlen_aligned 3); + desc.flags = 0; + desc.trans_id = 0; + + pipe_hdr.pkt_type = 1; + pipe_hdr.data_size = len; + + bufferlist[0].iov_base = desc; + bufferlist[0].iov_len = sizeof(struct vmpacket_descriptor); + bufferlist[1].iov_base = pipe_hdr; + bufferlist[1].iov_len = sizeof(struct vmpipe_proto_header); + bufferlist[2].iov_base = buf; + bufferlist[2].iov_len = len; + bufferlist[3].iov_base = aligned_data; + bufferlist[3].iov_len = packetlen_aligned - packetlen; + + ret = hv_ringbuffer_write(channel-outbound, bufferlist, 4, signal); + + if (ret == 0 signal) + vmbus_setevent(channel); + + return ret; +} +EXPORT_SYMBOL_GPL(vmbus_sendpacket_hvsock); + +/* * vmbus_sendpacket_pagebuffer - Send a range of single-page buffer * packets using a GPADL Direct packet type. */ @@ -978,3 +1025,90 @@ int vmbus_recvpacket_raw(struct vmbus_channel *channel, void *buffer, return ret; } EXPORT_SYMBOL_GPL(vmbus_recvpacket_raw); + +/* + * vmbus_recvpacket_hvsock - Receive the hvsock payload from the vmbus + * ringbuffer into the 'buffer'. + */ +int vmbus_recvpacket_hvsock(struct vmbus_channel *channel, void *buffer, + u32 bufferlen, u32 *buffer_actual_len) +{ + struct vmpipe_proto_header *pipe_hdr; + struct vmpacket_descriptor *desc; + u32 packet_len, payload_len; + bool signal = false; + int ret; + + *buffer_actual_len = 0; + + if (bufferlen HVSOCK_HEADER_LEN) + return -ENOBUFS; + + ret = hv_ringbuffer_peek(channel-inbound, buffer, +HVSOCK_HEADER_LEN); + if (ret != 0) + return ret; + + desc = (struct vmpacket_descriptor *)buffer; + packet_len = desc-len8 3; + if (desc-type != VM_PKT_DATA_INBAND || + desc-offset8 != (sizeof(*desc) / 8) || + packet_len HVSOCK_HEADER_LEN) + return -EIO; + + pipe_hdr = (struct vmpipe_proto_header *)(desc + 1); + payload_len = pipe_hdr-data_size; + + if (pipe_hdr-pkt_type != 1 || payload_len == 0) + return -EIO; + + if (HVSOCK_PKT_LEN(payload_len) != packet_len + PREV_INDICES_LEN) + return -EIO; + + if (bufferlen packet_len - HVSOCK_HEADER_LEN) + return -ENOBUFS; + + /* Copy over the hvsock payload to the user buffer */ + ret = hv_ringbuffer_read(channel-inbound, buffer, +packet_len - HVSOCK_HEADER_LEN, +HVSOCK_HEADER_LEN, signal); + if (ret != 0) + return ret; + + *buffer_actual_len = payload_len; + + if (signal) + vmbus_setevent(channel); + + return 0; +} +EXPORT_SYMBOL_GPL(vmbus_recvpacket_hvsock); + +/* + * vmbus_get_hvsock_rw_status - can the ringbuffer be read/written? + */ +void vmbus_get_hvsock_rw_status(struct vmbus_channel *channel, + bool *can_read, bool *can_write) +{ + u32 avl_read_bytes, avl_write_bytes, dummy; + + if (can_read != NULL) { + hv_get_ringbuffer_available_space(channel-inbound, + avl_read_bytes, + dummy); + *can_read = avl_read_bytes = HVSOCK_MIN_PKT_LEN; + } + + /* +* We write into the ringbuffer only when we're able to write a +* a payload of 4096 bytes (the actual written payload's length may be +* less than
[PATCH V4 6/7] hvsock: introduce Hyper-V VM Sockets feature
Hyper-V VM sockets (hvsock) supplies a byte-stream based communication mechanism between the host and a guest. It's kind of TCP over VMBus, but the transportation layer (VMBus) is much simpler than IP. With Hyper-V VM Sockets, applications between the host and a guest can talk with each other directly by the traditional BSD-style socket APIs. Hyper-V VM Sockets is only available on Windows 10 host and later. The patch implements the necessary support in the guest side by introducing a new socket address family AF_HYPERV. Signed-off-by: Dexuan Cui de...@microsoft.com --- Changes since v1: - added __init and __exit for the module init/exit functions - net/hv_sock/Kconfig: default m - default m if HYPERV - MODULE_LICENSE: Dual MIT/GPL - Dual BSD/GPL Changes since v2: - fixed indentation issues - removed pr_debug I know the kernel has already had a VM Sockets driver (AF_VSOCK) based on VMware's VMCI (net/vmw_vsock/, drivers/misc/vmw_vmci), and KVM is proposing AF_VSOCK of virtio version: http://thread.gmane.org/gmane.linux.network/365205. However, though Hyper-V VM Sockets may seem conceptually similar to AF_VOSCK, there are differences in the transportation layer, and IMO these make the direct code reusing impractical: 1. In AF_VSOCK, the endpoint type is: u32 ContextID, u32 Port, but in AF_HYPERV, the endpoint type is: GUID VM_ID, GUID ServiceID. Here GUID is 128-bit. 2. AF_VSOCK supports SOCK_DGRAM, while AF_HYPERV doesn't. 3. AF_VSOCK supports some special sock opts, like SO_VM_SOCKETS_BUFFER_SIZE, SO_VM_SOCKETS_BUFFER_MIN/MAX_SIZE and SO_VM_SOCKETS_CONNECT_TIMEOUT. These are meaningless to AF_HYPERV. 4. Some AF_VSOCK's VMCI transportation ops are meanless to AF_HYPERV/VMBus, like.notify_recv_init .notify_recv_pre_block .notify_recv_pre_dequeue .notify_recv_post_dequeue .notify_send_init .notify_send_pre_block .notify_send_pre_enqueue .notify_send_post_enqueue etc. So I think we'd better introduce a new address family: AF_HYPERV. MAINTAINERS |2 + include/linux/socket.h |4 +- include/net/af_hvsock.h | 44 ++ include/uapi/linux/hyperv.h | 16 + net/Kconfig |1 + net/Makefile|1 + net/hv_sock/Kconfig | 10 + net/hv_sock/Makefile|3 + net/hv_sock/af_hvsock.c | 1430 +++ 9 files changed, 1510 insertions(+), 1 deletion(-) create mode 100644 include/net/af_hvsock.h create mode 100644 net/hv_sock/Kconfig create mode 100644 net/hv_sock/Makefile create mode 100644 net/hv_sock/af_hvsock.c diff --git a/MAINTAINERS b/MAINTAINERS index e7bdbac..a4a7e03 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -4941,7 +4941,9 @@ F:drivers/input/serio/hyperv-keyboard.c F: drivers/net/hyperv/ F: drivers/scsi/storvsc_drv.c F: drivers/video/fbdev/hyperv_fb.c +F: net/hv_sock/ F: include/linux/hyperv.h +F: include/net/af_hvsock.h F: tools/hv/ I2C OVER PARALLEL PORT diff --git a/include/linux/socket.h b/include/linux/socket.h index 5bf59c8..d5ef612 100644 --- a/include/linux/socket.h +++ b/include/linux/socket.h @@ -200,7 +200,8 @@ struct ucred { #define AF_ALG 38 /* Algorithm sockets*/ #define AF_NFC 39 /* NFC sockets */ #define AF_VSOCK 40 /* vSockets */ -#define AF_MAX 41 /* For now.. */ +#define AF_HYPERV 41 /* Hyper-V virtual sockets */ +#define AF_MAX 42 /* For now.. */ /* Protocol families, same as address families. */ #define PF_UNSPEC AF_UNSPEC @@ -246,6 +247,7 @@ struct ucred { #define PF_ALG AF_ALG #define PF_NFC AF_NFC #define PF_VSOCK AF_VSOCK +#define PF_HYPERV AF_HYPERV #define PF_MAX AF_MAX /* Maximum queue length specifiable by listen. */ diff --git a/include/net/af_hvsock.h b/include/net/af_hvsock.h new file mode 100644 index 000..9951658 --- /dev/null +++ b/include/net/af_hvsock.h @@ -0,0 +1,44 @@ +#ifndef __AF_HVSOCK_H__ +#define __AF_HVSOCK_H__ + +#include linux/kernel.h +#include linux/hyperv.h +#include net/sock.h + +#define VMBUS_RINGBUFFER_SIZE_HVSOCK_RECV (5 * PAGE_SIZE) +#define VMBUS_RINGBUFFER_SIZE_HVSOCK_SEND (5 * PAGE_SIZE) + +#define HVSOCK_RCV_BUF_SZ VMBUS_RINGBUFFER_SIZE_HVSOCK_RECV +#define HVSOCK_SND_BUF_SZ PAGE_SIZE + +#define sk_to_hvsock(__sk)((struct hvsock_sock *)(__sk)) +#define hvsock_to_sk(__hvsk) ((struct sock *)(__hvsk)) + +struct hvsock_sock { + /* sk must be the first member. */ + struct sock sk; + + struct sockaddr_hv local_addr; + struct sockaddr_hv remote_addr; + + /* protected by the global hvsock_mutex */ + struct list_head bound_list; + struct list_head connected_list; + + struct list_head accept_queue; + /* used by enqueue and
[PATCH V4 7/7] Drivers: hv: vmbus: disable local interrupt when hvsock's callback is running
In the SMP guest case, when the per-channel callback hvsock_events() is running on virtual CPU A, if the guest tries to close the connection on virtual CPU B: we invoke vmbus_close() - vmbus_close_internal(), then we can have trouble: on B, vmbus_close_internal() will send IPI reset_channel_cb() to A, trying to set channel-onchannel_callbackto NULL; on A, if the IPI handler happens between if (channel-onchannel_callback != NULL) and invoking channel-onchannel_callback, we'll invoke a function pointer of NULL. This is why the patch is necessary. Signed-off-by: Dexuan Cui de...@microsoft.com --- drivers/hv/connection.c | 15 ++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c index 4fc2e88..4766fd8 100644 --- a/drivers/hv/connection.c +++ b/drivers/hv/connection.c @@ -319,6 +319,9 @@ static void process_chn_event(u32 relid) void *arg; bool read_state; u32 bytes_to_read; + bool is_hvsock = false; + + local_irq_disable(); /* * Find the channel based on this relid and invokes the @@ -327,7 +330,11 @@ static void process_chn_event(u32 relid) channel = pcpu_relid2channel(relid); if (!channel) - return; + goto out; + + is_hvsock = is_hvsock_channel(channel); + if (!is_hvsock) + local_irq_enable(); /* * A channel once created is persistent even when there @@ -363,6 +370,12 @@ static void process_chn_event(u32 relid) bytes_to_read = 0; } while (read_state (bytes_to_read != 0)); } + + /* local_irq_enable() is alredy invoked above */ + if (!is_hvsock) + return; +out: + local_irq_enable(); } /* -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH V4 4/7] Drivers: hv: vmbus: add APIs to register callbacks to process hvsock connection
With the 2 APIs supplied by the VMBus driver, the coming net/hvsock driver can register 2 callbacks and can know when a new hvsock connection is offered by the host, and when a hvsock connection is being closed by the host. Signed-off-by: Dexuan Cui de...@microsoft.com --- drivers/hv/Makefile | 4 ++- drivers/hv/channel_mgmt.c | 9 ++ drivers/hv/hvsock_callbacks.c | 71 +++ include/linux/hyperv.h| 10 ++ 4 files changed, 93 insertions(+), 1 deletion(-) create mode 100644 drivers/hv/hvsock_callbacks.c diff --git a/drivers/hv/Makefile b/drivers/hv/Makefile index 39c9b2c..ef6f8a8 100644 --- a/drivers/hv/Makefile +++ b/drivers/hv/Makefile @@ -4,5 +4,7 @@ obj-$(CONFIG_HYPERV_BALLOON)+= hv_balloon.o hv_vmbus-y := vmbus_drv.o \ hv.o connection.o channel.o \ -channel_mgmt.o ring_buffer.o +channel_mgmt.o ring_buffer.o \ +hvsock_callbacks.o + hv_utils-y := hv_util.o hv_kvp.o hv_snapshot.o hv_fcopy.o hv_utils_transport.o diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index 7018c53..a8b1e61 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -300,6 +300,12 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel) return; } + if (is_hvsock_channel(newchannel)) { + if (hvsock_process_offer(newchannel) != 0) + goto err_deq_chan; + return; + } + /* * Start the process of binding this offer to the driver * We need to set the DeviceObject field before calling @@ -564,7 +570,10 @@ static void vmbus_onoffer_rescind(struct vmbus_channel_message_header *hdr) vmbus_device_unregister(channel-device_obj); put_device(dev); } + } else if (is_hvsock_channel(channel)) { + hvsock_process_offer_rescind(channel); } else { + /* it is a sub-channel. */ hv_process_channel_removal(channel, channel-offermsg.child_relid); } diff --git a/drivers/hv/hvsock_callbacks.c b/drivers/hv/hvsock_callbacks.c new file mode 100644 index 000..28f7b75 --- /dev/null +++ b/drivers/hv/hvsock_callbacks.c @@ -0,0 +1,71 @@ +/* + * Copyright (c) 2015, Microsoft Corporation. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + */ +#define pr_fmt(fmt) KBUILD_MODNAME : fmt + +#include linux/hyperv.h + +/* We should hold the mutex when getting/setting the function pointers */ +static DEFINE_MUTEX(hvsock_cb_mutex); +static int (*__process_offer)(struct vmbus_channel *channel); +static void (*__process_offer_rescind)(struct vmbus_channel *channel); + +int hvsock_process_offer(struct vmbus_channel *channel) +{ + int ret = -ENODEV; + + mutex_lock(hvsock_cb_mutex); + + if (__process_offer != NULL) + ret = __process_offer(channel); + + mutex_unlock(hvsock_cb_mutex); + + return ret; +} + +void hvsock_process_offer_rescind(struct vmbus_channel *channel) +{ + mutex_lock(hvsock_cb_mutex); + + if (__process_offer_rescind != NULL) + __process_offer_rescind(channel); + else + hv_process_channel_removal(channel, + channel-offermsg.child_relid); + + mutex_unlock(hvsock_cb_mutex); +} + +void vmbus_register_hvsock_callbacks( + int (*process_offer)(struct vmbus_channel *), + void (*process_offer_rescind)(struct vmbus_channel *)) +{ + mutex_lock(hvsock_cb_mutex); + + __process_offer = process_offer; + __process_offer_rescind = process_offer_rescind; + + mutex_unlock(hvsock_cb_mutex); +} +EXPORT_SYMBOL_GPL(vmbus_register_hvsock_callbacks); + +void vmbus_unregister_hvsock_callbacks(void) +{ + mutex_lock(hvsock_cb_mutex); + + __process_offer = NULL; + __process_offer_rescind = NULL; + + mutex_unlock(hvsock_cb_mutex); +} +EXPORT_SYMBOL_GPL(vmbus_unregister_hvsock_callbacks); diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index c8e27da..fda9790 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -1269,6 +1269,16 @@ extern __u32 vmbus_proto_version; int vmbus_send_tl_connect_request(const uuid_le *shv_guest_servie_id, const uuid_le *shv_host_servie_id); + +extern int hvsock_process_offer(struct vmbus_channel *channel); +extern void hvsock_process_offer_rescind(struct vmbus_channel
Re: [PATCH net 1/2] r8152: add pre_reset and post_reset
On Tue, 2015-07-28 at 09:52 +, Hayes Wang wrote: Oliver Neukum [mailto:oneu...@suse.com] Sent: Tuesday, July 28, 2015 4:53 PM [...] + return 0; + + netdev = tp-netdev; + if (!netif_running(netdev)) + return 0; + + ret = usb_autopm_get_interface(intf); + if (ret 0) + return ret; What sense does this make? [...] + return 0; + + netdev = tp-netdev; + if (!netif_running(netdev)) + return 0; + + ret = usb_autopm_get_interface(intf); The device will be awake. I don't sure if the device would be in runtimesuspend, so I wake it up by myself. I think you mean I don't have to do this. I would remove them and resend the patch. Thanks. Usbcore will resume the device. HTH Oliver A. /* Prevent autosuspend during the reset */ usb_autoresume_device(udev); if (config) { for (i = 0; i config-desc.bNumInterfaces; ++i) { struct usb_interface *cintf = config-interface[i]; struct usb_driver *drv; int unbind = 0; if (cintf-dev.driver) { drv = to_usb_driver(cintf-dev.driver); if (drv-pre_reset drv-post_reset) unbind = (drv-pre_reset)(cintf); -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next] net/mlx4_en: Hardware accelerated 802.1ad works only on the first port
Fix mistakenly used, hard coded, port number in get_phv_bit() Fixes: 77fc29c (net/mlx4_core: Preparations for 802.1ad VLAN support) Signed-off-by: Amir Vadai am...@mellanox.com --- Hi Dave, Because of my mistake I've sent a version [1] without some internal review fixes. This patch fix the only code issue that was missing. The rest were only improvements to the commit messages, which unfortunately it is too late to fix now. [1] - http://www.spinics.net/lists/netdev/msg337148.html Thanks, Amir drivers/net/ethernet/mellanox/mlx4/fw.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c b/drivers/net/ethernet/mellanox/mlx4/fw.c index 5a1c3d2..e8ec1de 100644 --- a/drivers/net/ethernet/mellanox/mlx4/fw.c +++ b/drivers/net/ethernet/mellanox/mlx4/fw.c @@ -2815,7 +2815,7 @@ int get_phv_bit(struct mlx4_dev *dev, u8 port, int *phv) struct mlx4_func_cap func_cap; memset(func_cap, 0, sizeof(func_cap)); - err = mlx4_QUERY_FUNC_CAP(dev, 1, func_cap); + err = mlx4_QUERY_FUNC_CAP(dev, port, func_cap); if (!err) *phv = func_cap.flags QUERY_FUNC_CAP_PHV_BIT; return err; -- 2.4.3.413.ga5fe668 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH V4 1/7] Drivers: hv: vmbus: define the new offer type for Hyper-V socket (hvsock)
A helper function is also added. Signed-off-by: Dexuan Cui de...@microsoft.com --- include/linux/hyperv.h | 7 +++ 1 file changed, 7 insertions(+) diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h index 30d3a1f..2ca3ac1 100644 --- a/include/linux/hyperv.h +++ b/include/linux/hyperv.h @@ -236,6 +236,7 @@ struct vmbus_channel_offer { #define VMBUS_CHANNEL_LOOPBACK_OFFER 0x100 #define VMBUS_CHANNEL_PARENT_OFFER 0x200 #define VMBUS_CHANNEL_REQUEST_MONITORED_NOTIFICATION 0x400 +#define VMBUS_CHANNEL_TLNPI_PROVIDER_OFFER 0x2000 struct vmpacket_descriptor { u16 type; @@ -758,6 +759,12 @@ struct vmbus_channel { struct list_head percpu_list; }; +static inline bool is_hvsock_channel(const struct vmbus_channel *c) +{ + return !!(c-offermsg.offer.chn_flags + VMBUS_CHANNEL_TLNPI_PROVIDER_OFFER); +} + static inline void set_channel_read_state(struct vmbus_channel *c, bool state) { c-batched_reading = state; -- 2.1.0 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH iproute2 net-next] bridge: mdb: add support for router add/del notifications monitoring
On 07/27/2015 11:49 PM, Nikolay Aleksandrov wrote: On 27 Jul 2015, at 23:40, Stephen Hemminger step...@networkplumber.org wrote: On Mon, 27 Jul 2015 13:44:05 +0200 Nikolay Aleksandrov ra...@blackwall.org wrote: From: Nikolay Aleksandrov niko...@cumulusnetworks.com This patch adds support for ADDMDB/DELMDB notifications about router ports which have been added or deleted/expired respectively. Example output: $ bridge -s monitor mdb Deleted router port dev eth3 master br0 router port dev eth3 master br0 Signed-off-by: Nikolay Aleksandrov niko...@cumulusnetworks.com Looks useful, applied. Does usage or manual page need to be updated as well? Good question :-) I'll look into it. Thanks! I've looked into it and we don't need any documentation/man changes the mdb monitoring command is the same and the description doesn't specify what exactly is being returned, so we're good. Cheers, Nik -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] packet: tpacket_snd(): fix signed/unsigned comparison
On Tue, 2015-07-28 at 13:07 +0200, Daniel Borkmann wrote: On 07/28/2015 12:57 PM, Alexander Drozdov wrote: tpacket_fill_skb() can return a negative value (-errno) which is stored in tp_len variable. In that case the following condition will be (but shouldn't be) true: tp_len dev-mtu + dev-hard_header_len as dev-mtu and dev-hard_header_len are both unsigned. That may lead to just returning an incorrect EMSGSIZE errno to the user. Signed-off-by: Alexander Drozdov al.droz...@gmail.com Looks good to me, thanks! Acked-by: Daniel Borkmann dan...@iogearbox.net -- Fixes: 52f1454f629fa (packet: allow to transmit +4 byte in TX_RING slot for VLAN case) -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net] bridge: mdb: fix delmdb state in the notification
From: Nikolay Aleksandrov niko...@cumulusnetworks.com Since mdb states were introduced when deleting an entry the state was left as it was set in the delete request from the user which leads to the following output when doing a monitor (for example): $ bridge mdb add dev br0 port eth3 grp 239.0.0.1 permanent (monitor) dev br0 port eth3 grp 239.0.0.1 permanent $ bridge mdb del dev br0 port eth3 grp 239.0.0.1 permanent (monitor) dev br0 port eth3 grp 239.0.0.1 temp ^^^ Note the temp state in the delete notification which is wrong since the entry was permanent, the state in a delete is always reported as temp regardless of the real state of the entry. After this patch: $ bridge mdb add dev br0 port eth3 grp 239.0.0.1 permanent (monitor) dev br0 port eth3 grp 239.0.0.1 permanent $ bridge mdb del dev br0 port eth3 grp 239.0.0.1 permanent (monitor) dev br0 port eth3 grp 239.0.0.1 permanent There's one important note to make here that the state is actually not matched when doing a delete, so one can delete a permanent entry by stating temp in the end of the command, I've chosen this fix in order not to break user-space tools which rely on this (incorrect) behaviour. So to give an example after this patch and using the wrong state: $ bridge mdb add dev br0 port eth3 grp 239.0.0.1 permanent (monitor) dev br0 port eth3 grp 239.0.0.1 permanent $ bridge mdb del dev br0 port eth3 grp 239.0.0.1 temp (monitor) dev br0 port eth3 grp 239.0.0.1 permanent Note the state of the entry that got deleted is correct in the notification. Signed-off-by: Nikolay Aleksandrov niko...@cumulusnetworks.com Fixes: ccb1c31a7a87 (bridge: add flags to distinguish permanent mdb entires) --- I propose to fix the state matching in net-next but we may risk breaking some user-space tools which rely on this behaviour. net/bridge/br_mdb.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c index 1198a3dbad95..c94321955db7 100644 --- a/net/bridge/br_mdb.c +++ b/net/bridge/br_mdb.c @@ -445,6 +445,7 @@ static int __br_mdb_del(struct net_bridge *br, struct br_mdb_entry *entry) if (p-port-state == BR_STATE_DISABLED) goto unlock; + entry-state = p-state; rcu_assign_pointer(*pp, p-next); hlist_del_init(p-mglist); del_timer(p-timer); -- 2.4.3 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net v2 2/2] r8152: reset device when tx timeout
On Tue, 2015-07-28 at 20:08 +0800, Hayes Wang wrote: static void rtl8152_tx_timeout(struct net_device *netdev) { struct r8152 *tp = netdev_priv(netdev); - int i; netif_warn(tp, tx_err, netdev, Tx timeout\n); - for (i = 0; i RTL8152_MAX_TX; i++) - usb_unlink_urb(tp-tx_info[i].urb); + + usb_queue_reset_device(tp-intf); + cancel_delayed_work(tp-schedule); Sorry to bother you again, but this looks wrong. You want to cancel first. There is no point in running any work before the reset is done. It will undo any progress anyway. Regards Oliver -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH net v2 2/2] r8152: reset device when tx timeout
Oliver Neukum [mailto:oneu...@suse.com] Sent: Tuesday, July 28, 2015 8:14 PM [...] static void rtl8152_tx_timeout(struct net_device *netdev) { struct r8152 *tp = netdev_priv(netdev); - int i; netif_warn(tp, tx_err, netdev, Tx timeout\n); - for (i = 0; i RTL8152_MAX_TX; i++) - usb_unlink_urb(tp-tx_info[i].urb); + + usb_queue_reset_device(tp-intf); + cancel_delayed_work(tp-schedule); Sorry to bother you again, but this looks wrong. You want to cancel first. There is no point in running any work before the reset is done. It will undo any progress anyway. Excuse me. Do you mean I don't need cancel the other work because it wouldn't be run before the reset is finished? Best Regards, Hayes
[PATCH net] bridge: mcast: give fast leave precedence over multicast router and querier
From: Satish Ashok sas...@cumulusnetworks.com When fast leave is configured on a bridge port and an IGMP leave is received for a group, the group is not deleted immediately if there is a router detected or if multicast querier is configured. Ideally the group should be deleted immediately when fast leave is configured. Signed-off-by: Satish Ashok sas...@cumulusnetworks.com --- net/bridge/br_multicast.c | 50 --- 1 file changed, 26 insertions(+), 24 deletions(-) diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c index 79db489cdade..0b39dcc65b94 100644 --- a/net/bridge/br_multicast.c +++ b/net/bridge/br_multicast.c @@ -1416,8 +1416,7 @@ br_multicast_leave_group(struct net_bridge *br, spin_lock(br-multicast_lock); if (!netif_running(br-dev) || - (port port-state == BR_STATE_DISABLED) || - timer_pending(other_query-timer)) + (port port-state == BR_STATE_DISABLED)) goto out; mdb = mlock_dereference(br-mdb, br); @@ -1425,6 +1424,31 @@ br_multicast_leave_group(struct net_bridge *br, if (!mp) goto out; + if (port (port-flags BR_MULTICAST_FAST_LEAVE)) { + struct net_bridge_port_group __rcu **pp; + + for (pp = mp-ports; +(p = mlock_dereference(*pp, br)) != NULL; +pp = p-next) { + if (p-port != port) + continue; + + rcu_assign_pointer(*pp, p-next); + hlist_del_init(p-mglist); + del_timer(p-timer); + call_rcu_bh(p-rcu, br_multicast_free_pg); + br_mdb_notify(br-dev, port, group, RTM_DELMDB); + + if (!mp-ports !mp-mglist + netif_running(br-dev)) + mod_timer(mp-timer, jiffies); + } + goto out; + } + + if (timer_pending(other_query-timer)) + goto out; + if (br-multicast_querier) { __br_multicast_send_query(br, port, mp-addr); @@ -1450,28 +1474,6 @@ br_multicast_leave_group(struct net_bridge *br, } } - if (port (port-flags BR_MULTICAST_FAST_LEAVE)) { - struct net_bridge_port_group __rcu **pp; - - for (pp = mp-ports; -(p = mlock_dereference(*pp, br)) != NULL; -pp = p-next) { - if (p-port != port) - continue; - - rcu_assign_pointer(*pp, p-next); - hlist_del_init(p-mglist); - del_timer(p-timer); - call_rcu_bh(p-rcu, br_multicast_free_pg); - br_mdb_notify(br-dev, port, group, RTM_DELMDB); - - if (!mp-ports !mp-mglist - netif_running(br-dev)) - mod_timer(mp-timer, jiffies); - } - goto out; - } - now = jiffies; time = now + br-multicast_last_member_count * br-multicast_last_member_interval; -- 2.4.3 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] packet: tpacket_snd(): fix signed/unsigned comparison
tpacket_fill_skb() can return a negative value (-errno) which is stored in tp_len variable. In that case the following condition will be (but shouldn't be) true: tp_len dev-mtu + dev-hard_header_len as dev-mtu and dev-hard_header_len are both unsigned. That may lead to just returning an incorrect EMSGSIZE errno to the user. Signed-off-by: Alexander Drozdov al.droz...@gmail.com --- net/packet/af_packet.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index c9e8741..d1d3625 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -2403,7 +2403,8 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg) } tp_len = tpacket_fill_skb(po, skb, ph, dev, size_max, proto, addr, hlen); - if (tp_len dev-mtu + dev-hard_header_len) { + if (likely(tp_len = 0) + tp_len dev-mtu + dev-hard_header_len) { struct ethhdr *ehdr; /* Earlier code assumed this would be a VLAN pkt, * double-check this now that we have the actual -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] packet: tpacket_snd(): fix signed/unsigned comparison
On 07/28/2015 12:57 PM, Alexander Drozdov wrote: tpacket_fill_skb() can return a negative value (-errno) which is stored in tp_len variable. In that case the following condition will be (but shouldn't be) true: tp_len dev-mtu + dev-hard_header_len as dev-mtu and dev-hard_header_len are both unsigned. That may lead to just returning an incorrect EMSGSIZE errno to the user. Signed-off-by: Alexander Drozdov al.droz...@gmail.com Looks good to me, thanks! Acked-by: Daniel Borkmann dan...@iogearbox.net -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net v2 1/2] r8152: add pre_reset and post_reset
Add rtl8152_pre_reset() and rtl8152_post_reset() which are used when calling usb_reset_device(). The two functions could reduce the time of reset when calling usb_reset_device() after probe(). Signed-off-by: Hayes Wang hayesw...@realtek.com --- drivers/net/usb/r8152.c | 54 + 1 file changed, 54 insertions(+) diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c index 144dc64..e1b6d6d 100644 --- a/drivers/net/usb/r8152.c +++ b/drivers/net/usb/r8152.c @@ -3342,6 +3342,58 @@ static void r8153_init(struct r8152 *tp) r8153_u2p3en(tp, true); } +static int rtl8152_pre_reset(struct usb_interface *intf) +{ + struct r8152 *tp = usb_get_intfdata(intf); + struct net_device *netdev; + + if (!tp) + return 0; + + netdev = tp-netdev; + if (!netif_running(netdev)) + return 0; + + napi_disable(tp-napi); + clear_bit(WORK_ENABLE, tp-flags); + usb_kill_urb(tp-intr_urb); + cancel_delayed_work_sync(tp-schedule); + if (netif_carrier_ok(netdev)) { + netif_stop_queue(netdev); + mutex_lock(tp-control); + tp-rtl_ops.disable(tp); + mutex_unlock(tp-control); + } + + return 0; +} + +static int rtl8152_post_reset(struct usb_interface *intf) +{ + struct r8152 *tp = usb_get_intfdata(intf); + struct net_device *netdev; + + if (!tp) + return 0; + + netdev = tp-netdev; + if (!netif_running(netdev)) + return 0; + + set_bit(WORK_ENABLE, tp-flags); + if (netif_carrier_ok(netdev)) { + mutex_lock(tp-control); + tp-rtl_ops.enable(tp); + rtl8152_set_rx_mode(netdev); + mutex_unlock(tp-control); + netif_wake_queue(netdev); + } + + napi_enable(tp-napi); + + return 0; +} + static int rtl8152_suspend(struct usb_interface *intf, pm_message_t message) { struct r8152 *tp = usb_get_intfdata(intf); @@ -4164,6 +4216,8 @@ static struct usb_driver rtl8152_driver = { .suspend = rtl8152_suspend, .resume = rtl8152_resume, .reset_resume = rtl8152_resume, + .pre_reset =rtl8152_pre_reset, + .post_reset = rtl8152_post_reset, .supports_autosuspend = 1, .disable_hub_initiated_lpm = 1, }; -- 2.4.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net v2 2/2] r8152: reset device when tx timeout
The device reset is necessary if the hw becomes abnormal and stops transmitting packets. Signed-off-by: Hayes Wang hayesw...@realtek.com --- drivers/net/usb/r8152.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c index e1b6d6d..6af299f 100644 --- a/drivers/net/usb/r8152.c +++ b/drivers/net/usb/r8152.c @@ -27,7 +27,7 @@ #include linux/usb/cdc.h /* Version Information */ -#define DRIVER_VERSION v1.08.0 (2015/01/13) +#define DRIVER_VERSION v1.08.1 (2015/07/28) #define DRIVER_AUTHOR Realtek linux nic maintainers nic_s...@realtek.com #define DRIVER_DESC Realtek RTL8152/RTL8153 Based USB Ethernet Adapters #define MODULENAME r8152 @@ -1902,11 +1902,11 @@ static void rtl_drop_queued_tx(struct r8152 *tp) static void rtl8152_tx_timeout(struct net_device *netdev) { struct r8152 *tp = netdev_priv(netdev); - int i; netif_warn(tp, tx_err, netdev, Tx timeout\n); - for (i = 0; i RTL8152_MAX_TX; i++) - usb_unlink_urb(tp-tx_info[i].urb); + + usb_queue_reset_device(tp-intf); + cancel_delayed_work(tp-schedule); } static void rtl8152_set_rx_mode(struct net_device *netdev) -- 2.4.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net v2 0/2] r8152: device reset
v2: For patch #1, remove usb_autopm_get_interface(), usb_autopm_put_interface(), and the checking of intf-condition. For patch #2, replace the original method with usb_queue_reset_device() to reset the device. v1: Although the driver works normally, we find the device may get all 0xff data when transmitting packets on certain platforms. It would break the device and no packet could be transmitted. The reset is necessary to recover the hw for this situation. Hayes Wang (2): r8152: add pre_reset and post_reset r8152: reset device when tx timeout drivers/net/usb/r8152.c | 90 ++--- 1 file changed, 86 insertions(+), 4 deletions(-) -- 2.4.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next 14/16] net: Add sk_bind_dev_if to task_struct
Hello Eric, On Mon, 2015-07-27 at 15:33 -0500, Eric W. Biederman wrote: David Ahern d...@cumulusnetworks.com writes: Allow tasks to have a default device index for binding sockets. If set the value is passed to all AF_INET/AF_INET6 sockets when they are created. The task setting is passed parent to child on fork, but can be set or changed after task creation using prctl (if task has CAP_NET_ADMIN permissions). The setting for a socket can be retrieved using prctl(). This option allows an administrator to restrict a task to only send/receive packets through the specified device. In the case of VRF devices this option restricts tasks to a specific VRF. Correlation of the device index to a specific VRF, ie., ifindex -- VRF device -- VRF id is left to userspace. Nacked-by: Eric W. Biederman ebied...@xmission.com Because it is broken by design. Your routing device is only safe for programs that know it's limitations it is not appropriate for general applications. Since you don't even seen to know it's limitations I think this is a bad path to walk down. Can you please elaborate about the broken by design? Different operating systems are already using this approach with good success. I read your other mail regarding isolation of different VRFs and I agree that all code which persists state depending solely on the IP address is affected by this and this must be dealt with and fixed (actually, there aren't too many). But I wouldn't call that broken by design. This stuff will get fixed like e.g. cross-talk between fragmentation queues, icmp rate limiters etc, which could already happen in the past. What is your opinion on the fundamental approach only from a user perspective? Do you think that is broken, too? Thanks, Hannes -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next] packet: remove handling of tx_ring from prb_shutdown_retire_blk_timer()
Follow e8e85cc5eb57 (packet: remove handling of tx_ring) and remove the tx_ring parameter from prb_shutdown_retire_blk_timer() as it is only called with tx_ring = 0. Signed-off-by: Tobias Klauser tklau...@distanz.ch --- net/packet/af_packet.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index c9e8741..2af8590 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -518,13 +518,11 @@ static void prb_del_retire_blk_timer(struct tpacket_kbdq_core *pkc) } static void prb_shutdown_retire_blk_timer(struct packet_sock *po, - int tx_ring, struct sk_buff_head *rb_queue) { struct tpacket_kbdq_core *pkc; - pkc = tx_ring ? GET_PBDQC_FROM_RB(po-tx_ring) : - GET_PBDQC_FROM_RB(po-rx_ring); + pkc = GET_PBDQC_FROM_RB(po-rx_ring); spin_lock_bh(rb_queue-lock); pkc-delete_blk_timer = 1; @@ -4044,7 +4042,7 @@ static int packet_set_ring(struct sock *sk, union tpacket_req_u *req_u, if (closing (po-tp_version TPACKET_V2)) { /* Because we don't support block-based V3 on tx-ring */ if (!tx_ring) - prb_shutdown_retire_blk_timer(po, tx_ring, rb_queue); + prb_shutdown_retire_blk_timer(po, rb_queue); } release_sock(sk); -- 2.2.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH iproute2 net-next] bridge: mdb: add support for vlans
On 07/15/2015 05:45 PM, Nikolay Aleksandrov wrote: This patch allows the user to specify the vlan of the mdb group being added or deleted and adds support for displaying the vlan when dumping mdb information or monitoring it. It also updates the man page to reflect the new vid argument for mdb. Signed-off-by: Nikolay Aleksandrov niko...@cumulusnetworks.com --- note: the cast in print_mdb_entry() was necessary to shut the compiler bridge/mdb.c | 31 +++ include/linux/if_bridge.h | 1 + man/man8/bridge.8 | 8 +++- 3 files changed, 27 insertions(+), 13 deletions(-) Hi Stephen, Just wondering what's the state of this patch because I'd like to submit some improvements in the same area and I'm wondering if I should do them on top of this patch or if I need to change something in it ? Thanks, Nik -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net v2 2/2] r8152: reset device when tx timeout
On Tue, 2015-07-28 at 12:31 +, Hayes Wang wrote: Oliver Neukum [mailto:oneu...@suse.com] Sent: Tuesday, July 28, 2015 8:14 PM [...] static void rtl8152_tx_timeout(struct net_device *netdev) { struct r8152 *tp = netdev_priv(netdev); - int i; netif_warn(tp, tx_err, netdev, Tx timeout\n); - for (i = 0; i RTL8152_MAX_TX; i++) - usb_unlink_urb(tp-tx_info[i].urb); + + usb_queue_reset_device(tp-intf); + cancel_delayed_work(tp-schedule); Sorry to bother you again, but this looks wrong. You want to cancel first. There is no point in running any work before the reset is done. It will undo any progress anyway. Excuse me. Do you mean I don't need cancel the other work because it wouldn't be run before the reset is finished? No, whatever the other work will do, the reset will undo. Regards Oliver -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next v4] af_mpls: fix undefined reference to ip6_route_output
On Mon, 2015-07-27 at 23:40 -0700, Roopa Prabhu wrote: From: Roopa Prabhu ro...@cumulusnetworks.com Undefined reference to ip6_route_output and ip_route_output was reported with CONFIG_INET=n and CONFIG_IPV6=n. This patch adds new CONFIG_MPLS_NEXTHOP_DEVLOOKUP to lookup nexthop device if user has not specified it in RTA_OIF attribute. Make CONFIG_MPLS_NEXTHOP_DEVLOOKUP depend on INET and (IPV6 || IPV6=n) because it uses ip6_route_output and ip_route_output. Reported-by: kbuild test robot fengguang...@intel.com Reported-by: Thomas Graf tg...@suug.ch Signed-off-by: Roopa Prabhu ro...@cumulusnetworks.com --- v1 - v2: use IS_BUILTIN v2 - v3: Use new Kconfig option that depends on (IPV6 || IPV6=n) as suggested by Dave. Also uses IS_ERR as suggested by Thomas. v3 - v4: Include missed case of (MPLS_ROUTING=y IPV6=m) reported by Dave. net/mpls/Kconfig |8 net/mpls/af_mpls.c | 19 ++- 2 files changed, 26 insertions(+), 1 deletion(-) diff --git a/net/mpls/Kconfig b/net/mpls/Kconfig index 5c467ef..134764e 100644 --- a/net/mpls/Kconfig +++ b/net/mpls/Kconfig @@ -33,4 +33,12 @@ config MPLS_IPTUNNEL ---help--- mpls ip tunnel support. +config MPLS_NEXTHOP_DEVLOOKUP + bool MPLS: nexthop oif dev lookup + depends on MPLS_ROUTING INET \ + ((IPV6 !(MPLS_ROUTING=y IPV6=m)) || IPV6=n) + ---help--- + This enables mpls route nexthop dev lookup when oif is not + specified by user + Urks. Can't you simply use ipv6_stub_impl.ipv6_dst_lookup with sk=NULL to do that and don't have a run-time dependency on IPv6 at all (for the cost of a function pointer). Maybe same for IPv4? If builtin you can inline those calls anyway. Bye, Hannes -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [net-next PATCH 2/2] drivers: net: cpsw: add separate napi for tx packet handling for performance improvment
On Wednesday 29 July 2015 04:00 AM, Francois Romieu wrote: Mugunthan V N mugunthan...@ti.com : On Tuesday 28 July 2015 02:52 AM, Francois Romieu wrote: Mugunthan V N mugunthan...@ti.com : [...] @@ -752,13 +753,22 @@ static irqreturn_t cpsw_tx_interrupt(int irq, void *dev_id) struct cpsw_priv *priv = dev_id; cpdma_ctlr_eoi(priv-dma, CPDMA_EOI_TX); - cpdma_chan_process(priv-txch, 128); + writel(0, priv-wr_regs-tx_en); + + if (netif_running(priv-ndev)) { + napi_schedule(priv-napi_tx); + return IRQ_HANDLED; + } cpsw_ndo_stop calls napi_disable: you can remove netif_running. This netif_running check is to find which interface is up as the interrupt is shared by both the interfaces. When first interface is down and second interface is active then napi_schedule for first interface will fail and second interface napi needs to be scheduled. So I don't think netif_running needs to be removed. Each interface has its own napi tx (resp. rx) context: I would had expected two unconditional napi_schedule per tx (resp. rx) shared irq, not one. I'll read it again after some sleep. For each interrupt only one napi will be scheduled, when the first interface is down then only second interface napi is scheduled in both tx and rx irqs. Regards Mugunthan V N -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next 5/5] s390/bpf: recache skb-data/hlen for skb_vlan_push/pop
On 7/28/15 7:10 AM, Michael Holzheu wrote: Allow eBPF programs attached to TC qdiscs call skb_vlan_push/pop via helper functions. These functions may change skb-data/hlen. This data is cached by s390 JIT to improve performance of ld_abs/ld_ind instructions. Therefore after a change we have to reload the data. In case of usage of skb_vlan_push/pop, in the prologue we store the SKB pointer on the stack and restore it after BPF_JMP_CALL to skb_vlan_push/pop. Signed-off-by: Michael Holzheuholz...@linux.vnet.ibm.com Thanks! Acked-by: Alexei Starovoitov a...@plumgrid.com -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net] ebpf, x86: fix general protection fault when tail call is invoked
On 7/28/15 6:26 AM, Daniel Borkmann wrote: After patch, disassembly: [...] 9e: lea0x80(%rsi,%rdx,8),%rax --- CONFIG_LOCKDEP/CONFIG_LOCK_STAT 48 8d 84 d6 80 00 00 00 a6: mov(%rax),%rax 48 8b 00 [...] [...] 9e: lea0x50(%rsi,%rdx,8),%rax --- No CONFIG_LOCKDEP 48 8d 84 d6 50 00 00 00 a6: mov(%rax),%rax 48 8b 00 [...] Fixes: b52f00e6a715 (x86: bpf_jit: implement bpf_tail_call() helper) Signed-off-by: Daniel Borkmanndan...@iogearbox.net Thanks for fixing it. Most of my development is actually with LOCKDEP on, but I don't ever turn LOCK_STAT on, so sadly missed this 48 byte increase of 80 byte structure :( Acked-by: Alexei Starovoitov a...@plumgrid.com -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Xen-devel] [PATCH 4/8] xen: Use the correctly the Xen memory terminologies
On 28/07/15 16:02, Julien Grall wrote: Based on include/xen/mm.h [1], Linux is mistakenly using MFN when GFN is meant, I suspect this is because the first support for Xen was for PV. This brough some misimplementation of helpers on ARM and make the developper confused the expected behavior. For the benefit of other subsystem maintainers, this is a purely mechanical change in Xen-specific terminology. It doesn't need reviews or acks from non-Xen people (IMO). For instance, with pfn_to_mfn, we expect to get an MFN based on the name. Although, if we look at the implementation on x86, it's returning a GFN. For clarity and avoid new confusion, replace any reference of mfn into gnf in any helpers used by PV drivers. Take also the opportunity to simplify simple construction such as pfn_to_mfn(page_to_pfn(page)) into page_to_gfn. More complex clean up will come in follow-up patches. I think it may be possible to do further clean up in the x86 code to ensure that helpers returning machine address (such as virt_address) is not used by no auto-translated guests. I will let x86 xen expert doing it. Reviewed-by: David Vrabel david.vra...@citrix.com It looks a bit odd to use GFN in some of the PV code where the hypervisor API uses MFN but overall I think using the correct terminology where possible is best. But I'd like to have Boris's or Konrad's opinion on this. David -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Please can I trust you?
My Dear Friend, I am Mr. Brice Adams, staff and auditor of a Bank in Lome, Togo Republic. I am the Account Officer to (Late Mr. Daniel I. Glade) whose account is presently dormant, I advise you to keep this as a top secret as I am still in service and intend to retire from service after I conclude this deal with you. I have an important Message/discussion with you about his death and his funds, the sum of (6.5 Million Euros) left without a heir. If you can be of an assistance to me, I will be pleased to offer to you 25% of the total fund. Please I got your email contact through internet email directory when I was searching for a trust worthy partner. If you are willing to help me, I need the following information below from you; Your full name. Nationality Telephone number.. Profession. Age. I will be humbly waiting your soonest response. Please contact direct to my email address (brice2ad...@yahoo.fr) for more information. With Respect, Mr. Brice Adams. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next 14/16] net: Add sk_bind_dev_if to task_struct
On Tue, Jul 28, 2015 at 9:11 AM, David Ahern d...@cumulusnetworks.com wrote: On 7/28/15 9:25 AM, Andy Lutomirski wrote: On Jul 27, 2015 11:33 AM, David Ahern d...@cumulusnetworks.com wrote: Allow tasks to have a default device index for binding sockets. If set the value is passed to all AF_INET/AF_INET6 sockets when they are created. This is not intended to be a review of the concept. I haven't thought about whether the concept is a good idea, broken by design, or whatever. FWIW, if this were added to the kernel and didn't require excessive privilege, I'd probably use it. (I still don't really understand why binding to a device requires privilege in the first place, but, again, I haven't thought about it very much.) The intent here is to restrict a task to only sending and receiving packets from a single network device. The device can be single ethernet interface, a stacked device (e.g, bond) or in our case a VRF device which restricts a task to interfaces (and hence network paths) associated with the VRF. We are also intending to implement similar functionality for ILA to restrict tasks (probably from cgroup) to binding to it's assigned addresses. This seems most easily accomplished by adding a binding interface which is only checked at bind time. After binding, the a connection should be processed no differently than any others, additional plumbing in the data path for network name spaces just seems like overhead. Tom +#ifdef CONFIG_NET + case PR_SET_SK_BIND_DEV_IF: + { + struct net_device *dev; + int idx = (int) arg2; + + if (!capable(CAP_NET_ADMIN)) + return -EPERM; + Can you either use ns_capable or add a comment as to why not? will do. Also, please return -EINVAL if unused args are nonzero. ok. + if (idx) { + dev = dev_get_by_index(me-nsproxy-net_ns, idx); + if (!dev) + return -EINVAL; + dev_put(dev); + } + me-sk_bind_dev_if = idx; + break; + } + case PR_GET_SK_BIND_DEV_IF: + { + struct task_struct *tsk; + int sk_bind_dev_if = -EINVAL; + + rcu_read_lock(); + tsk = find_task_by_vpid(arg2); + if (tsk) + sk_bind_dev_if = tsk-sk_bind_dev_if; Why do you support different tasks here? Could this use proc instead? In this case we want to allow a separate process to determine if a task is restricted to a device. The same -EINVAL issue applies. Also, I think you need to hook setns and unshare to do something reasonable when the task is bound to a device. ack on both. Thanks for the review, David -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next 14/16] net: Add sk_bind_dev_if to task_struct
On Tue, 2015-07-28 at 10:07 -0600, David Ahern wrote: Problems with using network namespaces for VRFs has been discussed in the past. e.g., http://www.spinics.net/lists/netdev/msg298368.html Great. Are you suggesting to get rid of network namespaces ? If not, your proposal only increases bloat and maintenance burden. If namespaces cant be fixed, they are the wrong design and we should remove them. If they can be fixed, they must be fixed. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next 13/16] net: Introduce VRF device driver - v2
On 7/27/15 2:01 PM, Nikolay Aleksandrov wrote: + + if (!vrf_is_master(dev) || vrf_is_master(port_dev) || Hmm, this means that bonds won't be able to be VRF slaves. They have the IFF_MASTER flag set. Right, will change to the IFF_VRF_MASTER flag. + vrf_is_slave(port_dev)) + return -EINVAL; + + return do_vrf_add_slave(dev, port_dev); +} + +/* inverse of do_vrf_add_slave */ +static int do_vrf_del_slave(struct net_device *dev, struct net_device *port_dev) +{ + struct net_vrf *vrf = netdev_priv(dev); + struct slave_queue *queue = vrf-queue; + struct net_vrf_dev *vrf_ptr = NULL; + struct slave *slave; + + vrf_ptr = rcu_dereference(dev-vrf_ptr); + RCU_INIT_POINTER(dev-vrf_ptr, NULL); I think this isn't safe, you should wait for a grace period before freeing the pointer. Actually you can just move the kfree() below the netdev_rx_handler_unregister() since it does synchronize_rcu() anyway. ok And ack on all other comments.. Thanks for the review, David -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html