[PATCH, net-next]r8169:Disable interrupts.
Disable interrupts when close the interface. Signed-off-by: Corcodel Marian corcodel.mar...@gmail.com diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c index 6cd7226..ea461fe 100644 --- a/drivers/net/ethernet/realtek/r8169.c +++ b/drivers/net/ethernet/realtek/r8169.c @@ -7548,6 +7548,7 @@ static int rtl8169_close(struct net_device *dev) /* Update counters before going down */ rtl8169_update_counters(dev); + rtl8169_irq_mask_and_ack(tp); rtl_lock_work(tp); clear_bit(RTL_FLAG_TASK_ENABLED, tp-wk.flags); -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 net-next] netfilter: ipset: Fixing unnamed union init
In continue to proposed Vinson Lee's post [1], this patch fixes compilation issues founded at gcc 4.4.7. The initialization of .cidr field of unnamed unions causes compilation error in gcc 4.4.x. References Visible links [1] https://lkml.org/lkml/2015/7/5/74 Signed-off-by: Elad Raz el...@mellanox.com --- net/netfilter/ipset/ip_set_hash_netnet.c | 20 ++-- net/netfilter/ipset/ip_set_hash_netportnet.c | 20 ++-- 2 files changed, 36 insertions(+), 4 deletions(-) diff --git a/net/netfilter/ipset/ip_set_hash_netnet.c b/net/netfilter/ipset/ip_set_hash_netnet.c index 3c862c0..a93dfeb 100644 --- a/net/netfilter/ipset/ip_set_hash_netnet.c +++ b/net/netfilter/ipset/ip_set_hash_netnet.c @@ -131,6 +131,13 @@ hash_netnet4_data_next(struct hash_netnet4_elem *next, #define HOST_MASK 32 #include ip_set_hash_gen.h +static void +hash_netnet4_init(struct hash_netnet4_elem *e) +{ + e-cidr[0] = HOST_MASK; + e-cidr[1] = HOST_MASK; +} + static int hash_netnet4_kadt(struct ip_set *set, const struct sk_buff *skb, const struct xt_action_param *par, @@ -160,7 +167,7 @@ hash_netnet4_uadt(struct ip_set *set, struct nlattr *tb[], { const struct hash_netnet *h = set-data; ipset_adtfn adtfn = set-variant-adt[adt]; - struct hash_netnet4_elem e = { .cidr = { HOST_MASK, HOST_MASK, }, }; + struct hash_netnet4_elem e = { }; struct ip_set_ext ext = IP_SET_INIT_UEXT(set); u32 ip = 0, ip_to = 0, last; u32 ip2 = 0, ip2_from = 0, ip2_to = 0, last2; @@ -169,6 +176,7 @@ hash_netnet4_uadt(struct ip_set *set, struct nlattr *tb[], if (tb[IPSET_ATTR_LINENO]) *lineno = nla_get_u32(tb[IPSET_ATTR_LINENO]); + hash_netnet4_init(e); if (unlikely(!tb[IPSET_ATTR_IP] || !tb[IPSET_ATTR_IP2] || !ip_set_optattr_netorder(tb, IPSET_ATTR_CADT_FLAGS))) return -IPSET_ERR_PROTOCOL; @@ -357,6 +365,13 @@ hash_netnet6_data_next(struct hash_netnet4_elem *next, #define IP_SET_EMIT_CREATE #include ip_set_hash_gen.h +static void +hash_netnet6_init(struct hash_netnet6_elem *e) +{ + e-cidr[0] = HOST_MASK; + e-cidr[1] = HOST_MASK; +} + static int hash_netnet6_kadt(struct ip_set *set, const struct sk_buff *skb, const struct xt_action_param *par, @@ -385,13 +400,14 @@ hash_netnet6_uadt(struct ip_set *set, struct nlattr *tb[], enum ipset_adt adt, u32 *lineno, u32 flags, bool retried) { ipset_adtfn adtfn = set-variant-adt[adt]; - struct hash_netnet6_elem e = { .cidr = { HOST_MASK, HOST_MASK, }, }; + struct hash_netnet6_elem e = { }; struct ip_set_ext ext = IP_SET_INIT_UEXT(set); int ret; if (tb[IPSET_ATTR_LINENO]) *lineno = nla_get_u32(tb[IPSET_ATTR_LINENO]); + hash_netnet6_init(e); if (unlikely(!tb[IPSET_ATTR_IP] || !tb[IPSET_ATTR_IP2] || !ip_set_optattr_netorder(tb, IPSET_ATTR_CADT_FLAGS))) return -IPSET_ERR_PROTOCOL; diff --git a/net/netfilter/ipset/ip_set_hash_netportnet.c b/net/netfilter/ipset/ip_set_hash_netportnet.c index 0c68734..9a14c23 100644 --- a/net/netfilter/ipset/ip_set_hash_netportnet.c +++ b/net/netfilter/ipset/ip_set_hash_netportnet.c @@ -142,6 +142,13 @@ hash_netportnet4_data_next(struct hash_netportnet4_elem *next, #define HOST_MASK 32 #include ip_set_hash_gen.h +static void +hash_netportnet4_init(struct hash_netportnet4_elem *e) +{ + e-cidr[0] = HOST_MASK; + e-cidr[1] = HOST_MASK; +} + static int hash_netportnet4_kadt(struct ip_set *set, const struct sk_buff *skb, const struct xt_action_param *par, @@ -175,7 +182,7 @@ hash_netportnet4_uadt(struct ip_set *set, struct nlattr *tb[], { const struct hash_netportnet *h = set-data; ipset_adtfn adtfn = set-variant-adt[adt]; - struct hash_netportnet4_elem e = { .cidr = { HOST_MASK, HOST_MASK, }, }; + struct hash_netportnet4_elem e = { }; struct ip_set_ext ext = IP_SET_INIT_UEXT(set); u32 ip = 0, ip_to = 0, ip_last, p = 0, port, port_to; u32 ip2_from = 0, ip2_to = 0, ip2_last, ip2; @@ -185,6 +192,7 @@ hash_netportnet4_uadt(struct ip_set *set, struct nlattr *tb[], if (tb[IPSET_ATTR_LINENO]) *lineno = nla_get_u32(tb[IPSET_ATTR_LINENO]); + hash_netportnet4_init(e); if (unlikely(!tb[IPSET_ATTR_IP] || !tb[IPSET_ATTR_IP2] || !ip_set_attr_netorder(tb, IPSET_ATTR_PORT) || !ip_set_optattr_netorder(tb, IPSET_ATTR_PORT_TO) || @@ -412,6 +420,13 @@ hash_netportnet6_data_next(struct hash_netportnet4_elem *next, #define IP_SET_EMIT_CREATE #include ip_set_hash_gen.h +static void +hash_netportnet6_init(struct hash_netportnet6_elem *e) +{ + e-cidr[0] = HOST_MASK; + e-cidr[1] = HOST_MASK; +} + static int hash_netportnet6_kadt(struct ip_set *set, const struct sk_buff
Re: [PATCH v2 net-next] r8169: Add values missing in @get_stats64 from HW counters
Sorry, I forgot to mention that I tested this patch on three different chip versions, RTL_GIGA_MAC_VER_23, RTL_GIGA_MAC_VER_33 and RTL_GIGA_MAC_VER_35. I couldn't test on pre-RTL_GIGA_MAC_VER_19, but the offset handling without counter reset already worked as expected on later chip versions, so I'm pretty confident that older chip versions should work accordingly. On Aug 21 12:09, Corinna Vinschen wrote: The r8169 driver collects statistical information returned by @get_stats64 by counting them in the driver itself, even though many (but not all) of the values are already collected by tally counters (TCs) in the NIC. Some of these TC values are not returned by @get_stats64. Especially the received multicast packages are missing from /proc/net/dev. Rectify this by fetching the TCs and returning them from rtl8169_get_stats64. The counters collected in the driver obviously disappear as soon as the driver is unloaded so after a driver is loaded the counters always start at 0. The TCs on the other hand are only reset by a power cycle. Without further considerations the values collected by the driver would not match up against the TC values. This patch introduces a new function rtl8169_reset_counters which resets the TCs. Unfortunately chip versions prior to RTL_GIGA_MAC_VER_19 don't allow to reset the TCs programatically. Therefore introduce an addition to the rtl8169_private struct and a function rtl8169_init_counter_offsets to store the TCs at first rtl_open. Use these values as offsets in rtl8169_get_stats64. Signed-off-by: Corinna Vinschen vinsc...@redhat.com --- drivers/net/ethernet/realtek/r8169.c | 107 +++ 1 file changed, 107 insertions(+) diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c index f790f61..f26a48d 100644 --- a/drivers/net/ethernet/realtek/r8169.c +++ b/drivers/net/ethernet/realtek/r8169.c @@ -637,6 +637,9 @@ enum rtl_register_content { /* _TBICSRBit */ TBILinkOK = 0x0200, + /* ResetCounterCommand */ + CounterReset= 0x1, + /* DumpCounterCommand */ CounterDump = 0x8, @@ -747,6 +750,14 @@ struct rtl8169_counters { __le16 tx_underun; }; +struct rtl8169_tc_offsets { + boolinited; + __le64 tx_errors; + __le32 tx_multi_collision; + __le32 rx_multicast; + __le16 tx_aborted; +}; + enum rtl_flag { RTL_FLAG_TASK_ENABLED, RTL_FLAG_TASK_SLOW_PENDING, @@ -824,6 +835,7 @@ struct rtl8169_private { struct mii_if_info mii; struct rtl8169_counters counters; + struct rtl8169_tc_offsets tc_offset; u32 saved_wolopts; u32 opts1_mask; @@ -2179,6 +2191,47 @@ static int rtl8169_get_sset_count(struct net_device *dev, int sset) } } +DECLARE_RTL_COND(rtl_reset_counters_cond) +{ + void __iomem *ioaddr = tp-mmio_addr; + + return RTL_R32(CounterAddrLow) CounterReset; +} + +static void rtl8169_reset_counters(struct net_device *dev) +{ + struct rtl8169_private *tp = netdev_priv(dev); + void __iomem *ioaddr = tp-mmio_addr; + struct device *d = tp-pci_dev-dev; + struct rtl8169_counters *counters; + dma_addr_t paddr; + u32 cmd; + + /* + * Versions prior to RTL_GIGA_MAC_VER_19 don't support resetting the + * tally counters. + */ + if (tp-mac_version RTL_GIGA_MAC_VER_19) + return; + + counters = dma_alloc_coherent(d, sizeof(*counters), paddr, GFP_KERNEL); + if (!counters) + return; + + RTL_W32(CounterAddrHigh, (u64)paddr 32); + cmd = (u64)paddr DMA_BIT_MASK(32); + RTL_W32(CounterAddrLow, cmd); + RTL_W32(CounterAddrLow, cmd | CounterReset); + + if (!rtl_udelay_loop_wait_low(tp, rtl_reset_counters_cond, 10, 1000)) + netif_warn(tp, hw, dev, counter reset failed\n); + + RTL_W32(CounterAddrLow, 0); + RTL_W32(CounterAddrHigh, 0); + + dma_free_coherent(d, sizeof(*counters), counters, paddr); +} + DECLARE_RTL_COND(rtl_counters_cond) { void __iomem *ioaddr = tp-mmio_addr; @@ -2220,6 +2273,39 @@ static void rtl8169_update_counters(struct net_device *dev) dma_free_coherent(d, sizeof(*counters), counters, paddr); } +static void rtl8169_init_counter_offsets(struct net_device *dev) +{ + struct rtl8169_private *tp = netdev_priv(dev); + + /* + * rtl8169_init_counter_offsets is called from rtl_open. On chip + * versions prior to RTL_GIGA_MAC_VER_19 the tally counters are only + * reset by a power cycle, while the counter values collected by the + * driver are reset at every driver unload/load cycle. + * + * To make sure the HW values returned by @get_stats64 match the SW + * values, we collect the initial values at first open(*) and use them + * as offsets to normalize the values
Re: e1000e: possible reggresion?
Hi Eric, It was probalby it. Uptime 5+ hours and no problem. Thanks for the hint, I was compiling linus tree yesterday around 12:00 UTC, your change was added later. Regards Tomas On Thu, Aug 20, 2015 at 10:29 PM, Eric Dumazet eric.duma...@gmail.com wrote: On Thu, 2015-08-20 at 21:37 +0200, Tomas Papan wrote: Hi there, I’m observing a freeze with the recent kernel (4.2-rc7). Unfortunately I can’t preserver the full traces. There is nothing in the messages after reboot, I was just lucky one time to see it when tail -f /var/log/messages was running. This is the only line which I was able to get: eth1 (e1000e): transmit queue 0 timed out I’ve got this message in the past, but the ethtool -K eth1 tso off solved that. I’m always running this command at the boot time since then. There is no issue with 4.2-rc4. It is hard to bisect, because this machine is used as headless server and it happens randomly (usually within 2 hours). Do you have any idea how to trace it or what can I do? Please keep me on CC since I’m not subscribed on this list Regards Tomas I would pull latest tree from Linus and pray the bug was fixed. My feeling is that you hit the issue fixed with commit 83fccfc3940c4a2db90fd7e7079f5b465cd8c6af Author: Eric Dumazet eduma...@google.com Date: Thu Aug 13 15:44:51 2015 -0700 inet: fix potential deadlock in reqsk_queue_unlink() When replacing del_timer() with del_timer_sync(), I introduced a deadlock condition : reqsk_queue_unlink() is called from inet_csk_reqsk_queue_drop() inet_csk_reqsk_queue_drop() can be called from many contexts, one being the timer handler itself (reqsk_timer_handler()). In this case, del_timer_sync() loops forever. Simple fix is to test if timer is pending. Fixes: 2235f2ac75fd (inet: fix races with reqsk timers) Signed-off-by: Eric Dumazet eduma...@google.com Signed-off-by: David S. Miller da...@davemloft.net -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH V4 net-next 0/2] net: implement SMC-R solution
From: Ursula Braun ursula.br...@de.ibm.com Dave, this is V4 of my SMC-R patches: Since you are asking for a solution 100% in our own separate module with our own can of worms, we have to give up the transparent detection whether a communication peer can do SMC-R or not (this has been the purpose of the rejected TCP hooks). Instead, we want just the new self-contained SMC-R socket family added to the kernel. By the way, since August 2015 the SMC-R Informational RFC is no longer a draft, but published as RFC7609. V4 changes: 1. Remove tcp patches supporting TCP experimental options 2. Remove references to tcp_sock syn_smc flag in smc-code, since TCP experimental options are not supported by the Linux-tcp. 3. clc_wait_msg() simplified V3 changes: 1. Avoid adding of new space for smc-related bits in the tcp structures. 2. Make the smc feature to be nearly zero cost using Static Keys / jump labels 3. Increase / decrease smc static key in the smc-code 4. Make sure the next-to-last patch does not break the build 5. Additional pnet table checking V2 changes: 1. activate tcp changes for CONFIG_AFSMC only (as suggested by Eric Dumazet) 2. add additional hook in net/core/sock.c 3. fix bitfield endianness problem Thanks, Ursula In 2013, IBM introduced an optimized communications solution for the IBM zEnterprise EC12 and BC12 (s390 in Linux terminology) that is comprised of the IBM 10GbE RoCE Express feature with Shared Memory Communications-RDMA (SMC-R) protocol [1]. SMC-R is designed for the enterprise data center environment and is an open protocol as specified in the informational RFC7609 [2]. It has been published in August 2015. Another implementation of this protocol is available since 2013 with IBM z/OS Version 2 Release 1. SMC-R provides a “sockets over RDMA” solution that leverages industry standard RDMA over Converged Ethernet (RoCE) technology. IBM has developed a Linux implementation of the SMC-R standard. A new socket protocol family AF_SMC is introduced. A preload library can be used to enable TCP-based applications to use SMC-R without changes. Key aspects of SMC-R are: 1. Provides optimized performance compared to standard TCP/IP over Ethernet within the data center for both request/response (latency) and streaming workloads (CPU savings) [3]. Initial benchmarks on Linux on x86 processors have shown latency reduction of up to 52% with a throughput gain of 111% using SMC-R vs TCP for request/response message patterns (10 concurrent TCP connections with 16KBmessages) and CPU savings of up to 69% for streaming data patterns (single TCP connection with 20MB of data in one direction). [1] is currently updated to contain more detailed information on Linux and performance. 2. In order to preserve the traditional network administrative model the SMC-R protocol ties into the existing IP addresses and uses TCP's handshake to establish connections. This allows existing management tools and security infrastructure to control the creation of SMC connections. 3. The SMC-R protocol logically bonds multiple RoCE adapters together providingredundancy with transparent fail-over for improved high availability, increased bandwidth and load balancing across multiple RDMA-capable devices. Without the rejected TCP Experimental Options the following aspects are restricted; alternate solutions are in discussion. 4. Due to its handshake protocol, SMC-R is compatible with (transparent to) existing TCP connection load balancers that are commonly used in the enterprise data center environment for multi-tier application workloads. 5. SMC-R's handshake protocol allows for transparent fallback to TCP/IP, should one of the peers not be capable of the protocol. Additional SMC-R overview and reference materials are available [1]. The SMC-R “rendezvous protocol eliminates the need for RDMA-CM and the exchange occurs through an initial TCP connection. Building on a TCP connection to establish an SMC-R connection solves many key requirements. The rendezvous process occurs now in 1 phase only: 1. TCP/IP 3-way exchange with TCP experimental options is skipped. 2. SMC-R 3-way exchange: It is assumed both partners indicate SMC-R capability. Then at the completion of the 3-way TCP handshake the SMC-R layers in each peer take control of the TCP connection and exchange their RDMA credentials. If this 3-way exchange completes successfully the connection continues using SMC-R. If the exchange is not successful the connections falls back to standard TCP/IP. References: [1] SMC-R Overview and Reference Materials: http://www-01.ibm.com/software/network/commserver/SMCR/ [2] SMC-R Informational RFC: https://tools.ietf.org/rfc/rfc7609 [3] Linux SMC-R Overview and Performance Summary (archs x86 and s390): http://www-01.ibm.com/software/network/commserver/SMCR/ The patch series is prepared to apply to net-next and
[PATCH V4 net-next 1/2] net: introduce socket family constants
From: Ursula Braun ursula.br...@de.ibm.com The new socket family is assigned the next available address / protocol family constant 41. Implementing SO_KEEPALIVE for SMC-R requires an extra hook in net/ipv4/timer.c. Signed-off-by: Ursula Braun ursula.br...@de.ibm.com --- include/linux/socket.h | 4 +++- include/net/smc.h | 13 + net/ipv4/tcp_timer.c | 2 +- 3 files changed, 17 insertions(+), 2 deletions(-) create mode 100644 include/net/smc.h diff --git a/include/linux/socket.h b/include/linux/socket.h index 5bf59c8..1adcbcc 100644 --- a/include/linux/socket.h +++ b/include/linux/socket.h @@ -200,7 +200,8 @@ struct ucred { #define AF_ALG 38 /* Algorithm sockets*/ #define AF_NFC 39 /* NFC sockets */ #define AF_VSOCK 40 /* vSockets */ -#define AF_MAX 41 /* For now.. */ +#define AF_SMC 41 /* smc sockets */ +#define AF_MAX 42 /* For now.. */ /* Protocol families, same as address families. */ #define PF_UNSPEC AF_UNSPEC @@ -246,6 +247,7 @@ struct ucred { #define PF_ALG AF_ALG #define PF_NFC AF_NFC #define PF_VSOCK AF_VSOCK +#define PF_SMC AF_SMC #define PF_MAX AF_MAX /* Maximum queue length specifiable by listen. */ diff --git a/include/net/smc.h b/include/net/smc.h new file mode 100644 index 000..cd513ee --- /dev/null +++ b/include/net/smc.h @@ -0,0 +1,13 @@ +/* + * SMC Definitions for the SMC protocol. + * + * Author: Ursula Braun ursula.br...@de.ibm.com + */ +#ifndef _SMC_H +#define _SMC_H + +/* SMC socket options - disjunct with TCP socket options */ +#define SMC_KEEPALIVE 99 /* start/stop keepalives */ + +#endif /* _SMC_H */ + diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c index 7149ebc..070bfc7 100644 --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -557,7 +557,7 @@ void tcp_set_keepalive(struct sock *sk, int val) else if (!val) inet_csk_delete_keepalive_timer(sk); } - +EXPORT_SYMBOL(tcp_set_keepalive); static void tcp_keepalive_timer (unsigned long data) { -- 2.3.8 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next] route: fix breakage after moving lwtunnel state
__recnt and related fields need to be in its own cacheline for performance reasons. Commit 61adedf3e3f1 (route: move lwtunnel state to dst_entry) broke that on 32bit archs, causing BUILD_BUG_ON in dst_hold to be triggered. This patch fixes the breakage by moving the lwtunnel state to the end of dst_entry on 32bit archs. Unfortunately, this makes it share the cacheline with __refcnt and may affect performance, thus further patches may be needed. Reported-by: kbuild test robot fengguang...@intel.com Fixes: 61adedf3e3f1 (route: move lwtunnel state to dst_entry) Signed-off-by: Jiri Benc jb...@redhat.com --- I'm working on this, I'm going to grab performance numbers with this patch applied and work on follow up patches as necessary. Until then, this patch at least fixes the 32bit build. I'm very sorry for the breakage. I tried to build the patchset with various configs (IPv6 off, lwtunnel off, etc.) but obviously did not test on 32bit. I have no excuse for this, I should have tested it, the #ifdef was very obvious. --- include/net/dst.h | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/include/net/dst.h b/include/net/dst.h index 0a9a723f6c19..ef8f1d43a203 100644 --- a/include/net/dst.h +++ b/include/net/dst.h @@ -44,7 +44,6 @@ struct dst_entry { #else void*__pad1; #endif - struct lwtunnel_state *lwtstate; int (*input)(struct sk_buff *); int (*output)(struct sock *sk, struct sk_buff *skb); @@ -85,11 +84,12 @@ struct dst_entry { __u32 __pad2; #endif +#ifdef CONFIG_64BIT + struct lwtunnel_state *lwtstate; /* * Align __refcnt to a 64 bytes alignment * (L1_CACHE_SIZE would be too much) */ -#ifdef CONFIG_64BIT long__pad_to_align_refcnt[1]; #endif /* @@ -99,6 +99,9 @@ struct dst_entry { atomic_t__refcnt; /* client references*/ int __use; unsigned long lastuse; +#ifndef CONFIG_64BIT + struct lwtunnel_state *lwtstate; +#endif union { struct dst_entry*next; struct rtable __rcu *rt_next; -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DEBUG_LOCKS_WARN_ON(in_interrupt()) triggering in socket code
On Fri, Aug 21, 2015 at 03:42:33PM +0200, Jason A. Donenfeld wrote: Ahhh, interesting, so it turns out you can't do a number of things with a read_lock_bh held, because it increases the softirq count. Mystery solved. You must not do anything that can sleep (like taking a mutex) while holding a rwlock (even for reading) as someone else could call write_lock() on the same rwlock on the same CPU in the meantime and would end up spinning indefinitely while waiting for you to release it. Michal Kubecek -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: DEBUG_LOCKS_WARN_ON(in_interrupt()) triggering in socket code
Ahhh, interesting, so it turns out you can't do a number of things with a read_lock_bh held, because it increases the softirq count. Mystery solved. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: e1000e: possible reggresion?
On Fri, 2015-08-21 at 12:48 +0200, Tomas Papan wrote: Hi Eric, It was probalby it. Uptime 5+ hours and no problem. Thanks for the hint, I was compiling linus tree yesterday around 12:00 UTC, your change was added later. Sure, let me know if you have any problems. A timer fix was also queued, not yet in Linus tree. https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=d0023a1448abdcc892b8bca631e74bb1888efd02 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/5] net: add Hisilicon Network Subsystem MDIO support
On Monday 17 August 2015 17:17:50 Kenneth Lee wrote: Thanks, Arnd, You are right. This is the same IP as hip04_mdio.c. We just mis-understand the hardware design. We will merge them and re-submit the patches. Ok, great! Arnd -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [net-next:master 1179/1189] include/linux/compiler.h:447:38: error: call to '__compiletime_assert_243' declared with attribute error: BUILD_BUG_ON failed: offsetof(struct dst_entry, __refcnt) 63
On Thu, 20 Aug 2015 23:26:50 -0700 (PDT), David Miller wrote: Yeah, I should have predicted this would happen on 32-bit builds when I saw the adjustment of __pad_to_align_refcnt[] for 64-bit. Jiri, you might not have any reasonable options to fix this I'm afraid. Still working on this, the patch I sent should at least relieve the pressure (but of course, I'll understand if you revert the whole set). I'm currently fighting with vxlan triggering null pointer dereference in include/net/netns/generic.h:41, seems that net-gen is NULL. This is with commit 938049e18dca, i.e. before my lwtunnel ipv6 patchset. Pasting the trace below in case anyone has an idea. CONFIG_NET_NS is enabled. When adding debug printk to vxlan_init_net (before the call to net_generic), the issue disappears. Smells like a race. I'm not sure how much time I will have during the weekend. Jiri [ 26.102174] BUG: unable to handle kernel NULL pointer dereference at 0010 [ 26.109299] IP: [f8501154] vxlan_init_net+0x14/0x50 [vxlan] [ 26.115032] *pdpt = 33b48001 *pde = [ 26.120770] Oops: [#1] SMP [ 26.124000] Modules linked in: vxlan(+) tg3(+) ip6_udp_tunnel snd_pcm udp_tunnel snd_timer hp_wmi sparse_keymap snd ptp coretemp rfkill pps_core gpio_ich iTCO_wdt mdio dca iTCO_vendor_support ppdev kvm_intel kvm soundcore lpc_ich mfd_core pcspkr crc32_pclmul floppy parport_pc i7core_edac parport edac_core acpi_cpufreq xfs libcrc32c nouveau video mxm_wmi i2c_algo_bit drm_kms_helper ttm drm mptsas scsi_transport_sas firewire_ohci mptscsih crc32c_intel serio_raw firewire_core mptbase crc_itu_t wmi [ 26.168070] CPU: 0 PID: 370 Comm: systemd-udevd Not tainted 4.2.0-rc6+ #1 [ 26.174829] Hardware name: Hewlett-Packard HP Z800 Workstation/0AECh, BIOS 786G5 v03.54 11/02/2011 [ 26.183750] task: f4ec45c0 ti: f4a82000 task.ti: f4a82000 [ 26.189125] EIP: 0060:[f8501154] EFLAGS: 00010282 CPU: 0 [ 26.194588] EIP is at vxlan_init_net+0x14/0x50 [vxlan] [ 26.199703] EAX: EBX: f8509000 ECX: 0002 EDX: 0002 [ 26.205942] ESI: f6583000 EDI: c0df7900 EBP: f4a83d74 ESP: f4a83d74 [ 26.212182] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 26.217557] CR0: 80050033 CR2: 0010 CR3: 34a3f4a0 CR4: 06f0 [ 26.223797] Stack: [ 26.225797] f4a83d94 c09beab1 f6aed2c0 0002 f8509000 c0df7900 f8509000 [ 26.233576] f4a83db4 c09bedb4 f4a83d9c f4a83d9c 3cd52c75 f8509000 c0d72780 f8091000 [ 26.241355] f4a83dc0 c09bee51 f4f08960 f4a83dcc f8091040 f4f08960 f4a83e48 c040211a [ 26.249133] Call Trace: [ 26.251568] [c09beab1] ops_init+0x31/0x130 [ 26.255905] [c09bedb4] register_pernet_operations+0xe4/0x160 [ 26.261798] [f8091000] ? 0xf8091000 [ 26.265530] [c09bee51] register_pernet_subsys+0x21/0x40 [ 26.270993] [f8091040] vxlan_init_module+0x40/0x1000 [vxlan] [ 26.276889] [c040211a] do_one_initcall+0xaa/0x200 [ 26.281829] [f8091000] ? 0xf8091000 [ 26.285562] [c05a45d5] ? kmem_cache_alloc_trace+0x175/0x1f0 [ 26.291369] [c0ac73db] ? do_init_module+0x21/0x1b5 [ 26.296398] [c0ac73db] ? do_init_module+0x21/0x1b5 [ 26.301428] [c0ac740a] do_init_module+0x50/0x1b5 [ 26.306285] [c04e743b] load_module+0x1dbb/0x23c0 [ 26.311143] [c04e4049] ? copy_module_from_fd.isra.48+0xf9/0x190 [ 26.317297] [c04e7c75] SyS_finit_module+0xa5/0xf0 [ 26.322240] [c05724cb] ? vm_mmap_pgoff+0x9b/0xc0 [ 26.327097] [c0acd79f] sysenter_do_call+0x12/0x12 [ 26.332038] Code: c0 00 00 00 5d c3 90 55 89 e5 66 66 66 66 90 8b 80 18 05 00 00 5d c3 55 89 e5 66 66 66 66 90 8b 15 4c 94 50 f8 8b 80 98 0c 00 00 8b 54 90 08 89 12 89 52 04 8d 42 08 c7 82 08 04 00 00 00 00 00 [ 26.351295] EIP: [f8501154] vxlan_init_net+0x14/0x50 [vxlan] SS:ESP 0068:f4a83d74 [ 26.358938] CR2: 0010 [ 26.362259] ---[ end trace 3aa9af5192e30e1f ]--- -- Jiri Benc -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next] tcp: fix slow start after idle vs TSO/GSO
On Thu, Aug 20, 2015 at 1:08 PM, Eric Dumazet eric.duma...@gmail.com wrote: From: Eric Dumazet eduma...@google.com slow start after idle might reduce cwnd, but we perform this after first packet was cooked and sent. With TSO/GSO, it means that we might send a full TSO packet even if cwnd should have been reduced to IW10. Moving the SSAI check in skb_entail() makes sense, because we slightly reduce number of times this check is done, especially for large send() and TCP Small queue callbacks from softirq context. Very nice catch, and this fix seems like a definite improvement. One potential issue is that the connection can restart from idle not just because new data has been written (which this patch addresses), but also because the receive window opens and so now packets can be sent again. The old version of the code implicitly fired the restart code path in the receive window opens case as well, since it fired every time new data was sent. We might want to check if we need to call tcp_cwnd_restart() in tcp_ack_update_window(), next to the call for tcp_fast_path_check()? neal -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 答复: [PATCH 1/5] net: add Hisilicon Network Subsystem support (config and documents)
On Monday 17 August 2015 01:28:07 Liguozhu wrote: Thanks, Arnd. Regarding the ae-name: it is the name of the Acceleration Engine. It is provided by the BIOS according to the position and the feature enabled of the IP. So soc0 means it is on SoC No. 0, while n4 means it is running on Non-dsaf mode 4. Ideally, we should setup the rule to name it. But as I said in the patchset, the IP is original designed for a bare metal solution, it is worthless to export all modes and we are planning to add more mode for Linux itself in the IP in future version. So I think the better way is to leave it as a name but add more meaning in the future. The name property is a bit awkward. The position is normally implied by the location of the parent device in the DT, so you should not need that at all and instead derive it elsewhere. You can also add strings to the compatible property instead of this, to signify differences in the programming that are based on how the IP block is used. Regarding the ae-opts: it is the initial value for the AE's runtime options, Currently, we have only port number (there are 6XGE+2GE port for a DSAF AE) as option. But for future version, we will add other options such as enable Spanning Tree Protocol algorithm) and so on. I think these can easily be converted into an index property and boolean flags (present if true, absent otherwise) for additional features. Should I add these background to somewhere? The binding document needs to list all supported configurations, if you have a string property, describe specifically what strings are allowed and what they mean, but better try to avoid strings altogether. Arnd -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next] netfilter: ipset: Fixing unnamed union init
In continue to proposed Vinson Lee's post [1], this patch fixes compilation issues founded at gcc 4.4.7. The initialization of .cidr field of unnamed unions causes compilation error in gcc 4.4.x. References Visible links [1] https://lkml.org/lkml/2015/7/5/74 Signed-off-by: Elad Raz el...@mellanox.com --- net/netfilter/ipset/ip_set_hash_netnet.c | 23 +-- net/netfilter/ipset/ip_set_hash_netportnet.c | 23 +-- 2 files changed, 42 insertions(+), 4 deletions(-) diff --git a/net/netfilter/ipset/ip_set_hash_netnet.c b/net/netfilter/ipset/ip_set_hash_netnet.c index 3c862c0..2bff1f0 100644 --- a/net/netfilter/ipset/ip_set_hash_netnet.c +++ b/net/netfilter/ipset/ip_set_hash_netnet.c @@ -131,6 +131,14 @@ hash_netnet4_data_next(struct hash_netnet4_elem *next, #define HOST_MASK 32 #include ip_set_hash_gen.h +static void +hash_netnet4_init(struct hash_netnet4_elem *e) +{ + e-ipcmp = 0; + e-cidr[0] = HOST_MASK; + e-cidr[1] = HOST_MASK; +} + static int hash_netnet4_kadt(struct ip_set *set, const struct sk_buff *skb, const struct xt_action_param *par, @@ -160,7 +168,7 @@ hash_netnet4_uadt(struct ip_set *set, struct nlattr *tb[], { const struct hash_netnet *h = set-data; ipset_adtfn adtfn = set-variant-adt[adt]; - struct hash_netnet4_elem e = { .cidr = { HOST_MASK, HOST_MASK, }, }; + struct hash_netnet4_elem e = { }; struct ip_set_ext ext = IP_SET_INIT_UEXT(set); u32 ip = 0, ip_to = 0, last; u32 ip2 = 0, ip2_from = 0, ip2_to = 0, last2; @@ -169,6 +177,7 @@ hash_netnet4_uadt(struct ip_set *set, struct nlattr *tb[], if (tb[IPSET_ATTR_LINENO]) *lineno = nla_get_u32(tb[IPSET_ATTR_LINENO]); + hash_netnet4_init(e); if (unlikely(!tb[IPSET_ATTR_IP] || !tb[IPSET_ATTR_IP2] || !ip_set_optattr_netorder(tb, IPSET_ATTR_CADT_FLAGS))) return -IPSET_ERR_PROTOCOL; @@ -357,6 +366,15 @@ hash_netnet6_data_next(struct hash_netnet4_elem *next, #define IP_SET_EMIT_CREATE #include ip_set_hash_gen.h +static void +hash_netnet6_init(struct hash_netnet6_elem *e) +{ + ipv6_addr_set(e-ip[0].in6, 0, 0, 0, 0); + ipv6_addr_set(e-ip[1].ip6, 0, 0, 0, 0); + e-cidr[0] = HOST_MASK; + e-cidr[1] = HOST_MASK; +} + static int hash_netnet6_kadt(struct ip_set *set, const struct sk_buff *skb, const struct xt_action_param *par, @@ -385,13 +403,14 @@ hash_netnet6_uadt(struct ip_set *set, struct nlattr *tb[], enum ipset_adt adt, u32 *lineno, u32 flags, bool retried) { ipset_adtfn adtfn = set-variant-adt[adt]; - struct hash_netnet6_elem e = { .cidr = { HOST_MASK, HOST_MASK, }, }; + struct hash_netnet6_elem e = { }; struct ip_set_ext ext = IP_SET_INIT_UEXT(set); int ret; if (tb[IPSET_ATTR_LINENO]) *lineno = nla_get_u32(tb[IPSET_ATTR_LINENO]); + hash_netnet6_init(e); if (unlikely(!tb[IPSET_ATTR_IP] || !tb[IPSET_ATTR_IP2] || !ip_set_optattr_netorder(tb, IPSET_ATTR_CADT_FLAGS))) return -IPSET_ERR_PROTOCOL; diff --git a/net/netfilter/ipset/ip_set_hash_netportnet.c b/net/netfilter/ipset/ip_set_hash_netportnet.c index 0c68734..0695c5c 100644 --- a/net/netfilter/ipset/ip_set_hash_netportnet.c +++ b/net/netfilter/ipset/ip_set_hash_netportnet.c @@ -142,6 +142,14 @@ hash_netportnet4_data_next(struct hash_netportnet4_elem *next, #define HOST_MASK 32 #include ip_set_hash_gen.h +static void +hash_netportnet4_init(struct hash_netportnet4_elem *e) +{ + e-ipcmp = 0; + e-cidr[0] = HOST_MASK; + e-cidr[1] = HOST_MASK; +} + static int hash_netportnet4_kadt(struct ip_set *set, const struct sk_buff *skb, const struct xt_action_param *par, @@ -175,7 +183,7 @@ hash_netportnet4_uadt(struct ip_set *set, struct nlattr *tb[], { const struct hash_netportnet *h = set-data; ipset_adtfn adtfn = set-variant-adt[adt]; - struct hash_netportnet4_elem e = { .cidr = { HOST_MASK, HOST_MASK, }, }; + struct hash_netportnet4_elem e = { }; struct ip_set_ext ext = IP_SET_INIT_UEXT(set); u32 ip = 0, ip_to = 0, ip_last, p = 0, port, port_to; u32 ip2_from = 0, ip2_to = 0, ip2_last, ip2; @@ -185,6 +193,7 @@ hash_netportnet4_uadt(struct ip_set *set, struct nlattr *tb[], if (tb[IPSET_ATTR_LINENO]) *lineno = nla_get_u32(tb[IPSET_ATTR_LINENO]); + hash_netportnet4_init(e); if (unlikely(!tb[IPSET_ATTR_IP] || !tb[IPSET_ATTR_IP2] || !ip_set_attr_netorder(tb, IPSET_ATTR_PORT) || !ip_set_optattr_netorder(tb, IPSET_ATTR_PORT_TO) || @@ -412,6 +421,15 @@ hash_netportnet6_data_next(struct hash_netportnet4_elem *next, #define IP_SET_EMIT_CREATE #include ip_set_hash_gen.h +static void +hash_netportnet6_init(struct hash_netportnet6_elem *e) +{
[net-next PATCH 3/3] net: sched: fall back to noqueue when removing root qdisc
When removing the root qdisc, the interface should fall back to noqueue as the 'real' minimal qdisc instead of the default one. Therefore dev_graft_qdisc() has to be adjusted to assign noqueue if NULL was passed as new qdisc, and qdisc_graft() needs to assign noqueue to dev-qdisc instead of noop to prevent dev_activate() from attaching default qdiscs to the interface. Note that it is also necessary to have dev_graft_qdisc() set dev_queue-qdisc to the new qdisc instead of (unconditionally) noop. I don't know why this was there at all (originates from pre-git time), but it seems wrong to me. It could be worked around by droping the extra check for noqueue in transition_one_qdisc(), maybe with unintended side-effects. Signed-off-by: Phil Sutter p...@nwl.cc --- net/sched/sch_api.c | 2 +- net/sched/sch_generic.c | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c index 224374c..3b2cf30 100644 --- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -839,7 +839,7 @@ skip: dev-qdisc, new); if (new !new-ops-attach) atomic_inc(new-refcnt); - dev-qdisc = new ? : noop_qdisc; + dev-qdisc = new ? : noqueue_qdisc; if (new new-ops-attach) new-ops-attach(new); diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index ab614ee..556de30 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -723,9 +723,9 @@ struct Qdisc *dev_graft_qdisc(struct netdev_queue *dev_queue, /* ... and graft new one */ if (qdisc == NULL) - qdisc = noop_qdisc; + qdisc = noqueue_qdisc; dev_queue-qdisc_sleeping = qdisc; - rcu_assign_pointer(dev_queue-qdisc, noop_qdisc); + rcu_assign_pointer(dev_queue-qdisc, qdisc); spin_unlock_bh(root_lock); -- 2.1.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[net-next PATCH 1/3] net: sched: make noqueue_qdisc non-static
This needs to be referenced from within net/sched/sched_api.c later. Signed-off-by: Phil Sutter p...@nwl.cc --- include/net/sch_generic.h | 1 + net/sched/sch_generic.c | 3 +-- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index 2eab08c..4495193 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -337,6 +337,7 @@ static inline void sch_tree_unlock(const struct Qdisc *q) #define tcf_tree_unlock(tp)sch_tree_unlock((tp)-q) extern struct Qdisc noop_qdisc; +extern struct Qdisc noqueue_qdisc; extern struct Qdisc_ops noop_qdisc_ops; extern struct Qdisc_ops pfifo_fast_ops; extern struct Qdisc_ops mq_qdisc_ops; diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index 942fea8..1fb65f9 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -425,13 +425,12 @@ static struct Qdisc_ops noqueue_qdisc_ops __read_mostly = { .owner = THIS_MODULE, }; -static struct Qdisc noqueue_qdisc; static struct netdev_queue noqueue_netdev_queue = { .qdisc = noqueue_qdisc, .qdisc_sleeping = noqueue_qdisc, }; -static struct Qdisc noqueue_qdisc = { +struct Qdisc noqueue_qdisc = { .enqueue= NULL, .dequeue= noop_dequeue, .flags = TCQ_F_BUILTIN, -- 2.1.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[net-next PATCH 0/3] net: sched: allow switching qdisc to noqueue intuitively
This patch series improves the integration of the noqueue qdisc to become the fallback queueing if no other is attached to an interface. Before it was rather an add-on, a simpler alternative to a FIFO if no congestion is expected or possible. It has become the default qdisc for virtual interfaces, and could be attached by this mechanism only (through removing the root qdisc after having set tx_queue_len to zero for interfaces not defaulting to noqueue otherwise). This series does not change the default qdisc chosen for new interfaces, but upon removal of the root qdisc from an interface, the kernel won't fall back to the default but to noqueue instead. Phil Sutter (3): net: sched: make noqueue_qdisc non-static net: sched: allocate a handle to default qdiscs net: sched: fall back to noqueue when removing root qdisc include/net/sch_generic.h | 2 ++ net/sched/sch_api.c | 5 +++-- net/sched/sch_generic.c | 12 3 files changed, 13 insertions(+), 6 deletions(-) -- 2.1.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH net-next] tcp: fix slow start after idle vs TSO/GSO
On Fri, 2015-08-21 at 11:10 -0400, Neal Cardwell wrote: Very nice catch, and this fix seems like a definite improvement. One potential issue is that the connection can restart from idle not just because new data has been written (which this patch addresses), but also because the receive window opens and so now packets can be sent again. The old version of the code implicitly fired the restart code path in the receive window opens case as well, since it fired every time new data was sent. We might want to check if we need to call tcp_cwnd_restart() in tcp_ack_update_window(), next to the call for tcp_fast_path_check()? Excellent, I wrote a 2nd packetdrill test to exercise this path, will submit a v2 soon. Thanks Neal -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [net-next PATCH 2/3] net: sched: allocate a handle to default qdiscs
On Fri, 2015-08-21 at 17:58 +0200, Phil Sutter wrote: Since tc_get_qdisc() does not allow to remove a qdisc with zero handle, a handle needs to be allocated to default qdiscs (currently pfifo_fast or mq) in order to allow removing them. Signed-off-by: Phil Sutter p...@nwl.cc --- include/net/sch_generic.h | 1 + net/sched/sch_api.c | 3 ++- net/sched/sch_generic.c | 5 + 3 files changed, 8 insertions(+), 1 deletion(-) diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index 4495193..2bfc898 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -391,6 +391,7 @@ void dev_deactivate(struct net_device *dev); void dev_deactivate_many(struct list_head *head); struct Qdisc *dev_graft_qdisc(struct netdev_queue *dev_queue, struct Qdisc *qdisc); +u32 qdisc_alloc_handle(struct net_device *dev); void qdisc_reset(struct Qdisc *qdisc); void qdisc_destroy(struct Qdisc *qdisc); void qdisc_tree_decrease_qlen(struct Qdisc *qdisc, unsigned int n); diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c index f06aa01..224374c 100644 --- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -723,7 +723,7 @@ EXPORT_SYMBOL(qdisc_class_hash_remove); /* Allocate an unique handle from space managed by kernel * Possible range is [8000-]: (0x8000 values) */ -static u32 qdisc_alloc_handle(struct net_device *dev) +u32 qdisc_alloc_handle(struct net_device *dev) { int i = 0x8000; static u32 autohandle = TC_H_MAKE(0x8000U, 0); @@ -739,6 +739,7 @@ static u32 qdisc_alloc_handle(struct net_device *dev) return 0; } +EXPORT_SYMBOL(qdisc_alloc_handle); void qdisc_tree_decrease_qlen(struct Qdisc *sch, unsigned int n) { diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index 1fb65f9..ab614ee 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -634,6 +634,11 @@ struct Qdisc *qdisc_create_dflt(struct netdev_queue *dev_queue, if (IS_ERR(sch)) goto errout; sch-parent = parentid; +#ifdef CONFIG_NET_SCHED + sch-handle = qdisc_alloc_handle(dev_queue-dev); + if (!sch-handle) + goto errout; +#endif if (!ops-init || ops-init(sch, NULL) == 0) return sch; This might break HTB setups with more than 32768 classes ? The pfifo qdisc that gets attached had no handle. qdisc_alloc_handle() has a limited range. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/4 nf-next] ipvs: add sync_maxlen parameter for the sync daemon
From: Julian Anastasov j...@ssi.bg Allow setups with large MTU to send large sync packets by adding sync_maxlen parameter. The default value is now based on MTU but no more than 1500 for compatibility reasons. To avoid problems if MTU changes allow fragmentation by sending packets with DF=0. Problem reported by Dan Carpenter. Reported-by: Dan Carpenter dan.carpen...@oracle.com Signed-off-by: Julian Anastasov j...@ssi.bg Signed-off-by: Simon Horman ho...@verge.net.au --- include/net/ip_vs.h | 19 +++--- include/uapi/linux/ip_vs.h | 1 + net/netfilter/ipvs/ip_vs_ctl.c | 53 ++-- net/netfilter/ipvs/ip_vs_sync.c | 137 ++-- 4 files changed, 108 insertions(+), 102 deletions(-) diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h index 4e3731ee4eac..2fdc13caf712 100644 --- a/include/net/ip_vs.h +++ b/include/net/ip_vs.h @@ -846,6 +846,13 @@ struct ipvs_master_sync_state { /* How much time to keep dests in trash */ #define IP_VS_DEST_TRASH_PERIOD(120 * HZ) +struct ipvs_sync_daemon_cfg { + int syncid; + u16 sync_maxlen; + /* multicast interface name */ + charmcast_ifn[IP_VS_IFNAME_MAXLEN]; +}; + /* IPVS in network namespace */ struct netns_ipvs { int gen;/* Generation */ @@ -961,15 +968,10 @@ struct netns_ipvs { spinlock_t sync_buff_lock; struct task_struct **backup_threads; int threads_mask; - int send_mesg_maxlen; - int recv_mesg_maxlen; volatile intsync_state; - volatile intmaster_syncid; - volatile intbackup_syncid; struct mutexsync_mutex; - /* multicast interface name */ - charmaster_mcast_ifn[IP_VS_IFNAME_MAXLEN]; - charbackup_mcast_ifn[IP_VS_IFNAME_MAXLEN]; + struct ipvs_sync_daemon_cfg mcfg; /* Master Configuration */ + struct ipvs_sync_daemon_cfg bcfg; /* Backup Configuration */ /* net name space ptr */ struct net *net;/* Needed by timer routines */ /* Number of heterogeneous destinations, needed becaus heterogeneous @@ -1408,7 +1410,8 @@ static inline void ip_vs_dest_put_and_free(struct ip_vs_dest *dest) /* IPVS sync daemon data and function prototypes * (from ip_vs_sync.c) */ -int start_sync_thread(struct net *net, int state, char *mcast_ifn, __u8 syncid); +int start_sync_thread(struct net *net, struct ipvs_sync_daemon_cfg *cfg, + int state); int stop_sync_thread(struct net *net, int state); void ip_vs_sync_conn(struct net *net, struct ip_vs_conn *cp, int pkts); diff --git a/include/uapi/linux/ip_vs.h b/include/uapi/linux/ip_vs.h index 3199243f2028..68377d8c8870 100644 --- a/include/uapi/linux/ip_vs.h +++ b/include/uapi/linux/ip_vs.h @@ -406,6 +406,7 @@ enum { IPVS_DAEMON_ATTR_STATE, /* sync daemon state (master/backup) */ IPVS_DAEMON_ATTR_MCAST_IFN, /* multicast interface name */ IPVS_DAEMON_ATTR_SYNC_ID, /* SyncID we belong to */ + IPVS_DAEMON_ATTR_SYNC_MAXLEN, /* UDP Payload Size */ __IPVS_DAEMON_ATTR_MAX, }; diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c index af0b69e411b7..96f7bbfd5e1d 100644 --- a/net/netfilter/ipvs/ip_vs_ctl.c +++ b/net/netfilter/ipvs/ip_vs_ctl.c @@ -2336,10 +2336,15 @@ do_ip_vs_set_ctl(struct sock *sk, int cmd, void __user *user, unsigned int len) struct ip_vs_daemon_user *dm = (struct ip_vs_daemon_user *)arg; if (cmd == IP_VS_SO_SET_STARTDAEMON) { + struct ipvs_sync_daemon_cfg cfg; + + memset(cfg, 0, sizeof(cfg)); + strlcpy(cfg.mcast_ifn, dm-mcast_ifn, + sizeof(cfg.mcast_ifn)); + cfg.syncid = dm-syncid; rtnl_lock(); mutex_lock(ipvs-sync_mutex); - ret = start_sync_thread(net, dm-state, dm-mcast_ifn, - dm-syncid); + ret = start_sync_thread(net, cfg, dm-state); mutex_unlock(ipvs-sync_mutex); rtnl_unlock(); } else { @@ -2650,15 +2655,15 @@ do_ip_vs_get_ctl(struct sock *sk, int cmd, void __user *user, int *len) mutex_lock(ipvs-sync_mutex); if (ipvs-sync_state IP_VS_STATE_MASTER) { d[0].state = IP_VS_STATE_MASTER; - strlcpy(d[0].mcast_ifn, ipvs-master_mcast_ifn, + strlcpy(d[0].mcast_ifn, ipvs-mcfg.mcast_ifn, sizeof(d[0].mcast_ifn)); -
[GIT PULL nf-next] Second Round of IPVS Updates for v4.3
Hi Pablo, please consider these IPVS Updates for v4.3. I realise these are a little late in the cycle, so if you would prefer me to repost them for v4.4 then just let me know. The updates include: * A new scheduler from Raducu Deaconu * Enhanced configurability of the sync daemon from Julian Anastasov The following changes since commit 81bf1c64e7fe08f956c74fe2b0f1fa6eb163bd91: Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next (2015-08-21 06:09:05 +0200) are available in the git repository at: https://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs-next.git tags/ipvs2-for-v4.3 for you to fetch changes up to d33288172e72c4729e8b9f2243fb40601afabc8f: ipvs: add more mcast parameters for the sync daemon (2015-08-21 09:10:11 -0700) Julian Anastasov (3): ipvs: call rtnl_lock early ipvs: add sync_maxlen parameter for the sync daemon ipvs: add more mcast parameters for the sync daemon Raducu Deaconu (1): ipvs: Add ovf scheduler include/net/ip_vs.h | 23 ++-- include/uapi/linux/ip_vs.h | 5 + net/netfilter/ipvs/Kconfig | 11 ++ net/netfilter/ipvs/Makefile | 1 + net/netfilter/ipvs/ip_vs_ctl.c | 143 - net/netfilter/ipvs/ip_vs_ovf.c | 86 + net/netfilter/ipvs/ip_vs_sync.c | 269 ++-- 7 files changed, 402 insertions(+), 136 deletions(-) create mode 100644 net/netfilter/ipvs/ip_vs_ovf.c -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/4 nf-next] ipvs: add more mcast parameters for the sync daemon
From: Julian Anastasov j...@ssi.bg - mcast_group: configure the multicast address, now IPv6 is supported too - mcast_port: configure the multicast port - mcast_ttl: configure the multicast TTL/HOP_LIMIT Signed-off-by: Julian Anastasov j...@ssi.bg Signed-off-by: Simon Horman ho...@verge.net.au --- include/net/ip_vs.h | 4 ++ include/uapi/linux/ip_vs.h | 4 ++ net/netfilter/ipvs/ip_vs_ctl.c | 50 ++- net/netfilter/ipvs/ip_vs_sync.c | 138 +--- 4 files changed, 172 insertions(+), 24 deletions(-) diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h index 2fdc13caf712..9b9ca87a4210 100644 --- a/include/net/ip_vs.h +++ b/include/net/ip_vs.h @@ -847,8 +847,12 @@ struct ipvs_master_sync_state { #define IP_VS_DEST_TRASH_PERIOD(120 * HZ) struct ipvs_sync_daemon_cfg { + union nf_inet_addr mcast_group; int syncid; u16 sync_maxlen; + u16 mcast_port; + u8 mcast_af; + u8 mcast_ttl; /* multicast interface name */ charmcast_ifn[IP_VS_IFNAME_MAXLEN]; }; diff --git a/include/uapi/linux/ip_vs.h b/include/uapi/linux/ip_vs.h index 68377d8c8870..391395c06c7e 100644 --- a/include/uapi/linux/ip_vs.h +++ b/include/uapi/linux/ip_vs.h @@ -407,6 +407,10 @@ enum { IPVS_DAEMON_ATTR_MCAST_IFN, /* multicast interface name */ IPVS_DAEMON_ATTR_SYNC_ID, /* SyncID we belong to */ IPVS_DAEMON_ATTR_SYNC_MAXLEN, /* UDP Payload Size */ + IPVS_DAEMON_ATTR_MCAST_GROUP, /* IPv4 Multicast Address */ + IPVS_DAEMON_ATTR_MCAST_GROUP6, /* IPv6 Multicast Address */ + IPVS_DAEMON_ATTR_MCAST_PORT,/* Multicast Port (base) */ + IPVS_DAEMON_ATTR_MCAST_TTL, /* Multicast TTL */ __IPVS_DAEMON_ATTR_MAX, }; diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c index 96f7bbfd5e1d..1a23e91d50d8 100644 --- a/net/netfilter/ipvs/ip_vs_ctl.c +++ b/net/netfilter/ipvs/ip_vs_ctl.c @@ -2819,6 +2819,10 @@ static const struct nla_policy ip_vs_daemon_policy[IPVS_DAEMON_ATTR_MAX + 1] = { .len = IP_VS_IFNAME_MAXLEN }, [IPVS_DAEMON_ATTR_SYNC_ID] = { .type = NLA_U32 }, [IPVS_DAEMON_ATTR_SYNC_MAXLEN] = { .type = NLA_U16 }, + [IPVS_DAEMON_ATTR_MCAST_GROUP] = { .type = NLA_U32 }, + [IPVS_DAEMON_ATTR_MCAST_GROUP6] = { .len = sizeof(struct in6_addr) }, + [IPVS_DAEMON_ATTR_MCAST_PORT] = { .type = NLA_U16 }, + [IPVS_DAEMON_ATTR_MCAST_TTL]= { .type = NLA_U8 }, }; /* Policy used for attributes in nested attribute IPVS_CMD_ATTR_SERVICE */ @@ -3288,8 +3292,21 @@ static int ip_vs_genl_fill_daemon(struct sk_buff *skb, __u32 state, if (nla_put_u32(skb, IPVS_DAEMON_ATTR_STATE, state) || nla_put_string(skb, IPVS_DAEMON_ATTR_MCAST_IFN, c-mcast_ifn) || nla_put_u32(skb, IPVS_DAEMON_ATTR_SYNC_ID, c-syncid) || - nla_put_u16(skb, IPVS_DAEMON_ATTR_SYNC_MAXLEN, c-sync_maxlen)) + nla_put_u16(skb, IPVS_DAEMON_ATTR_SYNC_MAXLEN, c-sync_maxlen) || + nla_put_u16(skb, IPVS_DAEMON_ATTR_MCAST_PORT, c-mcast_port) || + nla_put_u8(skb, IPVS_DAEMON_ATTR_MCAST_TTL, c-mcast_ttl)) goto nla_put_failure; +#ifdef CONFIG_IP_VS_IPV6 + if (c-mcast_af == AF_INET6) { + if (nla_put_in6_addr(skb, IPVS_DAEMON_ATTR_MCAST_GROUP6, +c-mcast_group.in6)) + goto nla_put_failure; + } else +#endif + if (c-mcast_af == AF_INET + nla_put_in_addr(skb, IPVS_DAEMON_ATTR_MCAST_GROUP, + c-mcast_group.ip)) + goto nla_put_failure; nla_nest_end(skb, nl_daemon); return 0; @@ -3370,6 +3387,37 @@ static int ip_vs_genl_new_daemon(struct net *net, struct nlattr **attrs) if (a) c.sync_maxlen = nla_get_u16(a); + a = attrs[IPVS_DAEMON_ATTR_MCAST_GROUP]; + if (a) { + c.mcast_af = AF_INET; + c.mcast_group.ip = nla_get_in_addr(a); + if (!ipv4_is_multicast(c.mcast_group.ip)) + return -EINVAL; + } else { + a = attrs[IPVS_DAEMON_ATTR_MCAST_GROUP6]; + if (a) { +#ifdef CONFIG_IP_VS_IPV6 + int addr_type; + + c.mcast_af = AF_INET6; + c.mcast_group.in6 = nla_get_in6_addr(a); + addr_type = ipv6_addr_type(c.mcast_group.in6); + if (!(addr_type IPV6_ADDR_MULTICAST)) + return -EINVAL; +#else + return -EAFNOSUPPORT; +#endif + } + } + + a = attrs[IPVS_DAEMON_ATTR_MCAST_PORT]; + if (a) +
[PATCH 2/4 nf-next] ipvs: call rtnl_lock early
From: Julian Anastasov j...@ssi.bg When the sync damon is started we need to hold rtnl lock while calling ip_mc_join_group. Currently, we have a wrong locking order because the correct one is rtnl_lock-__ip_vs_mutex. It is implied from the usage of __ip_vs_mutex in ip_vs_dst_event() which is called under rtnl lock during NETDEV_* notifications. Fix the problem by calling rtnl_lock early only for the start_sync_thread call. As a bonus this fixes the usage __dev_get_by_name which was not called under rtnl lock. This patch actually extends and depends on commit 54ff9ef36bdf (ipv4, ipv6: kill ip_mc_{join, leave}_group and ipv6_sock_mc_{join, drop}). Signed-off-by: Julian Anastasov j...@ssi.bg Signed-off-by: Simon Horman ho...@verge.net.au --- net/netfilter/ipvs/ip_vs_ctl.c | 50 +++-- net/netfilter/ipvs/ip_vs_sync.c | 2 -- 2 files changed, 33 insertions(+), 19 deletions(-) diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c index 24c554201a76..af0b69e411b7 100644 --- a/net/netfilter/ipvs/ip_vs_ctl.c +++ b/net/netfilter/ipvs/ip_vs_ctl.c @@ -2335,13 +2335,18 @@ do_ip_vs_set_ctl(struct sock *sk, int cmd, void __user *user, unsigned int len) cmd == IP_VS_SO_SET_STOPDAEMON) { struct ip_vs_daemon_user *dm = (struct ip_vs_daemon_user *)arg; - mutex_lock(ipvs-sync_mutex); - if (cmd == IP_VS_SO_SET_STARTDAEMON) + if (cmd == IP_VS_SO_SET_STARTDAEMON) { + rtnl_lock(); + mutex_lock(ipvs-sync_mutex); ret = start_sync_thread(net, dm-state, dm-mcast_ifn, dm-syncid); - else + mutex_unlock(ipvs-sync_mutex); + rtnl_unlock(); + } else { + mutex_lock(ipvs-sync_mutex); ret = stop_sync_thread(net, dm-state); - mutex_unlock(ipvs-sync_mutex); + mutex_unlock(ipvs-sync_mutex); + } goto out_dec; } @@ -3342,6 +3347,9 @@ nla_put_failure: static int ip_vs_genl_new_daemon(struct net *net, struct nlattr **attrs) { + struct netns_ipvs *ipvs = net_ipvs(net); + int ret; + if (!(attrs[IPVS_DAEMON_ATTR_STATE] attrs[IPVS_DAEMON_ATTR_MCAST_IFN] attrs[IPVS_DAEMON_ATTR_SYNC_ID])) @@ -3353,19 +3361,30 @@ static int ip_vs_genl_new_daemon(struct net *net, struct nlattr **attrs) if (net_ipvs(net)-mixed_address_family_dests 0) return -EINVAL; - return start_sync_thread(net, -nla_get_u32(attrs[IPVS_DAEMON_ATTR_STATE]), -nla_data(attrs[IPVS_DAEMON_ATTR_MCAST_IFN]), -nla_get_u32(attrs[IPVS_DAEMON_ATTR_SYNC_ID])); + rtnl_lock(); + mutex_lock(ipvs-sync_mutex); + ret = start_sync_thread(net, + nla_get_u32(attrs[IPVS_DAEMON_ATTR_STATE]), + nla_data(attrs[IPVS_DAEMON_ATTR_MCAST_IFN]), + nla_get_u32(attrs[IPVS_DAEMON_ATTR_SYNC_ID])); + mutex_unlock(ipvs-sync_mutex); + rtnl_unlock(); + return ret; } static int ip_vs_genl_del_daemon(struct net *net, struct nlattr **attrs) { + struct netns_ipvs *ipvs = net_ipvs(net); + int ret; + if (!attrs[IPVS_DAEMON_ATTR_STATE]) return -EINVAL; - return stop_sync_thread(net, - nla_get_u32(attrs[IPVS_DAEMON_ATTR_STATE])); + mutex_lock(ipvs-sync_mutex); + ret = stop_sync_thread(net, + nla_get_u32(attrs[IPVS_DAEMON_ATTR_STATE])); + mutex_unlock(ipvs-sync_mutex); + return ret; } static int ip_vs_genl_set_config(struct net *net, struct nlattr **attrs) @@ -3389,7 +3408,7 @@ static int ip_vs_genl_set_config(struct net *net, struct nlattr **attrs) static int ip_vs_genl_set_daemon(struct sk_buff *skb, struct genl_info *info) { - int ret = 0, cmd; + int ret = -EINVAL, cmd; struct net *net; struct netns_ipvs *ipvs; @@ -3400,22 +3419,19 @@ static int ip_vs_genl_set_daemon(struct sk_buff *skb, struct genl_info *info) if (cmd == IPVS_CMD_NEW_DAEMON || cmd == IPVS_CMD_DEL_DAEMON) { struct nlattr *daemon_attrs[IPVS_DAEMON_ATTR_MAX + 1]; - mutex_lock(ipvs-sync_mutex); if (!info-attrs[IPVS_CMD_ATTR_DAEMON] || nla_parse_nested(daemon_attrs, IPVS_DAEMON_ATTR_MAX, info-attrs[IPVS_CMD_ATTR_DAEMON], -ip_vs_daemon_policy)) { - ret = -EINVAL; +ip_vs_daemon_policy)) goto out; - }
[PATCH 1/4 nf-next] ipvs: Add ovf scheduler
From: Raducu Deaconu rhadoo.i...@gmail.com The weighted overflow scheduling algorithm directs network connections to the server with the highest weight that is currently available and overflows to the next when active connections exceed the node's weight. Signed-off-by: Raducu Deaconu rhadoo.i...@gmail.com Acked-by: Julian Anastasov j...@ssi.bg Signed-off-by: Simon Horman ho...@verge.net.au --- net/netfilter/ipvs/Kconfig | 11 ++ net/netfilter/ipvs/Makefile| 1 + net/netfilter/ipvs/ip_vs_ovf.c | 86 ++ 3 files changed, 98 insertions(+) create mode 100644 net/netfilter/ipvs/ip_vs_ovf.c diff --git a/net/netfilter/ipvs/Kconfig b/net/netfilter/ipvs/Kconfig index 3b6929dec748..b32fb0dbe237 100644 --- a/net/netfilter/ipvs/Kconfig +++ b/net/netfilter/ipvs/Kconfig @@ -162,6 +162,17 @@ config IP_VS_FO If you want to compile it in kernel, say Y. To compile it as a module, choose M here. If unsure, say N. +config IP_VS_OVF + tristate weighted overflow scheduling + ---help--- + The weighted overflow scheduling algorithm directs network + connections to the server with the highest weight that is + currently available and overflows to the next when active + connections exceed the node's weight. + + If you want to compile it in kernel, say Y. To compile it as a + module, choose M here. If unsure, say N. + config IP_VS_LBLC tristate locality-based least-connection scheduling ---help--- diff --git a/net/netfilter/ipvs/Makefile b/net/netfilter/ipvs/Makefile index 38b2723b2e3d..67f3f4389602 100644 --- a/net/netfilter/ipvs/Makefile +++ b/net/netfilter/ipvs/Makefile @@ -27,6 +27,7 @@ obj-$(CONFIG_IP_VS_WRR) += ip_vs_wrr.o obj-$(CONFIG_IP_VS_LC) += ip_vs_lc.o obj-$(CONFIG_IP_VS_WLC) += ip_vs_wlc.o obj-$(CONFIG_IP_VS_FO) += ip_vs_fo.o +obj-$(CONFIG_IP_VS_OVF) += ip_vs_ovf.o obj-$(CONFIG_IP_VS_LBLC) += ip_vs_lblc.o obj-$(CONFIG_IP_VS_LBLCR) += ip_vs_lblcr.o obj-$(CONFIG_IP_VS_DH) += ip_vs_dh.o diff --git a/net/netfilter/ipvs/ip_vs_ovf.c b/net/netfilter/ipvs/ip_vs_ovf.c new file mode 100644 index ..f7d62c3b7329 --- /dev/null +++ b/net/netfilter/ipvs/ip_vs_ovf.c @@ -0,0 +1,86 @@ +/* + * IPVS:Overflow-Connection Scheduling module + * + * Authors: Raducu Deaconu rhadoo...@yahoo.com + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + * + * Scheduler implements overflow loadbalancing according to number of active + * connections , will keep all conections to the node with the highest weight + * and overflow to the next node if the number of connections exceeds the node's + * weight. + * Note that this scheduler might not be suitable for UDP because it only uses + * active connections + * + */ + +#define KMSG_COMPONENT IPVS +#define pr_fmt(fmt) KMSG_COMPONENT : fmt + +#include linux/module.h +#include linux/kernel.h + +#include net/ip_vs.h + +/* OVF Connection scheduling */ +static struct ip_vs_dest * +ip_vs_ovf_schedule(struct ip_vs_service *svc, const struct sk_buff *skb, + struct ip_vs_iphdr *iph) +{ + struct ip_vs_dest *dest, *h = NULL; + int hw = 0, w; + + IP_VS_DBG(6, ip_vs_ovf_schedule(): Scheduling...\n); + /* select the node with highest weight, go to next in line if active + * connections exceed weight + */ + list_for_each_entry_rcu(dest, svc-destinations, n_list) { + w = atomic_read(dest-weight); + if ((dest-flags IP_VS_DEST_F_OVERLOAD) || + atomic_read(dest-activeconns) w || + w == 0) + continue; + if (!h || w hw) { + h = dest; + hw = w; + } + } + + if (h) { + IP_VS_DBG_BUF(6, OVF: server %s:%u active %d w %d\n, + IP_VS_DBG_ADDR(h-af, h-addr), + ntohs(h-port), + atomic_read(h-activeconns), + atomic_read(h-weight)); + return h; + } + + ip_vs_scheduler_err(svc, no destination available); + return NULL; +} + +static struct ip_vs_scheduler ip_vs_ovf_scheduler = { + .name = ovf, + .refcnt = ATOMIC_INIT(0), + .module = THIS_MODULE, + .n_list = LIST_HEAD_INIT(ip_vs_ovf_scheduler.n_list), + .schedule = ip_vs_ovf_schedule, +}; + +static int __init ip_vs_ovf_init(void) +{ + return register_ip_vs_scheduler(ip_vs_ovf_scheduler); +} + +static void __exit ip_vs_ovf_cleanup(void) +{ +
[PATCH] /net/ethernet/3com/3c59x.c:Fixed coding style errors and warnings.
Checks are also cleared Signed-off-by: Ravinder Atla rednivara...@gmail.com --- drivers/net/ethernet/3com/3c59x.c | 39 +-- 1 file changed, 21 insertions(+), 18 deletions(-) diff --git a/drivers/net/ethernet/3com/3c59x.c b/drivers/net/ethernet/3com/3c59x.c index 2d1ce3c..18d242c 100644 --- a/drivers/net/ethernet/3com/3c59x.c +++ b/drivers/net/ethernet/3com/3c59x.c @@ -208,15 +208,18 @@ limit of 4K. of the drivers, and will likely be provided by some future kernel. */ enum pci_flags_bit { - PCI_USES_MASTER=4, + PCI_USES_MASTER = 4, }; -enum { IS_VORTEX=1, IS_BOOMERANG=2, IS_CYCLONE=4, IS_TORNADO=8, - EEPROM_8BIT=0x10, /* AKPM: Uses 0x230 as the base bitmaps for EEPROM reads */ - HAS_PWR_CTRL=0x20, HAS_MII=0x40, HAS_NWAY=0x80, HAS_CB_FNS=0x100, - INVERT_MII_PWR=0x200, INVERT_LED_PWR=0x400, MAX_COLLISION_RESET=0x800, - EEPROM_OFFSET=0x1000, HAS_HWCKSM=0x2000, WNO_XCVR_PWR=0x4000, - EXTRA_PREAMBLE=0x8000, EEPROM_RESET=0x1, }; +enum { IS_VORTEX=1, IS_BOOMERANG = 2, IS_CYCLONE = 4, IS_TORNADO = 8, + EEPROM_8BIT = 0x10, + /* AKPM: Uses 0x230 as the base bitmaps for EEPROM reads */ + HAS_PWR_CTRL = 0x20, HAS_MII = 0x40, HAS_NWAY = 0x80, + HAS_CB_FNS = 0x100, + INVERT_MII_PWR = 0x200, INVERT_LED_PWR = 0x400, + MAX_COLLISION_RESET = 0x800, + EEPROM_OFFSET = 0x1000, HAS_HWCKSM = 0x2000, WNO_XCVR_PWR = 0x4000, + EXTRA_PREAMBLE = 0x8000, EEPROM_RESET = 0x1, }; enum vortex_chips { CH_3C590 = 0, @@ -267,7 +270,6 @@ enum vortex_chips { CH_920B_EMB_WNM, }; - /* note: this array directly indexed by above enums, and MUST * be kept in sync with both the enums above, and the PCI device * table below @@ -280,9 +282,9 @@ static struct vortex_chip_info { } vortex_info_tbl[] = { {3c590 Vortex 10Mbps, PCI_USES_MASTER, IS_VORTEX, 32, }, - {3c592 EISA 10Mbps Demon/Vortex, /* AKPM: from Don's 3c59x_cb.c 0.49H */ + {3c592 EISA 10Mbps Demon/Vortex, /* AKPM: from Don's 3c59x_cb.c 0.49H */ PCI_USES_MASTER, IS_VORTEX, 32, }, - {3c597 EISA Fast Demon/Vortex, /* AKPM: from Don's 3c59x_cb.c 0.49H */ + {3c597 EISA Fast Demon/Vortex,/* AKPM: from Don's 3c59x_cb.c 0.49H */ PCI_USES_MASTER, IS_VORTEX, 32, }, {3c595 Vortex 100baseTx, PCI_USES_MASTER, IS_VORTEX, 32, }, @@ -292,15 +294,15 @@ static struct vortex_chip_info { {3c595 Vortex 100base-MII, PCI_USES_MASTER, IS_VORTEX, 32, }, {3c900 Boomerang 10baseT, -PCI_USES_MASTER, IS_BOOMERANG|EEPROM_RESET, 64, }, +PCI_USES_MASTER, IS_BOOMERANG | EEPROM_RESET, 64, }, {3c900 Boomerang 10Mbps Combo, -PCI_USES_MASTER, IS_BOOMERANG|EEPROM_RESET, 64, }, - {3c900 Cyclone 10Mbps TPO, /* AKPM: from Don's 0.99M */ +PCI_USES_MASTER, IS_BOOMERANG | EEPROM_RESET, 64, }, + {3c900 Cyclone 10Mbps TPO,/* AKPM: from Don's 0.99M */ PCI_USES_MASTER, IS_CYCLONE|HAS_HWCKSM, 128, }, {3c900 Cyclone 10Mbps Combo, PCI_USES_MASTER, IS_CYCLONE|HAS_HWCKSM, 128, }, - {3c900 Cyclone 10Mbps TPC, /* AKPM: from Don's 0.99M */ + {3c900 Cyclone 10Mbps TPC,/* AKPM: from Don's 0.99M */ PCI_USES_MASTER, IS_CYCLONE|HAS_HWCKSM, 128, }, {3c900B-FL Cyclone 10base-FL, PCI_USES_MASTER, IS_CYCLONE|HAS_HWCKSM, 128, }, @@ -331,8 +333,8 @@ static struct vortex_chip_info { {3c555 Laptop Hurricane, PCI_USES_MASTER, IS_CYCLONE|EEPROM_8BIT|HAS_HWCKSM, 128, }, {3c556 Laptop Tornado, -PCI_USES_MASTER, IS_TORNADO|HAS_NWAY|EEPROM_8BIT|HAS_CB_FNS|INVERT_MII_PWR| - HAS_HWCKSM, 128, }, +PCI_USES_MASTER, IS_TORNADO | HAS_NWAY | EEPROM_8BIT | HAS_CB_FNS | + INVERT_MII_PWR | HAS_HWCKSM, 128, }, {3c556B Laptop Hurricane, PCI_USES_MASTER, IS_TORNADO|HAS_NWAY|EEPROM_OFFSET|HAS_CB_FNS|INVERT_MII_PWR| WNO_XCVR_PWR|HAS_HWCKSM, 128, }, @@ -474,13 +476,14 @@ enum vortex_status { On the Vortex this window is always mapped at offsets 0x10-0x1f. */ enum Window1 { TX_FIFO = 0x10, RX_FIFO = 0x10, RxErrors = 0x14, - RxStatus = 0x18, Timer=0x1A, TxStatus = 0x1B, + RxStatus = 0x18, Timer = 0x1A, TxStatus = 0x1B, TxFree = 0x1C, /* Remaining free bytes in Tx buffer. */ }; + enum Window0 { Wn0EepromCmd = 10, /* Window 0: EEPROM command register. */ Wn0EepromData = 12, /* Window 0: EEPROM results register. */ - IntrStatus=0x0E,/* Valid in all windows. */ + IntrStatus = 0x0E, /*
ipg and dl2k mess
Hello, I've got an Asus NX1101 card with ICPlus IP1000A chip: 02:01.0 Ethernet controller [0200]: Sundance Technology Inc / IC Plus Corp IP1000 Family Gigabit Ethernet [13f0:1023] (rev 41) Subsystem: ASUSTeK Computer Inc. NX1101 [1043:8180] Flags: bus master, 66MHz, medium devsel, latency 32, IRQ 19 I/O ports at a000 [size=256] Memory at f500 (32-bit, non-prefetchable) [size=256] [virtual] Expansion ROM at 3000 [disabled] [size=64K] Capabilities: [50] Power Management version 2 It does not work properly because the ipg driver is broken - it loses packets (easily reproduced by ping -f) and stops working under load with no messages (copying a 200MB file using scp at 100mbit is enough to reproduce it). The dl2k (for TC902x chips, DL2000 is probably a rebranded TC902x) driver is very similar to ipg (for IP1000A). According to datasheets, IP1000A chip looks like a TC9021 with integrated PHY. The patch below is enough to make my IP1000A card work with dl2k driver - no more lost packets and hangs. Haven't tested gigabit speed yet - the PHY will probably need some tweaking but that should be easy. So maybe we should add IP1000A support to dl2k and remove the broken ipg driver. Does anyone have HW to test? diff --git a/drivers/net/ethernet/dlink/dl2k.c b/drivers/net/ethernet/dlink/dl2k.c index cf0a5fc..d5a60fe 100644 --- a/drivers/net/ethernet/dlink/dl2k.c +++ b/drivers/net/ethernet/dlink/dl2k.c @@ -433,9 +455,9 @@ rio_open (struct net_device *dev) alloc_list (dev); - /* Get station address */ - for (i = 0; i 6; i++) - dw8(StationAddr0 + i, dev-dev_addr[i]); + /* Set station address */ + for (i = 0; i 3; i++) + dw16(StationAddr0 + 2 * i, cpu_to_le16(((u16 *)dev-dev_addr)[i])); set_multicast (dev); if (np-coalesce) { diff --git a/drivers/net/ethernet/dlink/dl2k.h b/drivers/net/ethernet/dlink/dl2k.h index 23c07b0..da35e66 100644 --- a/drivers/net/ethernet/dlink/dl2k.h +++ b/drivers/net/ethernet/dlink/dl2k.h @@ -411,6 +411,7 @@ struct netdev_private { static const struct pci_device_id rio_pci_tbl[] = { {0x1186, 0x4000, PCI_ANY_ID, PCI_ANY_ID, }, {0x13f0, 0x1021, PCI_ANY_ID, PCI_ANY_ID, }, + {0x13f0, 0x1023, PCI_ANY_ID, PCI_ANY_ID, }, { } }; MODULE_DEVICE_TABLE (pci, rio_pci_tbl); -- Ondrej Zary -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 net-next] r8169: Add values missing in @get_stats64 from HW counters
Corinna Vinschen vinsc...@redhat.com : [...] diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c index f790f61..f26a48d 100644 --- a/drivers/net/ethernet/realtek/r8169.c +++ b/drivers/net/ethernet/realtek/r8169.c [...] @@ -2179,6 +2191,47 @@ static int rtl8169_get_sset_count(struct net_device *dev, int sset) } } +DECLARE_RTL_COND(rtl_reset_counters_cond) +{ + void __iomem *ioaddr = tp-mmio_addr; + + return RTL_R32(CounterAddrLow) CounterReset; +} + +static void rtl8169_reset_counters(struct net_device *dev) +{ rtl8169_reset_counters duplicates most of rtl8169_update_counters. Please factor out the dma_alloc + parametrized CounterAddrLow write + cleanup. [...] @@ -2220,6 +2273,39 @@ static void rtl8169_update_counters(struct net_device *dev) dma_free_coherent(d, sizeof(*counters), counters, paddr); } +static void rtl8169_init_counter_offsets(struct net_device *dev) +{ + struct rtl8169_private *tp = netdev_priv(dev); + + /* + * rtl8169_init_counter_offsets is called from rtl_open. On chip + * versions prior to RTL_GIGA_MAC_VER_19 the tally counters are only + * reset by a power cycle, while the counter values collected by the + * driver are reset at every driver unload/load cycle. + * + * To make sure the HW values returned by @get_stats64 match the SW + * values, we collect the initial values at first open(*) and use them + * as offsets to normalize the values returned by @get_stats64. + * + * (*) We can't call rtl8169_init_counter_offsets from rtl_init_one + * for the reason stated in rtl8169_update_counters; CmdRxEnb is only + * set at open time by rtl_hw_start. + */ + + if (tp-tc_offset.inited) + return; + + rtl8169_reset_counters(dev); + + rtl8169_update_counters(dev); The code should propagate failure when both rtl8169_reset_counters and rtl8169_update_counters fail. -- Ueimor -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH ipsec-next v2] xfrm: Use VRF master index if output device is enslaved
On Fri, Aug 21, 2015 at 02:11:21AM +0300, Nikolay Aleksandrov wrote: On Aug 21, 2015, at 1:06 AM, David Ahern d...@cumulusnetworks.com wrote: Directs route lookups to VRF table. Compiles out if NET_VRF is not enabled. With this patch able to successfully bring up ipsec tunnels in VRFs, even with duplicate network configuration. Signed-off-by: David Ahern d...@cumulusnetworks.com --- v2 - use vrf_master_ifindex rather than vrf_master_ifindex_rcu net/ipv4/xfrm4_policy.c | 7 +-- net/ipv6/xfrm6_policy.c | 7 +-- 2 files changed, 10 insertions(+), 4 deletions(-) Looks good to me, Acked-by: Nikolay Aleksandrov niko...@cumulusnetworks.com David, can you please take this directly into net-next? Acked-by: Steffen Klassert steffen.klass...@secunet.com -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] rsi: Fix possible leak when loading firmware
Commit 5d5cd85ff441 (rsi: Fix failure to load firmware after memory leak fix and fix the leak) also added a check on the allocation of DMA-accessible memory that may directly return. In that case the already allocated firmware data is leaked. Make sure the data is always freed correctly. Detected by Coverity CID 1316519. Signed-off-by: Christian Engelmayer cenge...@gmx.at --- Compile tested only. --- drivers/net/wireless/rsi/rsi_91x_sdio_ops.c | 8 ++-- drivers/net/wireless/rsi/rsi_91x_usb_ops.c | 8 ++-- 2 files changed, 12 insertions(+), 4 deletions(-) diff --git a/drivers/net/wireless/rsi/rsi_91x_sdio_ops.c b/drivers/net/wireless/rsi/rsi_91x_sdio_ops.c index 1c6788aecc62..40d72312f3df 100644 --- a/drivers/net/wireless/rsi/rsi_91x_sdio_ops.c +++ b/drivers/net/wireless/rsi/rsi_91x_sdio_ops.c @@ -203,8 +203,10 @@ static int rsi_load_ta_instructions(struct rsi_common *common) /* Copy firmware into DMA-accessible memory */ fw = kmemdup(fw_entry-data, fw_entry-size, GFP_KERNEL); - if (!fw) - return -ENOMEM; + if (!fw) { + status = -ENOMEM; + goto out; + } len = fw_entry-size; if (len % 4) @@ -217,6 +219,8 @@ static int rsi_load_ta_instructions(struct rsi_common *common) status = rsi_copy_to_card(common, fw, len, num_blocks); kfree(fw); + +out: release_firmware(fw_entry); return status; } diff --git a/drivers/net/wireless/rsi/rsi_91x_usb_ops.c b/drivers/net/wireless/rsi/rsi_91x_usb_ops.c index 30c2cf7fa93b..de4900862836 100644 --- a/drivers/net/wireless/rsi/rsi_91x_usb_ops.c +++ b/drivers/net/wireless/rsi/rsi_91x_usb_ops.c @@ -148,8 +148,10 @@ static int rsi_load_ta_instructions(struct rsi_common *common) /* Copy firmware into DMA-accessible memory */ fw = kmemdup(fw_entry-data, fw_entry-size, GFP_KERNEL); - if (!fw) - return -ENOMEM; + if (!fw) { + status = -ENOMEM; + goto out; + } len = fw_entry-size; if (len % 4) @@ -162,6 +164,8 @@ static int rsi_load_ta_instructions(struct rsi_common *common) status = rsi_copy_to_card(common, fw, len, num_blocks); kfree(fw); + +out: release_firmware(fw_entry); return status; } -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2 net-next] tcp: fix slow start after idle vs TSO/GSO
From: Eric Dumazet eduma...@google.com slow start after idle might reduce cwnd, but we perform this after first packet was cooked and sent. With TSO/GSO, it means that we might send a full TSO packet even if cwnd should have been reduced to IW10. Moving the SSAI check in skb_entail() makes sense, because we slightly reduce number of times this check is done, especially for large send() and TCP Small queue callbacks from softirq context. As Neal pointed out, we also need to perform the check if/when receive window opens. Tested: Following packetdrill test demonstrates the problem // Test of slow start after idle `sysctl -q net.ipv4.tcp_slow_start_after_idle=1` 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0bind(3, ..., ...) = 0 +0listen(3, 1) = 0 +0 S 0:0(0) win 65535 mss 1000,sackOK,nop,nop,nop,wscale 7 +0 S. 0:0(0) ack 1 mss 1460,nop,nop,sackOK,nop,wscale 6 +.100 . 1:1(0) ack 1 win 511 +0accept(3, ..., ...) = 4 +0setsockopt(4, SOL_SOCKET, SO_SNDBUF, [20], 4) = 0 +0write(4, ..., 26000) = 26000 +0 . 1:5001(5000) ack 1 +0 . 5001:10001(5000) ack 1 +0%{ assert tcpi_snd_cwnd == 10 }% +.100 . 1:1(0) ack 10001 win 511 +0%{ assert tcpi_snd_cwnd == 20, tcpi_snd_cwnd }% +0 . 10001:20001(1) ack 1 +0 P. 20001:26001(6000) ack 1 +.100 . 1:1(0) ack 26001 win 511 +0%{ assert tcpi_snd_cwnd == 36, tcpi_snd_cwnd }% +4 write(4, ..., 2) = 2 // If slow start after idle works properly, we should send 5 MSS here (cwnd/2) +0 . 26001:31001(5000) ack 1 +0%{ assert tcpi_snd_cwnd == 10, tcpi_snd_cwnd }% +0 . 31001:36001(5000) ack 1 Signed-off-by: Eric Dumazet eduma...@google.com Cc: Neal Cardwell ncardw...@google.com Cc: Yuchung Cheng ych...@google.com --- include/net/tcp.h | 13 + net/ipv4/tcp.c|2 ++ net/ipv4/tcp_input.c |3 +++ net/ipv4/tcp_output.c | 12 4 files changed, 22 insertions(+), 8 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index 364426a..309801f 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -1165,6 +1165,19 @@ static inline void tcp_sack_reset(struct tcp_options_received *rx_opt) } u32 tcp_default_init_rwnd(u32 mss); +void tcp_cwnd_restart(struct sock *sk, s32 delta); + +static inline void tcp_slow_start_after_idle_check(struct sock *sk) +{ + struct tcp_sock *tp = tcp_sk(sk); + s32 delta; + + if (!sysctl_tcp_slow_start_after_idle || tp-packets_out) + return; + delta = tcp_time_stamp - tp-lsndtime; + if (delta inet_csk(sk)-icsk_rto) + tcp_cwnd_restart(sk, delta); +} /* Determine a window scaling and initial window to offer. */ void tcp_select_initial_window(int __space, __u32 mss, __u32 *rcv_wnd, diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 45534a5..b8b8fa1 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -627,6 +627,8 @@ static void skb_entail(struct sock *sk, struct sk_buff *skb) sk_mem_charge(sk, skb-truesize); if (tp-nonagle TCP_NAGLE_PUSH) tp-nonagle = ~TCP_NAGLE_PUSH; + + tcp_slow_start_after_idle_check(sk); } static inline void tcp_mark_urg(struct tcp_sock *tp, int flags) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 4e4d6bc..0abca28 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -3332,6 +3332,9 @@ static int tcp_ack_update_window(struct sock *sk, const struct sk_buff *skb, u32 tp-pred_flags = 0; tcp_fast_path_check(sk); + if (tcp_send_head(sk)) + tcp_slow_start_after_idle_check(sk); + if (nwin tp-max_window) { tp-max_window = nwin; tcp_sync_mss(sk, inet_csk(sk)-icsk_pmtu_cookie); diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 444ab5b..1188e4f 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -137,12 +137,12 @@ static __u16 tcp_advertise_mss(struct sock *sk) } /* RFC2861. Reset CWND after idle period longer RTO to restart window. - * This is the first part of cwnd validation mechanism. */ -static void tcp_cwnd_restart(struct sock *sk, const struct dst_entry *dst) + * This is the first part of cwnd validation mechanism. + */ +void tcp_cwnd_restart(struct sock *sk, s32 delta) { struct tcp_sock *tp = tcp_sk(sk); - s32 delta = tcp_time_stamp - tp-lsndtime; - u32 restart_cwnd = tcp_init_cwnd(tp, dst); + u32 restart_cwnd = tcp_init_cwnd(tp, __sk_dst_get(sk)); u32 cwnd = tp-snd_cwnd; tcp_ca_event(sk, CA_EVENT_CWND_RESTART); @@ -164,10 +164,6 @@ static void tcp_event_data_sent(struct tcp_sock *tp, struct inet_connection_sock *icsk = inet_csk(sk); const u32 now = tcp_time_stamp; - if (sysctl_tcp_slow_start_after_idle -
Re: [PATCH net-next v2] tcp: reduce cpu usage under tcp memory pressure when SO_SNDBUF is set
On 08/11/2015 01:59 PM, Jason Baron wrote: On 08/11/2015 12:12 PM, Eric Dumazet wrote: On Tue, 2015-08-11 at 11:03 -0400, Jason Baron wrote: Yes, so the test case I'm using to test against is somewhat contrived. In that I am simply allocating around 40,000 sockets that are idle to create a 'permanent' memory pressure in the background. Then, I have just 1 flow that sets SO_SNDBUF, which results in the: poll(), write() loop. That said, we encountered this issue initially where we had 10,000+ flows and whenever the system would get into memory pressure, we would see all the cpus spin at 100%. So the testcase I wrote, was just a simplistic version for testing. But I am going to try and test against the more realistic workload where this issue was initially observed. Note that I am still trying to understand why we need to increase socket structure, for something which is inherently a problem of sharing memory with an unknown (potentially big) number of sockets. I was trying to mirror the wakeups when SO_SNDBUF is not set, where we continue to trigger on 1/3 of the buffer being available, as the sk-sndbuf is shrunk. And I saw this value as dynamic depending on number of sockets and read/write buffer usage. So that's where I was coming from with it. Also, at least with the .config I have the tcp_sock structure didn't increase in size (although struct sock did go up by 8 and not 4). I suggested to use a flag (one bit). If set, then we should fallback to tcp_wmem[0] (each socket has 4096 bytes, so that we can avoid starvation) Ok, I will test this approach. Hi Eric, So I created a test here with 20,000 streams, and if I set SO_SNDBUF high enough on the server side, I can create tcp memory pressure above tcp_mem[2]. In this case, with the 'one bit' approach using tcp_wmem[0] as the wakeup threshold I can still observe the 100% cpu spinning issue, but with this v2 patch, cpu usage is minimal (1-2%). Since, we don't guarantee tcp_wmem[0], above tcp_mem[2]. So using the 'one bit' definitely alleviates the spinning between tcp_mem[1] and tcp_mem[2], but not above tcp_mem[2] in my testing. Maybe nobody cares about this case (you are getting what you ask for by using SO_SNDBUF), but it seems to me that it would be nice to avoid this sort of behavior. I also like the fact that with the sk_effective_sndbuf, we keep doing wakeups on 1/3 of the write buffer emptying, which keeps the wakeup behavior consistent. In theory this would matter for high latency and bandwidth link, but in the testing I did, I didn't observe any throughput differences between this v2 patch, and the 'one bit' approach. As I mentioned with this v2, the 'struct sock' grows by 4 bytes, but struct tcp_sock does not increase. So since this is tcp specific, we could add the sk_effective_sndbuf only to the struct tcp_sock. So the 'one bit' approach definitely seems to me to be an improvement, but I wanted to get feedback on this testing, before deciding how to proceed. Thanks, -Jason -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] iproute2: provide common json output formatter
Formatting JSON is moderately painful. Provide a simple API to do the syntax formatting. Use it to replace existing json in *stat commands. --- include/json_writer.h | 61 ++ lib/Makefile | 3 +- lib/json_writer.c | 312 ++ misc/Makefile | 2 +- misc/ifstat.c | 103 + misc/lnstat.c | 22 ++-- misc/nstat.c | 59 ++ 7 files changed, 477 insertions(+), 85 deletions(-) create mode 100644 include/json_writer.h create mode 100644 lib/json_writer.c diff --git a/include/json_writer.h b/include/json_writer.h new file mode 100644 index 000..ab9a008 --- /dev/null +++ b/include/json_writer.h @@ -0,0 +1,61 @@ +/* + * Simple streaming JSON writer + * + * This takes care of the annoying bits of JSON syntax like the commas + * after elements + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + * + * Authors:Stephen Hemminger step...@networkplumber.org + */ + +#ifndef _JSON_WRITER_H_ +#define _JSON_WRITER_H_ + +#include stdbool.h +#include stdint.h + +/* Opaque class structure */ +typedef struct json_writer json_writer_t; + +/* Create a new JSON stream */ +json_writer_t *jsonw_new(FILE *f); +/* End output to JSON stream */ +void jsonw_destroy(json_writer_t **self_p); + +/* Cause output to have pretty whitespace */ +void jsonw_pretty(json_writer_t *self, bool on); + +/* Add property name */ +void jsonw_name(json_writer_t *self, const char *name); + +/* Add value */ +void jsonw_string(json_writer_t *self, const char *value); +void jsonw_bool(json_writer_t *self, bool value); +void jsonw_float(json_writer_t *self, double number); +void jsonw_uint(json_writer_t *self, uint64_t number); +void jsonw_int(json_writer_t *self, int64_t number); +void jsonw_null(json_writer_t *self); + +/* Useful Combinations of name and value */ +void jsonw_string_field(json_writer_t *self, const char *prop, const char *val); +void jsonw_bool_field(json_writer_t *self, const char *prop, bool value); +void jsonw_float_field(json_writer_t *self, const char *prop, double num); +void jsonw_uint_field(json_writer_t *self, const char *prop, uint64_t num); +void jsonw_int_field(json_writer_t *self, const char *prop, int64_t num); +void jsonw_null_field(json_writer_t *self, const char *prop); + +/* Collections */ +void jsonw_start_object(json_writer_t *self); +void jsonw_end_object(json_writer_t *self); + +void jsonw_start_array(json_writer_t *self); +void jsonw_end_array(json_writer_t *self); + +/* Override default exception handling */ +typedef void (jsonw_err_handler_fn)(const char *); + +#endif /* _JSON_WRITER_H_ */ diff --git a/lib/Makefile b/lib/Makefile index 1d4045f..9d1307d 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -6,7 +6,8 @@ endif CFLAGS += -fPIC -UTILOBJ=utils.o rt_names.o ll_types.o ll_proto.o ll_addr.o inet_proto.o namespace.o \ +UTILOBJ = utils.o rt_names.o ll_types.o ll_proto.o ll_addr.o \ + inet_proto.o namespace.o json_writer.o \ names.o color.o NLOBJ=libgenl.o ll_map.o libnetlink.o diff --git a/lib/json_writer.c b/lib/json_writer.c new file mode 100644 index 000..2af16e1 --- /dev/null +++ b/lib/json_writer.c @@ -0,0 +1,312 @@ +/* + * Simple streaming JSON writer + * + * This takes care of the annoying bits of JSON syntax like the commas + * after elements + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + * + * Authors:Stephen Hemminger step...@networkplumber.org + */ + +#include stdio.h +#include stdbool.h +#include stdarg.h +#include assert.h +#include malloc.h +#include inttypes.h +#include stdint.h + +#include json_writer.h + +struct json_writer { + FILE*out; /* output file */ + unsigneddepth; /* nesting */ + boolpretty; /* optional whitepace */ + charsep;/* either nul or comma */ +}; + +/* indentation for pretty print */ +static void jsonw_indent(json_writer_t *self) +{ + unsigned i; + for (i = 0; i = self-depth; ++i) + fputs(, self-out); +} + +/* end current line and indent if pretty printing */ +static void jsonw_eol(json_writer_t *self) +{ + if (!self-pretty) + return; + + putc('\n', self-out); + jsonw_indent(self); +} + +/* If current object is not empty print a comma */ +static void jsonw_eor(json_writer_t *self) +{ + if (self-sep != '\0') + putc(self-sep, self-out); + self-sep = ','; +} + + +/* Output JSON encoded string */ +/* Handles C escapes, does not do Unicode */
Re: [net-next PATCH 2/3] net: sched: allocate a handle to default qdiscs
On Fri, Aug 21, 2015 at 09:14:58AM -0700, Eric Dumazet wrote: [...] diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index 1fb65f9..ab614ee 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -634,6 +634,11 @@ struct Qdisc *qdisc_create_dflt(struct netdev_queue *dev_queue, if (IS_ERR(sch)) goto errout; sch-parent = parentid; +#ifdef CONFIG_NET_SCHED + sch-handle = qdisc_alloc_handle(dev_queue-dev); + if (!sch-handle) + goto errout; +#endif if (!ops-init || ops-init(sch, NULL) == 0) return sch; This might break HTB setups with more than 32768 classes ? Urgh. Thanks for noticing this! The pfifo qdisc that gets attached had no handle. Yes, looks like I need to leave qdisc_create_dflt() alone. It is possible, by doing the above twice in sch_generic.c (once in attach_one_default_qdisc(), and in attach_default_qdiscs() as well). qdisc_alloc_handle() has a limited range. Yes, I noticed this. Handles aren't reused in a running system either, which might contribute to this problem in other situations. V2 will follow, thanks again. Cheers, Phil -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCHv4 net-next 10/10] openvswitch: Allow attaching helpers to ct action
On Thu, Aug 20, 2015 at 5:47 PM, Joe Stringer joestrin...@nicira.com wrote: On 19 August 2015 at 15:57, Pravin Shelar pshe...@nicira.com wrote: On Tue, Aug 18, 2015 at 4:39 PM, Joe Stringer joestrin...@nicira.com wrote: Add support for using conntrack helpers to assist protocol detection. The new OVS_CT_ATTR_HELPER attribute of the ct action specifies a helper to be used for this connection. Example ODP flows allowing FTP connections from ports 1-2: in_port=1,tcp,action=ct(helper=ftp,commit),2 in_port=2,tcp,ct_state=-trk,action=ct(),recirc(1) recirc_id=1,in_port=2,tcp,ct_state=+trk-new+est,action=1 recirc_id=1,in_port=2,tcp,ct_state=+trk+rel,action=1 Signed-off-by: Joe Stringer joestrin...@nicira.com --- v2-v3: No change. v4: Change error code for unknown helper ENOENT-EINVAL. I got following compilation warning : net/openvswitch/conntrack.c:352:42: error: incompatible types in comparison expression (different address spaces) Is this made available via another sparse flag? It looks like it's related to the __rcu as you've mentioned below, but I'm not seeing this (latest sparse, gcc-4.9.2) You need to enable RCU space checker in kernel config. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/4 net-next] enic: add devcmd2
From: Brandeburg, Jesse jesse.brandeb...@intel.com Date: Fri, 21 Aug 2015 17:19:03 + This series introduces a compile error drivers/net/ethernet/cisco/enic/enic_main.c: In function 'enic_probe': drivers/net/ethernet/cisco/enic/enic_main.c:2490:3: error: label 'err_out_vnic_unregister' used but not defined goto err_out_vnic_unregister; ^ Thanks I'm about to push the following to fix this: [PATCH] enic: Fix build failure with SRIOV disabled. err_out_vnic_unregister is used regardless of whether SRIOV is enabled or not. Reported-by: Jesse Brandeburg jesse.brangeb...@intel.com Signed-off-by: David S. Miller da...@davemloft.net --- drivers/net/ethernet/cisco/enic/enic_main.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/cisco/enic/enic_main.c b/drivers/net/ethernet/cisco/enic/enic_main.c index cb1fdc3..3352d02 100644 --- a/drivers/net/ethernet/cisco/enic/enic_main.c +++ b/drivers/net/ethernet/cisco/enic/enic_main.c @@ -2663,8 +2663,8 @@ err_out_disable_sriov_pp: pci_disable_sriov(pdev); enic-priv_flags = ~ENIC_SRIOV_ENABLED; } -err_out_vnic_unregister: #endif +err_out_vnic_unregister: vnic_dev_unregister(enic-vdev); err_out_iounmap: enic_iounmap(enic); -- 1.7.10.4 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 1/2] lan78xx: change phy id and fix phy reset issue
Patch to change internal PHYID to 1 and fix PHY reset issue. Signed-off-by: Woojung Huh woojung@microchip.com --- drivers/net/usb/lan78xx.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c index 39364a4..4bcbf28 100644 --- a/drivers/net/usb/lan78xx.c +++ b/drivers/net/usb/lan78xx.c @@ -36,7 +36,7 @@ #define DRIVER_AUTHOR WOOJUNG HUH woojung@microchip.com #define DRIVER_DESCLAN78XX USB 3.0 Gigabit Ethernet Devices #define DRIVER_NAMElan78xx -#define DRIVER_VERSION 1.0.0 +#define DRIVER_VERSION 1.0.1 #define TX_TIMEOUT_JIFFIES (5 * HZ) #define THROTTLE_JIFFIES (HZ / 8) @@ -57,7 +57,7 @@ #define DEFAULT_RX_CSUM_ENABLE (true) #define DEFAULT_TSO_CSUM_ENABLE(true) #define DEFAULT_VLAN_FILTER_ENABLE (true) -#define INTERNAL_PHY_ID(2) /* 2: GMII */ +#define INTERNAL_PHY_ID(1) #define TX_OVERHEAD(8) #define RXW_PADDING2 @@ -2003,7 +2003,7 @@ static int lan78xx_reset(struct lan78xx_net *dev) netdev_warn(dev-net, timeout waiting for PHY Reset); return -EIO; } - } while (buf PMT_CTL_PHY_RST_); + } while ((buf PMT_CTL_PHY_RST_) || !(buf PMT_CTL_READY_)); lan78xx_mii_init(dev); -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 net-next] r8169: Add values missing in @get_stats64 from HW counters
On Aug 21 21:39, Francois Romieu wrote: Corinna Vinschen vinsc...@redhat.com : [...] diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c index f790f61..f26a48d 100644 --- a/drivers/net/ethernet/realtek/r8169.c +++ b/drivers/net/ethernet/realtek/r8169.c [...] @@ -2179,6 +2191,47 @@ static int rtl8169_get_sset_count(struct net_device *dev, int sset) } } +DECLARE_RTL_COND(rtl_reset_counters_cond) +{ + void __iomem *ioaddr = tp-mmio_addr; + + return RTL_R32(CounterAddrLow) CounterReset; +} + +static void rtl8169_reset_counters(struct net_device *dev) +{ rtl8169_reset_counters duplicates most of rtl8169_update_counters. Please factor out the dma_alloc + parametrized CounterAddrLow write + cleanup. Ok, will do. + rtl8169_reset_counters(dev); + + rtl8169_update_counters(dev); The code should propagate failure when both rtl8169_reset_counters and rtl8169_update_counters fail. This one I don't understand. Neither failing to reset the counters nor failing to update the counters is fatal for the driver. So far the (unchanged) rtl8169_update_counters doesn't even print a log message, while a failing reset in rtl8169_reset_counters now does. Why is that not sufficent? Thanks, Corinna pgpKL0AKrvb57.pgp Description: PGP signature
Re: [RFC PATCH v5 net-next 4/4] tcp: add NV congestion control
Kenneth, thank you for your comments, I¹ve implemented most of the improvements you've mentioned. I¹m finishing the new patch and the updated results, they should be done by Monday (including cdg). On 8/5/15, 5:51 PM, knn...@gmail.com on behalf of Kenneth Klette Jonassen knn...@gmail.com on behalf of kenne...@ifi.uio.no wrote: On Wed, Aug 5, 2015 at 3:39 AM, Lawrence Brakmo bra...@fb.com wrote: This is a request for comments. Nice to see more development on delay-based congestion control. Thank you. It would be good to see how NV stacks up against CDG. Any chance of adding cdg as a congestion control parameter to your experiments? Experiments on NV without its temporary cwnd reductions would also be of interest -- to get a reference of how effective this mechanism is. I¹m finishing with cdg experiments, they will be up on Monday together with an update to the NV patch. I will also have some experiments with variations in the temporary cwnd reduction. This mechanism is meant to reduce min_rtt creep, but it is now always successful. Its drawback is that it can increase high percentile latency. +#define NV_INIT_RTT 0x Maybe use U32_MAX? Done +static void tcpnv_init(struct sock *sk) +{ + struct tcpnv *ca = inet_csk_ca(sk); + + tcpnv_reset(ca, sk); + + ca-nv_min_rtt_reset_jiffies = jiffies + 2*HZ; + ca-nv_min_rtt = NV_INIT_RTT; + ca-nv_min_rtt_new = NV_INIT_RTT; + ca-nv_enable = nv_enable; Can this assignment be ca-nv_enable = 1? That would match the TCP_CA_Open case in tcpnv_state(). Done + if (nv_dec_eval_min_calls 255) + nv_dec_eval_min_calls = 255; + if (nv_rtt_min_cnt 63) + nv_rtt_min_cnt = 63; nv_dec_eval_min_calls can be clamped to 0-255 by changing its type to u8. nv_rtt_min_cnt can also be u8? In struct tcpnv, perhaps move nv_rtt_cnt to the available byte. Done +static void tcpnv_cong_avoid(struct sock *sk, u32 ack, u32 acked) +{ + struct tcp_sock *tp = tcp_sk(sk); + struct tcpnv *ca = inet_csk_ca(sk); + + if (!tcp_is_cwnd_limited(sk)) + return; + + /* Only grow cwnd if NV has not detected congestion */ + if (nv_enable ca-nv_enable !ca-nv_allow_cwnd_growth) + return; The check for ca-nv_enable might be overly harsh on some unfortunate sockets in TCP_CA_Disorder. Is it needed here? TCP_CA_Disorder should not affect ca-nv_enable in the new patch +static void tcpnv_acked(struct sock *sk, struct ack_sample *sample) Maybe move some of this function to tcpnv_cong_avoid()? It needs to be here since We need the information provided in argument sample +{ + const struct inet_connection_sock *icsk = inet_csk(sk); + struct tcp_sock *tp = tcp_sk(sk); + struct tcpnv *ca = inet_csk_ca(sk); + unsigned long now = jiffies; + s64 rate64 = 0; + u32 rate, max_win, cwnd_by_slope; + u32 avg_rtt; + u32 bytes_acked = 0; + + /* Some calls are for duplicates without timetamps */ + if (sample-rtt_us 0) + return; + + /* If not in TCP_CA_Open state, skip. */ + if (icsk-icsk_ca_state != TCP_CA_Open) + return; Consider using samples in other states too, especially TCP_CA_Disorder. Linux 4.2 enhances RTT sampling from SACKs, so any non-negative RTT sample should be fully usable. Done -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next 2/2] lan78xx: update eee code
Patch to pdate EEE code. Signed-off-by: Woojung Huh woojung@microchip.com --- drivers/net/usb/lan78xx.c | 44 drivers/net/usb/lan78xx.h | 22 +++--- 2 files changed, 35 insertions(+), 31 deletions(-) diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c index 4bcbf28..af102b0 100644 --- a/drivers/net/usb/lan78xx.c +++ b/drivers/net/usb/lan78xx.c @@ -1296,38 +1296,37 @@ static int lan78xx_get_eee(struct net_device *net, struct ethtool_eee *edata) if (ret 0) return ret; + buf = lan78xx_mmd_read(dev-net, dev-mii.phy_id, + PHY_MMD_DEV_7, PHY_EEE_ADVERTISEMENT); + adv = mmd_eee_adv_to_ethtool_adv_t(buf); + buf = lan78xx_mmd_read(dev-net, dev-mii.phy_id, + PHY_MMD_DEV_7, PHY_EEE_LP_ADVERTISEMENT); + lpadv = mmd_eee_adv_to_ethtool_adv_t(buf); + ret = lan78xx_read_reg(dev, MAC_CR, buf); if (buf MAC_CR_EEE_EN_) { - buf = lan78xx_mmd_read(dev-net, dev-mii.phy_id, - PHY_MMD_DEV_7, PHY_EEE_ADVERTISEMENT); - adv = mmd_eee_adv_to_ethtool_adv_t(buf); - buf = lan78xx_mmd_read(dev-net, dev-mii.phy_id, - PHY_MMD_DEV_7, PHY_EEE_LP_ADVERTISEMENT); - lpadv = mmd_eee_adv_to_ethtool_adv_t(buf); - edata-eee_enabled = true; - edata-supported = true; edata-eee_active = !!(adv lpadv); - edata-advertised = adv; - edata-lp_advertised = lpadv; edata-tx_lpi_enabled = true; /* EEE_TX_LPI_REQ_DLY tx_lpi_timer are same uSec unit */ ret = lan78xx_read_reg(dev, EEE_TX_LPI_REQ_DLY, buf); edata-tx_lpi_timer = buf; } else { - buf = lan78xx_mmd_read(dev-net, dev-mii.phy_id, - PHY_MMD_DEV_7, PHY_EEE_LP_ADVERTISEMENT); - lpadv = mmd_eee_adv_to_ethtool_adv_t(buf); edata-eee_enabled = false; edata-eee_active = false; - edata-supported = false; - edata-advertised = 0; - edata-lp_advertised = mmd_eee_adv_to_ethtool_adv_t(lpadv); edata-tx_lpi_enabled = false; edata-tx_lpi_timer = 0; } + edata-supported = ADVERTISED_100baseT_Full | + ADVERTISED_1000baseT_Full; + + edata-advertised = ADVERTISED_100baseT_Full | + ADVERTISED_1000baseT_Full; + + edata-lp_advertised = lpadv; + usb_autopm_put_interface(dev-intf); return 0; @@ -1351,6 +1350,9 @@ static int lan78xx_set_eee(struct net_device *net, struct ethtool_eee *edata) buf = ethtool_adv_to_mmd_eee_adv_t(edata-advertised); lan78xx_mmd_write(dev-net, dev-mii.phy_id, PHY_MMD_DEV_7, PHY_EEE_ADVERTISEMENT, buf); + + buf = (u32)edata-tx_lpi_timer; + ret = lan78xx_write_reg(dev, EEE_TX_LPI_REQ_DLY, buf); } else { ret = lan78xx_read_reg(dev, MAC_CR, buf); buf = ~MAC_CR_EEE_EN_; @@ -1641,6 +1643,12 @@ static int lan78xx_phy_init(struct lan78xx_net *dev) mii-mdio_write(mii-dev, mii-phy_id, MII_CTRL1000, temp ~ADVERTISE_1000HALF); + /* Set EEE advertise */ + lan78xx_mmd_write(dev-net, dev-mii.phy_id, + PHY_MMD_DEV_7, PHY_EEE_ADVERTISEMENT, + PHY_EEE_ADVERTISEMENT_1000BT_EEE_ | + PHY_EEE_ADVERTISEMENT_100BT_EEE_); + /* clear interrupt */ mii-mdio_read(mii-dev, mii-phy_id, PHY_VTSE_INT_STS); mii-mdio_write(mii-dev, mii-phy_id, PHY_VTSE_INT_MASK, @@ -2016,10 +2024,6 @@ static int lan78xx_reset(struct lan78xx_net *dev) ret = lan78xx_write_reg(dev, MAC_CR, buf); - /* enable on PHY */ - if (buf MAC_CR_EEE_EN_) - lan78xx_mmd_write(dev-net, dev-mii.phy_id, 0x07, 0x3C, 0x06); - /* enable PHY interrupts */ ret = lan78xx_read_reg(dev, INT_EP_CTL, buf); buf |= INT_ENP_PHY_INT; diff --git a/drivers/net/usb/lan78xx.h b/drivers/net/usb/lan78xx.h index ae7562e..95e721b 100644 --- a/drivers/net/usb/lan78xx.h +++ b/drivers/net/usb/lan78xx.h @@ -1047,23 +1047,23 @@ #define PHY_MMD_DEV_3 3 #define PHY_EEE_PCS_STATUS (0x1) -#define PHY_EEE_PCS_STATUS_TX_LPI_RCVD_((WORD)0x0800) -#define PHY_EEE_PCS_STATUS_RX_LPI_RCVD_((WORD)0x0400) -#define PHY_EEE_PCS_STATUS_TX_LPI_IND_ ((WORD)0x0200) -#define PHY_EEE_PCS_STATUS_RX_LPI_IND_ ((WORD)0x0100) -#define PHY_EEE_PCS_STATUS_PCS_RCV_LNK_STS_((WORD)0x0004) +#define PHY_EEE_PCS_STATUS_TX_LPI_RCVD_(0x0800)
Re: [PATCH net-next 2/2] lan78xx: update eee code
On 21/08/15 14:41, woojung@microchip.com wrote: Patch to pdate EEE code. This really deserves a better explanation of what is it that you are fixing here. Signed-off-by: Woojung Huh woojung@microchip.com --- drivers/net/usb/lan78xx.c | 44 drivers/net/usb/lan78xx.h | 22 +++--- 2 files changed, 35 insertions(+), 31 deletions(-) diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c index 4bcbf28..af102b0 100644 --- a/drivers/net/usb/lan78xx.c +++ b/drivers/net/usb/lan78xx.c @@ -1296,38 +1296,37 @@ static int lan78xx_get_eee(struct net_device *net, struct ethtool_eee *edata) if (ret 0) return ret; + buf = lan78xx_mmd_read(dev-net, dev-mii.phy_id, +PHY_MMD_DEV_7, PHY_EEE_ADVERTISEMENT); + adv = mmd_eee_adv_to_ethtool_adv_t(buf); + buf = lan78xx_mmd_read(dev-net, dev-mii.phy_id, +PHY_MMD_DEV_7, PHY_EEE_LP_ADVERTISEMENT); + lpadv = mmd_eee_adv_to_ethtool_adv_t(buf); Considering your function signatures, it sounds like you should implement a libphy driver and you could get things like phy_init_eee() for free. [snip] /* enable PHY interrupts */ ret = lan78xx_read_reg(dev, INT_EP_CTL, buf); buf |= INT_ENP_PHY_INT; diff --git a/drivers/net/usb/lan78xx.h b/drivers/net/usb/lan78xx.h index ae7562e..95e721b 100644 --- a/drivers/net/usb/lan78xx.h +++ b/drivers/net/usb/lan78xx.h @@ -1047,23 +1047,23 @@ #define PHY_MMD_DEV_33 #define PHY_EEE_PCS_STATUS (0x1) -#define PHY_EEE_PCS_STATUS_TX_LPI_RCVD_ ((WORD)0x0800) -#define PHY_EEE_PCS_STATUS_RX_LPI_RCVD_ ((WORD)0x0400) -#define PHY_EEE_PCS_STATUS_TX_LPI_IND_ ((WORD)0x0200) -#define PHY_EEE_PCS_STATUS_RX_LPI_IND_ ((WORD)0x0100) -#define PHY_EEE_PCS_STATUS_PCS_RCV_LNK_STS_ ((WORD)0x0004) +#define PHY_EEE_PCS_STATUS_TX_LPI_RCVD_ (0x0800) +#define PHY_EEE_PCS_STATUS_RX_LPI_RCVD_ (0x0400) +#define PHY_EEE_PCS_STATUS_TX_LPI_IND_ (0x0200) +#define PHY_EEE_PCS_STATUS_RX_LPI_IND_ (0x0100) +#define PHY_EEE_PCS_STATUS_PCS_RCV_LNK_STS_ (0x0004) Can you look at updating include/uapi/linux/mdio.h with the missing registers for your use case instead of replicating this in a driver? -- Florian -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] phylib: Make PHYs children of their MDIO bus, not the bus' parent.
From: David Daney david.da...@cavium.com commit 18ee49ddb0d2 (phylib: rename mii_bus::dev to mii_bus::parent) changed the parent of PHY devices from the bus to the bus parent. Then, commit 4dea547fef1b (phylib: rework to prepare for OF registration of PHYs) moved the code into phy_device.c At this point, it is somewhat unclear why the change was seen as necessary. But, when we look at the device model tree in /sys/devices, it is clearly incorrect. The PHYs should be children of their MDIO bus. Change the PHY's parent device to be the MDIO bus device. Cc: Lennert Buytenhek buyt...@wantstofly.org Cc: Grant Likely grant.lik...@secretlab.ca Signed-off-by: David Daney david.da...@cavium.com --- drivers/net/phy/phy_device.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index 0302483..55f0178 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -176,7 +176,7 @@ struct phy_device *phy_device_create(struct mii_bus *bus, int addr, int phy_id, if (c45_ids) dev-c45_ids = *c45_ids; dev-bus = bus; - dev-dev.parent = bus-parent; + dev-dev.parent = bus-dev; dev-dev.bus = mdio_bus_type; dev-irq = bus-irq != NULL ? bus-irq[addr] : PHY_POLL; dev_set_name(dev-dev, PHY_ID_FMT, bus-id, addr); -- 1.7.11.7 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[net-next PATCH v2 0/3] net: sched: allow switching qdisc to noqueue intuitively
This patch series improves the integration of the noqueue qdisc to become the fallback queueing if no other is attached to an interface. Before it was rather an add-on, a simpler alternative to a FIFO if no congestion is expected or possible. It has become the default qdisc for virtual interfaces, and could be attached by this mechanism only (through removing the root qdisc after having set tx_queue_len to zero for interfaces not defaulting to noqueue otherwise). This series does not change the default qdisc chosen for new interfaces, but upon removal of the root qdisc from an interface, the kernel won't fall back to the default but to noqueue instead. Changes since v1: - Leave qdisc_create_dflt() alone as it is used in sch_htb.c as well. Instead allocate the handle in attach_default_qdiscs() and attach_one_default_qdisc(). Phil Sutter (3): net: sched: make noqueue_qdisc non-static net: sched: allocate a handle to default qdiscs net: sched: fall back to noqueue when removing root qdisc include/net/sch_generic.h | 2 ++ net/sched/sch_api.c | 5 +++-- net/sched/sch_generic.c | 18 ++ 3 files changed, 19 insertions(+), 6 deletions(-) -- 2.1.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v1 0/6] Add fec1 and fec2 support for i.MX7d sdb board
From: Florian Fainelli f.faine...@gmail.com Sent: Friday, August 21, 2015 5:14 AM To: David Miller; Duan Fugang-B38611 Cc: netdev@vger.kernel.org; shawn...@kernel.org; linux-arm- ker...@lists.infradead.org Subject: Re: [PATCH v1 0/6] Add fec1 and fec2 support for i.MX7d sdb board On 20/08/15 14:05, David Miller wrote: From: Fugang Duan b38...@freescale.com Date: Wed, 19 Aug 2015 13:33:58 +0800 The patch series is to add fec support for i.MX7d sdb board. Since i.MX7d fec ip is the same as i.MX6SX, so there have no change for driver itself. Patch#1: add bcm54220 PHY ID entry into brcmphy.h file. This is completely, and utterly, pointless. The only reason a PHY ID should be defined in brcmphy.h is so that it can be used in the broadcom.c PHY driver or similar. If there is no user in the tree, there is no reason to add it to the header file. There is a valid reason for which you may have a PHY id, which is defining a PHY fixup in your platform code like Andy is doing, however, this should not be used in conjunction with the Generic PHY driver, because this driver has absolutely no clue about your PHY fixup, and this could create at best inconsistencies in how the fixup is managed later on. At the very least, I would like to see a change to drivers/net/phy/broadcom.c which identifies this PHY id, and eventually just invokes the genphy_* functions where relevant. -- Florian I will try to add the phy support in Broadcom phy driver. Thanks for your comments. Regards, Andy
Re: [PATCH net-next] tcp: refine pacing rate determination
On Fri, Aug 21, 2015 at 8:38 PM, Eric Dumazet eric.duma...@gmail.com wrote: From: Eric Dumazet eduma...@google.com When TCP pacing was added back in linux-3.12, we chose to apply a fixed ratio of 200 % against current rate, to allow probing for optimal throughput even during slow start phase, where cwnd can be doubled every other gRTT. At Google, we found it was better applying a different ratio while in Congestion Avoidance phase. This ratio was set to 120 %. We've used the normal tcp_in_slow_start() helper for a while, then tuned the condition to select the conservative ratio as soon as cwnd = ssthresh/2 : - After cwnd reduction, it is safer to ramp up more slowly, as we approach optimal cwnd. - Initial ramp up (ssthresh == INFINITY) still allows doubling cwnd every other RTT. Signed-off-by: Eric Dumazet eduma...@google.com Cc: Neal Cardwell ncardw...@google.com Cc: Yuchung Cheng ych...@google.com Acked-by: Neal Cardwell ncardw...@google.com Looks great to me. Thanks, Eric! neal -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 3/4] Add support for driver cross-timestamp to PTP_SYS_OFFSET ioctl
From: Christopher Hall christopher.s.h...@intel.com This patch allows system and device time (cross-timestamp) to be performed by the driver. Currently, the cross-timestamping is performed in the PTP_SYS_OFFSET ioctl. The PTP clock driver reads gettimeofday() and the gettime64() callback provided by the driver. The cross-timestamp is best effort where the latency between the capture of system time (getnstimeofday()) and the device time (driver callback) may be significant. This patch adds an additional callback getsynctime64(). Which will be called when the driver is able to perform a more accurate, implementation specific cross-timestamping. For example, future network devices that implement PCIE PTM will be able to precisely correlate the device clock with the system clock with virtually zero latency between captures. This added callback can be used by the driver to expose this functionality. The callback, getsynctime64(), will only be called when defined and n_samples == 1 because the driver returns only 1 cross-timestamp where multiple samples cannot be chained together. This patch also adds to the capabilities ioctl (PTP_CLOCK_GETCAPS), allowing applications to query whether or not drivers implement the getsynctime callback, providing more precise cross timestamping. Commit Details: Added additional callback to ptp_clock_info: * getsynctime64() This takes 2 arguments referring to system and device time With this callback drivers may provide both system time and device time to ensure precise correlation Modified PTP_SYS_OFFSET ioctl in PTP clock driver to use the above callback if it's available Added capability (PTP_CLOCK_GETCAPS) for checking whether driver supports cross timestamping Added check for cross timestamping flag to testptp.c Signed-off-by: Christopher S. Hall christopher.s.h...@intel.com --- Documentation/ptp/testptp.c | 6 -- drivers/ptp/ptp_chardev.c| 29 + include/linux/ptp_clock_kernel.h | 7 +++ include/uapi/linux/ptp_clock.h | 4 +++- 4 files changed, 35 insertions(+), 11 deletions(-) diff --git a/Documentation/ptp/testptp.c b/Documentation/ptp/testptp.c index 2bc8abc..8004efd 100644 --- a/Documentation/ptp/testptp.c +++ b/Documentation/ptp/testptp.c @@ -276,13 +276,15 @@ int main(int argc, char *argv[]) %d external time stamp channels\n %d programmable periodic signals\n %d pulse per second\n -%d programmable pins\n, +%d programmable pins\n +%d cross timestamping\n, caps.max_adj, caps.n_alarm, caps.n_ext_ts, caps.n_per_out, caps.pps, - caps.n_pins); + caps.n_pins, + caps.cross_timestamping); } } diff --git a/drivers/ptp/ptp_chardev.c b/drivers/ptp/ptp_chardev.c index da7bae9..392ccfa 100644 --- a/drivers/ptp/ptp_chardev.c +++ b/drivers/ptp/ptp_chardev.c @@ -124,7 +124,7 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg) struct ptp_clock *ptp = container_of(pc, struct ptp_clock, clock); struct ptp_clock_info *ops = ptp-info; struct ptp_clock_time *pct; - struct timespec64 ts; + struct timespec64 ts, systs; int enable, err = 0; unsigned int i, pin_index; @@ -138,6 +138,7 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg) caps.n_per_out = ptp-info-n_per_out; caps.pps = ptp-info-pps; caps.n_pins = ptp-info-n_pins; + caps.cross_timestamping = ptp-info-getsynctime64 != NULL; if (copy_to_user((void __user *)arg, caps, sizeof(caps))) err = -EFAULT; break; @@ -196,19 +197,31 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg) break; } pct = sysoff-ts[0]; - for (i = 0; i sysoff-n_samples; i++) { - getnstimeofday64(ts); + if (ptp-info-getsynctime64 sysoff-n_samples == 1 + ptp-info-getsynctime64(ptp-info, ts, systs) == 0) { + pct-sec = systs.tv_sec; + pct-nsec = systs.tv_nsec; + pct++; pct-sec = ts.tv_sec; pct-nsec = ts.tv_nsec; pct++; - ptp-info-gettime64(ptp-info, ts); + pct-sec = systs.tv_sec; + pct-nsec = systs.tv_nsec; + } else { + for (i = 0; i
RE: [PATCH v1 0/6] Add fec1 and fec2 support for i.MX7d sdb board
From: David Miller da...@davemloft.net Sent: Friday, August 21, 2015 5:06 AM To: Duan Fugang-B38611 Cc: shawn...@kernel.org; linux-arm-ker...@lists.infradead.org; netdev@vger.kernel.org Subject: Re: [PATCH v1 0/6] Add fec1 and fec2 support for i.MX7d sdb board From: Fugang Duan b38...@freescale.com Date: Wed, 19 Aug 2015 13:33:58 +0800 The patch series is to add fec support for i.MX7d sdb board. Since i.MX7d fec ip is the same as i.MX6SX, so there have no change for driver itself. Patch#1: add bcm54220 PHY ID entry into brcmphy.h file. This is completely, and utterly, pointless. The only reason a PHY ID should be defined in brcmphy.h is so that it can be used in the broadcom.c PHY driver or similar. If there is no user in the tree, there is no reason to add it to the header file. Ok, I will try to add the phy support in Broadcom phy driver. Thanks for your comment. Regards, Andy -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 net-next] r8169: Add values missing in @get_stats64 from HW counters
Corinna Vinschen vinsc...@redhat.com : On Aug 21 21:39, Francois Romieu wrote: [...] The code should propagate failure when both rtl8169_reset_counters and rtl8169_update_counters fail. This one I don't understand. Neither failing to reset the counters nor failing to update the counters is fatal for the driver. So far the (unchanged) rtl8169_update_counters doesn't even print a log message, I wouldn't overestimate the value of log messages vs real status return. Users can be quite unhappy with default settings that spam their logs (it isn't a problem in open(), it's marginaly murphy plausible from a periodic get_stats context). The driver can't propagate errors from the current get_stats context where rtl8169_update_counters is used. However it can be done in open(). while a failing reset in rtl8169_reset_counters now does. Why is that not sufficent? Because of the same reason(s) why this patch wants to improve things. It isn't a showstopper. -- Ueimor -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[net-next PATCH v2 3/3] net: sched: fall back to noqueue when removing root qdisc
When removing the root qdisc, the interface should fall back to noqueue as the 'real' minimal qdisc instead of the default one. Therefore dev_graft_qdisc() has to be adjusted to assign noqueue if NULL was passed as new qdisc, and qdisc_graft() needs to assign noqueue to dev-qdisc instead of noop to prevent dev_activate() from attaching default qdiscs to the interface. Note that it is also necessary to have dev_graft_qdisc() set dev_queue-qdisc to the new qdisc instead of (unconditionally) noop. I don't know why this was there at all (originates from pre-git time), but it seems wrong to me. It could be worked around by droping the extra check for noqueue in transition_one_qdisc(), maybe with unintended side-effects. Signed-off-by: Phil Sutter p...@nwl.cc --- net/sched/sch_api.c | 2 +- net/sched/sch_generic.c | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c index 224374c..3b2cf30 100644 --- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -839,7 +839,7 @@ skip: dev-qdisc, new); if (new !new-ops-attach) atomic_inc(new-refcnt); - dev-qdisc = new ? : noop_qdisc; + dev-qdisc = new ? : noqueue_qdisc; if (new new-ops-attach) new-ops-attach(new); diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index 68df721..ecc369b 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -718,9 +718,9 @@ struct Qdisc *dev_graft_qdisc(struct netdev_queue *dev_queue, /* ... and graft new one */ if (qdisc == NULL) - qdisc = noop_qdisc; + qdisc = noqueue_qdisc; dev_queue-qdisc_sleeping = qdisc; - rcu_assign_pointer(dev_queue-qdisc, noop_qdisc); + rcu_assign_pointer(dev_queue-qdisc, qdisc); spin_unlock_bh(root_lock); -- 2.1.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[net-next PATCH v2 2/3] net: sched: allocate a handle to default qdiscs
Since tc_get_qdisc() does not allow to remove a qdisc with zero handle, a handle needs to be allocated to default qdiscs (currently pfifo_fast or mq) in order to allow removing them. Signed-off-by: Phil Sutter p...@nwl.cc --- include/net/sch_generic.h | 1 + net/sched/sch_api.c | 3 ++- net/sched/sch_generic.c | 11 +++ 3 files changed, 14 insertions(+), 1 deletion(-) diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index 4495193..2bfc898 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -391,6 +391,7 @@ void dev_deactivate(struct net_device *dev); void dev_deactivate_many(struct list_head *head); struct Qdisc *dev_graft_qdisc(struct netdev_queue *dev_queue, struct Qdisc *qdisc); +u32 qdisc_alloc_handle(struct net_device *dev); void qdisc_reset(struct Qdisc *qdisc); void qdisc_destroy(struct Qdisc *qdisc); void qdisc_tree_decrease_qlen(struct Qdisc *qdisc, unsigned int n); diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c index f06aa01..224374c 100644 --- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -723,7 +723,7 @@ EXPORT_SYMBOL(qdisc_class_hash_remove); /* Allocate an unique handle from space managed by kernel * Possible range is [8000-]: (0x8000 values) */ -static u32 qdisc_alloc_handle(struct net_device *dev) +u32 qdisc_alloc_handle(struct net_device *dev) { int i = 0x8000; static u32 autohandle = TC_H_MAKE(0x8000U, 0); @@ -739,6 +739,7 @@ static u32 qdisc_alloc_handle(struct net_device *dev) return 0; } +EXPORT_SYMBOL(qdisc_alloc_handle); void qdisc_tree_decrease_qlen(struct Qdisc *sch, unsigned int n) { diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index 1fb65f9..68df721 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -741,6 +741,11 @@ static void attach_one_default_qdisc(struct net_device *dev, netdev_info(dev, activation failed\n); return; } +#ifdef CONFIG_NET_SCHED + qdisc-handle = qdisc_alloc_handle(dev); + if (!qdisc-handle) + netdev_info(dev, qdisc handle allocation failed\n); +#endif if (!netif_is_multiqueue(dev)) qdisc-flags |= TCQ_F_ONETXQUEUE; } @@ -763,6 +768,12 @@ static void attach_default_qdiscs(struct net_device *dev) } else { qdisc = qdisc_create_dflt(txq, mq_qdisc_ops, TC_H_ROOT); if (qdisc) { +#ifdef CONFIG_NET_SCHED + qdisc-handle = qdisc_alloc_handle(dev); + if (!qdisc-handle) + netdev_info(dev, + qdisc handle allocation failed\n); +#endif dev-qdisc = qdisc; qdisc-ops-attach(qdisc); } -- 2.1.2 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH net-next] tcp: refine pacing rate determination
From: Eric Dumazet eduma...@google.com When TCP pacing was added back in linux-3.12, we chose to apply a fixed ratio of 200 % against current rate, to allow probing for optimal throughput even during slow start phase, where cwnd can be doubled every other gRTT. At Google, we found it was better applying a different ratio while in Congestion Avoidance phase. This ratio was set to 120 %. We've used the normal tcp_in_slow_start() helper for a while, then tuned the condition to select the conservative ratio as soon as cwnd = ssthresh/2 : - After cwnd reduction, it is safer to ramp up more slowly, as we approach optimal cwnd. - Initial ramp up (ssthresh == INFINITY) still allows doubling cwnd every other RTT. Signed-off-by: Eric Dumazet eduma...@google.com Cc: Neal Cardwell ncardw...@google.com Cc: Yuchung Cheng ych...@google.com --- Documentation/networking/ip-sysctl.txt | 15 +++ include/net/tcp.h |2 ++ net/ipv4/sysctl_net_ipv4.c | 19 +++ net/ipv4/tcp_input.c | 18 +- 4 files changed, 53 insertions(+), 1 deletion(-) diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index 46e88ed7f41d..ac77a13d2ea2 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt @@ -586,6 +586,21 @@ tcp_min_tso_segs - INTEGER if available window is too small. Default: 2 +tcp_pacing_ss_ratio - INTEGER + sk-sk_pacing_rate is set by TCP stack using a ratio applied + to current rate. (current_rate = cwnd * mss / srtt) + If TCP is in slow start, tcp_pacing_ss_ratio is applied + to let TCP probe for bigger speeds, assuming cwnd can be + doubled every other RTT. + Default: 200 + +tcp_pacing_ca_ratio - INTEGER + sk-sk_pacing_rate is set by TCP stack using a ratio applied + to current rate. (current_rate = cwnd * mss / srtt) + If TCP is in congestion avoidance phase, tcp_pacing_ca_ratio + is applied to conservatively probe for bigger throughput. + Default: 120 + tcp_tso_win_divisor - INTEGER This allows control over what percentage of the congestion window can be consumed by a single TSO frame. diff --git a/include/net/tcp.h b/include/net/tcp.h index 364426a2be5a..3e2b3ba43ae5 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -281,6 +281,8 @@ extern unsigned int sysctl_tcp_notsent_lowat; extern int sysctl_tcp_min_tso_segs; extern int sysctl_tcp_autocorking; extern int sysctl_tcp_invalid_ratelimit; +extern int sysctl_tcp_pacing_ss_ratio; +extern int sysctl_tcp_pacing_ca_ratio; extern atomic_long_t tcp_memory_allocated; extern struct percpu_counter tcp_sockets_allocated; diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c index 0330ab2e2b63..879bdc5c95b1 100644 --- a/net/ipv4/sysctl_net_ipv4.c +++ b/net/ipv4/sysctl_net_ipv4.c @@ -29,6 +29,7 @@ static int zero; static int one = 1; static int four = 4; +static int thousand = 1000; static int gso_max_segs = GSO_MAX_SEGS; static int tcp_retr1_max = 255; static int ip_local_port_range_min[] = { 1, 1 }; @@ -712,6 +713,24 @@ static struct ctl_table ipv4_table[] = { .extra2 = gso_max_segs, }, { + .procname = tcp_pacing_ss_ratio, + .data = sysctl_tcp_pacing_ss_ratio, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec_minmax, + .extra1 = zero, + .extra2 = thousand, + }, + { + .procname = tcp_pacing_ca_ratio, + .data = sysctl_tcp_pacing_ca_ratio, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec_minmax, + .extra1 = zero, + .extra2 = thousand, + }, + { .procname = tcp_autocorking, .data = sysctl_tcp_autocorking, .maxlen = sizeof(int), diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 4e4d6bcd0ca9..7e1623775744 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -753,13 +753,29 @@ static void tcp_rtt_estimator(struct sock *sk, long mrtt_us) * TCP pacing, to smooth the burst on large writes when packets * in flight is significantly lower than cwnd (or rwin) */ +int sysctl_tcp_pacing_ss_ratio __read_mostly = 200; +int sysctl_tcp_pacing_ca_ratio __read_mostly = 120; + static void tcp_update_pacing_rate(struct sock *sk) { const struct tcp_sock *tp = tcp_sk(sk); u64 rate; /* set sk_pacing_rate to 200 % of current rate (mss * cwnd / srtt) */ - rate = (u64)tp-mss_cache * 2 * (USEC_PER_SEC 3); + rate = (u64)tp-mss_cache * ((USEC_PER_SEC
[PATCH v3 2/4] Added ART correlated clocksource and ART CPU feature
Add detect_art() call to early TSC initialization which reads ART-TSC numerator/denominator and sets CPU feature if present Add convert_art_to_tsc() function performing conversion ART to TSC Add art_timestamp referencing art_to_tsc() and clocksource_tsc enabling driver conversion of ART to TSC Signed-off-by: Christopher S. Hall christopher.s.h...@intel.com --- arch/x86/include/asm/cpufeature.h | 3 ++- arch/x86/include/asm/tsc.h| 2 ++ arch/x86/kernel/tsc.c | 54 +++ 3 files changed, 58 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h index 3d6606f..a9322e5 100644 --- a/arch/x86/include/asm/cpufeature.h +++ b/arch/x86/include/asm/cpufeature.h @@ -85,7 +85,7 @@ #define X86_FEATURE_P4 ( 3*32+ 7) /* P4 */ #define X86_FEATURE_CONSTANT_TSC ( 3*32+ 8) /* TSC ticks at a constant rate */ #define X86_FEATURE_UP ( 3*32+ 9) /* smp kernel running on up */ -/* free, was #define X86_FEATURE_FXSAVE_LEAK ( 3*32+10) * FXSAVE leaks FOP/FIP/FOP */ +#define X86_FEATURE_ART(3*32+10) /* Platform has always running timer (ART) */ #define X86_FEATURE_ARCH_PERFMON ( 3*32+11) /* Intel Architectural PerfMon */ #define X86_FEATURE_PEBS ( 3*32+12) /* Precise-Event Based Sampling */ #define X86_FEATURE_BTS( 3*32+13) /* Branch Trace Store */ @@ -352,6 +352,7 @@ extern const char * const x86_bug_flags[NBUGINTS*32]; #define cpu_has_de boot_cpu_has(X86_FEATURE_DE) #define cpu_has_pseboot_cpu_has(X86_FEATURE_PSE) #define cpu_has_tscboot_cpu_has(X86_FEATURE_TSC) +#define cpu_has_artboot_cpu_has(X86_FEATURE_ART) #define cpu_has_pgeboot_cpu_has(X86_FEATURE_PGE) #define cpu_has_apic boot_cpu_has(X86_FEATURE_APIC) #define cpu_has_sepboot_cpu_has(X86_FEATURE_SEP) diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h index 94605c0..8d52d91 100644 --- a/arch/x86/include/asm/tsc.h +++ b/arch/x86/include/asm/tsc.h @@ -45,6 +45,8 @@ static __always_inline cycles_t vget_cycles(void) return (cycles_t)__native_read_tsc(); } +extern struct correlated_cs art_timestamper; + extern void tsc_init(void); extern void mark_tsc_unstable(char *reason); extern int unsynchronized_tsc(void); diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c index 7437b41..13f12e0 100644 --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -939,10 +939,36 @@ static struct notifier_block time_cpufreq_notifier_block = { .notifier_call = time_cpufreq_notifier }; +#define ART_CPUID_LEAF (0x15) +#define ART_MIN_DENOMINATOR (2) + +static u32 art_to_tsc_numerator; +static u32 art_to_tsc_denominator; + +/* + * If ART is present detect the numberator:denominator to convert to TSC + */ +void detect_art(void) +{ + unsigned int unused[2]; + + if (boot_cpu_data.cpuid_level = ART_CPUID_LEAF) { + cpuid(ART_CPUID_LEAF, art_to_tsc_denominator, + art_to_tsc_numerator, unused, unused+1); + + if (art_to_tsc_denominator = ART_MIN_DENOMINATOR) { + set_cpu_cap(boot_cpu_data, X86_FEATURE_ART); + } + } +} + static int __init cpufreq_tsc(void) { if (!cpu_has_tsc) return 0; + + detect_art(); + if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC)) return 0; cpufreq_register_notifier(time_cpufreq_notifier_block, @@ -1059,6 +1085,32 @@ int unsynchronized_tsc(void) return 0; } +/* + * Convert ART to TSC given numerator/denominator found in detect_art() + */ +static u64 convert_art_to_tsc(struct correlated_cs *cs, u64 cycles) +{ + u64 tmp, res; + + switch (art_to_tsc_denominator) { + default: + res = (cycles / art_to_tsc_denominator) * art_to_tsc_numerator; + tmp = (cycles % art_to_tsc_denominator) * art_to_tsc_numerator; + res += tmp / art_to_tsc_denominator; + break; + case 2: + res = (cycles 1) * art_to_tsc_numerator; + tmp = (cycles 0x1) * art_to_tsc_numerator; + res += tmp 1; + break; + } + return res; +} + +struct correlated_cs art_timestamper = { + .convert= convert_art_to_tsc, +}; +EXPORT_SYMBOL(art_timestamper); static void tsc_refine_calibration_work(struct work_struct *work); static DECLARE_DELAYED_WORK(tsc_irqwork, tsc_refine_calibration_work); @@ -1130,6 +1182,8 @@ static void tsc_refine_calibration_work(struct work_struct *work) (unsigned long)tsc_khz % 1000); out: + if (cpu_has_art) + art_timestamper.related_cs = clocksource_tsc; clocksource_register_khz(clocksource_tsc, tsc_khz); } -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a
[PATCH v3 4/4] Enabling hardware supported PTP system/device crosstimestamping
From: Christopher Hall christopher.s.h...@intel.com Add getsynctime() PTP device callback to cross timestamp system device clock using ART translation depends on platform being = SPT and having ART getsynctime() reads ART (TSC-derived)/device cross timestamp and converts to realtime/device time reporting cross timestamp to PTP driver Signed-off-by: Christopher S. Hall christopher.s.h...@intel.com --- drivers/net/ethernet/intel/e1000e/defines.h | 5 ++ drivers/net/ethernet/intel/e1000e/ptp.c | 88 + drivers/net/ethernet/intel/e1000e/regs.h| 4 ++ 3 files changed, 97 insertions(+) diff --git a/drivers/net/ethernet/intel/e1000e/defines.h b/drivers/net/ethernet/intel/e1000e/defines.h index 133d407..13cff75 100644 --- a/drivers/net/ethernet/intel/e1000e/defines.h +++ b/drivers/net/ethernet/intel/e1000e/defines.h @@ -527,6 +527,11 @@ #define E1000_RXCW_C 0x2000/* Receive config */ #define E1000_RXCW_SYNCH 0x4000/* Receive config synch */ +/* HH Time Sync */ +#define E1000_TSYNCTXCTL_MAX_ALLOWED_DLY_MASK 0xF000 /* max delay */ +#define E1000_TSYNCTXCTL_SYNC_COMP 0x4000 /* sync complete */ +#define E1000_TSYNCTXCTL_START_SYNC0x8000 /* initiate sync */ + #define E1000_TSYNCTXCTL_VALID 0x0001 /* Tx timestamp valid */ #define E1000_TSYNCTXCTL_ENABLED 0x0010 /* enable Tx timestamping */ diff --git a/drivers/net/ethernet/intel/e1000e/ptp.c b/drivers/net/ethernet/intel/e1000e/ptp.c index 25a0ad5..228f3f3 100644 --- a/drivers/net/ethernet/intel/e1000e/ptp.c +++ b/drivers/net/ethernet/intel/e1000e/ptp.c @@ -25,6 +25,8 @@ */ #include e1000.h +#include asm/tsc.h +#include linux/timekeeping.h /** * e1000e_phc_adjfreq - adjust the frequency of the hardware clock @@ -98,6 +100,87 @@ static int e1000e_phc_adjtime(struct ptp_clock_info *ptp, s64 delta) return 0; } +#define MAX_HW_WAIT_COUNT (3) + +static int e1000e_phc_get_ts(struct correlated_ts *cts) +{ + struct e1000_adapter *adapter = (struct e1000_adapter *)cts-private; + struct e1000_hw *hw = adapter-hw; + int i; + u32 tsync_ctrl; + int ret; + + tsync_ctrl = er32(TSYNCTXCTL); + tsync_ctrl |= E1000_TSYNCTXCTL_START_SYNC | + E1000_TSYNCTXCTL_MAX_ALLOWED_DLY_MASK; + ew32(TSYNCTXCTL, tsync_ctrl); + for (i = 0; i MAX_HW_WAIT_COUNT; ++i) { + udelay(1); + tsync_ctrl = er32(TSYNCTXCTL); + if (tsync_ctrl E1000_TSYNCTXCTL_SYNC_COMP) + break; + } + + if (i == MAX_HW_WAIT_COUNT) { + ret = -ETIMEDOUT; + } else { + ret = 0; + cts-system_ts = er32(PLTSTMPH); + cts-system_ts = 32; + cts-system_ts |= er32(PLTSTMPL); + cts-device_ts = er32(SYSSTMPH); + cts-device_ts = 32; + cts-device_ts |= er32(SYSSTMPL); + } + + return ret; +} + +/** + * e1000e_phc_getsynctime - Reads the current time from the hardware clock and + * correlated system time + * @ptp: ptp clock structure + * @devts: timespec structure to hold the current device time value + * @systs: timespec structure to hold the current system time value + * + * Read device and system (ART) clock simultaneously and return the correct + * clock values in ns after converting into a struct timespec. + **/ +static int e1000e_phc_getsynctime(struct ptp_clock_info *ptp, + struct timespec64 *devts, + struct timespec64 *systs) +{ + struct e1000_adapter *adapter = container_of(ptp, struct e1000_adapter, +ptp_clock_info); + unsigned long flags; + u32 remainder; + struct correlated_ts art_correlated_ts; + u64 device_time; + int ret; + + art_correlated_ts.get_ts = e1000e_phc_get_ts; + art_correlated_ts.private = adapter; + ret = get_correlated_timestamp(art_correlated_ts, + art_timestamper); + if (ret != 0) + goto bail; + + systs-tv_sec = + div_u64_rem(art_correlated_ts.system_real.tv64, + NSEC_PER_SEC, remainder); + systs-tv_nsec = remainder; + spin_lock_irqsave(adapter-systim_lock, flags); + device_time = timecounter_cyc2time(adapter-tc, + art_correlated_ts.device_ts); + spin_unlock_irqrestore(adapter-systim_lock, flags); + devts-tv_sec = + div_u64_rem(device_time, NSEC_PER_SEC, remainder); + devts-tv_nsec = remainder; + +bail: + return ret; +} + /** * e1000e_phc_gettime - Reads the current time from the hardware clock * @ptp: ptp clock structure @@ -190,6 +273,7 @@ static const struct ptp_clock_info
[net-next:master 790/1189] xt_TEE.c:undefined reference to `nf_dup_ipv6'
tree: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master head: a9e01ed986aa80d3092134428f453072752da223 commit: bbde9fc1824aab58bc78c084163007dd6c03fe5b [790/1189] netfilter: factor out packet duplication for IPv4/IPv6 config: x86_64-nfsroot (attached as .config) reproduce: git checkout bbde9fc1824aab58bc78c084163007dd6c03fe5b # save the attached .config to linux build tree make ARCH=x86_64 All error/warnings (new ones prefixed by ): net/built-in.o: In function `tee_tg6': xt_TEE.c:(.text+0x6cd8c): undefined reference to `nf_dup_ipv6' --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation # # Automatically generated file; DO NOT EDIT. # Linux/x86_64 4.2.0-rc4 Kernel Configuration # CONFIG_64BIT=y CONFIG_X86_64=y CONFIG_X86=y CONFIG_INSTRUCTION_DECODER=y CONFIG_PERF_EVENTS_INTEL_UNCORE=y CONFIG_OUTPUT_FORMAT=elf64-x86-64 CONFIG_ARCH_DEFCONFIG=arch/x86/configs/x86_64_defconfig CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_HAVE_LATENCYTOP_SUPPORT=y CONFIG_MMU=y CONFIG_NEED_DMA_MAP_STATE=y CONFIG_NEED_SG_DMA_LENGTH=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_BUG=y CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y CONFIG_GENERIC_HWEIGHT=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_GENERIC_CALIBRATE_DELAY=y CONFIG_ARCH_HAS_CPU_RELAX=y CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y CONFIG_HAVE_SETUP_PER_CPU_AREA=y CONFIG_NEED_PER_CPU_EMBED_FIRST_CHUNK=y CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK=y CONFIG_ARCH_HIBERNATION_POSSIBLE=y CONFIG_ARCH_SUSPEND_POSSIBLE=y CONFIG_ARCH_WANT_HUGE_PMD_SHARE=y CONFIG_ARCH_WANT_GENERAL_HUGETLB=y CONFIG_ZONE_DMA32=y CONFIG_AUDIT_ARCH=y CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y CONFIG_X86_64_SMP=y CONFIG_ARCH_HWEIGHT_CFLAGS=-fcall-saved-rdi -fcall-saved-rsi -fcall-saved-rdx -fcall-saved-rcx -fcall-saved-r8 -fcall-saved-r9 -fcall-saved-r10 -fcall-saved-r11 CONFIG_ARCH_SUPPORTS_UPROBES=y CONFIG_FIX_EARLYCON_MEM=y CONFIG_PGTABLE_LEVELS=4 CONFIG_DEFCONFIG_LIST=/lib/modules/$UNAME_RELEASE/.config CONFIG_IRQ_WORK=y CONFIG_BUILDTIME_EXTABLE_SORT=y # # General setup # CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_CROSS_COMPILE= # CONFIG_COMPILE_TEST is not set CONFIG_LOCALVERSION= # CONFIG_LOCALVERSION_AUTO is not set CONFIG_HAVE_KERNEL_GZIP=y CONFIG_HAVE_KERNEL_BZIP2=y CONFIG_HAVE_KERNEL_LZMA=y CONFIG_HAVE_KERNEL_XZ=y CONFIG_HAVE_KERNEL_LZO=y CONFIG_HAVE_KERNEL_LZ4=y CONFIG_KERNEL_GZIP=y # CONFIG_KERNEL_BZIP2 is not set # CONFIG_KERNEL_LZMA is not set # CONFIG_KERNEL_XZ is not set # CONFIG_KERNEL_LZO is not set # CONFIG_KERNEL_LZ4 is not set CONFIG_DEFAULT_HOSTNAME=(none) CONFIG_SWAP=y CONFIG_SYSVIPC=y CONFIG_SYSVIPC_SYSCTL=y CONFIG_POSIX_MQUEUE=y CONFIG_POSIX_MQUEUE_SYSCTL=y CONFIG_CROSS_MEMORY_ATTACH=y # CONFIG_FHANDLE is not set CONFIG_USELIB=y CONFIG_AUDIT=y CONFIG_HAVE_ARCH_AUDITSYSCALL=y CONFIG_AUDITSYSCALL=y CONFIG_AUDIT_WATCH=y CONFIG_AUDIT_TREE=y # # IRQ subsystem # CONFIG_GENERIC_IRQ_PROBE=y CONFIG_GENERIC_IRQ_SHOW=y CONFIG_GENERIC_PENDING_IRQ=y CONFIG_IRQ_DOMAIN=y CONFIG_IRQ_DOMAIN_HIERARCHY=y CONFIG_GENERIC_MSI_IRQ=y CONFIG_GENERIC_MSI_IRQ_DOMAIN=y # CONFIG_IRQ_DOMAIN_DEBUG is not set CONFIG_IRQ_FORCED_THREADING=y CONFIG_SPARSE_IRQ=y CONFIG_CLOCKSOURCE_WATCHDOG=y CONFIG_ARCH_CLOCKSOURCE_DATA=y CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y CONFIG_GENERIC_TIME_VSYSCALL=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y CONFIG_GENERIC_CMOS_UPDATE=y # # Timers subsystem # CONFIG_TICK_ONESHOT=y CONFIG_NO_HZ_COMMON=y # CONFIG_HZ_PERIODIC is not set CONFIG_NO_HZ_IDLE=y # CONFIG_NO_HZ_FULL is not set CONFIG_NO_HZ=y CONFIG_HIGH_RES_TIMERS=y # # CPU/Task time and stats accounting # CONFIG_TICK_CPU_ACCOUNTING=y # CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set # CONFIG_IRQ_TIME_ACCOUNTING is not set CONFIG_BSD_PROCESS_ACCT=y CONFIG_BSD_PROCESS_ACCT_V3=y CONFIG_TASKSTATS=y CONFIG_TASK_DELAY_ACCT=y CONFIG_TASK_XACCT=y CONFIG_TASK_IO_ACCOUNTING=y # # RCU Subsystem # CONFIG_TREE_RCU=y # CONFIG_RCU_EXPERT is not set CONFIG_SRCU=y CONFIG_TASKS_RCU=y CONFIG_RCU_STALL_COMMON=y CONFIG_TREE_RCU_TRACE=y # CONFIG_RCU_NOCB_CPU is not set # CONFIG_RCU_EXPEDITE_BOOT is not set CONFIG_BUILD_BIN2C=y CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=18 CONFIG_LOG_CPU_MAX_BUF_SHIFT=12 CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y CONFIG_ARCH_SUPPORTS_INT128=y # CONFIG_NUMA_BALANCING is not set CONFIG_CGROUPS=y CONFIG_CGROUP_DEBUG=y CONFIG_CGROUP_FREEZER=y CONFIG_CGROUP_DEVICE=y CONFIG_CPUSETS=y CONFIG_PROC_PID_CPUSET=y # CONFIG_CGROUP_CPUACCT is not set # CONFIG_MEMCG is not set # CONFIG_CGROUP_HUGETLB is not set # CONFIG_CGROUP_PERF is not set CONFIG_CGROUP_SCHED=y CONFIG_FAIR_GROUP_SCHED=y # CONFIG_CFS_BANDWIDTH is not set # CONFIG_RT_GROUP_SCHED is not set CONFIG_BLK_CGROUP=y CONFIG_DEBUG_BLK_CGROUP=y # CONFIG_CHECKPOINT_RESTORE
[net-next:master 791/1189] net/ipv4/netfilter/nft_dup_ipv4.c:29:37: sparse: incorrect type in initializer (different base types)
tree: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master head: a9e01ed986aa80d3092134428f453072752da223 commit: d877f07112f1e5a247c6b585c971a93895c9f738 [791/1189] netfilter: nf_tables: add nft_dup expression reproduce: # apt-get install sparse git checkout d877f07112f1e5a247c6b585c971a93895c9f738 make ARCH=x86_64 allmodconfig make C=1 CF=-D__CHECK_ENDIAN__ sparse warnings: (new ones prefixed by ) net/ipv4/netfilter/nft_dup_ipv4.c:29:37: sparse: incorrect type in initializer (different base types) net/ipv4/netfilter/nft_dup_ipv4.c:29:37:expected restricted __be32 [usertype] s_addr net/ipv4/netfilter/nft_dup_ipv4.c:29:37:got unsigned int [unsigned] noident vim +29 net/ipv4/netfilter/nft_dup_ipv4.c 13 #include linux/netfilter.h 14 #include linux/netfilter/nf_tables.h 15 #include net/netfilter/nf_tables.h 16 #include net/netfilter/ipv4/nf_dup_ipv4.h 17 18 struct nft_dup_ipv4 { 19 enum nft_registers sreg_addr:8; 20 enum nft_registers sreg_dev:8; 21 }; 22 23 static void nft_dup_ipv4_eval(const struct nft_expr *expr, 24struct nft_regs *regs, 25const struct nft_pktinfo *pkt) 26 { 27 struct nft_dup_ipv4 *priv = nft_expr_priv(expr); 28 struct in_addr gw = { 29 .s_addr = regs-data[priv-sreg_addr], 30 }; 31 int oif = regs-data[priv-sreg_dev]; 32 33 nf_dup_ipv4(pkt-skb, pkt-ops-hooknum, gw, oif); 34 } 35 36 static int nft_dup_ipv4_init(const struct nft_ctx *ctx, 37 const struct nft_expr *expr, --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Bug in tc of iproute2 ? Deleting single filter, deletes all the filters (apart from hashtable 800::) ...
I actually posted this on lartc first. But then it was suggested to post it over here as you guys might be able to guide better. Please help ... On Fri, Aug 21, 2015 at 10:38 AM, Akshat Kakkar akshat.1...@gmail.com wrote: When I am trying to delete a single tc filter, it deleting all the filters with the same priority/preference. i.e. it is ignoring the handle specified. But, When I am deleting in hashtable 800: it is deleting only the specified filter. For example, following set of commands create a hashtable 15: and add 2 filters to it. tc filter add dev eth0 parent 1:0 prio 5 handle 15: protocol ip u32 divisor 256 tc filter add dev eth0 protocol ip parent 1: prio 5 handle 15:2:2 u32 ht 15:2: match ip src 10.0.0.2 flowid 1:10 tc filter add dev eth0 protocol ip parent 1: prio 5 handle 15:2:3 u32 ht 15:2: match ip src 10.0.0.3 flowid 1:10 Now following command DELETES ALL THE FILTERS, though it should only delete FILTER 15:2:3 ! tc filter del dev eth0 protocol ip parent 1: prio 5 handle 15:2:3 u32 O/p of tc filter show eth0 is this case is blank. As all filters are deleted. However, similar commands when executed for hashtable 800: is deleting only the specified filter tc filter add dev eth0 protocol ip parent 1: prio 5 handle 800:0:2 u32 ht 800:0: match ip src 10.0.0.2 flowid 1:10 tc filter add dev eth0 protocol ip parent 1: prio 5 handle 800:0:3 u32 ht 800:0: match ip src 10.0.0.3 flowid 1:10 tc filter del dev eth0 protocol ip parent 1: prio 5 handle 800:0:2 u32 Above mentioned command only deletes single filter. O/p of tc filter show eth0 is 2nd case is filter parent 1: protocol ip pref 5 u32 filter parent 1: protocol ip pref 5 u32 fh 800: ht divisor 1 filter parent 1: protocol ip pref 5 u32 fh 800::3 order 3 key ht 800 bkt 0 flowid 1:10 match 0a03/ at 12 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [lkp] [rhashtable] 9d901bc0515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/ioremap.c:63 __ioremap_check_ram+0x6a/0x99()
On Fri, Aug 21, 2015 at 02:05:19PM +0800, kernel test robot wrote: FYI, we noticed the below changes on git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master commit 9d901bc05153bbf33b5da2cd6266865e531f0545 (rhashtable: Free bucket tables asynchronously after rehash) With the commit, the possibility of OOM is increased under our boot testing. Can you gather some stats on how much memory rhashtable is actually using? With that kernel you've probably got only one rhashtable user which is netlink. Bear in mind that this is a fairly low-memory machine ( 300M) so it's not clear to me that this patch is the root cause of your OOM problem. Thanks, -- Email: Herbert Xu herb...@gondor.apana.org.au Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v2.2 01/22] fjes: Introduce FUJITSU Extended Socket Network Device driver
Dear David, Thank you for reviewing. I'll update patchset according to your comment. Sincerely, Taku Izumi -Original Message- From: David Miller [mailto:da...@davemloft.net] Sent: Friday, August 21, 2015 7:49 AM To: Izumi, Taku/泉 拓 Cc: netdev@vger.kernel.org; platform-driver-...@vger.kernel.org; dvh...@infradead.org; rk...@redhat.com; alexander.h.du...@redhat.com; linux-a...@vger.kernel.org; j...@perches.com; sergei.shtyl...@cogentembedded.com; step...@networkplumber.org; yasu.isim...@gmail.com Subject: Re: [PATCH v2.2 01/22] fjes: Introduce FUJITSU Extended Socket Network Device driver From: Taku Izumi izumi.t...@jp.fujitsu.com Date: Thu, 20 Aug 2015 17:46:05 +0900 +obj-$(CONFIG_FUJITSU_ES) += fjes.o + +fjes-objs := fjes_main.o + Please do not have trailing empty lines in any files you add or edit, 'git' warns about this even when applying patches. +static int fjes_acpi_add(struct acpi_device *device) +{ + acpi_status status; + struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL}; + union acpi_object *str; + char str_buf[sizeof(FJES_ACPI_SYMBOL) + 1]; + int result; + struct platform_device *plat_dev; Please order your local variables in reverse christmas tree order, which means longer lines come before shorter ones. Please correct this problem in your entire submission, as I am not going to point out each and every other place where this problem exists. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[net-next:master 790/1189] net/ipv6/netfilter/nf_dup_ipv6.c:48:23: sparse: incorrect type in assignment (different base types)
tree: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git master head: a9e01ed986aa80d3092134428f453072752da223 commit: bbde9fc1824aab58bc78c084163007dd6c03fe5b [790/1189] netfilter: factor out packet duplication for IPv4/IPv6 reproduce: # apt-get install sparse git checkout bbde9fc1824aab58bc78c084163007dd6c03fe5b make ARCH=x86_64 allmodconfig make C=1 CF=-D__CHECK_ENDIAN__ sparse warnings: (new ones prefixed by ) net/ipv6/netfilter/nf_dup_ipv6.c:48:23: sparse: incorrect type in assignment (different base types) net/ipv6/netfilter/nf_dup_ipv6.c:48:23:expected restricted __be32 [addressable] [assigned] [usertype] flowlabel net/ipv6/netfilter/nf_dup_ipv6.c:48:23:got int vim +48 net/ipv6/netfilter/nf_dup_ipv6.c 32 return init_net; 33 } 34 35 static bool nf_dup_ipv6_route(struct sk_buff *skb, const struct in6_addr *gw, 36int oif) 37 { 38 const struct ipv6hdr *iph = ipv6_hdr(skb); 39 struct net *net = pick_net(skb); 40 struct dst_entry *dst; 41 struct flowi6 fl6; 42 43 memset(fl6, 0, sizeof(fl6)); 44 if (oif != -1) 45 fl6.flowi6_oif = oif; 46 47 fl6.daddr = *gw; 48 fl6.flowlabel = ((iph-flow_lbl[0] 0xF) 16) | 49 (iph-flow_lbl[1] 8) | iph-flow_lbl[2]; 50 dst = ip6_route_output(net, NULL, fl6); 51 if (dst-error) { 52 dst_release(dst); 53 return false; 54 } 55 skb_dst_drop(skb); 56 skb_dst_set(skb, dst); --- 0-DAY kernel test infrastructureOpen Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/5] net: add Hisilicon Network Subsystem hnae framework support
Thanks, Klimov, You are right. I will fix it in next patches. On Tue, Aug 18, 2015 at 03:12:02AM +0300, Alexey Klimov wrote: Date: Tue, 18 Aug 2015 03:12:02 +0300 From: Alexey Klimov klimov.li...@gmail.com To: Kenneth Lee liguo...@hisilicon.com CC: robh...@kernel.org, pawel.m...@arm.com, Mark Rutland mark.rutl...@arm.com, ijc+devicet...@hellion.org.uk, Kumar Gala ga...@codeaurora.org, Catalin Marinas catalin.mari...@arm.com, Will Deacon will.dea...@arm.com, yisen.zhu...@huawei.com, David S. Miller da...@davemloft.net, paul.gortma...@windriver.com, dingtianh...@huawei.com, zhangfei@linaro.org, devicet...@vger.kernel.org, Linux Kernel Mailing List linux-ker...@vger.kernel.org, linux-arm-ker...@lists.infradead.org, netdev@vger.kernel.org, linux...@huawei.com, salil.me...@huawei.com, huangda...@hisilicon.com, Kenneth Lee liguo...@huawei.com, Yury Norov yury.no...@gmail.com Subject: Re: [PATCH 2/5] net: add Hisilicon Network Subsystem hnae framework support Message-ID: CALW4P+J8LkLshu5TuRT+8c__KRwJ8XAdMV4yA0KEnrfUg=m...@mail.gmail.com Hi Kenneth, just small minor question. On Fri, Aug 14, 2015 at 1:30 PM, Kenneth Lee liguo...@hisilicon.com wrote: HNAE (Hisilicon Network Acceleration Engine) is a framework to provide a unified ring buffer interface for Hisilicon Network Acceleration Engines. With the interface, upper layer can work as ethernet driver, ODP driver or other service driver on purpose. Signed-off-by: Kenneth Lee liguo...@huawei.com Signed-off-by: Yisen Zhuang yisen.zhu...@huawei.com --- drivers/net/ethernet/hisilicon/Kconfig | 33 +- drivers/net/ethernet/hisilicon/Makefile | 1 + drivers/net/ethernet/hisilicon/hns/Makefile | 15 + drivers/net/ethernet/hisilicon/hns/hnae.c | 494 +++ drivers/net/ethernet/hisilicon/hns/hnae.h | 582 5 files changed, 1124 insertions(+), 1 deletion(-) create mode 100644 drivers/net/ethernet/hisilicon/hns/Makefile create mode 100644 drivers/net/ethernet/hisilicon/hns/hnae.c create mode 100644 drivers/net/ethernet/hisilicon/hns/hnae.h diff --git a/drivers/net/ethernet/hisilicon/Kconfig b/drivers/net/ethernet/hisilicon/Kconfig index dead17b..1e4f5a7 100644 --- a/drivers/net/ethernet/hisilicon/Kconfig +++ b/drivers/net/ethernet/hisilicon/Kconfig @@ -5,7 +5,7 @@ config NET_VENDOR_HISILICON bool Hisilicon devices default y - depends on ARM + depends on ARM || ARM64 ---help--- If you have a network (Ethernet) card belonging to this class, say Y. @@ -31,4 +31,35 @@ config HIP04_ETH If you wish to compile a kernel for a hardware with hisilicon p04 SoC and want to use the internal ethernet then you should answer Y to this. +config HNS + tristate Hisilicon Network Subsystem Support (Framework) + ---help--- + This selects the framework support for Hisilicon Network Subsystem. It + is needed by any driver which provides HNS acceleration engine or make + use of the engine + +config HNS_DSAF + tristate Hisilicon HNS DSAF device Support + select HNS + select HNS_MDIO + ---help--- + This selects the DSAF (Distributed System Area Frabric) network + acceleration engine support. The engine is used in Hisilicon P660, + Hi1610 and further ICT SoC + +config HNS_MDIO + tristate Hisilicon HNS MDIO device Support + select MDIO + ---help--- + This selects the HNS MDIO support. It is needed by HNS_DSAF to access + the PHY + +config HNS_ENET + tristate Hisilicon HNS Ethernet Device Support + select PHYLIB + select HNS + ---help--- + This selects the general ethernet driver for HNS. This module make + use of any HNS AE driver, such as HNS_DSAF + endif # NET_VENDOR_HISILICON diff --git a/drivers/net/ethernet/hisilicon/Makefile b/drivers/net/ethernet/hisilicon/Makefile index 6c14540..2503a9b 100644 --- a/drivers/net/ethernet/hisilicon/Makefile +++ b/drivers/net/ethernet/hisilicon/Makefile @@ -4,3 +4,4 @@ obj-$(CONFIG_HIX5HD2_GMAC) += hix5hd2_gmac.o obj-$(CONFIG_HIP04_ETH) += hip04_mdio.o hip04_eth.o +obj-$(CONFIG_HNS) += hns/ diff --git a/drivers/net/ethernet/hisilicon/hns/Makefile b/drivers/net/ethernet/hisilicon/hns/Makefile new file mode 100644 index 000..6680602 --- /dev/null +++ b/drivers/net/ethernet/hisilicon/hns/Makefile @@ -0,0 +1,15 @@ +# +# Makefile for the HISILICON network device drivers. +# + +obj-$(CONFIG_HNS) += hnae.o + +obj-$(CONFIG_HNS_DSAF) += hns_dsaf.o +hns_dsaf-objs = hns_ae_adapt.o hns_dsaf_gmac.o hns_dsaf_mac.o hns_dsaf_misc.o \ + hns_dsaf_main.o hns_dsaf_ppe.o hns_dsaf_rcb.o hns_dsaf_xgmac.o +
[PATCH v3 19/22] fjes: update_zone_task
This patch adds update_zone_task. Zoning information can be changed by user. This task is used to monitor if zoning information is changed or not. Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com --- drivers/net/fjes/fjes_hw.c | 179 +++ drivers/net/fjes/fjes_hw.h | 1 + drivers/net/fjes/fjes_main.c | 14 3 files changed, 194 insertions(+) diff --git a/drivers/net/fjes/fjes_hw.c b/drivers/net/fjes/fjes_hw.c index 4a4b750..4525d36 100644 --- a/drivers/net/fjes/fjes_hw.c +++ b/drivers/net/fjes/fjes_hw.c @@ -22,6 +22,8 @@ #include fjes_hw.h #include fjes.h +static void fjes_hw_update_zone_task(struct work_struct *); + /* supported MTU list */ const u32 fjes_support_mtu[] = { FJES_MTU_DEFINE(8 * 1024), @@ -322,6 +324,8 @@ int fjes_hw_init(struct fjes_hw *hw) fjes_hw_set_irqmask(hw, REG_ICTL_MASK_ALL, true); + INIT_WORK(hw-update_zone_task, fjes_hw_update_zone_task); + mutex_init(hw-hw_info.lock); hw-max_epid = fjes_hw_get_max_epid(hw); @@ -349,6 +353,8 @@ void fjes_hw_exit(struct fjes_hw *hw) } fjes_hw_cleanup(hw); + + cancel_work_sync(hw-update_zone_task); } static enum fjes_dev_command_response_e @@ -913,3 +919,176 @@ int fjes_hw_epbuf_tx_pkt_send(struct epbuf_handler *epbh, return 0; } + +static void fjes_hw_update_zone_task(struct work_struct *work) +{ + struct fjes_hw *hw = container_of(work, + struct fjes_hw, update_zone_task); + + struct my_s {u8 es_status; u8 zone; } *info; + union fjes_device_command_res *res_buf; + enum ep_partner_status pstatus; + + struct fjes_adapter *adapter; + struct net_device *netdev; + + ulong unshare_bit = 0; + ulong share_bit = 0; + ulong irq_bit = 0; + + int epidx; + int ret; + + adapter = (struct fjes_adapter *)hw-back; + netdev = adapter-netdev; + res_buf = hw-hw_info.res_buf; + info = (struct my_s *)res_buf-info.info; + + mutex_lock(hw-hw_info.lock); + + ret = fjes_hw_request_info(hw); + switch (ret) { + case -ENOMSG: + case -EBUSY: + default: + if (!work_pending(adapter-force_close_task)) { + adapter-force_reset = true; + schedule_work(adapter-force_close_task); + } + break; + + case 0: + + for (epidx = 0; epidx hw-max_epid; epidx++) { + if (epidx == hw-my_epid) { + hw-ep_shm_info[epidx].es_status = + info[epidx].es_status; + hw-ep_shm_info[epidx].zone = + info[epidx].zone; + continue; + } + + pstatus = fjes_hw_get_partner_ep_status(hw, epidx); + switch (pstatus) { + case EP_PARTNER_UNSHARE: + default: + if ((info[epidx].zone != + FJES_ZONING_ZONE_TYPE_NONE) + (info[epidx].es_status == + FJES_ZONING_STATUS_ENABLE) + (info[epidx].zone == + info[hw-my_epid].zone)) + set_bit(epidx, share_bit); + else + set_bit(epidx, unshare_bit); + break; + + case EP_PARTNER_COMPLETE: + case EP_PARTNER_WAITING: + if ((info[epidx].zone == + FJES_ZONING_ZONE_TYPE_NONE) || + (info[epidx].es_status != + FJES_ZONING_STATUS_ENABLE) || + (info[epidx].zone != + info[hw-my_epid].zone)) { + set_bit(epidx, + adapter-unshare_watch_bitmask); + set_bit(epidx, + hw-hw_info.buffer_unshare_reserve_bit); + } + break; + + case EP_PARTNER_SHARED: + if ((info[epidx].zone == + FJES_ZONING_ZONE_TYPE_NONE) || + (info[epidx].es_status != + FJES_ZONING_STATUS_ENABLE) || + (info[epidx].zone != + info[hw-my_epid].zone)) + set_bit(epidx,
[PATCH v3 21/22] fjes: handle receive cancellation request interrupt
This patch adds implementation of handling IRQ of other receiver's receive cancellation request. Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com --- drivers/net/fjes/fjes_main.c | 78 1 file changed, 78 insertions(+) diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c index 5e77d0c..5f93e42 100644 --- a/drivers/net/fjes/fjes_main.c +++ b/drivers/net/fjes/fjes_main.c @@ -820,6 +820,74 @@ static int fjes_vlan_rx_kill_vid(struct net_device *netdev, return 0; } +static void fjes_txrx_stop_req_irq(struct fjes_adapter *adapter, + int src_epid) +{ + struct fjes_hw *hw = adapter-hw; + enum ep_partner_status status; + + status = fjes_hw_get_partner_ep_status(hw, src_epid); + switch (status) { + case EP_PARTNER_UNSHARE: + case EP_PARTNER_COMPLETE: + default: + break; + case EP_PARTNER_WAITING: + if (src_epid hw-my_epid) { + hw-ep_shm_info[src_epid].tx.info-v1i.rx_status |= + FJES_RX_STOP_REQ_DONE; + + clear_bit(src_epid, hw-txrx_stop_req_bit); + set_bit(src_epid, adapter-unshare_watch_bitmask); + + if (!work_pending(adapter-unshare_watch_task)) + queue_work(adapter-control_wq, + adapter-unshare_watch_task); + } + break; + case EP_PARTNER_SHARED: + if (hw-ep_shm_info[src_epid].rx.info-v1i.rx_status + FJES_RX_STOP_REQ_REQUEST) { + set_bit(src_epid, hw-epstop_req_bit); + if (!work_pending(hw-epstop_task)) + queue_work(adapter-control_wq, + hw-epstop_task); + } + break; + } +} + +static void fjes_stop_req_irq(struct fjes_adapter *adapter, int src_epid) +{ + struct fjes_hw *hw = adapter-hw; + enum ep_partner_status status; + + set_bit(src_epid, hw-hw_info.buffer_unshare_reserve_bit); + + status = fjes_hw_get_partner_ep_status(hw, src_epid); + switch (status) { + case EP_PARTNER_WAITING: + hw-ep_shm_info[src_epid].tx.info-v1i.rx_status |= + FJES_RX_STOP_REQ_DONE; + clear_bit(src_epid, hw-txrx_stop_req_bit); + /* fall through */ + case EP_PARTNER_UNSHARE: + case EP_PARTNER_COMPLETE: + default: + set_bit(src_epid, adapter-unshare_watch_bitmask); + if (!work_pending(adapter-unshare_watch_task)) + queue_work(adapter-control_wq, + adapter-unshare_watch_task); + break; + case EP_PARTNER_SHARED: + set_bit(src_epid, hw-epstop_req_bit); + + if (!work_pending(hw-epstop_task)) + queue_work(adapter-control_wq, hw-epstop_task); + break; + } +} + static void fjes_update_zone_irq(struct fjes_adapter *adapter, int src_epid) { @@ -842,6 +910,16 @@ static irqreturn_t fjes_intr(int irq, void *data) if (icr REG_ICTL_MASK_RX_DATA) fjes_rx_irq(adapter, icr REG_IS_MASK_EPID); + if (icr REG_ICTL_MASK_DEV_STOP_REQ) + fjes_stop_req_irq(adapter, icr REG_IS_MASK_EPID); + + if (icr REG_ICTL_MASK_TXRX_STOP_REQ) + fjes_txrx_stop_req_irq(adapter, icr REG_IS_MASK_EPID); + + if (icr REG_ICTL_MASK_TXRX_STOP_DONE) + fjes_hw_set_irqmask(hw, + REG_ICTL_MASK_TXRX_STOP_DONE, true); + if (icr REG_ICTL_MASK_INFO_UPDATE) fjes_update_zone_irq(adapter, icr REG_IS_MASK_EPID); -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 09/22] fjes: raise_intr_rxdata_task
This patch add raise_intr_rxdata_task. Extended Socket Network Device is shared memory based, so someone's transmission denotes other's reception. In order to notify receivers, sender has to raise interruption of receivers. raise_intr_rxdata_task does this work. Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com --- drivers/net/fjes/fjes.h | 4 +++ drivers/net/fjes/fjes_main.c | 63 2 files changed, 67 insertions(+) diff --git a/drivers/net/fjes/fjes.h b/drivers/net/fjes/fjes.h index 7af4304..8e9899e 100644 --- a/drivers/net/fjes/fjes.h +++ b/drivers/net/fjes/fjes.h @@ -50,6 +50,10 @@ struct fjes_adapter { bool irq_registered; + struct workqueue_struct *txrx_wq; + + struct work_struct raise_intr_rxdata_task; + struct fjes_hw hw; }; diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c index 220ff3d..80e180f 100644 --- a/drivers/net/fjes/fjes_main.c +++ b/drivers/net/fjes/fjes_main.c @@ -52,6 +52,7 @@ static int fjes_close(struct net_device *); static int fjes_setup_resources(struct fjes_adapter *); static void fjes_free_resources(struct fjes_adapter *); static netdev_tx_t fjes_xmit_frame(struct sk_buff *, struct net_device *); +static void fjes_raise_intr_rxdata_task(struct work_struct *); static irqreturn_t fjes_intr(int, void*); static int fjes_acpi_add(struct acpi_device *); @@ -276,6 +277,8 @@ static int fjes_close(struct net_device *netdev) fjes_free_irq(adapter); + cancel_work_sync(adapter-raise_intr_rxdata_task); + fjes_hw_wait_epstop(hw); fjes_free_resources(adapter); @@ -404,6 +407,54 @@ static void fjes_free_resources(struct fjes_adapter *adapter) } } +static void fjes_raise_intr_rxdata_task(struct work_struct *work) +{ + struct fjes_adapter *adapter = container_of(work, + struct fjes_adapter, raise_intr_rxdata_task); + struct fjes_hw *hw = adapter-hw; + enum ep_partner_status pstatus; + int max_epid, my_epid, epid; + + my_epid = hw-my_epid; + max_epid = hw-max_epid; + + for (epid = 0; epid max_epid; epid++) + hw-ep_shm_info[epid].tx_status_work = 0; + + for (epid = 0; epid max_epid; epid++) { + if (epid == my_epid) + continue; + + pstatus = fjes_hw_get_partner_ep_status(hw, epid); + if (pstatus == EP_PARTNER_SHARED) { + hw-ep_shm_info[epid].tx_status_work = + hw-ep_shm_info[epid].tx.info-v1i.tx_status; + + if (hw-ep_shm_info[epid].tx_status_work == + FJES_TX_DELAY_SEND_PENDING) { + hw-ep_shm_info[epid].tx.info-v1i.tx_status = + FJES_TX_DELAY_SEND_NONE; + } + } + } + + for (epid = 0; epid max_epid; epid++) { + if (epid == my_epid) + continue; + + pstatus = fjes_hw_get_partner_ep_status(hw, epid); + if ((hw-ep_shm_info[epid].tx_status_work == +FJES_TX_DELAY_SEND_PENDING) + (pstatus == EP_PARTNER_SHARED) + !(hw-ep_shm_info[epid].rx.info-v1i.rx_status)) { + fjes_hw_raise_interrupt(hw, epid, + REG_ICTL_MASK_RX_DATA); + } + } + + usleep_range(500, 1000); +} + static int fjes_tx_send(struct fjes_adapter *adapter, int dest, void *data, size_t len) { @@ -416,6 +467,9 @@ static int fjes_tx_send(struct fjes_adapter *adapter, int dest, adapter-hw.ep_shm_info[dest].tx.info-v1i.tx_status = FJES_TX_DELAY_SEND_PENDING; + if (!work_pending(adapter-raise_intr_rxdata_task)) + queue_work(adapter-txrx_wq, + adapter-raise_intr_rxdata_task); retval = 0; return retval; @@ -630,6 +684,11 @@ static int fjes_probe(struct platform_device *plat_dev) adapter-force_reset = false; adapter-open_guard = false; + adapter-txrx_wq = create_workqueue(DRV_NAME /txrx); + + INIT_WORK(adapter-raise_intr_rxdata_task, + fjes_raise_intr_rxdata_task); + res = platform_get_resource(plat_dev, IORESOURCE_MEM, 0); hw-hw_res.start = res-start; hw-hw_res.size = res-end - res-start + 1; @@ -669,6 +728,10 @@ static int fjes_remove(struct platform_device *plat_dev) struct fjes_adapter *adapter = netdev_priv(netdev); struct fjes_hw *hw = adapter-hw; + cancel_work_sync(adapter-raise_intr_rxdata_task); + if (adapter-txrx_wq) + destroy_workqueue(adapter-txrx_wq); + unregister_netdev(netdev); fjes_hw_exit(hw); -- 1.8.3.1 -- To unsubscribe from this list: send the line
[PATCH v3 02/22] fjes: Hardware initialization routine
This patch adds hardware initialization routine to be invoked at driver's .probe routine. Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com --- drivers/net/fjes/Makefile| 2 +- drivers/net/fjes/fjes.h | 1 + drivers/net/fjes/fjes_hw.c | 295 +++ drivers/net/fjes/fjes_hw.h | 251 drivers/net/fjes/fjes_regs.h | 102 +++ 5 files changed, 650 insertions(+), 1 deletion(-) create mode 100644 drivers/net/fjes/fjes_hw.c create mode 100644 drivers/net/fjes/fjes_hw.h create mode 100644 drivers/net/fjes/fjes_regs.h diff --git a/drivers/net/fjes/Makefile b/drivers/net/fjes/Makefile index 34bccba..753d52f 100644 --- a/drivers/net/fjes/Makefile +++ b/drivers/net/fjes/Makefile @@ -27,4 +27,4 @@ obj-$(CONFIG_FUJITSU_ES) += fjes.o -fjes-objs := fjes_main.o +fjes-objs := fjes_main.o fjes_hw.o diff --git a/drivers/net/fjes/fjes.h b/drivers/net/fjes/fjes.h index 52eb60b..15ded96 100644 --- a/drivers/net/fjes/fjes.h +++ b/drivers/net/fjes/fjes.h @@ -28,5 +28,6 @@ extern char fjes_driver_name[]; extern char fjes_driver_version[]; +extern const u32 fjes_support_mtu[]; #endif /* FJES_H_ */ diff --git a/drivers/net/fjes/fjes_hw.c b/drivers/net/fjes/fjes_hw.c new file mode 100644 index 000..ae26638 --- /dev/null +++ b/drivers/net/fjes/fjes_hw.c @@ -0,0 +1,295 @@ +/* + * FUJITSU Extended Socket Network Device driver + * Copyright (c) 2015 FUJITSU LIMITED + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, see http://www.gnu.org/licenses/. + * + * The full GNU General Public License is included in this distribution in + * the file called COPYING. + * + */ + +#include fjes_hw.h +#include fjes.h + +/* supported MTU list */ +const u32 fjes_support_mtu[] = { + FJES_MTU_DEFINE(8 * 1024), + FJES_MTU_DEFINE(16 * 1024), + FJES_MTU_DEFINE(32 * 1024), + FJES_MTU_DEFINE(64 * 1024), + 0 +}; + +u32 fjes_hw_rd32(struct fjes_hw *hw, u32 reg) +{ + u8 *base = hw-base; + u32 value = 0; + + value = readl(base[reg]); + + return value; +} + +static u8 *fjes_hw_iomap(struct fjes_hw *hw) +{ + u8 *base; + + if (!request_mem_region(hw-hw_res.start, hw-hw_res.size, + fjes_driver_name)) { + pr_err(request_mem_region failed\n); + return NULL; + } + + base = (u8 *)ioremap_nocache(hw-hw_res.start, hw-hw_res.size); + + return base; +} + +int fjes_hw_reset(struct fjes_hw *hw) +{ + union REG_DCTL dctl; + int timeout; + + dctl.reg = 0; + dctl.bits.reset = 1; + wr32(XSCT_DCTL, dctl.reg); + + timeout = FJES_DEVICE_RESET_TIMEOUT * 1000; + dctl.reg = rd32(XSCT_DCTL); + while ((dctl.bits.reset == 1) (timeout 0)) { + msleep(1000); + dctl.reg = rd32(XSCT_DCTL); + timeout -= 1000; + } + + return timeout 0 ? 0 : -EIO; +} + +static int fjes_hw_get_max_epid(struct fjes_hw *hw) +{ + union REG_MAX_EP info; + + info.reg = rd32(XSCT_MAX_EP); + + return info.bits.maxep; +} + +static int fjes_hw_get_my_epid(struct fjes_hw *hw) +{ + union REG_OWNER_EPID info; + + info.reg = rd32(XSCT_OWNER_EPID); + + return info.bits.epid; +} + +static int fjes_hw_alloc_shared_status_region(struct fjes_hw *hw) +{ + size_t size; + + size = sizeof(struct fjes_device_shared_info) + + (sizeof(u8) * hw-max_epid); + hw-hw_info.share = kzalloc(size, GFP_KERNEL); + if (!hw-hw_info.share) + return -ENOMEM; + + hw-hw_info.share-epnum = hw-max_epid; + + return 0; +} + +static int fjes_hw_alloc_epbuf(struct epbuf_handler *epbh) +{ + void *mem; + + mem = vzalloc(EP_BUFFER_SIZE); + if (!mem) + return -ENOMEM; + + epbh-buffer = mem; + epbh-size = EP_BUFFER_SIZE; + + epbh-info = (union ep_buffer_info *)mem; + epbh-ring = (u8 *)(mem + sizeof(union ep_buffer_info)); + + return 0; +} + +void fjes_hw_setup_epbuf(struct epbuf_handler *epbh, u8 *mac_addr, u32 mtu) +{ + union ep_buffer_info *info = epbh-info; + u16 vlan_id[EP_BUFFER_SUPPORT_VLAN_MAX]; + int i; + + for (i = 0; i EP_BUFFER_SUPPORT_VLAN_MAX; i++) + vlan_id[i] = info-v1i.vlan_id[i]; + + memset(info, 0, sizeof(union ep_buffer_info)); + + info-v1i.version = 0; /* version 0 */
[PATCH v3 08/22] fjes: net_device_ops.ndo_start_xmit
This patch adds net_device_ops.ndo_start_xmit callback, which is called when sending packets. Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com --- drivers/net/fjes/fjes.h | 1 + drivers/net/fjes/fjes_hw.c | 55 ++ drivers/net/fjes/fjes_hw.h | 12 +++ drivers/net/fjes/fjes_main.c | 177 +++ 4 files changed, 245 insertions(+) diff --git a/drivers/net/fjes/fjes.h b/drivers/net/fjes/fjes.h index f182ed3..7af4304 100644 --- a/drivers/net/fjes/fjes.h +++ b/drivers/net/fjes/fjes.h @@ -29,6 +29,7 @@ #define FJES_ACPI_SYMBOL Extended Socket #define FJES_MAX_QUEUES1 #define FJES_TX_RETRY_INTERVAL (20 * HZ) +#define FJES_TX_RETRY_TIMEOUT (100) #define FJES_OPEN_ZONE_UPDATE_WAIT (300) /* msec */ /* board specific private data structure */ diff --git a/drivers/net/fjes/fjes_hw.c b/drivers/net/fjes/fjes_hw.c index 1935f48..487dbc6 100644 --- a/drivers/net/fjes/fjes_hw.c +++ b/drivers/net/fjes/fjes_hw.c @@ -791,3 +791,58 @@ int fjes_hw_wait_epstop(struct fjes_hw *hw) return (wait_time FJES_COMMAND_EPSTOP_WAIT_TIMEOUT * 1000) ? 0 : -EBUSY; } + +bool fjes_hw_check_epbuf_version(struct epbuf_handler *epbh, u32 version) +{ + union ep_buffer_info *info = epbh-info; + + return (info-common.version == version); +} + +bool fjes_hw_check_mtu(struct epbuf_handler *epbh, u32 mtu) +{ + union ep_buffer_info *info = epbh-info; + + return (info-v1i.frame_max == FJES_MTU_TO_FRAME_SIZE(mtu)); +} + +bool fjes_hw_check_vlan_id(struct epbuf_handler *epbh, u16 vlan_id) +{ + union ep_buffer_info *info = epbh-info; + bool ret = false; + int i; + + if (vlan_id == 0) { + ret = true; + } else { + for (i = 0; i EP_BUFFER_SUPPORT_VLAN_MAX; i++) { + if (vlan_id == info-v1i.vlan_id[i]) { + ret = true; + break; + } + } + } + return ret; +} + +int fjes_hw_epbuf_tx_pkt_send(struct epbuf_handler *epbh, + void *frame, size_t size) +{ + union ep_buffer_info *info = epbh-info; + struct esmem_frame *ring_frame; + + if (EP_RING_FULL(info-v1i.head, info-v1i.tail, info-v1i.count_max)) + return -ENOBUFS; + + ring_frame = (struct esmem_frame *)(epbh-ring[EP_RING_INDEX +(info-v1i.tail - 1, + info-v1i.count_max) * +info-v1i.frame_max]); + + ring_frame-frame_size = size; + memcpy((void *)(ring_frame-frame_data), (void *)frame, size); + + EP_RING_INDEX_INC(epbh-info-v1i.tail, info-v1i.count_max); + + return 0; +} diff --git a/drivers/net/fjes/fjes_hw.h b/drivers/net/fjes/fjes_hw.h index 9b8df55..07e1226 100644 --- a/drivers/net/fjes/fjes_hw.h +++ b/drivers/net/fjes/fjes_hw.h @@ -50,6 +50,9 @@ struct fjes_hw; #define FJES_ZONING_ZONE_TYPE_NONE (0xFF) +#define FJES_TX_DELAY_SEND_NONE(0) +#define FJES_TX_DELAY_SEND_PENDING (1) + #define FJES_RX_STOP_REQ_NONE (0x0) #define FJES_RX_STOP_REQ_DONE (0x1) #define FJES_RX_STOP_REQ_REQUEST (0x2) @@ -61,6 +64,11 @@ struct fjes_hw; #define EP_RING_NUM(buffer_size, frame_size) \ (u32)((buffer_size) / (frame_size)) +#define EP_RING_INDEX(_num, _max) (((_num) + (_max)) % (_max)) +#define EP_RING_INDEX_INC(_num, _max) \ + ((_num) = EP_RING_INDEX((_num) + 1, (_max))) +#define EP_RING_FULL(_head, _tail, _max) \ + (0 == EP_RING_INDEX(((_tail) - (_head)), (_max))) #define FJES_MTU_TO_BUFFER_SIZE(mtu) \ (ETH_HLEN + VLAN_HLEN + (mtu) + ETH_FCS_LEN) @@ -309,5 +317,9 @@ enum ep_partner_status bool fjes_hw_epid_is_same_zone(struct fjes_hw *, int); int fjes_hw_epid_is_shared(struct fjes_device_shared_info *, int); +bool fjes_hw_check_epbuf_version(struct epbuf_handler *, u32); +bool fjes_hw_check_mtu(struct epbuf_handler *, u32); +bool fjes_hw_check_vlan_id(struct epbuf_handler *, u16); +int fjes_hw_epbuf_tx_pkt_send(struct epbuf_handler *, void *, size_t); #endif /* FJES_HW_H_ */ diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c index bd50cbd..220ff3d 100644 --- a/drivers/net/fjes/fjes_main.c +++ b/drivers/net/fjes/fjes_main.c @@ -51,6 +51,7 @@ static int fjes_open(struct net_device *); static int fjes_close(struct net_device *); static int fjes_setup_resources(struct fjes_adapter *); static void fjes_free_resources(struct fjes_adapter *); +static netdev_tx_t fjes_xmit_frame(struct sk_buff *, struct net_device *); static irqreturn_t fjes_intr(int, void*); static int fjes_acpi_add(struct acpi_device *); @@ -212,6 +213,7 @@ static void fjes_free_irq(struct fjes_adapter *adapter) static const struct net_device_ops
[PATCH v3 06/22] fjes: buffer address regist/unregistration routine
This patch adds buffer address regist/unregistration routine. This function is mainly invoked when network device's activation (open) and deactivation (close) in order to retist/unregist shared buffer address. Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com --- drivers/net/fjes/fjes_hw.c | 186 + drivers/net/fjes/fjes_hw.h | 9 ++- 2 files changed, 194 insertions(+), 1 deletion(-) diff --git a/drivers/net/fjes/fjes_hw.c b/drivers/net/fjes/fjes_hw.c index c31be7f..1e807df 100644 --- a/drivers/net/fjes/fjes_hw.c +++ b/drivers/net/fjes/fjes_hw.c @@ -452,6 +452,192 @@ int fjes_hw_request_info(struct fjes_hw *hw) return result; } +int fjes_hw_register_buff_addr(struct fjes_hw *hw, int dest_epid, + struct ep_share_mem_info *buf_pair) +{ + union fjes_device_command_req *req_buf = hw-hw_info.req_buf; + union fjes_device_command_res *res_buf = hw-hw_info.res_buf; + enum fjes_dev_command_response_e ret; + int page_count; + int timeout; + int i, idx; + void *addr; + int result; + + if (test_bit(dest_epid, hw-hw_info.buffer_share_bit)) + return 0; + + memset(req_buf, 0, hw-hw_info.req_buf_size); + memset(res_buf, 0, hw-hw_info.res_buf_size); + + req_buf-share_buffer.length = FJES_DEV_COMMAND_SHARE_BUFFER_REQ_LEN( + buf_pair-tx.size, + buf_pair-rx.size); + req_buf-share_buffer.epid = dest_epid; + + idx = 0; + req_buf-share_buffer.buffer[idx++] = buf_pair-tx.size; + page_count = buf_pair-tx.size / EP_BUFFER_INFO_SIZE; + for (i = 0; i page_count; i++) { + addr = ((u8 *)(buf_pair-tx.buffer)) + + (i * EP_BUFFER_INFO_SIZE); + req_buf-share_buffer.buffer[idx++] = + (__le64)(page_to_phys(vmalloc_to_page(addr)) + + offset_in_page(addr)); + } + + req_buf-share_buffer.buffer[idx++] = buf_pair-rx.size; + page_count = buf_pair-rx.size / EP_BUFFER_INFO_SIZE; + for (i = 0; i page_count; i++) { + addr = ((u8 *)(buf_pair-rx.buffer)) + + (i * EP_BUFFER_INFO_SIZE); + req_buf-share_buffer.buffer[idx++] = + (__le64)(page_to_phys(vmalloc_to_page(addr)) + + offset_in_page(addr)); + } + + res_buf-share_buffer.length = 0; + res_buf-share_buffer.code = 0; + + ret = fjes_hw_issue_request_command(hw, FJES_CMD_REQ_SHARE_BUFFER); + + timeout = FJES_COMMAND_REQ_BUFF_TIMEOUT * 1000; + while ((ret == FJES_CMD_STATUS_NORMAL) + (res_buf-share_buffer.length == + FJES_DEV_COMMAND_SHARE_BUFFER_RES_LEN) + (res_buf-share_buffer.code == FJES_CMD_REQ_RES_CODE_BUSY) + (timeout 0)) { + msleep(200 + hw-my_epid * 20); + timeout -= (200 + hw-my_epid * 20); + + res_buf-share_buffer.length = 0; + res_buf-share_buffer.code = 0; + + ret = fjes_hw_issue_request_command( + hw, FJES_CMD_REQ_SHARE_BUFFER); + } + + result = 0; + + if (res_buf-share_buffer.length != + FJES_DEV_COMMAND_SHARE_BUFFER_RES_LEN) + result = -ENOMSG; + else if (ret == FJES_CMD_STATUS_NORMAL) { + switch (res_buf-share_buffer.code) { + case FJES_CMD_REQ_RES_CODE_NORMAL: + result = 0; + set_bit(dest_epid, hw-hw_info.buffer_share_bit); + break; + case FJES_CMD_REQ_RES_CODE_BUSY: + result = -EBUSY; + break; + default: + result = -EPERM; + break; + } + } else { + switch (ret) { + case FJES_CMD_STATUS_UNKNOWN: + result = -EPERM; + break; + case FJES_CMD_STATUS_TIMEOUT: + result = -EBUSY; + break; + case FJES_CMD_STATUS_ERROR_PARAM: + case FJES_CMD_STATUS_ERROR_STATUS: + default: + result = -EPERM; + break; + } + } + + return result; +} + +int fjes_hw_unregister_buff_addr(struct fjes_hw *hw, int dest_epid) +{ + union fjes_device_command_req *req_buf = hw-hw_info.req_buf; + union fjes_device_command_res *res_buf = hw-hw_info.res_buf; + struct fjes_device_shared_info *share = hw-hw_info.share; + enum fjes_dev_command_response_e ret; +
[PATCH v3 22/22] fjes: ethtool support
This patch adds implementation for ethtool support. Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com --- drivers/net/fjes/Makefile | 2 +- drivers/net/fjes/fjes.h | 2 + drivers/net/fjes/fjes_ethtool.c | 137 drivers/net/fjes/fjes_main.c| 1 + 4 files changed, 141 insertions(+), 1 deletion(-) create mode 100644 drivers/net/fjes/fjes_ethtool.c diff --git a/drivers/net/fjes/Makefile b/drivers/net/fjes/Makefile index 753d52f..523e3d7 100644 --- a/drivers/net/fjes/Makefile +++ b/drivers/net/fjes/Makefile @@ -27,4 +27,4 @@ obj-$(CONFIG_FUJITSU_ES) += fjes.o -fjes-objs := fjes_main.o fjes_hw.o +fjes-objs := fjes_main.o fjes_hw.o fjes_ethtool.o diff --git a/drivers/net/fjes/fjes.h b/drivers/net/fjes/fjes.h index 57feee8..a592fe2 100644 --- a/drivers/net/fjes/fjes.h +++ b/drivers/net/fjes/fjes.h @@ -72,4 +72,6 @@ extern char fjes_driver_name[]; extern char fjes_driver_version[]; extern const u32 fjes_support_mtu[]; +void fjes_set_ethtool_ops(struct net_device *); + #endif /* FJES_H_ */ diff --git a/drivers/net/fjes/fjes_ethtool.c b/drivers/net/fjes/fjes_ethtool.c new file mode 100644 index 000..0119dd1 --- /dev/null +++ b/drivers/net/fjes/fjes_ethtool.c @@ -0,0 +1,137 @@ +/* + * FUJITSU Extended Socket Network Device driver + * Copyright (c) 2015 FUJITSU LIMITED + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, see http://www.gnu.org/licenses/. + * + * The full GNU General Public License is included in this distribution in + * the file called COPYING. + * + */ + +/* ethtool support for fjes */ + +#include linux/vmalloc.h +#include linux/netdevice.h +#include linux/ethtool.h +#include linux/platform_device.h + +#include fjes.h + +struct fjes_stats { + char stat_string[ETH_GSTRING_LEN]; + int sizeof_stat; + int stat_offset; +}; + +#define FJES_STAT(name, stat) { \ + .stat_string = name, \ + .sizeof_stat = FIELD_SIZEOF(struct fjes_adapter, stat), \ + .stat_offset = offsetof(struct fjes_adapter, stat) \ +} + +static const struct fjes_stats fjes_gstrings_stats[] = { + FJES_STAT(rx_packets, stats64.rx_packets), + FJES_STAT(tx_packets, stats64.tx_packets), + FJES_STAT(rx_bytes, stats64.rx_bytes), + FJES_STAT(tx_bytes, stats64.rx_bytes), + FJES_STAT(rx_dropped, stats64.rx_dropped), + FJES_STAT(tx_dropped, stats64.tx_dropped), +}; + +static void fjes_get_ethtool_stats(struct net_device *netdev, + struct ethtool_stats *stats, u64 *data) +{ + struct fjes_adapter *adapter = netdev_priv(netdev); + char *p; + int i; + + for (i = 0; i ARRAY_SIZE(fjes_gstrings_stats); i++) { + p = (char *)adapter + fjes_gstrings_stats[i].stat_offset; + data[i] = (fjes_gstrings_stats[i].sizeof_stat == sizeof(u64)) + ? *(u64 *)p : *(u32 *)p; + } +} + +static void fjes_get_strings(struct net_device *netdev, +u32 stringset, u8 *data) +{ + u8 *p = data; + int i; + + switch (stringset) { + case ETH_SS_STATS: + for (i = 0; i ARRAY_SIZE(fjes_gstrings_stats); i++) { + memcpy(p, fjes_gstrings_stats[i].stat_string, + ETH_GSTRING_LEN); + p += ETH_GSTRING_LEN; + } + break; + } +} + +static int fjes_get_sset_count(struct net_device *netdev, int sset) +{ + switch (sset) { + case ETH_SS_STATS: + return ARRAY_SIZE(fjes_gstrings_stats); + default: + return -EOPNOTSUPP; + } +} + +static void fjes_get_drvinfo(struct net_device *netdev, +struct ethtool_drvinfo *drvinfo) +{ + struct fjes_adapter *adapter = netdev_priv(netdev); + struct platform_device *plat_dev; + + plat_dev = adapter-plat_dev; + + strlcpy(drvinfo-driver, fjes_driver_name, sizeof(drvinfo-driver)); + strlcpy(drvinfo-version, fjes_driver_version, + sizeof(drvinfo-version)); + + strlcpy(drvinfo-fw_version, none, sizeof(drvinfo-fw_version)); + snprintf(drvinfo-bus_info, sizeof(drvinfo-bus_info), +platform:%s, plat_dev-name); + drvinfo-regdump_len = 0; + drvinfo-eedump_len = 0; +} + +static int fjes_get_settings(struct net_device *netdev, +struct ethtool_cmd *ecmd)
[PATCH net-next 1/1] sfc: Allow driver to cope with a lower number of VIs than it needs for RSS
Previously, the driver would refuse to load if it couldn't secure enough VIs from the MC to fulfill its RSS requirements. This was causing probe to fail on later functions in configuration where we'd run out of VIs, such as having many VFs. This change allows the driver to load with fewer VIs, down to a minimum of 2. A warning will be printed saying that RSS requirements were not met, possibly affecting performance. efx-max_tx_channels needs to be set to avoid going down the failure path in efx_probe_nic() immediately in the loop after the probe() NIC-type function. Also, Set rc=ENOSPC when bombing out of efx_probe_nic due to lack of VIs. Signed-off-by: Shradha Shah ss...@solarflare.com --- drivers/net/ethernet/sfc/ef10.c | 38 ++ drivers/net/ethernet/sfc/efx.c| 44 +-- drivers/net/ethernet/sfc/efx.h| 1 + drivers/net/ethernet/sfc/falcon.c | 1 + drivers/net/ethernet/sfc/net_driver.h | 1 + drivers/net/ethernet/sfc/siena.c | 1 + 6 files changed, 64 insertions(+), 22 deletions(-) diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c index 06b8061..99e3510 100644 --- a/drivers/net/ethernet/sfc/ef10.c +++ b/drivers/net/ethernet/sfc/ef10.c @@ -295,11 +295,11 @@ static int efx_ef10_probe(struct efx_nic *efx) /* We can have one VI for each 8K region. However, until we * use TX option descriptors we need two TX queues per channel. */ - efx-max_channels = - min_t(unsigned int, - EFX_MAX_CHANNELS, - efx_ef10_mem_map_size(efx) / - (EFX_VI_PAGE_SIZE * EFX_TXQ_TYPES)); + efx-max_channels = min_t(unsigned int, + EFX_MAX_CHANNELS, + efx_ef10_mem_map_size(efx) / + (EFX_VI_PAGE_SIZE * EFX_TXQ_TYPES)); + efx-max_tx_channels = efx-max_channels; if (WARN_ON(efx-max_channels == 0)) return -EIO; @@ -824,11 +824,12 @@ static int efx_ef10_dimension_resources(struct efx_nic *efx) { struct efx_ef10_nic_data *nic_data = efx-nic_data; unsigned int uc_mem_map_size, wc_mem_map_size; - unsigned int min_vis, pio_write_vi_base, max_vis; + unsigned int min_vis = max(EFX_TXQ_TYPES, separate_tx_channels ? 2 : 1); + unsigned int channel_vis, pio_write_vi_base, max_vis; void __iomem *membase; int rc; - min_vis = max(efx-n_channels, efx-n_tx_channels * EFX_TXQ_TYPES); + channel_vis = max(efx-n_channels, efx-n_tx_channels * EFX_TXQ_TYPES); #ifdef EFX_USE_PIO /* Try to allocate PIO buffers if wanted and if the full @@ -862,11 +863,11 @@ static int efx_ef10_dimension_resources(struct efx_nic *efx) * page size is 4K). So we may allocate some extra VIs just * for writing PIO buffers through. * -* The UC mapping contains (min_vis - 1) complete VIs and the +* The UC mapping contains (channel_vis - 1) complete VIs and the * first half of the next VI. Then the WC mapping begins with * the second half of this last VI. */ - uc_mem_map_size = PAGE_ALIGN((min_vis - 1) * EFX_VI_PAGE_SIZE + + uc_mem_map_size = PAGE_ALIGN((channel_vis - 1) * EFX_VI_PAGE_SIZE + ER_DZ_TX_PIOBUF); if (nic_data-n_piobufs) { /* pio_write_vi_base rounds down to give the number of complete @@ -881,7 +882,7 @@ static int efx_ef10_dimension_resources(struct efx_nic *efx) } else { pio_write_vi_base = 0; wc_mem_map_size = 0; - max_vis = min_vis; + max_vis = channel_vis; } /* In case the last attached driver failed to free VIs, do it now */ @@ -893,6 +894,23 @@ static int efx_ef10_dimension_resources(struct efx_nic *efx) if (rc != 0) return rc; + if (nic_data-n_allocated_vis channel_vis) { + netif_info(efx, drv, efx-net_dev, + Could not allocate enough VIs to satisfy RSS + requirements. Performance may not be optimal.\n); + /* We didn't get the VIs to populate our channels. +* We could keep what we got but then we'd have more +* interrupts than we need. +* Instead calculate new max_channels and restart +*/ + efx-max_channels = nic_data-n_allocated_vis; + efx-max_tx_channels = + nic_data-n_allocated_vis / EFX_TXQ_TYPES; + + efx_ef10_free_vis(efx); + return -EAGAIN; + } + /* If we didn't get enough VIs to map all the PIO buffers, free the * PIO buffers */ diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c index
[PATCH v2 net-next] r8169: Add values missing in @get_stats64 from HW counters
The r8169 driver collects statistical information returned by @get_stats64 by counting them in the driver itself, even though many (but not all) of the values are already collected by tally counters (TCs) in the NIC. Some of these TC values are not returned by @get_stats64. Especially the received multicast packages are missing from /proc/net/dev. Rectify this by fetching the TCs and returning them from rtl8169_get_stats64. The counters collected in the driver obviously disappear as soon as the driver is unloaded so after a driver is loaded the counters always start at 0. The TCs on the other hand are only reset by a power cycle. Without further considerations the values collected by the driver would not match up against the TC values. This patch introduces a new function rtl8169_reset_counters which resets the TCs. Unfortunately chip versions prior to RTL_GIGA_MAC_VER_19 don't allow to reset the TCs programatically. Therefore introduce an addition to the rtl8169_private struct and a function rtl8169_init_counter_offsets to store the TCs at first rtl_open. Use these values as offsets in rtl8169_get_stats64. Signed-off-by: Corinna Vinschen vinsc...@redhat.com --- drivers/net/ethernet/realtek/r8169.c | 107 +++ 1 file changed, 107 insertions(+) diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c index f790f61..f26a48d 100644 --- a/drivers/net/ethernet/realtek/r8169.c +++ b/drivers/net/ethernet/realtek/r8169.c @@ -637,6 +637,9 @@ enum rtl_register_content { /* _TBICSRBit */ TBILinkOK = 0x0200, + /* ResetCounterCommand */ + CounterReset= 0x1, + /* DumpCounterCommand */ CounterDump = 0x8, @@ -747,6 +750,14 @@ struct rtl8169_counters { __le16 tx_underun; }; +struct rtl8169_tc_offsets { + boolinited; + __le64 tx_errors; + __le32 tx_multi_collision; + __le32 rx_multicast; + __le16 tx_aborted; +}; + enum rtl_flag { RTL_FLAG_TASK_ENABLED, RTL_FLAG_TASK_SLOW_PENDING, @@ -824,6 +835,7 @@ struct rtl8169_private { struct mii_if_info mii; struct rtl8169_counters counters; + struct rtl8169_tc_offsets tc_offset; u32 saved_wolopts; u32 opts1_mask; @@ -2179,6 +2191,47 @@ static int rtl8169_get_sset_count(struct net_device *dev, int sset) } } +DECLARE_RTL_COND(rtl_reset_counters_cond) +{ + void __iomem *ioaddr = tp-mmio_addr; + + return RTL_R32(CounterAddrLow) CounterReset; +} + +static void rtl8169_reset_counters(struct net_device *dev) +{ + struct rtl8169_private *tp = netdev_priv(dev); + void __iomem *ioaddr = tp-mmio_addr; + struct device *d = tp-pci_dev-dev; + struct rtl8169_counters *counters; + dma_addr_t paddr; + u32 cmd; + + /* +* Versions prior to RTL_GIGA_MAC_VER_19 don't support resetting the +* tally counters. +*/ + if (tp-mac_version RTL_GIGA_MAC_VER_19) + return; + + counters = dma_alloc_coherent(d, sizeof(*counters), paddr, GFP_KERNEL); + if (!counters) + return; + + RTL_W32(CounterAddrHigh, (u64)paddr 32); + cmd = (u64)paddr DMA_BIT_MASK(32); + RTL_W32(CounterAddrLow, cmd); + RTL_W32(CounterAddrLow, cmd | CounterReset); + + if (!rtl_udelay_loop_wait_low(tp, rtl_reset_counters_cond, 10, 1000)) + netif_warn(tp, hw, dev, counter reset failed\n); + + RTL_W32(CounterAddrLow, 0); + RTL_W32(CounterAddrHigh, 0); + + dma_free_coherent(d, sizeof(*counters), counters, paddr); +} + DECLARE_RTL_COND(rtl_counters_cond) { void __iomem *ioaddr = tp-mmio_addr; @@ -2220,6 +2273,39 @@ static void rtl8169_update_counters(struct net_device *dev) dma_free_coherent(d, sizeof(*counters), counters, paddr); } +static void rtl8169_init_counter_offsets(struct net_device *dev) +{ + struct rtl8169_private *tp = netdev_priv(dev); + + /* +* rtl8169_init_counter_offsets is called from rtl_open. On chip +* versions prior to RTL_GIGA_MAC_VER_19 the tally counters are only +* reset by a power cycle, while the counter values collected by the +* driver are reset at every driver unload/load cycle. +* +* To make sure the HW values returned by @get_stats64 match the SW +* values, we collect the initial values at first open(*) and use them +* as offsets to normalize the values returned by @get_stats64. +* +* (*) We can't call rtl8169_init_counter_offsets from rtl_init_one +* for the reason stated in rtl8169_update_counters; CmdRxEnb is only +* set at open time by rtl_hw_start. +*/ + + if (tp-tc_offset.inited) + return; + + rtl8169_reset_counters(dev); + + rtl8169_update_counters(dev); + + tp-tc_offset.tx_errors =
Re: [PATCH] lib/Makefile: remove CONFIG_AVERAGE build rule
On Fri, 2015-08-21 at 10:05 +, Valentin Rothberg wrote: The Kconfig option AVERAGE and its implementation has been removed by commit f4e774f55fe0 (average: remove out-of-line implementation). Remove the dead build rule in lib/Makefile. D'oh, sorry about that. Reviewed-by: Johannes Berg johan...@sipsolutions.net [reproducing patch in full for netdev] Signed-off-by: Valentin Rothberg valentinrothb...@gmail.com --- I detected the issue with scripts/checkkconfigsymbols.py lib/Makefile | 2 -- 1 file changed, 2 deletions(-) diff --git a/lib/Makefile b/lib/Makefile index 51e1d761f0b9..f32d342b75de 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -143,8 +143,6 @@ obj-$(CONFIG_GENERIC_ATOMIC64) += atomic64.o obj-$(CONFIG_ATOMIC64_SELFTEST) += atomic64_test.o -obj-$(CONFIG_AVERAGE) += average.o - obj-$(CONFIG_CPU_RMAP) += cpu_rmap.o obj-$(CONFIG_CORDIC) += cordic.o -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 15/22] fjes: net_device_ops.ndo_vlan_rx_add/kill_vid
This patch adds net_device_ops.ndo_vlan_rx_add_vid and net_device_ops.ndo_vlan_rx_kill_vid callback. Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com --- drivers/net/fjes/fjes_hw.c | 27 +++ drivers/net/fjes/fjes_hw.h | 2 ++ drivers/net/fjes/fjes_main.c | 40 3 files changed, 69 insertions(+) diff --git a/drivers/net/fjes/fjes_hw.c b/drivers/net/fjes/fjes_hw.c index 3c96d06..4a4b750 100644 --- a/drivers/net/fjes/fjes_hw.c +++ b/drivers/net/fjes/fjes_hw.c @@ -825,6 +825,33 @@ bool fjes_hw_check_vlan_id(struct epbuf_handler *epbh, u16 vlan_id) return ret; } +bool fjes_hw_set_vlan_id(struct epbuf_handler *epbh, u16 vlan_id) +{ + union ep_buffer_info *info = epbh-info; + int i; + + for (i = 0; i EP_BUFFER_SUPPORT_VLAN_MAX; i++) { + if (info-v1i.vlan_id[i] == 0) { + info-v1i.vlan_id[i] = vlan_id; + return true; + } + } + return false; +} + +void fjes_hw_del_vlan_id(struct epbuf_handler *epbh, u16 vlan_id) +{ + union ep_buffer_info *info = epbh-info; + int i; + + if (0 != vlan_id) { + for (i = 0; i EP_BUFFER_SUPPORT_VLAN_MAX; i++) { + if (vlan_id == info-v1i.vlan_id[i]) + info-v1i.vlan_id[i] = 0; + } + } +} + bool fjes_hw_epbuf_rx_is_empty(struct epbuf_handler *epbh) { union ep_buffer_info *info = epbh-info; diff --git a/drivers/net/fjes/fjes_hw.h b/drivers/net/fjes/fjes_hw.h index 3511db2..95e632b 100644 --- a/drivers/net/fjes/fjes_hw.h +++ b/drivers/net/fjes/fjes_hw.h @@ -322,6 +322,8 @@ int fjes_hw_epid_is_shared(struct fjes_device_shared_info *, int); bool fjes_hw_check_epbuf_version(struct epbuf_handler *, u32); bool fjes_hw_check_mtu(struct epbuf_handler *, u32); bool fjes_hw_check_vlan_id(struct epbuf_handler *, u16); +bool fjes_hw_set_vlan_id(struct epbuf_handler *, u16); +void fjes_hw_del_vlan_id(struct epbuf_handler *, u16); bool fjes_hw_epbuf_rx_is_empty(struct epbuf_handler *); void *fjes_hw_epbuf_rx_curpkt_get_addr(struct epbuf_handler *, size_t *); void fjes_hw_epbuf_rx_curpkt_drop(struct epbuf_handler *); diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c index 94ccc11..4a4ce81 100644 --- a/drivers/net/fjes/fjes_main.c +++ b/drivers/net/fjes/fjes_main.c @@ -58,6 +58,8 @@ static irqreturn_t fjes_intr(int, void*); static struct rtnl_link_stats64 * fjes_get_stats64(struct net_device *, struct rtnl_link_stats64 *); static int fjes_change_mtu(struct net_device *, int); +static int fjes_vlan_rx_add_vid(struct net_device *, __be16 proto, u16); +static int fjes_vlan_rx_kill_vid(struct net_device *, __be16 proto, u16); static void fjes_tx_retry(struct net_device *); static int fjes_acpi_add(struct acpi_device *); @@ -226,6 +228,8 @@ static const struct net_device_ops fjes_netdev_ops = { .ndo_get_stats64= fjes_get_stats64, .ndo_change_mtu = fjes_change_mtu, .ndo_tx_timeout = fjes_tx_retry, + .ndo_vlan_rx_add_vid= fjes_vlan_rx_add_vid, + .ndo_vlan_rx_kill_vid = fjes_vlan_rx_kill_vid, }; /* fjes_open - Called when a network interface is made active */ @@ -751,6 +755,42 @@ static int fjes_change_mtu(struct net_device *netdev, int new_mtu) return -EINVAL; } +static int fjes_vlan_rx_add_vid(struct net_device *netdev, + __be16 proto, u16 vid) +{ + struct fjes_adapter *adapter = netdev_priv(netdev); + bool ret = true; + int epid; + + for (epid = 0; epid adapter-hw.max_epid; epid++) { + if (epid == adapter-hw.my_epid) + continue; + + if (!fjes_hw_check_vlan_id( + adapter-hw.ep_shm_info[epid].tx, vid)) + ret = fjes_hw_set_vlan_id( + adapter-hw.ep_shm_info[epid].tx, vid); + } + + return ret ? 0 : -ENOSPC; +} + +static int fjes_vlan_rx_kill_vid(struct net_device *netdev, +__be16 proto, u16 vid) +{ + struct fjes_adapter *adapter = netdev_priv(netdev); + int epid; + + for (epid = 0; epid adapter-hw.max_epid; epid++) { + if (epid == adapter-hw.my_epid) + continue; + + fjes_hw_del_vlan_id(adapter-hw.ep_shm_info[epid].tx, vid); + } + + return 0; +} + static irqreturn_t fjes_intr(int irq, void *data) { struct fjes_adapter *adapter = data; -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 14/22] fjes: net_device_ops.ndo_tx_timeout
This patch adds net_device_ops.ndo_tx_timeout callback. Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com --- drivers/net/fjes/fjes_main.c | 9 + 1 file changed, 9 insertions(+) diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c index 519976c..94ccc11 100644 --- a/drivers/net/fjes/fjes_main.c +++ b/drivers/net/fjes/fjes_main.c @@ -58,6 +58,7 @@ static irqreturn_t fjes_intr(int, void*); static struct rtnl_link_stats64 * fjes_get_stats64(struct net_device *, struct rtnl_link_stats64 *); static int fjes_change_mtu(struct net_device *, int); +static void fjes_tx_retry(struct net_device *); static int fjes_acpi_add(struct acpi_device *); static int fjes_acpi_remove(struct acpi_device *); @@ -224,6 +225,7 @@ static const struct net_device_ops fjes_netdev_ops = { .ndo_start_xmit = fjes_xmit_frame, .ndo_get_stats64= fjes_get_stats64, .ndo_change_mtu = fjes_change_mtu, + .ndo_tx_timeout = fjes_tx_retry, }; /* fjes_open - Called when a network interface is made active */ @@ -705,6 +707,13 @@ fjes_xmit_frame(struct sk_buff *skb, struct net_device *netdev) return ret; } +static void fjes_tx_retry(struct net_device *netdev) +{ + struct netdev_queue *queue = netdev_get_tx_queue(netdev, 0); + + netif_tx_wake_queue(queue); +} + static struct rtnl_link_stats64 * fjes_get_stats64(struct net_device *netdev, struct rtnl_link_stats64 *stats) { -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 16/22] fjes: interrupt_watch_task
This patch adds interrupt_watch_task. This task is used to prevent delay of interrupts. Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com --- drivers/net/fjes/fjes.h | 5 + drivers/net/fjes/fjes_main.c | 40 +++- 2 files changed, 44 insertions(+), 1 deletion(-) diff --git a/drivers/net/fjes/fjes.h b/drivers/net/fjes/fjes.h index b04ea9d..1743dbb 100644 --- a/drivers/net/fjes/fjes.h +++ b/drivers/net/fjes/fjes.h @@ -32,6 +32,7 @@ #define FJES_TX_RETRY_TIMEOUT (100) #define FJES_TX_TX_STALL_TIMEOUT (FJES_TX_RETRY_INTERVAL / 2) #define FJES_OPEN_ZONE_UPDATE_WAIT (300) /* msec */ +#define FJES_IRQ_WATCH_DELAY (HZ) /* board specific private data structure */ struct fjes_adapter { @@ -52,10 +53,14 @@ struct fjes_adapter { bool irq_registered; struct workqueue_struct *txrx_wq; + struct workqueue_struct *control_wq; struct work_struct tx_stall_task; struct work_struct raise_intr_rxdata_task; + struct delayed_work interrupt_watch_task; + bool interrupt_watch_enable; + struct fjes_hw hw; }; diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c index 4a4ce81..5fce33d 100644 --- a/drivers/net/fjes/fjes_main.c +++ b/drivers/net/fjes/fjes_main.c @@ -71,7 +71,7 @@ static int fjes_remove(struct platform_device *); static int fjes_sw_init(struct fjes_adapter *); static void fjes_netdev_setup(struct net_device *); - +static void fjes_irq_watch_task(struct work_struct *); static void fjes_rx_irq(struct fjes_adapter *, int); static int fjes_poll(struct napi_struct *, int); @@ -197,6 +197,13 @@ static int fjes_request_irq(struct fjes_adapter *adapter) struct net_device *netdev = adapter-netdev; int result = -1; + adapter-interrupt_watch_enable = true; + if (!delayed_work_pending(adapter-interrupt_watch_task)) { + queue_delayed_work(adapter-control_wq, + adapter-interrupt_watch_task, + FJES_IRQ_WATCH_DELAY); + } + if (!adapter-irq_registered) { result = request_irq(adapter-hw.hw_res.irq, fjes_intr, IRQF_SHARED, netdev-name, adapter); @@ -213,6 +220,9 @@ static void fjes_free_irq(struct fjes_adapter *adapter) { struct fjes_hw *hw = adapter-hw; + adapter-interrupt_watch_enable = false; + cancel_delayed_work_sync(adapter-interrupt_watch_task); + fjes_hw_set_irqmask(hw, REG_ICTL_MASK_ALL, true); if (adapter-irq_registered) { @@ -297,6 +307,7 @@ static int fjes_close(struct net_device *netdev) fjes_free_irq(adapter); + cancel_delayed_work_sync(adapter-interrupt_watch_task); cancel_work_sync(adapter-raise_intr_rxdata_task); cancel_work_sync(adapter-tx_stall_task); @@ -996,11 +1007,15 @@ static int fjes_probe(struct platform_device *plat_dev) adapter-open_guard = false; adapter-txrx_wq = create_workqueue(DRV_NAME /txrx); + adapter-control_wq = create_workqueue(DRV_NAME /control); INIT_WORK(adapter-tx_stall_task, fjes_tx_stall_task); INIT_WORK(adapter-raise_intr_rxdata_task, fjes_raise_intr_rxdata_task); + INIT_DELAYED_WORK(adapter-interrupt_watch_task, fjes_irq_watch_task); + adapter-interrupt_watch_enable = false; + res = platform_get_resource(plat_dev, IORESOURCE_MEM, 0); hw-hw_res.start = res-start; hw-hw_res.size = res-end - res-start + 1; @@ -1040,8 +1055,11 @@ static int fjes_remove(struct platform_device *plat_dev) struct fjes_adapter *adapter = netdev_priv(netdev); struct fjes_hw *hw = adapter-hw; + cancel_delayed_work_sync(adapter-interrupt_watch_task); cancel_work_sync(adapter-raise_intr_rxdata_task); cancel_work_sync(adapter-tx_stall_task); + if (adapter-control_wq) + destroy_workqueue(adapter-control_wq); if (adapter-txrx_wq) destroy_workqueue(adapter-txrx_wq); @@ -1077,6 +1095,26 @@ static void fjes_netdev_setup(struct net_device *netdev) netdev-features |= NETIF_F_HW_CSUM | NETIF_F_HW_VLAN_CTAG_FILTER; } +static void fjes_irq_watch_task(struct work_struct *work) +{ + struct fjes_adapter *adapter = container_of(to_delayed_work(work), + struct fjes_adapter, interrupt_watch_task); + + local_irq_disable(); + fjes_intr(adapter-hw.hw_res.irq, adapter); + local_irq_enable(); + + if (fjes_rxframe_search_exist(adapter, 0) = 0) + napi_schedule(adapter-napi); + + if (adapter-interrupt_watch_enable) { + if (!delayed_work_pending(adapter-interrupt_watch_task)) + queue_delayed_work(adapter-control_wq, + adapter-interrupt_watch_task, +
[PATCH v3 18/22] fjes: unshare_watch_task
This patch adds unshare_watch_task. Shared buffer's status can be changed into unshared. This task is used to monitor shared buffer's status. Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com --- drivers/net/fjes/fjes.h | 3 ++ drivers/net/fjes/fjes_main.c | 126 +++ 2 files changed, 129 insertions(+) diff --git a/drivers/net/fjes/fjes.h b/drivers/net/fjes/fjes.h index d31d4c3..57feee8 100644 --- a/drivers/net/fjes/fjes.h +++ b/drivers/net/fjes/fjes.h @@ -59,6 +59,9 @@ struct fjes_adapter { struct work_struct tx_stall_task; struct work_struct raise_intr_rxdata_task; + struct work_struct unshare_watch_task; + unsigned long unshare_watch_bitmask; + struct delayed_work interrupt_watch_task; bool interrupt_watch_enable; diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c index caecfb3..c47ecf3 100644 --- a/drivers/net/fjes/fjes_main.c +++ b/drivers/net/fjes/fjes_main.c @@ -73,6 +73,7 @@ static int fjes_remove(struct platform_device *); static int fjes_sw_init(struct fjes_adapter *); static void fjes_netdev_setup(struct net_device *); static void fjes_irq_watch_task(struct work_struct *); +static void fjes_watch_unshare_task(struct work_struct *); static void fjes_rx_irq(struct fjes_adapter *, int); static int fjes_poll(struct napi_struct *, int); @@ -309,6 +310,8 @@ static int fjes_close(struct net_device *netdev) fjes_free_irq(adapter); cancel_delayed_work_sync(adapter-interrupt_watch_task); + cancel_work_sync(adapter-unshare_watch_task); + adapter-unshare_watch_bitmask = 0; cancel_work_sync(adapter-raise_intr_rxdata_task); cancel_work_sync(adapter-tx_stall_task); @@ -1025,6 +1028,8 @@ static int fjes_probe(struct platform_device *plat_dev) INIT_WORK(adapter-tx_stall_task, fjes_tx_stall_task); INIT_WORK(adapter-raise_intr_rxdata_task, fjes_raise_intr_rxdata_task); + INIT_WORK(adapter-unshare_watch_task, fjes_watch_unshare_task); + adapter-unshare_watch_bitmask = 0; INIT_DELAYED_WORK(adapter-interrupt_watch_task, fjes_irq_watch_task); adapter-interrupt_watch_enable = false; @@ -1069,6 +1074,7 @@ static int fjes_remove(struct platform_device *plat_dev) struct fjes_hw *hw = adapter-hw; cancel_delayed_work_sync(adapter-interrupt_watch_task); + cancel_work_sync(adapter-unshare_watch_task); cancel_work_sync(adapter-raise_intr_rxdata_task); cancel_work_sync(adapter-tx_stall_task); if (adapter-control_wq) @@ -1128,6 +1134,126 @@ static void fjes_irq_watch_task(struct work_struct *work) } } +static void fjes_watch_unshare_task(struct work_struct *work) +{ + struct fjes_adapter *adapter = + container_of(work, struct fjes_adapter, unshare_watch_task); + + struct net_device *netdev = adapter-netdev; + struct fjes_hw *hw = adapter-hw; + + int unshare_watch, unshare_reserve; + int max_epid, my_epid, epidx; + int stop_req, stop_req_done; + ulong unshare_watch_bitmask; + int wait_time = 0; + int is_shared; + int ret; + + my_epid = hw-my_epid; + max_epid = hw-max_epid; + + unshare_watch_bitmask = adapter-unshare_watch_bitmask; + adapter-unshare_watch_bitmask = 0; + + while ((unshare_watch_bitmask || hw-txrx_stop_req_bit) + (wait_time 3000)) { + for (epidx = 0; epidx hw-max_epid; epidx++) { + if (epidx == hw-my_epid) + continue; + + is_shared = fjes_hw_epid_is_shared(hw-hw_info.share, + epidx); + + stop_req = test_bit(epidx, hw-txrx_stop_req_bit); + + stop_req_done = hw-ep_shm_info[epidx].rx.info-v1i.rx_status + FJES_RX_STOP_REQ_DONE; + + unshare_watch = test_bit(epidx, unshare_watch_bitmask); + + unshare_reserve = test_bit(epidx, + hw-hw_info.buffer_unshare_reserve_bit); + + if ((!stop_req || +(is_shared (!is_shared || !stop_req_done))) + (is_shared || !unshare_watch || !unshare_reserve)) + continue; + + mutex_lock(hw-hw_info.lock); + ret = fjes_hw_unregister_buff_addr(hw, epidx); + switch (ret) { + case 0: + break; + case -ENOMSG: + case -EBUSY: + default: + if (!work_pending( + adapter-force_close_task)) { +
[PATCH v3 04/22] fjes: platform_driver's .probe and .remove routine
This patch implements platform_driver's .probe and .remove routine, and also adds board specific private data structure. This driver registers net_device at platform_driver's .probe routine and unregisters net_device at its .remove routine. Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com --- drivers/net/fjes/fjes.h | 25 drivers/net/fjes/fjes_main.c | 94 2 files changed, 119 insertions(+) diff --git a/drivers/net/fjes/fjes.h b/drivers/net/fjes/fjes.h index 15ded96..54bc189 100644 --- a/drivers/net/fjes/fjes.h +++ b/drivers/net/fjes/fjes.h @@ -24,7 +24,32 @@ #include linux/acpi.h +#include fjes_hw.h + #define FJES_ACPI_SYMBOL Extended Socket +#define FJES_MAX_QUEUES1 +#define FJES_TX_RETRY_INTERVAL (20 * HZ) + +/* board specific private data structure */ +struct fjes_adapter { + struct net_device *netdev; + struct platform_device *plat_dev; + + struct napi_struct napi; + struct rtnl_link_stats64 stats64; + + unsigned int tx_retry_count; + unsigned long tx_start_jiffies; + unsigned long rx_last_jiffies; + bool unset_rx_last; + + bool force_reset; + bool open_guard; + + bool irq_registered; + + struct fjes_hw hw; +}; extern char fjes_driver_name[]; extern char fjes_driver_version[]; diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c index 9517666..45a8b9c 100644 --- a/drivers/net/fjes/fjes_main.c +++ b/drivers/net/fjes/fjes_main.c @@ -23,6 +23,7 @@ #include linux/types.h #include linux/nls.h #include linux/platform_device.h +#include linux/netdevice.h #include fjes.h @@ -49,6 +50,9 @@ static acpi_status fjes_get_acpi_resource(struct acpi_resource *, void*); static int fjes_probe(struct platform_device *); static int fjes_remove(struct platform_device *); +static int fjes_sw_init(struct fjes_adapter *); +static void fjes_netdev_setup(struct net_device *); + static const struct acpi_device_id fjes_acpi_ids[] = { {PNP0C02, 0}, {, 0}, @@ -166,18 +170,108 @@ fjes_get_acpi_resource(struct acpi_resource *acpi_res, void *data) return AE_OK; } +static const struct net_device_ops fjes_netdev_ops = { +}; + /* fjes_probe - Device Initialization Routine */ static int fjes_probe(struct platform_device *plat_dev) { + struct fjes_adapter *adapter; + struct net_device *netdev; + struct resource *res; + struct fjes_hw *hw; + int err; + + err = -ENOMEM; + netdev = alloc_netdev_mq(sizeof(struct fjes_adapter), es%d, +NET_NAME_UNKNOWN, fjes_netdev_setup, +FJES_MAX_QUEUES); + + if (!netdev) + goto err_out; + + SET_NETDEV_DEV(netdev, plat_dev-dev); + + dev_set_drvdata(plat_dev-dev, netdev); + adapter = netdev_priv(netdev); + adapter-netdev = netdev; + adapter-plat_dev = plat_dev; + hw = adapter-hw; + hw-back = adapter; + + /* setup the private structure */ + err = fjes_sw_init(adapter); + if (err) + goto err_free_netdev; + + adapter-force_reset = false; + adapter-open_guard = false; + + res = platform_get_resource(plat_dev, IORESOURCE_MEM, 0); + hw-hw_res.start = res-start; + hw-hw_res.size = res-end - res-start + 1; + hw-hw_res.irq = platform_get_irq(plat_dev, 0); + err = fjes_hw_init(adapter-hw); + if (err) + goto err_free_netdev; + + /* setup MAC address (02:00:00:00:00:[epid])*/ + netdev-dev_addr[0] = 2; + netdev-dev_addr[1] = 0; + netdev-dev_addr[2] = 0; + netdev-dev_addr[3] = 0; + netdev-dev_addr[4] = 0; + netdev-dev_addr[5] = hw-my_epid; /* EPID */ + + err = register_netdev(netdev); + if (err) + goto err_hw_exit; + + netif_carrier_off(netdev); + return 0; + +err_hw_exit: + fjes_hw_exit(adapter-hw); +err_free_netdev: + free_netdev(netdev); +err_out: + return err; } /* fjes_remove - Device Removal Routine */ static int fjes_remove(struct platform_device *plat_dev) { + struct net_device *netdev = dev_get_drvdata(plat_dev-dev); + struct fjes_adapter *adapter = netdev_priv(netdev); + struct fjes_hw *hw = adapter-hw; + + unregister_netdev(netdev); + + fjes_hw_exit(hw); + + free_netdev(netdev); + return 0; } +static int fjes_sw_init(struct fjes_adapter *adapter) +{ + return 0; +} + +/* fjes_netdev_setup - netdevice initialization routine */ +static void fjes_netdev_setup(struct net_device *netdev) +{ + ether_setup(netdev); + + netdev-watchdog_timeo = FJES_TX_RETRY_INTERVAL; + netdev-netdev_ops = fjes_netdev_ops; + netdev-mtu = fjes_support_mtu[0]; + netdev-flags |= IFF_BROADCAST; + netdev-features |= NETIF_F_HW_CSUM |
[PATCH v3 07/22] fjes: net_device_ops.ndo_open and .ndo_stop
This patch adds net_device_ops.ndo_open and .ndo_stop callback. These function is called when network device activation and deactivation. Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com --- drivers/net/fjes/fjes.h | 1 + drivers/net/fjes/fjes_hw.c | 145 + drivers/net/fjes/fjes_hw.h | 30 ++ drivers/net/fjes/fjes_main.c | 246 +++ drivers/net/fjes/fjes_regs.h | 17 +++ 5 files changed, 439 insertions(+) diff --git a/drivers/net/fjes/fjes.h b/drivers/net/fjes/fjes.h index 54bc189..f182ed3 100644 --- a/drivers/net/fjes/fjes.h +++ b/drivers/net/fjes/fjes.h @@ -29,6 +29,7 @@ #define FJES_ACPI_SYMBOL Extended Socket #define FJES_MAX_QUEUES1 #define FJES_TX_RETRY_INTERVAL (20 * HZ) +#define FJES_OPEN_ZONE_UPDATE_WAIT (300) /* msec */ /* board specific private data structure */ struct fjes_adapter { diff --git a/drivers/net/fjes/fjes_hw.c b/drivers/net/fjes/fjes_hw.c index 1e807df..1935f48 100644 --- a/drivers/net/fjes/fjes_hw.c +++ b/drivers/net/fjes/fjes_hw.c @@ -638,6 +638,25 @@ int fjes_hw_unregister_buff_addr(struct fjes_hw *hw, int dest_epid) return result; } +int fjes_hw_raise_interrupt(struct fjes_hw *hw, int dest_epid, + enum REG_ICTL_MASK mask) +{ + u32 ig = mask | dest_epid; + + wr32(XSCT_IG, cpu_to_le32(ig)); + + return 0; +} + +u32 fjes_hw_capture_interrupt_status(struct fjes_hw *hw) +{ + u32 cur_is; + + cur_is = rd32(XSCT_IS); + + return cur_is; +} + void fjes_hw_set_irqmask(struct fjes_hw *hw, enum REG_ICTL_MASK intr_mask, bool mask) { @@ -646,3 +665,129 @@ void fjes_hw_set_irqmask(struct fjes_hw *hw, else wr32(XSCT_IMC, intr_mask); } + +bool fjes_hw_epid_is_same_zone(struct fjes_hw *hw, int epid) +{ + if (epid = hw-max_epid) + return false; + + if ((hw-ep_shm_info[epid].es_status != + FJES_ZONING_STATUS_ENABLE) || + (hw-ep_shm_info[hw-my_epid].zone == + FJES_ZONING_ZONE_TYPE_NONE)) + return false; + else + return (hw-ep_shm_info[epid].zone == + hw-ep_shm_info[hw-my_epid].zone); +} + +int fjes_hw_epid_is_shared(struct fjes_device_shared_info *share, + int dest_epid) +{ + int value = false; + + if (dest_epid share-epnum) + value = share-ep_status[dest_epid]; + + return value; +} + +static bool fjes_hw_epid_is_stop_requested(struct fjes_hw *hw, int src_epid) +{ + return test_bit(src_epid, hw-txrx_stop_req_bit); +} + +static bool fjes_hw_epid_is_stop_process_done(struct fjes_hw *hw, int src_epid) +{ + return (hw-ep_shm_info[src_epid].tx.info-v1i.rx_status + FJES_RX_STOP_REQ_DONE); +} + +enum ep_partner_status +fjes_hw_get_partner_ep_status(struct fjes_hw *hw, int epid) +{ + enum ep_partner_status status; + + if (fjes_hw_epid_is_shared(hw-hw_info.share, epid)) { + if (fjes_hw_epid_is_stop_requested(hw, epid)) { + status = EP_PARTNER_WAITING; + } else { + if (fjes_hw_epid_is_stop_process_done(hw, epid)) + status = EP_PARTNER_COMPLETE; + else + status = EP_PARTNER_SHARED; + } + } else { + status = EP_PARTNER_UNSHARE; + } + + return status; +} + +void fjes_hw_raise_epstop(struct fjes_hw *hw) +{ + enum ep_partner_status status; + int epidx; + + for (epidx = 0; epidx hw-max_epid; epidx++) { + if (epidx == hw-my_epid) + continue; + + status = fjes_hw_get_partner_ep_status(hw, epidx); + switch (status) { + case EP_PARTNER_SHARED: + fjes_hw_raise_interrupt(hw, epidx, + REG_ICTL_MASK_TXRX_STOP_REQ); + break; + default: + break; + } + + set_bit(epidx, hw-hw_info.buffer_unshare_reserve_bit); + set_bit(epidx, hw-txrx_stop_req_bit); + + hw-ep_shm_info[epidx].tx.info-v1i.rx_status |= + FJES_RX_STOP_REQ_REQUEST; + } +} + +int fjes_hw_wait_epstop(struct fjes_hw *hw) +{ + enum ep_partner_status status; + union ep_buffer_info *info; + int wait_time = 0; + int epidx; + + while (hw-hw_info.buffer_unshare_reserve_bit + (wait_time FJES_COMMAND_EPSTOP_WAIT_TIMEOUT * 1000)) { + for (epidx = 0; epidx hw-max_epid; epidx++) { + if (epidx == hw-my_epid) + continue; + status =
[PATCH v3 03/22] fjes: Hardware cleanup routine
This patch adds hardware cleanup routine to be invoked at driver's .remove routine. Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com --- drivers/net/fjes/fjes_hw.c | 66 ++ drivers/net/fjes/fjes_hw.h | 1 + 2 files changed, 67 insertions(+) diff --git a/drivers/net/fjes/fjes_hw.c b/drivers/net/fjes/fjes_hw.c index ae26638..757cece 100644 --- a/drivers/net/fjes/fjes_hw.c +++ b/drivers/net/fjes/fjes_hw.c @@ -56,6 +56,12 @@ static u8 *fjes_hw_iomap(struct fjes_hw *hw) return base; } +static void fjes_hw_iounmap(struct fjes_hw *hw) +{ + iounmap(hw-base); + release_mem_region(hw-hw_res.start, hw-hw_res.size); +} + int fjes_hw_reset(struct fjes_hw *hw) { union REG_DCTL dctl; @@ -109,6 +115,12 @@ static int fjes_hw_alloc_shared_status_region(struct fjes_hw *hw) return 0; } +static void fjes_hw_free_shared_status_region(struct fjes_hw *hw) +{ + kfree(hw-hw_info.share); + hw-hw_info.share = NULL; +} + static int fjes_hw_alloc_epbuf(struct epbuf_handler *epbh) { void *mem; @@ -126,6 +138,18 @@ static int fjes_hw_alloc_epbuf(struct epbuf_handler *epbh) return 0; } +static void fjes_hw_free_epbuf(struct epbuf_handler *epbh) +{ + if (epbh-buffer) + vfree(epbh-buffer); + + epbh-buffer = NULL; + epbh-size = 0; + + epbh-info = NULL; + epbh-ring = NULL; +} + void fjes_hw_setup_epbuf(struct epbuf_handler *epbh, u8 *mac_addr, u32 mtu) { union ep_buffer_info *info = epbh-info; @@ -258,6 +282,32 @@ static int fjes_hw_setup(struct fjes_hw *hw) return 0; } +static void fjes_hw_cleanup(struct fjes_hw *hw) +{ + int epidx; + + if (!hw-ep_shm_info) + return; + + fjes_hw_free_shared_status_region(hw); + + kfree(hw-hw_info.req_buf); + hw-hw_info.req_buf = NULL; + + kfree(hw-hw_info.res_buf); + hw-hw_info.res_buf = NULL; + + for (epidx = 0; epidx hw-max_epid ; epidx++) { + if (epidx == hw-my_epid) + continue; + fjes_hw_free_epbuf(hw-ep_shm_info[epidx].tx); + fjes_hw_free_epbuf(hw-ep_shm_info[epidx].rx); + } + + kfree(hw-ep_shm_info); + hw-ep_shm_info = NULL; +} + int fjes_hw_init(struct fjes_hw *hw) { int ret; @@ -285,6 +335,22 @@ int fjes_hw_init(struct fjes_hw *hw) return ret; } +void fjes_hw_exit(struct fjes_hw *hw) +{ + int ret; + + if (hw-base) { + ret = fjes_hw_reset(hw); + if (ret) + pr_err(%s: reset error, __func__); + + fjes_hw_iounmap(hw); + hw-base = NULL; + } + + fjes_hw_cleanup(hw); +} + void fjes_hw_set_irqmask(struct fjes_hw *hw, enum REG_ICTL_MASK intr_mask, bool mask) { diff --git a/drivers/net/fjes/fjes_hw.h b/drivers/net/fjes/fjes_hw.h index 836ebe2..1b3e9ca 100644 --- a/drivers/net/fjes/fjes_hw.h +++ b/drivers/net/fjes/fjes_hw.h @@ -241,6 +241,7 @@ struct fjes_hw { }; int fjes_hw_init(struct fjes_hw *); +void fjes_hw_exit(struct fjes_hw *); int fjes_hw_reset(struct fjes_hw *); void fjes_hw_init_command_registers(struct fjes_hw *, -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 20/22] fjes: epstop_task
This patch adds epstop_task. This task is used to process other receiver's cancellation request. Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com --- drivers/net/fjes/fjes_hw.c | 31 +++ drivers/net/fjes/fjes_hw.h | 1 + drivers/net/fjes/fjes_main.c | 1 + 3 files changed, 33 insertions(+) diff --git a/drivers/net/fjes/fjes_hw.c b/drivers/net/fjes/fjes_hw.c index 4525d36..b5f4a78 100644 --- a/drivers/net/fjes/fjes_hw.c +++ b/drivers/net/fjes/fjes_hw.c @@ -23,6 +23,7 @@ #include fjes.h static void fjes_hw_update_zone_task(struct work_struct *); +static void fjes_hw_epstop_task(struct work_struct *); /* supported MTU list */ const u32 fjes_support_mtu[] = { @@ -325,6 +326,7 @@ int fjes_hw_init(struct fjes_hw *hw) fjes_hw_set_irqmask(hw, REG_ICTL_MASK_ALL, true); INIT_WORK(hw-update_zone_task, fjes_hw_update_zone_task); + INIT_WORK(hw-epstop_task, fjes_hw_epstop_task); mutex_init(hw-hw_info.lock); @@ -355,6 +357,7 @@ void fjes_hw_exit(struct fjes_hw *hw) fjes_hw_cleanup(hw); cancel_work_sync(hw-update_zone_task); + cancel_work_sync(hw-epstop_task); } static enum fjes_dev_command_response_e @@ -1092,3 +1095,31 @@ static void fjes_hw_update_zone_task(struct work_struct *work) adapter-unshare_watch_task); } } + +static void fjes_hw_epstop_task(struct work_struct *work) +{ + struct fjes_hw *hw = container_of(work, struct fjes_hw, epstop_task); + struct fjes_adapter *adapter = (struct fjes_adapter *)hw-back; + + ulong remain_bit; + int epid_bit; + + while ((remain_bit = hw-epstop_req_bit)) { + for (epid_bit = 0; remain_bit; remain_bit = 1, epid_bit++) { + if (remain_bit 1) { + hw-ep_shm_info[epid_bit]. + tx.info-v1i.rx_status |= + FJES_RX_STOP_REQ_DONE; + + clear_bit(epid_bit, hw-epstop_req_bit); + set_bit(epid_bit, + adapter-unshare_watch_bitmask); + + if (!work_pending(adapter-unshare_watch_task)) + queue_work( + adapter-control_wq, + adapter-unshare_watch_task); + } + } + } +} diff --git a/drivers/net/fjes/fjes_hw.h b/drivers/net/fjes/fjes_hw.h index e59b737..6d57b89 100644 --- a/drivers/net/fjes/fjes_hw.h +++ b/drivers/net/fjes/fjes_hw.h @@ -283,6 +283,7 @@ struct fjes_hw { unsigned long txrx_stop_req_bit; unsigned long epstop_req_bit; struct work_struct update_zone_task; + struct work_struct epstop_task; int my_epid; int max_epid; diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c index 8e3a084..5e77d0c 100644 --- a/drivers/net/fjes/fjes_main.c +++ b/drivers/net/fjes/fjes_main.c @@ -316,6 +316,7 @@ static int fjes_close(struct net_device *netdev) cancel_work_sync(adapter-tx_stall_task); cancel_work_sync(hw-update_zone_task); + cancel_work_sync(hw-epstop_task); fjes_hw_wait_epstop(hw); -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RX packet loss on i.MX6Q running 4.2-rc7
On Fri, Aug 21, 2015 at 06:49:20AM +0200, Jon Nettleton wrote: On Fri, Aug 21, 2015 at 12:30 AM, Clemens Gruber clemens.gru...@pqgruber.com wrote: Hi, I am experiencing massive RX packet loss on my i.MX6Q (Chip rev 1.3) on Linux 4.2-rc7 with a Marvell 88E1510 Gigabit Ethernet PHY connected over RGMII. I noticed it when doing an UDP benchmark with iperf3. When sending UDP packets from a Debian PC to the i.MX6 with a rate of 100 Mbit/s, 99% of the packets are lost. With a rate of 10 Mbit/s, we are still losing 93% of all packets. TCP RX does suffer from packet loss too, but still achieves about 211 Mbit/s. TX is not affected. Steps to reproduce: On the i.MX6: iperf3 -s On a desktop PC: iperf3 -b 10M -u -c MX6IP The iperf3 results: [ ID] Interval Transfer Bandwidth JitterLost/Total [ 4] 0.00-10.00 sec 11.8 MBytes 9.90 Mbits/sec 0.687 ms 1397/1497 (93%) During the 10 Mbit UDP test, the IEEE_rx_macerr counter increased to 5371. ifconfig eth0 shows: RX packets:9216 errors:5248 dropped:170 overruns:5248 frame:5248 TX packets:83 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 Here are the TCP results with iperf3 -c MX6IP: [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-10.00 sec 252 MBytes 211 Mbits/sec 4343 sender [ 4] 0.00-10.00 sec 251 MBytes 211 Mbits/sec receiver During the TCP test, IEEE_rx_macerr increased to 4059. ifconfig eth0 shows: RX packets:186368 errors:4206 dropped:50 overruns:4206 frame:4206 TX packets:41861 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 Freescale errata entry ERR004512 did mention a RX FIFO overrun. Is this related? Forcing pause frames via ethtool -A eth0 rx on tx on, does not improve it: Same amount of UDP packet loss with reduced TCP throughput of 190 Mbit/s. IEEE_rx_macerr increased up to 5232 during UDP 10Mbit and up to 4270 for TCP. I am already using the MX6QDL_PAD_GPIO_6__ENET_IRQ workaround, which solved the ping latency issues from ERR006687 but not the packet loss problem. I read through the mailing list archives and found a discussion between Russell King, Marek Vasut, Eric Nelson, Fugang Duan and others about a similar problem. I therefore added you and contributors to fec_main.c to the CC. One suggestion I found, was adding udelay(210); to fec_enet_rx(): https://lkml.org/lkml/2014/8/22/88 But this also did not reduce the packet loss. (I added it to the fec_enet_rx function just before return pkt_received; but I still got 93% packet loss) Does anyone have the equipment/setup to trace an i.MX6Q during UDP RX traffic from iperf3 to find the root cause of this packet loss problem? What else could we do to fix this? This is a bug in iperf3's UDP tests. Do the same test with iperf2 and you will see expected performance. I believe there is a bug open in github about it. -Jon Thank you, Jon. You are right: With iperf2 I get the following results: 10 Mbit/s: 0% packet loss 50 Mbit/s: 0.045% packet loss 100 Mbit/s: 0.31% packet loss 200 Mbit/s: 0.64% packet loss Much better! :) Cheers, Clemens -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 01/22] fjes: Introduce FUJITSU Extended Socket Network Device driver
This patch adds the basic code of FUJITSU Extended Socket Network Device driver. When PNP0C02 is found in ACPI DSDT, it evaluates _STR to check if PNP0C02 is for Extended Socket device driver and retrieves ACPI resource information. Then creates platform_device. Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com --- drivers/net/Kconfig | 7 ++ drivers/net/Makefile | 2 + drivers/net/fjes/Makefile| 30 ++ drivers/net/fjes/fjes.h | 32 +++ drivers/net/fjes/fjes_main.c | 213 +++ 5 files changed, 284 insertions(+) create mode 100644 drivers/net/fjes/Makefile create mode 100644 drivers/net/fjes/fjes.h create mode 100644 drivers/net/fjes/fjes_main.c diff --git a/drivers/net/Kconfig b/drivers/net/Kconfig index c18f9e6..c78a81a 100644 --- a/drivers/net/Kconfig +++ b/drivers/net/Kconfig @@ -407,6 +407,13 @@ config VMXNET3 To compile this driver as a module, choose M here: the module will be called vmxnet3. +config FUJITSU_ES + tristate FUJITSU Extended Socket Network Device driver + depends on ACPI + help + This driver provides support for Extended Socket network device + on Extended Partitioning of FUJITSU PRIMEQUEST 2000 E2 series. + source drivers/net/hyperv/Kconfig endif # NETDEVICES diff --git a/drivers/net/Makefile b/drivers/net/Makefile index c12cb22..677c7b4 100644 --- a/drivers/net/Makefile +++ b/drivers/net/Makefile @@ -67,3 +67,5 @@ obj-$(CONFIG_USB_NET_DRIVERS) += usb/ obj-$(CONFIG_HYPERV_NET) += hyperv/ obj-$(CONFIG_NTB_NETDEV) += ntb_netdev.o + +obj-$(CONFIG_FUJITSU_ES) += fjes/ diff --git a/drivers/net/fjes/Makefile b/drivers/net/fjes/Makefile new file mode 100644 index 000..34bccba --- /dev/null +++ b/drivers/net/fjes/Makefile @@ -0,0 +1,30 @@ + +# +# FUJITSU Extended Socket Network Device driver +# Copyright (c) 2015 FUJITSU LIMITED +# +# This program is free software; you can redistribute it and/or modify it +# under the terms and conditions of the GNU General Public License, +# version 2, as published by the Free Software Foundation. +# +# This program is distributed in the hope it will be useful, but WITHOUT +# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or +# FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for +# more details. +# +# You should have received a copy of the GNU General Public License along with +# this program; if not, see http://www.gnu.org/licenses/. +# +# The full GNU General Public License is included in this distribution in +# the file called COPYING. +# + + + +# +# Makefile for the FUJITSU Extended Socket network device driver +# + +obj-$(CONFIG_FUJITSU_ES) += fjes.o + +fjes-objs := fjes_main.o diff --git a/drivers/net/fjes/fjes.h b/drivers/net/fjes/fjes.h new file mode 100644 index 000..52eb60b --- /dev/null +++ b/drivers/net/fjes/fjes.h @@ -0,0 +1,32 @@ +/* + * FUJITSU Extended Socket Network Device driver + * Copyright (c) 2015 FUJITSU LIMITED + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, see http://www.gnu.org/licenses/. + * + * The full GNU General Public License is included in this distribution in + * the file called COPYING. + * + */ + +#ifndef FJES_H_ +#define FJES_H_ + +#include linux/acpi.h + +#define FJES_ACPI_SYMBOL Extended Socket + +extern char fjes_driver_name[]; +extern char fjes_driver_version[]; + +#endif /* FJES_H_ */ diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c new file mode 100644 index 000..9517666 --- /dev/null +++ b/drivers/net/fjes/fjes_main.c @@ -0,0 +1,213 @@ +/* + * FUJITSU Extended Socket Network Device driver + * Copyright (c) 2015 FUJITSU LIMITED + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program; if not, see
[PATCH v3 00/22] FUJITSU Extended Socket network device driver
This patchsets adds FUJITSU Extended Socket network device driver. Extended Socket network device is a shared memory based high-speed network interface between Extended Partitions of PRIMEQUEST 2000 E2 series. You can get some information about Extended Partition and Extended Socket by referring the following manual. http://globalsp.ts.fujitsu.com/dmsp/Publications/public/CA92344-0537.pdf 3.2.1 Extended Partitioning 3.2.2 Extended Socke v2.2 - v3: - Fix up according to David's comment (No functional change) Taku Izumi (22): fjes: Introduce FUJITSU Extended Socket Network Device driver fjes: Hardware initialization routine fjes: Hardware cleanup routine fjes: platform_driver's .probe and .remove routine fjes: ES information acquisition routine fjes: buffer address regist/unregistration routine fjes: net_device_ops.ndo_open and .ndo_stop fjes: net_device_ops.ndo_start_xmit fjes: raise_intr_rxdata_task fjes: tx_stall_task fjes: NAPI polling function fjes: net_device_ops.ndo_get_stats64 fjes: net_device_ops.ndo_change_mtu fjes: net_device_ops.ndo_tx_timeout fjes: net_device_ops.ndo_vlan_rx_add/kill_vid fjes: interrupt_watch_task fjes: force_close_task fjes: unshare_watch_task fjes: update_zone_task fjes: epstop_task fjes: handle receive cancellation request interrupt fjes: ethtool support drivers/net/Kconfig |7 + drivers/net/Makefile|2 + drivers/net/fjes/Makefile | 30 + drivers/net/fjes/fjes.h | 77 +++ drivers/net/fjes/fjes_ethtool.c | 137 drivers/net/fjes/fjes_hw.c | 1125 +++ drivers/net/fjes/fjes_hw.h | 334 ++ drivers/net/fjes/fjes_main.c| 1383 +++ drivers/net/fjes/fjes_regs.h| 142 9 files changed, 3237 insertions(+) create mode 100644 drivers/net/fjes/Makefile create mode 100644 drivers/net/fjes/fjes.h create mode 100644 drivers/net/fjes/fjes_ethtool.c create mode 100644 drivers/net/fjes/fjes_hw.c create mode 100644 drivers/net/fjes/fjes_hw.h create mode 100644 drivers/net/fjes/fjes_main.c create mode 100644 drivers/net/fjes/fjes_regs.h -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 11/22] fjes: NAPI polling function
This patch adds NAPI polling function and receive related work. Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com --- drivers/net/fjes/fjes_hw.c | 40 ++ drivers/net/fjes/fjes_hw.h | 5 ++ drivers/net/fjes/fjes_main.c | 171 ++- 3 files changed, 214 insertions(+), 2 deletions(-) diff --git a/drivers/net/fjes/fjes_hw.c b/drivers/net/fjes/fjes_hw.c index 487dbc6..3c96d06 100644 --- a/drivers/net/fjes/fjes_hw.c +++ b/drivers/net/fjes/fjes_hw.c @@ -825,6 +825,46 @@ bool fjes_hw_check_vlan_id(struct epbuf_handler *epbh, u16 vlan_id) return ret; } +bool fjes_hw_epbuf_rx_is_empty(struct epbuf_handler *epbh) +{ + union ep_buffer_info *info = epbh-info; + + if (info-v1i.count_max == 0) + return true; + + return EP_RING_EMPTY(info-v1i.head, info-v1i.tail, +info-v1i.count_max); +} + +void *fjes_hw_epbuf_rx_curpkt_get_addr(struct epbuf_handler *epbh, + size_t *psize) +{ + union ep_buffer_info *info = epbh-info; + struct esmem_frame *ring_frame; + void *frame; + + ring_frame = (struct esmem_frame *)(epbh-ring[EP_RING_INDEX +(info-v1i.head, + info-v1i.count_max) * +info-v1i.frame_max]); + + *psize = (size_t)ring_frame-frame_size; + + frame = ring_frame-frame_data; + + return frame; +} + +void fjes_hw_epbuf_rx_curpkt_drop(struct epbuf_handler *epbh) +{ + union ep_buffer_info *info = epbh-info; + + if (fjes_hw_epbuf_rx_is_empty(epbh)) + return; + + EP_RING_INDEX_INC(epbh-info-v1i.head, info-v1i.count_max); +} + int fjes_hw_epbuf_tx_pkt_send(struct epbuf_handler *epbh, void *frame, size_t size) { diff --git a/drivers/net/fjes/fjes_hw.h b/drivers/net/fjes/fjes_hw.h index 07e1226..3511db2 100644 --- a/drivers/net/fjes/fjes_hw.h +++ b/drivers/net/fjes/fjes_hw.h @@ -69,6 +69,8 @@ struct fjes_hw; ((_num) = EP_RING_INDEX((_num) + 1, (_max))) #define EP_RING_FULL(_head, _tail, _max) \ (0 == EP_RING_INDEX(((_tail) - (_head)), (_max))) +#define EP_RING_EMPTY(_head, _tail, _max) \ + (1 == EP_RING_INDEX(((_tail) - (_head)), (_max))) #define FJES_MTU_TO_BUFFER_SIZE(mtu) \ (ETH_HLEN + VLAN_HLEN + (mtu) + ETH_FCS_LEN) @@ -320,6 +322,9 @@ int fjes_hw_epid_is_shared(struct fjes_device_shared_info *, int); bool fjes_hw_check_epbuf_version(struct epbuf_handler *, u32); bool fjes_hw_check_mtu(struct epbuf_handler *, u32); bool fjes_hw_check_vlan_id(struct epbuf_handler *, u16); +bool fjes_hw_epbuf_rx_is_empty(struct epbuf_handler *); +void *fjes_hw_epbuf_rx_curpkt_get_addr(struct epbuf_handler *, size_t *); +void fjes_hw_epbuf_rx_curpkt_drop(struct epbuf_handler *); int fjes_hw_epbuf_tx_pkt_send(struct epbuf_handler *, void *, size_t); #endif /* FJES_HW_H_ */ diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c index ac1e076..6194962 100644 --- a/drivers/net/fjes/fjes_main.c +++ b/drivers/net/fjes/fjes_main.c @@ -66,6 +66,9 @@ static int fjes_remove(struct platform_device *); static int fjes_sw_init(struct fjes_adapter *); static void fjes_netdev_setup(struct net_device *); +static void fjes_rx_irq(struct fjes_adapter *, int); +static int fjes_poll(struct napi_struct *, int); + static const struct acpi_device_id fjes_acpi_ids[] = { {PNP0C02, 0}, {, 0}, @@ -235,6 +238,8 @@ static int fjes_open(struct net_device *netdev) hw-txrx_stop_req_bit = 0; hw-epstop_req_bit = 0; + napi_enable(adapter-napi); + fjes_hw_capture_interrupt_status(hw); result = fjes_request_irq(adapter); @@ -250,6 +255,7 @@ static int fjes_open(struct net_device *netdev) err_req_irq: fjes_free_irq(adapter); + napi_disable(adapter-napi); err_setup_res: fjes_free_resources(adapter); @@ -268,6 +274,8 @@ static int fjes_close(struct net_device *netdev) fjes_hw_raise_epstop(hw); + napi_disable(adapter-napi); + for (epidx = 0; epidx hw-max_epid; epidx++) { if (epidx == hw-my_epid) continue; @@ -701,14 +709,167 @@ static irqreturn_t fjes_intr(int irq, void *data) icr = fjes_hw_capture_interrupt_status(hw); - if (icr REG_IS_MASK_IS_ASSERT) + if (icr REG_IS_MASK_IS_ASSERT) { + if (icr REG_ICTL_MASK_RX_DATA) + fjes_rx_irq(adapter, icr REG_IS_MASK_EPID); + ret = IRQ_HANDLED; - else + } else { ret = IRQ_NONE; + } return ret; } +static int fjes_rxframe_search_exist(struct fjes_adapter *adapter, +int start_epid) +{ + struct fjes_hw *hw = adapter-hw; + enum
[PATCH v3 13/22] fjes: net_device_ops.ndo_change_mtu
This patch adds net_device_ops.ndo_change_mtu. Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com --- drivers/net/fjes/fjes_main.c | 29 + 1 file changed, 29 insertions(+) diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c index 20feb3e..519976c 100644 --- a/drivers/net/fjes/fjes_main.c +++ b/drivers/net/fjes/fjes_main.c @@ -57,6 +57,7 @@ static void fjes_tx_stall_task(struct work_struct *); static irqreturn_t fjes_intr(int, void*); static struct rtnl_link_stats64 * fjes_get_stats64(struct net_device *, struct rtnl_link_stats64 *); +static int fjes_change_mtu(struct net_device *, int); static int fjes_acpi_add(struct acpi_device *); static int fjes_acpi_remove(struct acpi_device *); @@ -222,6 +223,7 @@ static const struct net_device_ops fjes_netdev_ops = { .ndo_stop = fjes_close, .ndo_start_xmit = fjes_xmit_frame, .ndo_get_stats64= fjes_get_stats64, + .ndo_change_mtu = fjes_change_mtu, }; /* fjes_open - Called when a network interface is made active */ @@ -713,6 +715,33 @@ fjes_get_stats64(struct net_device *netdev, struct rtnl_link_stats64 *stats) return stats; } +static int fjes_change_mtu(struct net_device *netdev, int new_mtu) +{ + bool running = netif_running(netdev); + int ret = 0; + int idx; + + for (idx = 0; fjes_support_mtu[idx] != 0; idx++) { + if (new_mtu = fjes_support_mtu[idx]) { + new_mtu = fjes_support_mtu[idx]; + if (new_mtu == netdev-mtu) + return 0; + + if (running) + fjes_close(netdev); + + netdev-mtu = new_mtu; + + if (running) + ret = fjes_open(netdev); + + return ret; + } + } + + return -EINVAL; +} + static irqreturn_t fjes_intr(int irq, void *data) { struct fjes_adapter *adapter = data; -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 12/22] fjes: net_device_ops.ndo_get_stats64
This patch adds net_device_ops.ndo_get_stats64 callback. Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com --- drivers/net/fjes/fjes_main.c | 13 + 1 file changed, 13 insertions(+) diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c index 6194962..20feb3e 100644 --- a/drivers/net/fjes/fjes_main.c +++ b/drivers/net/fjes/fjes_main.c @@ -55,6 +55,8 @@ static netdev_tx_t fjes_xmit_frame(struct sk_buff *, struct net_device *); static void fjes_raise_intr_rxdata_task(struct work_struct *); static void fjes_tx_stall_task(struct work_struct *); static irqreturn_t fjes_intr(int, void*); +static struct rtnl_link_stats64 * +fjes_get_stats64(struct net_device *, struct rtnl_link_stats64 *); static int fjes_acpi_add(struct acpi_device *); static int fjes_acpi_remove(struct acpi_device *); @@ -219,6 +221,7 @@ static const struct net_device_ops fjes_netdev_ops = { .ndo_open = fjes_open, .ndo_stop = fjes_close, .ndo_start_xmit = fjes_xmit_frame, + .ndo_get_stats64= fjes_get_stats64, }; /* fjes_open - Called when a network interface is made active */ @@ -700,6 +703,16 @@ fjes_xmit_frame(struct sk_buff *skb, struct net_device *netdev) return ret; } +static struct rtnl_link_stats64 * +fjes_get_stats64(struct net_device *netdev, struct rtnl_link_stats64 *stats) +{ + struct fjes_adapter *adapter = netdev_priv(netdev); + + memcpy(stats, adapter-stats64, sizeof(struct rtnl_link_stats64)); + + return stats; +} + static irqreturn_t fjes_intr(int irq, void *data) { struct fjes_adapter *adapter = data; -- 1.8.3.1 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 10/22] fjes: tx_stall_task
This patch adds tx_stall_task. When receiver's buffer is full, sender stops its tx queue. This task is used to monitor receiver's status and when receiver's buffer is avairable, it resumes tx queue. Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com --- drivers/net/fjes/fjes.h | 2 ++ drivers/net/fjes/fjes_main.c | 61 2 files changed, 63 insertions(+) diff --git a/drivers/net/fjes/fjes.h b/drivers/net/fjes/fjes.h index 8e9899e..b04ea9d 100644 --- a/drivers/net/fjes/fjes.h +++ b/drivers/net/fjes/fjes.h @@ -30,6 +30,7 @@ #define FJES_MAX_QUEUES1 #define FJES_TX_RETRY_INTERVAL (20 * HZ) #define FJES_TX_RETRY_TIMEOUT (100) +#define FJES_TX_TX_STALL_TIMEOUT (FJES_TX_RETRY_INTERVAL / 2) #define FJES_OPEN_ZONE_UPDATE_WAIT (300) /* msec */ /* board specific private data structure */ @@ -52,6 +53,7 @@ struct fjes_adapter { struct workqueue_struct *txrx_wq; + struct work_struct tx_stall_task; struct work_struct raise_intr_rxdata_task; struct fjes_hw hw; diff --git a/drivers/net/fjes/fjes_main.c b/drivers/net/fjes/fjes_main.c index 80e180f..ac1e076 100644 --- a/drivers/net/fjes/fjes_main.c +++ b/drivers/net/fjes/fjes_main.c @@ -53,6 +53,7 @@ static int fjes_setup_resources(struct fjes_adapter *); static void fjes_free_resources(struct fjes_adapter *); static netdev_tx_t fjes_xmit_frame(struct sk_buff *, struct net_device *); static void fjes_raise_intr_rxdata_task(struct work_struct *); +static void fjes_tx_stall_task(struct work_struct *); static irqreturn_t fjes_intr(int, void*); static int fjes_acpi_add(struct acpi_device *); @@ -278,6 +279,7 @@ static int fjes_close(struct net_device *netdev) fjes_free_irq(adapter); cancel_work_sync(adapter-raise_intr_rxdata_task); + cancel_work_sync(adapter-tx_stall_task); fjes_hw_wait_epstop(hw); @@ -407,6 +409,59 @@ static void fjes_free_resources(struct fjes_adapter *adapter) } } +static void fjes_tx_stall_task(struct work_struct *work) +{ + struct fjes_adapter *adapter = container_of(work, + struct fjes_adapter, tx_stall_task); + struct net_device *netdev = adapter-netdev; + struct fjes_hw *hw = adapter-hw; + int all_queue_available, sendable; + enum ep_partner_status pstatus; + int max_epid, my_epid, epid; + union ep_buffer_info *info; + int i; + + if (((long)jiffies - + (long)(netdev-trans_start)) FJES_TX_TX_STALL_TIMEOUT) { + netif_wake_queue(netdev); + return; + } + + my_epid = hw-my_epid; + max_epid = hw-max_epid; + + for (i = 0; i 5; i++) { + all_queue_available = 1; + + for (epid = 0; epid max_epid; epid++) { + if (my_epid == epid) + continue; + + pstatus = fjes_hw_get_partner_ep_status(hw, epid); + sendable = (pstatus == EP_PARTNER_SHARED); + if (!sendable) + continue; + + info = adapter-hw.ep_shm_info[epid].tx.info; + + if (EP_RING_FULL(info-v1i.head, info-v1i.tail, +info-v1i.count_max)) { + all_queue_available = 0; + break; + } + } + + if (all_queue_available) { + netif_wake_queue(netdev); + return; + } + } + + usleep_range(50, 100); + + queue_work(adapter-txrx_wq, adapter-tx_stall_task); +} + static void fjes_raise_intr_rxdata_task(struct work_struct *work) { struct fjes_adapter *adapter = container_of(work, @@ -602,6 +657,10 @@ fjes_xmit_frame(struct sk_buff *skb, struct net_device *netdev) netdev-trans_start = jiffies; netif_tx_stop_queue(cur_queue); + if (!work_pending(adapter-tx_stall_task)) + queue_work(adapter-txrx_wq, + adapter-tx_stall_task); + ret = NETDEV_TX_BUSY; } } else { @@ -686,6 +745,7 @@ static int fjes_probe(struct platform_device *plat_dev) adapter-txrx_wq = create_workqueue(DRV_NAME /txrx); + INIT_WORK(adapter-tx_stall_task, fjes_tx_stall_task); INIT_WORK(adapter-raise_intr_rxdata_task, fjes_raise_intr_rxdata_task); @@ -729,6 +789,7 @@ static int fjes_remove(struct platform_device *plat_dev) struct fjes_hw *hw = adapter-hw; cancel_work_sync(adapter-raise_intr_rxdata_task); +
Re: [lkp] [rhashtable] 9d901bc0515: WARNING: CPU: 0 PID: 1 at arch/x86/mm/ioremap.c:63 __ioremap_check_ram+0x6a/0x99()
On Fri, Aug 21, 2015 at 03:09:42PM +0800, Huang Ying wrote: Sorry, my fault. There are OOM for parent commit too, just some dmesg difference, which I miss understood. Please ignore this report. I will be more careful next time. Thanks for the confirmation. -- Email: Herbert Xu herb...@gondor.apana.org.au Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] net: phy: add interrupt support for aquantia phy
From: Shaohui Xie shaohui@freescale.com By implementing config_intr ack_interrupt, now the phy can support link connect/disconnect interrupt. Signed-off-by: Shaohui Xie shaohui@freescale.com --- drivers/net/phy/aquantia.c | 49 ++ 1 file changed, 49 insertions(+) diff --git a/drivers/net/phy/aquantia.c b/drivers/net/phy/aquantia.c index 73d347d..d6111af 100644 --- a/drivers/net/phy/aquantia.c +++ b/drivers/net/phy/aquantia.c @@ -44,6 +44,43 @@ static int aquantia_aneg_done(struct phy_device *phydev) return (reg 0) ? reg : (reg BMSR_ANEGCOMPLETE); } +static int aquantia_config_intr(struct phy_device *phydev) +{ + int err; + + if (phydev-interrupts == PHY_INTERRUPT_ENABLED) { + err = phy_write_mmd(phydev, MDIO_MMD_AN, 0xd401, 1); + if (err 0) + return err; + + err = phy_write_mmd(phydev, MDIO_MMD_VEND1, 0xff00, 1); + if (err 0) + return err; + + err = phy_write_mmd(phydev, MDIO_MMD_VEND1, 0xff01, 0x1001); + } else { + err = phy_write_mmd(phydev, MDIO_MMD_AN, 0xd401, 0); + if (err 0) + return err; + + err = phy_write_mmd(phydev, MDIO_MMD_VEND1, 0xff00, 0); + if (err 0) + return err; + + err = phy_write_mmd(phydev, MDIO_MMD_VEND1, 0xff01, 0); + } + + return err; +} + +static int aquantia_ack_interrupt(struct phy_device *phydev) +{ + int reg; + + reg = phy_read_mmd(phydev, MDIO_MMD_AN, 0xcc01); + return (reg 0) ? reg : 0; +} + static int aquantia_read_status(struct phy_device *phydev) { int reg; @@ -85,8 +122,11 @@ static struct phy_driver aquantia_driver[] = { .phy_id_mask= 0xfff0, .name = Aquantia AQ1202, .features = PHY_AQUANTIA_FEATURES, + .flags = PHY_HAS_INTERRUPT, .aneg_done = aquantia_aneg_done, .config_aneg= aquantia_config_aneg, + .config_intr= aquantia_config_intr, + .ack_interrupt = aquantia_ack_interrupt, .read_status= aquantia_read_status, .driver = { .owner = THIS_MODULE,}, }, @@ -95,8 +135,11 @@ static struct phy_driver aquantia_driver[] = { .phy_id_mask= 0xfff0, .name = Aquantia AQ2104, .features = PHY_AQUANTIA_FEATURES, + .flags = PHY_HAS_INTERRUPT, .aneg_done = aquantia_aneg_done, .config_aneg= aquantia_config_aneg, + .config_intr= aquantia_config_intr, + .ack_interrupt = aquantia_ack_interrupt, .read_status= aquantia_read_status, .driver = { .owner = THIS_MODULE,}, }, @@ -105,8 +148,11 @@ static struct phy_driver aquantia_driver[] = { .phy_id_mask= 0xfff0, .name = Aquantia AQR105, .features = PHY_AQUANTIA_FEATURES, + .flags = PHY_HAS_INTERRUPT, .aneg_done = aquantia_aneg_done, .config_aneg= aquantia_config_aneg, + .config_intr= aquantia_config_intr, + .ack_interrupt = aquantia_ack_interrupt, .read_status= aquantia_read_status, .driver = { .owner = THIS_MODULE,}, }, @@ -115,8 +161,11 @@ static struct phy_driver aquantia_driver[] = { .phy_id_mask= 0xfff0, .name = Aquantia AQR405, .features = PHY_AQUANTIA_FEATURES, + .flags = PHY_HAS_INTERRUPT, .aneg_done = aquantia_aneg_done, .config_aneg= aquantia_config_aneg, + .config_intr= aquantia_config_intr, + .ack_interrupt = aquantia_ack_interrupt, .read_status= aquantia_read_status, .driver = { .owner = THIS_MODULE,}, }, -- 2.1.0.27.g96db324 -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 05/22] fjes: ES information acquisition routine
This patch adds ES information acquisition routine. ES information can be retrieved issuing information request command. ES information includes which receiver is same zone. Signed-off-by: Taku Izumi izumi.t...@jp.fujitsu.com --- drivers/net/fjes/fjes_hw.c | 101 +++ drivers/net/fjes/fjes_hw.h | 24 ++ drivers/net/fjes/fjes_regs.h | 23 ++ 3 files changed, 148 insertions(+) diff --git a/drivers/net/fjes/fjes_hw.c b/drivers/net/fjes/fjes_hw.c index 757cece..c31be7f 100644 --- a/drivers/net/fjes/fjes_hw.c +++ b/drivers/net/fjes/fjes_hw.c @@ -351,6 +351,107 @@ void fjes_hw_exit(struct fjes_hw *hw) fjes_hw_cleanup(hw); } +static enum fjes_dev_command_response_e +fjes_hw_issue_request_command(struct fjes_hw *hw, + enum fjes_dev_command_request_type type) +{ + enum fjes_dev_command_response_e ret = FJES_CMD_STATUS_UNKNOWN; + union REG_CR cr; + union REG_CS cs; + int timeout; + + cr.reg = 0; + cr.bits.req_start = 1; + cr.bits.req_code = type; + wr32(XSCT_CR, cr.reg); + cr.reg = rd32(XSCT_CR); + + if (cr.bits.error == 0) { + timeout = FJES_COMMAND_REQ_TIMEOUT * 1000; + cs.reg = rd32(XSCT_CS); + + while ((cs.bits.complete != 1) timeout 0) { + msleep(1000); + cs.reg = rd32(XSCT_CS); + timeout -= 1000; + } + + if (cs.bits.complete == 1) + ret = FJES_CMD_STATUS_NORMAL; + else if (timeout = 0) + ret = FJES_CMD_STATUS_TIMEOUT; + + } else { + switch (cr.bits.err_info) { + case FJES_CMD_REQ_ERR_INFO_PARAM: + ret = FJES_CMD_STATUS_ERROR_PARAM; + break; + case FJES_CMD_REQ_ERR_INFO_STATUS: + ret = FJES_CMD_STATUS_ERROR_STATUS; + break; + default: + ret = FJES_CMD_STATUS_UNKNOWN; + break; + } + } + + return ret; +} + +int fjes_hw_request_info(struct fjes_hw *hw) +{ + union fjes_device_command_req *req_buf = hw-hw_info.req_buf; + union fjes_device_command_res *res_buf = hw-hw_info.res_buf; + enum fjes_dev_command_response_e ret; + int result; + + memset(req_buf, 0, hw-hw_info.req_buf_size); + memset(res_buf, 0, hw-hw_info.res_buf_size); + + req_buf-info.length = FJES_DEV_COMMAND_INFO_REQ_LEN; + + res_buf-info.length = 0; + res_buf-info.code = 0; + + ret = fjes_hw_issue_request_command(hw, FJES_CMD_REQ_INFO); + + result = 0; + + if (FJES_DEV_COMMAND_INFO_RES_LEN((*hw-hw_info.max_epid)) != + res_buf-info.length) { + result = -ENOMSG; + } else if (ret == FJES_CMD_STATUS_NORMAL) { + switch (res_buf-info.code) { + case FJES_CMD_REQ_RES_CODE_NORMAL: + result = 0; + break; + default: + result = -EPERM; + break; + } + } else { + switch (ret) { + case FJES_CMD_STATUS_UNKNOWN: + result = -EPERM; + break; + case FJES_CMD_STATUS_TIMEOUT: + result = -EBUSY; + break; + case FJES_CMD_STATUS_ERROR_PARAM: + result = -EPERM; + break; + case FJES_CMD_STATUS_ERROR_STATUS: + result = -EPERM; + break; + default: + result = -EPERM; + break; + } + } + + return result; +} + void fjes_hw_set_irqmask(struct fjes_hw *hw, enum REG_ICTL_MASK intr_mask, bool mask) { diff --git a/drivers/net/fjes/fjes_hw.h b/drivers/net/fjes/fjes_hw.h index 1b3e9ca..cc1ef21 100644 --- a/drivers/net/fjes/fjes_hw.h +++ b/drivers/net/fjes/fjes_hw.h @@ -34,6 +34,12 @@ struct fjes_hw; #define EP_BUFFER_INFO_SIZE 4096 #define FJES_DEVICE_RESET_TIMEOUT ((17 + 1) * 3) /* sec */ +#define FJES_COMMAND_REQ_TIMEOUT (5 + 1) /* sec */ + +#define FJES_CMD_REQ_ERR_INFO_PARAM (0x0001) +#define FJES_CMD_REQ_ERR_INFO_STATUS (0x0002) + +#define FJES_CMD_REQ_RES_CODE_NORMAL (0) #define EP_BUFFER_SIZE \ (((sizeof(union ep_buffer_info) + (128 * (64 * 1024))) \ @@ -50,6 +56,7 @@ struct fjes_hw; ((size) - sizeof(struct esmem_frame) - \ (ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN)) +#define FJES_DEV_COMMAND_INFO_REQ_LEN (4) #define FJES_DEV_COMMAND_INFO_RES_LEN(epnum) (8 + 2 * (epnum)) #define FJES_DEV_COMMAND_SHARE_BUFFER_REQ_LEN(txb, rxb) \ (24 + (8 * ((txb) / EP_BUFFER_INFO_SIZE + (rxb)
Re: DEBUG_LOCKS_WARN_ON(in_interrupt()) triggering in socket code
Bueller? ... Bueller? On Thu, Aug 20, 2015 at 2:39 AM, Jason A. Donenfeld ja...@zx2c4.com wrote: Hi folks, In setting up a socket, there are two functions I make use of that in turn wind up calling static_key_slow_inc: setup_udp_tunnel_sock and sk_set_memalloc. These both make use of static_key_slow_inc because they selectively enable certain important code paths. This is all fine, except it poses some problems when calling these functions inside of .ndo_open. In that case, I get ugly (debug) warnings like this: WARNING: CPU: 1 PID: 2002 at kernel/locking/mutex.c:526 mutex_lock_nested+0x39b/0x3b0() DEBUG_LOCKS_WARN_ON(in_interrupt()) [81621d0e] dump_stack+0x45/0x57 [810505ca] warn_slowpath_common+0x8a/0xc0 [81050655] warn_slowpath_fmt+0x55/0x70 [8162513b] mutex_lock_nested+0x39b/0x3b0 [8113d699] static_key_slow_inc+0x59/0xc0 [8154ebc0] udp_encap_enable+0x20/0x30 [8157a885] setup_udp_tunnel_sock+0x55/0x70 [816028ac] socket_init+0x1cc/0x3a0 [81600341] open+0x21/0x1b0 [81476af0] __dev_open+0xb0/0x110 [81476e01] __dev_change_flags+0xa1/0x160 [81476ee9] dev_change_flags+0x29/0x70 [8148652a] do_setlink+0x5da/0xa80 [81487bed] rtnl_newlink+0x50d/0x8a0 [81485141] rtnetlink_rcv_msg+0xa1/0x240 [8149f1fb] netlink_rcv_skb+0x9b/0xc0 [8148508e] rtnetlink_rcv+0x2e/0x40 [8149ec3f] netlink_unicast+0x16f/0x200 [8149f009] netlink_sendmsg+0x339/0x380 [814559d9] ___sys_sendmsg+0x2f9/0x310 [814566d7] __sys_sendmsg+0x57/0xa0 [81456732] SyS_sendmsg+0x12/0x20 [816295b2] entry_SYSCALL_64_fastpath+0x16/0x7a The reason is that the static key code makes use of mutexes. And the mutex debug code ensures that in_interrupt() is zero; otherwise it prints that warning. In this case, in_interrupt() has a value of 512. So, questions: 1. Is the best thing to do just move my socket creation routine into a workqueue, and avoid this issue all together? 2. Is it, in fact, incorrect to check for in_interrupt(), and the debug assertion is actually wrong? 3. Is it a bug that in_interrupt() is returning non-zero in relation to a syscall? Thanks, Jason -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [net-next:master 1179/1189] include/linux/compiler.h:447:38: error: call to '__compiletime_assert_243' declared with attribute error: BUILD_BUG_ON failed: offsetof(struct dst_entry, __refcnt) 63
Yeah, I should have predicted this would happen on 32-bit builds when I saw the adjustment of __pad_to_align_refcnt[] for 64-bit. Jiri, you might not have any reasonable options to fix this I'm afraid. -- To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html