Re: INFO: rcu detected stall in xfrm_confirm_neigh

2018-02-18 Thread Steffen Klassert
On Tue, Feb 13, 2018 at 10:19:17AM +0100, Dmitry Vyukov wrote:
> On Mon, Feb 12, 2018 at 4:26 PM, Dmitry Vyukov  wrote:
> > On Mon, Feb 12, 2018 at 4:23 PM, syzbot
> >  wrote:
> >> Hello,
> >>
> >> syzbot hit the following crash on net-next commit
> >> 9515a2e082f91457db0ecff4b65371d0fb5d9aad (Thu Jan 25 03:37:38 2018 +)
> >> net/ipv4: Allow send to local broadcast from a socket bound to a VRF
> >>
> >> So far this crash happened 6 times on net-next.
> >> Unfortunately, I don't have any reproducer for this crash yet.
> >> Raw console output is attached.
> >> compiler: gcc (GCC) 7.1.1 20170620
> >> .config is attached.
> >
> >
> > +xfrm maintainers
> 
> Here is a C repro:
> https://gist.githubusercontent.com/dvyukov/92c67ba9afaaa960bcfbdc6ef549ac10/raw/786f9221c1d707c7f4a15effcb1d5997dd4f8638/gistfile1.txt

Seems like syzbot does not know about this reproducer.

I've send a patch to test and got this as the reply:

This crash does not have a reproducer. I cannot test it.


[PATCH net] tg3: APE heartbeat changes

2018-02-18 Thread Satish Baddipadige
From: Prashant Sreedharan 

In ungraceful host shutdown or driver crash case BMC connectivity is
lost. APE firmware is missing the driver state in this
case to keep the BMC connectivity alive.
This patch has below change to address this issue.

Heartbeat mechanism with APE firmware. This heartbeat mechanism
is needed to notify the APE firmware about driver state.

This patch also has the change in wait time for APE event from
1ms to 20ms as there can be some delay in getting response.

v2: Drop inline keyword as per David suggestion.

Signed-off-by: Prashant Sreedharan 
Signed-off-by: Satish Baddipadige 
Signed-off-by: Siva Reddy Kallam 
Acked-by: Michael Chan 
---
 drivers/net/ethernet/broadcom/tg3.c | 35 ---
 drivers/net/ethernet/broadcom/tg3.h |  5 +
 2 files changed, 29 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/tg3.c 
b/drivers/net/ethernet/broadcom/tg3.c
index a77ee2f..c1841db 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -820,7 +820,7 @@ static int tg3_ape_event_lock(struct tg3 *tp, u32 
timeout_us)
 
tg3_ape_unlock(tp, TG3_APE_LOCK_MEM);
 
-   udelay(10);
+   usleep_range(10, 20);
timeout_us -= (timeout_us > 10) ? 10 : timeout_us;
}
 
@@ -922,8 +922,8 @@ static int tg3_ape_send_event(struct tg3 *tp, u32 event)
if (!(apedata & APE_FW_STATUS_READY))
return -EAGAIN;
 
-   /* Wait for up to 1 millisecond for APE to service previous event. */
-   err = tg3_ape_event_lock(tp, 1000);
+   /* Wait for up to 20 millisecond for APE to service previous event. */
+   err = tg3_ape_event_lock(tp, 2);
if (err)
return err;
 
@@ -946,6 +946,7 @@ static void tg3_ape_driver_state_change(struct tg3 *tp, int 
kind)
 
switch (kind) {
case RESET_KIND_INIT:
+   tg3_ape_write32(tp, TG3_APE_HOST_HEARTBEAT_COUNT, tp->ape_hb++);
tg3_ape_write32(tp, TG3_APE_HOST_SEG_SIG,
APE_HOST_SEG_SIG_MAGIC);
tg3_ape_write32(tp, TG3_APE_HOST_SEG_LEN,
@@ -962,13 +963,6 @@ static void tg3_ape_driver_state_change(struct tg3 *tp, 
int kind)
event = APE_EVENT_STATUS_STATE_START;
break;
case RESET_KIND_SHUTDOWN:
-   /* With the interface we are currently using,
-* APE does not track driver state.  Wiping
-* out the HOST SEGMENT SIGNATURE forces
-* the APE to assume OS absent status.
-*/
-   tg3_ape_write32(tp, TG3_APE_HOST_SEG_SIG, 0x0);
-
if (device_may_wakeup(&tp->pdev->dev) &&
tg3_flag(tp, WOL_ENABLE)) {
tg3_ape_write32(tp, TG3_APE_HOST_WOL_SPEED,
@@ -990,6 +984,18 @@ static void tg3_ape_driver_state_change(struct tg3 *tp, 
int kind)
tg3_ape_send_event(tp, event);
 }
 
+static void tg3_send_ape_heartbeat(struct tg3 *tp,
+  unsigned long interval)
+{
+   /* Check if hb interval has exceeded */
+   if (!tg3_flag(tp, ENABLE_APE) ||
+   time_before(jiffies, tp->ape_hb_jiffies + interval))
+   return;
+
+   tg3_ape_write32(tp, TG3_APE_HOST_HEARTBEAT_COUNT, tp->ape_hb++);
+   tp->ape_hb_jiffies = jiffies;
+}
+
 static void tg3_disable_ints(struct tg3 *tp)
 {
int i;
@@ -7262,6 +7268,7 @@ static int tg3_poll_msix(struct napi_struct *napi, int 
budget)
}
}
 
+   tg3_send_ape_heartbeat(tp, TG3_APE_HB_INTERVAL << 1);
return work_done;
 
 tx_recovery:
@@ -7344,6 +7351,7 @@ static int tg3_poll(struct napi_struct *napi, int budget)
}
}
 
+   tg3_send_ape_heartbeat(tp, TG3_APE_HB_INTERVAL << 1);
return work_done;
 
 tx_recovery:
@@ -10732,7 +10740,7 @@ static int tg3_reset_hw(struct tg3 *tp, bool reset_phy)
if (tg3_flag(tp, ENABLE_APE))
/* Write our heartbeat update interval to APE. */
tg3_ape_write32(tp, TG3_APE_HOST_HEARTBEAT_INT_MS,
-   APE_HOST_HEARTBEAT_INT_DISABLE);
+   APE_HOST_HEARTBEAT_INT_5SEC);
 
tg3_write_sig_post_reset(tp, RESET_KIND_INIT);
 
@@ -11077,6 +11085,9 @@ static void tg3_timer(struct timer_list *t)
tp->asf_counter = tp->asf_multiplier;
}
 
+   /* Update the APE heartbeat every 5 seconds.*/
+   tg3_send_ape_heartbeat(tp, TG3_APE_HB_INTERVAL);
+
spin_unlock(&tp->lock);
 
 restart_timer:
@@ -16653,6 +16664,8 @@ static int tg3_get_invariants(struct tg3 *tp, const 
struct pci_device_id *ent)
   pci_state_reg);
 
tg3_ape_lock_init(tp);
+   tp->ape_hb_interval =
+   msecs_to_jiffies(APE_HOST_HEARTBEAT_INT_5SEC);
}
 
/* Set up tp->g

[Crypto v6 03/12] tls: support for inline tls

2018-02-18 Thread Atul Gupta
Facility to register Inline TLS drivers to net/tls. Setup
TLS_FULL_HW prot to listen on offload device.

Cases handled
1. Inline TLS device exists, setup prot for TLS_FULL_HW
2. Atleast one Inline TLS exists, sets TLS_FULL_HW. If
non-inline capable device establish connection, move to TLS_SW_TX
3. default mode TLS_SW_TX continues

Signed-off-by: Atul Gupta 
---
 net/tls/tls_main.c | 123 ++---
 1 file changed, 116 insertions(+), 7 deletions(-)

diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index b0d5fce..88eafec 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -45,13 +46,9 @@
 MODULE_DESCRIPTION("Transport Layer Security Support");
 MODULE_LICENSE("Dual BSD/GPL");
 
-enum {
-   TLS_BASE_TX,
-   TLS_SW_TX,
-   TLS_NUM_CONFIG,
-};
-
-static struct proto tls_prots[TLS_NUM_CONFIG];
+static LIST_HEAD(device_list);
+static DEFINE_MUTEX(device_mutex);
+struct proto tls_prots[TLS_NUM_CONFIG];
 
 static inline void update_sk_prot(struct sock *sk, struct tls_context *ctx)
 {
@@ -260,6 +257,37 @@ static void tls_sk_proto_close(struct sock *sk, long 
timeout)
sk_proto_close(sk, timeout);
 }
 
+static struct net_device *get_netdev(struct sock *sk)
+{
+   struct inet_sock *inet = inet_sk(sk);
+   struct net_device *netdev = NULL;
+
+   netdev = dev_get_by_index(sock_net(sk), inet->cork.fl.flowi_oif);
+   return netdev;
+}
+
+static int get_tls_offload_dev(struct sock *sk)
+{
+   struct net_device *netdev;
+   struct tls_device *dev;
+   int rc = 0;
+
+   netdev = get_netdev(sk);
+   if (!netdev)
+   return -EINVAL;
+
+   mutex_lock(&device_mutex);
+   list_for_each_entry(dev, &device_list, dev_list) {
+   if (dev->netdev && dev->netdev(dev, netdev)) {
+   rc = -EEXIST;
+   break;
+   }
+   }
+   mutex_unlock(&device_mutex);
+   dev_put(netdev);
+   return rc;
+}
+
 static int do_tls_getsockopt_tx(struct sock *sk, char __user *optval,
int __user *optlen)
 {
@@ -403,6 +431,15 @@ static int do_tls_setsockopt_tx(struct sock *sk, char 
__user *optval,
goto err_crypto_info;
}
 
+   rc = get_tls_offload_dev(sk);
+   if (rc == -EINVAL) {
+   goto out;
+   } else if (rc == -EEXIST) {
+   /* Retain HW unhash for cleanup and move to SW Tx */
+   sk->sk_prot[TLS_BASE_TX].unhash =
+   sk->sk_prot[TLS_FULL_HW].unhash;
+   }
+
/* currently SW is default, we will have ethtool in future */
rc = tls_set_sw_offload(sk, ctx);
tx_conf = TLS_SW_TX;
@@ -450,6 +487,54 @@ static int tls_setsockopt(struct sock *sk, int level, int 
optname,
return do_tls_setsockopt(sk, optname, optval, optlen);
 }
 
+static int tls_hw_prot(struct sock *sk)
+{
+   struct tls_context *ctx = tls_get_ctx(sk);
+   struct tls_device *dev;
+
+   mutex_lock(&device_mutex);
+   list_for_each_entry(dev, &device_list, dev_list) {
+   if (dev->feature && dev->feature(dev)) {
+   ctx->tx_conf = TLS_FULL_HW;
+   update_sk_prot(sk, ctx);
+   break;
+   }
+   }
+   mutex_unlock(&device_mutex);
+   return ctx->tx_conf;
+}
+
+static void tls_hw_unhash(struct sock *sk)
+{
+   struct tls_device *dev;
+
+   mutex_lock(&device_mutex);
+   list_for_each_entry(dev, &device_list, dev_list) {
+   if (dev->unhash)
+   dev->unhash(dev, sk);
+   }
+   mutex_unlock(&device_mutex);
+   tcp_prot.unhash(sk);
+}
+
+static int tls_hw_hash(struct sock *sk)
+{
+   struct tls_device *dev;
+   int err;
+
+   err = tcp_prot.hash(sk);
+   mutex_lock(&device_mutex);
+   list_for_each_entry(dev, &device_list, dev_list) {
+   if (dev->hash)
+   err |= dev->hash(dev, sk);
+   }
+   mutex_unlock(&device_mutex);
+
+   if (err)
+   tls_hw_unhash(sk);
+   return err;
+}
+
 static int tls_init(struct sock *sk)
 {
struct inet_connection_sock *icsk = inet_csk(sk);
@@ -477,6 +562,9 @@ static int tls_init(struct sock *sk)
ctx->sk_proto_close = sk->sk_prot->close;
 
ctx->tx_conf = TLS_BASE_TX;
+   if (tls_hw_prot(sk) == TLS_FULL_HW)
+   goto out;
+
update_sk_prot(sk, ctx);
 out:
return rc;
@@ -500,7 +588,27 @@ static void build_protos(struct proto *prot, struct proto 
*base)
prot[TLS_SW_TX] = prot[TLS_BASE_TX];
prot[TLS_SW_TX].sendmsg = tls_sw_sendmsg;
prot[TLS_SW_TX].sendpage= tls_sw_sendpage;
+
+   prot[TLS_FULL_HW] = prot[TLS_BASE_TX];
+   prot[TLS_FULL_HW].hash  = tls_hw_hash;
+   prot[TLS_FULL_HW].unhash  

[Crypto v6 12/12] Makefile Kconfig

2018-02-18 Thread Atul Gupta
Entry for Inline TLS as another driver dependent on cxgb4 and chcr

Signed-off-by: Atul Gupta 
---
 drivers/crypto/chelsio/Kconfig| 11 +++
 drivers/crypto/chelsio/Makefile   |  1 +
 drivers/crypto/chelsio/chtls/Makefile |  4 
 3 files changed, 16 insertions(+)
 create mode 100644 drivers/crypto/chelsio/chtls/Makefile

diff --git a/drivers/crypto/chelsio/Kconfig b/drivers/crypto/chelsio/Kconfig
index 5ae9f87..930d82d 100644
--- a/drivers/crypto/chelsio/Kconfig
+++ b/drivers/crypto/chelsio/Kconfig
@@ -29,3 +29,14 @@ config CHELSIO_IPSEC_INLINE
 default n
 ---help---
   Enable support for IPSec Tx Inline.
+
+config CRYPTO_DEV_CHELSIO_TLS
+tristate "Chelsio Crypto Inline TLS Driver"
+depends on CHELSIO_T4
+depends on TLS
+select CRYPTO_DEV_CHELSIO
+---help---
+  Support Chelsio Inline TLS with Chelsio crypto accelerator.
+
+  To compile this driver as a module, choose M here: the module
+  will be called chtls.
diff --git a/drivers/crypto/chelsio/Makefile b/drivers/crypto/chelsio/Makefile
index eaecaf1..639e571 100644
--- a/drivers/crypto/chelsio/Makefile
+++ b/drivers/crypto/chelsio/Makefile
@@ -3,3 +3,4 @@ ccflags-y := -Idrivers/net/ethernet/chelsio/cxgb4
 obj-$(CONFIG_CRYPTO_DEV_CHELSIO) += chcr.o
 chcr-objs :=  chcr_core.o chcr_algo.o
 chcr-$(CONFIG_CHELSIO_IPSEC_INLINE) += chcr_ipsec.o
+obj-$(CONFIG_CRYPTO_DEV_CHELSIO_TLS) += chtls/
diff --git a/drivers/crypto/chelsio/chtls/Makefile 
b/drivers/crypto/chelsio/chtls/Makefile
new file mode 100644
index 000..df13795
--- /dev/null
+++ b/drivers/crypto/chelsio/chtls/Makefile
@@ -0,0 +1,4 @@
+ccflags-y := -Idrivers/net/ethernet/chelsio/cxgb4 -Idrivers/crypto/chelsio/
+
+obj-$(CONFIG_CRYPTO_DEV_CHELSIO_TLS) += chtls.o
+chtls-objs := chtls_main.o chtls_cm.o chtls_io.o chtls_hw.o
-- 
1.8.3.1



[Crypto v6 10/12] chtls: Inline crypto request Tx/Rx

2018-02-18 Thread Atul Gupta
TLS handler for record transmit and receive.
Create Inline TLS work request and post to FW.

Signed-off-by: Atul Gupta 
---
 drivers/crypto/chelsio/chtls/chtls_io.c | 1867 +++
 1 file changed, 1867 insertions(+)
 create mode 100644 drivers/crypto/chelsio/chtls/chtls_io.c

diff --git a/drivers/crypto/chelsio/chtls/chtls_io.c 
b/drivers/crypto/chelsio/chtls/chtls_io.c
new file mode 100644
index 000..0c5d6c1
--- /dev/null
+++ b/drivers/crypto/chelsio/chtls/chtls_io.c
@@ -0,0 +1,1867 @@
+/*
+ * Copyright (c) 2017 Chelsio Communications, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Written by: Atul Gupta (atul.gu...@chelsio.com)
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "chtls.h"
+#include "chtls_cm.h"
+
+static bool is_tls_hw(struct chtls_sock *csk)
+{
+   return csk->tlshws.ofld;
+}
+
+static bool is_tls_rx(struct chtls_sock *csk)
+{
+   return (csk->tlshws.rxkey >= 0);
+}
+
+static bool is_tls_tx(struct chtls_sock *csk)
+{
+   return (csk->tlshws.txkey >= 0);
+}
+
+static bool is_tls_skb(struct chtls_sock *csk, const struct sk_buff *skb)
+{
+   return (is_tls_hw(csk) && skb_ulp_tls_skb_flags(skb));
+}
+
+static int key_size(void *sk)
+{
+   return 16; /* Key on DDR */
+}
+
+#define ceil(x, y) \
+   ({ unsigned long __x = (x), __y = (y); (__x + __y - 1) / __y; })
+
+static int data_sgl_len(const struct sk_buff *skb)
+{
+   unsigned int cnt;
+
+   cnt = skb_shinfo(skb)->nr_frags;
+   return (sgl_len(cnt) * 8);
+}
+
+static int nos_ivs(struct sock *sk, unsigned int size)
+{
+   struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
+
+   return ceil(size, csk->tlshws.mfs);
+}
+
+#define TLS_WR_CPL_LEN \
+   (sizeof(struct fw_tlstx_data_wr) + \
+   sizeof(struct cpl_tx_tls_sfo))
+
+static int is_ivs_imm(struct sock *sk, const struct sk_buff *skb)
+{
+   int ivs_size = nos_ivs(sk, skb->len) * CIPHER_BLOCK_SIZE;
+   int hlen = TLS_WR_CPL_LEN + data_sgl_len(skb);
+
+   if ((hlen + key_size(sk) + ivs_size) <
+   MAX_IMM_OFLD_TX_DATA_WR_LEN) {
+   ULP_SKB_CB(skb)->ulp.tls.iv = 1;
+   return 1;
+   }
+   ULP_SKB_CB(skb)->ulp.tls.iv = 0;
+   return 0;
+}
+
+static int max_ivs_size(struct sock *sk, int size)
+{
+   return (nos_ivs(sk, size) * CIPHER_BLOCK_SIZE);
+}
+
+static int ivs_size(struct sock *sk, const struct sk_buff *skb)
+{
+   return (is_ivs_imm(sk, skb) ? (nos_ivs(sk, skb->len) *
+CIPHER_BLOCK_SIZE) : 0);
+}
+
+static int flowc_wr_credits(int nparams, int *flowclenp)
+{
+   int flowclen16, flowclen;
+
+   flowclen = offsetof(struct fw_flowc_wr, mnemval[nparams]);
+   flowclen16 = DIV_ROUND_UP(flowclen, 16);
+   flowclen = flowclen16 * 16;
+
+   if (flowclenp)
+   *flowclenp = flowclen;
+
+   return flowclen16;
+}
+
+static struct sk_buff *create_flowc_wr_skb(struct sock *sk,
+  struct fw_flowc_wr *flowc,
+  int flowclen)
+{
+   struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
+   struct sk_buff *skb;
+
+   skb = alloc_skb(flowclen, GFP_ATOMIC);
+   if (!skb)
+   return NULL;
+
+   memcpy(__skb_put(skb, flowclen), flowc, flowclen);
+   set_queue(skb, (csk->txq_idx << 1) | CPL_PRIORITY_DATA, sk);
+
+   return skb;
+}
+
+static int send_flowc_wr(struct sock *sk, struct fw_flowc_wr *flowc,
+int flowclen)
+{
+   struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
+   bool syn_sent = (sk->sk_state == TCP_SYN_SENT);
+   struct tcp_sock *tp = tcp_sk(sk);
+   int flowclen16 = flowclen / 16;
+   struct sk_buff *skb;
+
+   if (csk_flag(sk, CSK_TX_DATA_SENT)) {
+   skb = create_flowc_wr_skb(sk, flowc, flowclen);
+   if (!skb)
+   return -ENOMEM;
+
+   if (syn_sent)
+   __skb_queue_tail(&csk->ooo_queue, skb);
+   else
+   skb_entail(sk, skb,
+  ULPCB_FLAG_NO_HDR | ULPCB_FLAG_NO_APPEND);
+   return 0;
+   }
+
+   if (!syn_sent) {
+   int ret;
+
+   ret = cxgb4_immdata_send(csk->egress_dev,
+csk->txq_idx,
+flowc, flowclen);
+   if (!ret)
+   return flowclen16;
+   }
+   skb = create_flowc_wr_skb(sk, flowc, flowclen);
+   if (!skb)
+   return -ENOMEM;
+   send_or_defer(sk, tp, skb, 0);
+   return flowclen16;
+}
+
+static u8 tcp_state_to_flowc_s

[Crypto v6 09/12] chtls: CPL handler definition

2018-02-18 Thread Atul Gupta
CPL handlers for TLS session, record transmit and receive.

Signed-off-by: Atul Gupta 
---
 drivers/crypto/chelsio/chtls/chtls_cm.c | 2041 +++
 net/ipv4/tcp_minisocks.c|1 +
 2 files changed, 2042 insertions(+)
 create mode 100644 drivers/crypto/chelsio/chtls/chtls_cm.c

diff --git a/drivers/crypto/chelsio/chtls/chtls_cm.c 
b/drivers/crypto/chelsio/chtls/chtls_cm.c
new file mode 100644
index 000..1c95e87
--- /dev/null
+++ b/drivers/crypto/chelsio/chtls/chtls_cm.c
@@ -0,0 +1,2041 @@
+/*
+ * Copyright (c) 2017 Chelsio Communications, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Written by: Atul Gupta (atul.gu...@chelsio.com)
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "chtls.h"
+#include "chtls_cm.h"
+
+extern struct request_sock_ops chtls_rsk_ops;
+
+/*
+ * State transitions and actions for close.  Note that if we are in SYN_SENT
+ * we remain in that state as we cannot control a connection while it's in
+ * SYN_SENT; such connections are allowed to establish and are then aborted.
+ */
+static unsigned char new_state[16] = {
+   /* current state: new state:  action: */
+   /* (Invalid)   */ TCP_CLOSE,
+   /* TCP_ESTABLISHED */ TCP_FIN_WAIT1 | TCP_ACTION_FIN,
+   /* TCP_SYN_SENT*/ TCP_SYN_SENT,
+   /* TCP_SYN_RECV*/ TCP_FIN_WAIT1 | TCP_ACTION_FIN,
+   /* TCP_FIN_WAIT1   */ TCP_FIN_WAIT1,
+   /* TCP_FIN_WAIT2   */ TCP_FIN_WAIT2,
+   /* TCP_TIME_WAIT   */ TCP_CLOSE,
+   /* TCP_CLOSE   */ TCP_CLOSE,
+   /* TCP_CLOSE_WAIT  */ TCP_LAST_ACK | TCP_ACTION_FIN,
+   /* TCP_LAST_ACK*/ TCP_LAST_ACK,
+   /* TCP_LISTEN  */ TCP_CLOSE,
+   /* TCP_CLOSING */ TCP_CLOSING,
+};
+
+static struct chtls_sock *chtls_sock_create(struct chtls_dev *cdev)
+{
+   struct chtls_sock *csk = kzalloc(sizeof(*csk), GFP_ATOMIC);
+
+   if (!csk)
+   return NULL;
+
+   csk->txdata_skb_cache = alloc_skb(TXDATA_SKB_LEN, GFP_ATOMIC);
+   if (!csk->txdata_skb_cache) {
+   kfree(csk);
+   return NULL;
+   }
+
+   kref_init(&csk->kref);
+   csk->cdev = cdev;
+   skb_queue_head_init(&csk->txq);
+   csk->wr_skb_head = NULL;
+   csk->wr_skb_tail = NULL;
+   csk->mss = MAX_MSS;
+   csk->tlshws.ofld = 1;
+   csk->tlshws.txkey = -1;
+   csk->tlshws.rxkey = -1;
+   csk->tlshws.mfs = TLS_MFS;
+   skb_queue_head_init(&csk->tlshws.sk_recv_queue);
+   return csk;
+}
+
+static void chtls_sock_release(struct kref *ref)
+{
+   struct chtls_sock *csk =
+   container_of(ref, struct chtls_sock, kref);
+
+   kfree(csk);
+}
+
+static struct net_device *chtls_ipv4_netdev(struct chtls_dev *cdev,
+   struct sock *sk)
+{
+   struct net_device *ndev = cdev->ports[0];
+
+   if (likely(!inet_sk(sk)->inet_rcv_saddr))
+   return ndev;
+
+   ndev = ip_dev_find(&init_net, inet_sk(sk)->inet_rcv_saddr);
+   if (!ndev)
+   return NULL;
+
+   if (is_vlan_dev(ndev))
+   return vlan_dev_real_dev(ndev);
+   return ndev;
+}
+
+static void assign_rxopt(struct sock *sk, unsigned int opt)
+{
+   struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
+   struct tcp_sock *tp = tcp_sk(sk);
+   const struct chtls_dev *cdev;
+
+   cdev = csk->cdev;
+   tp->tcp_header_len   = sizeof(struct tcphdr);
+   tp->rx_opt.mss_clamp = cdev->mtus[TCPOPT_MSS_G(opt)] - 40;
+   tp->mss_cache= tp->rx_opt.mss_clamp;
+   tp->rx_opt.tstamp_ok = TCPOPT_TSTAMP_G(opt);
+   tp->rx_opt.snd_wscale= TCPOPT_SACK_G(opt);
+   tp->rx_opt.wscale_ok = TCPOPT_WSCALE_OK_G(opt);
+   SND_WSCALE(tp)   = TCPOPT_SND_WSCALE_G(opt);
+   if (!tp->rx_opt.wscale_ok)
+   tp->rx_opt.rcv_wscale = 0;
+   if (tp->rx_opt.tstamp_ok) {
+   tp->tcp_header_len += TCPOLEN_TSTAMP_ALIGNED;
+   tp->rx_opt.mss_clamp -= TCPOLEN_TSTAMP_ALIGNED;
+   } else if (csk->opt2 & TSTAMPS_EN_F) {
+   csk->opt2 &= ~TSTAMPS_EN_F;
+   csk->mtu_idx = TCPOPT_MSS_G(opt);
+   }
+}
+
+static void chtls_purge_rcv_queue(struct sock *sk)
+{
+   struct sk_buff *skb;
+
+   while ((skb = __skb_dequeue(&sk->sk_receive_queue)) != NULL) {
+   skb_dst_set(skb, (void *)NULL);
+   kfree_skb(skb);
+   }
+}
+
+static void chtls_purge_write_queue(struct sock *sk)
+{
+   struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
+   struct sk_buff *skb;
+
+   while ((skb = __skb_dequeue(&csk->txq))) {

[Crypto v6 07/12] chcr: Key Macro

2018-02-18 Thread Atul Gupta
Define macro for TLS Key context

Signed-off-by: Atul Gupta 
---
 drivers/crypto/chelsio/chcr_algo.h | 42 +
 drivers/crypto/chelsio/chcr_core.h | 55 +-
 2 files changed, 96 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/chelsio/chcr_algo.h 
b/drivers/crypto/chelsio/chcr_algo.h
index d1673a5..f263cd4 100644
--- a/drivers/crypto/chelsio/chcr_algo.h
+++ b/drivers/crypto/chelsio/chcr_algo.h
@@ -86,6 +86,39 @@
 KEY_CONTEXT_OPAD_PRESENT_M)
 #define KEY_CONTEXT_OPAD_PRESENT_F  KEY_CONTEXT_OPAD_PRESENT_V(1U)
 
+#define TLS_KEYCTX_RXFLIT_CNT_S 24
+#define TLS_KEYCTX_RXFLIT_CNT_V(x) ((x) << TLS_KEYCTX_RXFLIT_CNT_S)
+
+#define TLS_KEYCTX_RXPROT_VER_S 20
+#define TLS_KEYCTX_RXPROT_VER_M 0xf
+#define TLS_KEYCTX_RXPROT_VER_V(x) ((x) << TLS_KEYCTX_RXPROT_VER_S)
+
+#define TLS_KEYCTX_RXCIPH_MODE_S 16
+#define TLS_KEYCTX_RXCIPH_MODE_M 0xf
+#define TLS_KEYCTX_RXCIPH_MODE_V(x) ((x) << TLS_KEYCTX_RXCIPH_MODE_S)
+
+#define TLS_KEYCTX_RXAUTH_MODE_S 12
+#define TLS_KEYCTX_RXAUTH_MODE_M 0xf
+#define TLS_KEYCTX_RXAUTH_MODE_V(x) ((x) << TLS_KEYCTX_RXAUTH_MODE_S)
+
+#define TLS_KEYCTX_RXCIAU_CTRL_S 11
+#define TLS_KEYCTX_RXCIAU_CTRL_V(x) ((x) << TLS_KEYCTX_RXCIAU_CTRL_S)
+
+#define TLS_KEYCTX_RX_SEQCTR_S 9
+#define TLS_KEYCTX_RX_SEQCTR_M 0x3
+#define TLS_KEYCTX_RX_SEQCTR_V(x) ((x) << TLS_KEYCTX_RX_SEQCTR_S)
+
+#define TLS_KEYCTX_RX_VALID_S 8
+#define TLS_KEYCTX_RX_VALID_V(x) ((x) << TLS_KEYCTX_RX_VALID_S)
+
+#define TLS_KEYCTX_RXCK_SIZE_S 3
+#define TLS_KEYCTX_RXCK_SIZE_M 0x7
+#define TLS_KEYCTX_RXCK_SIZE_V(x) ((x) << TLS_KEYCTX_RXCK_SIZE_S)
+
+#define TLS_KEYCTX_RXMK_SIZE_S 0
+#define TLS_KEYCTX_RXMK_SIZE_M 0x7
+#define TLS_KEYCTX_RXMK_SIZE_V(x) ((x) << TLS_KEYCTX_RXMK_SIZE_S)
+
 #define CHCR_HASH_MAX_DIGEST_SIZE 64
 #define CHCR_MAX_SHA_DIGEST_SIZE 64
 
@@ -176,6 +209,15 @@
  KEY_CONTEXT_SALT_PRESENT_V(1) | \
  KEY_CONTEXT_CTX_LEN_V((ctx_len)))
 
+#define  FILL_KEY_CRX_HDR(ck_size, mk_size, d_ck, opad, ctx_len) \
+   htonl(TLS_KEYCTX_RXMK_SIZE_V(mk_size) | \
+ TLS_KEYCTX_RXCK_SIZE_V(ck_size) | \
+ TLS_KEYCTX_RX_VALID_V(1) | \
+ TLS_KEYCTX_RX_SEQCTR_V(3) | \
+ TLS_KEYCTX_RXAUTH_MODE_V(4) | \
+ TLS_KEYCTX_RXCIPH_MODE_V(2) | \
+ TLS_KEYCTX_RXFLIT_CNT_V((ctx_len)))
+
 #define FILL_WR_OP_CCTX_SIZE \
htonl( \
FW_CRYPTO_LOOKASIDE_WR_OPCODE_V( \
diff --git a/drivers/crypto/chelsio/chcr_core.h 
b/drivers/crypto/chelsio/chcr_core.h
index 3c29ee0..77056a9 100644
--- a/drivers/crypto/chelsio/chcr_core.h
+++ b/drivers/crypto/chelsio/chcr_core.h
@@ -65,10 +65,58 @@
 struct _key_ctx {
__be32 ctx_hdr;
u8 salt[MAX_SALT];
-   __be64 reserverd;
+   __be64 iv_to_auth;
unsigned char key[0];
 };
 
+#define KEYCTX_TX_WR_IV_S  55
+#define KEYCTX_TX_WR_IV_M  0x1ffULL
+#define KEYCTX_TX_WR_IV_V(x) ((x) << KEYCTX_TX_WR_IV_S)
+#define KEYCTX_TX_WR_IV_G(x) \
+   (((x) >> KEYCTX_TX_WR_IV_S) & KEYCTX_TX_WR_IV_M)
+
+#define KEYCTX_TX_WR_AAD_S 47
+#define KEYCTX_TX_WR_AAD_M 0xffULL
+#define KEYCTX_TX_WR_AAD_V(x) ((x) << KEYCTX_TX_WR_AAD_S)
+#define KEYCTX_TX_WR_AAD_G(x) (((x) >> KEYCTX_TX_WR_AAD_S) & \
+   KEYCTX_TX_WR_AAD_M)
+
+#define KEYCTX_TX_WR_AADST_S 39
+#define KEYCTX_TX_WR_AADST_M 0xffULL
+#define KEYCTX_TX_WR_AADST_V(x) ((x) << KEYCTX_TX_WR_AADST_S)
+#define KEYCTX_TX_WR_AADST_G(x) \
+   (((x) >> KEYCTX_TX_WR_AADST_S) & KEYCTX_TX_WR_AADST_M)
+
+#define KEYCTX_TX_WR_CIPHER_S 30
+#define KEYCTX_TX_WR_CIPHER_M 0x1ffULL
+#define KEYCTX_TX_WR_CIPHER_V(x) ((x) << KEYCTX_TX_WR_CIPHER_S)
+#define KEYCTX_TX_WR_CIPHER_G(x) \
+   (((x) >> KEYCTX_TX_WR_CIPHER_S) & KEYCTX_TX_WR_CIPHER_M)
+
+#define KEYCTX_TX_WR_CIPHERST_S 23
+#define KEYCTX_TX_WR_CIPHERST_M 0x7f
+#define KEYCTX_TX_WR_CIPHERST_V(x) ((x) << KEYCTX_TX_WR_CIPHERST_S)
+#define KEYCTX_TX_WR_CIPHERST_G(x) \
+   (((x) >> KEYCTX_TX_WR_CIPHERST_S) & KEYCTX_TX_WR_CIPHERST_M)
+
+#define KEYCTX_TX_WR_AUTH_S 14
+#define KEYCTX_TX_WR_AUTH_M 0x1ff
+#define KEYCTX_TX_WR_AUTH_V(x) ((x) << KEYCTX_TX_WR_AUTH_S)
+#define KEYCTX_TX_WR_AUTH_G(x) \
+   (((x) >> KEYCTX_TX_WR_AUTH_S) & KEYCTX_TX_WR_AUTH_M)
+
+#define KEYCTX_TX_WR_AUTHST_S 7
+#define KEYCTX_TX_WR_AUTHST_M 0x7f
+#define KEYCTX_TX_WR_AUTHST_V(x) ((x) << KEYCTX_TX_WR_AUTHST_S)
+#define KEYCTX_TX_WR_AUTHST_G(x) \
+   (((x) >> KEYCTX_TX_WR_AUTHST_S) & KEYCTX_TX_WR_AUTHST_M)
+
+#define KEYCTX_TX_WR_AUTHIN_S 0
+#define KEYCTX_TX_WR_AUTHIN_M 0x7f
+#define KEYCTX_TX_WR_AUTHIN_V(x) ((x) << KEYCTX_TX_WR_AUTHIN_S)
+#define KEYCTX_TX_WR_AUTHIN_G(x) \
+   (((x) >> KEYCTX_TX_WR_AUTHIN_S) & KEYCTX_TX_WR_AUTHIN_M)
+
 struct chcr_wr {
struct fw_crypto_lookaside_wr wreq;
struct ulp_txpkt ulptx;
@@ -90,6 +138,11 @@ struct uld_ctx {
struct chcr_dev *dev;
 };
 
+struct sge_opaq

[Crypto v6 06/12] cxgb4: LLD driver changes to enable TLS

2018-02-18 Thread Atul Gupta
Read FW capability. Read key area size. Dump the TLS record count.

Signed-off-by: Atul Gupta 
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 32 +---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h  |  7 ++
 drivers/net/ethernet/chelsio/cxgb4/sge.c| 98 -
 3 files changed, 126 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c 
b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 56bc626..ab5937e 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -4284,18 +4284,32 @@ static int adap_init0(struct adapter *adap)
adap->num_ofld_uld += 2;
}
if (caps_cmd.cryptocaps) {
-   /* Should query params here...TODO */
-   params[0] = FW_PARAM_PFVF(NCRYPTO_LOOKASIDE);
-   ret = t4_query_params(adap, adap->mbox, adap->pf, 0, 2,
- params, val);
-   if (ret < 0) {
-   if (ret != -EINVAL)
+   if (ntohs(caps_cmd.cryptocaps) &
+   FW_CAPS_CONFIG_CRYPTO_LOOKASIDE) {
+   params[0] = FW_PARAM_PFVF(NCRYPTO_LOOKASIDE);
+   ret = t4_query_params(adap, adap->mbox, adap->pf, 0,
+ 2, params, val);
+   if (ret < 0) {
+   if (ret != -EINVAL)
+   goto bye;
+   } else {
+   adap->vres.ncrypto_fc = val[0];
+   }
+   adap->num_ofld_uld += 1;
+   }
+   if (ntohs(caps_cmd.cryptocaps) &
+   FW_CAPS_CONFIG_TLS_INLINE) {
+   params[0] = FW_PARAM_PFVF(TLS_START);
+   params[1] = FW_PARAM_PFVF(TLS_END);
+   ret = t4_query_params(adap, adap->mbox, adap->pf, 0,
+ 2, params, val);
+   if (ret < 0)
goto bye;
-   } else {
-   adap->vres.ncrypto_fc = val[0];
+   adap->vres.key.start = val[0];
+   adap->vres.key.size = val[1] - val[0] + 1;
+   adap->num_uld += 1;
}
adap->params.crypto = ntohs(caps_cmd.cryptocaps);
-   adap->num_uld += 1;
}
 #undef FW_PARAM_PFVF
 #undef FW_PARAM_DEV
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h 
b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
index a14e8db..3d3ef3f 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h
@@ -237,6 +237,7 @@ enum cxgb4_uld {
CXGB4_ULD_ISCSI,
CXGB4_ULD_ISCSIT,
CXGB4_ULD_CRYPTO,
+   CXGB4_ULD_TLS,
CXGB4_ULD_MAX
 };
 
@@ -287,6 +288,7 @@ struct cxgb4_virt_res {  /* virtualized 
HW resources */
struct cxgb4_range qp;
struct cxgb4_range cq;
struct cxgb4_range ocq;
+   struct cxgb4_range key;
unsigned int ncrypto_fc;
 };
 
@@ -298,6 +300,9 @@ struct chcr_stats_debug {
atomic_t error;
atomic_t fallback;
atomic_t ipsec_cnt;
+   atomic_t tls_pdu_tx;
+   atomic_t tls_pdu_rx;
+   atomic_t tls_key;
 };
 
 #define OCQ_WIN_OFFSET(pdev, vres) \
@@ -378,6 +383,8 @@ struct cxgb4_uld_info {
 int cxgb4_register_uld(enum cxgb4_uld type, const struct cxgb4_uld_info *p);
 int cxgb4_unregister_uld(enum cxgb4_uld type);
 int cxgb4_ofld_send(struct net_device *dev, struct sk_buff *skb);
+int cxgb4_immdata_send(struct net_device *dev, unsigned int idx,
+  const void *src, unsigned int len);
 int cxgb4_crypto_send(struct net_device *dev, struct sk_buff *skb);
 unsigned int cxgb4_dbfifo_count(const struct net_device *dev, int lpfifo);
 unsigned int cxgb4_port_chan(const struct net_device *dev);
diff --git a/drivers/net/ethernet/chelsio/cxgb4/sge.c 
b/drivers/net/ethernet/chelsio/cxgb4/sge.c
index 6e310a0..32e3779 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/sge.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/sge.c
@@ -1740,9 +1740,9 @@ static void txq_stop_maperr(struct sge_uld_txq *q)
  * Stops an offload Tx queue that has become full and modifies the packet
  * being written to request a wakeup.
  */
-static void ofldtxq_stop(struct sge_uld_txq *q, struct sk_buff *skb)
+static void ofldtxq_stop(struct sge_uld_txq *q, void *src)
 {
-   struct fw_wr_hdr *wr = (struct fw_wr_hdr *)skb->data;
+   struct fw_wr_hdr *wr = (struct fw_wr_hdr *)src;
 
wr->lo |= htonl(FW_WR_EQUEQ_F | FW_WR_EQUIQ_F);
q->q.stops++;
@@ -2005,6 +2005,100 @@ int cxgb4_ofld_send(struct net_device *dev, struct 
sk_buff *skb)
 }
 EXPORT_SYMBOL(cxgb4_ofld_send);
 
+static void *inline_tx_header(const void *src,

[Crypto v6 11/12] chtls: Register chtls Inline TLS with net tls

2018-02-18 Thread Atul Gupta
Register chtls as Inline TLS driver, chtls is ULD to cxgb4.
Setsockopt to program (tx/rx) keys on chip. Support AES GCM
of key size 128. Support both Inline Rx and Tx.

Signed-off-by: Atul Gupta 
---
 drivers/crypto/chelsio/chtls/chtls_main.c | 600 ++
 include/uapi/linux/tls.h  |   1 +
 2 files changed, 601 insertions(+)
 create mode 100644 drivers/crypto/chelsio/chtls/chtls_main.c

diff --git a/drivers/crypto/chelsio/chtls/chtls_main.c 
b/drivers/crypto/chelsio/chtls/chtls_main.c
new file mode 100644
index 000..657c515
--- /dev/null
+++ b/drivers/crypto/chelsio/chtls/chtls_main.c
@@ -0,0 +1,600 @@
+/*
+ * Copyright (c) 2017 Chelsio Communications, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Written by: Atul Gupta (atul.gu...@chelsio.com)
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "chtls.h"
+#include "chtls_cm.h"
+
+#define DRV_NAME "chtls"
+
+/*
+ * chtls device management
+ * maintains a list of the chtls devices
+ */
+static LIST_HEAD(cdev_list);
+static DEFINE_MUTEX(cdev_mutex);
+static DEFINE_MUTEX(cdev_list_lock);
+
+static struct proto chtls_cpl_prot;
+static struct proto chtls_base_prot;
+static DEFINE_MUTEX(notify_mutex);
+static RAW_NOTIFIER_HEAD(listen_notify_list);
+struct request_sock_ops chtls_rsk_ops;
+static uint send_page_order = (14 - PAGE_SHIFT < 0) ? 0 : 14 - PAGE_SHIFT;
+
+static int register_listen_notifier(struct notifier_block *nb)
+{
+   int err;
+
+   mutex_lock(¬ify_mutex);
+   err = raw_notifier_chain_register(&listen_notify_list, nb);
+   mutex_unlock(¬ify_mutex);
+   return err;
+}
+
+static int unregister_listen_notifier(struct notifier_block *nb)
+{
+   int err;
+
+   mutex_lock(¬ify_mutex);
+   err = raw_notifier_chain_unregister(&listen_notify_list, nb);
+   mutex_unlock(¬ify_mutex);
+   return err;
+}
+
+static int listen_notify_handler(struct notifier_block *this,
+unsigned long event, void *data)
+{
+   struct sock *sk = data;
+   struct chtls_dev *cdev;
+   int ret =  NOTIFY_DONE;
+
+   switch (event) {
+   case CHTLS_LISTEN_START:
+   case CHTLS_LISTEN_STOP:
+   mutex_lock(&cdev_list_lock);
+   list_for_each_entry(cdev, &cdev_list, list) {
+   if (event == CHTLS_LISTEN_START)
+   ret = chtls_listen_start(cdev, sk);
+   else
+   chtls_listen_stop(cdev, sk);
+   }
+   mutex_unlock(&cdev_list_lock);
+   break;
+   }
+   return ret;
+}
+
+static struct notifier_block listen_notifier = {
+   .notifier_call = listen_notify_handler
+};
+
+static int listen_backlog_rcv(struct sock *sk, struct sk_buff *skb)
+{
+   if (likely(skb_transport_header(skb) != skb_network_header(skb)))
+   return tcp_v4_do_rcv(sk, skb);
+   BLOG_SKB_CB(skb)->backlog_rcv(sk, skb);
+   return 0;
+}
+
+static int chtls_start_listen(struct sock *sk)
+{
+   int err;
+
+   if (sk->sk_protocol != IPPROTO_TCP)
+   return -EPROTONOSUPPORT;
+
+   if (sk->sk_family == PF_INET &&
+   LOOPBACK(inet_sk(sk)->inet_rcv_saddr))
+   return -EADDRNOTAVAIL;
+
+   sk->sk_backlog_rcv = listen_backlog_rcv;
+   mutex_lock(¬ify_mutex);
+   err = raw_notifier_call_chain(&listen_notify_list,
+ CHTLS_LISTEN_START, sk);
+   mutex_unlock(¬ify_mutex);
+   return err;
+}
+
+static int chtls_hash(struct sock *sk)
+{
+   int err;
+
+   err = tcp_prot.hash(sk);
+   if (sk->sk_state == TCP_LISTEN)
+   err |= chtls_start_listen(sk);
+
+   if (err)
+   tcp_prot.unhash(sk);
+   return err;
+}
+
+static int chtls_stop_listen(struct sock *sk)
+{
+   if (sk->sk_protocol != IPPROTO_TCP)
+   return -EPROTONOSUPPORT;
+
+   mutex_lock(¬ify_mutex);
+   raw_notifier_call_chain(&listen_notify_list,
+   CHTLS_LISTEN_STOP, sk);
+   mutex_unlock(¬ify_mutex);
+   return 0;
+}
+
+static void chtls_unhash(struct sock *sk)
+{
+   if (sk->sk_state == TCP_LISTEN)
+   chtls_stop_listen(sk);
+   tcp_prot.unhash(sk);
+}
+
+static int chtls_netdev(struct tls_device *dev,
+   struct net_device *netdev)
+{
+   struct chtls_dev *cdev = to_chtls_dev(dev);
+   int i;
+
+   for (i = 0; i < cdev->lldi->nports; i++)
+   if (cdev->ports[i] == netdev)
+   return 1;
+
+   return 0;
+}
+
+static int chtls_inline_feature(struct tls_device *dev)
+{
+   struct chtls_dev *cdev = to_chtls_dev(dev);
+  

[Crypto v6 08/12] chtls: Key program

2018-02-18 Thread Atul Gupta
Program the tx and rx key on chip.

Signed-off-by: Atul Gupta 
---
 drivers/crypto/chelsio/chtls/chtls_hw.c | 394 
 1 file changed, 394 insertions(+)
 create mode 100644 drivers/crypto/chelsio/chtls/chtls_hw.c

diff --git a/drivers/crypto/chelsio/chtls/chtls_hw.c 
b/drivers/crypto/chelsio/chtls/chtls_hw.c
new file mode 100644
index 000..c3e17159
--- /dev/null
+++ b/drivers/crypto/chelsio/chtls/chtls_hw.c
@@ -0,0 +1,394 @@
+/*
+ * Copyright (c) 2017 Chelsio Communications, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Written by: Atul Gupta (atul.gu...@chelsio.com)
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "chtls.h"
+#include "chtls_cm.h"
+
+static void __set_tcb_field_direct(struct chtls_sock *csk,
+  struct cpl_set_tcb_field *req, u16 word,
+  u64 mask, u64 val, u8 cookie, int no_reply)
+{
+   struct ulptx_idata *sc;
+
+   INIT_TP_WR_CPL(req, CPL_SET_TCB_FIELD, csk->tid);
+   req->wr.wr_mid |= htonl(FW_WR_FLOWID_V(csk->tid));
+   req->reply_ctrl = htons(NO_REPLY_V(no_reply) |
+   QUEUENO_V(csk->rss_qid));
+   req->word_cookie = htons(TCB_WORD_V(word) | TCB_COOKIE_V(cookie));
+   req->mask = cpu_to_be64(mask);
+   req->val = cpu_to_be64(val);
+   sc = (struct ulptx_idata *)(req + 1);
+   sc->cmd_more = htonl(ULPTX_CMD_V(ULP_TX_SC_NOOP));
+   sc->len = htonl(0);
+}
+
+static void __set_tcb_field(struct sock *sk, struct sk_buff *skb, u16 word,
+   u64 mask, u64 val, u8 cookie, int no_reply)
+{
+   struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
+   struct cpl_set_tcb_field *req;
+   struct ulptx_idata *sc;
+   unsigned int wrlen = roundup(sizeof(*req) + sizeof(*sc), 16);
+
+   req = (struct cpl_set_tcb_field *)__skb_put(skb, wrlen);
+   __set_tcb_field_direct(csk, req, word, mask, val, cookie, no_reply);
+   set_wr_txq(skb, CPL_PRIORITY_CONTROL, csk->port_id);
+}
+
+static int chtls_set_tcb_field(struct sock *sk, u16 word, u64 mask, u64 val)
+{
+   struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
+   struct sk_buff *skb;
+   struct cpl_set_tcb_field *req;
+   struct ulptx_idata *sc;
+   unsigned int wrlen = roundup(sizeof(*req) + sizeof(*sc), 16);
+   unsigned int credits_needed = DIV_ROUND_UP(wrlen, 16);
+
+   skb = alloc_skb(wrlen, GFP_ATOMIC);
+   if (!skb)
+   return -ENOMEM;
+
+   __set_tcb_field(sk, skb, word, mask, val, 0, 1);
+   set_queue(skb, (csk->txq_idx << 1) | CPL_PRIORITY_DATA, sk);
+   csk->wr_credits -= credits_needed;
+   csk->wr_unacked += credits_needed;
+   enqueue_wr(csk, skb);
+   cxgb4_ofld_send(csk->egress_dev, skb);
+   return 0;
+}
+
+/*
+ * Set one of the t_flags bits in the TCB.
+ */
+int chtls_set_tcb_tflag(struct sock *sk, unsigned int bit_pos, int val)
+{
+   return chtls_set_tcb_field(sk, 1, 1ULL << bit_pos,
+   val << bit_pos);
+}
+
+static int chtls_set_tcb_keyid(struct sock *sk, int keyid)
+{
+   return chtls_set_tcb_field(sk, 31, 0xULL, keyid);
+}
+
+static int chtls_set_tcb_seqno(struct sock *sk)
+{
+   return chtls_set_tcb_field(sk, 28, ~0ULL, 0);
+}
+
+static int chtls_set_tcb_quiesce(struct sock *sk, int val)
+{
+   return chtls_set_tcb_field(sk, 1, (1ULL << TF_RX_QUIESCE_S),
+  TF_RX_QUIESCE_V(val));
+}
+
+static void *chtls_alloc_mem(unsigned long size)
+{
+   void *p = kmalloc(size, GFP_KERNEL);
+
+   if (!p)
+   p = vmalloc(size);
+   if (p)
+   memset(p, 0, size);
+   return p;
+}
+
+static void chtls_free_mem(void *addr)
+{
+   unsigned long p = (unsigned long)addr;
+
+   if (p >= VMALLOC_START && p < VMALLOC_END)
+   vfree(addr);
+   else
+   kfree(addr);
+}
+
+/* TLS Key bitmap processing */
+int chtls_init_kmap(struct chtls_dev *cdev, struct cxgb4_lld_info *lldi)
+{
+   unsigned int num_key_ctx, bsize;
+
+   num_key_ctx = (lldi->vr->key.size / TLS_KEY_CONTEXT_SZ);
+   bsize = BITS_TO_LONGS(num_key_ctx);
+
+   cdev->kmap.size = num_key_ctx;
+   cdev->kmap.available = bsize;
+   cdev->kmap.addr = chtls_alloc_mem(sizeof(*cdev->kmap.addr) *
+ bsize);
+   if (!cdev->kmap.addr)
+   return -1;
+
+   cdev->kmap.start = lldi->vr->key.start;
+   spin_lock_init(&cdev->kmap.lock);
+   return 0;
+}
+
+void chtls_free_kmap(struct chtls_dev *cdev)
+{
+   if (cdev->kmap.addr)
+   chtls_free_mem(cdev->kmap.addr);
+}
+
+static int get_new_keyid(struct chtls

[Crypto v6 04/12] chtls: structure and macro definiton

2018-02-18 Thread Atul Gupta
Inline TLS state, connection management. Supporting macros definition.

Signed-off-by: Atul Gupta 
---
 drivers/crypto/chelsio/chtls/chtls.h| 487 
 drivers/crypto/chelsio/chtls/chtls_cm.h | 202 +
 2 files changed, 689 insertions(+)
 create mode 100644 drivers/crypto/chelsio/chtls/chtls.h
 create mode 100644 drivers/crypto/chelsio/chtls/chtls_cm.h

diff --git a/drivers/crypto/chelsio/chtls/chtls.h 
b/drivers/crypto/chelsio/chtls/chtls.h
new file mode 100644
index 000..3ae7145
--- /dev/null
+++ b/drivers/crypto/chelsio/chtls/chtls.h
@@ -0,0 +1,487 @@
+/*
+ * Copyright (c) 2016 Chelsio Communications, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#ifndef __CHTLS_H__
+#define __CHTLS_H__
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "t4fw_api.h"
+#include "t4_msg.h"
+#include "cxgb4.h"
+#include "cxgb4_uld.h"
+#include "l2t.h"
+#include "chcr_algo.h"
+#include "chcr_core.h"
+#include "chcr_crypto.h"
+
+#define CIPHER_BLOCK_SIZE   16
+#define MAX_IVS_PAGE256
+#define TLS_KEY_CONTEXT_SZ 64
+#define TLS_HEADER_LENGTH  5
+#define SCMD_CIPH_MODE_AES_GCM  2
+#define GCM_TAG_SIZE16
+#define AEAD_EXPLICIT_DATA_SIZE 8
+/* Any MFS size should work and come from openssl */
+#define TLS_MFS16384
+
+#define SOCK_INLINE (31)
+#define RSS_HDR sizeof(struct rss_header)
+
+enum {
+   CHTLS_KEY_CONTEXT_DSGL,
+   CHTLS_KEY_CONTEXT_IMM,
+   CHTLS_KEY_CONTEXT_DDR,
+};
+
+enum {
+   CHTLS_LISTEN_START,
+   CHTLS_LISTEN_STOP,
+};
+
+/* Flags for return value of CPL message handlers */
+enum {
+   CPL_RET_BUF_DONE = 1,   /* buffer processing done */
+   CPL_RET_BAD_MSG = 2,/* bad CPL message */
+   CPL_RET_UNKNOWN_TID = 4 /* unexpected unknown TID */
+};
+
+#define TLS_RCV_ST_READ_HEADER  0xF0
+#define TLS_RCV_ST_READ_BODY0xF1
+#define TLS_RCV_ST_READ_DONE0xF2
+#define TLS_RCV_ST_READ_NB  0xF3
+
+#define RSPQ_HASH_BITS 5
+#define LISTEN_INFO_HASH_SIZE 32
+struct listen_info {
+   struct listen_info *next;  /* Link to next entry */
+   struct sock *sk;   /* The listening socket */
+   unsigned int stid; /* The server TID */
+};
+
+enum {
+   T4_LISTEN_START_PENDING,
+   T4_LISTEN_STARTED
+};
+
+enum csk_flags {
+   CSK_CALLBACKS_CHKD, /* socket callbacks have been sanitized */
+   CSK_ABORT_REQ_RCVD, /* received one ABORT_REQ_RSS message */
+   CSK_TX_MORE_DATA,   /* sending ULP data; don't set SHOVE bit */
+   CSK_TX_WAIT_IDLE,   /* suspend Tx until in-flight data is ACKed */
+   CSK_ABORT_SHUTDOWN, /* shouldn't send more abort requests */
+   CSK_ABORT_RPL_PENDING,  /* expecting an abort reply */
+   CSK_CLOSE_CON_REQUESTED,/* we've sent a close_conn_req */
+   CSK_TX_DATA_SENT,   /* sent a TX_DATA WR on this connection */
+   CSK_TX_FAILOVER,/* Tx traffic failing over */
+   CSK_UPDATE_RCV_WND, /* Need to update rcv window */
+   CSK_RST_ABORTED,/* outgoing RST was aborted */
+   CSK_TLS_HANDSHK,/* TLS Handshake */
+};
+
+struct listen_ctx {
+   struct sock *lsk;
+   struct chtls_dev *cdev;
+   u32 state;
+};
+
+struct key_map {
+   unsigned long *addr;
+   unsigned int start;
+   unsigned int available;
+   unsigned int size;
+   spinlock_t lock; /* lock for key id request from map */
+} __packed;
+
+struct tls_scmd {
+   u32 seqno_numivs;
+   u32 ivgen_hdrlen;
+};
+
+struct chtls_dev {
+   struct tls_device tlsdev;
+   struct list_head list;
+   struct cxgb4_lld_info *lldi;
+   struct pci_dev *pdev;
+   struct listen_info *listen_hash_tab[LISTEN_INFO_HASH_SIZE];
+   spinlock_t listen_lock; /* lock for listen list */
+   struct net_device **ports;
+   struct tid_info *tids;
+   unsigned int pfvf;
+   const unsigned short *mtus;
+
+   spinlock_t aidr_lock cacheline_aligned_in_smp;
+   struct idr aidr; /* ATID id space */
+   struct idr hwtid_idr;
+   struct idr stid_idr;
+
+   spinlock_t idr_lock cacheline_aligned_in_smp;
+
+   struct net_device *egr_dev[NCHAN * 2];
+   struct sk_buff *rspq_skb_cache[1 << RSPQ_HASH_BITS];
+   struct sk_buff *askb;
+
+   struct sk_buff_head deferq;
+   struct work_struct deferq_task;
+
+   struct list_head list_node;
+   struct list_head rcu_node;
+   struct list_head na_node;
+   unsigned int send_page_order;
+   struct key_map kmap;
+};
+
+struct chtls_hws {
+   struct sk_buf

[Crypto v6 02/12] ethtool: enable Inline TLS in HW

2018-02-18 Thread Atul Gupta
Signed-off-by: Atul Gupta 
---
 include/linux/netdev_features.h | 2 ++
 net/core/ethtool.c  | 1 +
 2 files changed, 3 insertions(+)

diff --git a/include/linux/netdev_features.h b/include/linux/netdev_features.h
index db84c51..aacabe2 100644
--- a/include/linux/netdev_features.h
+++ b/include/linux/netdev_features.h
@@ -79,6 +79,7 @@ enum {
NETIF_F_RX_UDP_TUNNEL_PORT_BIT, /* Offload of RX port for UDP tunnels */
 
NETIF_F_GRO_HW_BIT, /* Hardware Generic receive offload */
+   NETIF_F_HW_TLS_INLINE_BIT,  /* Offload TLS record */
 
/*
 * Add your fresh new feature above and remember to update
@@ -145,6 +146,7 @@ enum {
 #define NETIF_F_HW_ESP __NETIF_F(HW_ESP)
 #define NETIF_F_HW_ESP_TX_CSUM __NETIF_F(HW_ESP_TX_CSUM)
 #defineNETIF_F_RX_UDP_TUNNEL_PORT  __NETIF_F(RX_UDP_TUNNEL_PORT)
+#define NETIF_F_HW_TLS_INLINE  __NETIF_F(HW_TLS_INLINE)
 
 #define for_each_netdev_feature(mask_addr, bit)\
for_each_set_bit(bit, (unsigned long *)mask_addr, NETDEV_FEATURE_COUNT)
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 494e6a5..ab16781 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -107,6 +107,7 @@ int ethtool_op_get_ts_info(struct net_device *dev, struct 
ethtool_ts_info *info)
[NETIF_F_HW_ESP_BIT] =   "esp-hw-offload",
[NETIF_F_HW_ESP_TX_CSUM_BIT] =   "esp-tx-csum-hw-offload",
[NETIF_F_RX_UDP_TUNNEL_PORT_BIT] =   "rx-udp_tunnel-port-offload",
+   [NETIF_F_HW_TLS_INLINE_BIT] =   "tls-inline",
 };
 
 static const char
-- 
1.8.3.1



[Crypto v6 05/12] cxgb4: Inline TLS FW Interface

2018-02-18 Thread Atul Gupta
Key area size in hw-config file. CPL struct for TLS request
and response. Work request for Inline TLS.

Signed-off-by: Atul Gupta 
---
 drivers/net/ethernet/chelsio/cxgb4/t4_msg.h   | 121 ++-
 drivers/net/ethernet/chelsio/cxgb4/t4_regs.h  |   2 +
 drivers/net/ethernet/chelsio/cxgb4/t4fw_api.h | 165 +-
 3 files changed, 283 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_msg.h 
b/drivers/net/ethernet/chelsio/cxgb4/t4_msg.h
index d0db442..507cb5a 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_msg.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_msg.h
@@ -81,6 +81,7 @@ enum {
CPL_RX_ISCSI_CMP  = 0x45,
CPL_TRACE_PKT_T5  = 0x48,
CPL_RX_ISCSI_DDP  = 0x49,
+   CPL_RX_TLS_CMP= 0x4E,
 
CPL_RDMA_READ_REQ = 0x60,
 
@@ -88,6 +89,7 @@ enum {
CPL_ACT_OPEN_REQ6 = 0x83,
 
CPL_TX_TLS_PDU =0x88,
+   CPL_TX_TLS_SFO= 0x89,
CPL_TX_SEC_PDU= 0x8A,
CPL_TX_TLS_ACK= 0x8B,
 
@@ -97,6 +99,7 @@ enum {
CPL_RX_MPS_PKT= 0xAF,
 
CPL_TRACE_PKT = 0xB0,
+   CPL_TLS_DATA  = 0xB1,
CPL_ISCSI_DATA= 0xB2,
 
CPL_FW4_MSG   = 0xC0,
@@ -151,6 +154,7 @@ enum {
ULP_MODE_RDMA  = 4,
ULP_MODE_TCPDDP= 5,
ULP_MODE_FCOE  = 6,
+   ULP_MODE_TLS   = 8,
 };
 
 enum {
@@ -1415,6 +1419,14 @@ struct cpl_tx_data {
 #define TX_FORCE_S 13
 #define TX_FORCE_V(x)  ((x) << TX_FORCE_S)
 
+#define TX_SHOVE_S14
+#define TX_SHOVE_V(x) ((x) << TX_SHOVE_S)
+
+#define TX_ULP_MODE_S10
+#define TX_ULP_MODE_M0x7
+#define TX_ULP_MODE_V(x) ((x) << TX_ULP_MODE_S)
+#define TX_ULP_MODE_G(x) (((x) >> TX_ULP_MODE_S) & TX_ULP_MODE_M)
+
 #define T6_TX_FORCE_S  20
 #define T6_TX_FORCE_V(x)   ((x) << T6_TX_FORCE_S)
 #define T6_TX_FORCE_F  T6_TX_FORCE_V(1U)
@@ -1429,12 +1441,21 @@ enum {
ULP_TX_SC_NOOP = 0x80,
ULP_TX_SC_IMM  = 0x81,
ULP_TX_SC_DSGL = 0x82,
-   ULP_TX_SC_ISGL = 0x83
+   ULP_TX_SC_ISGL = 0x83,
+   ULP_TX_SC_MEMRD = 0x86
 };
 
 #define ULPTX_CMD_S24
 #define ULPTX_CMD_V(x) ((x) << ULPTX_CMD_S)
 
+#define ULPTX_LEN16_S0
+#define ULPTX_LEN16_M0xFF
+#define ULPTX_LEN16_V(x) ((x) << ULPTX_LEN16_S)
+
+#define ULP_TX_SC_MORE_S 23
+#define ULP_TX_SC_MORE_V(x) ((x) << ULP_TX_SC_MORE_S)
+#define ULP_TX_SC_MORE_F  ULP_TX_SC_MORE_V(1U)
+
 struct ulptx_sge_pair {
__be32 len[2];
__be64 addr[2];
@@ -2112,4 +2133,102 @@ enum {
X_CPL_RX_MPS_PKT_TYPE_QFC   = 1 << 2,
X_CPL_RX_MPS_PKT_TYPE_PTP   = 1 << 3
 };
+
+struct cpl_tx_tls_sfo {
+   __be32 op_to_seg_len;
+   __be32 pld_len;
+   __be32 type_protover;
+   __be32 r1_lo;
+   __be32 seqno_numivs;
+   __be32 ivgen_hdrlen;
+   __be64 scmd1;
+};
+
+/* cpl_tx_tls_sfo macros */
+#define CPL_TX_TLS_SFO_OPCODE_S 24
+#define CPL_TX_TLS_SFO_OPCODE_V(x)  ((x) << CPL_TX_TLS_SFO_OPCODE_S)
+
+#define CPL_TX_TLS_SFO_DATA_TYPE_S  20
+#define CPL_TX_TLS_SFO_DATA_TYPE_V(x)   ((x) << CPL_TX_TLS_SFO_DATA_TYPE_S)
+
+#define CPL_TX_TLS_SFO_CPL_LEN_S16
+#define CPL_TX_TLS_SFO_CPL_LEN_V(x) ((x) << CPL_TX_TLS_SFO_CPL_LEN_S)
+
+#define CPL_TX_TLS_SFO_SEG_LEN_S0
+#define CPL_TX_TLS_SFO_SEG_LEN_M0x
+#define CPL_TX_TLS_SFO_SEG_LEN_V(x) ((x) << CPL_TX_TLS_SFO_SEG_LEN_S)
+#define CPL_TX_TLS_SFO_SEG_LEN_G(x) \
+   (((x) >> CPL_TX_TLS_SFO_SEG_LEN_S) & CPL_TX_TLS_SFO_SEG_LEN_M)
+
+#define CPL_TX_TLS_SFO_TYPE_S   24
+#define CPL_TX_TLS_SFO_TYPE_M   0xff
+#define CPL_TX_TLS_SFO_TYPE_V(x)((x) << CPL_TX_TLS_SFO_TYPE_S)
+#define CPL_TX_TLS_SFO_TYPE_G(x)\
+   (((x) >> CPL_TX_TLS_SFO_TYPE_S) & CPL_TX_TLS_SFO_TYPE_M)
+
+#define CPL_TX_TLS_SFO_PROTOVER_S   8
+#define CPL_TX_TLS_SFO_PROTOVER_M   0x
+#define CPL_TX_TLS_SFO_PROTOVER_V(x)((x) << CPL_TX_TLS_SFO_PROTOVER_S)
+#define CPL_TX_TLS_SFO_PROTOVER_G(x)\
+   (((x) >> CPL_TX_TLS_SFO_PROTOVER_S) & CPL_TX_TLS_SFO_PROTOVER_M)
+
+struct cpl_tls_data {
+   struct rss_header rsshdr;
+   union opcode_tid ot;
+   __be32 length_pkd;
+   __be32 seq;
+   __be32 r1;
+};
+
+#define CPL_TLS_DATA_OPCODE_S   24
+#define CPL_TLS_DATA_OPCODE_M   0xff
+#define CPL_TLS_DATA_OPCODE_V(x)((x) << CPL_TLS_DATA_OPCODE_S)
+#define CPL_TLS_DATA_OPCODE_G(x)\
+   (((x) >> CPL_TLS_DATA_OPCODE_S) & CPL_TLS_DATA_OPCODE_M)
+
+#define CPL_TLS_DATA_TID_S  0
+#define CPL_TLS_DATA_TID_M  0xff
+#define CPL_TLS_DATA_TID_V(x)   ((x) << CPL_TLS_DATA_TID_S)
+#define CPL_TLS_DATA_TID_G(x)   \
+   (((x) >> CPL_TLS_DATA_TID_S) & CPL_TLS_DATA_TID_M)
+
+#define CPL_TLS_DATA_LENGTH_S   0
+#define CPL_TLS_DATA_LENGTH_M   0x
+#define CPL_TLS_DATA_LENGTH_V(x)((x) 

[Crypto v6 01/12] tls: tls_device struct to register TLS drivers

2018-02-18 Thread Atul Gupta
tls_device structure to register Inline TLS drivers
with net/tls

Signed-off-by: Atul Gupta 
---
 include/net/tls.h | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/include/net/tls.h b/include/net/tls.h
index 4913430..e315bf9 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -55,6 +55,27 @@
 #define TLS_RECORD_TYPE_DATA   0x17
 
 #define TLS_AAD_SPACE_SIZE 13
+#define TLS_DEVICE_NAME_MAX32
+
+enum {
+   TLS_BASE_TX,
+   TLS_SW_TX,
+   TLS_FULL_HW, /* TLS record processed Inline */
+   TLS_NUM_CONFIG,
+};
+extern struct proto tls_prots[TLS_NUM_CONFIG];
+
+struct tls_device {
+   char name[TLS_DEVICE_NAME_MAX];
+   struct list_head dev_list;
+
+   /* netdev present in registered inline tls driver */
+   int (*netdev)(struct tls_device *device,
+ struct net_device *netdev);
+   int (*feature)(struct tls_device *device);
+   int (*hash)(struct tls_device *device, struct sock *sk);
+   void (*unhash)(struct tls_device *device, struct sock *sk);
+};
 
 struct tls_sw_context {
struct crypto_aead *aead_send;
@@ -256,5 +277,7 @@ static inline struct tls_offload_context *tls_offload_ctx(
 
 int tls_proccess_cmsg(struct sock *sk, struct msghdr *msg,
  unsigned char *record_type);
+void tls_register_device(struct tls_device *device);
+void tls_unregister_device(struct tls_device *device);
 
 #endif /* _TLS_OFFLOAD_H */
-- 
1.8.3.1



[Crypto v6 00/12] Chelsio Inline TLS

2018-02-18 Thread Atul Gupta
Series for Chelsio Inline TLS driver (chtls.ko)

Use tls ULP infrastructure to register chtls as Inline TLS driver.
Chtls use TCP Sockets to transmit and receive TLS record. TCP proto_ops is 
extended to offload TLS record.

T6 adapter provides the following features:
-TLS record offload, TLS header, encrypt, digest and transmit
-TLS record receive and decrypt
-TLS keys store
-TCP/IP engine
-TLS engine
-GCM crypto engine [support CBC also]

TLS provides security at the transport layer. It uses TCP to provide reliable 
end-to-end transport of application data. It relies on TCP for any 
retransmission. TLS session comprises of three parts:
a. TCP/IP connection
b. TLS handshake
c. Record layer processing

TLS handshake state machine is executed in host (refer standard implementation 
eg. OpenSSL).  Setsockopt [SOL_TCP, TCP_ULP] initialize TCP proto-ops for 
Chelsio inline tls support. setsockopt(sock, SOL_TCP, TCP_ULP, "tls", 
sizeof("tls"));

Tx and Rx Keys are decided during handshake and programmed onto the chip after 
CCS is exchanged.
struct tls12_crypto_info_aes_gcm_128 crypto_info setsockopt(sock, SOL_TLS, 
TLS_TX, &crypto_info, sizeof(crypto_info)) Finish is the first 
encrypted/decrypted message tx/rx inline.

On the Tx path TLS engine receive plain text from openssl, insert IV, fetches 
the tx key, create cipher text records and generate MAC. TLS header is added to 
cipher text and forward to TCP/IP engine for transport layer processing and 
transmission on wire.
TX:
Application--openssl--chtls---TLS engine---encrypt/auth---TCP/IP engine---wire.

On the Rx side, data received is PDU aligned at record boundaries. TLS 
processes only the complete record. If rx key is programmed on CCS receive, 
data is decrypted and plain text is posted to host.
RX:
Wire--cipher-text--TCP/IP engine [PDU align]---TLS engine--- 
decrypt/auth---plain-text--chtls--openssl--application

v6: modify prot only for FULL_HW
   -corrected commit message for patch 11

v5: set TLS_FULL_HW for registered inline tls drivers
   -set TLS_FULL_HW prot for offload connection else move
to TLS_SW_TX
   -Case handled for interface with same IP [Dave Miller]
   -Removed Specific IP and INADDR_ANY handling [v4]

v4: removed chtls ULP type, retained tls ULP
   -registered chtls with net tls
   -defined struct tls_device to register the Inline drivers
   -ethtool interface tls-inline to enable Inline TLS for interface
   -prot update to support inline TLS

v3: fixed the kbuild test issues
   -made few funtions static
   -initialized few variables

v2: fixed the following based on the review comments of Stephan Mueller,
Stefano Brivio and Hannes Frederic
-Added more details in cover letter
-Fixed indentation and formating issues
-Using aes instead of aes-generic
-memset key info after programing the key on chip
-reordered the patch sequence

Atul Gupta (12):
  tls: tls_device struct to register TLS drivers
  ethtool: enable Inline TLS in HW
  tls: support for inline tls
  chtls: structure and macro definiton
  cxgb4: Inline TLS FW Interface
  cxgb4: LLD driver changes to enable TLS
  chcr: Key Macro
  chtls: Key program
  chtls: CPL handler definition
  chtls: Inline crypto request Tx/Rx
  chtls: Register chtls Inline TLS with net tls
  Makefile Kconfig

 drivers/crypto/chelsio/Kconfig  |   11 +
 drivers/crypto/chelsio/Makefile |1 +
 drivers/crypto/chelsio/chcr_algo.h  |   42 +
 drivers/crypto/chelsio/chcr_core.h  |   55 +-
 drivers/crypto/chelsio/chtls/Makefile   |4 +
 drivers/crypto/chelsio/chtls/chtls.h|  487 ++
 drivers/crypto/chelsio/chtls/chtls_cm.c | 2041 +++
 drivers/crypto/chelsio/chtls/chtls_cm.h |  202 +++
 drivers/crypto/chelsio/chtls/chtls_hw.c |  394 +
 drivers/crypto/chelsio/chtls/chtls_io.c | 1867 +
 drivers/crypto/chelsio/chtls/chtls_main.c   |  600 +++
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c |   32 +-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.h  |7 +
 drivers/net/ethernet/chelsio/cxgb4/sge.c|   98 +-
 drivers/net/ethernet/chelsio/cxgb4/t4_msg.h |  121 +-
 drivers/net/ethernet/chelsio/cxgb4/t4_regs.h|2 +
 drivers/net/ethernet/chelsio/cxgb4/t4fw_api.h   |  165 +-
 include/linux/netdev_features.h |2 +
 include/net/tls.h   |   23 +
 include/uapi/linux/tls.h|1 +
 net/core/ethtool.c  |1 +
 net/ipv4/tcp_minisocks.c|1 +
 net/tls/tls_main.c  |  123 +-
 23 files changed, 6256 insertions(+), 24 deletions(-)
 create mode 100644 drivers/crypto/chelsio/chtls/Makefile
 create mode 100644 drivers/crypto/chelsio/chtls/chtls.h
 create mode 100644 drivers/crypto/chelsio/chtls/chtls_cm.c
 create mode 10064

Re: [PATCH net] tg3: APE heartbeat changes

2018-02-18 Thread Satish Baddipadige
On Sat, Feb 17, 2018 at 2:56 AM, David Miller  wrote:
>
> From: Satish Baddipadige 
> Date: Fri, 16 Feb 2018 10:01:29 +0530
>
> > @@ -990,6 +984,18 @@ static void tg3_ape_driver_state_change(struct tg3 
> > *tp, int kind)
> >   tg3_ape_send_event(tp, event);
> >  }
> >
> > +static inline void tg3_send_ape_heartbeat(struct tg3 *tp,
>
> Inline functions are not appropriate in foo.c files, please drop the
> inline keyword and let the compiler device.

Thanks David. Will send a v2 patch with inline dropped.

Thanks,
Satish


Re: [RFC PATCH v3 0/3] Enable virtio_net to act as a backup for a passthru device

2018-02-18 Thread Jakub Kicinski
On Sat, 17 Feb 2018 09:12:01 -0800, Alexander Duyck wrote:
> >> We noticed a couple of issues with this approach during testing.
> >> - As both 'bypass' and 'backup' netdevs are associated with the same
> >>   virtio pci device, udev tries to rename both of them with the same name
> >>   and the 2nd rename will fail. This would be OK as long as the first 
> >> netdev
> >>   to be renamed is the 'bypass' netdev, but the order in which udev gets
> >>   to rename the 2 netdevs is not reliable.  
> >
> > Out of curiosity - why do you link the master netdev to the virtio
> > struct device?  
> 
> The basic idea of all this is that we wanted this to work with an
> existing VM image that was using virtio. As such we were trying to
> make it so that the bypass interface takes the place of the original
> virtio and get udev to rename the bypass to what the original
> virtio_net was.

That makes sense.  Is it udev/naming that you're most concerned about
here?  I.e. what's the user space app that expects the netdev to be
linked?  This is just out of curiosity, the linking of netdevs to
devices is a bit of a PITA in the switchdev eswitch mode world, with
libvirt expecting only certain devices to be there..  Right now we're
not linking VF reprs, which breaks naming.  I wanted to revisit that.

> > FWIW two solutions that immediately come to mind is to export "backup"
> > as phys_port_name of the backup virtio link and/or assign a name to the
> > master like you are doing already.  I think team uses team%d and bond
> > uses bond%d, soft naming of master devices seems quite natural in this
> > case.  
> 
> I figured I had overlooked something like that.. Thanks for pointing
> this out. Okay so I think the phys_port_name approach might resolve
> the original issue. If I am reading things correctly what we end up
> with is the master showing up as "ens1" for example and the backup
> showing up as "ens1nbackup". Am I understanding that right?

Yes, provided systemd is new enough.

> The problem with the team/bond%d approach is that it creates a new
> netdevice and so it would require guest configuration changes.
>
> > IMHO phys_port_name == "backup" if BACKUP bit is set on slave virtio
> > link is quite neat.  
> 
> I agree. For non-"backup" virio_net devices would it be okay for us to
> just return -EOPNOTSUPP? I assume it would be and that way the legacy
> behavior could be maintained although the function still exists.

That's my understanding too.


Re: [PATCH net-next3/5] ibmvnic: Free and re-allocate scrqs when tx/rx scrqs change

2018-02-18 Thread kbuild test robot
Hi Nathan,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on net-next/master]
[also build test ERROR on v4.16-rc2 next-20180216]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Nathan-Fontenot/ibmvnic-Free-and-re-allocate-scrqs-when-tx-rx-scrqs-change/20180218-203503
config: powerpc-allmodconfig (attached as .config)
compiler: powerpc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=powerpc 

All errors (new ones prefixed by >>):

   drivers/net//ethernet/ibm/ibmvnic.c: In function 'release_sub_crqs':
>> drivers/net//ethernet/ibm/ibmvnic.c:2340:28: error: 'struct ibmvnic_adapter' 
>> has no member named 'num_active_tx_scrqs'; did you mean 
>> 'num_active_tx_pools'?
  for (i = 0; i < adapter->num_active_tx_scrqs; i++) {
   ^~~
   num_active_tx_pools
>> drivers/net//ethernet/ibm/ibmvnic.c:2362:28: error: 'struct ibmvnic_adapter' 
>> has no member named 'num_active_rx_scrqs'; did you mean 
>> 'num_active_rx_pools'?
  for (i = 0; i < adapter->num_active_rx_scrqs; i++) {
   ^~~
   num_active_rx_pools

vim +2340 drivers/net//ethernet/ibm/ibmvnic.c

  2334  
  2335  static void release_sub_crqs(struct ibmvnic_adapter *adapter, int 
do_h_free)
  2336  {
  2337  int i;
  2338  
  2339  if (adapter->tx_scrq) {
> 2340  for (i = 0; i < adapter->num_active_tx_scrqs; i++) {
  2341  if (!adapter->tx_scrq[i])
  2342  continue;
  2343  
  2344  netdev_dbg(adapter->netdev, "Releasing 
tx_scrq[%d]\n",
  2345 i);
  2346  if (adapter->tx_scrq[i]->irq) {
  2347  free_irq(adapter->tx_scrq[i]->irq,
  2348   adapter->tx_scrq[i]);
  2349  
irq_dispose_mapping(adapter->tx_scrq[i]->irq);
  2350  adapter->tx_scrq[i]->irq = 0;
  2351  }
  2352  
  2353  release_sub_crq_queue(adapter, 
adapter->tx_scrq[i],
  2354do_h_free);
  2355  }
  2356  
  2357  kfree(adapter->tx_scrq);
  2358  adapter->tx_scrq = NULL;
  2359  }
  2360  
  2361  if (adapter->rx_scrq) {
> 2362  for (i = 0; i < adapter->num_active_rx_scrqs; i++) {
  2363  if (!adapter->rx_scrq[i])
  2364  continue;
  2365  
  2366  netdev_dbg(adapter->netdev, "Releasing 
rx_scrq[%d]\n",
  2367 i);
  2368  if (adapter->rx_scrq[i]->irq) {
  2369  free_irq(adapter->rx_scrq[i]->irq,
  2370   adapter->rx_scrq[i]);
  2371  
irq_dispose_mapping(adapter->rx_scrq[i]->irq);
  2372  adapter->rx_scrq[i]->irq = 0;
  2373  }
  2374  
  2375  release_sub_crq_queue(adapter, 
adapter->rx_scrq[i],
  2376do_h_free);
  2377  }
  2378  
  2379  kfree(adapter->rx_scrq);
  2380  adapter->rx_scrq = NULL;
  2381  }
  2382  }
  2383  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [net-next v2 2/2] bpf: Add eBPF seccomp sample programs

2018-02-18 Thread Sargun Dhillon
On Sat, Feb 17, 2018 at 9:58 AM, Randy Dunlap  wrote:
> On 02/16/2018 11:36 PM, Sargun Dhillon wrote:
>> + close(111);
>> + assert(errno == EBADF);
>> + close(999);
>> + assert(errno = EPERM);
>
> should that be   == ?
>
Woops. Embarassing. Will fix that in the next re-spin.
>> +
>> + return 0;
>> +}
>
>
> --
> ~Randy


[PATCH iproute2-next 2/4] bridge: colorize output and use JSON print library

2018-02-18 Thread Stephen Hemminger
Use new functions from json_print to simplify code.
Provide standard flag for colorizing output.

Note: the simple -c flag is ambiguous it could mean color or
compressvlan; it is now changed to mean color for consistency
with other iproute2 commands.

Signed-off-by: Stephen Hemminger 
---
 bridge/br_common.h |   2 +-
 bridge/bridge.c|   9 +-
 bridge/fdb.c   | 281 +++--
 bridge/mdb.c   | 362 ++---
 bridge/vlan.c  | 275 +++-
 5 files changed, 362 insertions(+), 567 deletions(-)

diff --git a/bridge/br_common.h b/bridge/br_common.h
index 32dc46a2415d..1dcb14ae8062 100644
--- a/bridge/br_common.h
+++ b/bridge/br_common.h
@@ -6,7 +6,7 @@
 #define MDB_RTR_RTA(r) \
((struct rtattr *)(((char *)(r)) + RTA_ALIGN(sizeof(__u32
 
-extern void print_vlan_info(FILE *fp, struct rtattr *tb, int ifindex);
+extern void print_vlan_info(FILE *fp, struct rtattr *tb);
 extern int print_linkinfo(const struct sockaddr_nl *who,
  struct nlmsghdr *n,
  void *arg);
diff --git a/bridge/bridge.c b/bridge/bridge.c
index 3837c3d79290..fe4050407377 100644
--- a/bridge/bridge.c
+++ b/bridge/bridge.c
@@ -16,6 +16,7 @@
 #include "utils.h"
 #include "br_common.h"
 #include "namespace.h"
+#include "color.h"
 
 struct rtnl_handle rth = { .fd = -1 };
 int preferred_family = AF_UNSPEC;
@@ -23,6 +24,7 @@ int oneline;
 int show_stats;
 int show_details;
 int show_pretty;
+int color;
 int compress_vlans;
 int json;
 int timestamp;
@@ -40,7 +42,7 @@ static void usage(void)
 "where OBJECT := { link | fdb | mdb | vlan | monitor }\n"
 "  OPTIONS := { -V[ersion] | -s[tatistics] | -d[etails] |\n"
 "   -o[neline] | -t[imestamp] | -n[etns] name |\n"
-"   -c[ompressvlans] -p[retty] -j{son} }\n");
+"   -c[ompressvlans] -color -p[retty] -j{son} }\n");
exit(-1);
 }
 
@@ -171,6 +173,8 @@ main(int argc, char **argv)
NEXT_ARG();
if (netns_switch(argv[1]))
exit(-1);
+   } else if (matches(opt, "-color") == 0) {
+   enable_color();
} else if (matches(opt, "-compressvlans") == 0) {
++compress_vlans;
} else if (matches(opt, "-force") == 0) {
@@ -196,6 +200,9 @@ main(int argc, char **argv)
 
_SL_ = oneline ? "\\" : "\n";
 
+   if (json)
+   check_if_color_enabled();
+
if (batch_file)
return batch(batch_file);
 
diff --git a/bridge/fdb.c b/bridge/fdb.c
index 82dff5167e86..b4f6e8b3a01b 100644
--- a/bridge/fdb.c
+++ b/bridge/fdb.c
@@ -22,9 +22,9 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
+#include "json_print.h"
 #include "libnetlink.h"
 #include "br_common.h"
 #include "rt_names.h"
@@ -32,8 +32,6 @@
 
 static unsigned int filter_index, filter_vlan, filter_state;
 
-json_writer_t *jw_global;
-
 static void usage(void)
 {
fprintf(stderr,
@@ -83,13 +81,46 @@ static int state_a2n(unsigned int *s, const char *arg)
return 0;
 }
 
-static void start_json_fdb_flags_array(bool *fdb_flags)
+static void fdb_print_flags(FILE *fp, unsigned int flags)
+{
+   open_json_array(PRINT_JSON,
+   is_json_context() ?  "flags" : "");
+
+   if (flags & NTF_SELF)
+   print_string(PRINT_ANY, NULL, "%s ", "self");
+
+   if (flags & NTF_ROUTER)
+   print_string(PRINT_ANY, NULL, "%s ", "router");
+
+   if (flags & NTF_EXT_LEARNED)
+   print_string(PRINT_ANY, NULL, "%s ", "extern_learn");
+
+   if (flags & NTF_OFFLOADED)
+   print_string(PRINT_ANY, NULL, "%s ", "offload");
+
+   if (flags & NTF_MASTER)
+   print_string(PRINT_ANY, NULL, "%s ", "master");
+
+   close_json_array(PRINT_JSON, NULL);
+}
+
+static void fdb_print_stats(FILE *fp, const struct nda_cacheinfo *ci)
 {
-   if (*fdb_flags)
-   return;
-   jsonw_name(jw_global, "flags");
-   jsonw_start_array(jw_global);
-   *fdb_flags = true;
+   static int hz;
+
+   if (!hz)
+   hz = get_user_hz();
+
+   if (is_json_context()) {
+   print_uint(PRINT_JSON, "used", NULL,
+ci->ndm_used / hz);
+   print_uint(PRINT_JSON, "updated", NULL,
+   ci->ndm_updated / hz);
+   } else {
+   fprintf(fp, "used %d/%d ", ci->ndm_used / hz,
+   ci->ndm_updated / hz);
+
+   }
 }
 
 int print_fdb(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
@@ -99,8 +130,6 @@ int print_fdb(const struct sockaddr_nl *who, struct nlmsghdr 
*n, void *arg)
int len = n->nlmsg_len;
struct rtattr *tb[NDA_MAX+1];
__u16 vid = 0;
-   bool fdb_flags = fals

[PATCH iproute2-next 4/4] bridge: update man page for new color and json changes

2018-02-18 Thread Stephen Hemminger
Document color option, and no longer have restriction on json

Signed-off-by: Stephen Hemminger 
---
 man/man8/bridge.8 | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/man/man8/bridge.8 b/man/man8/bridge.8
index 490ef58da16c..05512b029fb9 100644
--- a/man/man8/bridge.8
+++ b/man/man8/bridge.8
@@ -22,6 +22,7 @@ bridge \- show / manipulate bridge addresses and devices
 \fB\-s\fR[\fItatistics\fR] |
 \fB\-n\fR[\fIetns\fR] name |
 \fB\-b\fR[\fIatch\fR] filename |
+\fB\-c\fR[\folor\fR] |
 \fB\-p\fR[\fIretty\fR] |
 \fB\-j\fR[\fIson\fR] }
 
@@ -169,11 +170,15 @@ If there were any errors during execution of the 
commands, the application
 return code will be non zero.
 
 .TP
-.BR "\-json"
-Display results in JSON format. Currently available for vlan and fdb.
+.BR "\-c" , " -color"
+Use color output.
 
 .TP
-.BR "\-pretty"
+.BR "\-j", " \-json"
+Output results in JavaScript Object Notation (JSON).
+
+.TP
+.BR "\-p", " \-pretty"
 When combined with -j generate a pretty JSON output.
 
 
-- 
2.16.1



[PATCH iproute2-next 3/4] bridge: add json support for link command

2018-02-18 Thread Stephen Hemminger
Add json output for bridge link show command.
This also changes the output format slightly for the non JSON case so
that it has same format as the ip link show command.

Signed-off-by: Stephen Hemminger 
---
 bridge/link.c | 304 --
 1 file changed, 188 insertions(+), 116 deletions(-)

diff --git a/bridge/link.c b/bridge/link.c
index 870ebe050477..12cc995e05e3 100644
--- a/bridge/link.c
+++ b/bridge/link.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 
+#include "json_print.h"
 #include "libnetlink.h"
 #include "utils.h"
 #include "br_common.h"
@@ -26,15 +27,27 @@ static const char *port_states[] = {
[BR_STATE_BLOCKING] = "blocking",
 };
 
-static void print_link_flags(FILE *fp, unsigned int flags)
+
+static const char *oper_states[] = {
+   "UNKNOWN", "NOTPRESENT", "DOWN", "LOWERLAYERDOWN",
+   "TESTING", "DORMANT","UP"
+};
+
+static const char *hw_mode[] = {
+   "VEB", "VEPA"
+};
+
+static void print_link_flags(FILE *fp, unsigned int flags, unsigned int mdown)
 {
-   fprintf(fp, "<");
+   open_json_array(PRINT_ANY, is_json_context() ? "flags" : "<");
if (flags & IFF_UP && !(flags & IFF_RUNNING))
-   fprintf(fp, "NO-CARRIER%s", flags ? "," : "");
+   print_string(PRINT_ANY, NULL,
+flags ? "%s," : "%s", "NO-CARRIER");
flags &= ~IFF_RUNNING;
-#define _PF(f) if (flags&IFF_##f) { \
- flags &= ~IFF_##f ; \
- fprintf(fp, #f "%s", flags ? "," : ""); }
+
+#define _PF(f) if (flags&IFF_##f) {\
+   flags &= ~IFF_##f ; \
+   print_string(PRINT_ANY, NULL, flags ? "%s," : "%s", #f); }
_PF(LOOPBACK);
_PF(BROADCAST);
_PF(POINTOPOINT);
@@ -55,44 +68,136 @@ static void print_link_flags(FILE *fp, unsigned int flags)
_PF(ECHO);
 #undef _PF
if (flags)
-   fprintf(fp, "%x", flags);
-   fprintf(fp, "> ");
+   print_hex(PRINT_ANY, NULL, "%x", flags);
+   if (mdown)
+   print_string(PRINT_ANY, NULL, ",%s", "M-DOWN");
+   close_json_array(PRINT_ANY, "> ");
 }
 
-static const char *oper_states[] = {
-   "UNKNOWN", "NOTPRESENT", "DOWN", "LOWERLAYERDOWN",
-   "TESTING", "DORMANT","UP"
-};
-
-static const char *hw_mode[] = {"VEB", "VEPA"};
-
 static void print_operstate(FILE *f, __u8 state)
 {
-   if (state >= ARRAY_SIZE(oper_states))
-   fprintf(f, "state %#x ", state);
-   else
-   fprintf(f, "state %s ", oper_states[state]);
+   if (state >= ARRAY_SIZE(oper_states)) {
+   print_0xhex(PRINT_ANY, "operstate", "operstat %#x", state);
+   } else {
+   if (is_json_context())
+   print_string(PRINT_JSON,
+"operstate",
+NULL, oper_states[state]);
+   else {
+   fprintf(f, "state ");
+   color_fprintf(f, oper_state_color(state),
+ "%s ", oper_states[state]);
+   }
+   }
 }
 
-static void print_portstate(FILE *f, __u8 state)
+static void print_portstate(__u8 state)
 {
if (state <= BR_STATE_BLOCKING)
-   fprintf(f, "state %s ", port_states[state]);
+   print_string(PRINT_ANY, "state",
+"state %s ", port_states[state]);
else
-   fprintf(f, "state (%d) ", state);
+   print_uint(PRINT_ANY, "state",
+"state (%d) ", state);
 }
 
-static void print_onoff(FILE *f, char *flag, __u8 val)
+static void print_onoff(FILE *fp, const char *flag, __u8 val)
 {
-   fprintf(f, "%s %s ", flag, val ? "on" : "off");
+   if (is_json_context())
+   print_bool(PRINT_JSON, flag, NULL, val);
+   else
+   fprintf(fp, "%s %s ", flag, val ? "on" : "off");
 }
 
-static void print_hwmode(FILE *f, __u16 mode)
+static void print_hwmode(__u16 mode)
 {
if (mode >= ARRAY_SIZE(hw_mode))
-   fprintf(f, "hwmode %#hx ", mode);
+   print_0xhex(PRINT_ANY, "hwmode",
+   "hwmode %#hx ", mode);
else
-   fprintf(f, "hwmode %s ", hw_mode[mode]);
+   print_string(PRINT_ANY, "hwmode",
+"hwmode %s ", hw_mode[mode]);
+}
+
+static void print_protinfo(FILE *fp, struct rtattr *attr)
+{
+   if (attr->rta_type & NLA_F_NESTED) {
+   struct rtattr *prtb[IFLA_BRPORT_MAX + 1];
+
+   parse_rtattr_nested(prtb, IFLA_BRPORT_MAX, attr);
+
+   if (prtb[IFLA_BRPORT_STATE])
+   
print_portstate(rta_getattr_u8(prtb[IFLA_BRPORT_STATE]));
+
+   if (prtb[IFLA_BRPORT_PRIORITY])
+   print_uint(PRINT_ANY, "priority",
+   

[PATCH iproute2-next 0/4] bridge: json and color output

2018-02-18 Thread Stephen Hemminger
This set of patches adds color and full JSON support to bridge command.
It also aligns ths output format and command line options to match
ip command; this is a change that may impact those using "compress vlan"
option or doing screen scraping.

The JSON output also changes slightly. But this should not impact
any real user, because in several cases the current format was
invalid JSON!

Stephen Hemminger (4):
  bridge: implement json pretty print flag
  bridge: colorize output and use JSON print library
  bridge: add json support for link command
  bridge: update man page for new color and json changes

 bridge/br_common.h |   5 +-
 bridge/bridge.c|  16 ++-
 bridge/fdb.c   | 279 -
 bridge/link.c  | 304 +++--
 bridge/mdb.c   | 354 ++---
 bridge/vlan.c  | 280 --
 man/man8/bridge.8  |  15 ++-
 7 files changed, 574 insertions(+), 679 deletions(-)

-- 
2.16.1



[PATCH iproute2-next 1/4] bridge: implement json pretty print flag

2018-02-18 Thread Stephen Hemminger
Make bridge work like other iproute2 commands with standard -j and -p
flags.

Signed-off-by: Stephen Hemminger 
---
 bridge/br_common.h |  3 ++-
 bridge/bridge.c|  9 ++---
 bridge/fdb.c   | 10 ++
 bridge/mdb.c   | 12 ++--
 bridge/vlan.c  | 11 +++
 man/man8/bridge.8  |  6 ++
 6 files changed, 37 insertions(+), 14 deletions(-)

diff --git a/bridge/br_common.h b/bridge/br_common.h
index f07c7d1c9090..32dc46a2415d 100644
--- a/bridge/br_common.h
+++ b/bridge/br_common.h
@@ -24,7 +24,8 @@ extern int do_link(int argc, char **argv);
 extern int preferred_family;
 extern int show_stats;
 extern int show_details;
+extern int show_pretty;
 extern int timestamp;
 extern int compress_vlans;
-extern int json_output;
+extern int json;
 extern struct rtnl_handle rth;
diff --git a/bridge/bridge.c b/bridge/bridge.c
index 48fe1c8b2ef1..3837c3d79290 100644
--- a/bridge/bridge.c
+++ b/bridge/bridge.c
@@ -22,8 +22,9 @@ int preferred_family = AF_UNSPEC;
 int oneline;
 int show_stats;
 int show_details;
+int show_pretty;
 int compress_vlans;
-int json_output;
+int json;
 int timestamp;
 char *batch_file;
 int force;
@@ -39,7 +40,7 @@ static void usage(void)
 "where OBJECT := { link | fdb | mdb | vlan | monitor }\n"
 "  OPTIONS := { -V[ersion] | -s[tatistics] | -d[etails] |\n"
 "   -o[neline] | -t[imestamp] | -n[etns] name |\n"
-"   -c[ompressvlans] -j{son} }\n");
+"   -c[ompressvlans] -p[retty] -j{son} }\n");
exit(-1);
 }
 
@@ -175,7 +176,9 @@ main(int argc, char **argv)
} else if (matches(opt, "-force") == 0) {
++force;
} else if (matches(opt, "-json") == 0) {
-   ++json_output;
+   ++json;
+   } else if (matches(opt, "-pretty") == 0) {
+   ++show_pretty;
} else if (matches(opt, "-batch") == 0) {
argc--;
argv++;
diff --git a/bridge/fdb.c b/bridge/fdb.c
index 8b133f9c1c8d..82dff5167e86 100644
--- a/bridge/fdb.c
+++ b/bridge/fdb.c
@@ -132,10 +132,8 @@ int print_fdb(const struct sockaddr_nl *who, struct 
nlmsghdr *n, void *arg)
if (filter_vlan && filter_vlan != vid)
return 0;
 
-   if (jw_global) {
-   jsonw_pretty(jw_global, 1);
+   if (jw_global)
jsonw_start_object(jw_global);
-   }
 
if (n->nlmsg_type == RTM_DELNEIGH) {
if (jw_global)
@@ -388,14 +386,18 @@ static int fdb_show(int argc, char **argv)
exit(1);
}
 
-   if (json_output) {
+   if (json) {
jw_global = jsonw_new(stdout);
if (!jw_global) {
fprintf(stderr, "Error allocation json object\n");
exit(1);
}
+   if (show_pretty)
+   jsonw_pretty(jw_global, 1);
+
jsonw_start_array(jw_global);
}
+
if (rtnl_dump_filter(&rth, print_fdb, stdout) < 0) {
fprintf(stderr, "Dump terminated\n");
exit(1);
diff --git a/bridge/mdb.c b/bridge/mdb.c
index 62dc8a0c58c3..0873502e8a24 100644
--- a/bridge/mdb.c
+++ b/bridge/mdb.c
@@ -325,7 +325,7 @@ static int mdb_show(int argc, char **argv)
return -1;
}
 
-   if (!json_output) {
+   if (!json) {
/* Normal output */
if (rtnl_dump_filter(&rth, print_mdb, stdout) < 0) {
fprintf(stderr, "Dump terminated\n");
@@ -333,9 +333,17 @@ static int mdb_show(int argc, char **argv)
}
return 0;
}
+
/* Json output */
jw_global = jsonw_new(stdout);
-   jsonw_pretty(jw_global, 1);
+   if (!jw_global) {
+   fprintf(stderr, "Error allocation json object\n");
+   exit(1);
+   }
+
+   if (show_pretty)
+   jsonw_pretty(jw_global, 1);
+
jsonw_start_object(jw_global);
jsonw_name(jw_global, "mdb");
jsonw_start_array(jw_global);
diff --git a/bridge/vlan.c b/bridge/vlan.c
index f42d7e6ba244..e4eab4c2384f 100644
--- a/bridge/vlan.c
+++ b/bridge/vlan.c
@@ -258,7 +258,6 @@ static int filter_vlan_check(__u16 vid, __u16 flags)
 static void print_vlan_port(FILE *fp, int ifi_index)
 {
if (jw_global) {
-   jsonw_pretty(jw_global, 1);
jsonw_name(jw_global,
   ll_index_to_name(ifi_index));
jsonw_start_array(jw_global);
@@ -578,17 +577,21 @@ static int vlan_show(int argc, char **argv)
if (!show_stats) {
if (rtnl_wilddump_req_filter(&rth, PF_BRIDGE, RTM_GETLINK,
 (compress_vlans ?
-   RTEXT_FILTER_BRVLAN_COMPRESSED :
-   RTEXT_FILTER_BRVLAN)) 

Re: [PATCH net 1/4] net/8390: Remove redundant make dependencies

2018-02-18 Thread Greg Ungerer
Hi Finn,

On 19/02/18 12:39, Finn Thain wrote:
> The hydra, zorro8390 and mcf8390 drivers all #include "lib8390.c" and
> have no need for 8390.o. modinfo confirms no dependency on 8390.ko.
> Drop the redundant dependency from the Makefile. objdump confirms
> that this patch has no effect on the module binaries.
> 
> The superfluous additions of 8390.o were introduced in
> commit 644570b83026 ("8390: Move the 8390 related drivers").
> 
> Cc: Greg Ungerer 

Looks right for mcf8390.c.

Acked-by: Greg Ungerer 

Regards
Greg


> Cc: Geert Uytterhoeven 
> Signed-off-by: Finn Thain 
> Acked-by: Geert Uytterhoeven 
> ---
>  drivers/net/ethernet/8390/Makefile | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/8390/Makefile 
> b/drivers/net/ethernet/8390/Makefile
> index f975c2fc88a3..1d650e66cc6e 100644
> --- a/drivers/net/ethernet/8390/Makefile
> +++ b/drivers/net/ethernet/8390/Makefile
> @@ -7,8 +7,8 @@ obj-$(CONFIG_MAC8390) += mac8390.o
>  obj-$(CONFIG_APNE) += apne.o 8390.o
>  obj-$(CONFIG_ARM_ETHERH) += etherh.o
>  obj-$(CONFIG_AX88796) += ax88796.o
> -obj-$(CONFIG_HYDRA) += hydra.o 8390.o
> -obj-$(CONFIG_MCF8390) += mcf8390.o 8390.o
> +obj-$(CONFIG_HYDRA) += hydra.o
> +obj-$(CONFIG_MCF8390) += mcf8390.o
>  obj-$(CONFIG_NE2000) += ne.o 8390p.o
>  obj-$(CONFIG_NE2K_PCI) += ne2k-pci.o 8390.o
>  obj-$(CONFIG_PCMCIA_AXNET) += axnet_cs.o 8390.o
> @@ -16,4 +16,4 @@ obj-$(CONFIG_PCMCIA_PCNET) += pcnet_cs.o 8390.o
>  obj-$(CONFIG_STNIC) += stnic.o 8390.o
>  obj-$(CONFIG_ULTRA) += smc-ultra.o 8390.o
>  obj-$(CONFIG_WD80x3) += wd.o 8390.o
> -obj-$(CONFIG_ZORRO8390) += zorro8390.o 8390.o
> +obj-$(CONFIG_ZORRO8390) += zorro8390.o
> 



[PATCH net 2/4] net/8390: Fix msg_enable patch snafu

2018-02-18 Thread Finn Thain
The lib8390 module parameter 'msg_enable' doesn't do anything useful:
it causes an ancient version string to be logged.

Remove redundant code that logs the same string.

In ne.c and wd.c, the value of ei_local->msg_enable is used before
being assigned. Use ne_msg_enable and wd_msg_enable, respectively.

Most of the other 8390 drivers never assign ei_local->msg_enable.
Use the 'msg_enable' module parameter from lib8390 as the default
value.

Eliminate the pointless static and local variables.

Clean up an indentation mistake.

All of these issues originated from the same patch.

Cc: Russell King 
Fixes: c45f812f0280 ("8390 : Replace ei_debug with msg_enable/NETIF_MSG_* 
feature")
Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
Only the mac8390.c and lib8390.c changes have been tested. The other
changes are similar but untested.
---
 drivers/net/ethernet/8390/ax88796.c   |  3 ---
 drivers/net/ethernet/8390/axnet_cs.c  |  2 --
 drivers/net/ethernet/8390/etherh.c| 17 -
 drivers/net/ethernet/8390/hydra.c |  4 
 drivers/net/ethernet/8390/lib8390.c   |  2 ++
 drivers/net/ethernet/8390/mac8390.c   |  8 
 drivers/net/ethernet/8390/mcf8390.c   |  4 
 drivers/net/ethernet/8390/ne.c|  2 +-
 drivers/net/ethernet/8390/pcnet_cs.c  |  4 
 drivers/net/ethernet/8390/wd.c|  2 +-
 drivers/net/ethernet/8390/zorro8390.c |  5 -
 11 files changed, 4 insertions(+), 49 deletions(-)

diff --git a/drivers/net/ethernet/8390/ax88796.c 
b/drivers/net/ethernet/8390/ax88796.c
index 245554707163..da61cf3cb3a9 100644
--- a/drivers/net/ethernet/8390/ax88796.c
+++ b/drivers/net/ethernet/8390/ax88796.c
@@ -77,8 +77,6 @@ static unsigned char version[] = "ax88796.c: Copyright 
2005,2007 Simtec Electron
 
 #define AX_GPOC_PPDSET BIT(6)
 
-static u32 ax_msg_enable;
-
 /* device private data */
 
 struct ax_device {
@@ -747,7 +745,6 @@ static int ax_init_dev(struct net_device *dev)
ei_local->block_output = &ax_block_output;
ei_local->get_8390_hdr = &ax_get_8390_hdr;
ei_local->priv = 0;
-   ei_local->msg_enable = ax_msg_enable;
 
dev->netdev_ops = &ax_netdev_ops;
dev->ethtool_ops = &ax_ethtool_ops;
diff --git a/drivers/net/ethernet/8390/axnet_cs.c 
b/drivers/net/ethernet/8390/axnet_cs.c
index 7bddb8efb6d5..d422a124cd7c 100644
--- a/drivers/net/ethernet/8390/axnet_cs.c
+++ b/drivers/net/ethernet/8390/axnet_cs.c
@@ -104,7 +104,6 @@ static void AX88190_init(struct net_device *dev, int 
startp);
 static int ax_open(struct net_device *dev);
 static int ax_close(struct net_device *dev);
 static irqreturn_t ax_interrupt(int irq, void *dev_id);
-static u32 axnet_msg_enable;
 
 /**/
 
@@ -151,7 +150,6 @@ static int axnet_probe(struct pcmcia_device *link)
return -ENOMEM;
 
 ei_local = netdev_priv(dev);
-ei_local->msg_enable = axnet_msg_enable;
 spin_lock_init(&ei_local->page_lock);
 
 info = PRIV(dev);
diff --git a/drivers/net/ethernet/8390/etherh.c 
b/drivers/net/ethernet/8390/etherh.c
index 11cbf22ad201..32e9627e3880 100644
--- a/drivers/net/ethernet/8390/etherh.c
+++ b/drivers/net/ethernet/8390/etherh.c
@@ -64,8 +64,6 @@ static char version[] =
 
 #include "lib8390.c"
 
-static u32 etherh_msg_enable;
-
 struct etherh_priv {
void __iomem*ioc_fast;
void __iomem*memc;
@@ -501,18 +499,6 @@ etherh_close(struct net_device *dev)
return 0;
 }
 
-/*
- * Initialisation
- */
-
-static void __init etherh_banner(void)
-{
-   static int version_printed;
-
-   if ((etherh_msg_enable & NETIF_MSG_DRV) && (version_printed++ == 0))
-   pr_info("%s", version);
-}
-
 /*
  * Read the ethernet address string from the on board rom.
  * This is an ascii string...
@@ -671,8 +657,6 @@ etherh_probe(struct expansion_card *ec, const struct 
ecard_id *id)
struct etherh_priv *eh;
int ret;
 
-   etherh_banner();
-
ret = ecard_request_resources(ec);
if (ret)
goto out;
@@ -757,7 +741,6 @@ etherh_probe(struct expansion_card *ec, const struct 
ecard_id *id)
ei_local->block_output  = etherh_block_output;
ei_local->get_8390_hdr  = etherh_get_header;
ei_local->interface_num = 0;
-   ei_local->msg_enable = etherh_msg_enable;
 
etherh_reset(dev);
__NS8390_init(dev, 0);
diff --git a/drivers/net/ethernet/8390/hydra.c 
b/drivers/net/ethernet/8390/hydra.c
index 8ae249195301..941754ea78ec 100644
--- a/drivers/net/ethernet/8390/hydra.c
+++ b/drivers/net/ethernet/8390/hydra.c
@@ -66,7 +66,6 @@ static void hydra_block_input(struct net_device *dev, int 
count,
 static void hydra_block_output(struct net_device *dev, int count,
   const unsigned char *buf, int start_page);
 static void hydra_remove_one(struct zorro_dev *z);
-static u32 hydra_msg_enable;
 
 static struct zorro_device_id hydra_zorro_tbl[] = {
 { ZORRO_PROD_HYDRA

[PATCH net 4/4] net/mac8390: Fix log messages

2018-02-18 Thread Finn Thain
Use dev_foo() to log the slot number instead of the unexpanded "eth%d"
format string.
Disambiguate the two identical "Card type %s is unsupported" messages.

Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
 drivers/net/ethernet/8390/mac8390.c | 36 +---
 1 file changed, 17 insertions(+), 19 deletions(-)

diff --git a/drivers/net/ethernet/8390/mac8390.c 
b/drivers/net/ethernet/8390/mac8390.c
index 8042dd73eb6a..b6d735bf8011 100644
--- a/drivers/net/ethernet/8390/mac8390.c
+++ b/drivers/net/ethernet/8390/mac8390.c
@@ -308,14 +308,14 @@ static bool mac8390_rsrc_init(struct net_device *dev,
 */
 
if (nubus_get_func_dir(fres, &dir) == -1) {
-   pr_err("%s: Unable to get Nubus functional directory for slot 
%X!\n",
-  dev->name, board->slot);
+   dev_err(&board->dev,
+   "Unable to get Nubus functional directory\n");
return false;
}
 
/* Get the MAC address */
if (nubus_find_rsrc(&dir, NUBUS_RESID_MAC_ADDRESS, &ent) == -1) {
-   pr_info("%s: Couldn't get MAC address!\n", dev->name);
+   dev_info(&board->dev, "MAC address resource not found\n");
return false;
}
 
@@ -325,8 +325,8 @@ static bool mac8390_rsrc_init(struct net_device *dev,
nubus_rewinddir(&dir);
if (nubus_find_rsrc(&dir, NUBUS_RESID_MINOR_BASEOS,
&ent) == -1) {
-   pr_err("%s: Memory offset resource for slot %X not 
found!\n",
-  dev->name, board->slot);
+   dev_err(&board->dev,
+   "Memory offset resource not found\n");
return false;
}
nubus_get_rsrc_mem(&offset, &ent, 4);
@@ -336,8 +336,8 @@ static bool mac8390_rsrc_init(struct net_device *dev,
nubus_rewinddir(&dir);
if (nubus_find_rsrc(&dir, NUBUS_RESID_MINOR_LENGTH,
&ent) == -1) {
-   pr_info("%s: Memory length resource for slot %X not 
found, probing\n",
-   dev->name, board->slot);
+   dev_info(&board->dev,
+"Memory length resource not found, probing\n");
offset = mac8390_memsize(dev->mem_start);
} else {
nubus_get_rsrc_mem(&offset, &ent, 4);
@@ -380,8 +380,8 @@ static bool mac8390_rsrc_init(struct net_device *dev,
break;
 
default:
-   pr_err("Card type %s is unsupported, sorry\n",
-  board->name);
+   dev_err(&board->dev,
+   "No known base address for card type\n");
return false;
}
}
@@ -533,7 +533,8 @@ static int mac8390_initdev(struct net_device *dev, struct 
nubus_board *board,
case MAC8390_APPLE:
switch (mac8390_testio(dev->mem_start)) {
case ACCESS_UNKNOWN:
-   pr_err("Don't know how to access card memory!\n");
+   dev_err(&board->dev,
+   "Don't know how to access card memory\n");
return -ENODEV;
 
case ACCESS_16:
@@ -599,21 +600,18 @@ static int mac8390_initdev(struct net_device *dev, struct 
nubus_board *board,
break;
 
default:
-   pr_err("Card type %s is unsupported, sorry\n",
-  board->name);
+   dev_err(&board->dev, "Unsupported card type\n");
return -ENODEV;
}
 
__NS8390_init(dev, 0);
 
/* Good, done, now spit out some messages */
-   pr_info("%s: %s in slot %X (type %s)\n",
-   dev->name, board->name, board->slot,
-   cardname[type]);
-   pr_info("MAC %pM IRQ %d, %d KB shared memory at %#lx, %d-bit access.\n",
-   dev->dev_addr, dev->irq,
-   (unsigned int)(dev->mem_end - dev->mem_start) >> 10,
-   dev->mem_start, access_bitmode ? 32 : 16);
+   dev_info(&board->dev, "%s (type %s)\n", board->name, cardname[type]);
+   dev_info(&board->dev, "MAC %pM, IRQ %d, %d KB shared memory at %#lx, 
%d-bit access.\n",
+dev->dev_addr, dev->irq,
+(unsigned int)(dev->mem_end - dev->mem_start) >> 10,
+dev->mem_start, access_bitmode ? 32 : 16);
return 0;
 }
 
-- 
2.16.1



[PATCH net 3/4] net/mac8390: Convert to nubus_driver

2018-02-18 Thread Finn Thain
This resolves an old bug that constrained this driver to no more than
one card.

Tested-by: Stan Johnson 
Signed-off-by: Finn Thain 
---
 drivers/net/Space.c |   3 -
 drivers/net/ethernet/8390/mac8390.c | 139 +---
 include/net/Space.h |   1 -
 3 files changed, 67 insertions(+), 76 deletions(-)

diff --git a/drivers/net/Space.c b/drivers/net/Space.c
index 11fe71278f40..64333ec999ac 100644
--- a/drivers/net/Space.c
+++ b/drivers/net/Space.c
@@ -114,9 +114,6 @@ static struct devprobe2 m68k_probes[] __initdata = {
 #ifdef CONFIG_MVME147_NET  /* MVME147 internal Ethernet */
{mvme147lance_probe, 0},
 #endif
-#ifdef CONFIG_MAC8390   /* NuBus NS8390-based cards */
-   {mac8390_probe, 0},
-#endif
 #ifdef CONFIG_MAC89x0
{mac89x0_probe, 0},
 #endif
diff --git a/drivers/net/ethernet/8390/mac8390.c 
b/drivers/net/ethernet/8390/mac8390.c
index abe50338b9f7..8042dd73eb6a 100644
--- a/drivers/net/ethernet/8390/mac8390.c
+++ b/drivers/net/ethernet/8390/mac8390.c
@@ -123,8 +123,7 @@ enum mac8390_access {
 };
 
 extern int mac8390_memtest(struct net_device *dev);
-static int mac8390_initdev(struct net_device *dev,
-  struct nubus_rsrc *ndev,
+static int mac8390_initdev(struct net_device *dev, struct nubus_board *board,
   enum mac8390_type type);
 
 static int mac8390_open(struct net_device *dev);
@@ -169,7 +168,7 @@ static void slow_sane_block_output(struct net_device *dev, 
int count,
 static void word_memcpy_tocard(unsigned long tp, const void *fp, int count);
 static void word_memcpy_fromcard(void *tp, unsigned long fp, int count);
 
-static enum mac8390_type __init mac8390_ident(struct nubus_rsrc *fres)
+static enum mac8390_type mac8390_ident(struct nubus_rsrc *fres)
 {
switch (fres->dr_sw) {
case NUBUS_DRSW_3COM:
@@ -235,7 +234,7 @@ static enum mac8390_type __init mac8390_ident(struct 
nubus_rsrc *fres)
return MAC8390_NONE;
 }
 
-static enum mac8390_access __init mac8390_testio(volatile unsigned long 
membase)
+static enum mac8390_access mac8390_testio(unsigned long membase)
 {
unsigned long outdata = 0xA5A0B5B0;
unsigned long indata =  0x;
@@ -253,7 +252,7 @@ static enum mac8390_access __init mac8390_testio(volatile 
unsigned long membase)
return ACCESS_UNKNOWN;
 }
 
-static int __init mac8390_memsize(unsigned long membase)
+static int mac8390_memsize(unsigned long membase)
 {
unsigned long flags;
int i, j;
@@ -289,28 +288,28 @@ static int __init mac8390_memsize(unsigned long membase)
return i * 0x1000;
 }
 
-static bool __init mac8390_init(struct net_device *dev,
-   struct nubus_rsrc *ndev,
-   enum mac8390_type cardtype)
+static bool mac8390_rsrc_init(struct net_device *dev,
+ struct nubus_rsrc *fres,
+ enum mac8390_type cardtype)
 {
+   struct nubus_board *board = fres->board;
struct nubus_dir dir;
struct nubus_dirent ent;
int offset;
volatile unsigned short *i;
 
-   dev->irq = SLOT2IRQ(ndev->board->slot);
+   dev->irq = SLOT2IRQ(board->slot);
/* This is getting to be a habit */
-   dev->base_addr = (ndev->board->slot_addr |
- ((ndev->board->slot & 0xf) << 20));
+   dev->base_addr = board->slot_addr | ((board->slot & 0xf) << 20);
 
/*
 * Get some Nubus info - we will trust the card's idea
 * of where its memory and registers are.
 */
 
-   if (nubus_get_func_dir(ndev, &dir) == -1) {
+   if (nubus_get_func_dir(fres, &dir) == -1) {
pr_err("%s: Unable to get Nubus functional directory for slot 
%X!\n",
-  dev->name, ndev->board->slot);
+  dev->name, board->slot);
return false;
}
 
@@ -327,7 +326,7 @@ static bool __init mac8390_init(struct net_device *dev,
if (nubus_find_rsrc(&dir, NUBUS_RESID_MINOR_BASEOS,
&ent) == -1) {
pr_err("%s: Memory offset resource for slot %X not 
found!\n",
-  dev->name, ndev->board->slot);
+  dev->name, board->slot);
return false;
}
nubus_get_rsrc_mem(&offset, &ent, 4);
@@ -338,7 +337,7 @@ static bool __init mac8390_init(struct net_device *dev,
if (nubus_find_rsrc(&dir, NUBUS_RESID_MINOR_LENGTH,
&ent) == -1) {
pr_info("%s: Memory length resource for slot %X not 
found, probing\n",
-   dev->name, ndev->board->slot);
+   dev->name, board->slot);
offset = mac8390_memsize(dev->mem_start);
} else {
 

[PATCH net 1/4] net/8390: Remove redundant make dependencies

2018-02-18 Thread Finn Thain
The hydra, zorro8390 and mcf8390 drivers all #include "lib8390.c" and
have no need for 8390.o. modinfo confirms no dependency on 8390.ko.
Drop the redundant dependency from the Makefile. objdump confirms
that this patch has no effect on the module binaries.

The superfluous additions of 8390.o were introduced in
commit 644570b83026 ("8390: Move the 8390 related drivers").

Cc: Greg Ungerer 
Cc: Geert Uytterhoeven 
Signed-off-by: Finn Thain 
Acked-by: Geert Uytterhoeven 
---
 drivers/net/ethernet/8390/Makefile | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/8390/Makefile 
b/drivers/net/ethernet/8390/Makefile
index f975c2fc88a3..1d650e66cc6e 100644
--- a/drivers/net/ethernet/8390/Makefile
+++ b/drivers/net/ethernet/8390/Makefile
@@ -7,8 +7,8 @@ obj-$(CONFIG_MAC8390) += mac8390.o
 obj-$(CONFIG_APNE) += apne.o 8390.o
 obj-$(CONFIG_ARM_ETHERH) += etherh.o
 obj-$(CONFIG_AX88796) += ax88796.o
-obj-$(CONFIG_HYDRA) += hydra.o 8390.o
-obj-$(CONFIG_MCF8390) += mcf8390.o 8390.o
+obj-$(CONFIG_HYDRA) += hydra.o
+obj-$(CONFIG_MCF8390) += mcf8390.o
 obj-$(CONFIG_NE2000) += ne.o 8390p.o
 obj-$(CONFIG_NE2K_PCI) += ne2k-pci.o 8390.o
 obj-$(CONFIG_PCMCIA_AXNET) += axnet_cs.o 8390.o
@@ -16,4 +16,4 @@ obj-$(CONFIG_PCMCIA_PCNET) += pcnet_cs.o 8390.o
 obj-$(CONFIG_STNIC) += stnic.o 8390.o
 obj-$(CONFIG_ULTRA) += smc-ultra.o 8390.o
 obj-$(CONFIG_WD80x3) += wd.o 8390.o
-obj-$(CONFIG_ZORRO8390) += zorro8390.o 8390.o
+obj-$(CONFIG_ZORRO8390) += zorro8390.o
-- 
2.16.1



[PATCH net 0/4] Fixes, cleanup and modernization for 8390 ethernet drivers

2018-02-18 Thread Finn Thain
Changes since v4 of combined patch series:
- Removed redundant and non-portable MACH_IS_MAC tests.
- Added acked-by tags from Geert Uytterhoeven.
- Omitted patches unrelated to 8390 drivers.


Finn Thain (4):
  net/8390: Remove redundant make dependencies
  net/8390: Fix msg_enable patch snafu
  net/mac8390: Convert to nubus_driver
  net/mac8390: Fix log messages

 drivers/net/Space.c   |   3 -
 drivers/net/ethernet/8390/Makefile|   6 +-
 drivers/net/ethernet/8390/ax88796.c   |   3 -
 drivers/net/ethernet/8390/axnet_cs.c  |   2 -
 drivers/net/ethernet/8390/etherh.c|  17 
 drivers/net/ethernet/8390/hydra.c |   4 -
 drivers/net/ethernet/8390/lib8390.c   |   2 +
 drivers/net/ethernet/8390/mac8390.c   | 171 --
 drivers/net/ethernet/8390/mcf8390.c   |   4 -
 drivers/net/ethernet/8390/ne.c|   2 +-
 drivers/net/ethernet/8390/pcnet_cs.c  |   4 -
 drivers/net/ethernet/8390/wd.c|   2 +-
 drivers/net/ethernet/8390/zorro8390.c |   5 -
 include/net/Space.h   |   1 -
 14 files changed, 85 insertions(+), 141 deletions(-)

-- 
2.16.1



Re: [PATCH v5 00/14] net: pch_gbe: Fixes & MIPS support

2018-02-18 Thread Florian Fainelli


On 02/18/2018 05:15 PM, David Miller wrote:
> From: Paul Burton 
> Date: Sun, 18 Feb 2018 09:03:10 -0800
> 
>> Hi David,
>>
>> On Sun, Feb 18, 2018 at 10:31:12AM -0500, David Miller wrote:
>>> Nobody is going to see and apply these patches if you don't CC: the
>>> Linux networking development list, netdev@vger.kernel.org
>>
>> You're replying to mail that was "To: netdev@vger.kernel.org" and I see
>> the whole series in the archives[1] so it definitely reached the list.
>>
>> I'm not sure I see the problem?
> 
> Sorry.
> 
> The issue is that your patch series didn't make it into patchwork
> properly, I wonder what happened since you did send it to netdev.
> 
> Hmmm...

The guys at buildroot seem to have seen a number of their patches not
making it to patchwork, thread starts here:

http://buildroot-busybox.2317881.n4.nabble.com/patchwork-ozlabs-org-down-and-e-mails-not-recorded-td183918.html
-- 
Florian


[PATCH nf] netfilter: bridge: ebt_among: add missing match size checks

2018-02-18 Thread Florian Westphal
ebt_among is special, it has a dynamic match size and is exempt
from the central size checks.

Therefore it must check that the size of the match structure
provided from userspace is sane by making sure em->match_size
is at least the minimum size of the expected structure.

The module has such a check, but its only done after accessing
a structure that might be out of bounds.

tested with: ebtables -A INPUT ... \
--among-dst fe:fe:fe:fe:fe:fe
--among-dst fe:fe:fe:fe:fe:fe --among-src 
fe:fe:fe:fe:ff:f,fe:fe:fe:fe:fe:fb,fe:fe:fe:fe:fc:fd,fe:fe:fe:fe:fe:fd,fe:fe:fe:fe:fe:fe
--among-src 
fe:fe:fe:fe:ff:f,fe:fe:fe:fe:fe:fa,fe:fe:fe:fe:fe:fd,fe:fe:fe:fe:fe:fe,fe:fe:fe:fe:fe:fe

Reported-by: 
Signed-off-by: Florian Westphal 
---
 net/bridge/netfilter/ebt_among.c | 21 +++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/net/bridge/netfilter/ebt_among.c b/net/bridge/netfilter/ebt_among.c
index ce7152a12bd8..c5afb4232ecb 100644
--- a/net/bridge/netfilter/ebt_among.c
+++ b/net/bridge/netfilter/ebt_among.c
@@ -172,18 +172,35 @@ ebt_among_mt(const struct sk_buff *skb, struct 
xt_action_param *par)
return true;
 }
 
+static bool poolsize_invalid(const struct ebt_mac_wormhash *w)
+{
+   return w && w->poolsize >= (INT_MAX / sizeof(struct 
ebt_mac_wormhash_tuple));
+}
+
 static int ebt_among_mt_check(const struct xt_mtchk_param *par)
 {
const struct ebt_among_info *info = par->matchinfo;
const struct ebt_entry_match *em =
container_of(par->matchinfo, const struct ebt_entry_match, 
data);
-   int expected_length = sizeof(struct ebt_among_info);
+   unsigned int expected_length = sizeof(struct ebt_among_info);
const struct ebt_mac_wormhash *wh_dst, *wh_src;
int err;
 
+   if (expected_length > em->match_size)
+   return -EINVAL;
+
wh_dst = ebt_among_wh_dst(info);
-   wh_src = ebt_among_wh_src(info);
+   if (poolsize_invalid(wh_dst))
+   return -EINVAL;
+
expected_length += ebt_mac_wormhash_size(wh_dst);
+   if (expected_length > em->match_size)
+   return -EINVAL;
+
+   wh_src = ebt_among_wh_src(info);
+   if (poolsize_invalid(wh_src))
+   return -EINVAL;
+
expected_length += ebt_mac_wormhash_size(wh_src);
 
if (em->match_size != EBT_ALIGN(expected_length)) {
-- 
2.16.1



Re: [PATCH net-next3/5] ibmvnic: Free and re-allocate scrqs when tx/rx scrqs change

2018-02-18 Thread Nathan Fontenot
On 02/17/2018 05:32 PM, Nathan Fontenot wrote:
> When the driver resets it is possible that the number of tx/rx
> sub-crqs can change. This patch handles this so that the driver does
> not try to access non-existent sub-crqs.
> 
> Additionally, a parameter is added to release_sub_crqs() so that
> we know if the h_call to free the sub-crq needs to be made. In
> the reset path we have to do a reset of the main crq, which is
> a free followed by a register of the main crq. The free of main
> crq results in all of the sub crq's being free'ed. When updating
> sub-crq count in the reset path we do not want to h_free the
> sub-crqs, they are already free'ed.

This patch has a bug in that it does not use the correct queue count
when releasing the sub crqs when the driver is in the probed state.

A version 2 of the patch set will be sent.

-Nathan

> 
> Signed-off-by: Nathan Fontenot 
> ---
>  drivers/net/ethernet/ibm/ibmvnic.c |   69 
> ++--
>  1 file changed, 43 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
> b/drivers/net/ethernet/ibm/ibmvnic.c
> index 9cfbb20b5ac1..a3079d5c072c 100644
> --- a/drivers/net/ethernet/ibm/ibmvnic.c
> +++ b/drivers/net/ethernet/ibm/ibmvnic.c
> @@ -90,7 +90,7 @@ MODULE_VERSION(IBMVNIC_DRIVER_VERSION);
> 
>  static int ibmvnic_version = IBMVNIC_INITIAL_VERSION;
>  static int ibmvnic_remove(struct vio_dev *);
> -static void release_sub_crqs(struct ibmvnic_adapter *);
> +static void release_sub_crqs(struct ibmvnic_adapter *, int);
>  static int ibmvnic_reset_crq(struct ibmvnic_adapter *);
>  static int ibmvnic_send_crq_init(struct ibmvnic_adapter *);
>  static int ibmvnic_reenable_crq_queue(struct ibmvnic_adapter *);
> @@ -740,7 +740,7 @@ static int ibmvnic_login(struct net_device *netdev)
>   do {
>   if (adapter->renegotiate) {
>   adapter->renegotiate = false;
> - release_sub_crqs(adapter);
> + release_sub_crqs(adapter, 1);
> 
>   reinit_completion(&adapter->init_done);
>   send_cap_queries(adapter);
> @@ -1602,7 +1602,7 @@ static int do_reset(struct ibmvnic_adapter *adapter,
>   if (adapter->reset_reason == VNIC_RESET_CHANGE_PARAM ||
>   adapter->wait_for_reset) {
>   release_resources(adapter);
> - release_sub_crqs(adapter);
> + release_sub_crqs(adapter, 1);
>   release_crq_queue(adapter);
>   }
> 
> @@ -2241,24 +2241,27 @@ static int reset_sub_crq_queues(struct 
> ibmvnic_adapter *adapter)
>  }
> 
>  static void release_sub_crq_queue(struct ibmvnic_adapter *adapter,
> -   struct ibmvnic_sub_crq_queue *scrq)
> +   struct ibmvnic_sub_crq_queue *scrq,
> +   int do_h_free)
>  {
>   struct device *dev = &adapter->vdev->dev;
>   long rc;
> 
>   netdev_dbg(adapter->netdev, "Releasing sub-CRQ\n");
> 
> - /* Close the sub-crqs */
> - do {
> - rc = plpar_hcall_norets(H_FREE_SUB_CRQ,
> - adapter->vdev->unit_address,
> - scrq->crq_num);
> - } while (rc == H_BUSY || H_IS_LONG_BUSY(rc));
> + if (do_h_free) {
> + /* Close the sub-crqs */
> + do {
> + rc = plpar_hcall_norets(H_FREE_SUB_CRQ,
> + adapter->vdev->unit_address,
> + scrq->crq_num);
> + } while (rc == H_BUSY || H_IS_LONG_BUSY(rc));
> 
> - if (rc) {
> - netdev_err(adapter->netdev,
> -"Failed to release sub-CRQ %16lx, rc = %ld\n",
> -scrq->crq_num, rc);
> + if (rc) {
> + netdev_err(adapter->netdev,
> +"Failed to release sub-CRQ %16lx, rc = 
> %ld\n",
> +scrq->crq_num, rc);
> + }
>   }
> 
>   dma_unmap_single(dev, scrq->msg_token, 4 * PAGE_SIZE,
> @@ -2326,12 +2329,12 @@ static struct ibmvnic_sub_crq_queue 
> *init_sub_crq_queue(struct ibmvnic_adapter
>   return NULL;
>  }
> 
> -static void release_sub_crqs(struct ibmvnic_adapter *adapter)
> +static void release_sub_crqs(struct ibmvnic_adapter *adapter, int do_h_free)
>  {
>   int i;
> 
>   if (adapter->tx_scrq) {
> - for (i = 0; i < adapter->req_tx_queues; i++) {
> + for (i = 0; i < adapter->num_active_tx_scrqs; i++) {
>   if (!adapter->tx_scrq[i])
>   continue;
> 
> @@ -2344,7 +2347,8 @@ static void release_sub_crqs(struct ibmvnic_adapter 
> *adapter)
>   adapter->tx_scrq[i]->irq = 0;
>   }
> 
> - release_sub_crq_queue(adapter, adapter->tx_scrq[i]);
> + release_su

[PATCH V2 net-next 5/5] ibmvnic: Allocate max queues stats buffers

2018-02-18 Thread Nathan Fontenot
To avoid losing any stats when the number of sub-crqs change, allocate
the max number of stats buffers so a stats buffer exists all possible
sub-crqs.

Signed-off-by: Nathan Fontenot 
---
 drivers/net/ethernet/ibm/ibmvnic.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index 0e2894a90885..14f2cffbbb55 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -361,14 +361,14 @@ static void release_stats_buffers(struct ibmvnic_adapter 
*adapter)
 static int init_stats_buffers(struct ibmvnic_adapter *adapter)
 {
adapter->tx_stats_buffers =
-   kcalloc(adapter->req_tx_queues,
+   kcalloc(IBMVNIC_MAX_QUEUES,
sizeof(struct ibmvnic_tx_queue_stats),
GFP_KERNEL);
if (!adapter->tx_stats_buffers)
return -ENOMEM;
 
adapter->rx_stats_buffers =
-   kcalloc(adapter->req_rx_queues,
+   kcalloc(IBMVNIC_MAX_QUEUES,
sizeof(struct ibmvnic_rx_queue_stats),
GFP_KERNEL);
if (!adapter->rx_stats_buffers)



[PATCH V2 net-next 4/5] ibmvnic: Make napi usage dynamic

2018-02-18 Thread Nathan Fontenot
In order to handle the number of rx sub crqs changing during a driver
reset, the ibmvnic driver also needs to update the number of napi.
To do this the code to init and free napi's is moved to their own
routines so they can be called during the reset process.

Signed-off-by: Nathan Fontenot 
---
 drivers/net/ethernet/ibm/ibmvnic.c |   67 
 1 file changed, 45 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index 35a17234be30..0e2894a90885 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -730,6 +730,43 @@ static void ibmvnic_napi_disable(struct ibmvnic_adapter 
*adapter)
adapter->napi_enabled = false;
 }
 
+static int init_napi(struct ibmvnic_adapter *adapter)
+{
+   int i;
+
+   adapter->napi = kcalloc(adapter->req_rx_queues,
+   sizeof(struct napi_struct), GFP_KERNEL);
+   if (!adapter->napi)
+   return -ENOMEM;
+
+   for (i = 0; i < adapter->req_rx_queues; i++) {
+   netdev_dbg(adapter->netdev, "Adding napi[%d]\n", i);
+   netif_napi_add(adapter->netdev, &adapter->napi[i],
+  ibmvnic_poll, NAPI_POLL_WEIGHT);
+   }
+
+   return 0;
+}
+
+static void release_napi(struct ibmvnic_adapter *adapter)
+{
+   int i;
+
+   if (!adapter->napi)
+   return;
+
+   for (i = 0; i < adapter->num_active_rx_scrqs; i++) {
+   if (&adapter->napi[i]) {
+   netdev_dbg(adapter->netdev,
+  "Releasing napi[%d]\n", i);
+   netif_napi_del(&adapter->napi[i]);
+   }
+   }
+
+   kfree(adapter->napi);
+   adapter->napi = NULL;
+}
+
 static int ibmvnic_login(struct net_device *netdev)
 {
struct ibmvnic_adapter *adapter = netdev_priv(netdev);
@@ -783,8 +820,6 @@ static int ibmvnic_login(struct net_device *netdev)
 
 static void release_resources(struct ibmvnic_adapter *adapter)
 {
-   int i;
-
release_vpd_data(adapter);
 
release_tx_pools(adapter);
@@ -793,16 +828,7 @@ static void release_resources(struct ibmvnic_adapter 
*adapter)
release_stats_token(adapter);
release_stats_buffers(adapter);
release_error_buffers(adapter);
-
-   if (adapter->napi) {
-   for (i = 0; i < adapter->req_rx_queues; i++) {
-   if (&adapter->napi[i]) {
-   netdev_dbg(adapter->netdev,
-  "Releasing napi[%d]\n", i);
-   netif_napi_del(&adapter->napi[i]);
-   }
-   }
-   }
+   release_napi(adapter);
 }
 
 static int set_link_state(struct ibmvnic_adapter *adapter, u8 link_state)
@@ -921,7 +947,7 @@ static int ibmvnic_get_vpd(struct ibmvnic_adapter *adapter)
 static int init_resources(struct ibmvnic_adapter *adapter)
 {
struct net_device *netdev = adapter->netdev;
-   int i, rc;
+   int rc;
 
rc = set_real_num_queues(netdev);
if (rc)
@@ -947,16 +973,10 @@ static int init_resources(struct ibmvnic_adapter *adapter)
}
 
adapter->map_id = 1;
-   adapter->napi = kcalloc(adapter->req_rx_queues,
-   sizeof(struct napi_struct), GFP_KERNEL);
-   if (!adapter->napi)
-   return -ENOMEM;
 
-   for (i = 0; i < adapter->req_rx_queues; i++) {
-   netdev_dbg(netdev, "Adding napi[%d]\n", i);
-   netif_napi_add(netdev, &adapter->napi[i], ibmvnic_poll,
-  NAPI_POLL_WEIGHT);
-   }
+   rc = init_napi(adapter);
+   if (rc)
+   return rc;
 
send_map_query(adapter);
 
@@ -1641,6 +1661,9 @@ static int do_reset(struct ibmvnic_adapter *adapter,
init_rx_pools(netdev);
init_tx_pools(netdev);
 
+   release_napi(adapter);
+   init_napi(adapter);
+
adapter->num_active_tx_scrqs = adapter->req_tx_queues;
adapter->num_active_rx_scrqs = adapter->req_rx_queues;
} else {



[PATCH V2 net-next 3/5] ibmvnic: Free and re-allocate scrqs when tx/rx scrqs change

2018-02-18 Thread Nathan Fontenot
When the driver resets it is possible that the number of tx/rx
sub-crqs can change. This patch handles this so that the driver does
not try to access non-existent sub-crqs.

The count for releasing sub crqs depends on the adapter state. The
active queue count is not set in probe, so if we are relasing in probe
state we use the request queue count.

Additionally, a parameter is added to release_sub_crqs() so that
we know if the h_call to free the sub-crq needs to be made. In
the reset path we have to do a reset of the main crq, which is
a free followed by a register of the main crq. The free of main
crq results in all of the sub crq's being free'ed. When updating
sub-crq count in the reset path we do not want to h_free the
sub-crqs, they are already free'ed.

Signed-off-by: Nathan Fontenot 
---

Updates for V2: Use the correct scrq count for releasing scrqs
when adapter is in probed state.

 drivers/net/ethernet/ibm/ibmvnic.c |   78 
 1 file changed, 52 insertions(+), 26 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index 9cfbb20b5ac1..35a17234be30 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -90,7 +90,7 @@ MODULE_VERSION(IBMVNIC_DRIVER_VERSION);
 
 static int ibmvnic_version = IBMVNIC_INITIAL_VERSION;
 static int ibmvnic_remove(struct vio_dev *);
-static void release_sub_crqs(struct ibmvnic_adapter *);
+static void release_sub_crqs(struct ibmvnic_adapter *, int);
 static int ibmvnic_reset_crq(struct ibmvnic_adapter *);
 static int ibmvnic_send_crq_init(struct ibmvnic_adapter *);
 static int ibmvnic_reenable_crq_queue(struct ibmvnic_adapter *);
@@ -740,7 +740,7 @@ static int ibmvnic_login(struct net_device *netdev)
do {
if (adapter->renegotiate) {
adapter->renegotiate = false;
-   release_sub_crqs(adapter);
+   release_sub_crqs(adapter, 1);
 
reinit_completion(&adapter->init_done);
send_cap_queries(adapter);
@@ -1602,7 +1602,7 @@ static int do_reset(struct ibmvnic_adapter *adapter,
if (adapter->reset_reason == VNIC_RESET_CHANGE_PARAM ||
adapter->wait_for_reset) {
release_resources(adapter);
-   release_sub_crqs(adapter);
+   release_sub_crqs(adapter, 1);
release_crq_queue(adapter);
}
 
@@ -2241,24 +2241,27 @@ static int reset_sub_crq_queues(struct ibmvnic_adapter 
*adapter)
 }
 
 static void release_sub_crq_queue(struct ibmvnic_adapter *adapter,
- struct ibmvnic_sub_crq_queue *scrq)
+ struct ibmvnic_sub_crq_queue *scrq,
+ int do_h_free)
 {
struct device *dev = &adapter->vdev->dev;
long rc;
 
netdev_dbg(adapter->netdev, "Releasing sub-CRQ\n");
 
-   /* Close the sub-crqs */
-   do {
-   rc = plpar_hcall_norets(H_FREE_SUB_CRQ,
-   adapter->vdev->unit_address,
-   scrq->crq_num);
-   } while (rc == H_BUSY || H_IS_LONG_BUSY(rc));
+   if (do_h_free) {
+   /* Close the sub-crqs */
+   do {
+   rc = plpar_hcall_norets(H_FREE_SUB_CRQ,
+   adapter->vdev->unit_address,
+   scrq->crq_num);
+   } while (rc == H_BUSY || H_IS_LONG_BUSY(rc));
 
-   if (rc) {
-   netdev_err(adapter->netdev,
-  "Failed to release sub-CRQ %16lx, rc = %ld\n",
-  scrq->crq_num, rc);
+   if (rc) {
+   netdev_err(adapter->netdev,
+  "Failed to release sub-CRQ %16lx, rc = 
%ld\n",
+  scrq->crq_num, rc);
+   }
}
 
dma_unmap_single(dev, scrq->msg_token, 4 * PAGE_SIZE,
@@ -2326,12 +2329,21 @@ static struct ibmvnic_sub_crq_queue 
*init_sub_crq_queue(struct ibmvnic_adapter
return NULL;
 }
 
-static void release_sub_crqs(struct ibmvnic_adapter *adapter)
+static void release_sub_crqs(struct ibmvnic_adapter *adapter, int do_h_free)
 {
+   u64 num_tx_scrqs, num_rx_scrqs;
int i;
 
+   if (adapter->state == VNIC_PROBED) {
+   num_tx_scrqs = adapter->req_tx_queues;
+   num_rx_scrqs = adapter->req_rx_queues;
+   } else {
+   num_tx_scrqs = adapter->num_active_tx_scrqs;
+   num_rx_scrqs = adapter->num_active_rx_scrqs;
+   }
+
if (adapter->tx_scrq) {
-   for (i = 0; i < adapter->req_tx_queues; i++) {
+   for (i = 0; i < num_tx_scrqs; i++) {
if (!adapter->tx_scrq[i])
continue;
 
@@ -2344,7 +2356,8 @@ stati

[PATCH V2 net-next 2/5] ibmvnic: Move active sub-crq count settings

2018-02-18 Thread Nathan Fontenot
In preparation for using the active scrq count to track more active
resources, move the setting of the active count to after initialization
occurs in initial driver init and during driver reset.

Signed-off-by: Nathan Fontenot 
---
 drivers/net/ethernet/ibm/ibmvnic.c |   17 +++--
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index ca2e3fbfd848..9cfbb20b5ac1 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -484,7 +484,6 @@ static void release_rx_pools(struct ibmvnic_adapter 
*adapter)
 
kfree(adapter->rx_pool);
adapter->rx_pool = NULL;
-   adapter->num_active_rx_scrqs = 0;
 }
 
 static int init_rx_pools(struct net_device *netdev)
@@ -509,8 +508,6 @@ static int init_rx_pools(struct net_device *netdev)
return -1;
}
 
-   adapter->num_active_rx_scrqs = 0;
-
for (i = 0; i < rxadd_subcrqs; i++) {
rx_pool = &adapter->rx_pool[i];
 
@@ -554,8 +551,6 @@ static int init_rx_pools(struct net_device *netdev)
rx_pool->next_free = 0;
}
 
-   adapter->num_active_rx_scrqs = rxadd_subcrqs;
-
return 0;
 }
 
@@ -624,7 +619,6 @@ static void release_tx_pools(struct ibmvnic_adapter 
*adapter)
 
kfree(adapter->tx_pool);
adapter->tx_pool = NULL;
-   adapter->num_active_tx_scrqs = 0;
 }
 
 static int init_tx_pools(struct net_device *netdev)
@@ -641,8 +635,6 @@ static int init_tx_pools(struct net_device *netdev)
if (!adapter->tx_pool)
return -1;
 
-   adapter->num_active_tx_scrqs = 0;
-
for (i = 0; i < tx_subcrqs; i++) {
tx_pool = &adapter->tx_pool[i];
 
@@ -690,8 +682,6 @@ static int init_tx_pools(struct net_device *netdev)
tx_pool->producer_index = 0;
}
 
-   adapter->num_active_tx_scrqs = tx_subcrqs;
-
return 0;
 }
 
@@ -975,6 +965,10 @@ static int init_resources(struct ibmvnic_adapter *adapter)
return rc;
 
rc = init_tx_pools(netdev);
+
+   adapter->num_active_tx_scrqs = adapter->req_tx_queues;
+   adapter->num_active_rx_scrqs = adapter->req_rx_queues;
+
return rc;
 }
 
@@ -1646,6 +1640,9 @@ static int do_reset(struct ibmvnic_adapter *adapter,
release_tx_pools(adapter);
init_rx_pools(netdev);
init_tx_pools(netdev);
+
+   adapter->num_active_tx_scrqs = adapter->req_tx_queues;
+   adapter->num_active_rx_scrqs = adapter->req_rx_queues;
} else {
rc = reset_tx_pools(adapter);
if (rc)



[PATCH V2 net-next 1/5] ibmvnic: Rename active queue count variables

2018-02-18 Thread Nathan Fontenot
Rename the tx/rx active pool variables to be tx/rx active scrq
counts. The tx/rx pools are per sub-crq so this is a more appropriate
name. This also is a preparatory step for using thiese variables
for handling updates to sub-crqs and napi based on the active
count.

Signed-off-by: Nathan Fontenot 
---
 drivers/net/ethernet/ibm/ibmvnic.c |   16 
 drivers/net/ethernet/ibm/ibmvnic.h |4 ++--
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index 27447260215d..ca2e3fbfd848 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -461,7 +461,7 @@ static void release_rx_pools(struct ibmvnic_adapter 
*adapter)
if (!adapter->rx_pool)
return;
 
-   for (i = 0; i < adapter->num_active_rx_pools; i++) {
+   for (i = 0; i < adapter->num_active_rx_scrqs; i++) {
rx_pool = &adapter->rx_pool[i];
 
netdev_dbg(adapter->netdev, "Releasing rx_pool[%d]\n", i);
@@ -484,7 +484,7 @@ static void release_rx_pools(struct ibmvnic_adapter 
*adapter)
 
kfree(adapter->rx_pool);
adapter->rx_pool = NULL;
-   adapter->num_active_rx_pools = 0;
+   adapter->num_active_rx_scrqs = 0;
 }
 
 static int init_rx_pools(struct net_device *netdev)
@@ -509,7 +509,7 @@ static int init_rx_pools(struct net_device *netdev)
return -1;
}
 
-   adapter->num_active_rx_pools = 0;
+   adapter->num_active_rx_scrqs = 0;
 
for (i = 0; i < rxadd_subcrqs; i++) {
rx_pool = &adapter->rx_pool[i];
@@ -554,7 +554,7 @@ static int init_rx_pools(struct net_device *netdev)
rx_pool->next_free = 0;
}
 
-   adapter->num_active_rx_pools = rxadd_subcrqs;
+   adapter->num_active_rx_scrqs = rxadd_subcrqs;
 
return 0;
 }
@@ -613,7 +613,7 @@ static void release_tx_pools(struct ibmvnic_adapter 
*adapter)
if (!adapter->tx_pool)
return;
 
-   for (i = 0; i < adapter->num_active_tx_pools; i++) {
+   for (i = 0; i < adapter->num_active_tx_scrqs; i++) {
netdev_dbg(adapter->netdev, "Releasing tx_pool[%d]\n", i);
tx_pool = &adapter->tx_pool[i];
kfree(tx_pool->tx_buff);
@@ -624,7 +624,7 @@ static void release_tx_pools(struct ibmvnic_adapter 
*adapter)
 
kfree(adapter->tx_pool);
adapter->tx_pool = NULL;
-   adapter->num_active_tx_pools = 0;
+   adapter->num_active_tx_scrqs = 0;
 }
 
 static int init_tx_pools(struct net_device *netdev)
@@ -641,7 +641,7 @@ static int init_tx_pools(struct net_device *netdev)
if (!adapter->tx_pool)
return -1;
 
-   adapter->num_active_tx_pools = 0;
+   adapter->num_active_tx_scrqs = 0;
 
for (i = 0; i < tx_subcrqs; i++) {
tx_pool = &adapter->tx_pool[i];
@@ -690,7 +690,7 @@ static int init_tx_pools(struct net_device *netdev)
tx_pool->producer_index = 0;
}
 
-   adapter->num_active_tx_pools = tx_subcrqs;
+   adapter->num_active_tx_scrqs = tx_subcrqs;
 
return 0;
 }
diff --git a/drivers/net/ethernet/ibm/ibmvnic.h 
b/drivers/net/ethernet/ibm/ibmvnic.h
index fe21a6e2ddae..c6d0b4afe899 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.h
+++ b/drivers/net/ethernet/ibm/ibmvnic.h
@@ -1091,8 +1091,8 @@ struct ibmvnic_adapter {
u64 opt_rxba_entries_per_subcrq;
__be64 tx_rx_desc_req;
u8 map_id;
-   u64 num_active_rx_pools;
-   u64 num_active_tx_pools;
+   u64 num_active_rx_scrqs;
+   u64 num_active_tx_scrqs;
 
struct tasklet_struct tasklet;
enum vnic_state state;



[PATCH V2 net-next 0/5] ibmvnic: Make driver resources dynamic

2018-02-18 Thread Nathan Fontenot
The ibmvnic driver needs to be able to handle the number of tx/rx
sub-crqs changing during a reset of the driver. To do this several
changes need to be made. First the num_active_[tx|rx]_pools
counters need to be re-named to num_active_[tc|rx]_scrqs, and
updated after resource initialization.

With this change we can now release and init the sub crqs and napi
(for rx sub crqs) when the number of sub crqs change.

Lastly, the stats buffer allocation is updated to always allocate
the maximum number of sub-crqs count of stats buffers.

-Nathan
---

Updates for V2: 
Patch 3/5 - Use correct queue count when driver is in probed state
for releasing sub crqs.

Nathan Fontenot (5):
  ibmvnic: Rename active queue count variables
  ibmvnic: Move active sub-crq count settings
  ibmvnic: Free and re-allocate scrqs when tx/rx scrqs change
  ibmvnic: Make napi usage dynamic
  ibmvnic: Allocate max queues stats buffers


 drivers/net/ethernet/ibm/ibmvnic.c |  170 +++-
 drivers/net/ethernet/ibm/ibmvnic.h |4 -
 2 files changed, 110 insertions(+), 64 deletions(-)



Re: [PATCH V6 2/4] sctp: Add ip option support

2018-02-18 Thread Neil Horman
On Sun, Feb 18, 2018 at 01:44:42PM +, Richard Haines wrote:
> On Fri, 2018-02-16 at 23:28 -0500, Neil Horman wrote:
> > On Fri, Feb 16, 2018 at 07:51:02PM -0200, Marcelo Ricardo Leitner
> > wrote:
> > > On Fri, Feb 16, 2018 at 03:14:35PM -0500, Neil Horman wrote:
> > > > On Fri, Feb 16, 2018 at 10:56:07AM -0200, Marcelo Ricardo Leitner
> > > > wrote:
> > > > > On Thu, Feb 15, 2018 at 09:15:40AM -0500, Neil Horman wrote:
> > > > > > On Tue, Feb 13, 2018 at 08:54:44PM +, Richard Haines
> > > > > > wrote:
> > > > > > > Add ip option support to allow LSM security modules to
> > > > > > > utilise CIPSO/IPv4
> > > > > > > and CALIPSO/IPv6 services.
> > > > > > > 
> > > > > > > Signed-off-by: Richard Haines  > > > > > > com>
> > > > > > > ---
> > > > > > >  include/net/sctp/sctp.h|  4 +++-
> > > > > > >  include/net/sctp/structs.h |  2 ++
> > > > > > >  net/sctp/chunk.c   | 12 +++-
> > > > > > >  net/sctp/ipv6.c| 42
> > > > > > > +++---
> > > > > > >  net/sctp/output.c  |  5 -
> > > > > > >  net/sctp/protocol.c| 36
> > > > > > > 
> > > > > > >  net/sctp/socket.c  | 14 ++
> > > > > > >  7 files changed, 97 insertions(+), 18 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/include/net/sctp/sctp.h
> > > > > > > b/include/net/sctp/sctp.h
> > > > > > > index f7ae6b0..25c5c87 100644
> > > > > > > --- a/include/net/sctp/sctp.h
> > > > > > > +++ b/include/net/sctp/sctp.h
> > > > > > > @@ -441,9 +441,11 @@ static inline int
> > > > > > > sctp_list_single_entry(struct list_head *head)
> > > > > > >  static inline int sctp_frag_point(const struct
> > > > > > > sctp_association *asoc, int pmtu)
> > > > > > >  {
> > > > > > >   struct sctp_sock *sp = sctp_sk(asoc->base.sk);
> > > > > > > + struct sctp_af *af = sp->pf->af;
> > > > > > >   int frag = pmtu;
> > > > > > >  
> > > > > > > - frag -= sp->pf->af->net_header_len;
> > > > > > > + frag -= af->ip_options_len(asoc->base.sk);
> > > > > > > + frag -= af->net_header_len;
> > > > > > >   frag -= sizeof(struct sctphdr) +
> > > > > > > sctp_datachk_len(&asoc->stream);
> > > > > > >  
> > > > > > >   if (asoc->user_frag)
> > > > > > > diff --git a/include/net/sctp/structs.h
> > > > > > > b/include/net/sctp/structs.h
> > > > > > > index 03e92dd..ead5fce 100644
> > > > > > > --- a/include/net/sctp/structs.h
> > > > > > > +++ b/include/net/sctp/structs.h
> > > > > > > @@ -491,6 +491,7 @@ struct sctp_af {
> > > > > > >   void(*ecn_capable)(struct sock
> > > > > > > *sk);
> > > > > > >   __u16   net_header_len;
> > > > > > >   int sockaddr_len;
> > > > > > > + int (*ip_options_len)(struct sock
> > > > > > > *sk);
> > > > > > >   sa_family_t sa_family;
> > > > > > >   struct list_head list;
> > > > > > >  };
> > > > > > > @@ -515,6 +516,7 @@ struct sctp_pf {
> > > > > > >   int (*addr_to_user)(struct sctp_sock *sk, union
> > > > > > > sctp_addr *addr);
> > > > > > >   void (*to_sk_saddr)(union sctp_addr *, struct sock
> > > > > > > *sk);
> > > > > > >   void (*to_sk_daddr)(union sctp_addr *, struct sock
> > > > > > > *sk);
> > > > > > > + void (*copy_ip_options)(struct sock *sk, struct
> > > > > > > sock *newsk);
> > > > > > >   struct sctp_af *af;
> > > > > > >  };
> > > > > > >  
> > > > > > > diff --git a/net/sctp/chunk.c b/net/sctp/chunk.c
> > > > > > > index 991a530..d5c0ef7 100644
> > > > > > > --- a/net/sctp/chunk.c
> > > > > > > +++ b/net/sctp/chunk.c
> > > > > > > @@ -154,7 +154,6 @@ static void sctp_datamsg_assign(struct
> > > > > > > sctp_datamsg *msg, struct sctp_chunk *chu
> > > > > > >   chunk->msg = msg;
> > > > > > >  }
> > > > > > >  
> > > > > > > -
> > > > > > >  /* A data chunk can have a maximum payload of (2^16 -
> > > > > > > 20).  Break
> > > > > > >   * down any such message into smaller
> > > > > > > chunks.  Opportunistically, fragment
> > > > > > >   * the chunks down to the current MTU constraints.  We may
> > > > > > > get refragmented
> > > > > > > @@ -171,6 +170,8 @@ struct sctp_datamsg
> > > > > > > *sctp_datamsg_from_user(struct sctp_association *asoc,
> > > > > > >   struct list_head *pos, *temp;
> > > > > > >   struct sctp_chunk *chunk;
> > > > > > >   struct sctp_datamsg *msg;
> > > > > > > + struct sctp_sock *sp;
> > > > > > > + struct sctp_af *af;
> > > > > > >   int err;
> > > > > > >  
> > > > > > >   msg = sctp_datamsg_new(GFP_KERNEL);
> > > > > > > @@ -189,9 +190,11 @@ struct sctp_datamsg
> > > > > > > *sctp_datamsg_from_user(struct sctp_association *asoc,
> > > > > > >   /* This is the biggest possible DATA chunk that
> > > > > > > can fit into
> > > > > > >* the packet
> > > > > > >*/
> > > > > > > - max_data = asoc->pathmtu -
> > > > > > > -sctp_sk(asoc->base.sk)->pf->af-
> > > > > > > >net_header_len -
> > > > > > > -sizeof(struct sctphdr) -
> > > > > > > sctp_datachk_len(&asoc->stream);
> > > > > > > + s

Re: [PATCH v5 00/14] net: pch_gbe: Fixes & MIPS support

2018-02-18 Thread David Miller
From: Paul Burton 
Date: Sun, 18 Feb 2018 09:03:10 -0800

> Hi David,
> 
> On Sun, Feb 18, 2018 at 10:31:12AM -0500, David Miller wrote:
>> Nobody is going to see and apply these patches if you don't CC: the
>> Linux networking development list, netdev@vger.kernel.org
> 
> You're replying to mail that was "To: netdev@vger.kernel.org" and I see
> the whole series in the archives[1] so it definitely reached the list.
> 
> I'm not sure I see the problem?

Sorry.

The issue is that your patch series didn't make it into patchwork
properly, I wonder what happened since you did send it to netdev.

Hmmm...


[PATCH nf] netfilter: ebtables: CONFIG_COMPAT: don't trust userland offsets

2018-02-18 Thread Florian Westphal
We need to make sure the offsets are not out of range of the
total size.
Also check that they are in ascending order.

The WARN_ON triggered by syzkaller (it sets panic_on_warn) is
changed to also bail out, no point in continuing parsing.

Briefly tested with simple ruleset of
-A INPUT --limit 1/s' --log
plus jump to custom chains using 32bit ebtables binary.

Reported-by: 
Signed-off-by: Florian Westphal 
---
 net/bridge/netfilter/ebtables.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c
index 02c4b409d317..3f536c7a3354 100644
--- a/net/bridge/netfilter/ebtables.c
+++ b/net/bridge/netfilter/ebtables.c
@@ -2053,7 +2053,9 @@ static int ebt_size_mwt(struct compat_ebt_entry_mwt 
*match32,
if (match_kern)
match_kern->match_size = ret;
 
-   WARN_ON(type == EBT_COMPAT_TARGET && size_left);
+   if (WARN_ON(type == EBT_COMPAT_TARGET && size_left))
+   return -EINVAL;
+
match32 = (struct compat_ebt_entry_mwt *) buf;
}
 
@@ -2109,6 +2111,15 @@ static int size_entry_mwt(struct ebt_entry *entry, const 
unsigned char *base,
 *
 * offsets are relative to beginning of struct ebt_entry (i.e., 0).
 */
+   for (i = 0; i < 4 ; ++i) {
+   if (offsets[i] >= *total)
+   return -EINVAL;
+   if (i == 0)
+   continue;
+   if (offsets[i-1] > offsets[i])
+   return -EINVAL;
+   }
+
for (i = 0, j = 1 ; j < 4 ; j++, i++) {
struct compat_ebt_entry_mwt *match32;
unsigned int size;
-- 
2.16.1



Re: [PATCH RFC 0/4] net: add bpfilter

2018-02-18 Thread Florian Westphal
Daniel Borkmann  wrote:
> As rule translation can potentially become very complex, this is performed
> entirely in user space. In order to ease deployment, request_module() code
> is extended to allow user mode helpers to be invoked. Idea is that user mode
> helpers are built as part of the kernel build and installed as traditional
> kernel modules with .ko file extension into distro specified location,
> such that from a distribution point of view, they are no different than
> regular kernel modules. Thus, allow request_module() logic to load such
> user mode helper (umh) binaries via:
> 
>   request_module("foo") ->
> call_umh("modprobe foo") ->
>   sys_finit_module(FD of /lib/modules/.../foo.ko) ->
> call_umh(struct file)
> 
> Such approach enables kernel to delegate functionality traditionally done
> by kernel modules into user space processes (either root or !root)

Unrelated:  AFAIU this would allow to e.g. move the compat32 handlers
(which are very ugly/error prone) off to userspace?

compat_syscall -> umh_32_64_xlate -> syscall() ?

[ feel free to move this to different thread, only mentioning this
  so I won't forget ]


a lot of WARNING, nf_xfrm_me_harder in 4.15.x

2018-02-18 Thread Denys Fedoryshchenko

Is there any bug with that or it is just some sort of spam?
Cause i am troubleshooting at same time "hard to catch" bug in ppp/pppoe

Workload: pppoe bras
I am going to try last stable 4.14.x after 1-2 days as well, but 
probably i noticed this message appeared there as well, under some 
conditions.


[   49.784216] WARNING: CPU: 4 PID: 0 at ./include/net/dst.h:256 
nf_xfrm_me_harder+0x12d/0x2d7 [nf_nat]
[   49.784847] Modules linked in: pppoe pppox ppp_generic slhc 
netconsole configfs coretemp nf_nat_pptp nf_nat_proto_gre 
nf_conntrack_pptp nf_conntrack_proto_gre tun xt_TEE nf_dup_ipv4 
xt_REDIRECT nf_nat_redirect xt_nat xt_TCPMSS ipt_REJECT nf_reject_ipv4 
xt_set xt_string xt_connmark xt_DSCP xt_mark xt_tcpudp ip_set_hash_net 
ip_set_hash_ip ip_set nfnetlink iptable_mangle iptable_filter 
iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat 
nf_conntrack ip_tables x_tables 8021q garp mrp stp llc ixgbe dca
[   49.786762] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 
4.15.4-build-0134 #2
[   49.787104] Hardware name: HP ProLiant DL320e Gen8 v2, BIOS P80 
04/02/2015

[   49.787448] RIP: 0010:nf_xfrm_me_harder+0x12d/0x2d7 [nf_nat]
[   49.787782] RSP: 0018:8803f23078e0 EFLAGS: 00010246
[   49.788114] RAX:  RBX: 8803d8acad00 RCX: 
11007a875b00
[   49.788463] RDX: 11007b159500 RSI:  RDI: 
8803d8acad48
[   49.788818] RBP: 8803d43ada40 R08: ed007e460f27 R09: 
8803f2307900
[   49.789175] R10: ed007e460f26 R11: 0001 R12: 
11007e460f1c
[   49.789528] R13: 8803d43ada98 R14: 83e2b600 R15: 
8803d8acad80
[   49.789881] FS:  () GS:8803f230() 
knlGS:

[   49.790500] CS:  0010 DS:  ES:  CR0: 80050033
[   49.790850] CR2: 7f758e3aa490 CR3: 000445a0d001 CR4: 
001606e0

[   49.791192] Call Trace:
[   49.791517]  
[   49.791845]  ? __nf_nat_decode_session+0x108/0x108 [nf_nat]
[   49.792180]  ? nf_nat_ipv4_fn+0x33d/0x4df [nf_nat_ipv4]
[   49.792515]  ? iptable_nat_ipv4_fn+0xc/0xc [iptable_nat]
[   49.792849]  nf_nat_ipv4_out+0x235/0x305 [nf_nat_ipv4]
[   49.793183]  ? iptable_nat_ipv4_local_fn+0xc/0xc [iptable_nat]
[   49.793519]  nf_hook_slow+0xb1/0x11b
[   49.793850]  ip_output+0x205/0x243
[   49.794180]  ? ip_mc_output+0x548/0x548
[   49.794508]  ? ip_fragment.constprop.5+0x197/0x197
[   49.794841]  ? iptable_filter_net_init+0x1a/0x1a [iptable_filter]
[   49.795173]  ? nf_hook_slow+0xb1/0x11b
[   49.795504]  ip_forward+0xe9c/0xecb
[   49.795836]  ? ip_forward_finish+0x110/0x110
[   49.796166]  ? ip_frag_mem+0x3d/0x3d
[   49.796493]  ? ip_rcv_finish+0xcf8/0xd91
[   49.796830]  ip_rcv+0x985/0xa12
[   49.797178]  ? ip_local_deliver+0x225/0x225
[   49.797536]  ? ip_local_deliver_finish+0x599/0x599
[   49.797893]  ? ip_local_deliver+0x225/0x225
[   49.798254]  __netif_receive_skb_core+0x10ce/0x1c76
[   49.798613]  ? netif_set_xps_queue+0xbdb/0xbdb
[   49.798972]  ? process_backlog+0x1c5/0x3c0
[   49.799323]  process_backlog+0x1c5/0x3c0
[   49.799674]  net_rx_action+0x3aa/0x840
[   49.800026]  ? napi_complete_done+0x22b/0x22b
[   49.800378]  ? __tick_nohz_idle_enter+0x42b/0x9b3
[   49.800733]  ? get_cpu_iowait_time_us+0x16f/0x16f
[   49.801084]  __do_softirq+0x17f/0x34a
[   49.801411]  ? flush_smp_call_function_queue+0x16a/0x229
[   49.801750]  irq_exit+0x8f/0xf9
[   49.802080]  call_function_single_interrupt+0x92/0xa0
[   49.802420]  
[   49.802765] RIP: 0010:mwait_idle+0x99/0xac
[   49.803106] RSP: 0018:8803f0317ef8 EFLAGS: 0246 ORIG_RAX: 
ff04
[   49.803709] RAX:  RBX: 8803f02e4240 RCX: 

[   49.804042] RDX: 11007e05c848 RSI:  RDI: 

[   49.804372] RBP: 8803f02e4240 R08: 55574f086bb0 R09: 
7f78bd996700
[   49.804705] R10: 8803f0317dd0 R11: 0293 R12: 

[   49.805038] R13: dc00 R14: ed007e05c848 R15: 
8803f02e4240

[   49.805373]  do_idle+0xe6/0x19a
[   49.805700]  cpu_startup_entry+0x18/0x1a
[   49.806033]  secondary_startup_64+0xa5/0xb0
[   49.806359] Code: e0 07 83 c0 03 38 d0 7c 0c 84 d2 74 08 4c 89 ff e8 
65 3f 26 e1 8b 83 80 00 00 00 85 c0 74 0c 8d 50 01 f0 41 0f b1 17 74 04 
eb f0 <0f> ff 48 8d 7d 18 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1

[   49.807242] ---[ end trace 2654a347942730c3 ]---
[   49.807580] dst_release: dst:24366567 refcnt:-1
[  164.894058] WARNING: CPU: 5 PID: 22617 at ./include/net/dst.h:256 
nf_xfrm_me_harder+0x12d/0x2d7 [nf_nat]
[  164.894686] Modules linked in: pppoe pppox ppp_generic slhc 
netconsole configfs coretemp nf_nat_pptp nf_nat_proto_gre 
nf_conntrack_pptp nf_conntrack_proto_gre tun xt_TEE nf_dup_ipv4 
xt_REDIRECT nf_nat_redirect xt_nat xt_TCPMSS ipt_REJECT nf_reject_ipv4 
xt_set xt_string xt_connmark xt_DSCP xt_mark xt_tcpudp ip_set_hash_net 
ip_set_hash_ip ip_set nfnetlink iptable_mangle iptable_filter 
iptable_nat nf_conntrack_ipv4 nf_def

Re: TCP and BBR: reproducibly low cwnd and bandwidth

2018-02-18 Thread Eric Dumazet
On Sun, 2018-02-18 at 22:49 +0100, Oleksandr Natalenko wrote:
> Hi.
> 
> On neděle 18. února 2018 22:04:27 CET Eric Dumazet wrote:
> > I was able to take a look today, and I believe this is the time to
> > switch TCP to GSO being always on.
> > 
> > As a bonus, we get speed boost for cubic as well.
> > 
> > Todays high BDP and recent TCP improvements (rtx queue as rb-tree, sack
> > coalescing, TCP pacing...) all were developed/tested/maintained with
> > GSO/TSO being the norm.
> > 
> > Can you please test the following patch ?
> 
> Yes, results below:
> 
> BBR+fq:
> sg on:  6.02 Gbits/sec
> sg off: 1.33 Gbits/sec
> 
> BBR+pfifo_fast:
> sg on:  4.13 Gbits/sec
> sg off: 1.34 Gbits/sec
> 
> BBR+fq_codel:
> sg on:  4.16 Gbits/sec
> sg off: 1.35 Gbits/sec
> 
> Reno+fq:
> sg on:  6.44 Gbits/sec
> sg off: 1.39 Gbits/sec
> 
> Reno+pfifo_fast:
> sg on:  6.36 Gbits/sec
> sg off: 1.39 Gbits/sec
> 
> Reno+fq_codel:
> sg on:  6.41 Gbits/sec
> sg off: 1.38 Gbits/sec
> 
> While BBR still suffers when fq is not used, disabling sg doesn't bring 
> drastic throughput drop anymore. So, looks good to me, eh?
> 

Indeed :)

Here are my results on 40Gbit link (mlx4) :

BBR+fq:
sg on:  26 Gbits/sec
sg off: 15.7 Gbits/sec   (was 2.3 Gbit before patch)

BBR+pfifo_fast:
sg on:  24.2 Gbits/sec
sg off: 14.9 Gbits/sec  (was 0.66 Gbit before patch !!! )

BBR+fq_codel:
sg on:  24.4 Gbits/sec
sg off: 15 Gbits/sec  (was 0.66 Gbit before patch !!! )

Reno+fq:
sg on:  20 Gbits/sec
sg off: 15.7 Gbits/sec  (was 6 Gbit)

Reno+pfifo_fast:
sg on:  25.7 Gbits/sec
sg off: 15.5 Gbits/sec  (was 7 Gbit)

Reno+fq_codel:
sg on:  25.8 Gbits/sec
sg off: 16 Gbits/sec(was 7 Gbit)

Definitely worth it ;)

Thanks !



[PATCH] net: sched: report if filter is too large to dump

2018-02-18 Thread Roman Kapl
So far, if the filter was too large to fit in the allocated skb, the
kernel did not return any error and stopped dumping. Modify the dumper
so that it returns -ENOMSG when a filter fails to dump and it is the
first filter in the skb. If we are not first, we will get a next chance
with more room.

I understand this is pretty near to being an API change, but the
original design (silent truncation) can be considered a bug.

Note: The error case can happen pretty easily if you create a filter
with 32 actions and have 4kb pages. Also recent versions of iproute try
to be clever with their buffer allocation size, which in turn leads to
smaller SKBs in kernel.

Signed-off-by: Roman Kapl 
---
 net/sched/cls_api.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index f21610c5da1a..b5771a586c2d 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -1399,13 +1399,17 @@ static int tc_dump_tfilter(struct sk_buff *skb, struct 
netlink_callback *cb)
nla_get_u32(tca[TCA_CHAIN]) != chain->index)
continue;
if (!tcf_chain_dump(chain, q, parent, skb, cb,
-   index_start, &index))
+   index_start, &index)) {
+   err = -EMSGSIZE;
break;
+   }
}
 
cb->args[0] = index;
 
 out:
+   if (skb->len == 0 && err)
+   return err;
return skb->len;
 }
 
-- 
2.16.1



Re: [PATCH v5 00/14] net: pch_gbe: Fixes & MIPS support

2018-02-18 Thread Paul Burton
Hi Andrew,

On Sun, Feb 18, 2018 at 06:56:07PM +0100, Andrew Lunn wrote:
> On Sun, Feb 18, 2018 at 09:03:10AM -0800, Paul Burton wrote:
> > Hi David,
> > 
> > On Sun, Feb 18, 2018 at 10:31:12AM -0500, David Miller wrote:
> > > Nobody is going to see and apply these patches if you don't CC: the
> > > Linux networking development list, netdev@vger.kernel.org
> > 
> > You're replying to mail that was "To: netdev@vger.kernel.org" and I see
> > the whole series in the archives[1] so it definitely reached the list.
> > 
> > I'm not sure I see the problem?
> 
> Hi Paul
> 
> I'm guess that David is wondering about version 1-4 of this patchset?
> As far as i can see, they were sent to the mips list, not the netdev
> list.

It has been quite a while since v4, but it and earlier revisions were
submitted to the netdev list too:

v4: https://www.spinics.net/lists/netdev/msg438550.html
v3: https://www.spinics.net/lists/netdev/msg438313.html
v2: https://marc.info/?l=linux-netdev&m=145450117711515&w=2

v1 was part of a larger series, but netdev was also copied on the
relevant patches starting here & the patches following it:

v1: https://marc.info/?l=linux-netdev&m=144890083511222&w=2

Thanks,
Paul


Re: TCP and BBR: reproducibly low cwnd and bandwidth

2018-02-18 Thread Oleksandr Natalenko
Hi.

On neděle 18. února 2018 22:04:27 CET Eric Dumazet wrote:
> I was able to take a look today, and I believe this is the time to
> switch TCP to GSO being always on.
> 
> As a bonus, we get speed boost for cubic as well.
> 
> Todays high BDP and recent TCP improvements (rtx queue as rb-tree, sack
> coalescing, TCP pacing...) all were developed/tested/maintained with
> GSO/TSO being the norm.
> 
> Can you please test the following patch ?

Yes, results below:

BBR+fq:
sg on:  6.02 Gbits/sec
sg off: 1.33 Gbits/sec

BBR+pfifo_fast:
sg on:  4.13 Gbits/sec
sg off: 1.34 Gbits/sec

BBR+fq_codel:
sg on:  4.16 Gbits/sec
sg off: 1.35 Gbits/sec

Reno+fq:
sg on:  6.44 Gbits/sec
sg off: 1.39 Gbits/sec

Reno+pfifo_fast:
sg on:  6.36 Gbits/sec
sg off: 1.39 Gbits/sec

Reno+fq_codel:
sg on:  6.41 Gbits/sec
sg off: 1.38 Gbits/sec

While BBR still suffers when fq is not used, disabling sg doesn't bring 
drastic throughput drop anymore. So, looks good to me, eh?

> Note that some cleanups can be done later in TCP stack, removing lots
> of legacy stuff.
> 
> Also TCP internal-pacing could benefit from something similar to this
> fq patch eventually, although there is no hurry.
> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?i
> d=fefa569a9d4bc4b7758c0fddd75bb0382c95da77  

Feel free to ping me if you have something else to test then ;).

> Of course, you have to consider why SG was disabled on your device,
> this looks very pessimistic.

Dunno why that happens, but I've managed to just enable it automatically on 
interface up.

Thanks.

Oleksandr




[no subject]

2018-02-18 Thread Alfred Chow




Good Day,

This is the second time i am sending you this mail.

I am Mr. Alfred Cheuk Yu Chow, the Director for Credit & Marketing Chong
Hing Bank, Hong Kong, need your alliance in a deal that will be of  
mutual benefit.


Email me back for more details.

Regards.







Re: [PATCH net] tcp: restrict F-RTO to work-around broken middle-boxes

2018-02-18 Thread Teodor Milkov

Hello,

I've numerous reports from Windows users that after kernel upgrade from 4.9 to 
4.14 they experienced major slow downs and transfer stalls.

After some digging, I found that the slowness starts with this commit:

 tcp: extend F-RTO to catch more spurious timeouts (89fe18e44)
 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=89fe18e44f7ee5ab1c90d0dff5835acee7751427

Which is partially reverted later with this one:

 tcp: restrict F-RTO to work-around broken middle-boxes (cc663f4d4)
 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cc663f4d4c97b7297fb45135ab23cfd508b35a77

But, still, we had stalls until I fully reverted 89fe18e44.


 
---


The recent extension of F-RTO 89fe18e44 ("tcp: extend F-RTO
to catch more spurious timeouts") interacts badly with certain
broken middle-boxes.  These broken boxes modify and falsely raise
the receive window on the ACKs. During a timeout induced recovery,
F-RTO would send new data packets to probe if the timeout is false
or not. Since the receive window is falsely raised, the receiver
would silently drop these F-RTO packets. The recovery would take N
(exponentially backoff) timeouts to repair N packet losses.  A TCP
performance killer.

Due to this unfortunate situation, this patch removes this extension
to revert F-RTO back to the RFC specification.

Fixes: 89fe18e44f7e ("tcp: extend F-RTO to catch more spurious timeouts")
Signed-off-by: Yuchung Cheng 
Signed-off-by: Neal Cardwell 
Signed-off-by: Soheil Hassas Yeganeh 
Signed-off-by: Eric Dumazet 
---
 net/ipv4/tcp_input.c | 20 
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 2c1f59386a7b..659d1baefb2b 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -1935,6 +1935,7 @@ void tcp_enter_loss(struct sock *sk)
struct tcp_sock *tp = tcp_sk(sk);
struct net *net = sock_net(sk);
struct sk_buff *skb;
+   bool new_recovery = icsk->icsk_ca_state < TCP_CA_Recovery;
bool is_reneg;  /* is receiver reneging on SACKs? */
bool mark_lost;
 
@@ -1994,15 +1995,18 @@ void tcp_enter_loss(struct sock *sk)

tp->high_seq = tp->snd_nxt;
tcp_ecn_queue_cwr(tp);
 
-	/* F-RTO RFC5682 sec 3.1 step 1 mandates to disable F-RTO

-* if a previous recovery is underway, otherwise it may incorrectly
-* call a timeout spurious if some previously retransmitted packets
-* are s/acked (sec 3.2). We do not apply that retriction since
-* retransmitted skbs are permanently tagged with TCPCB_EVER_RETRANS
-* so FLAG_ORIG_SACK_ACKED is always correct. But we do disable F-RTO
-* on PTMU discovery to avoid sending new data.
+   /* F-RTO RFC5682 sec 3.1 step 1: retransmit SND.UNA if no previous
+* loss recovery is underway except recurring timeout(s) on
+* the same SND.UNA (sec 3.2). Disable F-RTO on path MTU probing
+*
+* In theory F-RTO can be used repeatedly during loss recovery.
+* In practice this interacts badly with broken middle-boxes that
+* falsely raise the receive window, which results in repeated
+* timeouts and stop-and-go behavior.
 */
-   tp->frto = sysctl_tcp_frto && !inet_csk(sk)->icsk_mtup.probe_size;
+   tp->frto = sysctl_tcp_frto &&
+  (new_recovery || icsk->icsk_retransmits) &&
+  !inet_csk(sk)->icsk_mtup.probe_size;
 }
 
 /* If ACK arrived pointing to a remembered SACK, it means that our

--
2.12.2.715.g7642488e1d-goog



Re: TCP and BBR: reproducibly low cwnd and bandwidth

2018-02-18 Thread Eric Dumazet
On Sun, 2018-02-18 at 13:04 -0800, Eric Dumazet wrote:
> 
> Can you please test the following patch ?
> 
> Note that some cleanups can be done later in TCP stack, removing lots
> of legacy stuff.
> 
> Also TCP internal-pacing could benefit from something similar to this
> fq patch eventually, although there is no hurry.
> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=fefa569a9d4bc4b7758c0fddd75bb0382c95da77
>   
> 
> Of course, you have to consider why SG was disabled on your device,
> this looks very pessimistic.
> 
> Thanks !
> 
>  include/net/sock.h  |1 +
>  net/core/sock.c |2 +-
>  net/ipv4/tcp_ipv4.c |1 +
>  net/ipv6/tcp_ipv6.c |1 +
>  4 files changed, 4 insertions(+), 1 deletion(-)

Also note that the patch only deals with active connections.

My official patch will also take care of passive ones of course.



Re: TCP and BBR: reproducibly low cwnd and bandwidth

2018-02-18 Thread Eric Dumazet
On Sat, 2018-02-17 at 10:52 -0800, Eric Dumazet wrote:
> 
> This must be some race condition in the code I added in TCP for self-
> pacing, when a sort timeout is programmed.
> 
> Disabling SG means TCP cooks 1-MSS packets.
> 
> I will take a look, probably after the (long) week-end : Tuesday.

I was able to take a look today, and I believe this is the time to
switch TCP to GSO being always on.

As a bonus, we get speed boost for cubic as well.

Todays high BDP and recent TCP improvements (rtx queue as rb-tree, sack
coalescing, TCP pacing...) all were developed/tested/maintained with
GSO/TSO being the norm.

Can you please test the following patch ?

Note that some cleanups can be done later in TCP stack, removing lots
of legacy stuff.

Also TCP internal-pacing could benefit from something similar to this
fq patch eventually, although there is no hurry.
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/commit/?id=fefa569a9d4bc4b7758c0fddd75bb0382c95da77
  

Of course, you have to consider why SG was disabled on your device,
this looks very pessimistic.

Thanks !

 include/net/sock.h  |1 +
 net/core/sock.c |2 +-
 net/ipv4/tcp_ipv4.c |1 +
 net/ipv6/tcp_ipv6.c |1 +
 4 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 
169c92afcafa3d548f8238e91606b87c187559f4..df4ac691870ff9f779f1782ded58140eb4d3a961
 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -417,6 +417,7 @@ struct sock {
struct page_fragsk_frag;
netdev_features_t   sk_route_caps;
netdev_features_t   sk_route_nocaps;
+   netdev_features_t   sk_route_forced_caps;
int sk_gso_type;
unsigned intsk_gso_max_size;
gfp_t   sk_allocation;
diff --git a/net/core/sock.c b/net/core/sock.c
index 
c501499a04fe973e80e18655b306d762d348ff44..b084acb3b3b96791663b731788a392041148416c
 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1773,7 +1773,7 @@ void sk_setup_caps(struct sock *sk, struct dst_entry *dst)
u32 max_segs = 1;
 
sk_dst_set(sk, dst);
-   sk->sk_route_caps = dst->dev->features;
+   sk->sk_route_caps = dst->dev->features | sk->sk_route_forced_caps;
if (sk->sk_route_caps & NETIF_F_GSO)
sk->sk_route_caps |= NETIF_F_GSO_SOFTWARE;
sk->sk_route_caps &= ~sk->sk_route_nocaps;
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 
f8ad397e285e9b8db0b04f8abc30a42f22294ef9..eaf1e30fc5af879442f5f33ed4bd69f89dff8cfb
 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -233,6 +233,7 @@ int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, 
int addr_len)
}
/* OK, now commit destination to socket.  */
sk->sk_gso_type = SKB_GSO_TCPV4;
+   sk->sk_route_forced_caps = NETIF_F_GSO;
sk_setup_caps(sk, &rt->dst);
rt = NULL;
 
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 
412139f4eccd96923daaea064cd9fb8be13f5916..4a461e8e2d654aa341d525a0df609a294c2040df
 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -269,6 +269,7 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr 
*uaddr,
inet->inet_rcv_saddr = LOOPBACK4_IPV6;
 
sk->sk_gso_type = SKB_GSO_TCPV6;
+   sk->sk_route_forced_caps = NETIF_F_GSO;
ip6_dst_store(sk, dst, NULL, NULL);
 
icsk->icsk_ext_hdr_len = 0;



[PATCH net-next] dpaa_eth: fix pause capability advertisement logic

2018-02-18 Thread Jake Moroni
The ADVERTISED_Asym_Pause bit was being improperly set when both
rx and tx pause were enabled. When rx and tx are both enabled, only
the ADVERTISED_Pause bit is supposed to be set.

Signed-off-by: Jake Moroni 
---
 drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c 
b/drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c
index faea674..85306d1 100644
--- a/drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c
+++ b/drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c
@@ -211,7 +211,7 @@ static int dpaa_set_pauseparam(struct net_device *net_dev,
if (epause->rx_pause)
newadv = ADVERTISED_Pause | ADVERTISED_Asym_Pause;
if (epause->tx_pause)
-   newadv |= ADVERTISED_Asym_Pause;
+   newadv ^= ADVERTISED_Asym_Pause;
 
oldadv = phydev->advertising &
(ADVERTISED_Pause | ADVERTISED_Asym_Pause);
-- 
2.7.4



[PATCH iproute2-next] tc: fix an off-by-one error while printing tc actions

2018-02-18 Thread Adam Vyskovsky
The tc_print_action() function did not print all tc actions
when e.g. TCA_ACT_MAX_PRIO actions were defined for a single
tc filter.

Signed-off-by: Adam Vyskovsky 
---
 tc/m_action.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tc/m_action.c b/tc/m_action.c
index 744bde41..6c3049c7 100644
--- a/tc/m_action.c
+++ b/tc/m_action.c
@@ -365,7 +365,7 @@ tc_print_action(FILE *f, const struct rtattr *arg, unsigned 
short tot_acts)
return tc_print_action_flush(f, tb[0]);
 
open_json_array(PRINT_JSON, "actions");
-   for (i = 0; i < tot_acts; i++) {
+   for (i = 0; i <= tot_acts; i++) {
if (tb[i]) {
open_json_object(NULL);
print_uint(PRINT_ANY, "order",
-- 
2.16.1



Re: [PATCH] i40evf: remove redundant array comparisons to 0 checks

2018-02-18 Thread Colin Ian King
On 18/02/18 16:31, Joe Perches wrote:
> On Sun, 2018-02-18 at 16:58 +0200, Andy Shevchenko wrote:
>> On Fri, Feb 16, 2018 at 6:53 PM, Colin Ian King
>>  wrote:
>>> On 16/02/18 16:51, Andy Shevchenko wrote:
 On Thu, Feb 15, 2018 at 9:42 PM, Colin King  
 wrote:
> +   filter->f.mask.tcp_spec.dst_ip[i] |=
> 
> cpu_to_be32(0x);

 Can it be one line then?
>>>
>>> I re-adjusted the text because checkpatch was complaining.

> +   filter->f.mask.tcp_spec.src_ip[i] |=
> 
> cpu_to_be32(0x);
>>
>> For the rest OK, but for the above two how much over 80 it went if
>> would be one line?
>> If it 2-3 characters, consider to make it one line. It would increase
>> readability.
> 
> Another possibility would be to use temporaries for
>   filter->f.mask.tcp_spec
> and
>   filter->f.data.tcp_spec
> as both are used ~10 times in the function

That's a good idea. I'll fix this up tomorrow when I get back to work
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kernel-janitors" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 



Re: [PATCH v5 00/14] net: pch_gbe: Fixes & MIPS support

2018-02-18 Thread Andrew Lunn
On Sun, Feb 18, 2018 at 09:03:10AM -0800, Paul Burton wrote:
> Hi David,
> 
> On Sun, Feb 18, 2018 at 10:31:12AM -0500, David Miller wrote:
> > Nobody is going to see and apply these patches if you don't CC: the
> > Linux networking development list, netdev@vger.kernel.org
> 
> You're replying to mail that was "To: netdev@vger.kernel.org" and I see
> the whole series in the archives[1] so it definitely reached the list.
> 
> I'm not sure I see the problem?

Hi Paul

I'm guess that David is wondering about version 1-4 of this patchset?
As far as i can see, they were sent to the mips list, not the netdev
list.

Andrew


Re: [PATCH v5 00/14] net: pch_gbe: Fixes & MIPS support

2018-02-18 Thread Paul Burton
Hi David,

On Sun, Feb 18, 2018 at 10:31:12AM -0500, David Miller wrote:
> Nobody is going to see and apply these patches if you don't CC: the
> Linux networking development list, netdev@vger.kernel.org

You're replying to mail that was "To: netdev@vger.kernel.org" and I see
the whole series in the archives[1] so it definitely reached the list.

I'm not sure I see the problem?

Thanks,
Paul

[1] https://www.spinics.net/lists/netdev/msg484102.html


Re: [PATCH] i40evf: remove redundant array comparisons to 0 checks

2018-02-18 Thread Joe Perches
On Sun, 2018-02-18 at 16:58 +0200, Andy Shevchenko wrote:
> On Fri, Feb 16, 2018 at 6:53 PM, Colin Ian King
>  wrote:
> > On 16/02/18 16:51, Andy Shevchenko wrote:
> > > On Thu, Feb 15, 2018 at 9:42 PM, Colin King  
> > > wrote:
> > > > +   filter->f.mask.tcp_spec.dst_ip[i] |=
> > > > 
> > > > cpu_to_be32(0x);
> > > 
> > > Can it be one line then?
> > 
> > I re-adjusted the text because checkpatch was complaining.
> > > 
> > > > +   filter->f.mask.tcp_spec.src_ip[i] |=
> > > > 
> > > > cpu_to_be32(0x);
> 
> For the rest OK, but for the above two how much over 80 it went if
> would be one line?
> If it 2-3 characters, consider to make it one line. It would increase
> readability.

Another possibility would be to use temporaries for
filter->f.mask.tcp_spec
and
filter->f.data.tcp_spec
as both are used ~10 times in the function




Re: [PATCH v5 02/14] net: pch_gbe: Pull PHY GPIO handling out of Minnow code

2018-02-18 Thread Andrew Lunn
> How would you feel if I were to adjust the binding to match the standard
> PHY binding, but internally leave the driver's PHY handling as-is for
> now?

Hi Paul

That is a reasonable compromise.

Thanks

 Andrew


[PATCH net-next] ibmvnic: Keep track of supplementary TX descriptors

2018-02-18 Thread Thomas Falcon
Supplementary TX descriptors were not being accounted for, which
was resulting in an overflow of the hardware device's transmit
queue. Keep track of those descriptors now when determining
how many entries remain on the TX queue.

Signed-off-by: Thomas Falcon 
---
 drivers/net/ethernet/ibm/ibmvnic.c | 8 ++--
 drivers/net/ethernet/ibm/ibmvnic.h | 1 +
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index a3865ff5a48e..7f5488695fd5 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -1469,6 +1469,7 @@ static int ibmvnic_xmit(struct sk_buff *skb, struct 
net_device *netdev)
if ((*hdrs >> 7) & 1) {
build_hdr_descs_arr(tx_buff, &num_entries, *hdrs);
tx_crq.v1.n_crq_elem = num_entries;
+   tx_buff->num_entries = num_entries;
tx_buff->indir_arr[0] = tx_crq;
tx_buff->indir_dma = dma_map_single(dev, tx_buff->indir_arr,
sizeof(tx_buff->indir_arr),
@@ -1517,7 +1518,7 @@ static int ibmvnic_xmit(struct sk_buff *skb, struct 
net_device *netdev)
goto out;
}
 
-   if (atomic_inc_return(&tx_scrq->used)
+   if (atomic_add_return(num_entries, &tx_scrq->used)
>= adapter->req_tx_entries_per_subcrq) {
netdev_info(netdev, "Stopping queue %d\n", queue_num);
netif_stop_subqueue(netdev, queue_num);
@@ -2470,6 +2471,7 @@ static int ibmvnic_complete_tx(struct ibmvnic_adapter 
*adapter,
 restart_loop:
while (pending_scrq(adapter, scrq)) {
unsigned int pool = scrq->pool_index;
+   int num_entries = 0;
 
next = ibmvnic_next_scrq(adapter, scrq);
for (i = 0; i < next->tx_comp.num_comps; i++) {
@@ -2500,6 +2502,8 @@ static int ibmvnic_complete_tx(struct ibmvnic_adapter 
*adapter,
txbuff->skb = NULL;
}
 
+   num_entries += txbuff->num_entries;
+
adapter->tx_pool[pool].free_map[adapter->tx_pool[pool].
 producer_index] = index;
adapter->tx_pool[pool].producer_index =
@@ -2509,7 +2513,7 @@ static int ibmvnic_complete_tx(struct ibmvnic_adapter 
*adapter,
/* remove tx_comp scrq*/
next->tx_comp.first = 0;
 
-   if (atomic_sub_return(next->tx_comp.num_comps, &scrq->used) <=
+   if (atomic_sub_return(num_entries, &scrq->used) <=
(adapter->req_tx_entries_per_subcrq / 2) &&
__netif_subqueue_stopped(adapter->netdev,
 scrq->pool_index)) {
diff --git a/drivers/net/ethernet/ibm/ibmvnic.h 
b/drivers/net/ethernet/ibm/ibmvnic.h
index fe21a6e2ddae..2f51458ccdc3 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.h
+++ b/drivers/net/ethernet/ibm/ibmvnic.h
@@ -909,6 +909,7 @@ struct ibmvnic_tx_buff {
union sub_crq indir_arr[6];
u8 hdr_data[140];
dma_addr_t indir_dma;
+   int num_entries;
 };
 
 struct ibmvnic_tx_pool {
-- 
2.15.0



[PATCH net] ibmvnic: Clean RX pools only during a hard reset

2018-02-18 Thread Thomas Falcon
Sorry, this fixes a bug in commit d0869c0071e4. The cause of the
bug is that "stale" RX buffers containing packet data are returned
to the driver after device close and open. While most buffers will be
returned with an error and handled by the polling routine, some buffers
will be returned as containing valid data. Unfortunately, the socket
buffers allocated were already freed when the device was closed, so
attempts to process them result in a panic.

RX pools still need to be cleaned in some cases, such as during
a fatal reset. In all other cases, the socket buffers will either
be freed in the polling routine or processed by the kernel.

Fixes: d0869c0071e4 ("ibmvnic: Clean RX pool buffers during device close")
Signed-off-by: Thomas Falcon 
---
 drivers/net/ethernet/ibm/ibmvnic.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index 996f475..6710313 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -1179,7 +1179,9 @@ static int __ibmvnic_close(struct net_device *netdev)
}
}
}
-   clean_rx_pools(adapter);
+   if (unlikely(adapter->resetting &&
+adapter->reset_reason != VNIC_RESET_NON_FATAL))
+   clean_rx_pools(adapter);
clean_tx_pools(adapter);
adapter->state = VNIC_CLOSED;
return rc;
-- 
2.7.5



Re: [PATCH v5 12/14] net: pch_gbe: Fix TX RX descriptor accesses for big endian systems

2018-02-18 Thread kbuild test robot
Hi Hassan,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on net-next/master]
[also build test WARNING on v4.16-rc1 next-20180216]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Paul-Burton/net-pch_gbe-Fixes-MIPS-support/20180218-213023
reproduce:
# apt-get install sparse
make ARCH=x86_64 allmodconfig
make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)

   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:177:33: sparse: 
incorrect type in argument 2 (different base types) @@ expected unsigned short 
uid_hi @@ got short uid_hi @@
   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:177:33: expected 
unsigned short uid_hi
   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:177:33: got restricted 
__be16 
   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:177:45: sparse: 
incorrect type in argument 3 (different base types) @@ expected unsigned int 
uid_lo @@ got ed int uid_lo @@
   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:177:45: expected 
unsigned int uid_lo
   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:177:45: got restricted 
__be32 
   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:177:56: sparse: 
incorrect type in argument 4 (different base types) @@ expected unsigned short 
seqid @@ got short seqid @@
   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:177:56: expected 
unsigned short seqid
   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:177:56: got restricted 
__be16 
   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:325:15: sparse: 
incorrect type in argument 1 (different address spaces) @@ expected void const 
volatile @@ got @@
   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:325:15: expected void 
const volatile
   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:325:15: got void
   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:325:15: sparse: 
incorrect type in argument 1 (different address spaces) @@ expected void const 
volatile @@ got @@
   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:325:15: expected void 
const volatile
   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:325:15: got void
   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:354:33: sparse: 
incorrect type in argument 1 (different address spaces) @@ expected void @@ got 
unsigned int > drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:1587:17: sparse: cast 
>> from restricted __le16
>> drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:1587:17: sparse: 
>> restricted __le16 degrades to integer

vim +1587 drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c

  1527  
  1528  /**
  1529   * pch_gbe_clean_tx - Reclaim resources after transmit completes
  1530   * @adapter:   Board private structure
  1531   * @tx_ring:   Tx descriptor ring
  1532   * Returns:
  1533   *  true:  Cleaned the descriptor
  1534   *  false: Not cleaned the descriptor
  1535   */
  1536  static bool
  1537  pch_gbe_clean_tx(struct pch_gbe_adapter *adapter,
  1538   struct pch_gbe_tx_ring *tx_ring)
  1539  {
  1540  struct pch_gbe_tx_desc *tx_desc;
  1541  struct pch_gbe_buffer *buffer_info;
  1542  struct sk_buff *skb;
  1543  unsigned int i;
  1544  unsigned int cleaned_count = 0;
  1545  bool cleaned = false;
  1546  int unused, thresh;
  1547  
  1548  netdev_dbg(adapter->netdev, "next_to_clean : %d\n",
  1549 tx_ring->next_to_clean);
  1550  
  1551  i = tx_ring->next_to_clean;
  1552  tx_desc = PCH_GBE_TX_DESC(*tx_ring, i);
  1553  netdev_dbg(adapter->netdev, "gbec_status:0x%04x  
dma_status:0x%04x\n",
  1554 le16_to_cpu(tx_desc->gbec_status), 
tx_desc->dma_status);
  1555  
  1556  unused = PCH_GBE_DESC_UNUSED(tx_ring);
  1557  thresh = tx_ring->count - PCH_GBE_TX_WEIGHT;
  1558  if ((le16_to_cpu(tx_desc->gbec_status) == DSC_INIT16) &&
  1559  (unused < thresh))
  1560  {  /* current marked clean, tx queue filling up, do extra clean 
*/
  1561  int j, k;
  1562  if (unused < 8) {  /* tx queue nearly full */
  1563  netdev_dbg(adapter->netdev,
  1564 "clean_tx: transmit queue warning 
(%x,%x) unused=%d\n",
  1565 tx_ring->next_to_clean, 
tx_ring->next_to_use,
  1566 unused);
  1567  }
  1568  
  1569  /* current marked clean, scan for more that need 
cleaning. */
  1570  k = i;
  1571  for (j = 0; j < PCH_GBE_TX_WEIGHT; j+

Re: [PATCH v5 02/14] net: pch_gbe: Pull PHY GPIO handling out of Minnow code

2018-02-18 Thread Paul Burton
Hi Andrew,

On Sun, Feb 18, 2018 at 12:34:42AM +0100, Andrew Lunn wrote:
> > Even if that is true, rewriting the driver's PHY handling would be a
> > very separate change to the changes this series make which allow this
> > driver to work on a platform besides the Minnowboard. The *only* thing
> > this series does relating to the PHY is allow the reset GPIO to be
> > handled properly - rewriting the existing PHY handling is beyond it's
> > scope.
> 
> Well, you are adding a device tree binding, which needs to be
> supported forever. This is going to make things messy in the future
> when you do such a cleanup that you follow the PHY binding, in that
> you have to handle both what you add here, and the official PHY
> binding.

Thank you - it's useful to know what your concern actually is.

> I would prefer that for the moment, you drop the PHY binding patches
> in this series. That is what i object to the most. Adding an MDIO
> driver and using the standard PHY driver for this PHY is all
> internal. You can change that anytime. But adding a binding means an
> ABI.

The problem is that the device in question doesn't actually work unless
we reset the PHY, so just removing the PHY reset GPIO handling would
break things.

How would you feel if I were to adjust the binding to match the standard
PHY binding, but internally leave the driver's PHY handling as-is for
now? That would:

  1) Allow for the pch_gbe driver to move towards more standard PHY
 handling in the future without DT changes.

  2) Be fairly straightforward to implement in this patchset - the code
 reading the DT would just follow the phandle to the PHY node to
 find the reset GPIO - thereby not holding up the rest of the series.

  3) Still function on our hardware.

Thanks,
Paul


Re: [PATCH v5 00/14] net: pch_gbe: Fixes & MIPS support

2018-02-18 Thread David Miller

Nobody is going to see and apply these patches if you don't CC: the
Linux networking development list, netdev@vger.kernel.org

Thank you.


Re: [PATCH v5 06/14] net: pch_gbe: Allow longer for resets

2018-02-18 Thread kbuild test robot
Hi Paul,

I love your patch! Perhaps something to improve:

[auto build test WARNING on net-next/master]
[also build test WARNING on v4.16-rc1 next-20180216]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Paul-Burton/net-pch_gbe-Fixes-MIPS-support/20180218-213023
reproduce:
# apt-get install sparse
make ARCH=x86_64 allmodconfig
make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)

   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:177:33: sparse: 
incorrect type in argument 2 (different base types) @@ expected unsigned short 
uid_hi @@ got short uid_hi @@
   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:177:33: expected 
unsigned short uid_hi
   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:177:33: got restricted 
__be16 
   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:177:45: sparse: 
incorrect type in argument 3 (different base types) @@ expected unsigned int 
uid_lo @@ got ed int uid_lo @@
   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:177:45: expected 
unsigned int uid_lo
   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:177:45: got restricted 
__be32 
   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:177:56: sparse: 
incorrect type in argument 4 (different base types) @@ expected unsigned short 
seqid @@ got short seqid @@
   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:177:56: expected 
unsigned short seqid
   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:177:56: got restricted 
__be16 
>> drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:325:15: sparse: 
>> incorrect type in argument 1 (different address spaces) @@ expected void 
>> const volatile @@ got @@
   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:325:15: expected void 
const volatile
   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:325:15: got void
>> drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:325:15: sparse: 
>> incorrect type in argument 1 (different address spaces) @@ expected void 
>> const volatile @@ got @@
   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:325:15: expected void 
const volatile
   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:325:15: got void
   drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:354:33: sparse: 
incorrect type in argument 1 (different address spaces) @@ expected void @@ got 
unsigned int hwts_rx_en)
   161  return;
   162  
   163  /* Get ieee1588's dev information */
   164  pdev = adapter->ptp_pdev;
   165  
   166  val = pch_ch_event_read(pdev);
   167  
   168  if (!(val & RX_SNAPSHOT_LOCKED))
   169  return;
   170  
   171  lo = pch_src_uuid_lo_read(pdev);
   172  hi = pch_src_uuid_hi_read(pdev);
   173  
   174  uid = hi & 0x;
   175  seq = (hi >> 16) & 0x;
   176  
 > 177  if (!pch_ptp_match(skb, htons(uid), htonl(lo), htons(seq)))
   178  goto out;
   179  
   180  ns = pch_rx_snap_read(pdev);
   181  
   182  shhwtstamps = skb_hwtstamps(skb);
   183  memset(shhwtstamps, 0, sizeof(*shhwtstamps));
   184  shhwtstamps->hwtstamp = ns_to_ktime(ns);
   185  out:
   186  pch_ch_event_write(pdev, RX_SNAPSHOT_LOCKED);
   187  }
   188  
   189  static void
   190  pch_tx_timestamp(struct pch_gbe_adapter *adapter, struct sk_buff *skb)
   191  {
   192  struct skb_shared_hwtstamps shhwtstamps;
   193  struct pci_dev *pdev;
   194  struct skb_shared_info *shtx;
   195  u64 ns;
   196  u32 cnt, val;
   197  
   198  shtx = skb_shinfo(skb);
   199  if (likely(!(shtx->tx_flags & SKBTX_HW_TSTAMP && 
adapter->hwts_tx_en)))
   200  return;
   201  
   202  shtx->tx_flags |= SKBTX_IN_PROGRESS;
   203  
   204  /* Get ieee1588's dev information */
   205  pdev = adapter->ptp_pdev;
   206  
   207  /*
   208   * This really stinks, but we have to poll for the Tx time 
stamp.
   209   */
   210  for (cnt = 0; cnt < 100; cnt++) {
   211  val = pch_ch_event_read(pdev);
   212  if (val & TX_SNAPSHOT_LOCKED)
   213  break;
   214  udelay(1);
   215  }
   216  if (!(val & TX_SNAPSHOT_LOCKED)) {
   217  shtx->tx_flags &= ~SKBTX_IN_PROGRESS;
   218  return;
   219  }
   220  
   221  ns = pch_tx_snap_read(pdev);
   222  
   223  memset(&shhwtstamps, 0, sizeof(shhwtstamps));
   224  shhwtstamps.hwtstamp = ns_to_ktime(ns);
   225  skb_tstamp_tx(skb, &shhwtstamps);
   226  
   227  pch

[PATCH] sh_eth: simplify sh_eth_check_reset()

2018-02-18 Thread Sergei Shtylyov
The *while* loop in this function  can be turned into a normal *for* loop.
And getting rid  of the  single return point saves us a few more LoCs...

Signed-off-by: Sergei Shtylyov 

---
The patch is against DaveM's 'net-next.git' repo.

 drivers/net/ethernet/renesas/sh_eth.c |   16 ++--
 1 file changed, 6 insertions(+), 10 deletions(-)

Index: net-next/drivers/net/ethernet/renesas/sh_eth.c
===
--- net-next.orig/drivers/net/ethernet/renesas/sh_eth.c
+++ net-next/drivers/net/ethernet/renesas/sh_eth.c
@@ -962,20 +962,16 @@ static void sh_eth_set_default_cpu_data(
 
 static int sh_eth_check_reset(struct net_device *ndev)
 {
-   int ret = 0;
-   int cnt = 100;
+   int cnt;
 
-   while (cnt > 0) {
+   for (cnt = 100; cnt > 0; cnt--) {
if (!(sh_eth_read(ndev, EDMR) & EDMR_SRST_GETHER))
-   break;
+   return 0;
mdelay(1);
-   cnt--;
}
-   if (cnt <= 0) {
-   netdev_err(ndev, "Device reset failed\n");
-   ret = -ETIMEDOUT;
-   }
-   return ret;
+
+   netdev_err(ndev, "Device reset failed\n");
+   return -ETIMEDOUT;
 }
 
 static int sh_eth_reset(struct net_device *ndev)


Re: [PATCH] i40evf: remove redundant array comparisons to 0 checks

2018-02-18 Thread Andy Shevchenko
On Fri, Feb 16, 2018 at 6:53 PM, Colin Ian King
 wrote:
> On 16/02/18 16:51, Andy Shevchenko wrote:
>> On Thu, Feb 15, 2018 at 9:42 PM, Colin King  wrote:

>>> +   filter->f.mask.tcp_spec.dst_ip[i] |=
>>> 
>>> cpu_to_be32(0x);
>>
>> Can it be one line then?
>
> I re-adjusted the text because checkpatch was complaining.

>>
>>> +   filter->f.mask.tcp_spec.src_ip[i] |=
>>> 
>>> cpu_to_be32(0x);

For the rest OK, but for the above two how much over 80 it went if
would be one line?
If it 2-3 characters, consider to make it one line. It would increase
readability.

-- 
With Best Regards,
Andy Shevchenko


Re: [RFC net PATCH] virtio_net: disable XDP_REDIRECT in receive_mergeable() case

2018-02-18 Thread Jesper Dangaard Brouer
On Fri, 16 Feb 2018 13:31:37 +0800 Jason Wang  wrote:

> > Besides the described bug:
> >
> > Update(1): There is also a OOM leak in the XDP_REDIRECT code, which
> > receive_small() is likely also affected by.

Found the issue behind this memory leak... page refcnt issues when
hitting xmit error paths.

> > Update(2): Also observed a guest crash when redirecting out an
> > another virtio_net device, when device is down.  

Also found reason behind this. It is not related to a "down" device.
It caused by redirect into a virtio_net device that does not have setup
sufficient XDP TX queues (which it assumes is one per CPU). I have a
preliminary fix patch.

> Will have a look at these issues. (Holiday in china now, so will do it 
> after).

No worry.  I can take care of this... I'll cleanup my patches and test
them Monday, should have patches ready Tuesday... as I want to make
sure they work in all the different error cases.

I'll be working on improvements for the XDP_REDIRECT code paths anyway
next week... in preparations for supporting the AF_XDP use case.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer


Re: [PATCH V6 2/4] sctp: Add ip option support

2018-02-18 Thread Richard Haines
On Fri, 2018-02-16 at 23:28 -0500, Neil Horman wrote:
> On Fri, Feb 16, 2018 at 07:51:02PM -0200, Marcelo Ricardo Leitner
> wrote:
> > On Fri, Feb 16, 2018 at 03:14:35PM -0500, Neil Horman wrote:
> > > On Fri, Feb 16, 2018 at 10:56:07AM -0200, Marcelo Ricardo Leitner
> > > wrote:
> > > > On Thu, Feb 15, 2018 at 09:15:40AM -0500, Neil Horman wrote:
> > > > > On Tue, Feb 13, 2018 at 08:54:44PM +, Richard Haines
> > > > > wrote:
> > > > > > Add ip option support to allow LSM security modules to
> > > > > > utilise CIPSO/IPv4
> > > > > > and CALIPSO/IPv6 services.
> > > > > > 
> > > > > > Signed-off-by: Richard Haines  > > > > > com>
> > > > > > ---
> > > > > >  include/net/sctp/sctp.h|  4 +++-
> > > > > >  include/net/sctp/structs.h |  2 ++
> > > > > >  net/sctp/chunk.c   | 12 +++-
> > > > > >  net/sctp/ipv6.c| 42
> > > > > > +++---
> > > > > >  net/sctp/output.c  |  5 -
> > > > > >  net/sctp/protocol.c| 36
> > > > > > 
> > > > > >  net/sctp/socket.c  | 14 ++
> > > > > >  7 files changed, 97 insertions(+), 18 deletions(-)
> > > > > > 
> > > > > > diff --git a/include/net/sctp/sctp.h
> > > > > > b/include/net/sctp/sctp.h
> > > > > > index f7ae6b0..25c5c87 100644
> > > > > > --- a/include/net/sctp/sctp.h
> > > > > > +++ b/include/net/sctp/sctp.h
> > > > > > @@ -441,9 +441,11 @@ static inline int
> > > > > > sctp_list_single_entry(struct list_head *head)
> > > > > >  static inline int sctp_frag_point(const struct
> > > > > > sctp_association *asoc, int pmtu)
> > > > > >  {
> > > > > > struct sctp_sock *sp = sctp_sk(asoc->base.sk);
> > > > > > +   struct sctp_af *af = sp->pf->af;
> > > > > > int frag = pmtu;
> > > > > >  
> > > > > > -   frag -= sp->pf->af->net_header_len;
> > > > > > +   frag -= af->ip_options_len(asoc->base.sk);
> > > > > > +   frag -= af->net_header_len;
> > > > > > frag -= sizeof(struct sctphdr) +
> > > > > > sctp_datachk_len(&asoc->stream);
> > > > > >  
> > > > > > if (asoc->user_frag)
> > > > > > diff --git a/include/net/sctp/structs.h
> > > > > > b/include/net/sctp/structs.h
> > > > > > index 03e92dd..ead5fce 100644
> > > > > > --- a/include/net/sctp/structs.h
> > > > > > +++ b/include/net/sctp/structs.h
> > > > > > @@ -491,6 +491,7 @@ struct sctp_af {
> > > > > > void(*ecn_capable)(struct sock
> > > > > > *sk);
> > > > > > __u16   net_header_len;
> > > > > > int sockaddr_len;
> > > > > > +   int (*ip_options_len)(struct sock
> > > > > > *sk);
> > > > > > sa_family_t sa_family;
> > > > > > struct list_head list;
> > > > > >  };
> > > > > > @@ -515,6 +516,7 @@ struct sctp_pf {
> > > > > > int (*addr_to_user)(struct sctp_sock *sk, union
> > > > > > sctp_addr *addr);
> > > > > > void (*to_sk_saddr)(union sctp_addr *, struct sock
> > > > > > *sk);
> > > > > > void (*to_sk_daddr)(union sctp_addr *, struct sock
> > > > > > *sk);
> > > > > > +   void (*copy_ip_options)(struct sock *sk, struct
> > > > > > sock *newsk);
> > > > > > struct sctp_af *af;
> > > > > >  };
> > > > > >  
> > > > > > diff --git a/net/sctp/chunk.c b/net/sctp/chunk.c
> > > > > > index 991a530..d5c0ef7 100644
> > > > > > --- a/net/sctp/chunk.c
> > > > > > +++ b/net/sctp/chunk.c
> > > > > > @@ -154,7 +154,6 @@ static void sctp_datamsg_assign(struct
> > > > > > sctp_datamsg *msg, struct sctp_chunk *chu
> > > > > > chunk->msg = msg;
> > > > > >  }
> > > > > >  
> > > > > > -
> > > > > >  /* A data chunk can have a maximum payload of (2^16 -
> > > > > > 20).  Break
> > > > > >   * down any such message into smaller
> > > > > > chunks.  Opportunistically, fragment
> > > > > >   * the chunks down to the current MTU constraints.  We may
> > > > > > get refragmented
> > > > > > @@ -171,6 +170,8 @@ struct sctp_datamsg
> > > > > > *sctp_datamsg_from_user(struct sctp_association *asoc,
> > > > > > struct list_head *pos, *temp;
> > > > > > struct sctp_chunk *chunk;
> > > > > > struct sctp_datamsg *msg;
> > > > > > +   struct sctp_sock *sp;
> > > > > > +   struct sctp_af *af;
> > > > > > int err;
> > > > > >  
> > > > > > msg = sctp_datamsg_new(GFP_KERNEL);
> > > > > > @@ -189,9 +190,11 @@ struct sctp_datamsg
> > > > > > *sctp_datamsg_from_user(struct sctp_association *asoc,
> > > > > > /* This is the biggest possible DATA chunk that
> > > > > > can fit into
> > > > > >  * the packet
> > > > > >  */
> > > > > > -   max_data = asoc->pathmtu -
> > > > > > -  sctp_sk(asoc->base.sk)->pf->af-
> > > > > > >net_header_len -
> > > > > > -  sizeof(struct sctphdr) -
> > > > > > sctp_datachk_len(&asoc->stream);
> > > > > > +   sp = sctp_sk(asoc->base.sk);
> > > > > > +   af = sp->pf->af;
> > > > > > +   max_data = asoc->pathmtu - af->net_header_len -
> > > > > > +  sizeof(struct sctphdr) -
> > > > > > sctp_datachk_len(&asoc->stream) -

Re: [PATCH RFC net-next 0/7] net/ipv6: Add support for path selection using hash of 5-tuple

2018-02-18 Thread Or Gerlitz
On Thu, Feb 15, 2018 at 12:56 AM, David Ahern  wrote:
> On 2/14/18 3:45 PM, Or Gerlitz wrote:

>> how the various systems you are dealing with do with traffic that involves
>> ipv6 extension headers? what about environments with GRE? in ipv4 GRE
>> fabrics are just broken for ECMP, in ipv6 they can fly with flow label but
>> will crash again with L4 hash.

> If you like your ecmp hash algorithm, you can keep your ecmp hash algorithm.

> This gives users a choice; it is not a requirement to move from L3 only
> to L4. Further, this makes IPv6 on par with IPv4 with a choice between
> L3 and L4 and allows users to decide what works best for them.

Looking in the code for tunnels e.g in the vxlan xmit path (but I
believe this is the case for
other UDP tunnels as well), a call is made to generate the source udp
port as the hash of
the overlay tuple regardless if we are on v4/v6. Next, when it comes
to ipv6, the kernel
computes the  flow label which effectively takes into account the inner header.


Re: ppp/pppoe, still panic 4.15.3 in ppp_push

2018-02-18 Thread Denys Fedoryshchenko

On 2018-02-16 20:48, Guillaume Nault wrote:

On Fri, Feb 16, 2018 at 01:13:18PM +0200, Denys Fedoryshchenko wrote:

On 2018-02-15 21:42, Guillaume Nault wrote:
> On Thu, Feb 15, 2018 at 09:34:42PM +0200, Denys Fedoryshchenko wrote:
> > On 2018-02-15 21:31, Guillaume Nault wrote:
> > > On Thu, Feb 15, 2018 at 06:01:16PM +0200, Denys Fedoryshchenko wrote:
> > > > On 2018-02-15 17:55, Guillaume Nault wrote:
> > > > > On Thu, Feb 15, 2018 at 12:19:52PM +0200, Denys Fedoryshchenko wrote:
> > > > > > Here we go:
> > > > > >
> > > > > >   [24558.921549]
> > > > > > ==
> > > > > >   [24558.922167] BUG: KASAN: use-after-free in
> > > > > > ppp_ioctl+0xa6a/0x1522
> > > > > > [ppp_generic]
> > > > > >   [24558.922776] Write of size 8 at addr 8803d35bf3f8 by 
task
> > > > > > accel-pppd/12622
> > > > > >   [24558.923113]
> > > > > >   [24558.923451] CPU: 0 PID: 12622 Comm: accel-pppd Tainted: G
> > > > > > W
> > > > > > 4.15.3-build-0134 #1
> > > > > >   [24558.924058] Hardware name: HP ProLiant DL320e Gen8 v2,
> > > > > > BIOS P80
> > > > > > 04/02/2015
> > > > > >   [24558.924406] Call Trace:
> > > > > >   [24558.924753]  dump_stack+0x46/0x59
> > > > > >   [24558.925103]  print_address_description+0x6b/0x23b
> > > > > >   [24558.925451]  ? ppp_ioctl+0xa6a/0x1522 [ppp_generic]
> > > > > >   [24558.925797]  kasan_report+0x21b/0x241
> > > > > >   [24558.926136]  ppp_ioctl+0xa6a/0x1522 [ppp_generic]
> > > > > >   [24558.926479]  ? ppp_nl_newlink+0x1da/0x1da [ppp_generic]
> > > > > >   [24558.926829]  ? sock_sendmsg+0x89/0x99
> > > > > >   [24558.927176]  ? __vfs_write+0xd9/0x4ad
> > > > > >   [24558.927523]  ? kernel_read+0xed/0xed
> > > > > >   [24558.927872]  ? SyS_getpeername+0x18c/0x18c
> > > > > >   [24558.928213]  ? bit_waitqueue+0x2a/0x2a
> > > > > >   [24558.928561]  ? wake_atomic_t_function+0x115/0x115
> > > > > >   [24558.928898]  vfs_ioctl+0x6e/0x81
> > > > > >   [24558.929228]  do_vfs_ioctl+0xa00/0xb10
> > > > > >   [24558.929571]  ? sigprocmask+0x1a6/0x1d0
> > > > > >   [24558.929907]  ? sigsuspend+0x13e/0x13e
> > > > > >   [24558.930239]  ? ioctl_preallocate+0x14e/0x14e
> > > > > >   [24558.930568]  ? SyS_rt_sigprocmask+0xf1/0x142
> > > > > >   [24558.930904]  ? sigprocmask+0x1d0/0x1d0
> > > > > >   [24558.931252]  SyS_ioctl+0x39/0x55
> > > > > >   [24558.931595]  ? do_vfs_ioctl+0xb10/0xb10
> > > > > >   [24558.931942]  do_syscall_64+0x1b1/0x31f
> > > > > >   [24558.932288]  entry_SYSCALL_64_after_hwframe+0x21/0x86
> > > > > >   [24558.932627] RIP: 0033:0x7f302849d8a7
> > > > > >   [24558.932965] RSP: 002b:7f3029a52af8 EFLAGS: 0206
> > > > > > ORIG_RAX:
> > > > > > 0010
> > > > > >   [24558.933578] RAX: ffda RBX: 7f3027d861e3 
RCX:
> > > > > > 7f302849d8a7
> > > > > >   [24558.933927] RDX: 7f3023f49468 RSI: 4004743a 
RDI:
> > > > > > 3a67
> > > > > >   [24558.934266] RBP: 7f3029a52b20 R08:  
R09:
> > > > > > 55c8308d8e40
> > > > > >   [24558.934607] R10: 0008 R11: 0206 
R12:
> > > > > > 7f3023f49358
> > > > > >   [24558.934947] R13: 7ffe86e5723f R14:  
R15:
> > > > > > 7f3029a53700
> > > > > >   [24558.935288]
> > > > > >   [24558.935626] Allocated by task 12622:
> > > > > >   [24558.935972]  ppp_register_net_channel+0x5f/0x5c6
> > > > > > [ppp_generic]
> > > > > >   [24558.936306]  pppoe_connect+0xab7/0xc71 [pppoe]
> > > > > >   [24558.936640]  SyS_connect+0x14b/0x1b7
> > > > > >   [24558.936975]  do_syscall_64+0x1b1/0x31f
> > > > > >   [24558.937319]  entry_SYSCALL_64_after_hwframe+0x21/0x86
> > > > > >   [24558.937655]
> > > > > >   [24558.937993] Freed by task 12622:
> > > > > >   [24558.938321]  kfree+0xb0/0x11d
> > > > > >   [24558.938658]  ppp_release+0x111/0x120 [ppp_generic]
> > > > > >   [24558.938994]  __fput+0x2ba/0x51a
> > > > > >   [24558.939332]  task_work_run+0x11c/0x13d
> > > > > >   [24558.939676]  exit_to_usermode_loop+0x7c/0xaf
> > > > > >   [24558.940022]  do_syscall_64+0x2ea/0x31f
> > > > > >   [24558.940368]  entry_SYSCALL_64_after_hwframe+0x21/0x86
> > > > > >   [24558.947099]
> > > > >
> > > > > Your first guess was right. It looks like we have an issue with
> > > > > reference counting on the channels. Can you send me your 
ppp_generic.o?
> > > > http://nuclearcat.com/ppp_generic.o
> > > > Compiled with gcc version 6.4.0 (Gentoo 6.4.0-r1 p1.3)
> > > >
> > > From what I can see, ppp_release() and ioctl(PPPIOCCONNECT) are called
> > > concurrently on the same ppp_file. Even if this ppp_file was pointed at
> > > by two different file descriptors, I can't see how this could defeat
> > > the reference counting mechanism. I'm going to think more about it.
> > >
> > > Can you test with CONFIG_REFCOUNT_FULL? (and keep
> > > d780cd44e3ce ("drivers, net, ppp: convert ppp_file.refcnt from
> > > atomic_t to refcount_t")).
> > Ok, i will try that tonight. On vanilla kernel or reversing
>