date:20170723

Re: [PATCH] lib: test_rhashtable: fix for large entry counts

2017-07-23 Thread Herbert Xu

On Fri, Jul 21, 2017 at 04:51:31PM +0200, Phil Sutter wrote:
> During concurrent access testing, threadfunc() concatenated thread ID
> and object index to create a unique key like so:
> 
> | tdata->objs[i].value = (tdata->id << 16) | i;
> 
> This breaks if a user passes an entries parameter of 64k or higher,
> since 'i' might use more than 16 bits then. Effectively, this will lead
> to duplicate keys in the table.
> 
> Fix the problem by introducing a struct holding object and thread ID and
> using that as key instead of a single integer type field.
> 
> Fixes: f4a3e90ba5739 ("rhashtable-test: extend to test concurrency")
> Reported by: Manuel Messner 
> Signed-off-by: Phil Sutter 

Acked-by: Herbert Xu 
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Re: [PATCH net] ipv6: no need to return rt->dst.error if it is not null entry.

2017-07-23 Thread Hangbin Liu

Hi Cong,
On Fri, Jul 21, 2017 at 11:42:46AM -0700, Cong Wang wrote:
> On Thu, Jul 20, 2017 at 8:23 AM, Hangbin Liu  wrote:
> > 2017-07-20 23:06 GMT+08:00 Hangbin Liu :
> >>> +++ b/net/ipv6/route.c
> >>> @@ -3637,12 +3637,6 @@ static int inet6_rtm_getroute(struct sk_buff 
> >>> *in_skb, struct nlmsghdr *nlh,
> >>> dst = ip6_route_lookup(net, , 0);
> >>>
> >>> rt = container_of(dst, struct rt6_info, dst);
> >>> -   if (rt->dst.error) {
> >>> -   err = rt->dst.error;
> >>> -   ip6_rt_put(rt);
> >>> -   goto errout;
> >>> -   }
> >>
> >> hmm... or instead of remove this check, should we check all the entry? Like
> >> if ((rt->dst.error && rt != net->ipv6.ip6_null_entry && rt !=
> > ^^ mistake here
> >> net->ipv6.ip6_blk_hole_entry) ||
> >>  rt == net->ipv6.ip6_null_entry )
> >
> > Sorry,  there should be no need to check ip6_null_entry since the
> > error is already
> > -ENETUNREACH. So how about
> 
> Hmm? All of these 3 entries have error set, right??
> So we should only check dst.error...

A question, since you have fixed the ip6_null_entry->rt6i_idev issue via

  ipv6: initialize route null entry in addrconf_init()
  ipv6: reorder ip6_route_dev_notifier after ipv6_dev_notf
  ipv6: only call ip6_route_dev_notify() once for NETDEV_UNREGISTER
  ipv6: avoid unregistering inet6_dev for loopback

Do we still need this net->ipv6.ip6_null_entry check? How about remove all
the checks?

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 4d30c96..b5e3fe0 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -3637,17 +3637,6 @@ static int inet6_rtm_getroute(struct sk_buff *in_skb, 
struct nlmsghdr *nlh,
dst = ip6_route_lookup(net, , 0);

rt = container_of(dst, struct rt6_info, dst);
-   if (rt->dst.error) {
-   err = rt->dst.error;
-   ip6_rt_put(rt);
-   goto errout;
-   }
-
-   if (rt == net->ipv6.ip6_null_entry) {
-   err = rt->dst.error;
-   ip6_rt_put(rt);
-   goto errout;
-   }

skb = alloc_skb(NLMSG_GOODSIZE, GFP_KERNEL);
if (!skb) {


Before this patch:
+ ip netns exec client ip -6 route get 2003::1
RTNETLINK answers: Network is unreachable

After this patch:
+ ip netns exec client ip -6 route get 2003::1
unreachable 2003::1 dev lo table unspec proto kernel src 2001::1 metric 
4294967295 error -101


Thanks
Hangbin

[PATCH 1/1] mlx4_en: remove unnecessary returned value

2017-07-23 Thread Zhu Yanjun

The function mlx4_en_arm_cq always returns zero. So change the
return type of the function mlx4_en_arm_cq to void.

CC: Joe Jin 
CC: Junxiao Bi 
Signed-off-by: Zhu Yanjun 
---
 drivers/net/ethernet/mellanox/mlx4/en_cq.c   | 4 +---
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 2 +-
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_cq.c 
b/drivers/net/ethernet/mellanox/mlx4/en_cq.c
index 85fe17e..87d1f4d 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_cq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_cq.c
@@ -208,12 +208,10 @@ int mlx4_en_set_cq_moder(struct mlx4_en_priv *priv, 
struct mlx4_en_cq *cq)
  cq->moder_cnt, cq->moder_time);
 }
 
-int mlx4_en_arm_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq)
+void mlx4_en_arm_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq)
 {
mlx4_cq_arm(>mcq, MLX4_CQ_DB_REQ_NOT, priv->mdev->uar_map,
>mdev->uar_lock);
-
-   return 0;
 }
 
 
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h 
b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index d350b21..fdb3ad0 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -685,7 +685,7 @@ int mlx4_en_activate_cq(struct mlx4_en_priv *priv, struct 
mlx4_en_cq *cq,
int cq_idx);
 void mlx4_en_deactivate_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq);
 int mlx4_en_set_cq_moder(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq);
-int mlx4_en_arm_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq);
+void mlx4_en_arm_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq);
 
 void mlx4_en_tx_irq(struct mlx4_cq *mcq);
 u16 mlx4_en_select_queue(struct net_device *dev, struct sk_buff *skb,
-- 
2.7.4

Re: A buggy behavior for Linux TCP Reno and HTCP

2017-07-23 Thread Neal Cardwell

On Sun, Jul 23, 2017 at 10:36 PM, Neal Cardwell  wrote:
> On Fri, Jul 21, 2017 at 5:16 PM, Yuchung Cheng  wrote:
>> On Fri,  Jul 21, 2017 at 1:46 PM, Neal Cardwell  wrote:
>>> On Fri, Jul 21, 2017 at 4:27 PM, Lisong Xu  wrote:

 Hi Yuchung,

 This test scenario is only one example to trigger this bug. In general, as
 long as cwnd <4, the undo function has this bug.
>>>
>>>
>>> Yes, personally I agree that this seems like an issue that is general enough
>>> to be worth fixing. In the sense that, if cwnd <4, then we may well be very
>>> congested. So we don't want to get hit by this bug wherein an undo of a loss
>>> recovery can cause cwnd to suddenly jump (from 1, 2, or 3) up to 4.
>>>
>>> Seems like any of the several CCs that use tcp_reno_undo_cwnd() have this
>>> bug.
>>>
>>> I guess in my mind the only question is whether we want to add a
>>> tcp_foo_undo_cwnd() and ca->loss_cwnd to every CC module to handle this
>>> issue (i.e. make every CC module handle it the way CUBIC does), or (my
>> I would prefer the former b/c loss_cwnd may not be universal TCP
>> state, just like ssthresh carries no meaning in some CC (bbr). It also
>> seems also more consistent with the recent change on undo
>>
>> commit e97991832a4ea4a5f47d65f068a4c966a2eb5730
>> Author: Florian Westphal 
>> Date:   Mon Nov 21 14:18:38 2016 +0100
>>
>> tcp: make undo_cwnd mandatory for congestion modules
>>
>
> You are certainly right that it is more pure to keep a CC detail like
> that inside the CC module.
>
> But it's a bit sad to me that we have 9 separate identical
> implementations of a cwnd undo function, and that approach would add 6
> more.
>
> We do have tp->snd_ssthresh and tp->prior_ssthresh, even though not
> all CC modules use ssthresh.
>
> What if we call the field tp->prior_cwnd? Then at least we'd have some
> nice symmetry:
>
> - tp->snd_cwnd,  which is saved in tp->prior_cwnd  (and restored upon undo)
> - tp->snd_ssthresh,  which is saved in tp-> prior_ssthresh  (and
> restored upon undo)
>
> That sounds appealing to me. WDYT?

And, I should add, if we go with the tp->prior_cwnd approach, then we
can have a single "default"/CUBIC-style undo function, instead of 15
separate but identical implementations...

neal

Re: A buggy behavior for Linux TCP Reno and HTCP

2017-07-23 Thread Neal Cardwell

On Fri, Jul 21, 2017 at 5:16 PM, Yuchung Cheng  wrote:
> On Fri, Jul 21, 2017 at 1:46 PM, Neal Cardwell  wrote:
>> On Fri, Jul 21, 2017 at 4:27 PM, Lisong Xu  wrote:
>>>
>>> Hi Yuchung,
>>>
>>> This test scenario is only one example to trigger this bug. In general, as
>>> long as cwnd <4, the undo function has this bug.
>>
>>
>> Yes, personally I agree that this seems like an issue that is general enough
>> to be worth fixing. In the sense that, if cwnd <4, then we may well be very
>> congested. So we don't want to get hit by this bug wherein an undo of a loss
>> recovery can cause cwnd to suddenly jump (from 1, 2, or 3) up to 4.
>>
>> Seems like any of the several CCs that use tcp_reno_undo_cwnd() have this
>> bug.
>>
>> I guess in my mind the only question is whether we want to add a
>> tcp_foo_undo_cwnd() and ca->loss_cwnd to every CC module to handle this
>> issue (i.e. make every CC module handle it the way CUBIC does), or (my
> I would prefer the former b/c loss_cwnd may not be universal TCP
> state, just like ssthresh carries no meaning in some CC (bbr). It also
> seems also more consistent with the recent change on undo
>
> commit e97991832a4ea4a5f47d65f068a4c966a2eb5730
> Author: Florian Westphal 
> Date:   Mon Nov 21 14:18:38 2016 +0100
>
> tcp: make undo_cwnd mandatory for congestion modules
>

You are certainly right that it is more pure to keep a CC detail like
that inside the CC module.

But it's a bit sad to me that we have 9 separate identical
implementations of a cwnd undo function, and that approach would add 6
more.

We do have tp->snd_ssthresh and tp->prior_ssthresh, even though not
all CC modules use ssthresh.

What if we call the field tp->prior_cwnd? Then at least we'd have some
nice symmetry:

- tp->snd_cwnd,  which is saved in tp->prior_cwnd  (and restored upon undo)
- tp->snd_ssthresh,  which is saved in tp-> prior_ssthresh  (and
restored upon undo)

That sounds appealing to me. WDYT?

neal

Re: [PATCH net] ipv6: no need to return rt->dst.error if it is not null entry.

2017-07-23 Thread Hangbin Liu

Hi Roopa,
On Sat, Jul 22, 2017 at 09:55:51PM -0700, Roopa Prabhu wrote:
> On Thu, Jul 20, 2017 at 7:51 AM, Hangbin Liu  wrote:
> > After commit 18c3a61c4264 ("net: ipv6: RTM_GETROUTE: return matched fib
> > result when requested"). When we get a prohibit ertry, we will return
> > -EACCES directly.
> >
> > Before:
> > + ip netns exec client ip -6 route get 2003::1
> > prohibit 2003::1 dev lo table unspec proto kernel src 2001::1 metric
> > 4294967295 error -13
> >
> > After:
> > + ip netns exec server ip -6 route get 2002::1
> > RTNETLINK answers: Network is unreachable

Sorry, I have a copy/paste error here, the return error should be

+ ip netns exec server ip -6 route get 2002::1
RTNETLINK answers: Permission denied

I fixed it in v2 patch, but forgot to add
#ifdef CONFIG_IPV6_MULTIPLE_TABLES before net->ipv6.ip6_prohibit_entry.

Since you acked this patch. I will post a v3 patch with fixed comment.

> >
> > Since we will check the ip6_null_entry later. There is not sense
> > to check the dst.error after get rt.
> >
> > Fixes: 18c3a61c4264 ("net: ipv6: RTM_GETROUTE: return matched fib...")
> > Signed-off-by: Hangbin Liu 
> > ---
> 
> Acked-by: Roopa Prabhu 

Thanks
Hangbin

Re: [PATCH] ceph: kernel client startsync can be removed

2017-07-23 Thread Yan, Zheng


> On 21 Jul 2017, at 17:20, Yanhu Cao  wrote:
> 
> kernel client is still sending write,startsync.
> that startsync is a no-op (has been for years) and can probably be removed
> 
> Link: http://tracker.ceph.com/issues/20604
> 
> Signed-off-by: Yanhu Cao 
> ---
> fs/ceph/addr.c | 9 ++---
> fs/ceph/file.c | 5 +
> include/linux/ceph/rados.h | 1 -
> net/ceph/osd_client.c  | 5 -
> 4 files changed, 3 insertions(+), 17 deletions(-)
> 
> diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
> index 5083628..f9a1805 100644
> --- a/fs/ceph/addr.c
> +++ b/fs/ceph/addr.c
> @@ -939,7 +939,7 @@ static int ceph_writepages_start(struct address_space 
> *mapping,
>   break;
>   }
> 
> - num_ops = 1 + do_sync;
> + num_ops = 1;
>   strip_unit_end = page->index +
>   ((len - 1) >> PAGE_SHIFT);
> 
> @@ -1045,7 +1045,7 @@ static int ceph_writepages_start(struct address_space 
> *mapping,
>   for (i = 0; i < locked_pages; i++) {
>   u64 cur_offset = page_offset(pages[i]);
>   if (offset + len != cur_offset) {
> - if (op_idx + do_sync + 1 == req->r_num_ops)
> + if (op_idx + 1 == req->r_num_ops)
>   break;
>   osd_req_op_extent_dup_last(req, op_idx,
>  cur_offset - offset);
> @@ -1082,17 +1082,12 @@ static int ceph_writepages_start(struct address_space 
> *mapping,
>0, !!pool, false);
>   osd_req_op_extent_update(req, op_idx, len);
> 
> - if (do_sync) {
> - op_idx++;
> - osd_req_op_init(req, op_idx, CEPH_OSD_OP_STARTSYNC, 0);
> - }
>   BUG_ON(op_idx + 1 != req->r_num_ops);
> 
>   pool = NULL;
>   if (i < locked_pages) {
>   BUG_ON(num_ops <= req->r_num_ops);
>   num_ops -= req->r_num_ops;
> - num_ops += do_sync;
>   locked_pages -= i;
> 
>   /* allocate new pages array for next request */
> diff --git a/fs/ceph/file.c b/fs/ceph/file.c
> index 3d48c41..405c99b 100644
> --- a/fs/ceph/file.c
> +++ b/fs/ceph/file.c
> @@ -800,7 +800,6 @@ static void ceph_aio_retry_work(struct work_struct *work)
>   }
> 
>   req->r_ops[0] = orig_req->r_ops[0];
> - osd_req_op_init(req, 1, CEPH_OSD_OP_STARTSYNC, 0);
> 
>   req->r_mtime = aio_req->mtime;
>   req->r_data_offset = req->r_ops[0].extent.offset;
> @@ -874,8 +873,7 @@ static void ceph_aio_retry_work(struct work_struct *work)
>   vino = ceph_vino(inode);
>   req = ceph_osdc_new_request(>client->osdc, >i_layout,
>   vino, pos, , 0,
> - /*include a 'startsync' command*/
> - write ? 2 : 1,
> + 1,
>   write ? CEPH_OSD_OP_WRITE :
>   CEPH_OSD_OP_READ,
>   flags, snapc,
> @@ -922,7 +920,6 @@ static void ceph_aio_retry_work(struct work_struct *work)
>   truncate_inode_pages_range(inode->i_mapping, pos,
>   (pos+len) | (PAGE_SIZE - 1));
> 
> - osd_req_op_init(req, 1, CEPH_OSD_OP_STARTSYNC, 0);
>   req->r_mtime = mtime;
>   }
> 
> diff --git a/include/linux/ceph/rados.h b/include/linux/ceph/rados.h
> index 385db08..ba5adf0 100644
> --- a/include/linux/ceph/rados.h
> +++ b/include/linux/ceph/rados.h
> @@ -226,7 +226,6 @@ struct ceph_eversion {
>   \
>   /* fancy write */   \
>   f(APPEND,   __CEPH_OSD_OP(WR, DATA, 6), "append")   \
> - f(STARTSYNC,__CEPH_OSD_OP(WR, DATA, 7), "startsync")\
>   f(SETTRUNC, __CEPH_OSD_OP(WR, DATA, 8), "settrunc") \
>   f(TRIMTRUNC,__CEPH_OSD_OP(WR, DATA, 9), "trimtrunc")\
>   \
> diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c
> index 901bb82..5c9d696 100644
> --- a/net/ceph/osd_client.c
> +++ b/net/ceph/osd_client.c
> @@ -863,8 +863,6 @@ static u32 osd_req_encode_op(struct ceph_osd_op *dst,
>   dst->cls.method_len = src->cls.method_len;
>   dst->cls.indata_len = cpu_to_le32(src->cls.indata_len);
>

[PATCH net-next v11 1/4] net netlink: Add new type NLA_BITFIELD_32

2017-07-23 Thread Jamal Hadi Salim

From: Jamal Hadi Salim 

Generic bitflags attribute content sent to the kernel by user.
With this type the user can either set or unset a flag in the
kernel.

The nla_value is a bitmap that defines the values being set
The nla_selector is a bitmask that defines which value is legit.

A check is made to ensure the rules that a kernel subsystem always
conforms to bitflags the kernel already knows about. i.e
if the user tries to set a bit flag that is not understood then
the _it will be rejected_.

In the most basic form, the user specifies the attribute policy as:
[ATTR_GOO] = { .type = NLA_BITFIELD_32, .validation_data =  },

where myvalidflags is the bit mask of the flags the kernel understands.

If the user _does not_ provide myvalidflags then the attribute will
also be rejected.

Examples:
nla_value = 0x0, and nla_selector = 0x1
implies we are selecting bit 1 and we want to set its value to 0.

nla_value = 0x2, and nla_selector = 0x2
implies we are selecting bit 2 and we want to set its value to 1.

Signed-off-by: Jamal Hadi Salim 
---
 include/net/netlink.h|  4 
 include/uapi/linux/netlink.h | 17 +
 lib/nlattr.c | 21 +
 3 files changed, 42 insertions(+)

diff --git a/include/net/netlink.h b/include/net/netlink.h
index ef8e6c3..e33d1fb 100644
--- a/include/net/netlink.h
+++ b/include/net/netlink.h
@@ -178,6 +178,7 @@ enum {
NLA_S16,
NLA_S32,
NLA_S64,
+   NLA_BITFIELD_32,
__NLA_TYPE_MAX,
 };
 
@@ -206,6 +207,7 @@ enum {
  *NLA_MSECSLeaving the length field zero will verify the
  * given type fits, using it verifies minimum length
  * just like "All other"
+ *NLA_BITFIELD_32  A 32-bit bitmap/bitselector attribute
  *All otherMinimum length of attribute payload
  *
  * Example:
@@ -213,11 +215,13 @@ enum {
  * [ATTR_FOO] = { .type = NLA_U16 },
  * [ATTR_BAR] = { .type = NLA_STRING, .len = BARSIZ },
  * [ATTR_BAZ] = { .len = sizeof(struct mystruct) },
+ * [ATTR_GOO] = { .type = NLA_BITFIELD_32, .validation_data = 
 },
  * };
  */
 struct nla_policy {
u16 type;
u16 len;
+   void*validation_data;
 };
 
 /**
diff --git a/include/uapi/linux/netlink.h b/include/uapi/linux/netlink.h
index f86127a..0ac05a6 100644
--- a/include/uapi/linux/netlink.h
+++ b/include/uapi/linux/netlink.h
@@ -226,5 +226,22 @@ struct nlattr {
 #define NLA_ALIGN(len) (((len) + NLA_ALIGNTO - 1) & ~(NLA_ALIGNTO - 1))
 #define NLA_HDRLEN ((int) NLA_ALIGN(sizeof(struct nlattr)))
 
+/* Generic 32 bitflags attribute content sent to the kernel.
+ *
+ * The nla_value is a bitmap that defines the values being set
+ * The nla_selector is a bitmask that defines which value is legit
+ *
+ * Examples:
+ *  nla_value = 0x0, and nla_selector = 0x1
+ *  implies we are selecting bit 1 and we want to set its value to 0.
+ *
+ *  nla_value = 0x2, and nla_selector = 0x2
+ *  implies we are selecting bit 2 and we want to set its value to 1.
+ *
+ */
+struct nla_bitfield_32 {
+   __u32 nla_value;
+   __u32 nla_selector;
+};
 
 #endif /* _UAPI__LINUX_NETLINK_H */
diff --git a/lib/nlattr.c b/lib/nlattr.c
index fb52435..c8aad7e 100644
--- a/lib/nlattr.c
+++ b/lib/nlattr.c
@@ -27,6 +27,20 @@
[NLA_S64]   = sizeof(s64),
 };
 
+static int validate_nla_bitfield_32(const struct nlattr *nla, void *valid_data)
+{
+   const struct nla_bitfield_32 *nbf = nla_data(nla);
+   u32 *valid_flags_mask = valid_data;
+
+   if (!valid_data)
+   return -EINVAL;
+
+   if (nbf->nla_value & ~*valid_flags_mask)
+   return -EINVAL;
+
+   return 0;
+}
+
 static int validate_nla(const struct nlattr *nla, int maxtype,
const struct nla_policy *policy)
 {
@@ -46,6 +60,13 @@ static int validate_nla(const struct nlattr *nla, int 
maxtype,
return -ERANGE;
break;
 
+   case NLA_BITFIELD_32:
+   if (attrlen != sizeof(struct nla_bitfield_32))
+   return -ERANGE;
+
+   return validate_nla_bitfield_32(nla, pt->validation_data);
+   break;
+
case NLA_NUL_STRING:
if (pt->len)
minlen = min_t(int, attrlen, pt->len + 1);
-- 
1.9.1

[PATCH net-next v11 4/4] net sched actions: add time filter for action dumping

2017-07-23 Thread Jamal Hadi Salim

From: Jamal Hadi Salim 

This patch adds support for filtering based on time since last used.
When we are dumping a large number of actions it is useful to
have the option of filtering based on when the action was last
used to reduce the amount of data crossing to user space.

With this patch the user space app sets the TCA_ROOT_TIME_DELTA
attribute with the value in milliseconds with "time of interest
since now".  The kernel converts this to jiffies and does the
filtering comparison matching entries that have seen activity
since then and returns them to user space.
Old kernels and old tc continue to work in legacy mode since
they dont specify this attribute.

Some example (we have 400 actions bound to 400 filters); at
installation time using hacked tc setting the time of
interest to 120 seconds earlier:
prompt$ hackedtc actions ls action gact since 12| grep index | wc -l
400

go get some coffee and wait for > 120 seconds and try again:

prompt$ hackedtc actions ls action gact since 12 | grep index | wc -l
0

Lets see a filter bound to one of these actions:
..
filter pref 10 u32
filter pref 10 u32 fh 800: ht divisor 1
filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:10  (rule 
hit 2 success 1)
  match 7f02/ at 12 (success 1 )
action order 1: gact action pass
 random type none pass val 0
 index 23 ref 2 bind 1 installed 1145 sec used 802 sec
Action statistics:
Sent 84 bytes 1 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0


that coffee took long, no? It was good.

Now lets ping -c 1 127.0.0.2, then run the actions again:
prompt$ hackedtc actions ls action gact since 120 | grep index | wc -l
1

More details please:

prompt$ hackedtc -s actions ls action gact since 12

action order 0: gact action pass
 random type none pass val 0
 index 23 ref 2 bind 1 installed 1270 sec used 30 sec
Action statistics:
Sent 168 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0

And the filter?

filter pref 10 u32
filter pref 10 u32 fh 800: ht divisor 1
filter pref 10 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:10  (rule 
hit 4 success 2)
  match 7f02/ at 12 (success 2 )
action order 1: gact action pass
 random type none pass val 0
 index 23 ref 2 bind 1 installed 1324 sec used 84 sec
Action statistics:
Sent 168 bytes 2 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0

Signed-off-by: Jamal Hadi Salim 
---
 include/uapi/linux/rtnetlink.h |  1 +
 net/sched/act_api.c| 20 +++-
 2 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
index bfa80a6..dab7dad 100644
--- a/include/uapi/linux/rtnetlink.h
+++ b/include/uapi/linux/rtnetlink.h
@@ -691,6 +691,7 @@ enum {
 #define TCAA_MAX TCA_ROOT_TAB
TCA_ROOT_FLAGS,
TCA_ROOT_COUNT,
+   TCA_ROOT_TIME_DELTA, /* in msecs */
__TCA_ROOT_MAX,
 #defineTCA_ROOT_MAX (__TCA_ROOT_MAX - 1)
 };
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 15d6c46..9ffad9c 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -111,6 +111,7 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, 
struct sk_buff *skb,
 {
int err = 0, index = -1, i = 0, s_i = 0, n_i = 0;
u32 act_flags = cb->args[2];
+   unsigned long jiffy_since = cb->args[3];
struct nlattr *nest;
 
spin_lock_bh(>lock);
@@ -128,6 +129,11 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, 
struct sk_buff *skb,
if (index < s_i)
continue;
 
+   if (jiffy_since &&
+   time_after(jiffy_since,
+  (unsigned long)p->tcfa_tm.lastuse))
+   continue;
+
nest = nla_nest_start(skb, n_i);
if (nest == NULL)
goto nla_put_failure;
@@ -145,9 +151,11 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, 
struct sk_buff *skb,
}
}
 done:
+   if (index >= 0)
+   cb->args[0] = index + 1;
+
spin_unlock_bh(>lock);
if (n_i) {
-   cb->args[0] += n_i;
if (act_flags & TCA_FLAG_LARGE_DUMP_ON)
cb->args[1] = n_i;
}
@@ -1077,6 +1085,7 @@ static int tcf_action_add(struct net *net, struct nlattr 
*nla,
 static const struct nla_policy tcaa_policy[TCA_ROOT_MAX + 1] = {
[TCA_ROOT_FLAGS] = { .type = NLA_BITFIELD_32,
 .validation_data = _root_flags_allowed },
+   [TCA_ROOT_TIME_DELTA]  = { .type = NLA_U32 },
 };
 
 static int tc_ctl_action(struct sk_buff *skb, struct nlmsghdr *n,
@@ -1167,7 +1176,9 @@ static int tc_dump_action(struct sk_buff *skb, struct

[PATCH net-next v11 2/4] net sched actions: Use proper root attribute table for actions

2017-07-23 Thread Jamal Hadi Salim

From: Jamal Hadi Salim 

Bug fix for an issue which has been around for about a decade.
We got away with it because the enumeration was larger than needed.

Fixes: 7ba699c604ab ("[NET_SCHED]: Convert actions from rtnetlink to new 
netlink API")
Suggested-by: Jiri Pirko 
Reviewed-by: Simon Horman 
Signed-off-by: Jamal Hadi Salim 
---
 net/sched/act_api.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index f2e9ed3..848370e 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -1072,7 +1072,7 @@ static int tc_ctl_action(struct sk_buff *skb, struct 
nlmsghdr *n,
 struct netlink_ext_ack *extack)
 {
struct net *net = sock_net(skb->sk);
-   struct nlattr *tca[TCA_ACT_MAX + 1];
+   struct nlattr *tca[TCAA_MAX + 1];
u32 portid = skb ? NETLINK_CB(skb).portid : 0;
int ret = 0, ovr = 0;
 
@@ -1080,7 +1080,7 @@ static int tc_ctl_action(struct sk_buff *skb, struct 
nlmsghdr *n,
!netlink_capable(skb, CAP_NET_ADMIN))
return -EPERM;
 
-   ret = nlmsg_parse(n, sizeof(struct tcamsg), tca, TCA_ACT_MAX, NULL,
+   ret = nlmsg_parse(n, sizeof(struct tcamsg), tca, TCAA_MAX, NULL,
  extack);
if (ret < 0)
return ret;
-- 
1.9.1

[PATCH net-next v11 3/4] net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch

2017-07-23 Thread Jamal Hadi Salim

From: Jamal Hadi Salim 

When you dump hundreds of thousands of actions, getting only 32 per
dump batch even when the socket buffer and memory allocations allow
is inefficient.

With this change, the user will get as many as possibly fitting
within the given constraints available to the kernel.

The top level action TLV space is extended. An attribute
TCA_ROOT_FLAGS is used to carry flags; flag TCA_FLAG_LARGE_DUMP_ON
is set by the user indicating the user is capable of processing
these large dumps. Older user space which doesnt set this flag
doesnt get the large (than 32) batches.
The kernel uses the TCA_ROOT_COUNT attribute to tell the user how many
actions are put in a single batch. As such user space app knows how long
to iterate (independent of the type of action being dumped)
instead of hardcoded maximum of 32 thus maintaining backward compat.

Some results dumping 1.5M actions below:
first an unpatched tc which doesnt understand these features...

prompt$ time -p tc actions ls action gact | grep index | wc -l
150
real 1388.43
user 2.07
sys 1386.79

Now lets see a patched tc which sets the correct flags when requesting
a dump:

prompt$ time -p updatedtc actions ls action gact | grep index | wc -l
150
real 178.13
user 2.02
sys 176.96

That is about 8x performance improvement for tc app which sets its
receive buffer to about 32K.

Signed-off-by: Jamal Hadi Salim 
---
 include/net/netlink.h  | 12 +++
 include/uapi/linux/rtnetlink.h | 22 +--
 net/sched/act_api.c| 48 +-
 3 files changed, 70 insertions(+), 12 deletions(-)

diff --git a/include/net/netlink.h b/include/net/netlink.h
index e33d1fb..87c0b15 100644
--- a/include/net/netlink.h
+++ b/include/net/netlink.h
@@ -1207,6 +1207,18 @@ static inline struct in6_addr nla_get_in6_addr(const 
struct nlattr *nla)
 }
 
 /**
+ * nla_get_bitfield_32 - return payload of 32 bitfield attribute
+ * @nla: nla_bitfield_32 attribute
+ */
+static inline struct nla_bitfield_32 nla_get_bitfield_32(const struct nlattr 
*nla)
+{
+   struct nla_bitfield_32 tmp;
+
+   nla_memcpy(, nla, sizeof(tmp));
+   return tmp;
+}
+
+/**
  * nla_memdup - duplicate attribute memory (kmemdup)
  * @src: netlink attribute to duplicate from
  * @gfp: GFP mask
diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
index d148505..bfa80a6 100644
--- a/include/uapi/linux/rtnetlink.h
+++ b/include/uapi/linux/rtnetlink.h
@@ -683,10 +683,28 @@ struct tcamsg {
unsigned char   tca__pad1;
unsigned short  tca__pad2;
 };
+
+enum {
+   TCA_ROOT_UNSPEC,
+   TCA_ROOT_TAB,
+#define TCA_ACT_TAB TCA_ROOT_TAB
+#define TCAA_MAX TCA_ROOT_TAB
+   TCA_ROOT_FLAGS,
+   TCA_ROOT_COUNT,
+   __TCA_ROOT_MAX,
+#defineTCA_ROOT_MAX (__TCA_ROOT_MAX - 1)
+};
+
 #define TA_RTA(r)  ((struct rtattr*)(((char*)(r)) + NLMSG_ALIGN(sizeof(struct 
tcamsg
 #define TA_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct tcamsg))
-#define TCA_ACT_TAB 1 /* attr type must be >=1 */  
-#define TCAA_MAX 1
+/* tcamsg flags stored in attribute TCA_ROOT_FLAGS
+ *
+ * TCA_FLAG_LARGE_DUMP_ON user->kernel to request for larger than 
TCA_ACT_MAX_PRIO
+ * actions in a dump. All dump responses will contain the number of actions
+ * being dumped stored in for user app's consumption in TCA_ROOT_COUNT
+ *
+ */
+#define TCA_FLAG_LARGE_DUMP_ON (1 << 0)
 
 /* New extended info filters for IFLA_EXT_MASK */
 #define RTEXT_FILTER_VF(1 << 0)
diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 848370e..15d6c46 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -110,6 +110,7 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, 
struct sk_buff *skb,
   struct netlink_callback *cb)
 {
int err = 0, index = -1, i = 0, s_i = 0, n_i = 0;
+   u32 act_flags = cb->args[2];
struct nlattr *nest;
 
spin_lock_bh(>lock);
@@ -138,14 +139,18 @@ static int tcf_dump_walker(struct tcf_hashinfo *hinfo, 
struct sk_buff *skb,
}
nla_nest_end(skb, nest);
n_i++;
-   if (n_i >= TCA_ACT_MAX_PRIO)
+   if (!(act_flags & TCA_FLAG_LARGE_DUMP_ON) &&
+   n_i >= TCA_ACT_MAX_PRIO)
goto done;
}
}
 done:
spin_unlock_bh(>lock);
-   if (n_i)
+   if (n_i) {
cb->args[0] += n_i;
+   if (act_flags & TCA_FLAG_LARGE_DUMP_ON)
+   cb->args[1] = n_i;
+   }
return n_i;
 
 nla_put_failure:
@@ -1068,11 +1073,17 @@ static int tcf_action_add(struct net *net, struct 
nlattr *nla,
return tcf_add_notify(net, n, , portid);
 }
 
+static u32 tcaa_root_flags_allowed = TCA_FLAG_LARGE_DUMP_ON;
+static const struct nla_policy tcaa_policy[TCA_ROOT_MAX + 1]

[PATCH net-next v11 0/4] net sched actions: improve dump performance

2017-07-23 Thread Jamal Hadi Salim

From: Jamal Hadi Salim 


Changes since v10:
-
1) Jiri: move type->validate_content() to its own patch
Jamal: decided to remove it altogether so we can get this patch set in.

2) Change name of NLA_FLAG_BITS to NLA_BITFIELD_32 based on discussions
with D. Ahern and Jiri. D. Ahern suggests to make this a variable bitmap size.
My analysis at this point is it too complex and i only need a few bit
flags. If we run out of bits someone else can create a new NLA_BITFIELD_XXX
and start using that. So please let this go.

3) Jamal - Add Suggested-by: Jiri for type NLA_BITFIELD_32

4) Jiri: Change name allowed_flags to tcaa_root_flags_allowed

5) Jiri: Introduce nla_get_flag_bits_values() helper instead of using
memcpy for retrieving nla_bitfield_32 fields.

Changes since v9:
-

1) General consensus:
- remove again the use of BIT() to maintain uapi consistency ;->

1) Jiri:
- Add a new netlink type NLA_FLAG_BITS to check for valid bits 
  and use it instead of inline vetting (patch 4/4 now)
  
Changes since v8:
-

1) Jiri:
- Add back the use of BIT(). Eventually fix iproute2 instead
- Rename VALID_TCA_FLAGS to VALID_TCA_ROOT_FLAGS

Changes since v7:
-

Jamal:
No changes.
Patch 1 went out twice. Resend without two copies of patch 1

changes since v6:
-

1) DaveM:
New rules for netlink messages. From now on we are going to start
checking for bits that are not used and rejecting anything we dont
understand. In the future this is going to require major changes
to user space code (tc etc). This is just a start.

To quote, David:
"
 Again, bits you aren't using now, make sure userspace doesn't
   set them.  And if it does, reject.
"
Added checks for ensuring things work as above.

2) Jiri:
a)Fix the commit message to properly use "Fixes" description
b)Align assignments for nla_policy

Changes since v5:


0)
Remove use of BIT() because it is kernel specific. Requires a separate
patch (Jiri can submit that in his cleanups)

1)To paraphrase Eric D.

"memcpy(nla_data(count_attr), >args[1], sizeof(u32));
wont work on 64bit BE machines because cb->args[1]
(which is 64 bit is larger in size than sizeof(u32))"

Fixed

2) Jiri Pirko

i) Spotted a bug fix mixed in the patch for wrong TLV
fix. Add patch 1/3 to address this. Make part of this
series because of dependencies.

ii) Rename ACT_LARGE_DUMP_ON -> TCA_FLAG_LARGE_DUMP_ON

iii) Satisfy Jiri's obsession against the noun "tcaa"
a)Rename struct nlattr *tcaa --> struct nlattr *tb
b)Rename TCAA_ACT_XXX -> TCA_ROOT_XXX

Changes since v4:
-

1) Eric D.

pointed out that when all skb space is used up by the dump
there will be no space to insert the TCAA_ACT_COUNT attribute.

2) Jiri:

i) Change:

enum {
TCAA_UNSPEC,
TCAA_ACT_TAB,
TCAA_ACT_FLAGS,
TCAA_ACT_COUNT,
TCAA_ACT_TIME_FILTER,
__TCAA_MAX
};

#define TCAA_MAX (__TCAA_MAX - 1)
#define ACT_LARGE_DUMP_ON   (1 << 0)

to:
enum {
   TCAA_UNSPEC,
   TCAA_ACT_TAB,
#define TCA_ACT_TAB TCAA_ACT_TAB
   TCAA_ACT_FLAGS,
   TCAA_ACT_COUNT,
   __TCAA_MAX,
#defineTCAA_MAX (__TCAA_MAX - 1)
};

#define ACT_LARGE_DUMP_ON  BIT(0)

Jiri plans to followup with the rest of the code to make the
style consistent.

ii) Rename attribute TCAA_ACT_TIME_FILTER --> TCAA_ACT_TIME_DELTA

iii) Rename variable jiffy_filter --> jiffy_since
iv) Rename msecs_filter --> msecs_since
v) get rid of unused cb->args[0] and rename cb->args[4] to cb->args[0]

Earlier Changes

- Jiri mostly on names of things.

Jamal Hadi Salim (4):
  net netlink: Add new type NLA_BITFIELD_32
  net sched actions: Use proper root attribute table for actions
  net sched actions: dump more than TCA_ACT_MAX_PRIO actions per batch
  net sched actions: add time filter for action dumping

 include/net/netlink.h  | 16 ++
 include/uapi/linux/netlink.h   | 17 +++
 include/uapi/linux/rtnetlink.h | 23 --
 lib/nlattr.c   | 21 +
 net/sched/act_api.c| 68 +++---
 5 files changed, 132 insertions(+), 13 deletions(-)

-- 
1.9.1

Re: af_packet: use after free in prb_retire_rx_blk_timer_expired

2017-07-23 Thread Ding Tianhong



On 2017/7/24 9:09, Ding Tianhong wrote:
> 
> 
> On 2017/7/24 1:03, Cong Wang wrote:
>> On Sun, Jul 23, 2017 at 5:48 AM, liujian (CE)  wrote:
>>> Hi
>>>
>>> I find it caused by below steps:
>>> 1. set tp_version to TPACKET_V3 and req->tp_block_nr to 1
>>> 2. set tp_block_nr to 0
>>> Then pg_vec was freed, and we did not delete the timer?
>>
>> Thanks for testing!
>>
>> Ah, I overlook the initialization case in my previous patch.
>>
>> How about the following one? Does it cover all the cases?
>>
>>
>> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
>> index 008bb34ee324..0615c2a950fa 100644
>> --- a/net/packet/af_packet.c
>> +++ b/net/packet/af_packet.c
>> @@ -4329,7 +4329,7 @@ static int packet_set_ring(struct sock *sk,
>> union tpacket_req_u *req_u,
>> register_prot_hook(sk);
>> }
>> spin_unlock(>bind_lock);
>> -   if (closing && (po->tp_version > TPACKET_V2)) {
>> +   if (pg_vec && (po->tp_version > TPACKET_V2)) {
>> /* Because we don't support block-based V3 on tx-ring */
>> if (!tx_ring)
>> prb_shutdown_retire_blk_timer(po, rb_queue);
>>
>> .
> 
> Hi, Cong:
> 
> It looks like could not cover the case: req->tp_block_nr = 2 -> 
> reg->tp_block_nr = 1 .
> 

Oh, looks like this case would never happen, so I think your solution is ok.

> what about this way:
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -4331,13 +4331,17 @@ static int packet_set_ring(struct sock *sk, union 
> tpacket_req_u *req_u,
> register_prot_hook(sk);
> }
> spin_unlock(>bind_lock);
> -   if (closing && (po->tp_version > TPACKET_V2)) {
> +   if ((closing || (pg_vec && !reg->tp_block_nr))&& (po->tp_version > 
> TPACKET_V2)) {
> /* Because we don't support block-based V3 on tx-ring */
> if (!tx_ring)
> prb_shutdown_retire_blk_timer(po, rb_queue);
> 
> 

>>
> 
> 
> .
>

Re: af_packet: use after free in prb_retire_rx_blk_timer_expired

2017-07-23 Thread Ding Tianhong



On 2017/7/24 1:03, Cong Wang wrote:
> On Sun, Jul 23, 2017 at 5:48 AM, liujian (CE)  wrote:
>> Hi
>>
>> I find it caused by below steps:
>> 1. set tp_version to TPACKET_V3 and req->tp_block_nr to 1
>> 2. set tp_block_nr to 0
>> Then pg_vec was freed, and we did not delete the timer?
> 
> Thanks for testing!
> 
> Ah, I overlook the initialization case in my previous patch.
> 
> How about the following one? Does it cover all the cases?
> 
> 
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index 008bb34ee324..0615c2a950fa 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -4329,7 +4329,7 @@ static int packet_set_ring(struct sock *sk,
> union tpacket_req_u *req_u,
> register_prot_hook(sk);
> }
> spin_unlock(>bind_lock);
> -   if (closing && (po->tp_version > TPACKET_V2)) {
> +   if (pg_vec && (po->tp_version > TPACKET_V2)) {
> /* Because we don't support block-based V3 on tx-ring */
> if (!tx_ring)
> prb_shutdown_retire_blk_timer(po, rb_queue);
> 
> .

Hi, Cong:

It looks like could not cover the case: req->tp_block_nr = 2 -> 
reg->tp_block_nr = 1 .

what about this way:
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -4331,13 +4331,17 @@ static int packet_set_ring(struct sock *sk, union 
tpacket_req_u *req_u,
register_prot_hook(sk);
}
spin_unlock(>bind_lock);
-   if (closing && (po->tp_version > TPACKET_V2)) {
+   if ((closing || (pg_vec && !reg->tp_block_nr))&& (po->tp_version > 
TPACKET_V2)) {
/* Because we don't support block-based V3 on tx-ring */
if (!tx_ring)
prb_shutdown_retire_blk_timer(po, rb_queue);


>

Re: [PATCH V2 3/4] net-next: dsa: fix flow dissection

2017-07-23 Thread kbuild test robot

Hi John,

[auto build test ERROR on net-next/master]
[also build test ERROR on v4.13-rc1 next-20170721]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/John-Crispin/net-next-dsa-move-struct-dsa_device_ops-to-the-global-header-file/20170724-034620
config: i386-randconfig-a0-07231650 (attached as .config)
compiler: gcc-5 (Debian 5.4.1-2) 5.4.1 20160904
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All errors (new ones prefixed by >>):

   net/core/flow_dissector.c: In function '__skb_flow_dissect':
>> net/core/flow_dissector.c:449:18: error: 'struct net_device' has no member 
>> named 'dsa_ptr'
   ops = skb->dev->dsa_ptr->tag_ops;
 ^

vim +449 net/core/flow_dissector.c

   404  
   405  /**
   406   * __skb_flow_dissect - extract the flow_keys struct and return it
   407   * @skb: sk_buff to extract the flow from, can be NULL if the rest are 
specified
   408   * @flow_dissector: list of keys to dissect
   409   * @target_container: target structure to put dissected values into
   410   * @data: raw buffer pointer to the packet, if NULL use skb->data
   411   * @proto: protocol for which to get the flow, if @data is NULL use 
skb->protocol
   412   * @nhoff: network header offset, if @data is NULL use 
skb_network_offset(skb)
   413   * @hlen: packet header length, if @data is NULL use skb_headlen(skb)
   414   *
   415   * The function will try to retrieve individual keys into target 
specified
   416   * by flow_dissector from either the skbuff or a raw buffer specified 
by the
   417   * rest parameters.
   418   *
   419   * Caller must take care of zeroing target container memory.
   420   */
   421  bool __skb_flow_dissect(const struct sk_buff *skb,
   422  struct flow_dissector *flow_dissector,
   423  void *target_container,
   424  void *data, __be16 proto, int nhoff, int hlen,
   425  unsigned int flags)
   426  {
   427  struct flow_dissector_key_control *key_control;
   428  struct flow_dissector_key_basic *key_basic;
   429  struct flow_dissector_key_addrs *key_addrs;
   430  struct flow_dissector_key_ports *key_ports;
   431  struct flow_dissector_key_icmp *key_icmp;
   432  struct flow_dissector_key_tags *key_tags;
   433  struct flow_dissector_key_vlan *key_vlan;
   434  bool skip_vlan = false;
   435  u8 ip_proto = 0;
   436  bool ret;
   437  
   438  if (!data) {
   439  data = skb->data;
   440  proto = skb_vlan_tag_present(skb) ?
   441   skb->vlan_proto : skb->protocol;
   442  nhoff = skb_network_offset(skb);
   443  hlen = skb_headlen(skb);
   444  
   445  if (unlikely(netdev_uses_dsa(skb->dev))) {
   446  const struct dsa_device_ops *ops;
   447  u8 *p = (u8 *)data;
   448  
 > 449  ops = skb->dev->dsa_ptr->tag_ops;
   450  if (ops->hash_proto_off)
   451  proto = (u16)p[ops->hash_proto_off];
   452  hlen -= ops->hash_nh_off;
   453  nhoff += ops->hash_nh_off;
   454  }
   455  }
   456  
   457  /* It is ensured by skb_flow_dissector_init() that control key 
will
   458   * be always present.
   459   */
   460  key_control = skb_flow_dissector_target(flow_dissector,
   461  
FLOW_DISSECTOR_KEY_CONTROL,
   462  target_container);
   463  
   464  /* It is ensured by skb_flow_dissector_init() that basic key 
will
   465   * be always present.
   466   */
   467  key_basic = skb_flow_dissector_target(flow_dissector,
   468FLOW_DISSECTOR_KEY_BASIC,
   469target_container);
   470  
   471  if (dissector_uses_key(flow_dissector,
   472 FLOW_DISSECTOR_KEY_ETH_ADDRS)) {
   473  struct ethhdr *eth = eth_hdr(skb);
   474  struct flow_dissector_key_eth_addrs *key_eth_addrs;
   475  
   476  key_eth_addrs = 
skb_flow_dissector_target(flow_dissector,
   477
FLOW_DISSECTOR_KEY_ETH_ADDRS,
   478
target_container);
   479  memcpy(key_eth_addrs, >h_dest, 
sizeof(*key_eth_addrs));
   480  }
   481  
   482  proto_again:
   483  switch (proto) {
   484

Kernel TLS in 4.13-rc1

2017-07-23 Thread David Oberhollenzer

Hi!

I recently wanted to take a look at the kernel TLS support that
made it into 4.13-rc1, but ran into some issues.

After fixing the benchmark/test tool that the patch description
linked to (https://github.com/Mellanox/tls-af_ktls_tool) to make
sure that the server and client actually *agree* on AES-128-GCM,
I simply ran the client program with the --verify-sendpage option.

The handshake and setting up of the sockets appears to work but
the program complains that the sent and received page contents
do not match (sent is 0x12 repeated all over and received looks
pretty random).

I compiled the 4.13-rc1 tarball from kernel.org with
defconfig/kvmconfig for x86_64 and ran it on qemu using a
freshly debootstraped Debian sid rootfs.

I previously also tried it on a physical machine (localmodconfig,
also x86_64), running CentOS 7 and a custom build of recent gnutls
and its dependencies, with the same results.

Surely somebody must have tested this before it was merged? What
am I missing? Am I using a broken version of the benchmark tool
or am I holding it wrong?


Thanks,

David

?

2017-07-23 Thread Robert

> Did you receive my previous mail ? When and what time can i call you?

[PATCH] of_mdio: kill useless variable in of_phy_register_fixed_link()

2017-07-23 Thread Sergei Shtylyov

of_phy_register_fixed_link() declares the 'err' variable to hold the result
of of_property_read_string() but only uses it once after that, while that
function can be called directly from the *if* statement...

Remove that variable and move/regroup 'link_gpio' and 'len' variables in
order to sort the declarations in the reverse Xmas tree order -- to please
DaveM. ;-)

Signed-off-by: Sergei Shtylyov 

---
 drivers/of/of_mdio.c |6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

Index: linux/drivers/of/of_mdio.c
===
--- linux.orig/drivers/of/of_mdio.c
+++ linux/drivers/of/of_mdio.c
@@ -422,13 +422,11 @@ int of_phy_register_fixed_link(struct de
struct fixed_phy_status status = {};
struct device_node *fixed_link_node;
const __be32 *fixed_link_prop;
-   int link_gpio;
-   int len, err;
struct phy_device *phy;
const char *managed;
+   int link_gpio, len;
 
-   err = of_property_read_string(np, "managed", );
-   if (err == 0) {
+   if (of_property_read_string(np, "managed", ) == 0) {
if (strcmp(managed, "in-band-status") == 0) {
/* status is zeroed, namely its .link member */
phy = fixed_phy_register(PHY_POLL, , -1, np);

[PATCH net-next] skbuff: re-add check for NULL skb->head in kfree_skb path

2017-07-23 Thread Florian Westphal

A null check is needed after all.  netlink skbs can have skb->head be
backed by vmalloc.  The netlink destructor vfree()s head, then sets it to
NULL.  We then panic in skb_release_data with a NULL dereference.

Re-add such a test.

Alternative would be to switch to kvfree to free skb->head memory
and remove the special handling in netlink destructor.

Reported-by: kernel test robot 
Fixes: 06dc75ab06943 ("net: Revert "net: add function to allocate sk_buff head 
without data area")
Signed-off-by: Florian Westphal 
---
 net/core/skbuff.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 84bdfa2..c27da51 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -638,7 +638,8 @@ void skb_release_head_state(struct sk_buff *skb)
 static void skb_release_all(struct sk_buff *skb)
 {
skb_release_head_state(skb);
-   skb_release_data(skb);
+   if (likely(skb->head))
+   skb_release_data(skb);
 }
 
 /**
-- 
1.8.3.1

Re: [PATCH V4 net-next 7/8] net: hns3: Add Ethtool support to HNS3 driver

2017-07-23 Thread Stephen Hemminger

On Sat, 22 Jul 2017 23:09:41 +0100
Salil Mehta  wrote:

> + HNS3_NETDEV_STAT("rx_packets", rx_packets),
> + HNS3_NETDEV_STAT("tx_packets", tx_packets),
> + HNS3_NETDEV_STAT("rx_bytes", rx_bytes),
> + HNS3_NETDEV_STAT("tx_bytes", tx_bytes),
> + HNS3_NETDEV_STAT("rx_errors", rx_errors),
> + HNS3_NETDEV_STAT("tx_errors", tx_errors),
> + HNS3_NETDEV_STAT("rx_dropped", rx_dropped),
> + HNS3_NETDEV_STAT("tx_dropped", tx_dropped),
> + HNS3_NETDEV_STAT("multicast", multicast),
> + HNS3_NETDEV_STAT("collisions", collisions),
> +
> + /* detailed Rx errors */

Do not put network statistics in ethtool statistics.
This is redundant and unnecessary. Yes some other drivers may do it
but it is not best practice.

Re: [PATCH V4 net-next 1/8] net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC

2017-07-23 Thread Florian Fainelli



On 07/22/2017 03:09 PM, Salil Mehta wrote:
> This patch adds the support of Hisilicon Network Subsystem 3
> Ethernet driver to hip08 family of SoCs.
> 
> This driver includes basic Rx/Tx functionality. It also includes
> the client registration code with the HNAE3(Hisilicon Network
> Acceleration Engine 3) framework.
> 
> This work provides the initial support to the hip08 SoC and
> would incrementally add features or enhancements.
> 
> Signed-off-by: Daode Huang 
> Signed-off-by: lipeng 
> Signed-off-by: Salil Mehta 
> Signed-off-by: Yisen Zhuang 
> ---
> Patch V4: addressed comments by:
>   1. Andrew Lunn:
>  https://lkml.org/lkml/2017/6/17/222
>  https://lkml.org/lkml/2017/6/17/232
>   2. Bo Yu:
>  https://lkml.org/lkml/2017/6/18/110
>  https://lkml.org/lkml/2017/6/18/115
> Patch V3: Addresed below comments:
>   1. Stephen Hemminger:
>  https://lkml.org/lkml/2017/6/13/972
>   2. Yuval Mintz:
>  https://lkml.org/lkml/2017/6/14/151
> Patch V2: Addressed below comments:
>   1. Kbuild:
>  https://lkml.org/lkml/2017/6/11/73
>   2. Yuval Mintz:
>  https://lkml.org/lkml/2017/6/10/78
> Patch V1: Initial Submit
> ---
>  .../net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.c | 2894 
> 
>  .../net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.h |  598 
>  2 files changed, 3492 insertions(+)
>  create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.c
>  create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.h
> 
> diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.c 
> b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.c
> new file mode 100644
> index ..6e0e2967db42
> --- /dev/null
> +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_enet.c
> @@ -0,0 +1,2894 @@
> +/*
> + * Copyright (c) 2016~2017 Hisilicon Limited.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "hnae3.h"
> +#include "hns3_enet.h"
> +
> +const char hns3_driver_name[] = "hns3";
> +static const char hns3_driver_string[] =
> + "Hisilicon Ethernet Network Driver for Hi162x Family";
> +static const char hns3_copyright[] = "Copyright (c) 2017 Huawei 
> Corporation.";
> +static struct hnae3_client client;
> +
> +/* hns3_pci_tbl - PCI Device ID Table
> + *
> + * Last entry must be all 0s
> + *
> + * { Vendor ID, Device ID, SubVendor ID, SubDevice ID,
> + *   Class, Class Mask, private data (not used) }
> + */
> +static const struct pci_device_id hns3_pci_tbl[] = {
> + {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_GE), 0},
> + {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_25GE), 0},
> + {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_25GE_RDMA), 0},
> + {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_25GE_RDMA_MACSEC), 0},
> + {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_50GE_RDMA), 0},
> + {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_50GE_RDMA_MACSEC), 0},
> + {PCI_VDEVICE(HUAWEI, HNAE3_DEV_ID_100G_RDMA_MACSEC), 0},
> + /* required last entry */
> + {0, }
> +};
> +MODULE_DEVICE_TABLE(pci, hns3_pci_tbl);
> +
> +static irqreturn_t hns3_irq_handle(int irq, void *dev)
> +{
> + struct hns3_enet_tqp_vector *tqp_vector = dev;
> +
> + napi_schedule(_vector->napi);
> +
> + return IRQ_HANDLED;
> +}
> +
> +static int hns3_nic_init_irq(struct hns3_nic_priv *priv)
> +{
> + struct pci_dev *pdev = priv->ae_handle->pdev;
> + struct hns3_enet_tqp_vector *tqp_vectors;
> + int txrx_int_idx = 0;
> + int rx_int_idx = 0;
> + int tx_int_idx = 0;
> + int ret;
> + int i;

unsigned int i

> +
> + for (i = 0; i < priv->vector_num; i++) {
> + tqp_vectors = >tqp_vector[i];
> +
> + if (tqp_vectors->irq_init_flag == HNS3_VECTOR_INITED)
> + continue;
> +
> + if (tqp_vectors->tx_group.ring && tqp_vectors->rx_group.ring) {
> + snprintf(tqp_vectors->name, HNAE3_INT_NAME_LEN - 1,
> +  "%s-%s-%d", priv->netdev->name, "TxRx",
> +  txrx_int_idx++);
> + txrx_int_idx++;
> + } else if (tqp_vectors->rx_group.ring) {
> + snprintf(tqp_vectors->name, HNAE3_INT_NAME_LEN - 1,
> +  "%s-%s-%d", priv->netdev->name, "Rx",
> +  rx_int_idx++);
> + } else if (tqp_vectors->tx_group.ring) {
> + snprintf(tqp_vectors->name, HNAE3_INT_NAME_LEN - 1,
> +  "%s-%s-%d", priv->netdev->name,

Re: [PATCH net-next v3 1/1] geneve: add rtnl changelink support

2017-07-23 Thread Pravin Shelar

On Thu, Jul 20, 2017 at 10:44 PM, Girish Moodalbail
 wrote:
> This patch adds changelink rtnl operation support for geneve devices
> and the code changes involve:
>
>   - added geneve_quiesce() which quiesces the geneve device data path
> for both TX and RX. This lets us perform the changelink operation
> atomically w.r.t data path. Also added geneve_unquiesce() to
> reverse the operation of geneve_quiesce().
>
>   - refactor geneve_newlink into geneve_nl2info to be used by both
> geneve_newlink and geneve_changelink
>
>   - geneve_nl2info takes a changelink boolean argument to isolate
> changelink checks.
>
>   - Allow changing only a few attributes (ttl, tos, and remote tunnel
> endpoint IP address (within the same address family)):
> - return -EOPNOTSUPP for attributes that cannot be changed for
>   now. Incremental patches can make the non-supported one
>   available in the future if needed.
>
> Signed-off-by: Girish Moodalbail 
> ---
> v2 -> v3:
>- removed the use of inline for new functions in my patch
>- removed an extra check for socket in the datapath and instead
>  I am piggybacking on an already existing check
>- added more comments to quiesce/unquiesce functions
>
> v1 -> v2:
>- added geneve_quiesce() and geneve_unquiesce() functions to
>  perform the changelink operation atomically w.r.t data path

Thanks for working on it.

Acked-by: Pravin B Shelar

Re: [PATCH V4 net-next 7/8] net: hns3: Add Ethtool support to HNS3 driver

2017-07-23 Thread Florian Fainelli



On 07/22/2017 03:09 PM, Salil Mehta wrote:
> This patch adds the support of the Ethtool interface to
> the HNS3 Ethernet driver. Various commands to read the
> statistics, configure the offloading, loopback selftest etc.
> are supported.
> 
> Signed-off-by: Daode Huang 
> Signed-off-by: lipeng 
> Signed-off-by: Salil Mehta 
> Signed-off-by: Yisen Zhuang 
> ---
> Patch V4: addressed below comments
>  1. Andrew Lunn
> Removed the support of loop PHY back for now
> Patch V3: Address below comments
>  1. Stephen Hemminger
> https://lkml.org/lkml/2017/6/13/974
>  2. Andrew Lunn
> https://lkml.org/lkml/2017/6/13/1037
> Patch V2: No change
> Patch V1: Initial Submit
> ---
>  .../ethernet/hisilicon/hns3/hns3pf/hns3_ethtool.c  | 543 
> +
>  1 file changed, 543 insertions(+)
>  create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_ethtool.c
> 
> diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_ethtool.c 
> b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_ethtool.c
> new file mode 100644
> index ..82b0d4d829f8
> --- /dev/null
> +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hns3_ethtool.c
> @@ -0,0 +1,543 @@
> +/*
> + * Copyright (c) 2016~2017 Hisilicon Limited.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + */
> +
> +#include 
> +#include "hns3_enet.h"
> +
> +struct hns3_stats {
> + char stats_string[ETH_GSTRING_LEN];
> + int stats_size;
> + int stats_offset;
> +};
> +
> +/* netdev related stats */
> +#define HNS3_NETDEV_STAT(_string, _member)   \
> + { _string,  \
> +   FIELD_SIZEOF(struct rtnl_link_stats64, _member),  \
> +   offsetof(struct rtnl_link_stats64, _member),  \
> + }

Can you make this macro use named initializers?

> +
> +static const struct hns3_stats hns3_netdev_stats[] = {
> + /* misc. Rx/Tx statistics */
> + HNS3_NETDEV_STAT("rx_packets", rx_packets),
> + HNS3_NETDEV_STAT("tx_packets", tx_packets),
> + HNS3_NETDEV_STAT("rx_bytes", rx_bytes),
> + HNS3_NETDEV_STAT("tx_bytes", tx_bytes),
> + HNS3_NETDEV_STAT("rx_errors", rx_errors),
> + HNS3_NETDEV_STAT("tx_errors", tx_errors),
> + HNS3_NETDEV_STAT("rx_dropped", rx_dropped),
> + HNS3_NETDEV_STAT("tx_dropped", tx_dropped),
> + HNS3_NETDEV_STAT("multicast", multicast),
> + HNS3_NETDEV_STAT("collisions", collisions),
> +
> + /* detailed Rx errors */
> + HNS3_NETDEV_STAT("rx_length_errors", rx_length_errors),
> + HNS3_NETDEV_STAT("rx_over_errors", rx_over_errors),
> + HNS3_NETDEV_STAT("rx_crc_errors", rx_crc_errors),
> + HNS3_NETDEV_STAT("rx_frame_errors", rx_frame_errors),
> + HNS3_NETDEV_STAT("rx_fifo_errors", rx_fifo_errors),
> + HNS3_NETDEV_STAT("rx_missed_errors", rx_missed_errors),
> +
> + /* detailed Tx errors */
> + HNS3_NETDEV_STAT("tx_aborted_errors", tx_aborted_errors),
> + HNS3_NETDEV_STAT("tx_carrier_errors", tx_carrier_errors),
> + HNS3_NETDEV_STAT("tx_fifo_errors", tx_fifo_errors),
> + HNS3_NETDEV_STAT("tx_heartbeat_errors", tx_heartbeat_errors),
> + HNS3_NETDEV_STAT("tx_window_errors", tx_window_errors),
> +
> + /* for cslip etc */
> + HNS3_NETDEV_STAT("rx_compressed", rx_compressed),
> + HNS3_NETDEV_STAT("tx_compressed", tx_compressed),
> +};
> +
> +#define HNS3_NETDEV_STATS_COUNT ARRAY_SIZE(hns3_netdev_stats)
> +
> +/* tqp related stats */
> +#define HNS3_TQP_STAT(_string, _member)  \
> + { _string,  \
> +   FIELD_SIZEOF(struct ring_stats, _member), \
> +   offsetof(struct hns3_enet_ring, stats),   \
> + }
> +

Same here.

> +static const struct hns3_stats hns3_txq_stats[] = {
> + /* Tx per-queue statistics */
> + HNS3_TQP_STAT("tx_io_err_cnt", io_err_cnt),
> + HNS3_TQP_STAT("tx_sw_err_cnt", sw_err_cnt),
> + HNS3_TQP_STAT("tx_seg_pkt_cnt", seg_pkt_cnt),
> + HNS3_TQP_STAT("tx_pkts", tx_pkts),
> + HNS3_TQP_STAT("tx_bytes", tx_bytes),
> + HNS3_TQP_STAT("tx_err_cnt", tx_err_cnt),
> + HNS3_TQP_STAT("tx_restart_queue", restart_queue),
> + HNS3_TQP_STAT("tx_busy", tx_busy),
> +};
> +
> +#define HNS3_TXQ_STATS_COUNT ARRAY_SIZE(hns3_txq_stats)
> +
> +static const struct hns3_stats hns3_rxq_stats[] = {
> + /* Rx per-queue statistics */
> + HNS3_TQP_STAT("rx_io_err_cnt", io_err_cnt),
> + HNS3_TQP_STAT("rx_sw_err_cnt", sw_err_cnt),
> + HNS3_TQP_STAT("rx_seg_pkt_cnt", seg_pkt_cnt),
> + HNS3_TQP_STAT("rx_pkts", rx_pkts),
> + HNS3_TQP_STAT("rx_bytes", rx_bytes),
> +

Re: af_packet: use after free in prb_retire_rx_blk_timer_expired

2017-07-23 Thread Cong Wang

On Sun, Jul 23, 2017 at 5:48 AM, liujian (CE)  wrote:
> Hi
>
> I find it caused by below steps:
> 1. set tp_version to TPACKET_V3 and req->tp_block_nr to 1
> 2. set tp_block_nr to 0
> Then pg_vec was freed, and we did not delete the timer?

Thanks for testing!

Ah, I overlook the initialization case in my previous patch.

How about the following one? Does it cover all the cases?


diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 008bb34ee324..0615c2a950fa 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -4329,7 +4329,7 @@ static int packet_set_ring(struct sock *sk,
union tpacket_req_u *req_u,
register_prot_hook(sk);
}
spin_unlock(>bind_lock);
-   if (closing && (po->tp_version > TPACKET_V2)) {
+   if (pg_vec && (po->tp_version > TPACKET_V2)) {
/* Because we don't support block-based V3 on tx-ring */
if (!tx_ring)
prb_shutdown_retire_blk_timer(po, rb_queue);

Re: [PATCH net] openvswitch: fix potential out of bound access in parse_ct

2017-07-23 Thread Pravin Shelar

On Sun, Jul 23, 2017 at 2:52 AM, Liping Zhang  wrote:
> From: Liping Zhang 
>
> Before the 'type' is validated, we shouldn't use it to fetch the
> ovs_ct_attr_lens's minlen and maxlen, else, out of bound access
> may happen.
>
> Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
> Signed-off-by: Liping Zhang 

Good catch!
Acked-by: Pravin B Shelar

Re: [PATCH V4 net-next 6/8] net: hns3: Add MDIO support to HNS3 Ethernet driver for hip08 SoC

2017-07-23 Thread Florian Fainelli



On 07/22/2017 03:09 PM, Salil Mehta wrote:
> This patch adds the support of MDIO bus interface for HNS3 driver.
> Code provides various interfaces to start and stop the PHY layer
> and to read and write the MDIO bus or PHY.
> 
> Signed-off-by: Daode Huang 
> Signed-off-by: lipeng 
> Signed-off-by: Salil Mehta 
> Signed-off-by: Yisen Zhuang 
> ---
> Patch V4: Addressed following comments:
>  1. Andrew Lunn:
> https://lkml.org/lkml/2017/6/17/208
> Patch V3: Addressed Below comments:
>  1. Florian Fainelli:
> https://lkml.org/lkml/2017/6/13/963
>  2. Andrew Lunn:
> https://lkml.org/lkml/2017/6/13/1039
> Patch V2: Addressed below comments:
>  1. Florian Fainelli:
> https://lkml.org/lkml/2017/6/10/130
>  2. Andrew Lunn:
> https://lkml.org/lkml/2017/6/10/168
> Patch V1: Initial Submit
> ---
>  .../ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c| 230 
> +
>  1 file changed, 230 insertions(+)
>  create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c
> 
> diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c 
> b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c
> new file mode 100644
> index ..6036a97f7de5
> --- /dev/null
> +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mdio.c
> @@ -0,0 +1,230 @@
> +/*
> + * Copyright (c) 2016~2017 Hisilicon Limited.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + */
> +
> +#include 
> +#include 
> +
> +#include "hclge_cmd.h"
> +#include "hclge_main.h"
> +
> +enum hclge_mdio_c22_op_seq {
> + HCLGE_MDIO_C22_WRITE = 1,
> + HCLGE_MDIO_C22_READ = 2
> +};
> +
> +#define HCLGE_MDIO_CTRL_START_B  0
> +#define HCLGE_MDIO_CTRL_ST_S 1
> +#define HCLGE_MDIO_CTRL_ST_M (0x3 << HCLGE_MDIO_CTRL_ST_S)
> +#define HCLGE_MDIO_CTRL_OP_S 3
> +#define HCLGE_MDIO_CTRL_OP_M (0x3 << HCLGE_MDIO_CTRL_OP_S)
> +
> +#define HCLGE_MDIO_PHYID_S   0
> +#define HCLGE_MDIO_PHYID_M   (0x1f << HCLGE_MDIO_PHYID_S)
> +
> +#define HCLGE_MDIO_PHYREG_S  0
> +#define HCLGE_MDIO_PHYREG_M  (0x1f << HCLGE_MDIO_PHYREG_S)
> +
> +#define HCLGE_MDIO_STA_B 0
> +
> +struct hclge_mdio_cfg_cmd {
> + u8 ctrl_bit;
> + u8 phyid;
> + u8 phyad;
> + u8 rsvd;
> + __le16 reserve;
> + __le16 data_wr;
> + __le16 data_rd;
> + __le16 sta;
> +};
> +
> +static int hclge_mdio_write(struct mii_bus *bus, int phyid, int regnum,
> + u16 data)
> +{
> + struct hclge_dev *hdev = (struct hclge_dev *)bus->priv;

Cast is not needed here since bus->priv is already a void *.

> + struct hclge_mdio_cfg_cmd *mdio_cmd;
> + enum hclge_cmd_status status;
> + struct hclge_desc desc;
> +
> + if (!bus)
> + return -EINVAL;

How can this be possible?

> +
> + hclge_cmd_setup_basic_desc(, HCLGE_OPC_MDIO_CONFIG, false);
> +
> + mdio_cmd = (struct hclge_mdio_cfg_cmd *)desc.data;

Same here, can we not cast this into a struct hclge_mdio_cfg_cmd?

> +
> + hnae_set_field(mdio_cmd->phyid, HCLGE_MDIO_PHYID_M,
> +HCLGE_MDIO_PHYID_S, phyid);
> + hnae_set_field(mdio_cmd->phyad, HCLGE_MDIO_PHYREG_M,
> +HCLGE_MDIO_PHYREG_S, regnum);
> +
> + hnae_set_bit(mdio_cmd->ctrl_bit, HCLGE_MDIO_CTRL_START_B, 1);
> + hnae_set_field(mdio_cmd->ctrl_bit, HCLGE_MDIO_CTRL_ST_M,
> +HCLGE_MDIO_CTRL_ST_S, 1);
> + hnae_set_field(mdio_cmd->ctrl_bit, HCLGE_MDIO_CTRL_OP_M,
> +HCLGE_MDIO_CTRL_OP_S, HCLGE_MDIO_C22_WRITE);
> +
> + mdio_cmd->data_wr = cpu_to_le16(data);
> +
> + status = hclge_cmd_send(>hw, , 1);
> + if (status) {
> + dev_err(>pdev->dev,
> + "mdio write fail when sending cmd, status is %d.\n",
> + status);
> + return -EIO;
> + }
> +
> + return 0;
> +}
> +
> +static int hclge_mdio_read(struct mii_bus *bus, int phyid, int regnum)
> +{
> + struct hclge_dev *hdev = (struct hclge_dev *)bus->priv;

Same here.

> + struct hclge_mdio_cfg_cmd *mdio_cmd;
> + enum hclge_cmd_status status;
> + struct hclge_desc desc;
> +
> + if (!bus)
> + return -EINVAL;

And here.

> +
> + hclge_cmd_setup_basic_desc(, HCLGE_OPC_MDIO_CONFIG, true);
> +
> + mdio_cmd = (struct hclge_mdio_cfg_cmd *)desc.data;

And here.

> +
> + hnae_set_field(mdio_cmd->phyid, HCLGE_MDIO_PHYID_M,
> +HCLGE_MDIO_PHYID_S, phyid);
> + hnae_set_field(mdio_cmd->phyad, HCLGE_MDIO_PHYREG_M,
> +HCLGE_MDIO_PHYREG_S, regnum);
> +
> + hnae_set_bit(mdio_cmd->ctrl_bit,

Greetings'''''

2017-07-23 Thread Dr Paul Benk

Greetings!

I have a proposal for you, this however is not mandatory nor will I in
any manner compel you to honor against your will. I am Dr.Paul Benk a
former executive director with Arab Tunisian Bank here in Tunis; I
retired 1 year 7 months ago after putting in 28 years of meticulous
service. During my days with Arab Tunisian Bank, I was the personal
account officer and one of the financial advisers to Zine Al-Abidine
Ben Ali the past Tunisian President in self exile at Saudi Arabia.
During his trying period, he instructed me to move all his investment
in my care which consists of US$115M and 767KG of gold out of the Gulf
States for safe keeping; and that I successfully did by moving US$50
Million to Madrid Spain, US$50 Million to Dubai United Arab Emirate,
US$15 Million to Burkina Faso and the 767KG of gold to Burkina Faso in
West Africa as an anonymous deposits, so that the funds will in no way
be traced to him. He has instructed me to find an investor who would
stand as the anonymous depositor of the fund and gold; and claim it
for further investment.

Consequent upon the above, my proposal is that I would like you as a
foreigner to stand in as the anonymous depositor of this fund and gold
that I have successfully moved outside the country and provide an
account overseas where this said fund will be transferred into. It is
a careful network and my voluntary retirement from the Arab Tunisian
Bank is to ensure a hitch-free operation as all modalities for you to
stand as beneficiary and owner of the deposits has been perfected by
me. Mr. Zine al-Abidine Ben Ali will be offering you 20% of the total
investment if you can be the investor and claim this deposits in Spain
and Burkina Faso as the anonymous depositor.

Now my questions are:-

1. Can you handle this transaction?
2. Can I give you this trust?

Consider this and get back to me as soon as possible on my private
email Only here: paulben...@gmail.com, so that I can give you more
details regarding this transaction. Finally, it is my humble request
that the information as contained herein be accorded the necessary
attention, urgency as well as the secrecy it deserves.

I expect your urgent response if you can handle this project.

Respectfully yours,

From:Dr.Paul Benk.
Email: paulben...@gmail.com

--
--

Re: [PATCH-next] net: ethernet: mediatek: fix implicit irq include causing build fails

2017-07-23 Thread Paul Gortmaker

[Re: [PATCH-next] net: ethernet: mediatek: fix implicit irq include causing 
build fails] On 23/07/2017 (Sun 17:16) John Crispin wrote:

> Hi Paul
> 
> mark sent the same patch a couple of days ago [1]

OK, thanks --  I searched for mediatek in my mail and scanned netdev
patchworks for patches in "New" state but didn't see it because it is
apparently already in "In review" state.

Paul.
--

> 
> John
> 
> [1] https://patchwork.ozlabs.org/patch/791550/
> 
> 
> On 23/07/17 16:56, Paul Gortmaker wrote:
> >To fix:
> >
> >drivers/net/ethernet/mediatek/mtk_eth_soc.c:1685:1: error: unknown type name 
> >'irqreturn_t'
> >drivers/net/ethernet/mediatek/mtk_eth_soc.c: In function 'mtk_handle_irq_rx':
> >drivers/net/ethernet/mediatek/mtk_eth_soc.c:1694:9: error: 'IRQ_HANDLED' 
> >undeclared (first use in this function)
> >drivers/net/ethernet/mediatek/mtk_eth_soc.c:1694:9: note: each undeclared 
> >identifier is reported only once for each function it appears in
> >
> >and similar follow-on errors as seen in arm allmodconfig builds.
> >
> >Cc: Sean Wang 
> >Cc: Felix Fietkau 
> >Cc: John Crispin 
> >Cc: Matthias Brugger 
> >Cc: linux-media...@lists.infradead.org
> >Signed-off-by: Paul Gortmaker 
> >---
> >
> >[Seen on current linux-next, where similar implicit irq includes
> >  were recently revealed by some header shuffle somewhere.]
> >
> >  drivers/net/ethernet/mediatek/mtk_eth_soc.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> >diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
> >b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> >index 7e95cf547ff1..ea15eb095d83 100644
> >--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> >+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> >@@ -16,6 +16,7 @@
> >  #include 
> >  #include 
> >  #include 
> >+#include 
> >  #include 
> >  #include 
> >  #include 
>

Re: [PATCH-next] net: ethernet: mediatek: fix implicit irq include causing build fails

2017-07-23 Thread John Crispin


Hi Paul

mark sent the same patch a couple of days ago [1]

John

[1] https://patchwork.ozlabs.org/patch/791550/


On 23/07/17 16:56, Paul Gortmaker wrote:

To fix:

drivers/net/ethernet/mediatek/mtk_eth_soc.c:1685:1: error: unknown type name 
'irqreturn_t'
drivers/net/ethernet/mediatek/mtk_eth_soc.c: In function 'mtk_handle_irq_rx':
drivers/net/ethernet/mediatek/mtk_eth_soc.c:1694:9: error: 'IRQ_HANDLED' 
undeclared (first use in this function)
drivers/net/ethernet/mediatek/mtk_eth_soc.c:1694:9: note: each undeclared 
identifier is reported only once for each function it appears in

and similar follow-on errors as seen in arm allmodconfig builds.

Cc: Sean Wang 
Cc: Felix Fietkau 
Cc: John Crispin 
Cc: Matthias Brugger 
Cc: linux-media...@lists.infradead.org
Signed-off-by: Paul Gortmaker 
---

[Seen on current linux-next, where similar implicit irq includes
  were recently revealed by some header shuffle somewhere.]

  drivers/net/ethernet/mediatek/mtk_eth_soc.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 7e95cf547ff1..ea15eb095d83 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -16,6 +16,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include

Re: [PATCH net-next v2 00/13] Change DSA's FDB API and perform switchdev cleanup

2017-07-23 Thread Arkadi Sharshevsky



On 07/20/2017 07:26 PM, Vivien Didelot wrote:
> Hi Arkadi,
> 
> Arkadi Sharshevsky  writes:
> 
>> Hi, thanks for the test. If the fdb is marked as self its not in the
>> bridge at all. So before my patch it was OK because you supported the
>> self thing.
>>
>> Please notice that both fdbs you added are marked the same because the
>> default is self: vim bridge/fdb.c +499 (I think this is a bug because
>> the man page states that master is the default). So in order to put it
>> in the bridge you should specify "master":
>>
>> $bridge fdb add e4:1d:2d:a5:f0:4a dev sw1p7 master
>> $bridge fdb show brport sw1p7
>> e4:1d:2d:a5:f0:4a vlan 1 offload master br0 permanent <---also should
>> e4:1d:2d:46:13:f1 vlan 1 master br0 permanent be offloaded*
>> e4:1d:2d:46:13:f1 master br0 permanent
>> e4:1d:2d:a5:f0:4a offload master br0 permanent
>> 33:33:00:00:00:01 self permanent
>> 01:00:5e:00:00:01 self permanent
>> 33:33:ff:46:13:f1 self permanent
>>
>> *you should take the latest iproute.
> 
> Thanks for the explanation, it makes more sense now. Now with latest
> net-next iproute2 and your patch, I have this behavior, first before
> your patchset:
> 
> # bridge fdb add e4:1d:2d:a5:f0:2a dev lan3
> # bridge fdb add e4:1d:2d:a5:f0:4a dev lan4 master
> # bridge fdb show
> 01:00:5e:00:00:01 dev eth0 self permanent
> 01:00:5e:00:00:01 dev eth1 self permanent
> 2a:6e:b6:a8:25:f1 dev lan0 master br0 permanent
> e4:1d:2d:a5:f0:2a dev lan3 self static
> e4:1d:2d:a5:f0:4a dev lan4 master br0 permanent
> 01:00:5e:00:00:01 dev br0 self permanent
> # bridge fdb del e4:1d:2d:a5:f0:2a dev lan3
> # bridge fdb del e4:1d:2d:a5:f0:4a dev lan4 master
> # bridge fdb show
> 01:00:5e:00:00:01 dev eth0 self permanent
> 01:00:5e:00:00:01 dev eth1 self permanent
> 2a:6e:b6:a8:25:f1 dev lan0 master br0 permanent
> 01:00:5e:00:00:01 dev br0 self permanent
> 
> and after your patchset:
> 
> # bridge fdb add e4:1d:2d:a5:f0:2a dev lan3
> # bridge fdb add e4:1d:2d:a5:f0:4a dev lan4 master
> # bridge fdb show
> 01:00:5e:00:00:01 dev eth0 self permanent
> e4:1d:2d:a5:f0:2a dev eth1 self permanent
> 01:00:5e:00:00:01 dev eth1 self permanent
> da:ac:a3:36:f2:10 dev lan0 master br0 permanent
> e4:1d:2d:a5:f0:4a dev lan4 offload master br0 permanent
> e4:1d:2d:a5:f0:4a dev lan4 self static
> 01:00:5e:00:00:01 dev br0 self permanent
> # bridge fdb del e4:1d:2d:a5:f0:2a dev lan3
> # bridge fdb del e4:1d:2d:a5:f0:4a dev lan4 master
> # bridge fdb show
> 01:00:5e:00:00:01 dev eth0 self permanent
> 01:00:5e:00:00:01 dev eth1 self permanent
> da:ac:a3:36:f2:10 dev lan0 master br0 permanent
> 01:00:5e:00:00:01 dev br0 self permanent
> 
> For lan4, the behavior seems correct. Even if reporting "lan4 self
> static" seems odd and redundant with the above "lan4 offload master".

Yeah, but remember that because we didn't remove the fdb dump from DSA
the dump operation will dumps the bridge fdb and then dumps the DSA's
fdbs (via the .ndo_fdb_dump()). So we see it twice due to the hw
limitation for syncing the bridge.

> However, adding an address to lan3 without flag (hence self...) ends up
> in the switch master device (i.e. SoC side of the CPU port conduit.)
> 
> Do you have an idea why the patchset changes that?
> 

Yeah, think that I figured it out, after my patchset there is no DSA
driver implementation for ndo_fdb_add(), thus, the default
implementation will be called:

vim net/core/rtnetlink.c +3113

The default ndo_fdb_add() implementation does two things
(Commit 090096bf3):

1. Adds the address to the unicast list of the device.
2. Calls __dev_set_rx_mode.

In dsa, the set_rx_mode implementation, dsa_slave_set_rx_mode(),
will sync the master device with the addresses.

After that an fdb dump will show this address also on the master
device.

I think this default implementation is relevant mostly for nics
and is not relevant for switchdev net devices.

We can discuss how to bypass this default implementation in our
case because at least in case of mlxsw it does not make sense.
I can send it as followup patch.

>> Also it seems strange that I removed the self support from the driver
>> but you still managed to configure it. The reason is the default
>> self implementation:
>>
>> vim net/core/rtnetlink.c +3112
>>
>> I think it is relevant for NICs mostly, so we can ignore it.
> 
> Regardless your patchset, this is unfortunately inconsistent with the
> bridge man page, describing:
> 
> self - the address is associated with the port drivers fdb. Usually
>hardware.
> master - the address is associated with master devices fdb. Usually
>  software (default).
> 
> So a user would imagine "master" (which is the default) programs the
> address in the software bridge only. Adding "self" means please also
> call into the driver to program the underlying hardware.
> 

As it

[PATCH-next] net: ethernet: mediatek: fix implicit irq include causing build fails

2017-07-23 Thread Paul Gortmaker

To fix:

drivers/net/ethernet/mediatek/mtk_eth_soc.c:1685:1: error: unknown type name 
'irqreturn_t'
drivers/net/ethernet/mediatek/mtk_eth_soc.c: In function 'mtk_handle_irq_rx':
drivers/net/ethernet/mediatek/mtk_eth_soc.c:1694:9: error: 'IRQ_HANDLED' 
undeclared (first use in this function)
drivers/net/ethernet/mediatek/mtk_eth_soc.c:1694:9: note: each undeclared 
identifier is reported only once for each function it appears in

and similar follow-on errors as seen in arm allmodconfig builds.

Cc: Sean Wang 
Cc: Felix Fietkau 
Cc: John Crispin 
Cc: Matthias Brugger 
Cc: linux-media...@lists.infradead.org
Signed-off-by: Paul Gortmaker 
---

[Seen on current linux-next, where similar implicit irq includes
 were recently revealed by some header shuffle somewhere.]

 drivers/net/ethernet/mediatek/mtk_eth_soc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c 
b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 7e95cf547ff1..ea15eb095d83 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
-- 
2.11.0

[PATCH-next] liquidio: fix implicit irq include causing build failures

2017-07-23 Thread Paul Gortmaker

To fix

In file included from 
drivers/net/ethernet/cavium/liquidio/octeon_mem_ops.c:24:0:
drivers/net/ethernet/cavium/liquidio/octeon_device.h:216:2: error: expected 
specifier-qualifier-list before ‘irqreturn_t’
  irqreturn_t (*process_interrupt_regs)(void *);
  ^

as seen on arm64 allmodconfig builds.

Cc: Derek Chickles 
Cc: Satanand Burla 
Cc: Felix Manlunas 
Cc: Raghu Vatsavayi 
Signed-off-by: Paul Gortmaker 
---

[Seen on current linux-next, where similar implicit irq includes
 were recently revealed by some header shuffle somewhere.]

 drivers/net/ethernet/cavium/liquidio/octeon_device.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/cavium/liquidio/octeon_device.h 
b/drivers/net/ethernet/cavium/liquidio/octeon_device.h
index c90ed48ae8ab..ad464788c923 100644
--- a/drivers/net/ethernet/cavium/liquidio/octeon_device.h
+++ b/drivers/net/ethernet/cavium/liquidio/octeon_device.h
@@ -22,6 +22,8 @@
 #ifndef _OCTEON_DEVICE_H_
 #define  _OCTEON_DEVICE_H_
 
+#include 
+
 /** PCI VendorId Device Id */
 #define  OCTEON_CN68XX_PCIID  0x91177d
 #define  OCTEON_CN66XX_PCIID  0x92177d
-- 
2.11.0

Re: [PATCH V4 net-next 2/8] net: hns3: Add support of the HNAE3 framework

2017-07-23 Thread Leon Romanovsky

On Sat, Jul 22, 2017 at 11:09:36PM +0100, Salil Mehta wrote:
> This patch adds the support of the HNAE3 (Hisilicon Network
> Acceleration Engine 3) framework support to the HNS3 driver.
>
> Framework facilitates clients like ENET(HNS3 Ethernet Driver), RoCE
> and user-space Ethernet drivers (like ODP etc.) to register with HNAE3
> devices and their associated operations.
>
> Signed-off-by: Daode Huang 
> Signed-off-by: lipeng 
> Signed-off-by: Salil Mehta 
> Signed-off-by: Yisen Zhuang 
> ---
> Patch V4: Addressed following comments
>   1. Andrew Lunn:
>  https://lkml.org/lkml/2017/6/17/233
>  https://lkml.org/lkml/2017/6/18/105
>   2. Bo Yu:
>  https://lkml.org/lkml/2017/6/18/112
>   3. Stephen Hamminger:
>  https://lkml.org/lkml/2017/6/19/778
> Patch V3: Addressed below comments
>   1. Andrew Lunn:
>  https://lkml.org/lkml/2017/6/13/1025
> Patch V2: No change
> Patch V1: Initial Submit
> ---
>  drivers/net/ethernet/hisilicon/hns3/hnae3.c | 319 
>  drivers/net/ethernet/hisilicon/hns3/hnae3.h | 449 
> 
>  2 files changed, 768 insertions(+)
>  create mode 100644 drivers/net/ethernet/hisilicon/hns3/hnae3.c
>  create mode 100644 drivers/net/ethernet/hisilicon/hns3/hnae3.h
>
> diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.c 
> b/drivers/net/ethernet/hisilicon/hns3/hnae3.c
> new file mode 100644
> index ..7a11aaff0a23
> --- /dev/null
> +++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.c
> @@ -0,0 +1,319 @@
> +/*
> + * Copyright (c) 2016-2017 Hisilicon Limited.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + */
> +
> +#include 
> +#include 
> +#include 
> +
> +#include "hnae3.h"
> +
> +static LIST_HEAD(hnae3_ae_algo_list);
> +static LIST_HEAD(hnae3_client_list);
> +static LIST_HEAD(hnae3_ae_dev_list);
> +
> +/* we are keeping things simple and using single lock for all the
> + * list. This is a non-critical code so other updations, if happen
> + * in parallel, can wait.
> + */
> +static DEFINE_MUTEX(hnae3_common_lock);
> +
> +static bool hnae3_client_match(enum hnae3_client_type client_type,
> +enum hnae3_dev_type dev_type)
> +{
> + if (dev_type == HNAE3_DEV_KNIC) {
> + switch (client_type) {
> + case HNAE3_CLIENT_KNIC:
> + case HNAE3_CLIENT_ROCE:
> + return true;
> + default:
> + return false;
> + }
> + } else if (dev_type == HNAE3_DEV_UNIC) {
> + switch (client_type) {
> + case HNAE3_CLIENT_UNIC:
> + return true;
> + default:
> + return false;
> + }
> + } else {
> + return false;
> + }
> +}
> +
> +static int hnae3_match_n_instantiate(struct hnae3_client *client,
> +  struct hnae3_ae_dev *ae_dev,
> +  bool is_reg, bool *matched)
> +{
> + int ret;
> +
> + *matched = false;
> +
> + /* check if this client matches the type of ae_dev */
> + if (!(hnae3_client_match(client->type, ae_dev->dev_type) &&
> +   hnae_get_bit(ae_dev->flag, HNAE3_DEV_INITED_B))) {
> + return 0;
> + }
> + /* there is a match of client and dev */
> + *matched = true;
> +
> + if (!(ae_dev->ops && ae_dev->ops->init_client_instance &&
> +   ae_dev->ops->uninit_client_instance)) {
> + dev_err(_dev->pdev->dev,
> + "ae_dev or client init/uninit ops are null\n");
> + return -EOPNOTSUPP;
> + }
> +
> + /* now, (un-)instantiate client by calling lower layer */
> + if (is_reg) {
> + ret = ae_dev->ops->init_client_instance(client, ae_dev);
> + if (ret)
> + dev_err(_dev->pdev->dev,
> + "fail to instantiate client\n");
> + return ret;
> + }
> +
> + ae_dev->ops->uninit_client_instance(client, ae_dev);
> + return 0;
> +}
> +
> +int hnae3_register_client(struct hnae3_client *client)
> +{
> + struct hnae3_client *client_tmp;
> + struct hnae3_ae_dev *ae_dev;
> + bool matched;
> + int ret = 0;
> +
> + mutex_lock(_common_lock);
> + /* one system should only have one client for every type */
> + list_for_each_entry(client_tmp, _client_list, node) {
> + if (client_tmp->type == client->type)
> + goto exit;
> + }
> +
> + list_add_tail(>node, _client_list);
> +
> + /* initialize the client on every matched port */
> + list_for_each_entry(ae_dev, _ae_dev_list, node) {
> +

Re: [PATCH V4 net-next 8/8] net: hns3: Add HNS3 driver to kernel build framework & MAINTAINERS

2017-07-23 Thread Leon Romanovsky

On Sat, Jul 22, 2017 at 11:09:42PM +0100, Salil Mehta wrote:
> This patch updates the MAINTAINERS file with HNS3 Ethernet driver
> maintainers names and other details. This also introduces the new
> Makefiles required to build the HNS3 Ethernet driver and updates
> the existing Kconfig file in the hisilicon folder.
>
> Signed-off-by: Salil Mehta 
> ---
> Patch V3: Addressed below errors:
>  1. Intel kbuild: https://lkml.org/lkml/2017/6/14/313
>  2. Intel Kbuild: https://lkml.org/lkml/2017/6/14/636
> Patch V2: No change
> Patch V1: Initial Submit
> ---
>  MAINTAINERS|  8 +++
>  drivers/net/ethernet/hisilicon/Kconfig | 27 
> ++
>  drivers/net/ethernet/hisilicon/Makefile|  1 +
>  drivers/net/ethernet/hisilicon/hns3/Makefile   |  7 ++
>  .../net/ethernet/hisilicon/hns3/hns3pf/Makefile| 11 +
>  5 files changed, 54 insertions(+)
>  create mode 100644 drivers/net/ethernet/hisilicon/hns3/Makefile
>  create mode 100644 drivers/net/ethernet/hisilicon/hns3/hns3pf/Makefile
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 297e610c9163..a22d5b86c2b7 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -6197,6 +6197,14 @@ S: Maintained
>  F:   drivers/net/ethernet/hisilicon/
>  F:   Documentation/devicetree/bindings/net/hisilicon*.txt
>
> +HISILICON NETWORK SUBSYSTEM 3 DRIVER (HNS3)
> +M:   Yisen Zhuang 
> +M:   Salil Mehta 
> +L:   netdev@vger.kernel.org
> +W:   http://www.hisilicon.com
> +S:   Maintained
> +F:   drivers/net/ethernet/hisilicon/hns3/
> +
>  HISILICON ROCE DRIVER
>  M:   Lijun Ou 
>  M:   Wei Hu(Xavier) 
> diff --git a/drivers/net/ethernet/hisilicon/Kconfig 
> b/drivers/net/ethernet/hisilicon/Kconfig
> index d11287e11371..9f8ea283c531 100644
> --- a/drivers/net/ethernet/hisilicon/Kconfig
> +++ b/drivers/net/ethernet/hisilicon/Kconfig
> @@ -76,4 +76,31 @@ config HNS_ENET
> This selects the general ethernet driver for HNS.  This module make
> use of any HNS AE driver, such as HNS_DSAF
>
> +config HNS3
> + tristate "Hisilicon Network Subsystem Support HNS3 (Framework)"
> +depends on PCI
> + ---help---
> +   This selects the framework support for Hisilicon Network Subsystem 3.
> +   This layer facilitates clients like ENET, RoCE and user-space ethernet
> +   drivers(like ODP)to register with HNAE devices and their associated
> +   operations.
> +
> +config HNS3_HCLGE
> + tristate "Hisilicon HNS3 HCLGE Acceleration Engine & Compatibility 
> Layer Support"
> +depends on PCI_MSI
> + select HNS3

IMHO it should be "depends" and not "select".

> + ---help---
> +   This selects the HNS3_HCLGE network acceleration engine & its hardware
> +   compatibility layer. The engine would be used in Hisilicon hip08 
> family of
> +   SoCs and further upcoming SoCs.
> +
> +config HNS3_ENET
> + tristate "Hisilicon HNS3 Ethernet Device Support"
> +depends on 64BIT && PCI
> + select HNS3

Ditto

> + ---help---
> +   This selects the Ethernet Driver for Hisilicon Network Subsystem 3 
> for hip08
> +   family of SoCs. This module depends upon HNAE3 driver to access the 
> HNAE3
> +   devices and their associated operations.
> +
>  endif # NET_VENDOR_HISILICON
> diff --git a/drivers/net/ethernet/hisilicon/Makefile 
> b/drivers/net/ethernet/hisilicon/Makefile
> index 8661695024dc..3828c435c18f 100644
> --- a/drivers/net/ethernet/hisilicon/Makefile
> +++ b/drivers/net/ethernet/hisilicon/Makefile
> @@ -6,4 +6,5 @@ obj-$(CONFIG_HIX5HD2_GMAC) += hix5hd2_gmac.o
>  obj-$(CONFIG_HIP04_ETH) += hip04_eth.o
>  obj-$(CONFIG_HNS_MDIO) += hns_mdio.o
>  obj-$(CONFIG_HNS) += hns/
> +obj-$(CONFIG_HNS3) += hns3/
>  obj-$(CONFIG_HISI_FEMAC) += hisi_femac.o
> diff --git a/drivers/net/ethernet/hisilicon/hns3/Makefile 
> b/drivers/net/ethernet/hisilicon/hns3/Makefile
> new file mode 100644
> index ..5e53735b2d4e
> --- /dev/null
> +++ b/drivers/net/ethernet/hisilicon/hns3/Makefile
> @@ -0,0 +1,7 @@
> +#
> +# Makefile for the HISILICON network device drivers.
> +#
> +
> +obj-$(CONFIG_HNS3) += hns3pf/
> +
> +obj-$(CONFIG_HNS3) +=hnae3.o

There is a missing space after "+="

> diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/Makefile 
> b/drivers/net/ethernet/hisilicon/hns3/hns3pf/Makefile
> new file mode 100644
> index ..c0a92b5690a9
> --- /dev/null
> +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/Makefile
> @@ -0,0 +1,11 @@
> +#
> +# Makefile for the HISILICON network device drivers.
> +#
> +
> +ccflags-y := -Idrivers/net/ethernet/hisilicon/hns3
> +
> +obj-$(CONFIG_HNS3_HCLGE) += hclge.o
> +hclge-objs =hclge_main.o hclge_cmd.o hclge_mdio.o hclge_tm.o

Missing space.

> +
> +obj-$(CONFIG_HNS3_ENET) += hns3.o
> +hns3-objs = hns3_enet.o hns3_ethtool.o
> --
> 2.11.0
>
>


signature.asc
Description: PGP

RE: af_packet: use after free in prb_retire_rx_blk_timer_expired

2017-07-23 Thread liujian (CE)

Hi 

I find it caused by below steps:
1. set tp_version to TPACKET_V3 and req->tp_block_nr to 1
2. set tp_block_nr to 0
Then pg_vec was freed, and we did not delete the timer?

Best Regards,
liujian


> -Original Message-
> From: liujian (CE)
> Sent: Sunday, July 23, 2017 5:47 PM
> To: liujian (CE); 'Cong Wang'; Dingtianhong
> Cc: 'Willem de Bruijn'; 'Dave Jones'; 'alexander.le...@verizon.com';
> 'da...@davemloft.net'; 'eduma...@google.com'; 'will...@google.com';
> 'dan...@iogearbox.net'; 'netdev@vger.kernel.org';
> 'linux-ker...@vger.kernel.org'
> Subject: RE: af_packet: use after free in prb_retire_rx_blk_timer_expired
> 
> Hi,
> 
> Do we need delete the v3 ring, when tp_version changed from TPACKET_V3 to
> TPACKET_V1 ?
> 
> 
> Best Regards,
> liujian
> 
> 
> > -Original Message-
> > From: liujian (CE)
> > Sent: Sunday, July 23, 2017 4:21 PM
> > To: 'Cong Wang'; Dingtianhong
> > Cc: Willem de Bruijn; Dave Jones; alexander.le...@verizon.com;
> > da...@davemloft.net; eduma...@google.com; will...@google.com;
> > dan...@iogearbox.net; netdev@vger.kernel.org;
> > linux-ker...@vger.kernel.org
> > Subject: RE: af_packet: use after free in
> > prb_retire_rx_blk_timer_expired
> >
> > Hi Wang Cong,
> >
> > With this patch , the system was crashed when setsockopt.
> >
> > The call trace as below:
> >
> > crash> bt
> > PID: 3069   TASK: 8800afcc  CPU: 0   COMMAND: "trinity-main"
> >  #0 [8801bec03ce0] machine_kexec at 8105354b
> >  #1 [8801bec03d40] crash_kexec at 810f7e82
> >  #2 [8801bec03e10] panic at 81650058
> >  #3 [8801bec03e90] watchdog_timer_fn at 81122533
> >  #4 [8801bec03ec8] __hrtimer_run_queues at 810abeb2
> >  #5 [8801bec03f20] hrtimer_interrupt at 810ac450
> >  #6 [8801bec03f70] local_apic_timer_interrupt at 8104a457
> >  #7 [8801bec03f88] smp_apic_timer_interrupt at 8166aed0
> >  #8 [8801bec03fb0] apic_timer_interrupt at 8166931d
> > ---  ---
> >  #9 [8801b301fcb8] apic_timer_interrupt at 8166931d
> > [exception RIP: lock_timer_base+77]
> > RIP: 8108dced  RSP: 8801b301fd60  RFLAGS: 0246
> > RAX:   RBX: 8800afcc  RCX:
> > 0001
> > RDX: 8800afcc  RSI: 8801b301fd90  RDI: 8800b0d853c8
> > RBP: 8801b301fd80   R8: 8800afcc   R9:
> ea000268
> > R10: 003c  R11: 8801b301fb2e  R12:
> 8800afcc
> > R13: 8800afcc  R14:   R15:
> 83d1a340
> > ORIG_RAX: ff10  CS: 0010  SS: 0018
> > #10 [8801b301fd88] try_to_del_timer_sync at 8108f19f
> > #11 [8801b301fdb8] del_timer_sync at 8108f252
> > #12 [8801b301fdd0] packet_set_ring at 81635e60
> > #13 [8801b301fe98] packet_setsockopt at 81636760
> > #14 [8801b301ff38] sys_setsockopt at 81531860
> > #15 [8801b301ff80] tracesys at 816687ed (via system_call)
> > RIP: 7fcc78b03e3a  RSP: 7fff16f246b8  RFLAGS: 0202
> > RAX: ffda  RBX: 816687ed  RCX: 
> > RDX: 0005  RSI: 0107  RDI:
> > 0180
> > RBP: 0180   R8: 001c   R9:
> > 7fcc78dc7160
> > R10: 01fd6ba0  R11: 0202  R12:
> > 
> > R13: 0011  R14: 01fd6b60  R15:
> > 01fd6b70
> > ORIG_RAX: 0036  CS: 0033  SS: 002b
> >
> >
> > Best Regards,
> > liujian
> >
> >
> > > -Original Message-
> > > From: Cong Wang [mailto:xiyou.wangc...@gmail.com]
> > > Sent: Sunday, July 23, 2017 1:59 PM
> > > To: Dingtianhong
> > > Cc: liujian (CE); Willem de Bruijn; Dave Jones;
> > > alexander.le...@verizon.com; da...@davemloft.net;
> > eduma...@google.com;
> > > will...@google.com; dan...@iogearbox.net; netdev@vger.kernel.org;
> > > linux-ker...@vger.kernel.org
> > > Subject: Re: af_packet: use after free in
> > > prb_retire_rx_blk_timer_expired
> > >
> > > On Sat, Jul 22, 2017 at 8:40 PM, Ding Tianhong
> > > 
> > > wrote:
> > > > Hi, Cong:
> > > >
> > > > Thanks for your quirk solution, but I still has some doubts about
> > > > it, it looks like fix the problem in the
> > > > packet_setsockopt->packet_set_ring processing, but when in
> > > > packet_release processing, it may could not release the real
> > > > pg_vec for the TPACKET_V3 ring, and then cause the mem leak, maybe
> > > > I miss something here, nice to hear from your feedback. :)
> > >
> > > Yes you miss that packet_release() has memset()'s so we won't hit
> > > that path. :)
> > >
> > > However, I missed the swap() in this messy function, actually I
> > > believe the bug is that we modify tpacket_kbdq_core inside rx_ring
> > > in non-closing case without actually stopping its timer. I feel more
> > > confident
> > with the following patch:
> > >
> > >

Re: [PATCH V2 net-next 01/21] net-next/hinic: Initialize hw interface

2017-07-23 Thread Aviad Krawczyk

Hi Francois,

ERR_PTR / IS ERR - we will change it

err_xyz labels - we will change it according to other companies style.

hinic_free_hwdev - It is there to mark us changes for VF code. We will remove 
it,
it can't be failed.

hinic_remove - If insmod failed and someone calls rmmod, we will get a crash 
because
the resource are already free. Therefore we test if the device exists, please 
tell me
if you meant to something different

pci_id_tbl - will be moved to the .c file.

void* - usually void * is something to avoid.

The priv data is in type void * because the
caller can use any struct that it wants, like the priv data in Linux
(netdev, irq, tasklet, work..) -

we can change it but if we will pass different struct
in the future, we will have to change the prototype of the functions.

According to the other void *:
The wq struct is used for cmdq, sq and rq. Therefore the wqe is in type
void *. There are 4 operations get_wqe, write_wqe, read_wqe and put_wqe - there
is no option that one function will be fed with a wrong pointer because the 
caller
should use what it got in get_wqe function.

When something is used as multiple types, it can be used as void * or union.
Usually, I would prefer union. But, in this case if we will use union, maybe 
there is a chance
of using the wrong wqe type in the wrong work queue type.
Another option is to use a wrapper for each wq type operations, so only the 
basic wq ops will
use void *.

Then, there will be cmdq, rq, sq operations with the correct wqe type and wq 
operations that
will use void *.

I will be glad to hear your opinion about the preferred style and about 
hinic_remove issue you mentioned
above.

Thanks for your time and review,
Aviad

On 7/20/2017 1:27 AM, Francois Romieu wrote:
> Aviad Krawczyk  :
> [...]
>> diff --git a/drivers/net/ethernet/huawei/hinic/hinic_hw_dev.c 
>> b/drivers/net/ethernet/huawei/hinic/hinic_hw_dev.c
>> new file mode 100644
>> index 000..fbc9de4
>> --- /dev/null
>> +++ b/drivers/net/ethernet/huawei/hinic/hinic_hw_dev.c
> [...]
>> +/**
>> + * hinic_init_hwdev - Initialize the NIC HW
>> + * @hwdev: the NIC HW device that is returned from the initialization
>> + * @pdev: the NIC pci device
>> + *
>> + * Return 0 - Success, negative - Failure
>> + *
>> + * Initialize the NIC HW device and return a pointer to it in the first arg
>> + **/
> 
> Return a pointer and use ERR_PTR / IS_ERR ?
> 
>> +int hinic_init_hwdev(struct hinic_hwdev **hwdev, struct pci_dev *pdev)
>> +{
>> +struct hinic_pfhwdev *pfhwdev;
>> +struct hinic_hwif *hwif;
>> +int err;
>> +
>> +hwif = devm_kzalloc(>dev, sizeof(*hwif), GFP_KERNEL);
>> +if (!hwif)
>> +return -ENOMEM;
>> +
>> +err = hinic_init_hwif(hwif, pdev);
>> +if (err) {
>> +dev_err(>dev, "Failed to init HW interface\n");
>> +return err;
>> +}
>> +
>> +if (!HINIC_IS_PF(hwif) && !HINIC_IS_PPF(hwif)) {
>> +dev_err(>dev, "Unsupported PCI Function type\n");
>> +err = -EFAULT;
>> +goto func_type_err;
>> +}
>> +
>> +pfhwdev = devm_kzalloc(>dev, sizeof(*pfhwdev), GFP_KERNEL);
>> +if (!pfhwdev) {
>> +err = -ENOMEM;
>> +goto pfhwdev_alloc_err;
> 
> Intel, Mellanox, Broadcom, Amazon and friends use "err_xyz" labels.
> 
> Please consider using the same style.
> 
> [...]
>> +void hinic_free_hwdev(struct hinic_hwdev *hwdev)
>> +{
>> +struct hinic_hwif *hwif = hwdev->hwif;
>> +struct pci_dev *pdev = hwif->pdev;
>> +struct hinic_pfhwdev *pfhwdev;
>> +
>> +if (!HINIC_IS_PF(hwif) && !HINIC_IS_PPF(hwif)) {
>> +dev_err(>dev, "unsupported PCI Function type\n");
>> +return;
>> +}
> 
> If it succeeded in hinic_init_hwdev, how could it fail here ?
> 
> If it failed in hinic_init_hwdev, hinic_free_hwdev should not even
> be called.
> 
> -> remove ?
> 
> [...]
>> diff --git a/drivers/net/ethernet/huawei/hinic/hinic_main.c 
>> b/drivers/net/ethernet/huawei/hinic/hinic_main.c
>> new file mode 100644
>> index 000..c61c769
>> --- /dev/null
>> +++ b/drivers/net/ethernet/huawei/hinic/hinic_main.c
> [...]
>> +static void hinic_remove(struct pci_dev *pdev)
>> +{
>> +struct net_device *netdev = pci_get_drvdata(pdev);
>> +struct hinic_dev *nic_dev;
>> +
>> +if (!netdev)
>> +return;
> 
> Your driver is flawed if this test can ever succeed.
> 
> [...]
>> +static int __init hinic_init(void)
>> +{
>> +return pci_register_driver(_driver);
>> +}
>> +
>> +static void __exit hinic_exit(void)
>> +{
>> +pci_unregister_driver(_driver);
>> +}
> 
> Use module_pci_driver(hinic_driver).
> 
> Remove hinic_init() and hinic_exit().
> 
>> diff --git a/drivers/net/ethernet/huawei/hinic/hinic_pci_id_tbl.h 
>> b/drivers/net/ethernet/huawei/hinic/hinic_pci_id_tbl.h
>> new file mode 100644
>> index 000..1d92617
>> --- /dev/null
>> +++ b/drivers/net/ethernet/huawei/hinic/hinic_pci_id_tbl.h
> [...]
>> +#ifndef

Re: [PATCH V2 net-next 12/21] net-next/hinic: Add qp resources

2017-07-23 Thread Aviad Krawczyk

Hi David,

Strange that checkpatch.pl --strict didn't warn about it.
We will fix it.

Thanks for review,
Aviad

On 7/20/2017 2:13 AM, David Miller wrote:
> From: Aviad Krawczyk 
> Date: Wed, 19 Jul 2017 17:19:10 +0800
> 
>> diff --git a/drivers/net/ethernet/huawei/hinic/Makefile 
>> b/drivers/net/ethernet/huawei/hinic/Makefile
>> index 519382b..24728f0 100644
>> --- a/drivers/net/ethernet/huawei/hinic/Makefile
>> +++ b/drivers/net/ethernet/huawei/hinic/Makefile
>> @@ -1,5 +1,5 @@
>>  obj-$(CONFIG_HINIC) += hinic.o
>>  
>>  hinic-y := hinic_main.o hinic_tx.o hinic_rx.o hinic_port.o hinic_hw_dev.o \
>> -   hinic_hw_io.o hinic_hw_wq.o hinic_hw_mgmt.o hinic_hw_api_cmd.o \
>> -   hinic_hw_eqs.o hinic_hw_if.o
>> \ No newline at end of file
>> +   hinic_hw_io.o hinic_hw_qp.o hinic_hw_wq.o hinic_hw_mgmt.o \
>> +   hinic_hw_api_cmd.o hinic_hw_eqs.o hinic_hw_if.o
>> \ No newline at end of file
> 
> Please add the missing newline to the end of this Makefile so that these 
> patches
> don't keep emitting this message.
> 
> .
>

RE: af_packet: use after free in prb_retire_rx_blk_timer_expired

2017-07-23 Thread liujian (CE)

Hi, 

Do we need delete the v3 ring, when tp_version changed from TPACKET_V3 to 
TPACKET_V1 ?


Best Regards,
liujian


> -Original Message-
> From: liujian (CE)
> Sent: Sunday, July 23, 2017 4:21 PM
> To: 'Cong Wang'; Dingtianhong
> Cc: Willem de Bruijn; Dave Jones; alexander.le...@verizon.com;
> da...@davemloft.net; eduma...@google.com; will...@google.com;
> dan...@iogearbox.net; netdev@vger.kernel.org; linux-ker...@vger.kernel.org
> Subject: RE: af_packet: use after free in prb_retire_rx_blk_timer_expired
> 
> Hi Wang Cong,
> 
> With this patch , the system was crashed when setsockopt.
> 
> The call trace as below:
> 
> crash> bt
> PID: 3069   TASK: 8800afcc  CPU: 0   COMMAND: "trinity-main"
>  #0 [8801bec03ce0] machine_kexec at 8105354b
>  #1 [8801bec03d40] crash_kexec at 810f7e82
>  #2 [8801bec03e10] panic at 81650058
>  #3 [8801bec03e90] watchdog_timer_fn at 81122533
>  #4 [8801bec03ec8] __hrtimer_run_queues at 810abeb2
>  #5 [8801bec03f20] hrtimer_interrupt at 810ac450
>  #6 [8801bec03f70] local_apic_timer_interrupt at 8104a457
>  #7 [8801bec03f88] smp_apic_timer_interrupt at 8166aed0
>  #8 [8801bec03fb0] apic_timer_interrupt at 8166931d
> ---  ---
>  #9 [8801b301fcb8] apic_timer_interrupt at 8166931d
> [exception RIP: lock_timer_base+77]
> RIP: 8108dced  RSP: 8801b301fd60  RFLAGS: 0246
> RAX:   RBX: 8800afcc  RCX:
> 0001
> RDX: 8800afcc  RSI: 8801b301fd90  RDI: 8800b0d853c8
> RBP: 8801b301fd80   R8: 8800afcc   R9: ea000268
> R10: 003c  R11: 8801b301fb2e  R12: 8800afcc
> R13: 8800afcc  R14:   R15: 83d1a340
> ORIG_RAX: ff10  CS: 0010  SS: 0018
> #10 [8801b301fd88] try_to_del_timer_sync at 8108f19f
> #11 [8801b301fdb8] del_timer_sync at 8108f252
> #12 [8801b301fdd0] packet_set_ring at 81635e60
> #13 [8801b301fe98] packet_setsockopt at 81636760
> #14 [8801b301ff38] sys_setsockopt at 81531860
> #15 [8801b301ff80] tracesys at 816687ed (via system_call)
> RIP: 7fcc78b03e3a  RSP: 7fff16f246b8  RFLAGS: 0202
> RAX: ffda  RBX: 816687ed  RCX: 
> RDX: 0005  RSI: 0107  RDI:
> 0180
> RBP: 0180   R8: 001c   R9:
> 7fcc78dc7160
> R10: 01fd6ba0  R11: 0202  R12:
> 
> R13: 0011  R14: 01fd6b60  R15:
> 01fd6b70
> ORIG_RAX: 0036  CS: 0033  SS: 002b
> 
> 
> Best Regards,
> liujian
> 
> 
> > -Original Message-
> > From: Cong Wang [mailto:xiyou.wangc...@gmail.com]
> > Sent: Sunday, July 23, 2017 1:59 PM
> > To: Dingtianhong
> > Cc: liujian (CE); Willem de Bruijn; Dave Jones;
> > alexander.le...@verizon.com; da...@davemloft.net;
> eduma...@google.com;
> > will...@google.com; dan...@iogearbox.net; netdev@vger.kernel.org;
> > linux-ker...@vger.kernel.org
> > Subject: Re: af_packet: use after free in
> > prb_retire_rx_blk_timer_expired
> >
> > On Sat, Jul 22, 2017 at 8:40 PM, Ding Tianhong
> > 
> > wrote:
> > > Hi, Cong:
> > >
> > > Thanks for your quirk solution, but I still has some doubts about
> > > it, it looks like fix the problem in the
> > > packet_setsockopt->packet_set_ring processing, but when in
> > > packet_release processing, it may could not release the real pg_vec
> > > for the TPACKET_V3 ring, and then cause the mem leak, maybe I miss
> > > something here, nice to hear from your feedback. :)
> >
> > Yes you miss that packet_release() has memset()'s so we won't hit that
> > path. :)
> >
> > However, I missed the swap() in this messy function, actually I
> > believe the bug is that we modify tpacket_kbdq_core inside rx_ring in
> > non-closing case without actually stopping its timer. I feel more confident
> with the following patch:
> >
> >
> > diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index
> > 008bb34ee324..267b181fef15 100644
> > --- a/net/packet/af_packet.c
> > +++ b/net/packet/af_packet.c
> > @@ -4263,6 +4263,7 @@ static int packet_set_ring(struct sock *sk,
> > union tpacket_req_u *req_u,
> > case TPACKET_V3:
> > /* Block transmit is not supported yet */
> > if (!tx_ring) {
> > +   prb_shutdown_retire_blk_timer(po,
> > + rb_queue);
> > init_prb_bdqc(po, rb, pg_vec, req_u);
> > } else {
> > struct tpacket_req3 *req3 =
> > _u->req3;

[PATCH net] openvswitch: fix potential out of bound access in parse_ct

2017-07-23 Thread Liping Zhang

From: Liping Zhang 

Before the 'type' is validated, we shouldn't use it to fetch the
ovs_ct_attr_lens's minlen and maxlen, else, out of bound access
may happen.

Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
Signed-off-by: Liping Zhang 
---
 net/openvswitch/conntrack.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index e3c4c6c..03859e3 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -1310,8 +1310,8 @@ static int parse_ct(const struct nlattr *attr, struct 
ovs_conntrack_info *info,
 
nla_for_each_nested(a, attr, rem) {
int type = nla_type(a);
-   int maxlen = ovs_ct_attr_lens[type].maxlen;
-   int minlen = ovs_ct_attr_lens[type].minlen;
+   int maxlen;
+   int minlen;
 
if (type > OVS_CT_ATTR_MAX) {
OVS_NLERR(log,
@@ -1319,6 +1319,9 @@ static int parse_ct(const struct nlattr *attr, struct 
ovs_conntrack_info *info,
  type, OVS_CT_ATTR_MAX);
return -EINVAL;
}
+
+   maxlen = ovs_ct_attr_lens[type].maxlen;
+   minlen = ovs_ct_attr_lens[type].minlen;
if (nla_len(a) < minlen || nla_len(a) > maxlen) {
OVS_NLERR(log,
  "Conntrack attr type has unexpected length 
(type=%d, length=%d, expected=%d)",
-- 
2.5.5

RE: [PATCH V4 net-next 5/8] net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver

2017-07-23 Thread Salil Mehta

Hi Richard,

> -Original Message-
> From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma-
> ow...@vger.kernel.org] On Behalf Of Richard Cochran
> Sent: Sunday, July 23, 2017 7:17 AM
> To: Salil Mehta
> Cc: da...@davemloft.net; Zhuangyuzeng (Yisen); huangdaode; lipeng (Y);
> mehta.salil@gmail.com; netdev@vger.kernel.org; linux-
> ker...@vger.kernel.org; linux-r...@vger.kernel.org; Linuxarm
> Subject: Re: [PATCH V4 net-next 5/8] net: hns3: Add support of TX
> Scheduler & Shaper to HNS3 driver
> 
> On Sat, Jul 22, 2017 at 11:09:39PM +0100, Salil Mehta wrote:
> > +int hclge_tm_setup_tc(struct hclge_dev *hdev);
> 
> Like I said before, this function DNE.
Sorry, it was removed, I mistakenly saw the old code.
Will remove this prototype as well.

Thanks
Salil
> 
> Thanks,
> Richard
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma"
> in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

RE: af_packet: use after free in prb_retire_rx_blk_timer_expired

2017-07-23 Thread liujian (CE)

Hi Wang Cong,

With this patch , the system was crashed when setsockopt. 

The call trace as below:  

crash> bt
PID: 3069   TASK: 8800afcc  CPU: 0   COMMAND: "trinity-main"
 #0 [8801bec03ce0] machine_kexec at 8105354b
 #1 [8801bec03d40] crash_kexec at 810f7e82
 #2 [8801bec03e10] panic at 81650058
 #3 [8801bec03e90] watchdog_timer_fn at 81122533
 #4 [8801bec03ec8] __hrtimer_run_queues at 810abeb2
 #5 [8801bec03f20] hrtimer_interrupt at 810ac450
 #6 [8801bec03f70] local_apic_timer_interrupt at 8104a457
 #7 [8801bec03f88] smp_apic_timer_interrupt at 8166aed0
 #8 [8801bec03fb0] apic_timer_interrupt at 8166931d
---  ---
 #9 [8801b301fcb8] apic_timer_interrupt at 8166931d
[exception RIP: lock_timer_base+77]
RIP: 8108dced  RSP: 8801b301fd60  RFLAGS: 0246
RAX:   RBX: 8800afcc  RCX: 0001
RDX: 8800afcc  RSI: 8801b301fd90  RDI: 8800b0d853c8
RBP: 8801b301fd80   R8: 8800afcc   R9: ea000268
R10: 003c  R11: 8801b301fb2e  R12: 8800afcc
R13: 8800afcc  R14:   R15: 83d1a340
ORIG_RAX: ff10  CS: 0010  SS: 0018
#10 [8801b301fd88] try_to_del_timer_sync at 8108f19f
#11 [8801b301fdb8] del_timer_sync at 8108f252
#12 [8801b301fdd0] packet_set_ring at 81635e60
#13 [8801b301fe98] packet_setsockopt at 81636760
#14 [8801b301ff38] sys_setsockopt at 81531860
#15 [8801b301ff80] tracesys at 816687ed (via system_call)
RIP: 7fcc78b03e3a  RSP: 7fff16f246b8  RFLAGS: 0202
RAX: ffda  RBX: 816687ed  RCX: 
RDX: 0005  RSI: 0107  RDI: 0180
RBP: 0180   R8: 001c   R9: 7fcc78dc7160
R10: 01fd6ba0  R11: 0202  R12: 
R13: 0011  R14: 01fd6b60  R15: 01fd6b70
ORIG_RAX: 0036  CS: 0033  SS: 002b


Best Regards,
liujian


> -Original Message-
> From: Cong Wang [mailto:xiyou.wangc...@gmail.com]
> Sent: Sunday, July 23, 2017 1:59 PM
> To: Dingtianhong
> Cc: liujian (CE); Willem de Bruijn; Dave Jones; alexander.le...@verizon.com;
> da...@davemloft.net; eduma...@google.com; will...@google.com;
> dan...@iogearbox.net; netdev@vger.kernel.org; linux-ker...@vger.kernel.org
> Subject: Re: af_packet: use after free in prb_retire_rx_blk_timer_expired
> 
> On Sat, Jul 22, 2017 at 8:40 PM, Ding Tianhong 
> wrote:
> > Hi, Cong:
> >
> > Thanks for your quirk solution, but I still has some doubts about it,
> > it looks like fix the problem in the
> > packet_setsockopt->packet_set_ring processing, but when in
> > packet_release processing, it may could not release the real pg_vec
> > for the TPACKET_V3 ring, and then cause the mem leak, maybe I miss
> > something here, nice to hear from your feedback. :)
> 
> Yes you miss that packet_release() has memset()'s so we won't hit that path. 
> :)
> 
> However, I missed the swap() in this messy function, actually I believe the 
> bug
> is that we modify tpacket_kbdq_core inside rx_ring in non-closing case without
> actually stopping its timer. I feel more confident with the following patch:
> 
> 
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index
> 008bb34ee324..267b181fef15 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -4263,6 +4263,7 @@ static int packet_set_ring(struct sock *sk, union
> tpacket_req_u *req_u,
> case TPACKET_V3:
> /* Block transmit is not supported yet */
> if (!tx_ring) {
> +   prb_shutdown_retire_blk_timer(po,
> + rb_queue);
> init_prb_bdqc(po, rb, pg_vec, req_u);
> } else {
> struct tpacket_req3 *req3 =
> _u->req3;

Re: [PATCH V4 net-next 5/8] net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver

2017-07-23 Thread Richard Cochran

On Sat, Jul 22, 2017 at 11:09:39PM +0100, Salil Mehta wrote:
> +int hclge_tm_setup_tc(struct hclge_dev *hdev);

Like I said before, this function DNE.

Thanks,
Richard

Re: [PATCH V3 net-next 5/8] net: hns3: Add support of TX Scheduler & Shaper to HNS3 driver

2017-07-23 Thread Richard Cochran

On Sat, Jul 22, 2017 at 11:38:26PM +, Salil Mehta wrote:
> > On Sat, Jun 17, 2017 at 06:24:28PM +0100, Salil Mehta wrote:
> > > +
> > > +int hclge_tm_schd_init(struct hclge_dev *hdev);
> > > +int hclge_tm_setup_tc(struct hclge_dev *hdev);
> > 
> > The definition of this function DNE.
> Sorry, I did not get what DNE means? Does Not Exist ?
> If yes, the I can see the definition of both the functions.

Where in this series is hclge_tm_setup_tc() defined?

Nowhere, AFAICT.

Thanks,
Richard

Re: af_packet: use after free in prb_retire_rx_blk_timer_expired

2017-07-23 Thread Cong Wang

On Sat, Jul 22, 2017 at 8:40 PM, Ding Tianhong  wrote:
> Hi, Cong:
>
> Thanks for your quirk solution, but I still has some doubts about it,
> it looks like fix the problem in the packet_setsockopt->packet_set_ring 
> processing,
> but when in packet_release processing, it may could not release the
> real pg_vec for the TPACKET_V3 ring, and then cause the mem leak,
> maybe I miss something here, nice to hear from your feedback. :)

Yes you miss that packet_release() has memset()'s so we won't hit
that path. :)

However, I missed the swap() in this messy function, actually I
believe the bug is that we modify tpacket_kbdq_core inside rx_ring
in non-closing case without actually stopping its timer. I feel
more confident with the following patch:


diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 008bb34ee324..267b181fef15 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -4263,6 +4263,7 @@ static int packet_set_ring(struct sock *sk,
union tpacket_req_u *req_u,
case TPACKET_V3:
/* Block transmit is not supported yet */
if (!tx_ring) {
+   prb_shutdown_retire_blk_timer(po, rb_queue);
init_prb_bdqc(po, rb, pg_vec, req_u);
} else {
struct tpacket_req3 *req3 = _u->req3;

44 matches

Mail list logo