Re: [PATCH] veth: fix memory leak in veth_newlink()

2020-08-30 Thread Toshiaki Makita

On 2020/08/31 9:51, Rustam Kovhaev wrote:

On Mon, Aug 31, 2020 at 09:16:32AM +0900, Toshiaki Makita wrote:

On 2020/08/30 22:13, Rustam Kovhaev wrote:

when register_netdevice(dev) fails we should check whether struct
veth_rq has been allocated via ndo_init callback and free it, because,
depending on the code path, register_netdevice() might not call
priv_destructor() callback


AFAICS, register_netdevice() always goto err_uninit and calls priv_destructor()
on failure after ndo_init() succeeded.
So I could not find such a code path.
Would you elaborate on it?


in net/core/dev.c:9863, where register_netdevice() calls rollback_registered(),
which does not call priv_destructor(), then register_netdevice() returns error
net/core/dev.c:9884


Thank you, now I see the code path.
But then all devices which allocate something in ndo_init() and free them in
priv_destructor() are affected? E.g. loopback and ifb seem to do such thing.
Why not calling priv_destructor() after invocation of rollback_registered()?
It looks weird that only that path does not call priv_destructor().

Toshiaki Makita


Re: [PATCH] veth: fix memory leak in veth_newlink()

2020-08-30 Thread Toshiaki Makita

On 2020/08/30 22:13, Rustam Kovhaev wrote:

when register_netdevice(dev) fails we should check whether struct
veth_rq has been allocated via ndo_init callback and free it, because,
depending on the code path, register_netdevice() might not call
priv_destructor() callback


AFAICS, register_netdevice() always goto err_uninit and calls priv_destructor()
on failure after ndo_init() succeeded.
So I could not find such a code path.
Would you elaborate on it?

Thanks,
Toshiaki Makita


Re: [PATCH net] net: ethtool: Allow matching on vlan CFI bit

2019-06-12 Thread Toshiaki Makita

On 2019/06/12 0:54, Maxime Chevallier wrote:

Using ethtool, users can specify a classification action matching on the
full vlan tag, which includes the CFI bit.

However, when converting the ethool_flow_spec to a flow_rule, we use
dissector keys to represent the matching patterns.

Since the vlan dissector key doesn't include the CFI bit, this
information was silently discarded when translating the ethtool
flow spec in to a flow_rule.

This commit adds the CFI bit into the vlan dissector key, and allows
propagating the information to the driver when parsing the ethtool flow
spec.

Fixes: eca4205f9ec3 ("ethtool: add ethtool_rx_flow_spec to flow_rule structure 
translator")
Reported-by: Michał Mirosław 
Signed-off-by: Maxime Chevallier 
---
Hi all,

Although this prevents information to be silently discarded when parsing
an ethtool_flow_spec, this information doesn't seem to be used by any
driver that converts an ethtool_flow_spec to a flow_rule, hence I'm not
sure this is suitable for -net.

Thanks,

Maxime

  include/net/flow_dissector.h | 1 +
  net/core/ethtool.c   | 5 +
  2 files changed, 6 insertions(+)

diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h
index 7c5a8d9a8d2a..9d2e395c6568 100644
--- a/include/net/flow_dissector.h
+++ b/include/net/flow_dissector.h
@@ -46,6 +46,7 @@ struct flow_dissector_key_tags {
  
  struct flow_dissector_key_vlan {

u16 vlan_id:12,
+   vlan_cfi:1,


Current IEEE 802.1Q defines this bit as DEI not CFI, so IMO this should be
vlan_dei.

Toshiaki Makita


Re: KMSAN: uninit-value in __netif_receive_skb_core

2018-04-13 Thread Toshiaki Makita
0246 R12: 
>> R13: 06cd R14: 006fd3d8 R15: 
>>
>> Uninit was stored to memory at:
>>  kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
>>  kmsan_save_stack mm/kmsan/kmsan.c:293 [inline]
>>  kmsan_internal_chain_origin+0x12b/0x210 mm/kmsan/kmsan.c:684
>>  __msan_chain_origin+0x69/0xc0 mm/kmsan/kmsan_instr.c:521
>>  skb_vlan_untag+0x950/0xee0 include/linux/if_vlan.h:597
>>  __netif_receive_skb_core+0x70a/0x4a80 net/core/dev.c:4460
>>  __netif_receive_skb net/core/dev.c:4627 [inline]
>>  process_backlog+0x62d/0xe20 net/core/dev.c:5307
>>  napi_poll net/core/dev.c:5705 [inline]
>>  net_rx_action+0x7c1/0x1a70 net/core/dev.c:5771
>>  __do_softirq+0x56d/0x93d kernel/softirq.c:285
>> Uninit was created at:
>>  kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
>>  kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188
>>  kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314
>>  kmsan_slab_alloc+0x11/0x20 mm/kmsan/kmsan.c:321
>>  slab_post_alloc_hook mm/slab.h:445 [inline]
>>  slab_alloc_node mm/slub.c:2737 [inline]
>>  __kmalloc_node_track_caller+0xaed/0x11c0 mm/slub.c:4369
>>  __kmalloc_reserve net/core/skbuff.c:138 [inline]
>>  __alloc_skb+0x2cf/0x9f0 net/core/skbuff.c:206
>>  alloc_skb include/linux/skbuff.h:984 [inline]
>>  alloc_skb_with_frags+0x1d4/0xb20 net/core/skbuff.c:5234
>>  sock_alloc_send_pskb+0xb56/0x1190 net/core/sock.c:2085
>>  packet_alloc_skb net/packet/af_packet.c:2803 [inline]
>>  packet_snd net/packet/af_packet.c:2894 [inline]
>>  packet_sendmsg+0x6444/0x8a10 net/packet/af_packet.c:2969
>>  sock_sendmsg_nosec net/socket.c:630 [inline]
>>  sock_sendmsg net/socket.c:640 [inline]
>>  sock_write_iter+0x3b9/0x470 net/socket.c:909
>>  do_iter_readv_writev+0x7bb/0x970 include/linux/fs.h:1776
>>  do_iter_write+0x30d/0xd40 fs/read_write.c:932
>>  vfs_writev fs/read_write.c:977 [inline]
>>  do_writev+0x3c9/0x830 fs/read_write.c:1012
>>  SYSC_writev+0x9b/0xb0 fs/read_write.c:1085
>>  SyS_writev+0x56/0x80 fs/read_write.c:1082
>>  do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
>>  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
>> ==
>>
>>
>> ---
>> This bug is generated by a dumb bot. It may contain errors.
>> See https://goo.gl/tpsmEJ for details.
>> Direct all questions to syzkal...@googlegroups.com.
>>
>> syzbot will keep track of this bug report.
>> If you forgot to add the Reported-by tag, once the fix for this bug is
>> merged
>> into any tree, please reply to this email with:
>> #syz fix: exact-commit-title
>> To mark this as a duplicate of another syzbot report, please reply with:
>> #syz dup: exact-subject-of-another-report
>> If it's a one-off invalid bug report, please reply with:
>> #syz invalid
>> Note: if the crash happens again, it will cause creation of a new bug
>> report.
>> Note: all commands must start from beginning of the line in the email body.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "syzkaller-bugs" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to syzkaller-bugs+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/syzkaller-bugs/94eb2c059ce01f643c0569a228ee%40google.com.
>> For more options, visit https://groups.google.com/d/optout.
> 
> 

-- 
Toshiaki Makita



Re: KMSAN: uninit-value in __netif_receive_skb_core

2018-04-13 Thread Toshiaki Makita
06cd R14: 006fd3d8 R15: 
>>
>> Uninit was stored to memory at:
>>  kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
>>  kmsan_save_stack mm/kmsan/kmsan.c:293 [inline]
>>  kmsan_internal_chain_origin+0x12b/0x210 mm/kmsan/kmsan.c:684
>>  __msan_chain_origin+0x69/0xc0 mm/kmsan/kmsan_instr.c:521
>>  skb_vlan_untag+0x950/0xee0 include/linux/if_vlan.h:597
>>  __netif_receive_skb_core+0x70a/0x4a80 net/core/dev.c:4460
>>  __netif_receive_skb net/core/dev.c:4627 [inline]
>>  process_backlog+0x62d/0xe20 net/core/dev.c:5307
>>  napi_poll net/core/dev.c:5705 [inline]
>>  net_rx_action+0x7c1/0x1a70 net/core/dev.c:5771
>>  __do_softirq+0x56d/0x93d kernel/softirq.c:285
>> Uninit was created at:
>>  kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
>>  kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188
>>  kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314
>>  kmsan_slab_alloc+0x11/0x20 mm/kmsan/kmsan.c:321
>>  slab_post_alloc_hook mm/slab.h:445 [inline]
>>  slab_alloc_node mm/slub.c:2737 [inline]
>>  __kmalloc_node_track_caller+0xaed/0x11c0 mm/slub.c:4369
>>  __kmalloc_reserve net/core/skbuff.c:138 [inline]
>>  __alloc_skb+0x2cf/0x9f0 net/core/skbuff.c:206
>>  alloc_skb include/linux/skbuff.h:984 [inline]
>>  alloc_skb_with_frags+0x1d4/0xb20 net/core/skbuff.c:5234
>>  sock_alloc_send_pskb+0xb56/0x1190 net/core/sock.c:2085
>>  packet_alloc_skb net/packet/af_packet.c:2803 [inline]
>>  packet_snd net/packet/af_packet.c:2894 [inline]
>>  packet_sendmsg+0x6444/0x8a10 net/packet/af_packet.c:2969
>>  sock_sendmsg_nosec net/socket.c:630 [inline]
>>  sock_sendmsg net/socket.c:640 [inline]
>>  sock_write_iter+0x3b9/0x470 net/socket.c:909
>>  do_iter_readv_writev+0x7bb/0x970 include/linux/fs.h:1776
>>  do_iter_write+0x30d/0xd40 fs/read_write.c:932
>>  vfs_writev fs/read_write.c:977 [inline]
>>  do_writev+0x3c9/0x830 fs/read_write.c:1012
>>  SYSC_writev+0x9b/0xb0 fs/read_write.c:1085
>>  SyS_writev+0x56/0x80 fs/read_write.c:1082
>>  do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
>>  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
>> ==
>>
>>
>> ---
>> This bug is generated by a dumb bot. It may contain errors.
>> See https://goo.gl/tpsmEJ for details.
>> Direct all questions to syzkal...@googlegroups.com.
>>
>> syzbot will keep track of this bug report.
>> If you forgot to add the Reported-by tag, once the fix for this bug is
>> merged
>> into any tree, please reply to this email with:
>> #syz fix: exact-commit-title
>> To mark this as a duplicate of another syzbot report, please reply with:
>> #syz dup: exact-subject-of-another-report
>> If it's a one-off invalid bug report, please reply with:
>> #syz invalid
>> Note: if the crash happens again, it will cause creation of a new bug
>> report.
>> Note: all commands must start from beginning of the line in the email body.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "syzkaller-bugs" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to syzkaller-bugs+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/syzkaller-bugs/94eb2c059ce01f643c0569a228ee%40google.com.
>> For more options, visit https://groups.google.com/d/optout.
> 
> 

-- 
Toshiaki Makita



Re: KMSAN: uninit-value in netif_skb_features

2018-04-13 Thread Toshiaki Makita
ine]
>>  __alloc_skb+0x2cf/0x9f0 net/core/skbuff.c:206
>>  alloc_skb include/linux/skbuff.h:984 [inline]
>>  alloc_skb_with_frags+0x1d4/0xb20 net/core/skbuff.c:5234
>>  sock_alloc_send_pskb+0xb56/0x1190 net/core/sock.c:2085
>>  packet_alloc_skb net/packet/af_packet.c:2803 [inline]
>>  packet_snd net/packet/af_packet.c:2894 [inline]
>>  packet_sendmsg+0x6444/0x8a10 net/packet/af_packet.c:2969
>>  sock_sendmsg_nosec net/socket.c:630 [inline]
>>  sock_sendmsg net/socket.c:640 [inline]
>>  sock_write_iter+0x3b9/0x470 net/socket.c:909
>>  do_iter_readv_writev+0x7bb/0x970 include/linux/fs.h:1776
>>  do_iter_write+0x30d/0xd40 fs/read_write.c:932
>>  vfs_writev fs/read_write.c:977 [inline]
>>  do_writev+0x3c9/0x830 fs/read_write.c:1012
>>  SYSC_writev+0x9b/0xb0 fs/read_write.c:1085
>>  SyS_writev+0x56/0x80 fs/read_write.c:1082
>>  do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
>>  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
>> ==
>>
>>
>> ---
>> This bug is generated by a dumb bot. It may contain errors.
>> See https://goo.gl/tpsmEJ for details.
>> Direct all questions to syzkal...@googlegroups.com.
>>
>> syzbot will keep track of this bug report.
>> If you forgot to add the Reported-by tag, once the fix for this bug is
>> merged
>> into any tree, please reply to this email with:
>> #syz fix: exact-commit-title
>> If you want to test a patch for this bug, please reply with:
>> #syz test: git://repo/address.git branch
>> and provide the patch inline or as an attachment.
>> To mark this as a duplicate of another syzbot report, please reply with:
>> #syz dup: exact-subject-of-another-report
>> If it's a one-off invalid bug report, please reply with:
>> #syz invalid
>> Note: if the crash happens again, it will cause creation of a new bug
>> report.
>> Note: all commands must start from beginning of the line in the email body.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "syzkaller-bugs" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to syzkaller-bugs+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/syzkaller-bugs/089e082d0cb81b67d10569a2283f%40google.com.
>> For more options, visit https://groups.google.com/d/optout.
> 
> 

-- 
Toshiaki Makita



Re: KMSAN: uninit-value in netif_skb_features

2018-04-13 Thread Toshiaki Makita
;>  alloc_skb include/linux/skbuff.h:984 [inline]
>>  alloc_skb_with_frags+0x1d4/0xb20 net/core/skbuff.c:5234
>>  sock_alloc_send_pskb+0xb56/0x1190 net/core/sock.c:2085
>>  packet_alloc_skb net/packet/af_packet.c:2803 [inline]
>>  packet_snd net/packet/af_packet.c:2894 [inline]
>>  packet_sendmsg+0x6444/0x8a10 net/packet/af_packet.c:2969
>>  sock_sendmsg_nosec net/socket.c:630 [inline]
>>  sock_sendmsg net/socket.c:640 [inline]
>>  sock_write_iter+0x3b9/0x470 net/socket.c:909
>>  do_iter_readv_writev+0x7bb/0x970 include/linux/fs.h:1776
>>  do_iter_write+0x30d/0xd40 fs/read_write.c:932
>>  vfs_writev fs/read_write.c:977 [inline]
>>  do_writev+0x3c9/0x830 fs/read_write.c:1012
>>  SYSC_writev+0x9b/0xb0 fs/read_write.c:1085
>>  SyS_writev+0x56/0x80 fs/read_write.c:1082
>>  do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
>>  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
>> ==
>>
>>
>> ---
>> This bug is generated by a dumb bot. It may contain errors.
>> See https://goo.gl/tpsmEJ for details.
>> Direct all questions to syzkal...@googlegroups.com.
>>
>> syzbot will keep track of this bug report.
>> If you forgot to add the Reported-by tag, once the fix for this bug is
>> merged
>> into any tree, please reply to this email with:
>> #syz fix: exact-commit-title
>> If you want to test a patch for this bug, please reply with:
>> #syz test: git://repo/address.git branch
>> and provide the patch inline or as an attachment.
>> To mark this as a duplicate of another syzbot report, please reply with:
>> #syz dup: exact-subject-of-another-report
>> If it's a one-off invalid bug report, please reply with:
>> #syz invalid
>> Note: if the crash happens again, it will cause creation of a new bug
>> report.
>> Note: all commands must start from beginning of the line in the email body.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "syzkaller-bugs" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to syzkaller-bugs+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/syzkaller-bugs/089e082d0cb81b67d10569a2283f%40google.com.
>> For more options, visit https://groups.google.com/d/optout.
> 
> 

-- 
Toshiaki Makita



Re: [PATCH bpf-next v3 1/3] libbpf: add function to setup XDP

2017-12-28 Thread Toshiaki Makita
On 2017/12/28 17:04, Eric Leblond wrote:
> Most of the code is taken from set_link_xdp_fd() in bpf_load.c and
> slightly modified to be library compliant.
> 
> Signed-off-by: Eric Leblond <e...@regit.org>
> Acked-by: Alexei Starovoitov <a...@kernel.org>
> ---
...
> +int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags)
...
> + if (bind(sock, (struct sockaddr *), sizeof(sa)) < 0) {
> + ret = -errno;
> + goto cleanup;
> + }
> +
> + addrlen = sizeof(sa);
> + if (getsockname(sock, (struct sockaddr *), ) < 0) {
> + ret = errno;

Still errno is not inverted,

> + goto cleanup;
> + }
> +
> + if (addrlen != sizeof(sa)) {
> + ret = errno;

And not set here.

> + goto cleanup;
> + }

-- 
Toshiaki Makita



Re: [PATCH bpf-next v3 1/3] libbpf: add function to setup XDP

2017-12-28 Thread Toshiaki Makita
On 2017/12/28 17:04, Eric Leblond wrote:
> Most of the code is taken from set_link_xdp_fd() in bpf_load.c and
> slightly modified to be library compliant.
> 
> Signed-off-by: Eric Leblond 
> Acked-by: Alexei Starovoitov 
> ---
...
> +int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags)
...
> + if (bind(sock, (struct sockaddr *), sizeof(sa)) < 0) {
> + ret = -errno;
> + goto cleanup;
> + }
> +
> + addrlen = sizeof(sa);
> + if (getsockname(sock, (struct sockaddr *), ) < 0) {
> + ret = errno;

Still errno is not inverted,

> + goto cleanup;
> + }
> +
> + if (addrlen != sizeof(sa)) {
> + ret = errno;

And not set here.

> + goto cleanup;
> + }

-- 
Toshiaki Makita



Re: [PATCH 1/4] libbpf: add function to setup XDP

2017-12-27 Thread Toshiaki Makita
On 2017/12/28 3:02, Eric Leblond wrote:
> Most of the code is taken from set_link_xdp_fd() in bpf_load.c and
> slightly modified to be library compliant.
> 
> Signed-off-by: Eric Leblond <e...@regit.org>
> ---
...
> +int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags)
...
> + if (bind(sock, (struct sockaddr *), sizeof(sa)) < 0) {
> + ret = -errno;
> + goto cleanup;
> + }
> +
> + addrlen = sizeof(sa);
> + if (getsockname(sock, (struct sockaddr *), ) < 0) {
> + ret = errno;

forgot to prepend '-'?

> + goto cleanup;
> + }
> +
> + if (addrlen != sizeof(sa)) {
> + ret = errno;

errno is not set?

> + goto cleanup;
> + }

-- 
Toshiaki Makita



Re: [PATCH 1/4] libbpf: add function to setup XDP

2017-12-27 Thread Toshiaki Makita
On 2017/12/28 3:02, Eric Leblond wrote:
> Most of the code is taken from set_link_xdp_fd() in bpf_load.c and
> slightly modified to be library compliant.
> 
> Signed-off-by: Eric Leblond 
> ---
...
> +int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags)
...
> + if (bind(sock, (struct sockaddr *), sizeof(sa)) < 0) {
> + ret = -errno;
> + goto cleanup;
> + }
> +
> + addrlen = sizeof(sa);
> + if (getsockname(sock, (struct sockaddr *), ) < 0) {
> + ret = errno;

forgot to prepend '-'?

> + goto cleanup;
> + }
> +
> + if (addrlen != sizeof(sa)) {
> + ret = errno;

errno is not set?

> + goto cleanup;
> + }

-- 
Toshiaki Makita



Re: [PATCH net-next] libbpf: add function to setup XDP

2017-12-10 Thread Toshiaki Makita
On 2017/12/09 23:43, Eric Leblond wrote:
> Most of the code is taken from set_link_xdp_fd() in bpf_load.c and
> slightly modified to be library compliant.
> 
> Signed-off-by: Eric Leblond <e...@regit.org>
...
> +int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags)
...
> + for (nh = (struct nlmsghdr *)buf; NLMSG_OK(nh, len);
> +  nh = NLMSG_NEXT(nh, len)) {
> + if (nh->nlmsg_pid != getpid()) {

Generally nlmsg_pid should not be compared with process id.
See man netlink and
https://github.com/iovisor/bcc/pull/1275/commits/69ce96a54c55960c8de3392061254c97b6306a6d

> + ret = -LIBBPF_ERRNO__WRNGPID;
> +     goto cleanup;
> + }

-- 
Toshiaki Makita



Re: [PATCH net-next] libbpf: add function to setup XDP

2017-12-10 Thread Toshiaki Makita
On 2017/12/09 23:43, Eric Leblond wrote:
> Most of the code is taken from set_link_xdp_fd() in bpf_load.c and
> slightly modified to be library compliant.
> 
> Signed-off-by: Eric Leblond 
...
> +int bpf_set_link_xdp_fd(int ifindex, int fd, __u32 flags)
...
> + for (nh = (struct nlmsghdr *)buf; NLMSG_OK(nh, len);
> +  nh = NLMSG_NEXT(nh, len)) {
> + if (nh->nlmsg_pid != getpid()) {

Generally nlmsg_pid should not be compared with process id.
See man netlink and
https://github.com/iovisor/bcc/pull/1275/commits/69ce96a54c55960c8de3392061254c97b6306a6d

> + ret = -LIBBPF_ERRNO__WRNGPID;
> + goto cleanup;
> + }

-- 
Toshiaki Makita



Re: Sending 802.1Q packets using AF_PACKET socket on filtered bridge forwards with wrong MAC addresses

2017-12-05 Thread Toshiaki Makita
Hi,
(CC: Vlad)

On 2017/11/30 7:01, Brandon Carpenter wrote:
> I narrowed the search to a memmove() called from
> skb_reorder_vlan_header() in net/core/skbuff.c.
> 
>> memmove(skb->data - ETH_HLEN, skb->data - skb->mac_len - VLAN_HLEN,
>>2 * ETH_ALEN);
> 
> Calling skb_reset_mac_len() after skb_reset_mac_header() before
> calling br_allowed_ingress() in net/bridge/br_device.c fixes the
> problem.
> 
> diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
> index af5b8c87f590..e10131e2f68f 100644
> --- a/net/bridge/br_device.c
> +++ b/net/bridge/br_device.c
> @@ -58,6 +58,7 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct
> net_device *dev)
> BR_INPUT_SKB_CB(skb)->brdev = dev;
> 
> skb_reset_mac_header(skb);
> +   skb_reset_mac_len(skb);
> eth = eth_hdr(skb);
> skb_pull(skb, ETH_HLEN);

Thanks for debugging this problem.
It seems this has been broken since a6e18ff11170 ("vlan: Fix untag
operations of stacked vlans with REORDER_HEADER off").

Unfortunately this does not always work correctly, since in tx path
drivers assume network header to be set to L3 protocol header offset.
Packet socket (packet_snd()) determines network header by
dev_hard_header which is ETH_HLEN in bridge devices, so this works for
packet socket, but with vlan devices on top of bridge device with
tx-vlan hwaccel disabled we get ETH_HLEN + VLAN_HLEN or longer by mac_len.

Since mac_len can be arbitrarily long if we stack vlan devices on bridge
devices, and since we want to untag the outermost tag, using mac_len to
untag in tx path is probably no longer correct.

I'll think deeper about how to fix it.

> I'll put together an official patch  and submit it. Should I use
> another email account? Are my emails being ignored because of that
> stupid disclaimer my employer attaches to my messages (outside my
> control)?
> 
> Brandon
> 

-- 
Toshiaki Makita



Re: Sending 802.1Q packets using AF_PACKET socket on filtered bridge forwards with wrong MAC addresses

2017-12-05 Thread Toshiaki Makita
Hi,
(CC: Vlad)

On 2017/11/30 7:01, Brandon Carpenter wrote:
> I narrowed the search to a memmove() called from
> skb_reorder_vlan_header() in net/core/skbuff.c.
> 
>> memmove(skb->data - ETH_HLEN, skb->data - skb->mac_len - VLAN_HLEN,
>>2 * ETH_ALEN);
> 
> Calling skb_reset_mac_len() after skb_reset_mac_header() before
> calling br_allowed_ingress() in net/bridge/br_device.c fixes the
> problem.
> 
> diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
> index af5b8c87f590..e10131e2f68f 100644
> --- a/net/bridge/br_device.c
> +++ b/net/bridge/br_device.c
> @@ -58,6 +58,7 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct
> net_device *dev)
> BR_INPUT_SKB_CB(skb)->brdev = dev;
> 
> skb_reset_mac_header(skb);
> +   skb_reset_mac_len(skb);
> eth = eth_hdr(skb);
> skb_pull(skb, ETH_HLEN);

Thanks for debugging this problem.
It seems this has been broken since a6e18ff11170 ("vlan: Fix untag
operations of stacked vlans with REORDER_HEADER off").

Unfortunately this does not always work correctly, since in tx path
drivers assume network header to be set to L3 protocol header offset.
Packet socket (packet_snd()) determines network header by
dev_hard_header which is ETH_HLEN in bridge devices, so this works for
packet socket, but with vlan devices on top of bridge device with
tx-vlan hwaccel disabled we get ETH_HLEN + VLAN_HLEN or longer by mac_len.

Since mac_len can be arbitrarily long if we stack vlan devices on bridge
devices, and since we want to untag the outermost tag, using mac_len to
untag in tx path is probably no longer correct.

I'll think deeper about how to fix it.

> I'll put together an official patch  and submit it. Should I use
> another email account? Are my emails being ignored because of that
> stupid disclaimer my employer attaches to my messages (outside my
> control)?
> 
> Brandon
> 

-- 
Toshiaki Makita



Re: Inconsistency in packet drop due to MTU (eth vs veth)

2017-02-03 Thread Toshiaki Makita

On 17/02/03 (金) 17:07, Fredrik Markstrom wrote:

  On Tue, 31 Jan 2017 17:27:09 +0100 Eric Dumazet <eric.duma...@gmail.com> 
wrote 
 > On Tue, 2017-01-31 at 14:32 +0100, Fredrik Markstrom wrote:
 > >   On Thu, 19 Jan 2017 19:53:47 +0100 Eric Dumazet 
<eric.duma...@gmail.com> wrote 
 > >  > On Thu, 2017-01-19 at 17:41 +0100, Fredrik Markstrom wrote:
 > >  > > Hello,
 > >  > >
 > >  > > I've noticed an inconsistency between how physical ethernet and
 > > veth handles mtu.
 > >  > >
 > >  > > If I setup two physical interfaces (directly connected) with
 > > different mtu:s, only the size of the outgoing packets are limited by
 > > the mtu. But with veth a packet is dropped if the mtu of the receiving
 > > interface is smaller then the packet size.
 > >  > >
 > >  > > This seems inconsistent to me, but maybe there is a reason for
 > > it ?
 > >  > >
 > >  > > Can someone confirm if it's a deliberate inconsistency or just a
 > > side effect of using dev_forward_skb() ?
 > >  >
 > >  > It looks this was added in commit
 > >  > 38d408152a86598a50680a82fe3353b506630409
 > >  > ("veth: Allow setting the L3 MTU")
 > >  >
 > >  > But what was really needed here was a way to change MRU :(
 > >
 > > Ok, do we consider this correct and/or something we need to be
 > > backwards compatible with ? Is it insane to believe that we can fix
 > > this "inconsistency" by removing the check ?
 > >
 > > The commit message reads "For consistency I drop packets on the
 > > receive side when they are larger than the MTU", do we know what it's
 > > supposed
 > > to be consistent with or is that lost in history ?
 >
 > There is no consistency among existing Ethernet drivers.
 >
 > Many ethernet drivers size the buffers they post in RX ring buffer
 > according to MTU.
 >
 > If MTU is set to 1500, RX buffers are sized to be about 1536 bytes,
 > so you wont be able to receive a 1700 bytes frame.
 >
 > I guess that you could add a specific veth attribute to precisely
 > control MRU, that would not break existing applications.

Ok, I will propose a patch shortly. And thanks, your response time is
awesome !


But why do you want to configure MRU?
What is the problem with setting MTU instead.

Toshiaki Makita


Re: Inconsistency in packet drop due to MTU (eth vs veth)

2017-02-03 Thread Toshiaki Makita

On 17/02/03 (金) 17:07, Fredrik Markstrom wrote:

  On Tue, 31 Jan 2017 17:27:09 +0100 Eric Dumazet  
wrote 
 > On Tue, 2017-01-31 at 14:32 +0100, Fredrik Markstrom wrote:
 > >   On Thu, 19 Jan 2017 19:53:47 +0100 Eric Dumazet 
 wrote 
 > >  > On Thu, 2017-01-19 at 17:41 +0100, Fredrik Markstrom wrote:
 > >  > > Hello,
 > >  > >
 > >  > > I've noticed an inconsistency between how physical ethernet and
 > > veth handles mtu.
 > >  > >
 > >  > > If I setup two physical interfaces (directly connected) with
 > > different mtu:s, only the size of the outgoing packets are limited by
 > > the mtu. But with veth a packet is dropped if the mtu of the receiving
 > > interface is smaller then the packet size.
 > >  > >
 > >  > > This seems inconsistent to me, but maybe there is a reason for
 > > it ?
 > >  > >
 > >  > > Can someone confirm if it's a deliberate inconsistency or just a
 > > side effect of using dev_forward_skb() ?
 > >  >
 > >  > It looks this was added in commit
 > >  > 38d408152a86598a50680a82fe3353b506630409
 > >  > ("veth: Allow setting the L3 MTU")
 > >  >
 > >  > But what was really needed here was a way to change MRU :(
 > >
 > > Ok, do we consider this correct and/or something we need to be
 > > backwards compatible with ? Is it insane to believe that we can fix
 > > this "inconsistency" by removing the check ?
 > >
 > > The commit message reads "For consistency I drop packets on the
 > > receive side when they are larger than the MTU", do we know what it's
 > > supposed
 > > to be consistent with or is that lost in history ?
 >
 > There is no consistency among existing Ethernet drivers.
 >
 > Many ethernet drivers size the buffers they post in RX ring buffer
 > according to MTU.
 >
 > If MTU is set to 1500, RX buffers are sized to be about 1536 bytes,
 > so you wont be able to receive a 1700 bytes frame.
 >
 > I guess that you could add a specific veth attribute to precisely
 > control MRU, that would not break existing applications.

Ok, I will propose a patch shortly. And thanks, your response time is
awesome !


But why do you want to configure MRU?
What is the problem with setting MTU instead.

Toshiaki Makita


Re: DSA vs envelope frames

2016-12-01 Thread Toshiaki Makita
On 2016/11/30 23:58, Nikita Yushchenko wrote:
>>> (1) When DSA is in use, frames processed by FEC chip contain DSA tag and
>>> thus can be larger than hardcoded limit of 1522. This issue is not
>>> FEC-specific, any driver that hardcodes maximum frame size to 1522 (many
>>> do) will have this issue if used with DSA.
>>
>> BTW I'm trying to introduce envelope frames to solve this kind of problems.
>> http://marc.info/?t=14749669155=1=2
>> http://marc.info/?t=14749669153=1=2
>> http://marc.info/?t=14749669152=1=2
>> http://marc.info/?t=14749669154=1=2
>> http://marc.info/?t=14749669151=1=2
>>
>> It needs jumbo frame support of NICs though.
> 
> Thanks for pointing to this.
> 
> Indeed frame with DSA tag conceptually is an envelope frame.
> 
> ndev->env_hdr_len introduced by your patches, actually is explicitly
> handled difference between (MTU + 18) and frame that HW should allow.
> If this is known, hardware can be configured to work with DSA. At least
> FEC hardware that can send and receive "slightly larger" frames after
> simple register configuration.
> 
> Furthermore, since DSA configuration is known statically (it comes from
> device tree), ndo_set_env_hdr_len method could be automatically called
> at init, making setup working by default if driver supports that. And if
> not, perhaps can automatically lower MTU.
> 
> Looks like a solution :)
> 
> What's current status of this work?

Thank you for taking a look.
I'm planning to post v2 soon.

> What is not really clear - what if several tagging protocols are used
> together. AFAIU, things may be more complex that simple appending of
> tags, e.g. EDSA tag can carry VLAN id inside.

If kernel is aware of VLAN configuration, add 4 bytes + DSA tag size.
(I'm not familiar with how dsa knows vlan configuration, but probably
through switchdev_port_obj_add()? If so, dsa should be able to take into
account additional vlan tag size.)

If vlan tag is opaque from kernel, e.g. forwarding vlan tagged frames
without configuring vlan_filtering in bridge, admin needs to set
env_hdr_len manually. This is why I'm proposing manual operation.

Regards,
Toshiaki Makita




Re: DSA vs envelope frames

2016-12-01 Thread Toshiaki Makita
On 2016/11/30 23:58, Nikita Yushchenko wrote:
>>> (1) When DSA is in use, frames processed by FEC chip contain DSA tag and
>>> thus can be larger than hardcoded limit of 1522. This issue is not
>>> FEC-specific, any driver that hardcodes maximum frame size to 1522 (many
>>> do) will have this issue if used with DSA.
>>
>> BTW I'm trying to introduce envelope frames to solve this kind of problems.
>> http://marc.info/?t=14749669155=1=2
>> http://marc.info/?t=14749669153=1=2
>> http://marc.info/?t=14749669152=1=2
>> http://marc.info/?t=14749669154=1=2
>> http://marc.info/?t=14749669151=1=2
>>
>> It needs jumbo frame support of NICs though.
> 
> Thanks for pointing to this.
> 
> Indeed frame with DSA tag conceptually is an envelope frame.
> 
> ndev->env_hdr_len introduced by your patches, actually is explicitly
> handled difference between (MTU + 18) and frame that HW should allow.
> If this is known, hardware can be configured to work with DSA. At least
> FEC hardware that can send and receive "slightly larger" frames after
> simple register configuration.
> 
> Furthermore, since DSA configuration is known statically (it comes from
> device tree), ndo_set_env_hdr_len method could be automatically called
> at init, making setup working by default if driver supports that. And if
> not, perhaps can automatically lower MTU.
> 
> Looks like a solution :)
> 
> What's current status of this work?

Thank you for taking a look.
I'm planning to post v2 soon.

> What is not really clear - what if several tagging protocols are used
> together. AFAIU, things may be more complex that simple appending of
> tags, e.g. EDSA tag can carry VLAN id inside.

If kernel is aware of VLAN configuration, add 4 bytes + DSA tag size.
(I'm not familiar with how dsa knows vlan configuration, but probably
through switchdev_port_obj_add()? If so, dsa should be able to take into
account additional vlan tag size.)

If vlan tag is opaque from kernel, e.g. forwarding vlan tagged frames
without configuring vlan_filtering in bridge, admin needs to set
env_hdr_len manually. This is why I'm proposing manual operation.

Regards,
Toshiaki Makita




Re: [patch net / RFC] net: fec: increase frame size limitation to actually available buffer

2016-11-30 Thread Toshiaki Makita
On 2016/11/30 15:36, Nikita Yushchenko wrote:
>> But I think it is not necessary since the driver don't support jumbo frame.
> 
> Hardcoded 1522 raises two separate issues.
> 
> (1) When DSA is in use, frames processed by FEC chip contain DSA tag and
> thus can be larger than hardcoded limit of 1522. This issue is not
> FEC-specific, any driver that hardcodes maximum frame size to 1522 (many
> do) will have this issue if used with DSA.
> 
> Clean solution for this must take into account that difference between
> MTU and max frame size is no longer known at compile time. Actually this
> is the case even without DSA, due to VLANs: max frame size is (MTU + 18)
> without VLANs, but (MTU + 22) with VLANs. However currently drivers tend
> to ignore this and hardcode 22.  With DSA, 22 is not enough, need to add
> switch-specific tag size to that.
> 
> Not yet sure how to handle this. DSA-specific API to find out tag size
> could be added, but generic solution should handle all cases of dynamic
> difference between MTU and max frame size, not only DSA.

BTW I'm trying to introduce envelope frames to solve this kind of problems.
http://marc.info/?t=14749669155=1=2
http://marc.info/?t=14749669153=1=2
http://marc.info/?t=14749669152=1=2
http://marc.info/?t=14749669154=1=2
http://marc.info/?t=14749669151=1=2

It needs jumbo frame support of NICs though.

Regards,
Toshiaki Makita




Re: [patch net / RFC] net: fec: increase frame size limitation to actually available buffer

2016-11-30 Thread Toshiaki Makita
On 2016/11/30 15:36, Nikita Yushchenko wrote:
>> But I think it is not necessary since the driver don't support jumbo frame.
> 
> Hardcoded 1522 raises two separate issues.
> 
> (1) When DSA is in use, frames processed by FEC chip contain DSA tag and
> thus can be larger than hardcoded limit of 1522. This issue is not
> FEC-specific, any driver that hardcodes maximum frame size to 1522 (many
> do) will have this issue if used with DSA.
> 
> Clean solution for this must take into account that difference between
> MTU and max frame size is no longer known at compile time. Actually this
> is the case even without DSA, due to VLANs: max frame size is (MTU + 18)
> without VLANs, but (MTU + 22) with VLANs. However currently drivers tend
> to ignore this and hardcode 22.  With DSA, 22 is not enough, need to add
> switch-specific tag size to that.
> 
> Not yet sure how to handle this. DSA-specific API to find out tag size
> could be added, but generic solution should handle all cases of dynamic
> difference between MTU and max frame size, not only DSA.

BTW I'm trying to introduce envelope frames to solve this kind of problems.
http://marc.info/?t=14749669155=1=2
http://marc.info/?t=14749669153=1=2
http://marc.info/?t=14749669152=1=2
http://marc.info/?t=14749669154=1=2
http://marc.info/?t=14749669151=1=2

It needs jumbo frame support of NICs though.

Regards,
Toshiaki Makita




Re: [PATCH] bridge: missing null bridge device check causing null pointer dereference (bugfix)

2014-11-06 Thread Toshiaki Makita
On 2014/11/06 16:58, 박수현 wrote:
>> -Original Message-
>> From: Toshiaki Makita [mailto:makita.toshi...@lab.ntt.co.jp]
>> Sent: Thursday, November 06, 2014 4:07 PM
>> To: 박수현; Stephen Hemminger; David S. Miller
>> Cc: bri...@lists.linux-foundation.org; net...@vger.kernel.org; linux-
>> ker...@vger.kernel.org
>> Subject: Re: [PATCH] bridge: missing null bridge device check causing null
>> pointer dereference (bugfix)
>>
>> On 2014/11/06 15:26, Su-Hyun Park wrote:
>>> the bridge device can be null if the bridge is being deleted while
>>> processing the packet, which causes the null pointer dereference in
>> switch statement.
>>
>> How can this happen??
>> It is guarded by rcu.
>> netdev_rx_handler_unregister() ensures rx_handler_data is non NULL.
>>
> 
> The RCU protect rx_handler_data, not the bridge member port. It can be NULL 
> according to below code.
> 
> static inline struct net_bridge_port *br_port_get_rcu(const struct net_device 
> *dev) {
>   struct net_bridge_port *port = rcu_dereference(dev->rx_handler_data);
>   return br_port_exists(dev) ? port : NULL; 
> }

Seems to have been fixed for a year.
716ec052d228 ("bridge: fix NULL pointer deref of br_port_get_rcu")

Thanks,
Toshiaki Makita

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] bridge: missing null bridge device check causing null pointer dereference (bugfix)

2014-11-06 Thread Toshiaki Makita
On 2014/11/06 16:58, 박수현 wrote:
 -Original Message-
 From: Toshiaki Makita [mailto:makita.toshi...@lab.ntt.co.jp]
 Sent: Thursday, November 06, 2014 4:07 PM
 To: 박수현; Stephen Hemminger; David S. Miller
 Cc: bri...@lists.linux-foundation.org; net...@vger.kernel.org; linux-
 ker...@vger.kernel.org
 Subject: Re: [PATCH] bridge: missing null bridge device check causing null
 pointer dereference (bugfix)

 On 2014/11/06 15:26, Su-Hyun Park wrote:
 the bridge device can be null if the bridge is being deleted while
 processing the packet, which causes the null pointer dereference in
 switch statement.

 How can this happen??
 It is guarded by rcu.
 netdev_rx_handler_unregister() ensures rx_handler_data is non NULL.

 
 The RCU protect rx_handler_data, not the bridge member port. It can be NULL 
 according to below code.
 
 static inline struct net_bridge_port *br_port_get_rcu(const struct net_device 
 *dev) {
   struct net_bridge_port *port = rcu_dereference(dev-rx_handler_data);
   return br_port_exists(dev) ? port : NULL; 
 }

Seems to have been fixed for a year.
716ec052d228 (bridge: fix NULL pointer deref of br_port_get_rcu)

Thanks,
Toshiaki Makita

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] bridge: missing null bridge device check causing null pointer dereference (bugfix)

2014-11-05 Thread Toshiaki Makita
On 2014/11/06 15:26, Su-Hyun Park wrote:
> the bridge device can be null if the bridge is being deleted while processing 
> the packet, which causes the null pointer dereference in switch statement.

How can this happen??
It is guarded by rcu.
netdev_rx_handler_unregister() ensures rx_handler_data is non NULL.

Thanks,
Toshiaki Makita

> 
> crash dump snippet:
> 
> <1>BUG: unable to handle kernel NULL pointer dereference at 0021
> <1>IP: [] br_handle_frame+0xe6/0x270
> 
> <0>Code: 4c 0f 44 f0 89 f8 66 33 15 32 52 24 00 66 33 05 29 52 24 00 09 c2 89 
> f0 66 33 05 22 52 24 00 80 e4 f0 66 09 c2 0f 84 eb 00 00 00 <41> 0f b6 46 21 
> 3c 02 74 61 3c 03 74 1d 48 89 df e8 d5 bc f0 ff
> ---
>  net/bridge/br_input.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
> index 6fd5522..7e899ca 100644
> --- a/net/bridge/br_input.c
> +++ b/net/bridge/br_input.c
> @@ -176,6 +176,8 @@ rx_handler_result_t br_handle_frame(struct sk_buff **pskb)
>   return RX_HANDLER_CONSUMED;
>  
>   p = br_port_get_rcu(skb->dev);
> + if (!p)
> + goto drop;
>  
>   if (unlikely(is_link_local_ether_addr(dest))) {
>   u16 fwd_mask = p->br->group_fwd_mask_required;
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] bridge: missing null bridge device check causing null pointer dereference (bugfix)

2014-11-05 Thread Toshiaki Makita
On 2014/11/06 15:26, Su-Hyun Park wrote:
 the bridge device can be null if the bridge is being deleted while processing 
 the packet, which causes the null pointer dereference in switch statement.

How can this happen??
It is guarded by rcu.
netdev_rx_handler_unregister() ensures rx_handler_data is non NULL.

Thanks,
Toshiaki Makita

 
 crash dump snippet:
 
 1BUG: unable to handle kernel NULL pointer dereference at 0021
 1IP: [814179f6] br_handle_frame+0xe6/0x270
 
 0Code: 4c 0f 44 f0 89 f8 66 33 15 32 52 24 00 66 33 05 29 52 24 00 09 c2 89 
 f0 66 33 05 22 52 24 00 80 e4 f0 66 09 c2 0f 84 eb 00 00 00 41 0f b6 46 21 
 3c 02 74 61 3c 03 74 1d 48 89 df e8 d5 bc f0 ff
 ---
  net/bridge/br_input.c | 2 ++
  1 file changed, 2 insertions(+)
 
 diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
 index 6fd5522..7e899ca 100644
 --- a/net/bridge/br_input.c
 +++ b/net/bridge/br_input.c
 @@ -176,6 +176,8 @@ rx_handler_result_t br_handle_frame(struct sk_buff **pskb)
   return RX_HANDLER_CONSUMED;
  
   p = br_port_get_rcu(skb-dev);
 + if (!p)
 + goto drop;
  
   if (unlikely(is_link_local_ether_addr(dest))) {
   u16 fwd_mask = p-br-group_fwd_mask_required;
 

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net] bridge: notify user space after fdb update

2014-05-29 Thread Toshiaki Makita
(2014/05/29 16:27), Jon Maxwell wrote:
> There has been a number incidents recently where customers running KVM have
> reported that VM hosts on different Hypervisors are unreachable. Based on
> pcap traces we found that the bridge was broadcasting the ARP request out
> onto the network. However some NICs have an inbuilt switch which on occasions
> were broadcasting the VMs ARP request back through the physical NIC on the
> Hypervisor. This resulted in the bridge changing ports and incorrectly 
> learning
> that the VMs mac address was external. As a result the ARP reply was directed
> back onto the external network and VM never updated it's ARP cache. This patch
> will notify the bridge command, after a fdb has been updated to identify such
> port toggling.
> 
> Signed-off-by: Jon Maxwell 

Acked-by: Toshiaki Makita 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net] bridge: notify user space after fdb update

2014-05-29 Thread Toshiaki Makita
(2014/05/29 16:27), Jon Maxwell wrote:
 There has been a number incidents recently where customers running KVM have
 reported that VM hosts on different Hypervisors are unreachable. Based on
 pcap traces we found that the bridge was broadcasting the ARP request out
 onto the network. However some NICs have an inbuilt switch which on occasions
 were broadcasting the VMs ARP request back through the physical NIC on the
 Hypervisor. This resulted in the bridge changing ports and incorrectly 
 learning
 that the VMs mac address was external. As a result the ARP reply was directed
 back onto the external network and VM never updated it's ARP cache. This patch
 will notify the bridge command, after a fdb has been updated to identify such
 port toggling.
 
 Signed-off-by: Jon Maxwell jmaxwel...@gmail.com

Acked-by: Toshiaki Makita makita.toshi...@lab.ntt.co.jp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net] bridge: notify user space of fdb port change

2014-05-23 Thread Toshiaki Makita
(2014/05/23 13:59), Jon Maxwell wrote:
...
> Makita-san,
> 
> I recoded this using your idea and ran it through a reproducer.
> It work fine. After some more consideration I agree that 
> setting fdb->dst = source is only required when source != fdb->dst.
> 
> Thanks for your suggestions. This is the revised patch. It should 
> retain the original behaviour except for the notify after the fdb update.  
> 
> Please let me know if you have any further input?

I have no more comments except for style problems (bracket position,
indentation, type mismatch).
thank you for rewriting :)

Thanks,
Toshiaki Makita

> 
> $ diff -Naur br_fdb.c br_fdb.c.patch
> --- br_fdb.c2014-05-17 12:43:23.346319609 +1000
> +++ br_fdb.c.patch2014-05-17 16:54:46.280235728 +1000
> @@ -487,6 +487,7 @@
>  {
>  struct hlist_head *head = >hash[br_mac_hash(addr, vid)];
>  struct net_bridge_fdb_entry *fdb;
> +bool fdb_modified = 0;
>  
>  /* some users want to always flood. */
>  if (hold_time(br) == 0)
> @@ -507,10 +508,16 @@
>  source->dev->name);
>  } else {
>  /* fastpath: update of existing entry */
> -fdb->dst = source;
> +if (unlikely(source != fdb->dst))
> +{
> +fdb->dst = source;
> +fdb_modified = 1;
> +}
>  fdb->updated = jiffies;
>  if (unlikely(added_by_user))
>  fdb->added_by_user = 1;
> +if (unlikely(fdb_modified))
> +fdb_notify(br, fdb, RTM_NEWNEIGH);
>  }
>  } else {
>  spin_lock(>hash_lock);
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net] bridge: notify user space of fdb port change

2014-05-23 Thread Toshiaki Makita
(2014/05/23 13:59), Jon Maxwell wrote:
...
 Makita-san,
 
 I recoded this using your idea and ran it through a reproducer.
 It work fine. After some more consideration I agree that 
 setting fdb-dst = source is only required when source != fdb-dst.
 
 Thanks for your suggestions. This is the revised patch. It should 
 retain the original behaviour except for the notify after the fdb update.  
 
 Please let me know if you have any further input?

I have no more comments except for style problems (bracket position,
indentation, type mismatch).
thank you for rewriting :)

Thanks,
Toshiaki Makita

 
 $ diff -Naur br_fdb.c br_fdb.c.patch
 --- br_fdb.c2014-05-17 12:43:23.346319609 +1000
 +++ br_fdb.c.patch2014-05-17 16:54:46.280235728 +1000
 @@ -487,6 +487,7 @@
  {
  struct hlist_head *head = br-hash[br_mac_hash(addr, vid)];
  struct net_bridge_fdb_entry *fdb;
 +bool fdb_modified = 0;
  
  /* some users want to always flood. */
  if (hold_time(br) == 0)
 @@ -507,10 +508,16 @@
  source-dev-name);
  } else {
  /* fastpath: update of existing entry */
 -fdb-dst = source;
 +if (unlikely(source != fdb-dst))
 +{
 +fdb-dst = source;
 +fdb_modified = 1;
 +}
  fdb-updated = jiffies;
  if (unlikely(added_by_user))
  fdb-added_by_user = 1;
 +if (unlikely(fdb_modified))
 +fdb_notify(br, fdb, RTM_NEWNEIGH);
  }
  } else {
  spin_lock(br-hash_lock);
 
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net] bridge: notify user space of fdb port change

2014-05-13 Thread Toshiaki Makita
(2014/05/13 16:55), Jon Maxwell wrote:
> From: Jon Maxwell 
> 
> There has been a number incidents recently where customers running KVM have 
> reported that VM hosts on different Hypervisors are unreachable. Based on 
> pcap traces we found that the bridge was broadcasting the ARP request out 
> onto the network. However some NICs have an inbuilt switch which on occasions 
> were broadcasting the VMs ARP request back through the physical NIC on the 
> Hypervisor. This resulted in the bridge changing ports and incorrectly 
> learning
> that the VMs mac address was external. As a result the ARP reply was directed 
> back onto the external network and VM never updated it's ARP cache. This 
> patch 
> will notify the bridge command to identify such port toggling.
> 
> Signed-off-by: Jon Maxwell 
> ---
>  net/bridge/br_fdb.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
> index 9203d5a..37742e2 100644
> --- a/net/bridge/br_fdb.c
> +++ b/net/bridge/br_fdb.c
> @@ -507,6 +507,8 @@ void br_fdb_update(struct net_bridge *br, struct 
> net_bridge_port *source,
>   source->dev->name);
>   } else {
>   /* fastpath: update of existing entry */
> + if (source->port_no != fdb->dst->port_no)

It seems that we don't need to fetch port_no and it is enough to compare
source and fdb->dst.

> + fdb_notify(br, fdb, RTM_NEWNEIGH);
>   fdb->dst = source;
>   fdb->updated = jiffies;
>   if (unlikely(added_by_user))
> 

This notifies fdb entry before updating existing entry. Is this on purpose?
I think we should notify the updated fdb entry.
Similar code fdb_add_entry() does after updating it.

Also, isn't it better to move update of dst into "if" block?

if (source != fdb->dst) {
    fdb->dst = source;
modified = true;
}
...
if (modified) ...

Thanks,
Toshiaki Makita
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net] bridge: notify user space of fdb port change

2014-05-13 Thread Toshiaki Makita
(2014/05/13 16:55), Jon Maxwell wrote:
 From: Jon Maxwell jmaxwel...@gmail.com
 
 There has been a number incidents recently where customers running KVM have 
 reported that VM hosts on different Hypervisors are unreachable. Based on 
 pcap traces we found that the bridge was broadcasting the ARP request out 
 onto the network. However some NICs have an inbuilt switch which on occasions 
 were broadcasting the VMs ARP request back through the physical NIC on the 
 Hypervisor. This resulted in the bridge changing ports and incorrectly 
 learning
 that the VMs mac address was external. As a result the ARP reply was directed 
 back onto the external network and VM never updated it's ARP cache. This 
 patch 
 will notify the bridge command to identify such port toggling.
 
 Signed-off-by: Jon Maxwell jmaxwel...@gmail.com
 ---
  net/bridge/br_fdb.c | 2 ++
  1 file changed, 2 insertions(+)
 
 diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
 index 9203d5a..37742e2 100644
 --- a/net/bridge/br_fdb.c
 +++ b/net/bridge/br_fdb.c
 @@ -507,6 +507,8 @@ void br_fdb_update(struct net_bridge *br, struct 
 net_bridge_port *source,
   source-dev-name);
   } else {
   /* fastpath: update of existing entry */
 + if (source-port_no != fdb-dst-port_no)

It seems that we don't need to fetch port_no and it is enough to compare
source and fdb-dst.

 + fdb_notify(br, fdb, RTM_NEWNEIGH);
   fdb-dst = source;
   fdb-updated = jiffies;
   if (unlikely(added_by_user))
 

This notifies fdb entry before updating existing entry. Is this on purpose?
I think we should notify the updated fdb entry.
Similar code fdb_add_entry() does after updating it.

Also, isn't it better to move update of dst into if block?

if (source != fdb-dst) {
fdb-dst = source;
modified = true;
}
...
if (modified) ...

Thanks,
Toshiaki Makita
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Bridge] [PATCH 1/3] bridge: preserve random init MAC address

2014-03-19 Thread Toshiaki Makita
On Tue, 2014-03-18 at 18:10 -0700, Luis R. Rodriguez wrote:
> On Tue, Mar 18, 2014 at 6:04 PM, Toshiaki Makita
>  wrote:
> > (2014/03/19 9:50), Luis R. Rodriguez wrote:
> >> On Tue, Mar 18, 2014 at 5:42 PM, Toshiaki Makita
> >>  wrote:
> >>> nit,
> >>> If the last detached port happens to have the same addr as
> >>> random_init_addr, this seems to call br_stp_change_bridge_id() even
> >>> though bridge_id is not changed.
> >>
> >> Ah good point.
> >>
> >>> Shouldn't the assignment of random_init_addr be done before the check of
> >>> "no change"?
> >>
> >> Good question, should we even allow two ports to have the same MAC
> >> address or should we complain and refuse to add it? If so that should
> >> mean we should also have to monitor any manual address changes or
> >> events for address changes on the ports.
> >
> > This was recently discussed by Stephen and me.
> > I'm thinking it should be allowed.
> >
> > http://marc.info/?l=linux-netdev=139182743919257=2
> 
> Great now that that's sorted out though I still think calling
> br_stp_change_bridge_id() is right just as calling the update features
> as the device is different. It could however be confusing when this
> situation is run and folks might report odd bugs unless we could tell
> them apart clearly. Thoughts?

br_stp_change_bridge_id() is currently called only if bridge_id.addr
should be changed.
If the addr should not be changed but some updates are needed,
br_stp_recalculate_bridge_id() doesn't seem to fit into it.

Toshiaki Makita

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Bridge] [PATCH 1/3] bridge: preserve random init MAC address

2014-03-19 Thread Toshiaki Makita
On Tue, 2014-03-18 at 18:10 -0700, Luis R. Rodriguez wrote:
 On Tue, Mar 18, 2014 at 6:04 PM, Toshiaki Makita
 makita.toshi...@lab.ntt.co.jp wrote:
  (2014/03/19 9:50), Luis R. Rodriguez wrote:
  On Tue, Mar 18, 2014 at 5:42 PM, Toshiaki Makita
  makita.toshi...@lab.ntt.co.jp wrote:
  nit,
  If the last detached port happens to have the same addr as
  random_init_addr, this seems to call br_stp_change_bridge_id() even
  though bridge_id is not changed.
 
  Ah good point.
 
  Shouldn't the assignment of random_init_addr be done before the check of
  no change?
 
  Good question, should we even allow two ports to have the same MAC
  address or should we complain and refuse to add it? If so that should
  mean we should also have to monitor any manual address changes or
  events for address changes on the ports.
 
  This was recently discussed by Stephen and me.
  I'm thinking it should be allowed.
 
  http://marc.info/?l=linux-netdevm=139182743919257w=2
 
 Great now that that's sorted out though I still think calling
 br_stp_change_bridge_id() is right just as calling the update features
 as the device is different. It could however be confusing when this
 situation is run and folks might report odd bugs unless we could tell
 them apart clearly. Thoughts?

br_stp_change_bridge_id() is currently called only if bridge_id.addr
should be changed.
If the addr should not be changed but some updates are needed,
br_stp_recalculate_bridge_id() doesn't seem to fit into it.

Toshiaki Makita

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] bridge: preserve random init MAC address

2014-03-18 Thread Toshiaki Makita
(2014/03/19 9:50), Luis R. Rodriguez wrote:
> On Tue, Mar 18, 2014 at 5:42 PM, Toshiaki Makita
>  wrote:
>> nit,
>> If the last detached port happens to have the same addr as
>> random_init_addr, this seems to call br_stp_change_bridge_id() even
>> though bridge_id is not changed.
> 
> Ah good point.
> 
>> Shouldn't the assignment of random_init_addr be done before the check of
>> "no change"?
> 
> Good question, should we even allow two ports to have the same MAC
> address or should we complain and refuse to add it? If so that should
> mean we should also have to monitor any manual address changes or
> events for address changes on the ports.

This was recently discussed by Stephen and me.
I'm thinking it should be allowed.

http://marc.info/?l=linux-netdev=139182743919257=2

Toshiaki Makita

> 
> Stephen?
> 
>   Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] bridge: preserve random init MAC address

2014-03-18 Thread Toshiaki Makita
(2014/03/13 12:15), Luis R. Rodriguez wrote:
> From: "Luis R. Rodriguez" 
> 
> As it is now if you add create a bridge it gets started
> with a random MAC address and if you then add a net_device
> as a slave but later kick it out you end up with a zero
> MAC address. Instead preserve the original random MAC
> address and use it.
> 
> If you manually set the bridge address that will always
> be respected. This change only takes effect if at the time
> of computing the new root port we determine we have found
> no candidates.
> 
> Cc: Stephen Hemminger 
> Cc: bri...@lists.linux-foundation.org
> Cc: net...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: xen-de...@lists.xenproject.org
> Cc: k...@vger.kernel.org
> Signed-off-by: Luis R. Rodriguez 
> ---
>  net/bridge/br_device.c  | 1 +
>  net/bridge/br_private.h | 1 +
>  net/bridge/br_stp_if.c  | 3 +++
>  3 files changed, 5 insertions(+)
> 
> diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
> index b063050..5f13eac 100644
> --- a/net/bridge/br_device.c
> +++ b/net/bridge/br_device.c
> @@ -368,6 +368,7 @@ void br_dev_setup(struct net_device *dev)
>   br->bridge_id.prio[1] = 0x00;
>  
>   ether_addr_copy(br->group_addr, eth_reserved_addr_base);
> + ether_addr_copy(br->random_init_addr, dev->dev_addr);
>  
>   br->stp_enabled = BR_NO_STP;
>   br->group_fwd_mask = BR_GROUPFWD_DEFAULT;
> diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
> index e1ca1dc..32a06da 100644
> --- a/net/bridge/br_private.h
> +++ b/net/bridge/br_private.h
> @@ -240,6 +240,7 @@ struct net_bridge
>   unsigned long   bridge_hello_time;
>   unsigned long   bridge_forward_delay;
>  
> + u8  random_init_addr[ETH_ALEN];
>   u8  group_addr[ETH_ALEN];
>   u16 root_port;
>  
> diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
> index 189ba1e..4c9ad45 100644
> --- a/net/bridge/br_stp_if.c
> +++ b/net/bridge/br_stp_if.c
> @@ -239,6 +239,9 @@ bool br_stp_recalculate_bridge_id(struct net_bridge *br)
>   if (ether_addr_equal(br->bridge_id.addr, addr))
>   return false;   /* no change */
>  
> + if (ether_addr_equal(addr, br_mac_zero))
> + addr = br->random_init_addr;
> +
>   br_stp_change_bridge_id(br, addr);
>   return true;
>  }

nit,
If the last detached port happens to have the same addr as
random_init_addr, this seems to call br_stp_change_bridge_id() even
though bridge_id is not changed.

Shouldn't the assignment of random_init_addr be done before the check of
"no change"?

Toshiaki Makita
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] bridge: preserve random init MAC address

2014-03-18 Thread Toshiaki Makita
(2014/03/13 12:15), Luis R. Rodriguez wrote:
 From: Luis R. Rodriguez mcg...@suse.com
 
 As it is now if you add create a bridge it gets started
 with a random MAC address and if you then add a net_device
 as a slave but later kick it out you end up with a zero
 MAC address. Instead preserve the original random MAC
 address and use it.
 
 If you manually set the bridge address that will always
 be respected. This change only takes effect if at the time
 of computing the new root port we determine we have found
 no candidates.
 
 Cc: Stephen Hemminger step...@networkplumber.org
 Cc: bri...@lists.linux-foundation.org
 Cc: net...@vger.kernel.org
 Cc: linux-kernel@vger.kernel.org
 Cc: xen-de...@lists.xenproject.org
 Cc: k...@vger.kernel.org
 Signed-off-by: Luis R. Rodriguez mcg...@suse.com
 ---
  net/bridge/br_device.c  | 1 +
  net/bridge/br_private.h | 1 +
  net/bridge/br_stp_if.c  | 3 +++
  3 files changed, 5 insertions(+)
 
 diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
 index b063050..5f13eac 100644
 --- a/net/bridge/br_device.c
 +++ b/net/bridge/br_device.c
 @@ -368,6 +368,7 @@ void br_dev_setup(struct net_device *dev)
   br-bridge_id.prio[1] = 0x00;
  
   ether_addr_copy(br-group_addr, eth_reserved_addr_base);
 + ether_addr_copy(br-random_init_addr, dev-dev_addr);
  
   br-stp_enabled = BR_NO_STP;
   br-group_fwd_mask = BR_GROUPFWD_DEFAULT;
 diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
 index e1ca1dc..32a06da 100644
 --- a/net/bridge/br_private.h
 +++ b/net/bridge/br_private.h
 @@ -240,6 +240,7 @@ struct net_bridge
   unsigned long   bridge_hello_time;
   unsigned long   bridge_forward_delay;
  
 + u8  random_init_addr[ETH_ALEN];
   u8  group_addr[ETH_ALEN];
   u16 root_port;
  
 diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
 index 189ba1e..4c9ad45 100644
 --- a/net/bridge/br_stp_if.c
 +++ b/net/bridge/br_stp_if.c
 @@ -239,6 +239,9 @@ bool br_stp_recalculate_bridge_id(struct net_bridge *br)
   if (ether_addr_equal(br-bridge_id.addr, addr))
   return false;   /* no change */
  
 + if (ether_addr_equal(addr, br_mac_zero))
 + addr = br-random_init_addr;
 +
   br_stp_change_bridge_id(br, addr);
   return true;
  }

nit,
If the last detached port happens to have the same addr as
random_init_addr, this seems to call br_stp_change_bridge_id() even
though bridge_id is not changed.

Shouldn't the assignment of random_init_addr be done before the check of
no change?

Toshiaki Makita
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] bridge: preserve random init MAC address

2014-03-18 Thread Toshiaki Makita
(2014/03/19 9:50), Luis R. Rodriguez wrote:
 On Tue, Mar 18, 2014 at 5:42 PM, Toshiaki Makita
 makita.toshi...@lab.ntt.co.jp wrote:
 nit,
 If the last detached port happens to have the same addr as
 random_init_addr, this seems to call br_stp_change_bridge_id() even
 though bridge_id is not changed.
 
 Ah good point.
 
 Shouldn't the assignment of random_init_addr be done before the check of
 no change?
 
 Good question, should we even allow two ports to have the same MAC
 address or should we complain and refuse to add it? If so that should
 mean we should also have to monitor any manual address changes or
 events for address changes on the ports.

This was recently discussed by Stephen and me.
I'm thinking it should be allowed.

http://marc.info/?l=linux-netdevm=139182743919257w=2

Toshiaki Makita

 
 Stephen?
 
   Luis
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 12/21] bridge: slight optimization of addr compare

2013-12-23 Thread Toshiaki Makita
On Mon, 2013-12-23 at 13:10 +0800, Ding Tianhong wrote:
> Use the recently added and possibly more efficient
> ether_addr_equal_unaligned to instead of memcmp.
> 
> Cc: Stephen Hemminger 
> Cc: David Miller 
> Cc: bri...@lists.linux-foundation.org
> Cc: net...@vger.kernel.org
> Signed-off-by: Wang Weidong 
> Signed-off-by: Ding Tianhong 
> ---
>  net/bridge/br_stp_if.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
> index 656a6f3..04217d1 100644
> --- a/net/bridge/br_stp_if.c
> +++ b/net/bridge/br_stp_if.c
> @@ -229,7 +229,7 @@ bool br_stp_recalculate_bridge_id(struct net_bridge *br)
>  
>   list_for_each_entry(p, >port_list, list) {
>   if (addr == br_mac_zero ||
> - memcmp(p->dev->dev_addr, addr, ETH_ALEN) < 0)
> + !ether_addr_equal_unaligned(p->dev->dev_addr, addr) < 0)
>   addr = p->dev->dev_addr;
>  
>   }

We cannot do this change.
!ether_addr_equal() isn't identical to memcmp().
memcmp() can return negative value but ether_addr_equal() cannot.
br_stp_recalculate_bridge_id() is searching the smallest address among
its ports. This change breaks it.

Thanks,
Toshiaki Makita

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 12/21] bridge: slight optimization of addr compare

2013-12-23 Thread Toshiaki Makita
On Mon, 2013-12-23 at 13:10 +0800, Ding Tianhong wrote:
 Use the recently added and possibly more efficient
 ether_addr_equal_unaligned to instead of memcmp.
 
 Cc: Stephen Hemminger step...@networkplumber.org
 Cc: David Miller da...@davemloft.net
 Cc: bri...@lists.linux-foundation.org
 Cc: net...@vger.kernel.org
 Signed-off-by: Wang Weidong wangweido...@huawei.com
 Signed-off-by: Ding Tianhong dingtianh...@huawei.com
 ---
  net/bridge/br_stp_if.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
 index 656a6f3..04217d1 100644
 --- a/net/bridge/br_stp_if.c
 +++ b/net/bridge/br_stp_if.c
 @@ -229,7 +229,7 @@ bool br_stp_recalculate_bridge_id(struct net_bridge *br)
  
   list_for_each_entry(p, br-port_list, list) {
   if (addr == br_mac_zero ||
 - memcmp(p-dev-dev_addr, addr, ETH_ALEN)  0)
 + !ether_addr_equal_unaligned(p-dev-dev_addr, addr)  0)
   addr = p-dev-dev_addr;
  
   }

We cannot do this change.
!ether_addr_equal() isn't identical to memcmp().
memcmp() can return negative value but ether_addr_equal() cannot.
br_stp_recalculate_bridge_id() is searching the smallest address among
its ports. This change breaks it.

Thanks,
Toshiaki Makita

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/