WARNING in ip_rt_bug
Hello, syzbot hit the following crash on net-next commit 8bde261e535257e81087d39ff808414e2f5aa39d (Sun Apr 1 02:31:43 2018 +) Merge tag 'mlx5-updates-2018-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux syzbot dashboard link: https://syzkaller.appspot.com/bug?extid=b09ac67a2af842b12eab Unfortunately, I don't have any reproducer for this crash yet. Raw console output: https://syzkaller.appspot.com/x/log.txt?id=5991727739437056 Kernel config: https://syzkaller.appspot.com/x/.config?id=3327544840960562528 compiler: gcc (GCC) 7.1.1 20170620 IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+b09ac67a2af842b12...@syzkaller.appspotmail.com It will help syzbot understand when the bug is fixed. See footer for details. If you forward the report, please keep this part and the footer. netlink: 'syz-executor6': attribute type 3 has an invalid length. WARNING: CPU: 0 PID: 11678 at net/ipv4/route.c:1213 ip_rt_bug+0x15/0x20 net/ipv4/route.c:1212 Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 11678 Comm: kworker/u4:7 Not tainted 4.16.0-rc6+ #289 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x24d lib/dump_stack.c:53 panic+0x1e4/0x41c kernel/panic.c:183 __warn+0x1dc/0x200 kernel/panic.c:547 report_bug+0x1f4/0x2b0 lib/bug.c:186 fixup_bug.part.10+0x37/0x80 arch/x86/kernel/traps.c:178 fixup_bug arch/x86/kernel/traps.c:247 [inline] do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315 invalid_op+0x1b/0x40 arch/x86/entry/entry_64.S:986 RIP: 0010:ip_rt_bug+0x15/0x20 net/ipv4/route.c:1212 RSP: 0018:8801db007290 EFLAGS: 00010282 RAX: dc00 RBX: 8801d8dda3c0 RCX: 856c31ca RDX: 0100 RSI: 8858c300 RDI: 0282 RBP: 8801db007298 R08: 11003b600de1 R09: R10: R11: R12: 8801d8dda3c0 R13: 88019bdb2200 R14: 88019bdeed80 R15: 8801d8dda418 dst_output include/net/dst.h:444 [inline] ip_local_out+0x95/0x160 net/ipv4/ip_output.c:124 ip_send_skb+0x3c/0xc0 net/ipv4/ip_output.c:1414 ip_push_pending_frames+0x64/0x80 net/ipv4/ip_output.c:1434 icmp_push_reply+0x395/0x4f0 net/ipv4/icmp.c:394 icmp_send+0x1136/0x19b0 net/ipv4/icmp.c:741 ipv4_link_failure+0x2a/0x1b0 net/ipv4/route.c:1200 dst_link_failure include/net/dst.h:427 [inline] arp_error_report+0xae/0x180 net/ipv4/arp.c:297 neigh_invalidate+0x225/0x530 net/core/neighbour.c:883 neigh_timer_handler+0x897/0xd60 net/core/neighbour.c:969 call_timer_fn+0x228/0x820 kernel/time/timer.c:1326 expire_timers kernel/time/timer.c:1363 [inline] __run_timers+0x7ee/0xb70 kernel/time/timer.c:1666 run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692 __do_softirq+0x2d7/0xb85 kernel/softirq.c:285 invoke_softirq kernel/softirq.c:365 [inline] irq_exit+0x1cc/0x200 kernel/softirq.c:405 exiting_irq arch/x86/include/asm/apic.h:541 [inline] smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:857 RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:778 [inline] RIP: 0010:lock_acquire+0x256/0x580 kernel/locking/lockdep.c:3923 RSP: 0018:880197b3f980 EFLAGS: 0282 ORIG_RAX: ff12 RAX: dc00 RBX: 8801d225e400 RCX: RDX: 110a24e5 RSI: b98b8227 RDI: 0282 RBP: 880197b3fa78 R08: 110032f67e93 R09: 0004 R10: 880197b3f960 R11: 0003 R12: 110032f67f36 R13: R14: R15: 0001 down_write_killable+0x8a/0x140 kernel/locking/rwsem.c:84 __bprm_mm_init fs/exec.c:297 [inline] bprm_mm_init fs/exec.c:414 [inline] do_execveat_common.isra.30+0xc8e/0x23c0 fs/exec.c:1771 do_execve+0x31/0x40 fs/exec.c:1847 call_usermodehelper_exec_async+0x457/0x8f0 kernel/umh.c:100 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406 Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: disabled Rebooting in 86400 seconds.. --- This bug is generated by a dumb bot. It may contain errors. See https://goo.gl/tpsmEJ for details. Direct all questions to syzkal...@googlegroups.com. syzbot will keep track of this bug report. If you forgot to add the Reported-by tag, once the fix for this bug is merged into any tree, please reply to this email with: #syz fix: exact-commit-title To mark this as a duplicate of another syzbot report, please reply with: #syz dup: exact-subject-of-another-report If it's a one-off invalid bug report, please reply with: #syz invalid Note: if the crash happens again, it will cause creation of a new bug report. Note: all commands must start from beginning of the line in the email body.
Re: KASAN: slab-out-of-bounds Read in pfkey_add
On Sun, Apr 08, 2018 at 09:04:33PM -0700, Eric Biggers wrote: ... > > Looks like this is going to be fixed by > https://patchwork.kernel.org/patch/10327883/ ("af_key: Always verify length of > provided sadb_key"), but it's not applied yet to the ipsec tree yet. Kevin, > for > future reference, for syzbot bugs it would be helpful to reply to the original > bug report and say that a patch was sent out, or even better send the patch > as a > reply to the bug report email, e.g. > > git format-patch > --in-reply-to="<001a114292fadd3e250560706...@google.com>" > > for this one (and the Message ID can be found in the syzkaller-bugs archive > even > if the email isn't in your inbox). Sure, I can do that. - Kevin
Re: [PATCH] vhost-net: set packet weight of tx polling to 2 * vq size
On Mon, Apr 09, 2018 at 04:09:20AM +, haibinzhang(张海斌) wrote: > > > On Fri, Apr 06, 2018 at 08:22:37AM +, haibinzhang(张海斌) wrote: > > > handle_tx will delay rx for tens or even hundreds of milliseconds when tx > > > busy > > > polling udp packets with small length(e.g. 1byte udp payload), because > > > setting > > > VHOST_NET_WEIGHT takes into account only sent-bytes but no single packet > > > length. > > > > > > Ping-Latencies shown below were tested between two Virtual Machines using > > > netperf (UDP_STREAM, len=1), and then another machine pinged the client: > > > > > > Packet-Weight Ping-Latencies(millisecond) > > >min avg max > > > Origin 3.319 18.48957.303 > > > 64 1.6432.021 2.552 > > > 128 1.8252.600 3.224 > > > 256 1.9972.710 4.295 > > > 512 1.8603.171 4.631 > > > 1024 2.0024.173 9.056 > > > 2048 2.2575.650 9.688 > > > 4096 2.0938.50815.943 > > > > And this is with Q size 256 right? > > Yes. Ping-latencies with 512 VQ size show below. > > Packet-Weight Ping-Latencies(millisecond) > min avg max > Origin 6.357 29.17766.245 > 64 2.7983.614 4.403 > 128 2.8613.820 4.775 > 256 3.0084.018 4.807 > 512 3.2544.523 5.824 > 1024 3.0795.335 7.747 > 2048 3.9448.201 12.762 > 4096 4.158 11.05719.985 > > We will submit again. Is there anything else? Seems pretty consistent, a small dip at 2 VQ sizes. Acked-by: Michael S. Tsirkin> > > > > Ring size is a hint from device about a burst size it can tolerate. Based > > > on > > > benchmarks, set the weight to 2 * vq size. > > > > > > To evaluate this change, another tests were done using netperf(RR, TX) > > > between > > > two machines with Intel(R) Xeon(R) Gold 6133 CPU @ 2.50GHz, and vq size > > > was > > > tweaked through qemu. Results shown below does not show obvious changes. > > > > What I asked for is ping-latency with different VQ sizes, > > streaming below does not show anything. > > > > > vq size=256 TCP_RRvq size=512 TCP_RR > > > size/sessions/+thu%/+normalize% size/sessions/+thu%/+normalize% > > >1/ 1/ -7%/-2% 1/ 1/ 0%/-2% > > >1/ 4/ +1%/ 0% 1/ 4/ +1%/ 0% > > >1/ 8/ +1%/-2% 1/ 8/ 0%/+1% > > > 64/ 1/ -6%/ 0% 64/ 1/ +7%/+3% > > > 64/ 4/ 0%/+2% 64/ 4/ -1%/+1% > > > 64/ 8/ 0%/ 0% 64/ 8/ -1%/-2% > > > 256/ 1/ -3%/-4%256/ 1/ -4%/-2% > > > 256/ 4/ +3%/+4%256/ 4/ +1%/+2% > > > 256/ 8/ +2%/ 0%256/ 8/ +1%/-1% > > > > > > vq size=256 UDP_RRvq size=512 UDP_RR > > > size/sessions/+thu%/+normalize% size/sessions/+thu%/+normalize% > > >1/ 1/ -5%/+1% 1/ 1/ -3%/-2% > > >1/ 4/ +4%/+1% 1/ 4/ -2%/+2% > > >1/ 8/ -1%/-1% 1/ 8/ -1%/ 0% > > > 64/ 1/ -2%/-3% 64/ 1/ +1%/+1% > > > 64/ 4/ -5%/-1% 64/ 4/ +2%/ 0% > > > 64/ 8/ 0%/-1% 64/ 8/ -2%/+1% > > > 256/ 1/ +7%/+1%256/ 1/ -7%/ 0% > > > 256/ 4/ +1%/+1%256/ 4/ -3%/-4% > > > 256/ 8/ +2%/+2%256/ 8/ +1%/+1% > > > > > > vq size=256 TCP_STREAMvq size=512 TCP_STREAM > > > size/sessions/+thu%/+normalize% size/sessions/+thu%/+normalize% > > > 64/ 1/ 0%/-3% 64/ 1/ 0%/ 0% > > > 64/ 4/ +3%/-1% 64/ 4/ -2%/+4% > > > 64/ 8/ +9%/-4% 64/ 8/ -1%/+2% > > > 256/ 1/ +1%/-4%256/ 1/ +1%/+1% > > > 256/ 4/ -1%/-1%256/ 4/ -3%/ 0% > > > 256/ 8/ +7%/+5%256/ 8/ -3%/ 0% > > > 512/ 1/ +1%/ 0%512/ 1/ -1%/-1% > > > 512/ 4/ +1%/-1%512/ 4/ 0%/ 0% > > > 512/ 8/ +7%/-5%512/ 8/ +6%/-1% > > > 1024/ 1/ 0%/-1% 1024/ 1/ 0%/+1% > > > 1024/ 4/ +3%/ 0% 1024/ 4/ +1%/ 0% > > > 1024/ 8/ +8%/+5% 1024/ 8/ -1%/ 0% > > > 2048/ 1/ +2%/+2% 2048/ 1/ -1%/ 0% > > > 2048/ 4/ +1%/ 0% 2048/ 4/
Re: [RFC PATCH bpf-next 2/6] bpf: add bpf_get_stack helper
On 4/8/18 9:53 PM, Yonghong Song wrote: @@ -1004,7 +1007,8 @@ static void __bpf_prog_put(struct bpf_prog *prog, bool do_idr_lock) bpf_prog_kallsyms_del(prog->aux->func[i]); bpf_prog_kallsyms_del(prog); -call_rcu(>aux->rcu, __bpf_prog_put_rcu); +synchronize_rcu(); +__bpf_prog_put_rcu(>aux->rcu); there should have been lockdep splat. We cannot call synchronize_rcu here, since we cannot sleep in some cases. Let me double check this. The following is the reason why I am using synchronize_rcu(). With call_rcu(>aux->rcu, __bpf_prog_put_rcu) and _bpf_prog_put_rcu calls put_callchain_buffers() which calls mutex_lock(), the runtime with CONFIG_DEBUG_ATOMIC_SLEEP=y will complains since potential sleep inside the call_rcu is not allowed. I see. Indeed. We cannot call put_callchain_buffers() from rcu callback, but doing synchronize_rcu() here is also not possible. How about moving put_callchain into bpf_prog_free_deferred() ?
Re: [RFC PATCH bpf-next 2/6] bpf: add bpf_get_stack helper
On 4/8/18 8:34 PM, Alexei Starovoitov wrote: On 4/6/18 2:48 PM, Yonghong Song wrote: Currently, stackmap and bpf_get_stackid helper are provided for bpf program to get the stack trace. This approach has a limitation though. If two stack traces have the same hash, only one will get stored in the stackmap table, so some stack traces are missing from user perspective. This patch implements a new helper, bpf_get_stack, will send stack traces directly to bpf program. The bpf program is able to see all stack traces, and then can do in-kernel processing or send stack traces to user space through shared map or bpf_perf_event_output. Signed-off-by: Yonghong Song--- include/linux/bpf.h | 1 + include/linux/filter.h | 3 ++- include/uapi/linux/bpf.h | 17 +-- kernel/bpf/stackmap.c | 56 kernel/bpf/syscall.c | 12 ++- kernel/bpf/verifier.c | 3 +++ kernel/trace/bpf_trace.c | 50 +- 7 files changed, 137 insertions(+), 5 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 95a7abd..72ccb9a 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -676,6 +676,7 @@ extern const struct bpf_func_proto bpf_get_current_comm_proto; extern const struct bpf_func_proto bpf_skb_vlan_push_proto; extern const struct bpf_func_proto bpf_skb_vlan_pop_proto; extern const struct bpf_func_proto bpf_get_stackid_proto; +extern const struct bpf_func_proto bpf_get_stack_proto; extern const struct bpf_func_proto bpf_sock_map_update_proto; /* Shared helpers among cBPF and eBPF. */ diff --git a/include/linux/filter.h b/include/linux/filter.h index fc4e8f9..9b64f63 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -467,7 +467,8 @@ struct bpf_prog { dst_needed:1, /* Do we need dst entry? */ blinded:1, /* Was blinded */ is_func:1, /* program is a bpf function */ - kprobe_override:1; /* Do we override a kprobe? */ + kprobe_override:1, /* Do we override a kprobe? */ + need_callchain_buf:1; /* Needs callchain buffer? */ enum bpf_prog_type type; /* Type of BPF program */ enum bpf_attach_type expected_attach_type; /* For some prog types */ u32 len; /* Number of filter blocks */ diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index c5ec897..a4ff5b7 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -517,6 +517,17 @@ union bpf_attr { * other bits - reserved * Return: >= 0 stackid on success or negative error * + * int bpf_get_stack(ctx, buf, size, flags) + * walk user or kernel stack and store the ips in buf + * @ctx: struct pt_regs* + * @buf: user buffer to fill stack + * @size: the buf size + * @flags: bits 0-7 - numer of stack frames to skip + * bit 8 - collect user stack instead of kernel + * bit 11 - get build-id as well if user stack + * other bits - reserved + * Return: >= 0 size copied on success or negative error + * * s64 bpf_csum_diff(from, from_size, to, to_size, seed) * calculate csum diff * @from: raw from buffer @@ -821,7 +832,8 @@ union bpf_attr { FN(msg_apply_bytes), \ FN(msg_cork_bytes), \ FN(msg_pull_data), \ - FN(bind), + FN(bind), \ + FN(get_stack), /* integer value in 'imm' field of BPF_CALL instruction selects which helper * function eBPF program intends to call @@ -855,11 +867,12 @@ enum bpf_func_id { /* BPF_FUNC_skb_set_tunnel_key and BPF_FUNC_skb_get_tunnel_key flags. */ #define BPF_F_TUNINFO_IPV6 (1ULL << 0) -/* BPF_FUNC_get_stackid flags. */ +/* BPF_FUNC_get_stackid and BPF_FUNC_get_stack flags. */ #define BPF_F_SKIP_FIELD_MASK 0xffULL #define BPF_F_USER_STACK (1ULL << 8) #define BPF_F_FAST_STACK_CMP (1ULL << 9) #define BPF_F_REUSE_STACKID (1ULL << 10) +#define BPF_F_USER_BUILD_ID (1ULL << 11) the comment above is not quite correct. This new flag is only available for new helper. Right, some flags are used for both helpers and some are only used for one of them. Will make it clear in the next revision. /* BPF_FUNC_skb_set_tunnel_key flags. */ #define BPF_F_ZERO_CSUM_TX (1ULL << 1) diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c index 04f6ec1..371c72e 100644 --- a/kernel/bpf/stackmap.c +++ b/kernel/bpf/stackmap.c @@ -402,6 +402,62 @@ const struct bpf_func_proto bpf_get_stackid_proto = { .arg3_type = ARG_ANYTHING, }; +BPF_CALL_4(bpf_get_stack, struct pt_regs *, regs, void *, buf, u32, size, + u64, flags) +{ + u32 init_nr, trace_nr, copy_len, elem_size, num_elem; + bool user_build_id = flags & BPF_F_USER_BUILD_ID; + u32 skip = flags & BPF_F_SKIP_FIELD_MASK; + bool user =
Re: [PATCH] vhost-net: set packet weight of tx polling to 2 * vq size
> On Fri, Apr 06, 2018 at 08:22:37AM +, haibinzhang(张海斌) wrote: > > handle_tx will delay rx for tens or even hundreds of milliseconds when tx > > busy > > polling udp packets with small length(e.g. 1byte udp payload), because > > setting > > VHOST_NET_WEIGHT takes into account only sent-bytes but no single packet > > length. > > > > Ping-Latencies shown below were tested between two Virtual Machines using > > netperf (UDP_STREAM, len=1), and then another machine pinged the client: > > > > Packet-Weight Ping-Latencies(millisecond) > >min avg max > > Origin 3.319 18.48957.303 > > 64 1.6432.021 2.552 > > 128 1.8252.600 3.224 > > 256 1.9972.710 4.295 > > 512 1.8603.171 4.631 > > 1024 2.0024.173 9.056 > > 2048 2.2575.650 9.688 > > 4096 2.0938.50815.943 > > And this is with Q size 256 right? Yes. Ping-latencies with 512 VQ size show below. Packet-Weight Ping-Latencies(millisecond) min avg max Origin 6.357 29.17766.245 64 2.7983.614 4.403 128 2.8613.820 4.775 256 3.0084.018 4.807 512 3.2544.523 5.824 1024 3.0795.335 7.747 2048 3.9448.201 12.762 4096 4.158 11.05719.985 We will submit again. Is there anything else? > > > Ring size is a hint from device about a burst size it can tolerate. Based on > > benchmarks, set the weight to 2 * vq size. > > > > To evaluate this change, another tests were done using netperf(RR, TX) > > between > > two machines with Intel(R) Xeon(R) Gold 6133 CPU @ 2.50GHz, and vq size was > > tweaked through qemu. Results shown below does not show obvious changes. > > What I asked for is ping-latency with different VQ sizes, > streaming below does not show anything. > > > vq size=256 TCP_RRvq size=512 TCP_RR > > size/sessions/+thu%/+normalize% size/sessions/+thu%/+normalize% > >1/ 1/ -7%/-2% 1/ 1/ 0%/-2% > >1/ 4/ +1%/ 0% 1/ 4/ +1%/ 0% > >1/ 8/ +1%/-2% 1/ 8/ 0%/+1% > > 64/ 1/ -6%/ 0% 64/ 1/ +7%/+3% > > 64/ 4/ 0%/+2% 64/ 4/ -1%/+1% > > 64/ 8/ 0%/ 0% 64/ 8/ -1%/-2% > > 256/ 1/ -3%/-4%256/ 1/ -4%/-2% > > 256/ 4/ +3%/+4%256/ 4/ +1%/+2% > > 256/ 8/ +2%/ 0%256/ 8/ +1%/-1% > > > > vq size=256 UDP_RRvq size=512 UDP_RR > > size/sessions/+thu%/+normalize% size/sessions/+thu%/+normalize% > >1/ 1/ -5%/+1% 1/ 1/ -3%/-2% > >1/ 4/ +4%/+1% 1/ 4/ -2%/+2% > >1/ 8/ -1%/-1% 1/ 8/ -1%/ 0% > > 64/ 1/ -2%/-3% 64/ 1/ +1%/+1% > > 64/ 4/ -5%/-1% 64/ 4/ +2%/ 0% > > 64/ 8/ 0%/-1% 64/ 8/ -2%/+1% > > 256/ 1/ +7%/+1%256/ 1/ -7%/ 0% > > 256/ 4/ +1%/+1%256/ 4/ -3%/-4% > > 256/ 8/ +2%/+2%256/ 8/ +1%/+1% > > > > vq size=256 TCP_STREAMvq size=512 TCP_STREAM > > size/sessions/+thu%/+normalize% size/sessions/+thu%/+normalize% > > 64/ 1/ 0%/-3% 64/ 1/ 0%/ 0% > > 64/ 4/ +3%/-1% 64/ 4/ -2%/+4% > > 64/ 8/ +9%/-4% 64/ 8/ -1%/+2% > > 256/ 1/ +1%/-4%256/ 1/ +1%/+1% > > 256/ 4/ -1%/-1%256/ 4/ -3%/ 0% > > 256/ 8/ +7%/+5%256/ 8/ -3%/ 0% > > 512/ 1/ +1%/ 0%512/ 1/ -1%/-1% > > 512/ 4/ +1%/-1%512/ 4/ 0%/ 0% > > 512/ 8/ +7%/-5%512/ 8/ +6%/-1% > > 1024/ 1/ 0%/-1% 1024/ 1/ 0%/+1% > > 1024/ 4/ +3%/ 0% 1024/ 4/ +1%/ 0% > > 1024/ 8/ +8%/+5% 1024/ 8/ -1%/ 0% > > 2048/ 1/ +2%/+2% 2048/ 1/ -1%/ 0% > > 2048/ 4/ +1%/ 0% 2048/ 4/ 0%/-1% > > 2048/ 8/ -2%/ 0% 2048/ 8/ 5%/-1% > > 4096/ 1/ -2%/ 0% 4096/ 1/ -2%/ 0% > > 4096/ 4/ +2%/ 0% 4096/ 4/ 0%/ 0% > > 4096/ 8/ +9%/-2% 4096/ 8/ -5%/-1% > > > > Signed-off-by: Haibin Zhang> >
Re: KASAN: slab-out-of-bounds Read in pfkey_add
On Fri, Dec 15, 2017 at 11:51:01PM -0800, syzbot wrote: > Hello, > > syzkaller hit the following crash on > 50c4c4e268a2d7a3e58ebb698ac74da0de40ae36 > git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/master > compiler: gcc (GCC) 7.1.1 20170620 > .config is attached > Raw console output is attached. > C reproducer is attached > syzkaller reproducer is attached. See https://goo.gl/kgGztJ > for information about syzkaller reproducers > > > audit: type=1400 audit(1513021744.055:7): avc: denied { map } for > pid=3149 comm="syzkaller428285" path="/root/syzkaller428285483" dev="sda1" > ino=16481 scontext=unconfined_u:system_r:insmod_t:s0-s0:c0.c1023 > tcontext=unconfined_u:object_r:user_home_t:s0 tclass=file permissive=1 > == > BUG: KASAN: slab-out-of-bounds in memcpy include/linux/string.h:341 [inline] > BUG: KASAN: slab-out-of-bounds in pfkey_msg2xfrm_state net/key/af_key.c:1212 > [inline] > BUG: KASAN: slab-out-of-bounds in pfkey_add+0x1634/0x3270 > net/key/af_key.c:1491 > Read of size 8192 at addr 8801c5197318 by task syzkaller428285/3149 > > CPU: 0 PID: 3149 Comm: syzkaller428285 Not tainted 4.15.0-rc3+ #127 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:17 [inline] > dump_stack+0x194/0x257 lib/dump_stack.c:53 > print_address_description+0x73/0x250 mm/kasan/report.c:252 > kasan_report_error mm/kasan/report.c:351 [inline] > kasan_report+0x25b/0x340 mm/kasan/report.c:409 > check_memory_region_inline mm/kasan/kasan.c:260 [inline] > check_memory_region+0x137/0x190 mm/kasan/kasan.c:267 > memcpy+0x23/0x50 mm/kasan/kasan.c:302 > memcpy include/linux/string.h:341 [inline] > pfkey_msg2xfrm_state net/key/af_key.c:1212 [inline] > pfkey_add+0x1634/0x3270 net/key/af_key.c:1491 > pfkey_process+0x60b/0x720 net/key/af_key.c:2809 > pfkey_sendmsg+0x4d6/0x9f0 net/key/af_key.c:3648 > sock_sendmsg_nosec net/socket.c:636 [inline] > sock_sendmsg+0xca/0x110 net/socket.c:646 > ___sys_sendmsg+0x75b/0x8a0 net/socket.c:2026 > __sys_sendmsg+0xe5/0x210 net/socket.c:2060 > C_SYSC_sendmsg net/compat.c:739 [inline] > compat_SyS_sendmsg+0x2a/0x40 net/compat.c:737 > do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline] > do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389 > entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125 > RIP: 0023:0xf7fd4c79 > RSP: 002b:ff9d7c1c EFLAGS: 0203 ORIG_RAX: 0172 > RAX: ffda RBX: 0003 RCX: 205f5000 > RDX: RSI: 0167 RDI: 000f > RBP: 0003 R08: R09: > R10: R11: R12: > R13: R14: R15: > Looks like this is going to be fixed by https://patchwork.kernel.org/patch/10327883/ ("af_key: Always verify length of provided sadb_key"), but it's not applied yet to the ipsec tree yet. Kevin, for future reference, for syzbot bugs it would be helpful to reply to the original bug report and say that a patch was sent out, or even better send the patch as a reply to the bug report email, e.g. git format-patch --in-reply-to="<001a114292fadd3e250560706...@google.com>" for this one (and the Message ID can be found in the syzkaller-bugs archive even if the email isn't in your inbox). Otherwise people may not know that a patch was sent out and do redundant work. Thanks! I also simplified the reproducer for this, so here it is just in case someone wants it anyway: #include #include int main() { int fd = socket(AF_KEY, SOCK_RAW, 2); char msg[96] = "\x02\x03\x00\x02\x0c\x00\x00\x00\x00\x00\x00\x01\x02\x00\x00\x00" "\x03\x00\x05\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00" "\x00\x00\x00\x00\x00\x00\x00\x00" "\x03\x00\x06\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00" "\x00\x00\x00\x00\x00\x00\x00\x00" "\x02\x00\x01\x00\x00\x00\x00\x00\x00\x00\xfb\x00\x00\x00\x00\x00" "\x02\x00\x08\x00\xff\xff\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"; write(fd, msg, sizeof(msg)); } It causes a 8192-byte out-of-bounds read. Eric
Re: [RFC bpf-next] bpf: document eBPF helpers and add a script to generate man page
On Fri, Apr 06, 2018 at 12:11:22PM +0100, Quentin Monnet wrote: > eBPF helper functions can be called from within eBPF programs to perform > a variety of tasks that would be otherwise hard or impossible to do with > eBPF itself. There is a growing number of such helper functions in the > kernel, but documentation is scarce. The main user space header file > does contain a short commented description of most helpers, but it is > somewhat outdated and not complete. It is more a "cheat sheet" than a > real documentation accessible to new eBPF developers. > > This commit attempts to improve the situation by replacing the existing > overview for the helpers with a more developed description. Furthermore, > a Python script is added to generate a manual page for eBPF helpers. The > workflow is the following, and requires the rst2man utility: > > $ ./scripts/bpf_helpers_doc.py \ > --filename include/uapi/linux/bpf.h > /tmp/bpf-helpers.rst > $ rst2man /tmp/bpf-helpers.rst > /tmp/bpf-helpers.7 > $ man /tmp/bpf-helpers.7 > > The objective is to keep all documentation related to the helpers in a > single place, and to be able to generate from here a manual page that > could be packaged in the man-pages repository and shipped with most > distributions [1]. > > Additionally, parsing the prototypes of the helper functions could > hopefully be reused, with a different Printer object, to generate > header files needed in some eBPF-related projects. > > Regarding the description of each helper, it comprises several items: > > - The function prototype. > - A description of the function and of its arguments (except for a > couple of cases, when there are no arguments and the return value > makes the function usage really obvious). > - A description of return values (if not void). > - A listing of eBPF program types (if relevant, map types) compatible > with the helper. > - Information about the helper being restricted to GPL programs, or not. > - The kernel version in which the helper was introduced. > - The commit that introduced the helper (this is mostly to have it in > the source of the man page, as it can be used to track changes and > update the page). > > For several helpers, descriptions are inspired (at times, nearly copied) > from the commit logs introducing them in the kernel--Many thanks to > their respective authors! They were completed as much as possible, the > objective being to have something easily accessible even for people just > starting with eBPF. There is probably a bit more work to do in this > direction for some helpers. > > Some RST formatting is used in the descriptions (not in function > prototypes, to keep them readable, but the Python script provided in > order to generate the RST for the manual page does add formatting to > prototypes, to produce something pretty) to get "bold" and "italics" in > manual pages. Hopefully, the descriptions in bpf.h file remains > perfectly readable. Note that the few trailing white spaces are > intentional, removing them would break paragraphs for rst2man. > > The descriptions should ideally be updated each time someone adds a new > helper, or updates the behaviour (compatibility extended to new program > types, new socket option supported...) or the interface (new flags > available, ...) of existing ones. > > [1] I have not contacted people from the man-pages project prior to > sending this RFC, so I can offer no guaranty at this time that they > would accept to take the generated man page. > > Cc: linux-...@vger.kernel.org > Cc: linux-...@vger.kernel.org > Signed-off-by: Quentin Monnet> --- > include/uapi/linux/bpf.h | 2237 > > scripts/bpf_helpers_doc.py | 568 +++ > 2 files changed, 2429 insertions(+), 376 deletions(-) > create mode 100755 scripts/bpf_helpers_doc.py > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h > index c5ec89732a8d..f47aeddbbe0a 100644 > --- a/include/uapi/linux/bpf.h > +++ b/include/uapi/linux/bpf.h > @@ -367,394 +367,1879 @@ union bpf_attr { > > /* BPF helper function descriptions: > * > - * void *bpf_map_lookup_elem(, ) > - * Return: Map value or NULL > - * > - * int bpf_map_update_elem(, , , flags) > - * Return: 0 on success or negative error > - * > - * int bpf_map_delete_elem(, ) > - * Return: 0 on success or negative error > - * > - * int bpf_probe_read(void *dst, int size, void *src) > - * Return: 0 on success or negative error > + * void *bpf_map_lookup_elem(struct bpf_map *map, void *key) > + * Description > + * Perform a lookup in *map* for an entry associated to *key*. > + * Return > + * Map value associated to *key*, or **NULL** if no entry was > + * found. > + * For > + * All types of programs. Limited to maps of types > + * **BPF_MAP_TYPE_HASH**, > + * **BPF_MAP_TYPE_ARRAY**, >
Re: [PATCH net] arp: fix arp_filter on l3slave devices
Hi, [This is an automated email] This commit has been processed by the -stable helper bot and determined to be a high probability candidate for -stable trees. (score: 33.5930) The bot has tested the following trees: v4.16, v4.15.15, v4.14.32, v4.9.92, v4.4.126. v4.16: Build OK! v4.15.15: Build OK! v4.14.32: Build OK! v4.9.92: Build OK! v4.4.126: Build OK! Please let us know if you'd like to have this patch included in a stable tree. -- Thanks, Sasha
Re: [PATCH net 4/6] ip6_gre: better validate user provided tunnel names
Hi, [This is an automated email] This commit has been processed because it contains a "Fixes:" tag, fixing commit: c12b395a4664 gre: Support GRE over IPv6. The bot has also determined it's probably a bug fixing patch. (score: 52.9896) The bot has tested the following trees: v4.16, v4.15.15, v4.14.32, v4.9.92, v4.4.126. v4.16: Build OK! v4.15.15: Build OK! v4.14.32: Build OK! v4.9.92: Build OK! v4.4.126: Build OK! -- Thanks, Sasha
Re: [PATCH net 5/6] ip6_tunnel: better validate user provided tunnel names
Hi, [This is an automated email] This commit has been processed because it contains a "Fixes:" tag, fixing commit: 1da177e4c3f4 Linux-2.6.12-rc2. The bot has also determined it's probably a bug fixing patch. (score: 24.0820) The bot has tested the following trees: v4.16, v4.15.15, v4.14.32, v4.9.92, v4.4.126. v4.16: Build OK! v4.15.15: Build OK! v4.14.32: Build OK! v4.9.92: Build OK! v4.4.126: Build OK! -- Thanks, Sasha
Re: [PATCH net 0/6] net: better validate user provided tunnel names
Hi, [This is an automated email] This commit has been processed because it contains a "Fixes:" tag, fixing commit: ed1efb2aefbb ipv6: Add support for IPsec virtual tunnel interfaces. The bot has also determined it's probably a bug fixing patch. (score: 53.6463) The bot has tested the following trees: v4.16, v4.15.15, v4.14.32, v4.9.92, v4.4.126. v4.16: Build OK! v4.15.15: Build OK! v4.14.32: Build OK! v4.9.92: Build OK! v4.4.126: Build OK! -- Thanks, Sasha
Re: [PATCH] net: phy: marvell: Enable interrupt function on LED2 pin
Hi, [This is an automated email] This commit has been processed by the -stable helper bot and determined to be a high probability candidate for -stable trees. (score: 7.3040) The bot has tested the following trees: v4.16, v4.15.15, v4.14.32, v4.9.92, v4.4.126. v4.16: Build OK! v4.15.15: Build failed! Errors: drivers/net/phy/marvell.c:472:9: error: implicit declaration of function ‘phy_modify’; did you mean ‘pmd_modify’? [-Werror=implicit-function-declaration] v4.14.32: Build failed! Errors: drivers/net/phy/marvell.c:472:9: error: implicit declaration of function ‘phy_modify’; did you mean ‘pmd_modify’? [-Werror=implicit-function-declaration] v4.9.92: Failed to apply! Possible dependencies: 864dc729d528 ("net: phy: marvell: Refactor m88e1121 RGMII delay configuration") v4.4.126: Failed to apply! Possible dependencies: 864dc729d528 ("net: phy: marvell: Refactor m88e1121 RGMII delay configuration") Please let us know if you'd like to have this patch included in a stable tree. -- Thanks, Sasha
Re: [PATCH net 3/6] ipv6: sit: better validate user provided tunnel names
Hi, [This is an automated email] This commit has been processed because it contains a "Fixes:" tag, fixing commit: 1da177e4c3f4 Linux-2.6.12-rc2. The bot has also determined it's probably a bug fixing patch. (score: 53.2877) The bot has tested the following trees: v4.16, v4.15.15, v4.14.32, v4.9.92, v4.4.126. v4.16: Build OK! v4.15.15: Build OK! v4.14.32: Build OK! v4.9.92: Build OK! v4.4.126: Build OK! -- Thanks, Sasha
Re: [PATCH net 2/6] ip_tunnel: better validate user provided tunnel names
Hi, [This is an automated email] This commit has been processed because it contains a "Fixes:" tag, fixing commit: c54419321455 GRE: Refactor GRE tunneling code.. The bot has also determined it's probably a bug fixing patch. (score: 46.6256) The bot has tested the following trees: v4.16, v4.15.15, v4.14.32, v4.9.92, v4.4.126. v4.16: Build OK! v4.15.15: Build OK! v4.14.32: Build OK! v4.9.92: Build OK! v4.4.126: Build OK! -- Thanks, Sasha
Re: [PATCH net 6/6] vti6: better validate user provided tunnel names
Hi, [This is an automated email] This commit has been processed because it contains a "Fixes:" tag, fixing commit: ed1efb2aefbb ipv6: Add support for IPsec virtual tunnel interfaces. The bot has also determined it's probably a bug fixing patch. (score: 65.4654) The bot has tested the following trees: v4.16, v4.15.15, v4.14.32, v4.9.92, v4.4.126. v4.16: Build OK! v4.15.15: Build OK! v4.14.32: Build OK! v4.9.92: Build OK! v4.4.126: Build OK! -- Thanks, Sasha
Re: [RFC PATCH bpf-next 2/6] bpf: add bpf_get_stack helper
On 4/6/18 2:48 PM, Yonghong Song wrote: Currently, stackmap and bpf_get_stackid helper are provided for bpf program to get the stack trace. This approach has a limitation though. If two stack traces have the same hash, only one will get stored in the stackmap table, so some stack traces are missing from user perspective. This patch implements a new helper, bpf_get_stack, will send stack traces directly to bpf program. The bpf program is able to see all stack traces, and then can do in-kernel processing or send stack traces to user space through shared map or bpf_perf_event_output. Signed-off-by: Yonghong Song--- include/linux/bpf.h | 1 + include/linux/filter.h | 3 ++- include/uapi/linux/bpf.h | 17 +-- kernel/bpf/stackmap.c| 56 kernel/bpf/syscall.c | 12 ++- kernel/bpf/verifier.c| 3 +++ kernel/trace/bpf_trace.c | 50 +- 7 files changed, 137 insertions(+), 5 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 95a7abd..72ccb9a 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -676,6 +676,7 @@ extern const struct bpf_func_proto bpf_get_current_comm_proto; extern const struct bpf_func_proto bpf_skb_vlan_push_proto; extern const struct bpf_func_proto bpf_skb_vlan_pop_proto; extern const struct bpf_func_proto bpf_get_stackid_proto; +extern const struct bpf_func_proto bpf_get_stack_proto; extern const struct bpf_func_proto bpf_sock_map_update_proto; /* Shared helpers among cBPF and eBPF. */ diff --git a/include/linux/filter.h b/include/linux/filter.h index fc4e8f9..9b64f63 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -467,7 +467,8 @@ struct bpf_prog { dst_needed:1, /* Do we need dst entry? */ blinded:1, /* Was blinded */ is_func:1, /* program is a bpf function */ - kprobe_override:1; /* Do we override a kprobe? */ + kprobe_override:1, /* Do we override a kprobe? */ + need_callchain_buf:1; /* Needs callchain buffer? */ enum bpf_prog_type type; /* Type of BPF program */ enum bpf_attach_typeexpected_attach_type; /* For some prog types */ u32 len;/* Number of filter blocks */ diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index c5ec897..a4ff5b7 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -517,6 +517,17 @@ union bpf_attr { * other bits - reserved * Return: >= 0 stackid on success or negative error * + * int bpf_get_stack(ctx, buf, size, flags) + * walk user or kernel stack and store the ips in buf + * @ctx: struct pt_regs* + * @buf: user buffer to fill stack + * @size: the buf size + * @flags: bits 0-7 - numer of stack frames to skip + * bit 8 - collect user stack instead of kernel + * bit 11 - get build-id as well if user stack + * other bits - reserved + * Return: >= 0 size copied on success or negative error + * * s64 bpf_csum_diff(from, from_size, to, to_size, seed) * calculate csum diff * @from: raw from buffer @@ -821,7 +832,8 @@ union bpf_attr { FN(msg_apply_bytes),\ FN(msg_cork_bytes), \ FN(msg_pull_data), \ - FN(bind), + FN(bind), \ + FN(get_stack), /* integer value in 'imm' field of BPF_CALL instruction selects which helper * function eBPF program intends to call @@ -855,11 +867,12 @@ enum bpf_func_id { /* BPF_FUNC_skb_set_tunnel_key and BPF_FUNC_skb_get_tunnel_key flags. */ #define BPF_F_TUNINFO_IPV6 (1ULL << 0) -/* BPF_FUNC_get_stackid flags. */ +/* BPF_FUNC_get_stackid and BPF_FUNC_get_stack flags. */ #define BPF_F_SKIP_FIELD_MASK 0xffULL #define BPF_F_USER_STACK (1ULL << 8) #define BPF_F_FAST_STACK_CMP (1ULL << 9) #define BPF_F_REUSE_STACKID(1ULL << 10) +#define BPF_F_USER_BUILD_ID(1ULL << 11) the comment above is not quite correct. This new flag is only available for new helper. /* BPF_FUNC_skb_set_tunnel_key flags. */ #define BPF_F_ZERO_CSUM_TX (1ULL << 1) diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c index 04f6ec1..371c72e 100644 --- a/kernel/bpf/stackmap.c +++ b/kernel/bpf/stackmap.c @@ -402,6 +402,62 @@ const struct bpf_func_proto bpf_get_stackid_proto = { .arg3_type = ARG_ANYTHING, }; +BPF_CALL_4(bpf_get_stack, struct pt_regs *, regs, void *, buf, u32, size, + u64, flags) +{ + u32 init_nr, trace_nr, copy_len, elem_size, num_elem; + bool user_build_id = flags & BPF_F_USER_BUILD_ID; + u32 skip = flags &
Re: kernel BUG at drivers/vhost/vhost.c:LINE! (2)
On Mon, Apr 09, 2018 at 05:44:36AM +0300, Michael S. Tsirkin wrote: > On Mon, Apr 09, 2018 at 10:37:45AM +0800, Stefan Hajnoczi wrote: > > On Sat, Apr 7, 2018 at 3:02 AM, syzbot > >wrote: > > > syzbot hit the following crash on upstream commit > > > 38c23685b273cfb4ccf31a199feccce3bdcb5d83 (Fri Apr 6 04:29:35 2018 +) > > > Merge tag 'armsoc-drivers' of > > > git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc > > > syzbot dashboard link: > > > https://syzkaller.appspot.com/bug?extid=65a84dde0214b0387ccd > > > > To prevent duplicated work: I am working on this one. > > > > Stefan > > Do you want to try this patchset: > https://lkml.org/lkml/2018/4/5/665 > > ? Thanks, I'll give it a shot. I also noticed a regression in commit d65026c6c62e7d9616c8ceb5a53b68bcdc050525 ("vhost: validate log when IOTLB is enabled") and am currently testing a fix. Stefan signature.asc Description: PGP signature
Re: DPAA TX Issues
On Sun, Apr 8, 2018, at 7:46 PM, Jacob S. Moroni wrote: > Hello Madalin, > > I've been experiencing some issues with the DPAA Ethernet driver, > specifically related to frame transmission. Hopefully you can point > me in the right direction. > > TLDR: Attempting to transmit faster than a few frames per second causes > the TX FQ CGR to enter into the congested state and remain there forever, > even after transmission stops. > > The hardware is a T2080RDB, running from the tip of net-next, using > the standard t2080rdb device tree and corenet64_smp_defconfig kernel > config. No changes were made to any of the files. The issue occurs > with 4.16.1 stable as well. In fact, the only time I've been able > to achieve reliable frame transmission was with the SDK 4.1 kernel. > > For my tests, I'm running iperf3 both with and without the -R > option (send/receive). When using a USB Ethernet adapter, there > are no issues. > > The issue is that it seems like the TX frame queues are getting > "stuck" when attempting to transmit at rates greater than a few frames > per second. Ping works fine, but it seems like anything that could > potentially cause multiple TX frames to be enqueued causes issues. > > If I run iperf3 in reverse mode (with the T2080RDB receiving), then > I can achieve ~940 Mbps, but this is also somewhat unreliable. > > If I run it with the T2080RDB transmitting, the test will never > complete. Sometimes it starts transmitting for a few seconds then stops, > and other times it never even starts. This also seems to force the > interface into a bad state. > > The ethtool stats show that the interface has entered > congestion a few times, and that it's currently congested. The fact > that it's currently congested even after stopping transmission > indicates that the FQ somehow stopped being drained. I've also > noticed that whenever this issue occurs, the TX confirmation > counters are always less than the TX packet counters. > > When it gets into this state, I can see that the memory usage is > climbing, up until about the point of where the CGR threshold > is (about 100 MB). > > Any idea what could prevent the TX FQ from being drained? My first > guess was flow control, but it's completely disabled. > > I tried messing with the egress congestion threshold, workqueue > assignments, etc., but nothing seemed to have any effect. > > If you need any more information or want me to run any tests, > please let me know. > > Thanks, > -- > Jacob S. Moroni > m...@jakemoroni.com It turns out that irqbalance was causing all of the issues. After disabling it and rebooting, the interfaces worked perfectly. Perhaps there's an issue with how the qman/bman portals are defined as per-cpu variables. During the portal's probe, the CPUs are assigned one-by-one and subsequently passed into request_irq as the argument. However, it seems like if the IRQ affinity changes, then the ISR could be passed a reference to a per-cpu variable belonging to another CPU. At least I know where to look now. - Jake
[GIT] Networking
1) The sockmap code has to free socket memory on close if there is corked data, from John Fastabend. 2) Tunnel names coming from userspace need to be length validated. From Eric Dumazet. 3) arp_filter() has to take VRFs properly into account, from Miguel Fadon Perlines. 4) Fix oops in error path of tcf_bpf_init(), from Davide Caratti. 5) Missing idr_remove() in u32_delete_key(), from Cong Wang. 6) More syzbot stuff. Several use of uninitialized value fixes all over, from Eric Dumazet. 7) Do not leak kernel memory to userspace in sctp, also from Eric Dumazet. 8) Discard frames from unused ports in DSA, from Andrew Lunn. 9) Fix DMA mapping and reset/failover problems in ibmvnic, from Thomas Falcon. 10) Do not access dp83640 PHY registers prematurely after reset, from Esben Haabendal. Please pull, thanks a lot! The following changes since commit 06dd3dfeea60e2a6457a6aedf97afc8e6d2ba497: Merge tag 'char-misc-4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc (2018-04-04 20:07:20 -0700) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git for you to fetch changes up to 76327a35caabd1a932e83d6a42b967aa08584e5d: dp83640: Ensure against premature access to PHY registers after reset (2018-04-08 19:58:52 -0400) Anders Roxell (1): kernel/bpf/syscall: fix warning defined but not used Andrew Lunn (1): net: dsa: Discard frames from unused ports Anirudh Venkataramanan (1): ice: Bug fixes in ethtool code Cong Wang (2): net_sched: fix a missing idr_remove() in u32_delete_key() tipc: use the right skb in tipc_sk_fill_sock_diag() David S. Miller (7): Merge branch 'net-tunnel-name-validate' Merge branch 'hv_netvsc-Fix-shutdown-issues-on-older-Windows-hosts' Merge branch '100GbE' of git://git.kernel.org/.../jkirsher/net-queue Merge branch 'net-fix-uninit-values-in-networking-stack' Merge branch 'ibmvnic-Fix-driver-reset-and-DMA-bugs' Merge branch 'for-upstream' of git://git.kernel.org/.../bluetooth/bluetooth Merge git://git.kernel.org/.../bpf/bpf Davide Caratti (1): net/sched: fix NULL dereference in the error path of tcf_bpf_init() Eric Dumazet (16): net: fool proof dev_valid_name() ip_tunnel: better validate user provided tunnel names ipv6: sit: better validate user provided tunnel names ip6_gre: better validate user provided tunnel names ip6_tunnel: better validate user provided tunnel names vti6: better validate user provided tunnel names crypto: af_alg - fix possible uninit-value in alg_bind() netlink: fix uninit-value in netlink_sendmsg net: fix rtnh_ok() net: initialize skb->peeked when cloning net: fix uninit-value in __hw_addr_add_ex() dccp: initialize ireq->ir_mark ipv4: fix uninit-value in ip_route_output_key_hash_rcu() soreuseport: initialise timewait reuseport field sctp: do not leak kernel memory to user space sctp: sctp_sockaddr_af must check minimal addr length for AF_INET6 Esben Haabendal (4): net: phy: marvell: Enable interrupt function on LED2 pin net/fsl_pq_mdio: Allow explicit speficition of TBIPA address ARM: dts: ls1021a: Specify TBIPA register address dp83640: Ensure against premature access to PHY registers after reset Jeff Barnhill (1): net/ipv6: Increment OUTxxx counters after netfilter hook Jiri Pirko (1): devlink: convert occ_get op to separate registration John Fastabend (2): bpf: sockmap, free memory on sock close with cork data bpf: sockmap, duplicates release calls may NULL sk_prot Maxime Chevallier (1): net: mvpp2: Fix parser entry init boundary check Miguel Fadon Perlines (1): arp: fix arp_filter on l3slave devices Mohammed Gamal (4): hv_netvsc: Use Windows version instead of NVSP version on GPAD teardown hv_netvsc: Split netvsc_revoke_buf() and netvsc_teardown_gpadl() hv_netvsc: Ensure correct teardown message sequence order hv_netvsc: Pass net_device parameter to revoke and teardown functions Nathan Fontenot (1): ibmvnic: Do not reset CRQ for Mobility driver resets Szymon Janc (1): Bluetooth: Fix connection if directed advertising and privacy is used Thomas Falcon (4): ibmvnic: Fix DMA mapping mistakes ibmvnic: Zero used TX descriptor counter on reset ibmvnic: Fix reset scheduler error handling ibmvnic: Fix failover case for non-redundant configuration Wei Yongjun (1): ice: Fix error return code in ice_init_hw() Documentation/devicetree/bindings/net/fsl-tsec-phy.txt | 6 +++- arch/arm/boot/dts/ls1021a.dtsi | 3 +- crypto/af_alg.c| 8 ++--- drivers/net/ethernet/freescale/fsl_pq_mdio.c | 50
Re: kernel BUG at drivers/vhost/vhost.c:LINE! (2)
On Mon, Apr 09, 2018 at 10:37:45AM +0800, Stefan Hajnoczi wrote: > On Sat, Apr 7, 2018 at 3:02 AM, syzbot >wrote: > > syzbot hit the following crash on upstream commit > > 38c23685b273cfb4ccf31a199feccce3bdcb5d83 (Fri Apr 6 04:29:35 2018 +) > > Merge tag 'armsoc-drivers' of > > git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc > > syzbot dashboard link: > > https://syzkaller.appspot.com/bug?extid=65a84dde0214b0387ccd > > To prevent duplicated work: I am working on this one. > > Stefan Do you want to try this patchset: https://lkml.org/lkml/2018/4/5/665 ? -- MST
Re: [PATCH] vhost-net: set packet weight of tx polling to 2 * vq size
On Fri, Apr 06, 2018 at 08:22:37AM +, haibinzhang(张海斌) wrote: > handle_tx will delay rx for tens or even hundreds of milliseconds when tx busy > polling udp packets with small length(e.g. 1byte udp payload), because setting > VHOST_NET_WEIGHT takes into account only sent-bytes but no single packet > length. > > Ping-Latencies shown below were tested between two Virtual Machines using > netperf (UDP_STREAM, len=1), and then another machine pinged the client: > > Packet-Weight Ping-Latencies(millisecond) >min avg max > Origin 3.319 18.48957.303 > 64 1.6432.021 2.552 > 128 1.8252.600 3.224 > 256 1.9972.710 4.295 > 512 1.8603.171 4.631 > 1024 2.0024.173 9.056 > 2048 2.2575.650 9.688 > 4096 2.0938.50815.943 And this is with Q size 256 right? > Ring size is a hint from device about a burst size it can tolerate. Based on > benchmarks, set the weight to 2 * vq size. > > To evaluate this change, another tests were done using netperf(RR, TX) between > two machines with Intel(R) Xeon(R) Gold 6133 CPU @ 2.50GHz, and vq size was > tweaked through qemu. Results shown below does not show obvious changes. What I asked for is ping-latency with different VQ sizes, streaming below does not show anything. > vq size=256 TCP_RRvq size=512 TCP_RR > size/sessions/+thu%/+normalize% size/sessions/+thu%/+normalize% >1/ 1/ -7%/-2% 1/ 1/ 0%/-2% >1/ 4/ +1%/ 0% 1/ 4/ +1%/ 0% >1/ 8/ +1%/-2% 1/ 8/ 0%/+1% > 64/ 1/ -6%/ 0% 64/ 1/ +7%/+3% > 64/ 4/ 0%/+2% 64/ 4/ -1%/+1% > 64/ 8/ 0%/ 0% 64/ 8/ -1%/-2% > 256/ 1/ -3%/-4%256/ 1/ -4%/-2% > 256/ 4/ +3%/+4%256/ 4/ +1%/+2% > 256/ 8/ +2%/ 0%256/ 8/ +1%/-1% > > vq size=256 UDP_RRvq size=512 UDP_RR > size/sessions/+thu%/+normalize% size/sessions/+thu%/+normalize% >1/ 1/ -5%/+1% 1/ 1/ -3%/-2% >1/ 4/ +4%/+1% 1/ 4/ -2%/+2% >1/ 8/ -1%/-1% 1/ 8/ -1%/ 0% > 64/ 1/ -2%/-3% 64/ 1/ +1%/+1% > 64/ 4/ -5%/-1% 64/ 4/ +2%/ 0% > 64/ 8/ 0%/-1% 64/ 8/ -2%/+1% > 256/ 1/ +7%/+1%256/ 1/ -7%/ 0% > 256/ 4/ +1%/+1%256/ 4/ -3%/-4% > 256/ 8/ +2%/+2%256/ 8/ +1%/+1% > > vq size=256 TCP_STREAMvq size=512 TCP_STREAM > size/sessions/+thu%/+normalize% size/sessions/+thu%/+normalize% > 64/ 1/ 0%/-3% 64/ 1/ 0%/ 0% > 64/ 4/ +3%/-1% 64/ 4/ -2%/+4% > 64/ 8/ +9%/-4% 64/ 8/ -1%/+2% > 256/ 1/ +1%/-4%256/ 1/ +1%/+1% > 256/ 4/ -1%/-1%256/ 4/ -3%/ 0% > 256/ 8/ +7%/+5%256/ 8/ -3%/ 0% > 512/ 1/ +1%/ 0%512/ 1/ -1%/-1% > 512/ 4/ +1%/-1%512/ 4/ 0%/ 0% > 512/ 8/ +7%/-5%512/ 8/ +6%/-1% > 1024/ 1/ 0%/-1% 1024/ 1/ 0%/+1% > 1024/ 4/ +3%/ 0% 1024/ 4/ +1%/ 0% > 1024/ 8/ +8%/+5% 1024/ 8/ -1%/ 0% > 2048/ 1/ +2%/+2% 2048/ 1/ -1%/ 0% > 2048/ 4/ +1%/ 0% 2048/ 4/ 0%/-1% > 2048/ 8/ -2%/ 0% 2048/ 8/ 5%/-1% > 4096/ 1/ -2%/ 0% 4096/ 1/ -2%/ 0% > 4096/ 4/ +2%/ 0% 4096/ 4/ 0%/ 0% > 4096/ 8/ +9%/-2% 4096/ 8/ -5%/-1% > > Signed-off-by: Haibin Zhang> Signed-off-by: Yunfang Tai > Signed-off-by: Lidong Chen Code is fine but I'd like to see validation of the heuristic 2*vq->num with another vq size. > --- > drivers/vhost/net.c | 8 +++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c > index 8139bc70ad7d..3563a305cc0a 100644 > --- a/drivers/vhost/net.c > +++ b/drivers/vhost/net.c > @@ -44,6 +44,10 @@ MODULE_PARM_DESC(experimental_zcopytx, "Enable Zero Copy > TX;" > * Using this limit prevents one virtqueue from starving others. */ > #define VHOST_NET_WEIGHT 0x8 > > +/* Max number of packets transferred before
Re: [PATCH net-next] net/ncsi: Refactor MAC, VLAN filters
The net-next tree is closed at this time, please resend this when the merge window is over and the net-next tree opens back up. Thank you.
Re: kernel BUG at drivers/vhost/vhost.c:LINE! (2)
On Sat, Apr 7, 2018 at 3:02 AM, syzbotwrote: > syzbot hit the following crash on upstream commit > 38c23685b273cfb4ccf31a199feccce3bdcb5d83 (Fri Apr 6 04:29:35 2018 +) > Merge tag 'armsoc-drivers' of > git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc > syzbot dashboard link: > https://syzkaller.appspot.com/bug?extid=65a84dde0214b0387ccd To prevent duplicated work: I am working on this one. Stefan > > So far this crash happened 4 times on upstream. > C reproducer: https://syzkaller.appspot.com/x/repro.c?id=6586748079439872 > syzkaller reproducer: > https://syzkaller.appspot.com/x/repro.syz?id=5974272052822016 > Raw console output: > https://syzkaller.appspot.com/x/log.txt?id=6224632407392256 > Kernel config: > https://syzkaller.appspot.com/x/.config?id=-5813481738265533882 > compiler: gcc (GCC) 8.0.1 20180301 (experimental) > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syzbot+65a84dde0214b0387...@syzkaller.appspotmail.com > It will help syzbot understand when the bug is fixed. See footer for > details. > If you forward the report, please keep this part and the footer. > > [ cut here ] > kernel BUG at drivers/vhost/vhost.c:1652! > invalid opcode: [#1] SMP KASAN > [ cut here ] > Dumping ftrace buffer: > kernel BUG at drivers/vhost/vhost.c:1652! >(ftrace buffer empty) > Modules linked in: > CPU: 1 PID: 4461 Comm: syzkaller684218 Not tainted 4.16.0+ #3 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > RIP: 0010:set_bit_to_user drivers/vhost/vhost.c:1652 [inline] > RIP: 0010:log_write+0x42a/0x4d0 drivers/vhost/vhost.c:1676 > RSP: 0018:8801b256f920 EFLAGS: 00010293 > RAX: 8801adc9e2c0 RBX: dc00 RCX: 85924a0f > RDX: RSI: 85924cea RDI: 0005 > RBP: 8801b256fa58 R08: 8801adc9e2c0 R09: ed003962412d > R10: 8801b256fad8 R11: 8801cb12096f R12: 0001 > R13: ed00364adf36 R14: R15: 8801b256fa30 > FS: 7fdf24b19700() GS:8801db10() knlGS: > CS: 0010 DS: ES: CR0: 80050033 > CR2: 20bf6000 CR3: 0001ae6a7000 CR4: 001406e0 > DR0: DR1: DR2: > DR3: DR6: fffe0ff0 DR7: 0400 > Call Trace: > vhost_update_used_flags+0x3af/0x4a0 drivers/vhost/vhost.c:1723 > vhost_vq_init_access+0x117/0x590 drivers/vhost/vhost.c:1763 > vhost_vsock_start drivers/vhost/vsock.c:446 [inline] > vhost_vsock_dev_ioctl+0x751/0x920 drivers/vhost/vsock.c:678 > vfs_ioctl fs/ioctl.c:46 [inline] > file_ioctl fs/ioctl.c:500 [inline] > do_vfs_ioctl+0x1cf/0x1650 fs/ioctl.c:684 > ksys_ioctl+0xa9/0xd0 fs/ioctl.c:701 > SYSC_ioctl fs/ioctl.c:708 [inline] > SyS_ioctl+0x24/0x30 fs/ioctl.c:706 > do_syscall_64+0x29e/0x9d0 arch/x86/entry/common.c:287 > entry_SYSCALL_64_after_hwframe+0x42/0xb7 > RIP: 0033:0x4456c9 > RSP: 002b:7fdf24b18da8 EFLAGS: 0297 ORIG_RAX: 0010 > RAX: ffda RBX: 006dac24 RCX: 004456c9 > RDX: 20f82ffc RSI: 4004af61 RDI: 001b > RBP: 006dac20 R08: R09: > R10: R11: 0297 R12: 6b636f73762d7473 > R13: 6f68762f7665642f R14: fffc R15: 0007 > Code: e8 7c 5e e4 fb 4c 89 ef e8 e4 16 06 fc 48 8d 85 58 ff ff ff 48 c1 e8 > 03 c6 04 18 f8 e9 46 ff ff ff 45 31 f6 eb 91 e8 56 5e e4 fb <0f> 0b e8 4f 5e > e4 fb 48 c7 c6 a0 a3 24 88 4c 89 ef e8 60 b6 10 > RIP: set_bit_to_user drivers/vhost/vhost.c:1652 [inline] RSP: > 8801b256f920 > RIP: log_write+0x42a/0x4d0 drivers/vhost/vhost.c:1676 RSP: 8801b256f920 > invalid opcode: [#2] SMP KASAN > ---[ end trace 0d0ff45aa44d8a23 ]--- > Dumping ftrace buffer: >(ftrace buffer empty) > Modules linked in: > > > --- > This bug is generated by a dumb bot. It may contain errors. > See https://goo.gl/tpsmEJ for details. > Direct all questions to syzkal...@googlegroups.com. > > syzbot will keep track of this bug report. > If you forgot to add the Reported-by tag, once the fix for this bug is > merged > into any tree, please reply to this email with: > #syz fix: exact-commit-title > If you want to test a patch for this bug, please reply with: > #syz test: git://repo/address.git branch > and provide the patch inline or as an attachment. > To mark this as a duplicate of another syzbot report, please reply with: > #syz dup: exact-subject-of-another-report > If it's a one-off invalid bug report, please reply with: > #syz invalid > Note: if the crash happens again, it will cause creation of a new bug > report. > Note: all commands must start from beginning of the line in the email body.
[PATCH net-next] net/ncsi: Refactor MAC, VLAN filters
The NCSI driver defines a generic ncsi_channel_filter struct that can be used to store arbitrarily formatted filters, and several generic methods of accessing data stored in such a filter. However in both the driver and as defined in the NCSI specification there are only two actual filters: VLAN ID filters and MAC address filters. The splitting of the MAC filter into unicast, multicast, and mixed is also technically not necessary as these are stored in the same location in hardware. To save complexity, particularly in the set up and accessing of these generic filters, remove them in favour of two specific structs. These can be acted on directly and do not need several generic helper functions to use. This also fixes a memory error found by KASAN on ARM32 (which is not upstream yet), where response handlers accessing a filter's data field could write past allocated memory. [ 114.926512] == [ 114.933861] BUG: KASAN: slab-out-of-bounds in ncsi_configure_channel+0x4b8/0xc58 [ 114.941304] Read of size 2 at addr 94888558 by task kworker/0:2/546 [ 114.947593] [ 114.949146] CPU: 0 PID: 546 Comm: kworker/0:2 Not tainted 4.16.0-rc6-00119-ge156398bfcad #13 ... [ 115.170233] The buggy address belongs to the object at 94888540 [ 115.170233] which belongs to the cache kmalloc-32 of size 32 [ 115.181917] The buggy address is located 24 bytes inside of [ 115.181917] 32-byte region [94888540, 94888560) [ 115.192115] The buggy address belongs to the page: [ 115.196943] page:9eeac100 count:1 mapcount:0 mapping:94888000 index:0x94888fc1 [ 115.204200] flags: 0x100(slab) [ 115.207330] raw: 0100 94888000 94888fc1 003f 0001 9eea2014 9eecaa74 96c003e0 [ 115.215444] page dumped because: kasan: bad access detected [ 115.221036] [ 115.222544] Memory state around the buggy address: [ 115.227384] 94888400: fb fb fb fb fc fc fc fc 04 fc fc fc fc fc fc fc [ 115.233959] 94888480: 00 00 00 fc fc fc fc fc 00 04 fc fc fc fc fc fc [ 115.240529] >94888500: 00 00 04 fc fc fc fc fc 00 00 04 fc fc fc fc fc [ 115.247077] ^ [ 115.252523] 94888580: 00 04 fc fc fc fc fc fc 06 fc fc fc fc fc fc fc [ 115.259093] 94888600: 00 00 06 fc fc fc fc fc 00 00 04 fc fc fc fc fc [ 115.265639] == Reported-by: Joel StanleySigned-off-by: Samuel Mendoza-Jonas --- net/ncsi/internal.h | 34 +++--- net/ncsi/ncsi-manage.c | 226 +--- net/ncsi/ncsi-netlink.c | 20 ++-- net/ncsi/ncsi-rsp.c | 178 +-- 4 files changed, 147 insertions(+), 311 deletions(-) diff --git a/net/ncsi/internal.h b/net/ncsi/internal.h index 8da84312cd3b..8055e3965cef 100644 --- a/net/ncsi/internal.h +++ b/net/ncsi/internal.h @@ -68,15 +68,6 @@ enum { NCSI_MODE_MAX }; -enum { - NCSI_FILTER_BASE= 0, - NCSI_FILTER_VLAN= 0, - NCSI_FILTER_UC, - NCSI_FILTER_MC, - NCSI_FILTER_MIXED, - NCSI_FILTER_MAX -}; - struct ncsi_channel_version { u32 version;/* Supported BCD encoded NCSI version */ u32 alpha2; /* Supported BCD encoded NCSI version */ @@ -98,11 +89,18 @@ struct ncsi_channel_mode { u32 data[8];/* Data entries*/ }; -struct ncsi_channel_filter { - u32 index; /* Index of channel filters */ - u32 total; /* Total entries in the filter table */ - u64 bitmap; /* Bitmap of valid entries */ - u32 data[]; /* Data for the valid entries*/ +struct ncsi_channel_mac_filter { + u8 n_uc; + u8 n_mc; + u8 n_mixed; + u64 bitmap; + unsigned char *addrs; +}; + +struct ncsi_channel_vlan_filter { + u8 n_vids; + u64 bitmap; + u16 *vids; }; struct ncsi_channel_stats { @@ -186,7 +184,9 @@ struct ncsi_channel { struct ncsi_channel_version version; struct ncsi_channel_cap caps[NCSI_CAP_MAX]; struct ncsi_channel_modemodes[NCSI_MODE_MAX]; - struct ncsi_channel_filter *filters[NCSI_FILTER_MAX]; + /* Filtering Settings */ + struct ncsi_channel_mac_filter mac_filter; + struct ncsi_channel_vlan_filter vlan_filter; struct ncsi_channel_stats stats; struct { struct timer_list timer; @@ -320,10 +320,6 @@ extern spinlock_t ncsi_dev_lock; list_for_each_entry_rcu(nc, >channels, node) /* Resources */ -u32 *ncsi_get_filter(struct ncsi_channel *nc, int table, int index); -int ncsi_find_filter(struct ncsi_channel *nc, int table, void *data); -int ncsi_add_filter(struct ncsi_channel *nc, int table, void *data); -int ncsi_remove_filter(struct ncsi_channel *nc, int table, int index); void ncsi_start_channel_monitor(struct
[PATCH AUTOSEL for 4.9 160/293] MIPS: Give __secure_computing() access to syscall arguments.
From: David Daney[ Upstream commit 669c4092225f0ed5df12ebee654581b558a5e3ed ] KProbes of __seccomp_filter() are not very useful without access to the syscall arguments. Do what x86 does, and populate a struct seccomp_data to be passed to __secure_computing(). This allows samples/bpf/tracex5 to extract a sensible trace. Signed-off-by: David Daney Cc: Alexei Starovoitov Cc: Daniel Borkmann Cc: Matt Redfearn Cc: netdev@vger.kernel.org Cc: linux-ker...@vger.kernel.org Cc: linux-m...@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/16368/ Signed-off-by: Ralf Baechle Signed-off-by: Sasha Levin --- arch/mips/kernel/ptrace.c | 22 -- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/arch/mips/kernel/ptrace.c b/arch/mips/kernel/ptrace.c index 0c8ae2cc6380..956dae7e6a69 100644 --- a/arch/mips/kernel/ptrace.c +++ b/arch/mips/kernel/ptrace.c @@ -1011,8 +1011,26 @@ asmlinkage long syscall_trace_enter(struct pt_regs *regs, long syscall) tracehook_report_syscall_entry(regs)) return -1; - if (secure_computing(NULL) == -1) - return -1; +#ifdef CONFIG_SECCOMP + if (unlikely(test_thread_flag(TIF_SECCOMP))) { + int ret, i; + struct seccomp_data sd; + + sd.nr = syscall; + sd.arch = syscall_get_arch(); + for (i = 0; i < 6; i++) { + unsigned long v, r; + + r = mips_get_syscall_arg(, current, regs, i); + sd.args[i] = r ? 0 : v; + } + sd.instruction_pointer = KSTK_EIP(current); + + ret = __secure_computing(); + if (ret == -1) + return ret; + } +#endif if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT))) trace_sys_enter(regs, regs->regs[2]); -- 2.15.1
Re: KASAN: use-after-free Read in inet_create
#syz dup: KASAN: use-after-free Read in rds_cong_queue_updates There are a number of manifestations of this bug, basically all suggest that the connect/reconnect etc workqs are somehow being scheduled after the netns is deleted, despite the code refactoring in Commit 3db6e0d172c (and looks like the WARN_ONs in that commit are not even being triggered). We've not been able to reproduce this issues, and without a crash dump (or some hint of other threads that were running at the time of the problem) are working on figuring out the root-cause by code-inspection. --Sowmini
Re: [PATCH v3] dp83640: Ensure against premature access to PHY registers after reset
From: Esben HaabendalDate: Sun, 8 Apr 2018 22:17:01 +0200 > From: Esben Haabendal > > The datasheet specifies a 3uS pause after performing a software > reset. The default implementation of genphy_soft_reset() does not > provide this, so implement soft_reset with the needed pause. > > Signed-off-by: Esben Haabendal > Reviewed-by: Andrew Lunn Applied, thank you.
Re: pull-request: bpf 2018-04-09
From: Daniel BorkmannDate: Mon, 9 Apr 2018 00:28:47 +0200 > The following pull-request contains BPF updates for your *net* tree. > > The main changes are: > > 1) Two sockmap fixes: i) fix a potential warning when a socket with >pending cork data is closed by freeing the memory right when the >socket is closed instead of seeing still outstanding memory at >garbage collector time, ii) fix a NULL pointer deref in case of >duplicates release calls, so make sure to only reset the sk_prot >pointer when it's in a valid state to do so, both from John. > > 2) Fix a compilation warning in bpf_prog_attach_check_attach_type() >by moving the function under CONFIG_CGROUP_BPF ifdef since only >used there, from Anders. > > Please consider pulling these changes from: > > git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git Pulled, thanks Daniel.
Re: [RFC] connector: add group_exit_code and signal_flags fields to exit_proc_event
Hi everyone Sorry for that late reply 01.03.2018, 21:58, "Stefan Strogin": > So I was thinking to add these two fields to union event_data: > task->signal->group_exit_code > task->signal->flags > This won't increase size of struct proc_event (because of comm_proc_event) > and shouldn't break backward compatibility for the user-space. But it will > add some useful information about what caused the process death. > What do you think, is it an acceptable approach? As I saw in other discussion, doesn't it break userspace API, or you are sure that no sizes has been increased? You are using the same structure as used for plain signals and add group status there, how will userspace react, if it was compiled with older headers? What if it uses zero-field alignment, i.e. allocating exactly the size of structure with byte precision?
DPAA TX Issues
Hello Madalin, I've been experiencing some issues with the DPAA Ethernet driver, specifically related to frame transmission. Hopefully you can point me in the right direction. TLDR: Attempting to transmit faster than a few frames per second causes the TX FQ CGR to enter into the congested state and remain there forever, even after transmission stops. The hardware is a T2080RDB, running from the tip of net-next, using the standard t2080rdb device tree and corenet64_smp_defconfig kernel config. No changes were made to any of the files. The issue occurs with 4.16.1 stable as well. In fact, the only time I've been able to achieve reliable frame transmission was with the SDK 4.1 kernel. For my tests, I'm running iperf3 both with and without the -R option (send/receive). When using a USB Ethernet adapter, there are no issues. The issue is that it seems like the TX frame queues are getting "stuck" when attempting to transmit at rates greater than a few frames per second. Ping works fine, but it seems like anything that could potentially cause multiple TX frames to be enqueued causes issues. If I run iperf3 in reverse mode (with the T2080RDB receiving), then I can achieve ~940 Mbps, but this is also somewhat unreliable. If I run it with the T2080RDB transmitting, the test will never complete. Sometimes it starts transmitting for a few seconds then stops, and other times it never even starts. This also seems to force the interface into a bad state. The ethtool stats show that the interface has entered congestion a few times, and that it's currently congested. The fact that it's currently congested even after stopping transmission indicates that the FQ somehow stopped being drained. I've also noticed that whenever this issue occurs, the TX confirmation counters are always less than the TX packet counters. When it gets into this state, I can see that the memory usage is climbing, up until about the point of where the CGR threshold is (about 100 MB). Any idea what could prevent the TX FQ from being drained? My first guess was flow control, but it's completely disabled. I tried messing with the egress congestion threshold, workqueue assignments, etc., but nothing seemed to have any effect. If you need any more information or want me to run any tests, please let me know. Thanks, -- Jacob S. Moroni m...@jakemoroni.com
Re: KASAN: use-after-free Read in inet_create
[+RDS list and maintainer] On Sat, Dec 09, 2017 at 12:50:01PM -0800, syzbot wrote: > Hello, > > syzkaller hit the following crash on > 82bcf1def3b5f1251177ad47c44f7e17af039b4b > git://git.cmpxchg.org/linux-mmots.git/master > compiler: gcc (GCC) 7.1.1 20170620 > .config is attached > Raw console output is attached. > > Unfortunately, I don't have any reproducer for this bug yet. > > > == > BUG: KASAN: use-after-free in inet_create+0xda0/0xf50 net/ipv4/af_inet.c:338 > Read of size 4 at addr 8801bde28554 by task kworker/u4:5/3492 > > CPU: 0 PID: 3492 Comm: kworker/u4:5 Not tainted 4.15.0-rc2-mm1+ #39 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > Workqueue: krdsd rds_connect_worker > Call Trace: > __dump_stack lib/dump_stack.c:17 [inline] > dump_stack+0x194/0x257 lib/dump_stack.c:53 > print_address_description+0x73/0x250 mm/kasan/report.c:252 > kasan_report_error mm/kasan/report.c:351 [inline] > kasan_report+0x25b/0x340 mm/kasan/report.c:409 > __asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:429 > inet_create+0xda0/0xf50 net/ipv4/af_inet.c:338 > __sock_create+0x4d4/0x850 net/socket.c:1265 > sock_create_kern+0x3f/0x50 net/socket.c:1311 > rds_tcp_conn_path_connect+0x26f/0x920 net/rds/tcp_connect.c:108 > rds_connect_worker+0x156/0x1f0 net/rds/threads.c:165 > process_one_work+0xbfd/0x1bc0 kernel/workqueue.c:2113 > worker_thread+0x223/0x1990 kernel/workqueue.c:2247 > kthread+0x37a/0x440 kernel/kthread.c:238 > ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:524 > > Allocated by task 3362: > save_stack+0x43/0xd0 mm/kasan/kasan.c:447 > set_track mm/kasan/kasan.c:459 [inline] > kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551 > kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:489 > kmem_cache_alloc+0x12e/0x760 mm/slab.c:3548 > kmem_cache_zalloc include/linux/slab.h:695 [inline] > net_alloc net/core/net_namespace.c:362 [inline] > copy_net_ns+0x196/0x580 net/core/net_namespace.c:402 > create_new_namespaces+0x425/0x880 kernel/nsproxy.c:107 > unshare_nsproxy_namespaces+0xae/0x1e0 kernel/nsproxy.c:206 > SYSC_unshare kernel/fork.c:2421 [inline] > SyS_unshare+0x653/0xfa0 kernel/fork.c:2371 > entry_SYSCALL_64_fastpath+0x1f/0x96 > > Freed by task 35: > save_stack+0x43/0xd0 mm/kasan/kasan.c:447 > set_track mm/kasan/kasan.c:459 [inline] > kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524 > __cache_free mm/slab.c:3492 [inline] > kmem_cache_free+0x77/0x280 mm/slab.c:3750 > net_free+0xca/0x110 net/core/net_namespace.c:378 > net_drop_ns.part.11+0x26/0x30 net/core/net_namespace.c:385 > net_drop_ns net/core/net_namespace.c:384 [inline] > cleanup_net+0x895/0xb60 net/core/net_namespace.c:502 > process_one_work+0xbfd/0x1bc0 kernel/workqueue.c:2113 > worker_thread+0x223/0x1990 kernel/workqueue.c:2247 > kthread+0x37a/0x440 kernel/kthread.c:238 > ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:524 > > The buggy address belongs to the object at 8801bde28080 > which belongs to the cache net_namespace of size 6272 > The buggy address is located 1236 bytes inside of > 6272-byte region [8801bde28080, 8801bde29900) > The buggy address belongs to the page: > page:df6a4dc0 count:1 mapcount:0 mapping:553659f1 index:0x0 > compound_mapcount: 0 > flags: 0x2fffc008100(slab|head) > raw: 02fffc008100 8801bde28080 00010001 > raw: ea0006f75da0 ea0006f60220 8801d989fe00 > page dumped because: kasan: bad access detected > > Memory state around the buggy address: > 8801bde28400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > 8801bde28480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > > 8801bde28500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > ^ > 8801bde28580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > 8801bde28600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb > == > > > --- > This bug is generated by a dumb bot. It may contain errors. > See https://goo.gl/tpsmEJ for details. > Direct all questions to syzkal...@googlegroups.com. > Please credit me with: Reported-by: syzbot> > syzbot will keep track of this bug report. > Once a fix for this bug is merged into any tree, reply to this email with: > #syz fix: exact-commit-title > To mark this as a duplicate of another syzbot report, please reply with: > #syz dup: exact-subject-of-another-report > If it's a one-off invalid bug report, please reply with: > #syz invalid > Note: if the crash happens again, it will cause creation of a new bug > report. > Note: all commands must start from beginning of the line in the email body. > This is still happening regularly, though syzbot hasn't been able to generate a reproducer yet. All the reports seem to involve
pull-request: bpf 2018-04-09
Hi David, The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Two sockmap fixes: i) fix a potential warning when a socket with pending cork data is closed by freeing the memory right when the socket is closed instead of seeing still outstanding memory at garbage collector time, ii) fix a NULL pointer deref in case of duplicates release calls, so make sure to only reset the sk_prot pointer when it's in a valid state to do so, both from John. 2) Fix a compilation warning in bpf_prog_attach_check_attach_type() by moving the function under CONFIG_CGROUP_BPF ifdef since only used there, from Anders. Please consider pulling these changes from: git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git Thanks a lot! The following changes since commit 4608f064532c28c0ea3c03fe26a3a5909852811a: Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next (2018-04-03 14:08:58 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git for you to fetch changes up to 33491588c1fb2c76ed114a211ad0ee76c16b5a0c: kernel/bpf/syscall: fix warning defined but not used (2018-04-04 11:08:36 +0200) Anders Roxell (1): kernel/bpf/syscall: fix warning defined but not used John Fastabend (2): bpf: sockmap, free memory on sock close with cork data bpf: sockmap, duplicates release calls may NULL sk_prot kernel/bpf/sockmap.c | 12 ++-- kernel/bpf/syscall.c | 24 2 files changed, 22 insertions(+), 14 deletions(-)
Re: [PATCH] net: bridge: add missing NULL checks
On 08/04/18 20:49, Laszlo Toth wrote: br_port_get_rtnl() can return NULL Signed-off-by: Laszlo Toth--- net/bridge/br_netlink.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) Nacked-by: Nikolay Aleksandrov More below. diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index 015f465c..cbec11f 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -939,14 +939,17 @@ static int br_port_slave_changelink(struct net_device *brdev, struct nlattr *data[], struct netlink_ext_ack *extack) { + struct net_bridge_port *port = br_port_get_rtnl(dev); struct net_bridge *br = netdev_priv(brdev); int ret; if (!data) return 0; + if (!port) + return -EINVAL; If we're here, it means the master device of dev is a bridge => dev is a bridge port, since we're running with RTNL that cannot change, so this check is unnecessary. Have you actually hit a bug with this code ? spin_lock_bh(>lock); - ret = br_setport(br_port_get_rtnl(dev), data); + ret = br_setport(port, data); spin_unlock_bh(>lock); return ret; @@ -956,7 +959,12 @@ static int br_port_fill_slave_info(struct sk_buff *skb, const struct net_device *brdev, const struct net_device *dev) { - return br_port_fill_attrs(skb, br_port_get_rtnl(dev)); + struct net_bridge_port *port = br_port_get_rtnl(dev); + + if (!port) + return -EINVAL; + + return br_port_fill_attrs(skb, port); Same rationale here, fill_slave_info is called via a master device's ops under RTNL, which means dev is a bridge port and that also cannot change. If you have hit a bug with this code, can we see the trace ? The problem might be elsewhere. Thanks, Nik } static size_t br_port_get_slave_size(const struct net_device *brdev,
Re: [PATCH bpf-next v8 05/11] seccomp,landlock: Enforce Landlock programs per process hierarchy
On 04/08/2018 11:06 PM, Andy Lutomirski wrote: > On Sun, Apr 8, 2018 at 6:13 AM, Mickaël Salaünwrote: >> >> On 02/27/2018 10:48 PM, Mickaël Salaün wrote: >>> >>> On 27/02/2018 17:39, Andy Lutomirski wrote: On Tue, Feb 27, 2018 at 5:32 AM, Alexei Starovoitov wrote: > On Tue, Feb 27, 2018 at 05:20:55AM +, Andy Lutomirski wrote: >> On Tue, Feb 27, 2018 at 4:54 AM, Alexei Starovoitov >> wrote: >>> On Tue, Feb 27, 2018 at 04:40:34AM +, Andy Lutomirski wrote: On Tue, Feb 27, 2018 at 2:08 AM, Alexei Starovoitov wrote: > On Tue, Feb 27, 2018 at 01:41:15AM +0100, Mickaël Salaün wrote: >> The seccomp(2) syscall can be used by a task to apply a Landlock >> program >> to itself. As a seccomp filter, a Landlock program is enforced for >> the >> current task and all its future children. A program is immutable and >> a >> task can only add new restricting programs to itself, forming a list >> of >> programss. >> >> A Landlock program is tied to a Landlock hook. If the action on a >> kernel >> object is allowed by the other Linux security mechanisms (e.g. DAC, >> capabilities, other LSM), then a Landlock hook related to this kind >> of >> object is triggered. The list of programs for this hook is then >> evaluated. Each program return a 32-bit value which can deny the >> action >> on a kernel object with a non-zero value. If every programs of the >> list >> return zero, then the action on the object is allowed. >> >> Multiple Landlock programs can be chained to share a 64-bits value >> for a >> call chain (e.g. evaluating multiple elements of a file path). This >> chaining is restricted when a process construct this chain by >> loading a >> program, but additional checks are performed when it requests to >> apply >> this chain of programs to itself. The restrictions ensure that it is >> not possible to call multiple programs in a way that would imply to >> handle multiple shared values (i.e. cookies) for one chain. For now, >> only a fs_pick program can be chained to the same type of program, >> because it may make sense if they have different triggers (cf. next >> commits). This restrictions still allows to reuse Landlock programs >> in >> a safe way (e.g. use the same loaded fs_walk program with multiple >> chains of fs_pick programs). >> >> Signed-off-by: Mickaël Salaün > > ... > >> +struct landlock_prog_set *landlock_prepend_prog( >> + struct landlock_prog_set *current_prog_set, >> + struct bpf_prog *prog) >> +{ >> + struct landlock_prog_set *new_prog_set = current_prog_set; >> + unsigned long pages; >> + int err; >> + size_t i; >> + struct landlock_prog_set tmp_prog_set = {}; >> + >> + if (prog->type != BPF_PROG_TYPE_LANDLOCK_HOOK) >> + return ERR_PTR(-EINVAL); >> + >> + /* validate memory size allocation */ >> + pages = prog->pages; >> + if (current_prog_set) { >> + size_t i; >> + >> + for (i = 0; i < >> ARRAY_SIZE(current_prog_set->programs); i++) { >> + struct landlock_prog_list *walker_p; >> + >> + for (walker_p = current_prog_set->programs[i]; >> + walker_p; walker_p = >> walker_p->prev) >> + pages += walker_p->prog->pages; >> + } >> + /* count a struct landlock_prog_set if we need to >> allocate one */ >> + if (refcount_read(_prog_set->usage) != 1) >> + pages += round_up(sizeof(*current_prog_set), >> PAGE_SIZE) >> + / PAGE_SIZE; >> + } >> + if (pages > LANDLOCK_PROGRAMS_MAX_PAGES) >> + return ERR_PTR(-E2BIG); >> + >> + /* ensure early that we can allocate enough memory for the new >> + * prog_lists */ >> + err = store_landlock_prog(_prog_set, current_prog_set, >> prog); >> + if (err) >> + return ERR_PTR(err); >> + >> + /* >> + * Each task_struct points to an array of prog list pointers. >> These >> + * tables are duplicated when additions are
Re: BUG: please report to d...@vger.kernel.org => prev = 0, last = 0 at net/dccp/ccids/lib/packet_history.c:LINE/tfrc_rx_hist_sample_rtt()
On Thu, Jan 18, 2018 at 01:34:02AM -0800, syzbot wrote: > syzbot has found reproducer for the following crash on linux-next commit > a362f6d2cdbd089dd7040ba66dcb0ad276a20cf7 (Thu Jan 18 07:07:54 2018 +) > Add linux-next specific files for 20180118 > > So far this crash happened 185 times on linux-next, mmots, net-next, > upstream. > C reproducer is attached. > syzkaller reproducer is attached. > Raw console output is attached. > compiler: gcc (GCC) 7.1.1 20170620 > .config is attached. > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: > syzbot+3ca02e1a9272a28e8959b32039154c5605164...@syzkaller.appspotmail.com > It will help syzbot understand when the bug is fixed. > > BUG: please report to d...@vger.kernel.org => prev = 0, last = 0 at > net/dccp/ccids/lib/packet_history.c:425/tfrc_rx_hist_sample_rtt() > CPU: 1 PID: 6246 Comm: syzkaller158939 Not tainted 4.15.0-rc8-next-20180118+ > #100 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > Call Trace: > > __dump_stack lib/dump_stack.c:17 [inline] > dump_stack+0x194/0x257 lib/dump_stack.c:53 > tfrc_rx_hist_sample_rtt+0x407/0x4d0 net/dccp/ccids/lib/packet_history.c:422 > ccid3_hc_rx_packet_recv+0x696/0xeb3 net/dccp/ccids/ccid3.c:765 > ccid_hc_rx_packet_recv net/dccp/ccid.h:185 [inline] > dccp_deliver_input_to_ccids+0xd9/0x250 net/dccp/input.c:180 > dccp_rcv_established+0x88/0xb0 net/dccp/input.c:378 > dccp_v4_do_rcv+0x135/0x160 net/dccp/ipv4.c:653 > sk_backlog_rcv include/net/sock.h:908 [inline] > __sk_receive_skb+0x33e/0xc10 net/core/sock.c:513 > dccp_v4_rcv+0xf5f/0x1c80 net/dccp/ipv4.c:874 > ip_local_deliver_finish+0x2f1/0xc50 net/ipv4/ip_input.c:216 > NF_HOOK include/linux/netfilter.h:288 [inline] > ip_local_deliver+0x1ce/0x6e0 net/ipv4/ip_input.c:257 > dst_input include/net/dst.h:449 [inline] > ip_rcv_finish+0x953/0x1e30 net/ipv4/ip_input.c:397 > NF_HOOK include/linux/netfilter.h:288 [inline] > ip_rcv+0xc5a/0x1840 net/ipv4/ip_input.c:493 > __netif_receive_skb_core+0x1a41/0x3460 net/core/dev.c:4537 > __netif_receive_skb+0x2c/0x1b0 net/core/dev.c:4602 > process_backlog+0x203/0x740 net/core/dev.c:5282 > napi_poll net/core/dev.c:5680 [inline] > net_rx_action+0x792/0x1910 net/core/dev.c:5746 > __do_softirq+0x2d7/0xb85 kernel/softirq.c:285 > do_softirq_own_stack+0x2a/0x40 arch/x86/entry/entry_64.S:1150 > > do_softirq.part.19+0x14d/0x190 kernel/softirq.c:329 > do_softirq kernel/softirq.c:177 [inline] > __local_bh_enable_ip+0x1ee/0x230 kernel/softirq.c:182 > local_bh_enable include/linux/bottom_half.h:32 [inline] > rcu_read_unlock_bh include/linux/rcupdate.h:726 [inline] > ip_finish_output2+0x962/0x1550 net/ipv4/ip_output.c:231 > ip_finish_output+0x864/0xd10 net/ipv4/ip_output.c:317 > NF_HOOK_COND include/linux/netfilter.h:277 [inline] > ip_output+0x1d2/0x860 net/ipv4/ip_output.c:405 > dst_output include/net/dst.h:443 [inline] > ip_local_out+0x95/0x160 net/ipv4/ip_output.c:124 > ip_queue_xmit+0x8c0/0x18e0 net/ipv4/ip_output.c:504 > dccp_transmit_skb+0x9ac/0x10f0 net/dccp/output.c:142 > dccp_xmit_packet+0x215/0x740 net/dccp/output.c:281 > dccp_write_xmit+0x17d/0x1d0 net/dccp/output.c:363 > dccp_sendmsg+0x95f/0xdc0 net/dccp/proto.c:813 > inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:764 > sock_sendmsg_nosec net/socket.c:630 [inline] > sock_sendmsg+0xca/0x110 net/socket.c:640 > ___sys_sendmsg+0x767/0x8b0 net/socket.c:2020 > __sys_sendmsg+0xe5/0x210 net/socket.c:2054 > SYSC_sendmsg net/socket.c:2065 [inline] > SyS_sendmsg+0x2d/0x50 net/socket.c:2061 > entry_SYSCALL_64_fastpath+0x29/0xa0 > RIP: 0033:0x446469 > RSP: 002b:7fcecb23bda8 EFLAGS: 0293 ORIG_RAX: 002e > RAX: ffda RBX: 006dbc3c RCX: 00446469 > RDX: 0080 RSI: 206c8000 RDI: 0005 > RBP: 006dbc38 R08: R09: > R10: R11: 0293 R12: f8e4cbe49e572d45 > R13: 54c1b85d98aba1df R14: a6eaa24dbeb18c29 R15: 000c > This is still happening. It *might* be related to the other bug "suspicious RCU usage at ./include/net/inet_sock.h:LINE". Here's a simplified reproducer for this one: #include #include #include #include #include int main() { struct sockaddr_in addr = { .sin_family = AF_INET }; socklen_t addrlen = sizeof(addr); int fd; while (fork()) wait(NULL); fd = socket(AF_INET, SOCK_DCCP, 0); bind(fd, (void *), addrlen); getsockname(fd, (void *), ); listen(fd, 100); if (fork()) { fd = socket(AF_INET, SOCK_DCCP, 0); setsockopt(fd, SOL_DCCP, DCCP_SOCKOPT_CCID, "\x03", 1); connect(fd, (void *), sizeof(addr)); } else { fd = accept(fd, NULL, 0); } for (int i = 0; i < 1000; i++) write(fd, "X", 1); }
Re: pull request: bluetooth 2018-04-08
From: Johan HedbergDate: Sun, 8 Apr 2018 20:47:02 +0300 > Here's one important Bluetooth fix for the 4.17-rc series that's needed > to pass several Bluetooth qualification test cases. > > Let me know if there are any issues pulling. Thanks. Pulled, thank you.
Re: [PATCH net 0/8] net: fix uninit-values in networking stack
From: Eric DumazetDate: Sun, 8 Apr 2018 09:55:58 -0700 > I also have a report of a WARN() in ip_rt_bug(), added in commit > c378a9c019cf5e017d1ed24954b54fae7bebd2bc by Dave Jones. > > Not sure what to do, maybe revert, since ip_rt_bug() is not catastrophic. Let's not do the revert, I wouldn't have seen the backtrace which points where this bug is if we had. icmp_route_lookup(), in one branch, does an input route lookup and uses the result of that to send the icmp message. That can't be right, input routes should never be used for transmitting traffice and that's how we end up at ip_rt_bug().
Re: [PATCH bpf-next v8 05/11] seccomp,landlock: Enforce Landlock programs per process hierarchy
On Sun, Apr 8, 2018 at 6:13 AM, Mickaël Salaünwrote: > > On 02/27/2018 10:48 PM, Mickaël Salaün wrote: >> >> On 27/02/2018 17:39, Andy Lutomirski wrote: >>> On Tue, Feb 27, 2018 at 5:32 AM, Alexei Starovoitov >>> wrote: On Tue, Feb 27, 2018 at 05:20:55AM +, Andy Lutomirski wrote: > On Tue, Feb 27, 2018 at 4:54 AM, Alexei Starovoitov > wrote: >> On Tue, Feb 27, 2018 at 04:40:34AM +, Andy Lutomirski wrote: >>> On Tue, Feb 27, 2018 at 2:08 AM, Alexei Starovoitov >>> wrote: On Tue, Feb 27, 2018 at 01:41:15AM +0100, Mickaël Salaün wrote: > The seccomp(2) syscall can be used by a task to apply a Landlock > program > to itself. As a seccomp filter, a Landlock program is enforced for the > current task and all its future children. A program is immutable and a > task can only add new restricting programs to itself, forming a list > of > programss. > > A Landlock program is tied to a Landlock hook. If the action on a > kernel > object is allowed by the other Linux security mechanisms (e.g. DAC, > capabilities, other LSM), then a Landlock hook related to this kind of > object is triggered. The list of programs for this hook is then > evaluated. Each program return a 32-bit value which can deny the > action > on a kernel object with a non-zero value. If every programs of the > list > return zero, then the action on the object is allowed. > > Multiple Landlock programs can be chained to share a 64-bits value > for a > call chain (e.g. evaluating multiple elements of a file path). This > chaining is restricted when a process construct this chain by loading > a > program, but additional checks are performed when it requests to apply > this chain of programs to itself. The restrictions ensure that it is > not possible to call multiple programs in a way that would imply to > handle multiple shared values (i.e. cookies) for one chain. For now, > only a fs_pick program can be chained to the same type of program, > because it may make sense if they have different triggers (cf. next > commits). This restrictions still allows to reuse Landlock programs > in > a safe way (e.g. use the same loaded fs_walk program with multiple > chains of fs_pick programs). > > Signed-off-by: Mickaël Salaün ... > +struct landlock_prog_set *landlock_prepend_prog( > + struct landlock_prog_set *current_prog_set, > + struct bpf_prog *prog) > +{ > + struct landlock_prog_set *new_prog_set = current_prog_set; > + unsigned long pages; > + int err; > + size_t i; > + struct landlock_prog_set tmp_prog_set = {}; > + > + if (prog->type != BPF_PROG_TYPE_LANDLOCK_HOOK) > + return ERR_PTR(-EINVAL); > + > + /* validate memory size allocation */ > + pages = prog->pages; > + if (current_prog_set) { > + size_t i; > + > + for (i = 0; i < ARRAY_SIZE(current_prog_set->programs); > i++) { > + struct landlock_prog_list *walker_p; > + > + for (walker_p = current_prog_set->programs[i]; > + walker_p; walker_p = > walker_p->prev) > + pages += walker_p->prog->pages; > + } > + /* count a struct landlock_prog_set if we need to > allocate one */ > + if (refcount_read(_prog_set->usage) != 1) > + pages += round_up(sizeof(*current_prog_set), > PAGE_SIZE) > + / PAGE_SIZE; > + } > + if (pages > LANDLOCK_PROGRAMS_MAX_PAGES) > + return ERR_PTR(-E2BIG); > + > + /* ensure early that we can allocate enough memory for the new > + * prog_lists */ > + err = store_landlock_prog(_prog_set, current_prog_set, > prog); > + if (err) > + return ERR_PTR(err); > + > + /* > + * Each task_struct points to an array of prog list pointers. > These > + * tables are duplicated when additions are made (which means > each > + * table needs to be refcounted for the processes using it). > When a new > + * table is created, all the refcounters on the
[PATCH v3] dp83640: Ensure against premature access to PHY registers after reset
From: Esben HaabendalThe datasheet specifies a 3uS pause after performing a software reset. The default implementation of genphy_soft_reset() does not provide this, so implement soft_reset with the needed pause. Signed-off-by: Esben Haabendal Reviewed-by: Andrew Lunn --- drivers/net/phy/dp83640.c | 18 ++ 1 file changed, 18 insertions(+) diff --git a/drivers/net/phy/dp83640.c b/drivers/net/phy/dp83640.c index 654f42d00092..a6c87793d899 100644 --- a/drivers/net/phy/dp83640.c +++ b/drivers/net/phy/dp83640.c @@ -1207,6 +1207,23 @@ static void dp83640_remove(struct phy_device *phydev) kfree(dp83640); } +static int dp83640_soft_reset(struct phy_device *phydev) +{ + int ret; + + ret = genphy_soft_reset(phydev); + if (ret < 0) + return ret; + + /* From DP83640 datasheet: "Software driver code must wait 3 us +* following a software reset before allowing further serial MII +* operations with the DP83640." +*/ + udelay(10); /* Taking udelay inaccuracy into account */ + + return 0; +} + static int dp83640_config_init(struct phy_device *phydev) { struct dp83640_private *dp83640 = phydev->priv; @@ -1501,6 +1518,7 @@ static struct phy_driver dp83640_driver = { .flags = PHY_HAS_INTERRUPT, .probe = dp83640_probe, .remove = dp83640_remove, + .soft_reset = dp83640_soft_reset, .config_init= dp83640_config_init, .ack_interrupt = dp83640_ack_interrupt, .config_intr= dp83640_config_intr, -- 2.16.3
Re: WARNING in skb_warn_bad_offload
On Wed, Nov 01, 2017 at 09:50:18PM +0300, 'Dmitry Vyukov' via syzkaller-bugs wrote: > On Wed, Nov 1, 2017 at 9:48 PM, syzbot >> wrote: > > Hello, > > > > syzkaller hit the following crash on > > 720bbe532b7c8f5613b48dea627fc58ed9ace707 > > git://git.cmpxchg.org/linux-mmots.git/master > > compiler: gcc (GCC) 7.1.1 20170620 > > .config is attached > > Raw console output is attached. > > C reproducer is attached > > syzkaller reproducer is attached. See https://goo.gl/kgGztJ > > for information about syzkaller reproducers > > > This also happens on more recent commits, including linux-next > 36ef71cae353f88fd6e095e2aaa3e5953af1685d (Oct 20): > > syz0: caps=(0x040058c1, 0x) len=4203 > data_len=2810 gso_size=8465 gso_type=3 ip_summed=0 > [ cut here ] > WARNING: CPU: 0 PID: 3473 at net/core/dev.c:2618 > skb_warn_bad_offload.cold.139+0x224/0x261 net/core/dev.c:2613 > Kernel panic - not syncing: panic_on_warn set ... > > CPU: 0 PID: 3473 Comm: a.out Not tainted 4.14.0-rc5-next-20171018 #15 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:16 [inline] > dump_stack+0x1a8/0x272 lib/dump_stack.c:52 > panic+0x21e/0x4b7 kernel/panic.c:183 > __warn.cold.6+0x182/0x187 kernel/panic.c:546 > report_bug+0x232/0x330 lib/bug.c:183 > fixup_bug+0x3f/0x90 arch/x86/kernel/traps.c:177 > do_trap_no_signal arch/x86/kernel/traps.c:211 [inline] > do_trap+0x132/0x280 arch/x86/kernel/traps.c:260 > do_error_trap+0x11f/0x390 arch/x86/kernel/traps.c:297 > do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:310 > invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:905 > RIP: 0010:skb_warn_bad_offload.cold.139+0x224/0x261 net/core/dev.c:2613 > RSP: 0018:880064797038 EFLAGS: 00010286 > RAX: 006f RBX: 88006365efe8 RCX: > RDX: 006f RSI: 815c88c1 RDI: ed000c8f2dfd > RBP: 880064797090 R08: 8800686f86c0 R09: 0002 > R10: 8800686f86c0 R11: R12: 8800538b1680 > R13: R14: 8800538b1680 R15: 2111 > __skb_gso_segment+0x69e/0x860 net/core/dev.c:2824 > skb_gso_segment include/linux/netdevice.h:3971 [inline] > validate_xmit_skb+0x29f/0xca0 net/core/dev.c:3074 > validate_xmit_skb_list+0xb7/0x120 net/core/dev.c:3125 > sch_direct_xmit+0x5b5/0x710 net/sched/sch_generic.c:181 > __dev_xmit_skb net/core/dev.c:3206 [inline] > __dev_queue_xmit+0x1e41/0x2350 net/core/dev.c:3473 > dev_queue_xmit+0x17/0x20 net/core/dev.c:3538 > packet_snd net/packet/af_packet.c:2956 [inline] > packet_sendmsg+0x487a/0x64b0 net/packet/af_packet.c:2981 > sock_sendmsg_nosec net/socket.c:632 [inline] > sock_sendmsg+0xd2/0x120 net/socket.c:642 > ___sys_sendmsg+0x7cc/0x900 net/socket.c:2048 > __sys_sendmsg+0xe6/0x220 net/socket.c:2082 > SYSC_sendmsg net/socket.c:2093 [inline] > SyS_sendmsg+0x36/0x60 net/socket.c:2089 > entry_SYSCALL_64_fastpath+0x1f/0xbe > RIP: 0033:0x44bab9 > RSP: 002b:007eff18 EFLAGS: 0246 ORIG_RAX: 002e > RAX: ffda RBX: 20001046 RCX: 0044bab9 > RDX: 4010 RSI: 207fcfc8 RDI: 0004 > RBP: 0086 R08: 850b2da14d2a3706 R09: > R10: 1b91126b7f398aaa R11: 0246 R12: > R13: 00407950 R14: 004079e0 R15: > > > > > > > [ cut here ] > > WARNING: CPU: 0 PID: 2986 at net/core/dev.c:2585 > > skb_warn_bad_offload+0x2a9/0x380 net/core/dev.c:2580 > > Kernel panic - not syncing: panic_on_warn set ... > > > > CPU: 0 PID: 2986 Comm: syzkaller546001 Not tainted 4.13.0-mm1+ #7 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > > Google 01/01/2011 > > Call Trace: > > __dump_stack lib/dump_stack.c:16 [inline] > > dump_stack+0x194/0x257 lib/dump_stack.c:52 > > panic+0x1e4/0x417 kernel/panic.c:181 > > __warn+0x1c4/0x1d9 kernel/panic.c:542 > > report_bug+0x211/0x2d0 lib/bug.c:183 > > fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:178 > > do_trap_no_signal arch/x86/kernel/traps.c:212 [inline] > > do_trap+0x260/0x390 arch/x86/kernel/traps.c:261 > > do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:298 > > do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:311 > > invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:905 > > RIP: 0010:skb_warn_bad_offload+0x2a9/0x380 net/core/dev.c:2580 > > RSP: 0018:8801ce73f0a0 EFLAGS: 00010282 > > RAX: 006f RBX: 8801cd84cde0 RCX: > > RDX: 006f RSI: 110039ce7dd4 RDI: ed0039ce7e08 > > RBP: 8801ce73f0f8 R08: 8801ce73e790 R09: > > R10: R11: R12: 8801ce7802c0 > > R13: R14: 8801ce7802c0 R15: 2111 > > __skb_gso_segment+0x607/0x7f0 net/core/dev.c:2791 >
Re: WARNING in kcm_exit_net (2)
On Wed, Nov 29, 2017 at 10:08:01PM -0800, syzbot wrote: > Hello, > > syzkaller hit the following crash on > 1d3b78bbc6e983fabb3fbf91b76339bf66e4a12c > git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/master > compiler: gcc (GCC) 7.1.1 20170620 > .config is attached > Raw console output is attached. > > Unfortunately, I don't have any reproducer for this bug yet. > > > WARNING: CPU: 1 PID: 4099 at net/kcm/kcmsock.c:2014 kcm_exit_net+0x317/0x360 > net/kcm/kcmsock.c:2014 > Kernel panic - not syncing: panic_on_warn set ... > > CPU: 1 PID: 4099 Comm: kworker/u4:9 Not tainted 4.14.0+ #129 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > Workqueue: netns cleanup_net > device lo entered promiscuous mode > Call Trace: > __dump_stack lib/dump_stack.c:17 [inline] > dump_stack+0x194/0x257 lib/dump_stack.c:53 > panic+0x1e4/0x41c kernel/panic.c:183 > __warn+0x1dc/0x200 kernel/panic.c:547 > report_bug+0x211/0x2d0 lib/bug.c:184 > fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:177 > fixup_bug arch/x86/kernel/traps.c:246 [inline] > do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:295 > do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:314 > invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:926 > RIP: 0010:kcm_exit_net+0x317/0x360 net/kcm/kcmsock.c:2014 > RSP: :8801d9d27198 EFLAGS: 00010293 > RAX: 8801c0884540 RBX: 11003b3a4e33 RCX: 84a738e7 > RDX: RSI: 0004 RDI: 0286 > RBP: 8801d9d27260 R08: 0003 R09: 11003b3a4e0c > R10: 8801c0884540 R11: 0003 R12: 11003b3a4e37 > R13: 8801d9d27238 R14: 8801c5fec8a0 R15: 8801c4b62e40 > ops_exit_list.isra.6+0xae/0x150 net/core/net_namespace.c:142 > cleanup_net+0x5c7/0xb60 net/core/net_namespace.c:484 > process_one_work+0xbfd/0x1be0 kernel/workqueue.c:2112 > worker_thread+0x223/0x1990 kernel/workqueue.c:2246 > kthread+0x37a/0x440 kernel/kthread.c:238 > ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:437 > Dumping ftrace buffer: >(ftrace buffer empty) > Kernel Offset: disabled > Rebooting in 86400 seconds.. > > > --- > This bug is generated by a dumb bot. It may contain errors. > See https://goo.gl/tpsmEJ for details. > Direct all questions to syzkal...@googlegroups.com. > Please credit me with: Reported-by: syzbot> > syzbot will keep track of this bug report. > Once a fix for this bug is committed, please reply to this email with: > #syz fix: exact-commit-title > To mark this as a duplicate of another syzbot report, please reply with: > #syz dup: exact-subject-of-another-report > If it's a one-off invalid bug report, please reply with: > #syz invalid No reproducer, this last occurred on Dec 26 (103 days ago, commit fba961ab29e), and there have been several potentially relevant KCM fixes since then such as 581e7226a5d ("kcm: Only allow TCP sockets to be attached to a KCM mux") and e5571240236 ("kcm: Check if sk_user_data already set in kcm_attach"). So I am invalidating this for syzbot, but if anyone thinks this may still be a bug then feel free to look into it. #syz invalid Eric
Re: suspicious RCU usage at ./include/net/inet_sock.h:LINE
On Mon, Dec 25, 2017 at 05:45:00PM -0800, syzbot wrote: > syzkaller has found reproducer for the following crash on > fba961ab29e5ffb055592442808bb0f7962e05da > git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/master > compiler: gcc (GCC) 7.1.1 20170620 > .config is attached > Raw console output is attached. > C reproducer is attached > syzkaller reproducer is attached. See https://goo.gl/kgGztJ > for information about syzkaller reproducers > > > Can not set IPV6_FL_F_REFLECT if flowlabel_consistency sysctl is enable > > = > WARNING: suspicious RCU usage > 4.15.0-rc4+ #164 Not tainted > - > ./include/net/inet_sock.h:136 suspicious rcu_dereference_check() usage! > > other info that might help us debug this: > > > rcu_scheduler_active = 2, debug_locks = 1 > 1 lock held by syzkaller667189/5780: > #0: (sk_lock-AF_INET6){+.+.}, at: [<8d7d4e62>] lock_sock > include/net/sock.h:1462 [inline] > #0: (sk_lock-AF_INET6){+.+.}, at: [<8d7d4e62>] > do_ipv6_setsockopt.isra.9+0x23d/0x38f0 net/ipv6/ipv6_sockglue.c:167 > > stack backtrace: > CPU: 0 PID: 5780 Comm: syzkaller667189 Not tainted 4.15.0-rc4+ #164 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:17 [inline] > dump_stack+0x194/0x257 lib/dump_stack.c:53 > lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4585 > ireq_opt_deref include/net/inet_sock.h:135 [inline] > inet_csk_route_req+0x824/0xca0 net/ipv4/inet_connection_sock.c:544 > dccp_v4_send_response+0xa7/0x640 net/dccp/ipv4.c:485 > dccp_v4_conn_request+0x9ee/0x11b0 net/dccp/ipv4.c:633 > dccp_v6_conn_request+0xd30/0x1350 net/dccp/ipv6.c:317 > dccp_rcv_state_process+0x574/0x1620 net/dccp/input.c:612 > dccp_v4_do_rcv+0xeb/0x160 net/dccp/ipv4.c:682 > dccp_v6_do_rcv+0x81a/0x9b0 net/dccp/ipv6.c:578 > sk_backlog_rcv include/net/sock.h:907 [inline] > __release_sock+0x124/0x360 net/core/sock.c:2274 > release_sock+0xa4/0x2a0 net/core/sock.c:2789 > do_ipv6_setsockopt.isra.9+0x50f/0x38f0 net/ipv6/ipv6_sockglue.c:898 > ipv6_setsockopt+0xd7/0x150 net/ipv6/ipv6_sockglue.c:922 > dccp_setsockopt+0x85/0xd0 net/dccp/proto.c:573 > sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2978 > SYSC_setsockopt net/socket.c:1821 [inline] > SyS_setsockopt+0x189/0x360 net/socket.c:1800 > entry_SYSCALL_64_fastpath+0x1f/0x96 > RIP: 0033:0x445ec9 > RSP: 002b:7fa001b58db8 EFLAGS: 0297 ORIG_RAX: 0036 > RAX: ffda RBX: 006dbc24 RCX: 00445ec9 > RDX: 0020 RSI: 0029 RDI: 0004 > RBP: 006dbc20 R08: 0020 R09: > R10: 2030a000 R11: 0297 R12: > R13: 7fff809eec1f R14: 7fa001b599c0 R15: 0001 > > = > WARNING: suspicious RCU usage > 4.15.0-rc4+ #164 Not tainted > - > ./include/net/inet_sock.h:136 suspicious rcu_dereference_check() usage! > > other info that might help us debug this: > > > rcu_scheduler_active = 2, debug_locks = 1 > 1 lock held by syzkaller667189/5780: > #0: (sk_lock-AF_INET6){+.+.}, at: [<8d7d4e62>] lock_sock > include/net/sock.h:1462 [inline] > #0: (sk_lock-AF_INET6){+.+.}, at: [<8d7d4e62>] > do_ipv6_setsockopt.isra.9+0x23d/0x38f0 net/ipv6/ipv6_sockglue.c:167 > > stack backtrace: > CPU: 0 PID: 5780 Comm: syzkaller667189 Not tainted 4.15.0-rc4+ #164 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:17 [inline] > dump_stack+0x194/0x257 lib/dump_stack.c:53 > lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4585 > ireq_opt_deref include/net/inet_sock.h:135 [inline] > dccp_v4_send_response+0x4b0/0x640 net/dccp/ipv4.c:496 > dccp_v4_conn_request+0x9ee/0x11b0 net/dccp/ipv4.c:633 > dccp_v6_conn_request+0xd30/0x1350 net/dccp/ipv6.c:317 > dccp_rcv_state_process+0x574/0x1620 net/dccp/input.c:612 > dccp_v4_do_rcv+0xeb/0x160 net/dccp/ipv4.c:682 > dccp_v6_do_rcv+0x81a/0x9b0 net/dccp/ipv6.c:578 > sk_backlog_rcv include/net/sock.h:907 [inline] > __release_sock+0x124/0x360 net/core/sock.c:2274 > release_sock+0xa4/0x2a0 net/core/sock.c:2789 > do_ipv6_setsockopt.isra.9+0x50f/0x38f0 net/ipv6/ipv6_sockglue.c:898 > ipv6_setsockopt+0xd7/0x150 net/ipv6/ipv6_sockglue.c:922 > dccp_setsockopt+0x85/0xd0 net/dccp/proto.c:573 > sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2978 > SYSC_setsockopt net/socket.c:1821 [inline] > SyS_setsockopt+0x189/0x360 net/socket.c:1800 > entry_SYSCALL_64_fastpath+0x1f/0x96 > RIP: 0033:0x445ec9 > RSP: 002b:7fa001b58db8 EFLAGS: 0297 ORIG_RAX: 0036 > RAX: ffda RBX: 006dbc24 RCX: 00445ec9 > RDX: 0020 RSI: 0029 RDI: 0004 > RBP: 006dbc20 R08: 0020 R09:
Re: [PATCH] make net_gso_ok return false when gso_type is zero(invalid)
2018-04-08 18:51 GMT+02:00 David Miller: > > From: Wenhua Shi > Date: Fri, 6 Apr 2018 03:43:39 +0200 > > > Signed-off-by: Wenhua Shi > > This precondition should be made impossible instead of having to do > an extra check everywhere that this helper is invoked, many of which > are in fast paths. I believe the precondition you said is quite true. In my situation, I have to disable GSO for some packet and I notice that it leads to a worse performance (slower than 1Mbps, was almost 800Mbps). Here's the hook I use on debian 9.4, kernel version 4.9: #include #include #include #include #include #include #include #include #include unsigned int hook_outgoing ( void * priv, struct sk_buff * skb, const struct nf_hook_state * state) { /* for some reason I have to disable GSO */ skb_gso_reset(skb); /* After I force sk_can_gso to return false here, the performance comes back normal. */ // skb->sk->sk_gso_type = ~0; return NF_ACCEPT; } static struct nf_hook_ops hook = { .hook = hook_outgoing, .pf = PF_INET, .hooknum = NF_INET_POST_ROUTING, .priority = NF_IP_PRI_LAST, }; static int __init init_testing(void) { nf_register_hook(); return 0; } static void __exit exit_testing(void) { nf_unregister_hook(); } module_init(init_testing); module_exit(exit_testing); Here are the performance measurements. Without the previous hook: root@debian-s-1vcpu-1gb-sfo1-01:~/test# iperf -c myanothernormaldebian -d Server listening on TCP port 5001 TCP window size: 85.3 KByte (default) Client connecting to myanothernormaldebian, TCP port 5001 TCP window size: 255 KByte (default) [ 3] local 192.241.204.XXX port 60528 connected with 104.131.148.XXX port 5001 [ 5] local 192.241.204.XXX port 5001 connected with 104.131.148.XXX port 58576 [ ID] Interval Transfer Bandwidth [ 3] 0.0-10.0 sec 922 MBytes 773 Mbits/sec [ 5] 0.0-10.1 sec 1.00 GBytes 849 Mbits/sec And with the previous hook: root@debian-s-1vcpu-1gb-sfo1-01:~/test# iperf -c myanothernormaldebian -d Server listening on TCP port 5001 TCP window size: 85.3 KByte (default) Client connecting to myanothernormaldebian, TCP port 5001 TCP window size: 85.0 KByte (default) [ 3] local 192.241.204.XXX port 60530 connected with 104.131.148.XXX port 5001 [ 5] local 192.241.204.XXX port 5001 connected with 104.131.148.XXX port 58578 [ ID] Interval Transfer Bandwidth [ 5] 0.0-10.2 sec 1.02 GBytes 864 Mbits/sec [ 3] 0.0-13.5 sec 170 KBytes 103 Kbits/sec Or it's just because of that I'm disabling the GSO in a wrong way?
Re: [PATCH iproute2-next 1/1] tc: jsonify tunnel_key action
On 4/4/18 11:21 AM, Roman Mashak wrote: > Signed-off-by: Roman Mashak> --- > tc/m_tunnel_key.c | 36 +--- > 1 file changed, 25 insertions(+), 11 deletions(-) > applied to iproute2-next
Re: [PATCH iproute2-next 1/1] tc: jsonify connmark action
On 4/3/18 7:09 AM, Roman Mashak wrote: > Signed-off-by: Roman Mashak> --- > tc/m_connmark.c | 16 ++-- > 1 file changed, 10 insertions(+), 6 deletions(-) applied to iproute2-next
Re: [PATCH iproute2-next 1/1] tc: jsonify skbedit action
On 4/3/18 1:24 PM, Roman Mashak wrote: > if (tb[TCA_SKBEDIT_PTYPE] != NULL) { > - ptype = RTA_DATA(tb[TCA_SKBEDIT_PTYPE]); > - if (*ptype == PACKET_HOST) > - fprintf(f, " ptype host"); > - else if (*ptype == PACKET_BROADCAST) > - fprintf(f, " ptype broadcast"); > - else if (*ptype == PACKET_MULTICAST) > - fprintf(f, " ptype multicast"); > - else if (*ptype == PACKET_OTHERHOST) > - fprintf(f, " ptype otherhost"); > + ptype = rta_getattr_u16(tb[TCA_SKBEDIT_PTYPE]); > + if (ptype == PACKET_HOST) > + print_string(PRINT_ANY, "ptype", " %s", "ptype host"); > + else if (ptype == PACKET_BROADCAST) > + print_string(PRINT_ANY, "ptype", " %s", > + "ptype broadcast"); > + else if (ptype == PACKET_MULTICAST) > + print_string(PRINT_ANY, "ptype", " %s", > + "ptype multicast"); > + else if (ptype == PACKET_OTHERHOST) > + print_string(PRINT_ANY, "ptype", " %s", > + "ptype otherhost"); Shouldn't that be: print_string(PRINT_ANY, "ptype", "ptype %s", "otherhost"); And ditto for the other strings. > else > - fprintf(f, " ptype %d", *ptype); > + print_uint(PRINT_ANY, "ptype", " %u", ptype); And then this one needs 'ptype' before %u
[PATCH] net: bridge: add missing NULL checks
br_port_get_rtnl() can return NULL Signed-off-by: Laszlo Toth--- net/bridge/br_netlink.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index 015f465c..cbec11f 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -939,14 +939,17 @@ static int br_port_slave_changelink(struct net_device *brdev, struct nlattr *data[], struct netlink_ext_ack *extack) { + struct net_bridge_port *port = br_port_get_rtnl(dev); struct net_bridge *br = netdev_priv(brdev); int ret; if (!data) return 0; + if (!port) + return -EINVAL; spin_lock_bh(>lock); - ret = br_setport(br_port_get_rtnl(dev), data); + ret = br_setport(port, data); spin_unlock_bh(>lock); return ret; @@ -956,7 +959,12 @@ static int br_port_fill_slave_info(struct sk_buff *skb, const struct net_device *brdev, const struct net_device *dev) { - return br_port_fill_attrs(skb, br_port_get_rtnl(dev)); + struct net_bridge_port *port = br_port_get_rtnl(dev); + + if (!port) + return -EINVAL; + + return br_port_fill_attrs(skb, port); } static size_t br_port_get_slave_size(const struct net_device *brdev, -- 2.7.4
pull request: bluetooth 2018-04-08
Hi Dave, Here's one important Bluetooth fix for the 4.17-rc series that's needed to pass several Bluetooth qualification test cases. Let me know if there are any issues pulling. Thanks. Johan --- The following changes since commit b5dbc28762fd3fd40ba76303be0c7f707826f982: Merge tag 'kbuild-fixes-v4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild (2018-03-30 18:53:57 -1000) are available in the Git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth.git for-upstream for you to fetch changes up to 082f2300cfa1a3d9d5221c38c5eba85d4ab98bd8: Bluetooth: Fix connection if directed advertising and privacy is used (2018-04-03 16:12:56 +0200) Szymon Janc (1): Bluetooth: Fix connection if directed advertising and privacy is used include/net/bluetooth/hci_core.h | 2 +- net/bluetooth/hci_conn.c | 29 + net/bluetooth/hci_event.c| 15 +++ net/bluetooth/l2cap_core.c | 2 +- 4 files changed, 34 insertions(+), 14 deletions(-) signature.asc Description: PGP signature
Re: [PATCH net 0/8] net: fix uninit-values in networking stack
On 04/08/2018 09:49 AM, David Miller wrote: > From: Eric Dumazet> Date: Sun, 8 Apr 2018 09:38:13 -0700 > >> On 04/07/2018 07:40 PM, David Miller wrote: >>> From: Eric Dumazet >>> Date: Sat, 7 Apr 2018 13:42:35 -0700 >>> It seems syzbot got new features enabled, and fired some interesting reports. Oh well. >>> >>> Series applied, however in patch #7 the condition syzbot detects >>> cannot happen. >>> >>> In all code paths that lead to __mkroute_output() with res->type >>> uninitialized, __mkroute_output() will reassign the local variable >>> 'type' before reading it. >> >> Well, we have : >> >> u16 type = res->type; >> ... >> >>if (ipv4_is_lbcast(fl4->daddr)) >> type = RTN_BROADCAST; >> else if (ipv4_is_multicast(fl4->daddr)) >> type = RTN_MULTICAST; >> else if (ipv4_is_zeronet(fl4->daddr)) >> return ERR_PTR(-EINVAL); >> >> ... >> >> if (type == RTN_BROADCAST) { /* This is where KMSAN complained */ >> >> So it looks like type could have been random at this point. > > Ok, then. It seems that the requirement is: > > fl4->flowi4_oif is non-zero > fl4->daddr is neither local multicast nor lbcast > fl4->flowi4_proto is IPPROTO_IGMP > > Then we can trigger such a sequence of events. > OK, maybe some more work then ;) I also have a report of a WARN() in ip_rt_bug(), added in commit c378a9c019cf5e017d1ed24954b54fae7bebd2bc by Dave Jones. Not sure what to do, maybe revert, since ip_rt_bug() is not catastrophic. WARNING: CPU: 0 PID: 11678 at net/ipv4/route.c:1213 ip_rt_bug+0x15/0x20 net/ipv4/route.c:1212 Kernel panic - not syncing: panic_on_warn set ... CPU: 0 PID: 11678 Comm: kworker/u4:7 Not tainted 4.16.0-rc6+ #289 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x24d lib/dump_stack.c:53 panic+0x1e4/0x41c kernel/panic.c:183 __warn+0x1dc/0x200 kernel/panic.c:547 report_bug+0x1f4/0x2b0 lib/bug.c:186 fixup_bug.part.10+0x37/0x80 arch/x86/kernel/traps.c:178 fixup_bug arch/x86/kernel/traps.c:247 [inline] do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315 invalid_op+0x1b/0x40 arch/x86/entry/entry_64.S:986 RIP: 0010:ip_rt_bug+0x15/0x20 net/ipv4/route.c:1212 RSP: 0018:8801db007290 EFLAGS: 00010282 RAX: dc00 RBX: 8801d8dda3c0 RCX: 856c31ca RDX: 0100 RSI: 8858c300 RDI: 0282 RBP: 8801db007298 R08: 11003b600de1 R09: R10: R11: R12: 8801d8dda3c0 R13: 88019bdb2200 R14: 88019bdeed80 R15: 8801d8dda418 dst_output include/net/dst.h:444 [inline] ip_local_out+0x95/0x160 net/ipv4/ip_output.c:124 ip_send_skb+0x3c/0xc0 net/ipv4/ip_output.c:1414 ip_push_pending_frames+0x64/0x80 net/ipv4/ip_output.c:1434 icmp_push_reply+0x395/0x4f0 net/ipv4/icmp.c:394 icmp_send+0x1136/0x19b0 net/ipv4/icmp.c:741 ipv4_link_failure+0x2a/0x1b0 net/ipv4/route.c:1200 dst_link_failure include/net/dst.h:427 [inline] arp_error_report+0xae/0x180 net/ipv4/arp.c:297 neigh_invalidate+0x225/0x530 net/core/neighbour.c:883 neigh_timer_handler+0x897/0xd60 net/core/neighbour.c:969 call_timer_fn+0x228/0x820 kernel/time/timer.c:1326 expire_timers kernel/time/timer.c:1363 [inline] __run_timers+0x7ee/0xb70 kernel/time/timer.c:1666 run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692 __do_softirq+0x2d7/0xb85 kernel/softirq.c:285 invoke_softirq kernel/softirq.c:365 [inline] irq_exit+0x1cc/0x200 kernel/softirq.c:405 exiting_irq arch/x86/include/asm/apic.h:541 [inline] smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:857
Re: [PATCH] vhost-net: set packet weight of tx polling to 2 * vq size
From: haibinzhang(张海斌)Date: Fri, 6 Apr 2018 08:22:37 + > handle_tx will delay rx for tens or even hundreds of milliseconds when tx busy > polling udp packets with small length(e.g. 1byte udp payload), because setting > VHOST_NET_WEIGHT takes into account only sent-bytes but no single packet > length. > > Ping-Latencies shown below were tested between two Virtual Machines using > netperf (UDP_STREAM, len=1), and then another machine pinged the client: ... > Signed-off-by: Haibin Zhang > Signed-off-by: Yunfang Tai > Signed-off-by: Lidong Chen Michael and Jason, please review.
Re: [PATCH] make net_gso_ok return false when gso_type is zero(invalid)
From: Wenhua ShiDate: Fri, 6 Apr 2018 03:43:39 +0200 > Signed-off-by: Wenhua Shi This precondition should be made impossible instead of having to do an extra check everywhere that this helper is invoked, many of which are in fast paths.
Re: [PATCH net 0/8] net: fix uninit-values in networking stack
From: Eric DumazetDate: Sun, 8 Apr 2018 09:38:13 -0700 > On 04/07/2018 07:40 PM, David Miller wrote: >> From: Eric Dumazet >> Date: Sat, 7 Apr 2018 13:42:35 -0700 >> >>> It seems syzbot got new features enabled, and fired some interesting >>> reports. Oh well. >> >> Series applied, however in patch #7 the condition syzbot detects >> cannot happen. >> >> In all code paths that lead to __mkroute_output() with res->type >> uninitialized, __mkroute_output() will reassign the local variable >> 'type' before reading it. > > Well, we have : > > u16 type = res->type; > ... > >if (ipv4_is_lbcast(fl4->daddr)) > type = RTN_BROADCAST; > else if (ipv4_is_multicast(fl4->daddr)) > type = RTN_MULTICAST; > else if (ipv4_is_zeronet(fl4->daddr)) > return ERR_PTR(-EINVAL); > > ... > > if (type == RTN_BROADCAST) { /* This is where KMSAN complained */ > > So it looks like type could have been random at this point. Ok, then. It seems that the requirement is: fl4->flowi4_oif is non-zero fl4->daddr is neither local multicast nor lbcast fl4->flowi4_proto is IPPROTO_IGMP Then we can trigger such a sequence of events.
Re: [patch net] devlink: convert occ_get op to separate registration
From: Jiri PirkoDate: Thu, 5 Apr 2018 22:13:21 +0200 > From: Jiri Pirko > > This resolves race during initialization where the resources with > ops are registered before driver and the structures used by occ_get > op is initialized. So keep occ_get callbacks registered only when > all structs are initialized. ... > Fixes: d9f9b9a4d05f ("devlink: Add support for resource abstraction") > Signed-off-by: Jiri Pirko Applied and queued up for -stable, thanks.
Re: [PATCH] ARM: dts: ls1021a: Specify TBIPA register address
From: Esben HaabendalDate: Fri, 6 Apr 2018 14:46:35 +0200 > From: Esben Haabendal > > The current (mildly evil) fsl_pq_mdio code uses an undocumented shadow of > the TBIPA register on LS1021A, which happens to be read-only. > Changing TBI PHY address therefore does not work on LS1021A. > > The real (and documented) address of the TBIPA registere lies in the eTSEC > block and not in MDIO/MII, which is read/write, so using that fixes > the problem. > > Signed-off-by: Esben Haabendal Applied.
Re: [PATCH 1/2] net/fsl_pq_mdio: Allow explicit speficition of TBIPA address
From: Esben HaabendalDate: Fri, 6 Apr 2018 14:38:34 +0200 > From: Esben Haabendal > > This introduces a simpler and generic method for for finding (and mapping) > the TBIPA register. > > Instead of relying of complicated logic for finding the TBIPA register > address based on the MDIO or MII register block base > address, which even in some cases relies on undocumented shadow registers, > a second "reg" entry for the mdio bus devicetree node specifies the TBIPA > register. > > Backwards compatibility is kept, as the existing logic is applied when > only a single "reg" mapping is specified. > > Signed-off-by: Esben Haabendal Applied.
Re: [PATCH v4] net: thunderx: rework mac addresses list to u64 array
From: Vadim LomovtsevDate: Fri, 6 Apr 2018 12:53:54 -0700 > @@ -1929,7 +1929,7 @@ static void nicvf_set_rx_mode_task(struct work_struct > *work_arg) > work.work); > struct nicvf *nic = container_of(vf_work, struct nicvf, rx_mode_work); > union nic_mbx mbx = {}; > - struct xcast_addr *xaddr, *next; > + int idx = 0; No need to initialize idx. > + for (idx = 0; idx < vf_work->mc->count; idx++) { As it is always explicitly initialized at, and only used inside of, this loop.
Re: [PATCH net 0/5] ibmvnic: Fix driver reset and DMA bugs
From: Thomas FalconDate: Fri, 6 Apr 2018 18:37:01 -0500 > This patch series introduces some fixes to the driver reset > routines and a patch that fixes mistakes caught by the kernel > DMA debugger. > > The reset fixes include a fix to reset TX queue counters properly > after a reset as well as updates to driver reset error-handling code. > It also provides updates to the reset handling routine for redundant > backing VF failover and partition migration cases. Series applied, thanks Thomas.
Re: [PATCH net 0/8] net: fix uninit-values in networking stack
On 04/07/2018 07:40 PM, David Miller wrote: > From: Eric Dumazet> Date: Sat, 7 Apr 2018 13:42:35 -0700 > >> It seems syzbot got new features enabled, and fired some interesting >> reports. Oh well. > > Series applied, however in patch #7 the condition syzbot detects > cannot happen. > > In all code paths that lead to __mkroute_output() with res->type > uninitialized, __mkroute_output() will reassign the local variable > 'type' before reading it. Well, we have : u16 type = res->type; ... if (ipv4_is_lbcast(fl4->daddr)) type = RTN_BROADCAST; else if (ipv4_is_multicast(fl4->daddr)) type = RTN_MULTICAST; else if (ipv4_is_zeronet(fl4->daddr)) return ERR_PTR(-EINVAL); ... if (type == RTN_BROADCAST) { /* This is where KMSAN complained */ So it looks like type could have been random at this point. > > Furthermore, by doing a full structure initialization lots of > unrelated things will be initialized now as well. fib_result is 40 bytes on 64bit arches. > > We explicitly are only setting up the "inputs" of the fib_result > object before we call fib_lookup(). The prefixlen and other members > have no business being initialized there. > Yep We might put all inputs at the beginning of the structure, and output at the end. then replace sizeof() by offsetof(), but this looks a bit convoluted and maybe risky.
Re: [Patch net] tipc: use the right skb in tipc_sk_fill_sock_diag()
From: Cong WangDate: Fri, 6 Apr 2018 18:54:52 -0700 > Commit 4b2e6877b879 ("tipc: Fix namespace violation in > tipc_sk_fill_sock_diag") > tried to fix the crash but failed, the crash is still 100% reproducible > with it. > > In tipc_sk_fill_sock_diag(), skb is the diag dump we are filling, it is not > correct to retrieve its NETLINK_CB(), instead, like other protocol diag, > we should use NETLINK_CB(cb->skb).sk here. > > Reported-by: > Fixes: 4b2e6877b879 ("tipc: Fix namespace violation in > tipc_sk_fill_sock_diag") > Fixes: c30b70deb5f4 (tipc: implement socket diagnostics for AF_TIPC) > Cc: GhantaKrishnamurthy MohanKrishna > > Cc: Jon Maloy > Cc: Ying Xue > Signed-off-by: Cong Wang Applied, thank you.
Re: [RFC PATCH 2/3] netdev: kernel-only IFF_HIDDEN netdevice
From: Siwei LiuDate: Fri, 6 Apr 2018 19:32:05 -0700 > And I assume everyone here understands the use case for live > migration (in the context of providing cloud service) is very > different, and we have to hide the netdevs. If not, I'm more than > happy to clarify. I think you still need to clarify. netdevs are netdevs. If they have special attributes, mark them as such and the tools base their actions upon that. "Hiding", or changing classes, doesn't make any sense to me still.
Re: [PATCH net] sctp: sctp_sockaddr_af must check minimal addr length for AF_INET6
From: Eric DumazetDate: Sun, 8 Apr 2018 07:52:08 -0700 > Check must happen before call to ipv6_addr_v4mapped() > > syzbot report was : ... > Signed-off-by: Eric Dumazet > Cc: Vlad Yasevich > Cc: Neil Horman > Reported-by: syzbot Applied and queued up for -stable, thanks Eric.
Re: possible deadlock in perf_event_detach_bpf_prog
On Thu, Mar 29, 2018 at 2:18 PM, Daniel Borkmannwrote: > On 03/29/2018 11:04 PM, syzbot wrote: >> Hello, >> >> syzbot hit the following crash on upstream commit >> 3eb2ce825ea1ad89d20f7a3b5780df850e4be274 (Sun Mar 25 22:44:30 2018 +) >> Linux 4.16-rc7 >> syzbot dashboard link: >> https://syzkaller.appspot.com/bug?extid=dc5ca0e4c9bfafaf2bae >> >> Unfortunately, I don't have any reproducer for this crash yet. >> Raw console output: >> https://syzkaller.appspot.com/x/log.txt?id=4742532743299072 >> Kernel config: >> https://syzkaller.appspot.com/x/.config?id=-8440362230543204781 >> compiler: gcc (GCC) 7.1.1 20170620 >> >> IMPORTANT: if you fix the bug, please add the following tag to the commit: >> Reported-by: syzbot+dc5ca0e4c9bfafaf2...@syzkaller.appspotmail.com >> It will help syzbot understand when the bug is fixed. See footer for details. >> If you forward the report, please keep this part and the footer. >> >> >> == >> WARNING: possible circular locking dependency detected >> 4.16.0-rc7+ #3 Not tainted >> -- >> syz-executor7/24531 is trying to acquire lock: >> (bpf_event_mutex){+.+.}, at: [<8a849b07>] >> perf_event_detach_bpf_prog+0x92/0x3d0 kernel/trace/bpf_trace.c:854 >> >> but task is already holding lock: >> (>mmap_sem){}, at: [<38768f87>] vm_mmap_pgoff+0x198/0x280 >> mm/util.c:353 >> >> which lock already depends on the new lock. >> >> >> the existing dependency chain (in reverse order) is: >> >> -> #1 (>mmap_sem){}: >>__might_fault+0x13a/0x1d0 mm/memory.c:4571 >>_copy_to_user+0x2c/0xc0 lib/usercopy.c:25 >>copy_to_user include/linux/uaccess.h:155 [inline] >>bpf_prog_array_copy_info+0xf2/0x1c0 kernel/bpf/core.c:1694 >>perf_event_query_prog_array+0x1c7/0x2c0 kernel/trace/bpf_trace.c:891 > > Looks like we should move the two copy_to_user() outside of > bpf_event_mutex section to avoid the deadlock. This is introduced by one of my previous patches. The above suggested fix makes sense. I will craft a patch and send to the mailing list for bpf branch soon. > >>_perf_ioctl kernel/events/core.c:4750 [inline] >>perf_ioctl+0x3e1/0x1480 kernel/events/core.c:4770 >>vfs_ioctl fs/ioctl.c:46 [inline] >>do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:686 >>SYSC_ioctl fs/ioctl.c:701 [inline] >>SyS_ioctl+0x8f/0xc0 fs/ioctl.c:692 >>do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287 >>entry_SYSCALL_64_after_hwframe+0x42/0xb7 >> >> -> #0 (bpf_event_mutex){+.+.}: >>lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:3920 >>__mutex_lock_common kernel/locking/mutex.c:756 [inline] >>__mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893 >>mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908 >>perf_event_detach_bpf_prog+0x92/0x3d0 kernel/trace/bpf_trace.c:854 >>perf_event_free_bpf_prog kernel/events/core.c:8147 [inline] >>_free_event+0xbdb/0x10f0 kernel/events/core.c:4116 >>put_event+0x24/0x30 kernel/events/core.c:4204 >>perf_mmap_close+0x60d/0x1010 kernel/events/core.c:5172 >>remove_vma+0xb4/0x1b0 mm/mmap.c:172 >>remove_vma_list mm/mmap.c:2490 [inline] >>do_munmap+0x82a/0xdf0 mm/mmap.c:2731 >>mmap_region+0x59e/0x15a0 mm/mmap.c:1646 >>do_mmap+0x6c0/0xe00 mm/mmap.c:1483 >>do_mmap_pgoff include/linux/mm.h:2223 [inline] >>vm_mmap_pgoff+0x1de/0x280 mm/util.c:355 >>SYSC_mmap_pgoff mm/mmap.c:1533 [inline] >>SyS_mmap_pgoff+0x462/0x5f0 mm/mmap.c:1491 >>SYSC_mmap arch/x86/kernel/sys_x86_64.c:100 [inline] >>SyS_mmap+0x16/0x20 arch/x86/kernel/sys_x86_64.c:91 >>do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287 >>entry_SYSCALL_64_after_hwframe+0x42/0xb7 >> >> other info that might help us debug this: >> >> Possible unsafe locking scenario: >> >>CPU0CPU1 >> >> lock(>mmap_sem); >>lock(bpf_event_mutex); >>lock(>mmap_sem); >> lock(bpf_event_mutex); >> >> *** DEADLOCK *** >> >> 1 lock held by syz-executor7/24531: >> #0: (>mmap_sem){}, at: [<38768f87>] >> vm_mmap_pgoff+0x198/0x280 mm/util.c:353 >> >> stack backtrace: >> CPU: 0 PID: 24531 Comm: syz-executor7 Not tainted 4.16.0-rc7+ #3 >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS >> Google 01/01/2011 >> Call Trace: >> __dump_stack lib/dump_stack.c:17 [inline] >> dump_stack+0x194/0x24d lib/dump_stack.c:53 >> print_circular_bug.isra.38+0x2cd/0x2dc kernel/locking/lockdep.c:1223 >> check_prev_add kernel/locking/lockdep.c:1863 [inline] >> check_prevs_add kernel/locking/lockdep.c:1976 [inline] >> validate_chain kernel/locking/lockdep.c:2417 [inline] >> __lock_acquire+0x30a8/0x3e00
KMSAN: uninit-value in tipc_subscrb_rcv_cb
Hello, syzbot hit the following crash on https://github.com/google/kmsan.git/master commit e2ab7e8abba47a2f2698216258e5d8727ae58717 (Fri Apr 6 16:24:31 2018 +) kmsan: temporarily disable visitAsmInstruction() to help syzbot syzbot dashboard link: https://syzkaller.appspot.com/bug?extid=75e6e042c5bbf691fc82 Unfortunately, I don't have any reproducer for this crash yet. Raw console output: https://syzkaller.appspot.com/x/log.txt?id=5784467448791040 Kernel config: https://syzkaller.appspot.com/x/.config?id=6627248707860932248 compiler: clang version 7.0.0 (trunk 329060) (llvm/trunk 329054) IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+75e6e042c5bbf691f...@syzkaller.appspotmail.com It will help syzbot understand when the bug is fixed. See footer for details. If you forward the report, please keep this part and the footer. == BUG: KMSAN: uninit-value in htohl net/tipc/subscr.c:66 [inline] BUG: KMSAN: uninit-value in tipc_subscrb_rcv_cb+0x418/0xe80 net/tipc/subscr.c:339 CPU: 1 PID: 5017 Comm: kworker/u4:6 Not tainted 4.16.0+ #81 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: tipc_rcv tipc_recv_work Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:53 kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067 __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676 htohl net/tipc/subscr.c:66 [inline] tipc_subscrb_rcv_cb+0x418/0xe80 net/tipc/subscr.c:339 tipc_receive_from_sock+0x64c/0x800 net/tipc/server.c:271 tipc_recv_work+0xd8/0x1f0 net/tipc/server.c:618 process_one_work+0x12c6/0x1f60 kernel/workqueue.c:2113 worker_thread+0x113c/0x24f0 kernel/workqueue.c:2247 kthread+0x539/0x720 kernel/kthread.c:239 ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:406 Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline] kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188 kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314 kmem_cache_alloc+0xaab/0xb90 mm/slub.c:2756 tipc_receive_from_sock+0x15c/0x800 net/tipc/server.c:253 tipc_recv_work+0xd8/0x1f0 net/tipc/server.c:618 process_one_work+0x12c6/0x1f60 kernel/workqueue.c:2113 worker_thread+0x113c/0x24f0 kernel/workqueue.c:2247 kthread+0x539/0x720 kernel/kthread.c:239 ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:406 == Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 5017 Comm: kworker/u4:6 Tainted: GB4.16.0+ #81 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: tipc_rcv tipc_recv_work Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:53 panic+0x39d/0x940 kernel/panic.c:183 kmsan_report+0x238/0x240 mm/kmsan/kmsan.c:1083 __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676 htohl net/tipc/subscr.c:66 [inline] tipc_subscrb_rcv_cb+0x418/0xe80 net/tipc/subscr.c:339 tipc_receive_from_sock+0x64c/0x800 net/tipc/server.c:271 tipc_recv_work+0xd8/0x1f0 net/tipc/server.c:618 process_one_work+0x12c6/0x1f60 kernel/workqueue.c:2113 worker_thread+0x113c/0x24f0 kernel/workqueue.c:2247 kthread+0x539/0x720 kernel/kthread.c:239 ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:406 Shutting down cpus with NMI Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: disabled Rebooting in 86400 seconds.. --- This bug is generated by a dumb bot. It may contain errors. See https://goo.gl/tpsmEJ for details. Direct all questions to syzkal...@googlegroups.com. syzbot will keep track of this bug report. If you forgot to add the Reported-by tag, once the fix for this bug is merged into any tree, please reply to this email with: #syz fix: exact-commit-title To mark this as a duplicate of another syzbot report, please reply with: #syz dup: exact-subject-of-another-report If it's a one-off invalid bug report, please reply with: #syz invalid Note: if the crash happens again, it will cause creation of a new bug report. Note: all commands must start from beginning of the line in the email body.
[PATCH net] sctp: sctp_sockaddr_af must check minimal addr length for AF_INET6
Check must happen before call to ipv6_addr_v4mapped() syzbot report was : BUG: KMSAN: uninit-value in sctp_sockaddr_af net/sctp/socket.c:359 [inline] BUG: KMSAN: uninit-value in sctp_do_bind+0x60f/0xdc0 net/sctp/socket.c:384 CPU: 0 PID: 3576 Comm: syzkaller968804 Not tainted 4.16.0+ #82 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:53 kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067 __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676 sctp_sockaddr_af net/sctp/socket.c:359 [inline] sctp_do_bind+0x60f/0xdc0 net/sctp/socket.c:384 sctp_bind+0x149/0x190 net/sctp/socket.c:332 inet6_bind+0x1fd/0x1820 net/ipv6/af_inet6.c:293 SYSC_bind+0x3f2/0x4b0 net/socket.c:1474 SyS_bind+0x54/0x80 net/socket.c:1460 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 RIP: 0033:0x43fd49 RSP: 002b:7ffe99df3d28 EFLAGS: 0213 ORIG_RAX: 0031 RAX: ffda RBX: 004002c8 RCX: 0043fd49 RDX: 0010 RSI: 2000 RDI: 0003 RBP: 006ca018 R08: 004002c8 R09: 004002c8 R10: 004002c8 R11: 0213 R12: 00401670 R13: 00401700 R14: R15: Local variable description: address@SYSC_bind Variable was created at: SYSC_bind+0x6f/0x4b0 net/socket.c:1461 SyS_bind+0x54/0x80 net/socket.c:1460 Signed-off-by: Eric DumazetCc: Vlad Yasevich Cc: Neil Horman Reported-by: syzbot --- net/sctp/socket.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/net/sctp/socket.c b/net/sctp/socket.c index 7a10ae3c3d8293abecd955ff6a5a19e60dcc6f95..eb712df7156eda7124cd88b4034359b088c2c475 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -357,11 +357,14 @@ static struct sctp_af *sctp_sockaddr_af(struct sctp_sock *opt, if (!opt->pf->af_supported(addr->sa.sa_family, opt)) return NULL; - /* V4 mapped address are really of AF_INET family */ - if (addr->sa.sa_family == AF_INET6 && - ipv6_addr_v4mapped(>v6.sin6_addr) && - !opt->pf->af_supported(AF_INET, opt)) - return NULL; + if (addr->sa.sa_family == AF_INET6) { + if (len < SIN6_LEN_RFC2133) + return NULL; + /* V4 mapped address are really of AF_INET family */ + if (ipv6_addr_v4mapped(>v6.sin6_addr) && + !opt->pf->af_supported(AF_INET, opt)) + return NULL; + } /* If we get this far, af is valid. */ af = sctp_get_af_specific(addr->sa.sa_family); -- 2.17.0.484.g0c8726318c-goog
Re: [PATCH v2 net] net: dsa: Discard frames from unused ports
From: Andrew LunnDate: Sat, 7 Apr 2018 20:37:40 +0200 > The Marvell switches under some conditions will pass a frame to the > host with the port being the CPU port. Such frames are invalid, and > should be dropped. Not dropping them can result in a crash when > incrementing the receive statistics for an invalid port. > > Reported-by: Chris Healy > Fixes: 91da11f870f0 ("net: Distributed Switch Architecture protocol support") > Signed-off-by: Andrew Lunn > --- > v2: > Use an earlier revision for the fixes tag. > Add unlikely annotation Applied and queued up for -stable, thanks.
Re: [PATCH net] sctp: do not leak kernel memory to user space
From: Eric DumazetDate: Sat, 7 Apr 2018 17:15:22 -0700 > syzbot produced a nice report [1] > > Issue here is that a recvmmsg() managed to leak 8 bytes of kernel memory > to user space, because sin_zero (padding field) was not properly cleared. ... > Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") > Signed-off-by: Eric Dumazet > Cc: Vlad Yasevich > Cc: Neil Horman > Reported-by: syzbot Applied and queued up for -stable, thanks Eric.
Re: [PATCH v2] net: phy: marvell10g: add thermal hwmon device
On Tue, Apr 03, 2018 at 10:31:45AM +0100, Russell King wrote: > Add a thermal monitoring device for the Marvell 88x3310, which updates > once a second. We also need to hook into the suspend/resume mechanism > to ensure that the thermal monitoring is reconfigured when we resume. > > Suggested-by: Andrew Lunn> Signed-off-by: Russell King > --- > v2: update to apply to net-next > > drivers/net/phy/marvell10g.c | 184 > ++- > 1 file changed, 182 insertions(+), 2 deletions(-) > > diff --git a/drivers/net/phy/marvell10g.c b/drivers/net/phy/marvell10g.c > index 8a0bd98fdec7..db9d66781da6 100644 > --- a/drivers/net/phy/marvell10g.c > +++ b/drivers/net/phy/marvell10g.c > @@ -21,8 +21,10 @@ > * If both the fiber and copper ports are connected, the first to gain > * link takes priority and the other port is completely locked out. > */ > -#include > +#include > +#include > #include > +#include > > enum { > MV_PCS_BASE_T = 0x, > @@ -40,6 +42,19 @@ enum { >*/ > MV_AN_CTRL1000 = 0x8000, /* 1000base-T control register */ > MV_AN_STAT1000 = 0x8001, /* 1000base-T status register */ > + > + /* Vendor2 MMD registers */ > + MV_V2_TEMP_CTRL = 0xf08a, > + MV_V2_TEMP_CTRL_MASK= 0xc000, > + MV_V2_TEMP_CTRL_SAMPLE = 0x, > + MV_V2_TEMP_CTRL_DISABLE = 0xc000, > + MV_V2_TEMP = 0xf08c, > + MV_V2_TEMP_UNKNOWN = 0x9600, /* unknown function */ > +}; > + > +struct mv3310_priv { > + struct device *hwmon_dev; > + char *hwmon_name; > }; > > static int mv3310_modify(struct phy_device *phydev, int devad, u16 reg, > @@ -60,17 +75,180 @@ static int mv3310_modify(struct phy_device *phydev, int > devad, u16 reg, > return ret < 0 ? ret : 1; > } > > +#ifdef CONFIG_HWMON > +static umode_t mv3310_hwmon_is_visible(const void *data, > +enum hwmon_sensor_types type, > +u32 attr, int channel) > +{ > + if (type == hwmon_chip && attr == hwmon_chip_update_interval) > + return 0444; > + if (type == hwmon_temp && attr == hwmon_temp_input) > + return 0444; > + return 0; > +} > + > +static int mv3310_hwmon_read(struct device *dev, enum hwmon_sensor_types > type, > + u32 attr, int channel, long *value) > +{ > + struct phy_device *phydev = dev_get_drvdata(dev); > + int temp; > + > + if (type == hwmon_chip && attr == hwmon_chip_update_interval) { > + *value = MSEC_PER_SEC; The update_interval attribute is supposed to be used for setting an update interval in the chip. Having it return a constant doesn't really serve a useful purpose. Guenter > + return 0; > + } > + > + if (type == hwmon_temp && attr == hwmon_temp_input) { > + temp = phy_read_mmd(phydev, MDIO_MMD_VEND2, MV_V2_TEMP); > + if (temp < 0) > + return temp; > + > + *value = ((temp & 0xff) - 75) * 1000; > + > + return 0; > + } > + > + return -EOPNOTSUPP; > +} > + > +static const struct hwmon_ops mv3310_hwmon_ops = { > + .is_visible = mv3310_hwmon_is_visible, > + .read = mv3310_hwmon_read, > +}; > + > +static u32 mv3310_hwmon_chip_config[] = { > + HWMON_C_REGISTER_TZ | HWMON_C_UPDATE_INTERVAL, > + 0, > +}; > + > +static const struct hwmon_channel_info mv3310_hwmon_chip = { > + .type = hwmon_chip, > + .config = mv3310_hwmon_chip_config, > +}; > + > +static u32 mv3310_hwmon_temp_config[] = { > + HWMON_T_INPUT, > + 0, > +}; > + > +static const struct hwmon_channel_info mv3310_hwmon_temp = { > + .type = hwmon_temp, > + .config = mv3310_hwmon_temp_config, > +}; > + > +static const struct hwmon_channel_info *mv3310_hwmon_info[] = { > + _hwmon_chip, > + _hwmon_temp, > + NULL, > +}; > + > +static const struct hwmon_chip_info mv3310_hwmon_chip_info = { > + .ops = _hwmon_ops, > + .info = mv3310_hwmon_info, > +}; > + > +static int mv3310_hwmon_config(struct phy_device *phydev, bool enable) > +{ > + u16 val; > + int ret; > + > + ret = phy_write_mmd(phydev, MDIO_MMD_VEND2, MV_V2_TEMP, > + MV_V2_TEMP_UNKNOWN); > + if (ret < 0) > + return ret; > + > + val = enable ? MV_V2_TEMP_CTRL_SAMPLE : MV_V2_TEMP_CTRL_DISABLE; > + ret = mv3310_modify(phydev, MDIO_MMD_VEND2, MV_V2_TEMP_CTRL, > + MV_V2_TEMP_CTRL_MASK, val); > + > + return ret < 0 ? ret : 0; > +} > + > +static void mv3310_hwmon_disable(void *data) > +{ > + struct phy_device *phydev = data; > + > + mv3310_hwmon_config(phydev, false); > +} > + > +static int mv3310_hwmon_probe(struct phy_device *phydev) > +{ > + struct device *dev = >mdio.dev; > + struct mv3310_priv *priv = dev_get_drvdata(>mdio.dev); > + int
Re: [PATCH bpf-next v8 05/11] seccomp,landlock: Enforce Landlock programs per process hierarchy
On 02/27/2018 10:48 PM, Mickaël Salaün wrote: > > On 27/02/2018 17:39, Andy Lutomirski wrote: >> On Tue, Feb 27, 2018 at 5:32 AM, Alexei Starovoitov >>wrote: >>> On Tue, Feb 27, 2018 at 05:20:55AM +, Andy Lutomirski wrote: On Tue, Feb 27, 2018 at 4:54 AM, Alexei Starovoitov wrote: > On Tue, Feb 27, 2018 at 04:40:34AM +, Andy Lutomirski wrote: >> On Tue, Feb 27, 2018 at 2:08 AM, Alexei Starovoitov >> wrote: >>> On Tue, Feb 27, 2018 at 01:41:15AM +0100, Mickaël Salaün wrote: The seccomp(2) syscall can be used by a task to apply a Landlock program to itself. As a seccomp filter, a Landlock program is enforced for the current task and all its future children. A program is immutable and a task can only add new restricting programs to itself, forming a list of programss. A Landlock program is tied to a Landlock hook. If the action on a kernel object is allowed by the other Linux security mechanisms (e.g. DAC, capabilities, other LSM), then a Landlock hook related to this kind of object is triggered. The list of programs for this hook is then evaluated. Each program return a 32-bit value which can deny the action on a kernel object with a non-zero value. If every programs of the list return zero, then the action on the object is allowed. Multiple Landlock programs can be chained to share a 64-bits value for a call chain (e.g. evaluating multiple elements of a file path). This chaining is restricted when a process construct this chain by loading a program, but additional checks are performed when it requests to apply this chain of programs to itself. The restrictions ensure that it is not possible to call multiple programs in a way that would imply to handle multiple shared values (i.e. cookies) for one chain. For now, only a fs_pick program can be chained to the same type of program, because it may make sense if they have different triggers (cf. next commits). This restrictions still allows to reuse Landlock programs in a safe way (e.g. use the same loaded fs_walk program with multiple chains of fs_pick programs). Signed-off-by: Mickaël Salaün >>> >>> ... >>> +struct landlock_prog_set *landlock_prepend_prog( + struct landlock_prog_set *current_prog_set, + struct bpf_prog *prog) +{ + struct landlock_prog_set *new_prog_set = current_prog_set; + unsigned long pages; + int err; + size_t i; + struct landlock_prog_set tmp_prog_set = {}; + + if (prog->type != BPF_PROG_TYPE_LANDLOCK_HOOK) + return ERR_PTR(-EINVAL); + + /* validate memory size allocation */ + pages = prog->pages; + if (current_prog_set) { + size_t i; + + for (i = 0; i < ARRAY_SIZE(current_prog_set->programs); i++) { + struct landlock_prog_list *walker_p; + + for (walker_p = current_prog_set->programs[i]; + walker_p; walker_p = walker_p->prev) + pages += walker_p->prog->pages; + } + /* count a struct landlock_prog_set if we need to allocate one */ + if (refcount_read(_prog_set->usage) != 1) + pages += round_up(sizeof(*current_prog_set), PAGE_SIZE) + / PAGE_SIZE; + } + if (pages > LANDLOCK_PROGRAMS_MAX_PAGES) + return ERR_PTR(-E2BIG); + + /* ensure early that we can allocate enough memory for the new + * prog_lists */ + err = store_landlock_prog(_prog_set, current_prog_set, prog); + if (err) + return ERR_PTR(err); + + /* + * Each task_struct points to an array of prog list pointers. These + * tables are duplicated when additions are made (which means each + * table needs to be refcounted for the processes using it). When a new + * table is created, all the refcounters on the prog_list are bumped (to + * track each table that references the prog). When a new prog is + * added, it's just prepended to the list for the new table to point + * at.
KMSAN: uninit-value in _decode_session6
Hello, syzbot hit the following crash on https://github.com/google/kmsan.git/master commit e2ab7e8abba47a2f2698216258e5d8727ae58717 (Fri Apr 6 16:24:31 2018 +) kmsan: temporarily disable visitAsmInstruction() to help syzbot syzbot dashboard link: https://syzkaller.appspot.com/bug?extid=2974b85346f85b586f4d Unfortunately, I don't have any reproducer for this crash yet. Raw console output: https://syzkaller.appspot.com/x/log.txt?id=4871594698604544 Kernel config: https://syzkaller.appspot.com/x/.config?id=6627248707860932248 compiler: clang version 7.0.0 (trunk 329060) (llvm/trunk 329054) IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+2974b85346f85b586...@syzkaller.appspotmail.com It will help syzbot understand when the bug is fixed. See footer for details. If you forward the report, please keep this part and the footer. == BUG: KMSAN: uninit-value in _decode_session6+0x6d1/0x1290 net/ipv6/xfrm6_policy.c:151 CPU: 1 PID: 5714 Comm: blkid Not tainted 4.16.0+ #81 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:53 kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067 __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676 _decode_session6+0x6d1/0x1290 net/ipv6/xfrm6_policy.c:151 __xfrm_decode_session+0x140/0x1c0 net/xfrm/xfrm_policy.c:2368 xfrm_decode_session_reverse include/net/xfrm.h:1213 [inline] icmpv6_route_lookup net/ipv6/icmp.c:372 [inline] icmp6_send+0x305f/0x3460 net/ipv6/icmp.c:551 icmpv6_send+0xe0/0x110 net/ipv6/ip6_icmp.c:43 ip6_link_failure+0x8f/0x580 net/ipv6/route.c:2034 dst_link_failure include/net/dst.h:426 [inline] ndisc_error_report+0x101/0x1a0 net/ipv6/ndisc.c:695 neigh_invalidate+0x385/0x930 net/core/neighbour.c:883 neigh_timer_handler+0xd85/0x12d0 net/core/neighbour.c:969 call_timer_fn+0x26a/0x5a0 kernel/time/timer.c:1326 expire_timers kernel/time/timer.c:1363 [inline] __run_timers+0xda7/0x11c0 kernel/time/timer.c:1666 run_timer_softirq+0x43/0x70 kernel/time/timer.c:1692 __do_softirq+0x56d/0x93d kernel/softirq.c:285 invoke_softirq kernel/softirq.c:365 [inline] irq_exit+0x202/0x240 kernel/softirq.c:405 exiting_irq+0xe/0x10 arch/x86/include/asm/apic.h:541 smp_apic_timer_interrupt+0x64/0x90 arch/x86/kernel/apic/apic.c:1055 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:857 RIP: 0010:kmsan_get_origin_address_noruntime+0x8f/0x260 include/linux/mmzone.h:1206 RSP: :880165b0fb40 EFLAGS: 0202 ORIG_RAX: ff12 RAX: 8801e5b0fcc8 RBX: RCX: 88021fff1580 RDX: 0580 RSI: RDI: 880165b0fcc8 RBP: 880165b0fb78 R08: 01080020 R09: 0002 R10: R11: R12: 0068 R13: d3a0004b R14: 880165b0fcc8 R15: kmsan_set_origin_inline+0x6b/0x120 mm/kmsan/kmsan_instr.c:585 __msan_poison_alloca+0x15c/0x1d0 mm/kmsan/kmsan_instr.c:647 handle_mm_fault+0x1c8/0x7ba0 mm/memory.c:4114 __do_page_fault+0xec4/0x1a10 arch/x86/mm/fault.c:1423 do_page_fault+0xd3/0x260 arch/x86/mm/fault.c:1500 page_fault+0x45/0x50 arch/x86/entry/entry_64.S:1151 RIP: 0033:0x7f93ad8e4789 RSP: 002b:7ffd11b3cf20 EFLAGS: 00010216 RAX: 7f93ad4742a0 RBX: 7f93adaf79a8 RCX: 04a8 RDX: 7f93ad6a9028 RSI: aaab RDI: RBP: 7ffd11b3d000 R08: 0001 R09: 0010 R10: 7f93ad343a30 R11: 0206 R12: 7f93ad325000 R13: 7f93ad343220 R14: 7f93ad33d748 R15: 7f93adaef740 Uninit was stored to memory at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline] kmsan_save_stack mm/kmsan/kmsan.c:293 [inline] kmsan_internal_chain_origin+0x12b/0x210 mm/kmsan/kmsan.c:684 kmsan_memcpy_origins+0x11d/0x170 mm/kmsan/kmsan.c:526 __msan_memcpy+0x19f/0x1f0 mm/kmsan/kmsan_instr.c:470 skb_copy_bits+0x63a/0xdb0 net/core/skbuff.c:2046 __pskb_pull_tail+0x483/0x22e0 net/core/skbuff.c:1883 pskb_may_pull include/linux/skbuff.h:2112 [inline] _decode_session6+0x79f/0x1290 net/ipv6/xfrm6_policy.c:152 __xfrm_decode_session+0x140/0x1c0 net/xfrm/xfrm_policy.c:2368 xfrm_decode_session_reverse include/net/xfrm.h:1213 [inline] icmpv6_route_lookup net/ipv6/icmp.c:372 [inline] icmp6_send+0x305f/0x3460 net/ipv6/icmp.c:551 icmpv6_send+0xe0/0x110 net/ipv6/ip6_icmp.c:43 ip6_link_failure+0x8f/0x580 net/ipv6/route.c:2034 dst_link_failure include/net/dst.h:426 [inline] ndisc_error_report+0x101/0x1a0 net/ipv6/ndisc.c:695 neigh_invalidate+0x385/0x930 net/core/neighbour.c:883 neigh_timer_handler+0xd85/0x12d0 net/core/neighbour.c:969 call_timer_fn+0x26a/0x5a0 kernel/time/timer.c:1326 expire_timers kernel/time/timer.c:1363 [inline] __run_timers+0xda7/0x11c0 kernel/time/timer.c:1666 run_timer_softirq+0x43/0x70
Re: [PATCH v2 net-next 06/10] mlxsw: core: Fix arg name of MLXSW_CORE_RES_VALID and MLXSW_CORE_RES_GET
On Thu, Apr 05, 2018 at 01:33:46AM +, Sasha Levin wrote: > Please let us know if you'd like to have this patch included in a stable tree. Patch isn't needed in a stable tree. Thanks!
Re: [PATCH v2 net-next 01/10] mlxsw: spectrum_acl: Fix flex actions header ifndef define construct
On Thu, Apr 05, 2018 at 01:33:48AM +, Sasha Levin wrote: > Please let us know if you'd like to have this patch included in a stable tree. Patch isn't needed in a stable tree. Thanks!