Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf
On Wed, Nov 8, 2017 at 9:12 AM, Fengguang Wuwrote: > > OK. Here is the original faddr2line output: > > $ ~/linux/scripts/faddr2line vmlinux vlan_device_event+0x7f5/0xa40 > vlan_device_event+0x7f5/0xa40: > vlan_device_event at net/8021q/vlan.h:60 > > And below is call trace embedded with full faddr2line output. > > I notice that this trace shows no additional inline files at all. > Is it because I did some kconfig option wrong, so that inline info is > lost? Eg. > > CONFIG_OPTIMIZE_INLINING=y (it looks better set to N) > CONFIG_DEBUG_INFO_REDUCED=y > CONFIG_DEBUG_INFO_SPLIT=y Ok, this annoyed me, so I went back and looked. It's the "CONFIG_DEBUG_INFO_SPLIT" thing that makes faddr2line unable to see the inlining information, Using OPTIMIZE_INLINING is fine. I'm not sure that addr2line could be made to understand the .dwo files that DEBUG_INFO_SPLIT causes (particularly since we munge the vmlinux file itself, who knows how that could confuse things). So can I ask that you make the 0day build scripts always use CONFIG_DEBUG_INFO=y CONFIG_DEBUG_INFO_REDUCED=y # CONFIG_DEBUG_INFO_SPLIT is not set because with that "DEBUG_INFO_REDUCED=y", the use of DEBUG_INFO_SPLIT shouldn't be _that_ big of a deal. Yes, splitting the debug info does help reduce disk usage for the build, and presumably speed it up a bit too due to less IO and reduced copying of the debug info data, but right now it really makes the debug info much less useful. Just to see the difference: - with DEBUG_INFO_SPLIT=y [torvalds@i7 linux]$ ./scripts/faddr2line vmlinux __schedule+0x314 __schedule+0x314/0x840: __schedule at kernel/sched/stats.h:12 - with DEBUG_INFO_SPLIT is not set [torvalds@i7 linux]$ ./scripts/faddr2line vmlinux __schedule+0x314 __schedule+0x314/0x840: rq_sched_info_arrive at kernel/sched/stats.h:12 (inlined by) sched_info_arrive at kernel/sched/stats.h:99 (inlined by) __sched_info_switch at kernel/sched/stats.h:151 (inlined by) sched_info_switch at kernel/sched/stats.h:158 (inlined by) prepare_task_switch at kernel/sched/core.c:2582 (inlined by) context_switch at kernel/sched/core.c:2755 (inlined by) __schedule at kernel/sched/core.c:3366 and while (once again) this is a pretty extreme case, we do use a lot of inlines, and gcc will add its own inlining. Getting this whole information - particularly for the faulting IP - would really help in some situations. I love what the 0day robot is doing, this would be another big step forward. Oh - and talking about "big step forward" - does the 0day robot do any suspend/resume testing at all? Even on non-laptop hardware, it should be possible to do something like echo platform > /sys/power/pm_test echo freeze > /sys/power/state or similar (assuming CONFIG_PM_DEBUG is enabled). Maybe you already do something like this? Anyway, regardless this was a good release for the 0day robot. Thanks. Linus
Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf
On Thu, Nov 9, 2017 at 7:51 AM, Girish Moodalbailwrote: > > Upon receiving NETDEV_DOWN event, we are calling > > vlan_vid_del(dev, htons(ETH_P_8021Q), 0); > > which in turn calls call_rcu() to queue vlan_info_free_rcu() to be called at > some point. This free function frees the array[] > (vlan_info.vlan_grp.vn_devices_array). My guess is that > vlan_info_free_rcu() is being called first and then the array[] is being > accessed in vlan_device_event(). > Well yes and no. No, RCU itself is not broken and we clearly unpublish the RCU pointer before calling call_rcu(). Yes, I see where it is broken: the grp pointer still points to old dev->vlan_info, we should re-fetch it after vlan_vid_del(). I will send a fix. Thanks!
Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf
On 11/8/17 10:34 PM, Cong Wang wrote: On Wed, Nov 8, 2017 at 7:12 PM, Fengguang Wuwrote: Hi Alex, So looking over the trace the panic seems to be happening after a decnet interface is getting deleted. Is there any chance we could try compiling the kernel without decnet support to see if that is the source of these issues? I don't know if anyone on the Intel Wired Lan team is testing with that enabled so if we can eliminate that as a possible cause that would be useful. Sure and thank you for the suggestion! It looks disabling DECNET still triggers the vlan_device_event BUG. However when looking at the dmesgs, I find another warning just before the vlan_device_event BUG. Not sure if it's related one or independent now-fixed issue. Those decnet symbols are probably noises. Right. This is a 32-bit Kernel compiled with CONFIG_PREEMPT=y (I am guessing that this has exposed some lock bug). Also, VLAN (8021q) is compiled into the kernel, so it registers a vlan_device_event() callback on boot. There may not be a VLAN device per-se. Upon receiving NETDEV_DOWN event, we are calling vlan_vid_del(dev, htons(ETH_P_8021Q), 0); which in turn calls call_rcu() to queue vlan_info_free_rcu() to be called at some point. This free function frees the array[] (vlan_info.vlan_grp.vn_devices_array). My guess is that vlan_info_free_rcu() is being called first and then the array[] is being accessed in vlan_device_event(). The netifd daemon in OpenWRT is marking the interface down and that is why it is generating NETDEV_DOWN event. And it uses ioctl(SIOCSIFFLAGS, ~IFF_UP) on a AF_UNIX socket. This results in a call to dev_ifsioc() in the kernel with only rtnl_lock() held and it is not in RCU read critical section. ~Girish How do you reproduce it? And what is your setup? Vlan device on top of your eth0 (e1000)?
Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf
On Thu, Nov 09, 2017 at 02:55:10PM +0800, Fengguang Wu wrote: On Wed, Nov 08, 2017 at 10:34:10PM -0800, Cong Wang wrote: On Wed, Nov 8, 2017 at 7:12 PM, Fengguang Wuwrote: Hi Alex, So looking over the trace the panic seems to be happening after a decnet interface is getting deleted. Is there any chance we could try compiling the kernel without decnet support to see if that is the source of these issues? I don't know if anyone on the Intel Wired Lan team is testing with that enabled so if we can eliminate that as a possible cause that would be useful. Sure and thank you for the suggestion! It looks disabling DECNET still triggers the vlan_device_event BUG. However when looking at the dmesgs, I find another warning just before the vlan_device_event BUG. Not sure if it's related one or independent now-fixed issue. Those decnet symbols are probably noises. Yes it's not related to CONFIG_DECNET. How do you reproduce it? And what is your setup? Vlan device on top of your eth0 (e1000)? It can basically be reproduced in one of our test machines -- lkp-wsx03, which is a Westmere EX server. Anyway if you'd like to try, here are the steps. It'll auto download the images and run QEMU. apt-get install lib32gcc-7-dev # or lib32gcc-6-dev git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp qemu -k job-script # job-script is attached in this email Note that even in our lkp-wsx03 machine, the chance of reproducing it is only 3% (3 out of 100 boots). Thanks, Fengguang #!/bin/sh export_top_env() { export suite='trinity' export testcase='trinity' export runtime=300 export job_origin='/lkp/lkp/src/allot/rand/vm-lkp-wsx03-openwrt-i386/trinity.yaml' export testbox='vm-lkp-wsx03-openwrt-i386-5' export tbox_group='vm-lkp-wsx03-openwrt-i386' export kconfig='i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS' export compiler='gcc-6' export queue='wfg' export branch='linus/master' export commit='c470abd4fde40ea6a0846a2beab642a578c0b8cd' export submit_id='5a03a4550b9a93f7c99708b0' export job_file='/lkp/scheduled/vm-lkp-wsx03-openwrt-i386-5/trinity-300s-openwrt-i386-2016-03-16.cgz-c470abd4fde40ea6a0846a2beab642a578c0b8cd-20171109-63433-kf9gj3-wait_kernel-0.yaml' export id='181954ca4367d475b88dc8de99b2d52ab533a5e1' export model='qemu-system-i386 -enable-kvm' export nr_vm=32 export nr_cpu=1 export memory='320M' export rootfs='openwrt-i386-2016-03-16.cgz' export hdd_partitions='/dev/vda' export swap_partitions='/dev/vdb' export need_kconfig='CONFIG_KVM_GUEST=y' export enqueue_time='2017-11-09 08:41:58 +0800' export _id='5a03a4560b9a93f7c99708bb' export _rt='/result/trinity/300s/vm-lkp-wsx03-openwrt-i386/openwrt-i386-2016-03-16.cgz/i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS/gcc-6/c470abd4fde40ea6a0846a2beab642a578c0b8cd' export user='lkp' export result_root='/result/trinity/300s/vm-lkp-wsx03-openwrt-i386/openwrt-i386-2016-03-16.cgz/i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS/gcc-6/c470abd4fde40ea6a0846a2beab642a578c0b8cd/0' export LKP_SERVER='inn' export max_uptime=1500 export initrd='/osimage/openwrt/openwrt-i386-2016-03-16.cgz' export bootloader_append='root=/dev/ram0 user=lkp job=/lkp/scheduled/vm-lkp-wsx03-openwrt-i386-5/trinity-300s-openwrt-i386-2016-03-16.cgz-c470abd4fde40ea6a0846a2beab642a578c0b8cd-20171109-63433-kf9gj3-wait_kernel-0.yaml ARCH=i386 kconfig=i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS branch=linus/master commit=c470abd4fde40ea6a0846a2beab642a578c0b8cd BOOT_IMAGE=/pkg/linux/i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS/gcc-6/c470abd4fde40ea6a0846a2beab642a578c0b8cd/vmlinuz-4.10.0 max_uptime=1500 RESULT_ROOT=/result/trinity/300s/vm-lkp-wsx03-openwrt-i386/openwrt-i386-2016-03-16.cgz/i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS/gcc-6/c470abd4fde40ea6a0846a2beab642a578c0b8cd/0 LKP_SERVER=inn debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100 net.ifnames=0 printk.devkmsg=on panic=-1 softlockup_panic=1 nmi_watchdog=panic oops=panic load_ramdisk=2 prompt_ramdisk=0 drbd.minor_count=8 systemd.log_level=err ignore_loglevel console=tty0 earlyprintk=ttyS0,115200 console=ttyS0,115200 vga=normal rw' export lkp_initrd='/lkp/lkp/lkp-i386.cgz' export bm_initrd='/osimage/pkg/static/trinity-i386.cgz' export site='inn' export LKP_CGI_PORT=80 export LKP_CIFS_PORT=139 export vmlinux_file='/pkg/linux/i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS/gcc-6/c470abd4fde40ea6a0846a2beab642a578c0b8cd/vmlinux' export kernel='/pkg/linux/i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS/gcc-6/c470abd4fde40ea6a0846a2beab642a578c0b8cd/vmlinuz-4.10.0' export dequeue_time='2017-11-09 09:06:15 +0800' export
Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf
On Thu, Nov 09, 2017 at 12:09:45PM +0800, Fengguang Wu wrote: On Thu, Nov 09, 2017 at 11:12:06AM +0800, Fengguang Wu wrote: Hi Alex, So looking over the trace the panic seems to be happening after a decnet interface is getting deleted. Is there any chance we could try compiling the kernel without decnet support to see if that is the source of these issues? I don't know if anyone on the Intel Wired Lan team is testing with that enabled so if we can eliminate that as a possible cause that would be useful. Sure and thank you for the suggestion! It looks disabling DECNET still triggers the vlan_device_event BUG. However when looking at the dmesgs, I find another warning just before the vlan_device_event BUG. Not sure if it's related one or independent now-fixed issue. Please press Enter to activate this console. [ 1291.938326] Writes: Total: 2 Max/Min: 0/0 Fail: 0 [ 1297.731690] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX [ 1297.828227] 8021q: adding VLAN 0 to HW filter on device eth0 [ 1300.506245] 8021q: adding VLAN 0 to HW filter on device eth0 [ 1302.467460] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX LKP: HOSTNAME vm-lkp-wsx03-openwrt-i386-10, MAC , kernel 4.13.0 1, serial console /dev/ttyS0 [ 1304.161688] Kernel tests: Boot OK! [ 1306.558532] 8021q: adding VLAN 0 to HW filter on device eth0 [ 1308.507499] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX [ 1310.526380] 8021q: adding VLAN 0 to HW filter on device eth0 [ 1311.246017] LKP: waiting for network... [ 1312.543432] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX [ 1313.985807] [ 1313.991541] = [ 1314.002398] WARNING: bad unlock balance detected! [ 1314.013154] 4.13.0 #1 Not tainted [ 1314.021549] - [ 1314.032505] procd/1244 is trying to release lock (rcu_preempt_state) at: [ 1314.047216] [] rcu_read_unlock_special+0x580/0x5b0 [ 1314.059825] but there are no more locks to release! [ 1314.070546] [ 1314.070546] other info that might help us debug this: [ 1314.085941] 2 locks held by procd/1244: [ 1314.095139] #0: (>cred_guard_mutex){..}, at: [] prepare_bprm_creds+0x28/0xc0 [ 1314.114616] #1: (rcu_read_lock){..}, at: [] path_init+0x490/0x6f0 [ 1314.132155] [ 1314.132155] stack backtrace: [ 1314.144402] CPU: 0 PID: 1244 Comm: procd Not tainted 4.13.0 #1 [ 1314.160197] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 1314.179404] Call Trace: [ 1314.186768] dump_stack+0x16/0x1c [ 1314.195387] print_unlock_imbalance_bug+0xb9/0xd0 [ 1314.205753] ? rcu_read_unlock_special+0x580/0x5b0 [ 1314.216381] ? rcu_read_unlock_special+0x580/0x5b0 [ 1314.226982] lock_release+0x1cc/0x490 [ 1314.235602] ? rcu_gp_kthread_wake+0x34/0x50 [ 1314.245262] ? rcu_read_unlock_special+0x580/0x5b0 [ 1314.255724] rt_mutex_unlock+0x1e/0xb0 [ 1314.264610] rcu_read_unlock_special+0x580/0x5b0 [ 1314.274814] __rcu_read_unlock+0xa7/0xb0 [ 1314.283954] unlazy_walk+0xcf/0x1f0 [ 1314.292409] trailing_symlink+0x349/0x4e0 [ 1314.301583] path_openat+0x333/0x1280 [ 1314.310197] do_filp_open+0x67/0x140 [ 1314.318696] ? getname_kernel+0x23/0x1e0 [ 1314.327766] ? cache_alloc_debugcheck_after+0x13a/0x2a0 [ 1314.340076] ? getname_kernel+0x23/0x1e0 [ 1314.349179] do_open_execat+0xab/0x2a0 [ 1314.358063] open_exec+0x57/0x80 [ 1314.366128] load_script+0x33c/0x3d0 [ 1314.374556] ? kvm_sched_clock_read+0x9/0x20 [ 1314.384219] ? sched_clock+0x9/0x10 [ 1314.392611] ? sched_clock_cpu+0x1a/0x1e0 [ 1314.401875] ? _raw_read_unlock+0x55/0x90 [ 1314.411080] search_binary_handler+0xd9/0x160 [ 1314.420799] do_execveat_common+0x8f6/0xb10 [ 1314.430334] SyS_execve+0x1f/0x30 [ 1314.438458] do_int80_syscall_32+0x95/0x1b0 [ 1314.447956] entry_INT80_32+0x2f/0x2f [ 1314.456606] EIP: 0xb7e9ab07 [ 1314.464062] EFLAGS: 0296 CPU: 0 [ 1314.472421] EAX: ffda EBX: 0807b584 ECX: bfb0fd70 EDX: 08061250 [ 1314.485257] ESI: 0807b584 EDI: EBP: bfb0fd58 ESP: bfb0fd28 [ 1314.498024] DS: 007b ES: 007b FS: GS: 0033 SS: 007b [ 1314.613681] hotplug-call (1244) used greatest stack depth: 6384 bytes left [ 1314.957636] 8021q: adding VLAN 0 to HW filter on device eth0 [ 1316.955154] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX [ 1318.197800] 8021q: adding VLAN 0 to HW filter on device eth0 [ 1320.222754] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX [ 1321.409456] BUG: unable to handle kernel paging request at 6b6b6f4f [ 1321.421942] IP: vlan_device_event+0x7f5/0xa40 [ 1321.431239] *pde = Note that this call trace is different from the ones posted in earlier emails. [ 1321.431267] [ 1321.443356] Oops: [#1] PREEMPT [ 1321.451390] CPU: 0 PID: 798 Comm: netifd Not tainted 4.13.0 #1 [ 1321.462802] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 1321.479743] task: cf8ae5c0
Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf
On Wed, Nov 08, 2017 at 10:34:10PM -0800, Cong Wang wrote: On Wed, Nov 8, 2017 at 7:12 PM, Fengguang Wuwrote: Hi Alex, So looking over the trace the panic seems to be happening after a decnet interface is getting deleted. Is there any chance we could try compiling the kernel without decnet support to see if that is the source of these issues? I don't know if anyone on the Intel Wired Lan team is testing with that enabled so if we can eliminate that as a possible cause that would be useful. Sure and thank you for the suggestion! It looks disabling DECNET still triggers the vlan_device_event BUG. However when looking at the dmesgs, I find another warning just before the vlan_device_event BUG. Not sure if it's related one or independent now-fixed issue. Those decnet symbols are probably noises. Yes it's not related to CONFIG_DECNET. How do you reproduce it? And what is your setup? Vlan device on top of your eth0 (e1000)? It can basically be reproduced in one of our test machines -- lkp-wsx03, which is a Westmere EX server. The test boots an openwrt image in QEMU and run trinity for minutes. Here is the openwrt image: https://github.com/0day-ci/lkp-qemu/blob/master/osimage/openwrt/openwrt-i386-2016-03-16.cgz I don't know how openwrt deals with VLAN. :) Thanks, Fengguang
Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf
On Thu, Nov 09, 2017 at 10:43:08AM +0800, Fengguang Wu wrote: Of course, if it's bisectable, that would be great too. Yes, bisect is on the way. So far it's bisecting in the 4.12 commits. The bisect was unsuccessful due to an unrelated DRM_BOCHS oops in 4.11. Disabling the buggy driver, I managed to reproduce the vlan_device_event bug in 4.11. However only to find the older kernels suffer from different kind of oops, which make the bisect troublesome. % grep -E 'dmesg.(BUG|EIP)' v4.*/matrix.json v4.10/matrix.json: "dmesg.BUG:unable_to_handle_kernel": [ v4.10/matrix.json: "dmesg.EIP:kobject_get": [ v4.10/matrix.json: "dmesg.BUG:kernel_reboot-without-warning_in_test_stage": [ v4.11/matrix.json: "dmesg.BUG:unable_to_handle_kernel": [ v4.11/matrix.json: "dmesg.EIP:vlan_device_event": [ v4.8/matrix.json: "dmesg.BUG:kernel_reboot-without-warning_in_early-boot_stage,last_printk:Decompressing_Linux": [ v4.9/matrix.json: "dmesg.BUG:key_not_in.data": [ I'll try tuning kconfig to get a good bisect base. Hopefully we'll make the coming RC releases free from such bisect stoppers with better regression tracking and bisect automation. However if the situation turns out to be not that optimistic, we might track "bisect stopper" kconfig options over kernel releases. For bisect scripts to auto pick them up. An example kconfig bisect-blacklist file would be: $ cat tools/testing/configs/bisect-4.12.config # CONFIG_DRM_BOCHS is not set That would mean: CONFIG_DRM_BOCHS will reliably stop kernel from booting before 4.12. If ever bisecting into pre-4.12 kernels, better disable it. The best fit would be the uncommon drivers and optional features. Warnings are typically not bisect stoppers so can be ignored. Kernel panics that happen rare enough to impact bisect may also be ignored. Thanks, Fengguang
Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf
On Wed, Nov 8, 2017 at 7:12 PM, Fengguang Wuwrote: > Hi Alex, > >> So looking over the trace the panic seems to be happening after a >> decnet interface is getting deleted. Is there any chance we could try >> compiling the kernel without decnet support to see if that is the >> source of these issues? I don't know if anyone on the Intel Wired Lan >> team is testing with that enabled so if we can eliminate that as a >> possible cause that would be useful. > > > Sure and thank you for the suggestion! > > It looks disabling DECNET still triggers the vlan_device_event BUG. > However when looking at the dmesgs, I find another warning just before > the vlan_device_event BUG. Not sure if it's related one or independent > now-fixed issue. Those decnet symbols are probably noises. How do you reproduce it? And what is your setup? Vlan device on top of your eth0 (e1000)?
Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf
On Thu, Nov 09, 2017 at 11:12:06AM +0800, Fengguang Wu wrote: Hi Alex, So looking over the trace the panic seems to be happening after a decnet interface is getting deleted. Is there any chance we could try compiling the kernel without decnet support to see if that is the source of these issues? I don't know if anyone on the Intel Wired Lan team is testing with that enabled so if we can eliminate that as a possible cause that would be useful. Sure and thank you for the suggestion! It looks disabling DECNET still triggers the vlan_device_event BUG. However when looking at the dmesgs, I find another warning just before the vlan_device_event BUG. Not sure if it's related one or independent now-fixed issue. Please press Enter to activate this console. [ 1291.938326] Writes: Total: 2 Max/Min: 0/0 Fail: 0 [ 1297.731690] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX [ 1297.828227] 8021q: adding VLAN 0 to HW filter on device eth0 [ 1300.506245] 8021q: adding VLAN 0 to HW filter on device eth0 [ 1302.467460] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX LKP: HOSTNAME vm-lkp-wsx03-openwrt-i386-10, MAC , kernel 4.13.0 1, serial console /dev/ttyS0 [ 1304.161688] Kernel tests: Boot OK! [ 1306.558532] 8021q: adding VLAN 0 to HW filter on device eth0 [ 1308.507499] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX [ 1310.526380] 8021q: adding VLAN 0 to HW filter on device eth0 [ 1311.246017] LKP: waiting for network... [ 1312.543432] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX [ 1313.985807] [ 1313.991541] = [ 1314.002398] WARNING: bad unlock balance detected! [ 1314.013154] 4.13.0 #1 Not tainted [ 1314.021549] - [ 1314.032505] procd/1244 is trying to release lock (rcu_preempt_state) at: [ 1314.047216] [] rcu_read_unlock_special+0x580/0x5b0 [ 1314.059825] but there are no more locks to release! [ 1314.070546] [ 1314.070546] other info that might help us debug this: [ 1314.085941] 2 locks held by procd/1244: [ 1314.095139] #0: (>cred_guard_mutex){..}, at: [] prepare_bprm_creds+0x28/0xc0 [ 1314.114616] #1: (rcu_read_lock){..}, at: [] path_init+0x490/0x6f0 [ 1314.132155] [ 1314.132155] stack backtrace: [ 1314.144402] CPU: 0 PID: 1244 Comm: procd Not tainted 4.13.0 #1 [ 1314.160197] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 1314.179404] Call Trace: [ 1314.186768] dump_stack+0x16/0x1c [ 1314.195387] print_unlock_imbalance_bug+0xb9/0xd0 [ 1314.205753] ? rcu_read_unlock_special+0x580/0x5b0 [ 1314.216381] ? rcu_read_unlock_special+0x580/0x5b0 [ 1314.226982] lock_release+0x1cc/0x490 [ 1314.235602] ? rcu_gp_kthread_wake+0x34/0x50 [ 1314.245262] ? rcu_read_unlock_special+0x580/0x5b0 [ 1314.255724] rt_mutex_unlock+0x1e/0xb0 [ 1314.264610] rcu_read_unlock_special+0x580/0x5b0 [ 1314.274814] __rcu_read_unlock+0xa7/0xb0 [ 1314.283954] unlazy_walk+0xcf/0x1f0 [ 1314.292409] trailing_symlink+0x349/0x4e0 [ 1314.301583] path_openat+0x333/0x1280 [ 1314.310197] do_filp_open+0x67/0x140 [ 1314.318696] ? getname_kernel+0x23/0x1e0 [ 1314.327766] ? cache_alloc_debugcheck_after+0x13a/0x2a0 [ 1314.340076] ? getname_kernel+0x23/0x1e0 [ 1314.349179] do_open_execat+0xab/0x2a0 [ 1314.358063] open_exec+0x57/0x80 [ 1314.366128] load_script+0x33c/0x3d0 [ 1314.374556] ? kvm_sched_clock_read+0x9/0x20 [ 1314.384219] ? sched_clock+0x9/0x10 [ 1314.392611] ? sched_clock_cpu+0x1a/0x1e0 [ 1314.401875] ? _raw_read_unlock+0x55/0x90 [ 1314.411080] search_binary_handler+0xd9/0x160 [ 1314.420799] do_execveat_common+0x8f6/0xb10 [ 1314.430334] SyS_execve+0x1f/0x30 [ 1314.438458] do_int80_syscall_32+0x95/0x1b0 [ 1314.447956] entry_INT80_32+0x2f/0x2f [ 1314.456606] EIP: 0xb7e9ab07 [ 1314.464062] EFLAGS: 0296 CPU: 0 [ 1314.472421] EAX: ffda EBX: 0807b584 ECX: bfb0fd70 EDX: 08061250 [ 1314.485257] ESI: 0807b584 EDI: EBP: bfb0fd58 ESP: bfb0fd28 [ 1314.498024] DS: 007b ES: 007b FS: GS: 0033 SS: 007b [ 1314.613681] hotplug-call (1244) used greatest stack depth: 6384 bytes left [ 1314.957636] 8021q: adding VLAN 0 to HW filter on device eth0 [ 1316.955154] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX [ 1318.197800] 8021q: adding VLAN 0 to HW filter on device eth0 [ 1320.222754] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX [ 1321.409456] BUG: unable to handle kernel paging request at 6b6b6f4f [ 1321.421942] IP: vlan_device_event+0x7f5/0xa40 [ 1321.431239] *pde = Note that this call trace is different from the ones posted in earlier emails. [ 1321.431267] [ 1321.443356] Oops: [#1] PREEMPT [ 1321.451390] CPU: 0 PID: 798 Comm: netifd Not tainted 4.13.0 #1 [ 1321.462802] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 1321.479743] task: cf8ae5c0 task.stack: cf114000 [ 1321.489345] EIP: vlan_device_event+0x7f5/0xa40
Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf
Of course, if it's bisectable, that would be great too. Yes, bisect is on the way. So far it's bisecting in the 4.12 commits. The bisect was unsuccessful due to an unrelated DRM_BOCHS oops in 4.11. Disabling the buggy driver, I managed to reproduce the vlan_device_event bug in 4.11. However only to find the older kernels suffer from different kind of oops, which make the bisect troublesome. % grep -E 'dmesg.(BUG|EIP)' v4.*/matrix.json v4.10/matrix.json: "dmesg.BUG:unable_to_handle_kernel": [ v4.10/matrix.json: "dmesg.EIP:kobject_get": [ v4.10/matrix.json: "dmesg.BUG:kernel_reboot-without-warning_in_test_stage": [ v4.11/matrix.json: "dmesg.BUG:unable_to_handle_kernel": [ v4.11/matrix.json: "dmesg.EIP:vlan_device_event": [ v4.8/matrix.json: "dmesg.BUG:kernel_reboot-without-warning_in_early-boot_stage,last_printk:Decompressing_Linux": [ v4.9/matrix.json: "dmesg.BUG:key_not_in.data": [ I'll try tuning kconfig to get a good bisect base. Hopefully we'll make the coming RC releases free from such bisect stoppers with better regression tracking and bisect automation. Regards, Fengguang
Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf
On Wed, Nov 8, 2017 at 9:12 AM, Fengguang Wuwrote: > Hi Linus, > > > On Wed, Nov 08, 2017 at 08:20:38AM -0800, Linus Torvalds wrote: >> >> On Wed, Nov 8, 2017 at 1:48 AM, Fengguang Wu >> wrote: >>> >>> >>> Now I got the faddr2line output. :) >> >> >> Thank you, but this also shows that you then compress the output too >> much for convenience. >> >>> [ 745.719623] BUG: unable to handle kernel paging request at 6b6b6f4f >>> [ 745.732871] IP: vlan_device_event at net/8021q/vlan.h:60 >> >> >> So this looks more legible than "vlan_device_event+0x7f5/0xa40" (or >> whatever), but in fact it's not. >> >> The reason faddr2line is great is because it gives inlining >> information, so that you can see exactly which access it is, even if >> it's inside some helper inline function (macros still obviously are >> going to be problematic). >> >> And you've cut the whole information down, to the point where there is >> _less_ information than there used to be. >> >> So the full faddr2line output is actually important, and should look >> something like this: >> >>__schedule+0x315/0x840: >>rq_sched_info_arrive at kernel/sched/stats.h:13 >> (inlined by) sched_info_arrive at kernel/sched/stats.h:99 >> (inlined by) __sched_info_switch at kernel/sched/stats.h:151 >> (inlined by) sched_info_switch at kernel/sched/stats.h:158 >> (inlined by) prepare_task_switch at kernel/sched/core.c:2582 >> (inlined by) context_switch at kernel/sched/core.c:2755 >> (inlined by) __schedule at kernel/sched/core.c:3366 >> >> which is admittedly a fairly extreme case, but it shows how involved >> things can be. > > > Indeed! I wasn't aware that faddr2line could have so informative > output! After checking several of its outputs, I decided to pipe its > output to |tail -1 :/ That's clearly due to my lacking of first hand > experience on oops analyzing! > > IMHO this kind of informative demo in your email could be very good > candidates to put in Documentation/. If you search faddr2line under > Documentation there's no single mention of it (addr2line does show up > 7 times). > > >> Note how in my example above the access itself is from a header file >> (that inlined rq_sched_info_arrive() function), and I happened to pick >> the >> >>rq->rq_sched_info.pcount++; >> >> line. But Many inline functions are inlined from different points, >> often many times in the same top-level function (think of the atomic >> helper inlines, for example). >> >> In your case, that net/8021q/vlan.h:60 information is literally not >> helping, because it only shows what I could already figure out from >> looking at the constants in the disassembly: the code comes from the >> >>vlan_group_for_each_dev(...) { >>.. >> >> loop setup (it's the inline __vlan_group_get_device() function, and >> the constants I could recognize are the VLAN_GROUP_ARRAY_PART_LEN >> things. >> >> But exactly like the constants didn't tell me *which* invocation of >> that loop it was, your "net/8021q/vlan.h:60" doesn't tell me which one >> it is. >> >> There's at least _eight_ different "vlan_group_for_each_dev() {" loops >> in vlan_device_event(), and maybe it's not all that meaningful which >> of the eight it is in this case, in other cases it's critical. >> >> And since you've actually replaced the "vlan_device_event+0x7f5/0xa40" >> with that "net/8021q/vlan.h:60" entirely, all the offset information >> that *could* maybe have been used to pick one case over another (but >> likely not) is also gone. >> >> That's why that inlining chain is important - so that we can see how >> it got to that access in __vlan_group_get_device(). >> >> So please do keep _all_ of the faddr2line output, at least for the >> "IP:" line (the stack traces might not be worth it). > > > OK. Here is the original faddr2line output: > > $ ~/linux/scripts/faddr2line vmlinux vlan_device_event+0x7f5/0xa40 > vlan_device_event+0x7f5/0xa40: > vlan_device_event at net/8021q/vlan.h:60 > > And below is call trace embedded with full faddr2line output. > > I notice that this trace shows no additional inline files at all. > Is it because I did some kconfig option wrong, so that inline info is > lost? Eg. > > CONFIG_OPTIMIZE_INLINING=y (reading lib/Kconfig.debug, it looks better set > to N) > CONFIG_DEBUG_INFO_REDUCED=y > CONFIG_DEBUG_INFO_SPLIT=y > (full .config attached) > > [ 745.719623] BUG: unable to handle kernel paging request at 6b6b6f4f > [ 745.732871] IP: vlan_device_event+0x7f5/0xa40: > vlan_device_event at net/8021q/vlan.h:60 > [ 745.742106] *pde = > [ 745.748587] Oops: [#1] PREEMPT > [ 745.756104] CPU: 0 PID: 786 Comm: netifd Not tainted 4.14.0-rc8 #1 > [ 745.769171] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > 1.10.2-1 04/01/2014 > [ 745.786791] task: cf768780 task.stack: d187a000 > [ 745.796485] EIP: vlan_device_event+0x7f5/0xa40: > vlan_device_event at
Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf
On Wed, Nov 8, 2017 at 9:12 AM, Fengguang Wuwrote: > > OK. Here is the original faddr2line output: > > $ ~/linux/scripts/faddr2line vmlinux vlan_device_event+0x7f5/0xa40 > vlan_device_event+0x7f5/0xa40: > vlan_device_event at net/8021q/vlan.h:60 Hmm. Yes, that's not what I hoped for. > I notice that this trace shows no additional inline files at all. > Is it because I did some kconfig option wrong, so that inline info is > lost? Eg. > > CONFIG_OPTIMIZE_INLINING=y (reading lib/Kconfig.debug, it looks better set > to N) > CONFIG_DEBUG_INFO_REDUCED=y > CONFIG_DEBUG_INFO_SPLIT=y Hmm. It might also be a compiler/linker issue. Debug info is sadly often a second class citizen, because it's generally not something that gets tested nearly as much as the actual _code_ that a compiler generates. So we've certainly had issues before with incomplete debug info. That said, I'm not sure how well that CONFIG_DEBUG_INFO_SPLIT works, I've only ever tested with CONFIG_DEBUG_INFO=y CONFIG_DEBUG_INFO_REDUCED=y # CONFIG_DEBUG_INFO_SPLIT is not set # CONFIG_DEBUG_INFO_DWARF4 is not set which was the config that my scheduler example was generated from. It might be an issue with the particular version of 'addr2line' too, for all I know. The one I have is GNU addr2line version 2.27-24.fc26 and who knows what else could influence it. I *thought* that as long as you just had DEBUG_INFO enabled, it would automatically be ok, but I was clearly wrong. > [ 745.719623] BUG: unable to handle kernel paging request at 6b6b6f4f > [ 745.732871] IP: vlan_device_event+0x7f5/0xa40: > vlan_device_event at net/8021q/vlan.h:60 Ok, at least now there's no lost information, even if it doesn't have that nice inlining information that I was hoping for. Linus
Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf
[...] I notice that this trace shows no additional inline files at all. Is it because I did some kconfig option wrong, so that inline info is lost? Eg. CONFIG_OPTIMIZE_INLINING=y (reading lib/Kconfig.debug, it looks better set to N) CONFIG_DEBUG_INFO_REDUCED=y CONFIG_DEBUG_INFO_SPLIT=y (full .config attached) FYI it's randconfig testing, so may well contain absurd settings -- unless we add rules to force enable/disable some specific options to make life easier. Regards, Fengguang [ 745.719623] BUG: unable to handle kernel paging request at 6b6b6f4f [ 745.732871] IP: vlan_device_event+0x7f5/0xa40: vlan_device_event at net/8021q/vlan.h:60 [ 745.742106] *pde = [ 745.748587] Oops: [#1] PREEMPT [ 745.756104] CPU: 0 PID: 786 Comm: netifd Not tainted 4.14.0-rc8 #1 [ 745.769171] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 745.786791] task: cf768780 task.stack: d187a000 [ 745.796485] EIP: vlan_device_event+0x7f5/0xa40: vlan_device_event at net/8021q/vlan.h:60 [ 745.805877] EFLAGS: 00010206 CPU: 0 [ 745.813237] EAX: 00f9 EBX: 0002 ECX: EDX: 6b6b6b6b [ 745.825774] ESI: 02f9 EDI: d1de3700 EBP: d187bdd8 ESP: d187bda4 [ 745.838871] DS: 007b ES: 007b FS: GS: 0033 SS: 0068 [ 745.850218] CR0: 80050033 CR2: 6b6b6f4f CR3: 0f4c8000 CR4: 0690 [ 745.862750] Call Trace: [ 745.868650] ? dn_dev_delete+0x138/0x1d0: dn_dev_delete at net/decnet/dn_dev.c:1224 [ 745.876751] ? dn_dev_down+0x69/0x80: dn_dev_down at net/decnet/dn_dev.c:1240 [ 745.885084] notifier_call_chain+0x4e/0xa0: notifier_call_chain at kernel/notifier.c:95 (discriminator 1) [ 745.894254] raw_notifier_call_chain+0xc/0x10: raw_notifier_call_chain at kernel/notifier.c:402 [ 745.903979] call_netdevice_notifiers_info+0x59/0x90: call_netdevice_notifiers_info at net/core/dev.c:1672 [ 745.914670] __dev_notify_flags+0xea/0x130: __dev_notify_flags at net/core/dev.c:1687 [ 745.923446] dev_change_flags+0x60/0x70: dev_change_flags at net/core/dev.c:6813 [ 745.931679] dev_ifsioc+0x49b/0x610: dev_ifsioc at net/core/dev_ioctl.c:257 [ 745.939102] ? mutex_lock_nested+0x14/0x20: mutex_lock_nested at kernel/locking/mutex.c:909 [ 745.948173] dev_ioctl+0x36f/0xb20: dev_ioctl at net/core/dev_ioctl.c:566 [ 745.956154] sock_ioctl+0x1cd/0x350: sock_ioctl at net/socket.c:968 [ 745.964313] ? sock_fasync+0xb0/0xb0: sock_ioctl at net/socket.c:984 [ 745.972512] vfs_ioctl+0x33/0x70: vfs_ioctl at fs/ioctl.c:47 [ 745.979867] do_vfs_ioctl+0x8d/0xc60: do_vfs_ioctl at fs/ioctl.c:690 [ 745.987782] ? kmem_cache_free+0x186/0x290: kmem_cache_free at include/linux/rcupdate.h:777 [ 745.996138] ? putname+0x9f/0xe0: putname at fs/namei.c:259 [ 746.003434] ? putname+0x9f/0xe0: putname at fs/namei.c:259 [ 746.011240] ? do_sys_open+0x28d/0x420: do_sys_open at fs/open.c:1069 [ 746.019728] ? __fget_light+0xb7/0xf0: __fget_light at fs/file.c:739 (discriminator 2) bad symbol size: base: 0xc1277040 end: 0xc1277040 [ 746.029292] SyS_ioctl+0x98/0xb0 [ 746.036680] do_int80_syscall_32+0x95/0x290: do_int80_syscall_32 at arch/x86/entry/common.c:329 [ 746.045685] entry_INT80_32+0x2f/0x2f: restore_all at arch/x86/entry/entry_32.S:536 [ 746.053427] EIP: 0xb7e97648 [ 746.059907] EFLAGS: 0246 CPU: 0 [ 746.068336] EAX: ffda EBX: 0005 ECX: 8914 EDX: bfcaa350 [ 746.081238] ESI: bfcaa350 EDI: bfcaa370 EBP: bfcaa388 ESP: bfcaa31c [ 746.093449] DS: 007b ES: 007b FS: GS: 0033 SS: 007b [ 746.103600] Code: 8d e0 db b4 c4 8d 56 01 89 14 8d e0 db b4 c4 0f 85 03 02 00 00 89 7d d4 31 f6 8b 7d d8 e9 84 00 00 00 8d 74 26 00 25 ff 01 00 00 <8b> 1c 82 31 d2 85 db 0f 95 c2 8b 04 95 cc db b4 c4 83 c0 01 85 [ 746.140089] EIP: vlan_device_event+0x7f5/0xa40: vlan_device_event at net/8021q/vlan.h:60 SS:ESP: 0068:d187bda4 [ 746.153505] CR2: 6b6b6f4f [ 746.413297] Kernel tests: Boot OK! [ 748.237463] ---[ end trace 40505af7d815b57d ]--- [ 748.281157] Kernel panic - not syncing: Fatal exception Anyway, it does all mean that the use-after-free almost certainly is that "array[]" access, which was the main suspect to begin with. I'm adding Jeff Kirsher and the e1000 list to the cc, since apparently the e1000 driver is involved. Of course, all the stack traces are in generic networking code, so it is possibly a generic networking issue, but it would probably be good to have a few people look at it. But all of this code is ancient. So it's almost certainly not a new bug, whatever it is. But as a test-case for your faddr2line improvements, this is excellent! Linus Regards, Fengguang # # Automatically generated file; DO NOT EDIT. # Linux/i386 4.14.0-rc8 Kernel Configuration # # CONFIG_64BIT is not set CONFIG_X86_32=y CONFIG_X86=y CONFIG_INSTRUCTION_DECODER=y CONFIG_OUTPUT_FORMAT="elf32-i386" CONFIG_ARCH_DEFCONFIG="arch/x86/configs/i386_defconfig" CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_MMU=y CONFIG_ARCH_MMAP_RND_BITS_MIN=8
Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf
Hi Linus, On Wed, Nov 08, 2017 at 08:20:38AM -0800, Linus Torvalds wrote: On Wed, Nov 8, 2017 at 1:48 AM, Fengguang Wuwrote: Now I got the faddr2line output. :) Thank you, but this also shows that you then compress the output too much for convenience. [ 745.719623] BUG: unable to handle kernel paging request at 6b6b6f4f [ 745.732871] IP: vlan_device_event at net/8021q/vlan.h:60 So this looks more legible than "vlan_device_event+0x7f5/0xa40" (or whatever), but in fact it's not. The reason faddr2line is great is because it gives inlining information, so that you can see exactly which access it is, even if it's inside some helper inline function (macros still obviously are going to be problematic). And you've cut the whole information down, to the point where there is _less_ information than there used to be. So the full faddr2line output is actually important, and should look something like this: __schedule+0x315/0x840: rq_sched_info_arrive at kernel/sched/stats.h:13 (inlined by) sched_info_arrive at kernel/sched/stats.h:99 (inlined by) __sched_info_switch at kernel/sched/stats.h:151 (inlined by) sched_info_switch at kernel/sched/stats.h:158 (inlined by) prepare_task_switch at kernel/sched/core.c:2582 (inlined by) context_switch at kernel/sched/core.c:2755 (inlined by) __schedule at kernel/sched/core.c:3366 which is admittedly a fairly extreme case, but it shows how involved things can be. Indeed! I wasn't aware that faddr2line could have so informative output! After checking several of its outputs, I decided to pipe its output to |tail -1 :/ That's clearly due to my lacking of first hand experience on oops analyzing! IMHO this kind of informative demo in your email could be very good candidates to put in Documentation/. If you search faddr2line under Documentation there's no single mention of it (addr2line does show up 7 times). Note how in my example above the access itself is from a header file (that inlined rq_sched_info_arrive() function), and I happened to pick the rq->rq_sched_info.pcount++; line. But Many inline functions are inlined from different points, often many times in the same top-level function (think of the atomic helper inlines, for example). In your case, that net/8021q/vlan.h:60 information is literally not helping, because it only shows what I could already figure out from looking at the constants in the disassembly: the code comes from the vlan_group_for_each_dev(...) { .. loop setup (it's the inline __vlan_group_get_device() function, and the constants I could recognize are the VLAN_GROUP_ARRAY_PART_LEN things. But exactly like the constants didn't tell me *which* invocation of that loop it was, your "net/8021q/vlan.h:60" doesn't tell me which one it is. There's at least _eight_ different "vlan_group_for_each_dev() {" loops in vlan_device_event(), and maybe it's not all that meaningful which of the eight it is in this case, in other cases it's critical. And since you've actually replaced the "vlan_device_event+0x7f5/0xa40" with that "net/8021q/vlan.h:60" entirely, all the offset information that *could* maybe have been used to pick one case over another (but likely not) is also gone. That's why that inlining chain is important - so that we can see how it got to that access in __vlan_group_get_device(). So please do keep _all_ of the faddr2line output, at least for the "IP:" line (the stack traces might not be worth it). OK. Here is the original faddr2line output: $ ~/linux/scripts/faddr2line vmlinux vlan_device_event+0x7f5/0xa40 vlan_device_event+0x7f5/0xa40: vlan_device_event at net/8021q/vlan.h:60 And below is call trace embedded with full faddr2line output. I notice that this trace shows no additional inline files at all. Is it because I did some kconfig option wrong, so that inline info is lost? Eg. CONFIG_OPTIMIZE_INLINING=y (reading lib/Kconfig.debug, it looks better set to N) CONFIG_DEBUG_INFO_REDUCED=y CONFIG_DEBUG_INFO_SPLIT=y (full .config attached) [ 745.719623] BUG: unable to handle kernel paging request at 6b6b6f4f [ 745.732871] IP: vlan_device_event+0x7f5/0xa40: vlan_device_event at net/8021q/vlan.h:60 [ 745.742106] *pde = [ 745.748587] Oops: [#1] PREEMPT [ 745.756104] CPU: 0 PID: 786 Comm: netifd Not tainted 4.14.0-rc8 #1 [ 745.769171] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 745.786791] task: cf768780 task.stack: d187a000 [ 745.796485] EIP: vlan_device_event+0x7f5/0xa40: vlan_device_event at net/8021q/vlan.h:60 [ 745.805877] EFLAGS: 00010206 CPU: 0 [ 745.813237] EAX: 00f9 EBX: 0002 ECX: EDX: 6b6b6b6b [ 745.825774] ESI: 02f9 EDI: d1de3700 EBP: d187bdd8 ESP: d187bda4 [ 745.838871] DS: 007b ES: 007b FS: GS: 0033 SS: 0068 [ 745.850218] CR0: 80050033 CR2: 6b6b6f4f CR3: 0f4c8000 CR4: 0690 [ 745.862750] Call Trace: [ 745.868650] ?
Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf
On Wed, Nov 8, 2017 at 1:48 AM, Fengguang Wuwrote: > > Now I got the faddr2line output. :) Thank you, but this also shows that you then compress the output too much for convenience. > [ 745.719623] BUG: unable to handle kernel paging request at 6b6b6f4f > [ 745.732871] IP: vlan_device_event at net/8021q/vlan.h:60 So this looks more legible than "vlan_device_event+0x7f5/0xa40" (or whatever), but in fact it's not. The reason faddr2line is great is because it gives inlining information, so that you can see exactly which access it is, even if it's inside some helper inline function (macros still obviously are going to be problematic). And you've cut the whole information down, to the point where there is _less_ information than there used to be. So the full faddr2line output is actually important, and should look something like this: __schedule+0x315/0x840: rq_sched_info_arrive at kernel/sched/stats.h:13 (inlined by) sched_info_arrive at kernel/sched/stats.h:99 (inlined by) __sched_info_switch at kernel/sched/stats.h:151 (inlined by) sched_info_switch at kernel/sched/stats.h:158 (inlined by) prepare_task_switch at kernel/sched/core.c:2582 (inlined by) context_switch at kernel/sched/core.c:2755 (inlined by) __schedule at kernel/sched/core.c:3366 which is admittedly a fairly extreme case, but it shows how involved things can be. Note how in my example above the access itself is from a header file (that inlined rq_sched_info_arrive() function), and I happened to pick the rq->rq_sched_info.pcount++; line. But Many inline functions are inlined from different points, often many times in the same top-level function (think of the atomic helper inlines, for example). In your case, that net/8021q/vlan.h:60 information is literally not helping, because it only shows what I could already figure out from looking at the constants in the disassembly: the code comes from the vlan_group_for_each_dev(...) { .. loop setup (it's the inline __vlan_group_get_device() function, and the constants I could recognize are the VLAN_GROUP_ARRAY_PART_LEN things. But exactly like the constants didn't tell me *which* invocation of that loop it was, your "net/8021q/vlan.h:60" doesn't tell me which one it is. There's at least _eight_ different "vlan_group_for_each_dev() {" loops in vlan_device_event(), and maybe it's not all that meaningful which of the eight it is in this case, in other cases it's critical. And since you've actually replaced the "vlan_device_event+0x7f5/0xa40" with that "net/8021q/vlan.h:60" entirely, all the offset information that *could* maybe have been used to pick one case over another (but likely not) is also gone. That's why that inlining chain is important - so that we can see how it got to that access in __vlan_group_get_device(). So please do keep _all_ of the faddr2line output, at least for the "IP:" line (the stack traces might not be worth it). Anyway, it does all mean that the use-after-free almost certainly is that "array[]" access, which was the main suspect to begin with. I'm adding Jeff Kirsher and the e1000 list to the cc, since apparently the e1000 driver is involved. Of course, all the stack traces are in generic networking code, so it is possibly a generic networking issue, but it would probably be good to have a few people look at it. But all of this code is ancient. So it's almost certainly not a new bug, whatever it is. But as a test-case for your faddr2line improvements, this is excellent! Linus
Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf
On Tue, Nov 07, 2017 at 08:25:03AM -0800, Linus Torvalds wrote: On Tue, Nov 7, 2017 at 2:21 AM, Fengguang Wuwrote: FYI this happens in v4.14-rc8 -- it's not necessarily a new bug. Probably not. Looks like a use-after-free bug in vlan_device_event() judging by the base pointer: ECX: 6b6b6b6b this is one of those circumstances where having the faddr2line output for that EIP would make it much easier to see exactly which access it is that causes problems. There's lots of inlining going on, so without that it's a pain to figure out. The code is 0: 31 c0xor%eax,%eax 2: 8d 76 00 lea0x0(%esi),%esi 5: 89 c2mov%eax,%edx 7: 89 c3mov%eax,%ebx 9: 81 e2 ff 0f 00 00and$0xfff,%edx f: 89 d1mov%edx,%ecx 11: c1 fb 0c sar$0xc,%ebx 14: c1 e9 09 shr$0x9,%ecx 17: 8d 0c d9 lea(%ecx,%ebx,8),%ecx 1a: 8b 4c 8e 10 mov0x10(%esi,%ecx,4),%ecx 1e: 85 c9test %ecx,%ecx 20: 74 34je 0x56 22: 81 e2 ff 01 00 00and$0x1ff,%edx 28:* 8b 14 91 mov(%ecx,%edx,4),%edx <-- trapping instruction 2b: 85 d2test %edx,%edx 2d: 74 27je 0x56 2f: f6 82 30 01 00 00 01 testb $0x1,0x130(%edx) 36: 74 1eje 0x56 and just by going by the constants in question (0xfff and 0x1ff), I can see that it's one of vlan_group_for_each_dev(..) { ... } things, but that's pretty much all I can tell. Apparently we'll get that faddr2line output soon. In the meantime, I think this is a real bug report but I don't see enough information to really go on. Got it. I should be able to get faddr2line output tomorrow. Of course, if it's bisectable, that would be great too. It looks reproducible enough to be bisectable. I'll try. Regards, Fengguang
Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf
On Tue, Nov 7, 2017 at 2:21 AM, Fengguang Wuwrote: > > FYI this happens in v4.14-rc8 -- it's not necessarily a new bug. Probably not. Looks like a use-after-free bug in vlan_device_event() judging by the base pointer: ECX: 6b6b6b6b this is one of those circumstances where having the faddr2line output for that EIP would make it much easier to see exactly which access it is that causes problems. There's lots of inlining going on, so without that it's a pain to figure out. The code is 0: 31 c0xor%eax,%eax 2: 8d 76 00 lea0x0(%esi),%esi 5: 89 c2mov%eax,%edx 7: 89 c3mov%eax,%ebx 9: 81 e2 ff 0f 00 00and$0xfff,%edx f: 89 d1mov%edx,%ecx 11: c1 fb 0c sar$0xc,%ebx 14: c1 e9 09 shr$0x9,%ecx 17: 8d 0c d9 lea(%ecx,%ebx,8),%ecx 1a: 8b 4c 8e 10 mov0x10(%esi,%ecx,4),%ecx 1e: 85 c9test %ecx,%ecx 20: 74 34je 0x56 22: 81 e2 ff 01 00 00and$0x1ff,%edx 28:* 8b 14 91 mov(%ecx,%edx,4),%edx <-- trapping instruction 2b: 85 d2test %edx,%edx 2d: 74 27je 0x56 2f: f6 82 30 01 00 00 01 testb $0x1,0x130(%edx) 36: 74 1eje 0x56 and just by going by the constants in question (0xfff and 0x1ff), I can see that it's one of vlan_group_for_each_dev(..) { ... } things, but that's pretty much all I can tell. Apparently we'll get that faddr2line output soon. In the meantime, I think this is a real bug report but I don't see enough information to really go on. Of course, if it's bisectable, that would be great too. Linus