Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-12 Thread Linus Torvalds
On Wed, Nov 8, 2017 at 9:12 AM, Fengguang Wu  wrote:
>
> OK. Here is the original faddr2line output:
>
> $ ~/linux/scripts/faddr2line vmlinux vlan_device_event+0x7f5/0xa40
> vlan_device_event+0x7f5/0xa40:
> vlan_device_event at net/8021q/vlan.h:60
>
> And below is call trace embedded with full faddr2line output.
>
> I notice that this trace shows no additional inline files at all.
> Is it because I did some kconfig option wrong, so that inline info is
> lost? Eg.
>
> CONFIG_OPTIMIZE_INLINING=y (it looks better set to N)
> CONFIG_DEBUG_INFO_REDUCED=y
> CONFIG_DEBUG_INFO_SPLIT=y

Ok, this annoyed me, so I went back and looked.

It's the "CONFIG_DEBUG_INFO_SPLIT" thing that makes faddr2line unable
to see the inlining information,

Using OPTIMIZE_INLINING is fine.

I'm not sure that addr2line could be made to understand the .dwo files
that DEBUG_INFO_SPLIT causes (particularly since we munge the vmlinux
file itself, who knows how that could confuse things).

So can I ask that you make the 0day build scripts always use

  CONFIG_DEBUG_INFO=y
  CONFIG_DEBUG_INFO_REDUCED=y
  # CONFIG_DEBUG_INFO_SPLIT is not set

because with that "DEBUG_INFO_REDUCED=y", the use of DEBUG_INFO_SPLIT
shouldn't be _that_ big of a deal.

Yes, splitting the debug info does help reduce disk usage for the
build, and presumably speed it up a bit too due to less IO and reduced
copying of the debug info data, but right now it really makes the
debug info much less useful.

Just to see the difference:

 - with DEBUG_INFO_SPLIT=y

[torvalds@i7 linux]$ ./scripts/faddr2line vmlinux __schedule+0x314
__schedule+0x314/0x840:
__schedule at kernel/sched/stats.h:12

 - with DEBUG_INFO_SPLIT is not set

[torvalds@i7 linux]$ ./scripts/faddr2line vmlinux __schedule+0x314
__schedule+0x314/0x840:
rq_sched_info_arrive at kernel/sched/stats.h:12
 (inlined by) sched_info_arrive at kernel/sched/stats.h:99
 (inlined by) __sched_info_switch at kernel/sched/stats.h:151
 (inlined by) sched_info_switch at kernel/sched/stats.h:158
 (inlined by) prepare_task_switch at kernel/sched/core.c:2582
 (inlined by) context_switch at kernel/sched/core.c:2755
 (inlined by) __schedule at kernel/sched/core.c:3366

and while (once again) this is a pretty extreme case, we do use a lot
of inlines, and gcc will add its own inlining. Getting this whole
information - particularly for the faulting IP - would really help in
some situations.

I love what the 0day robot is doing, this would be another big step forward.

Oh - and talking about "big step forward" - does the 0day robot do any
suspend/resume testing at all?

Even on non-laptop hardware, it should be possible to do something like

echo platform > /sys/power/pm_test
echo freeze > /sys/power/state

or similar (assuming CONFIG_PM_DEBUG is enabled).

Maybe you already do something like this?

Anyway, regardless this was a good release for the 0day robot. Thanks.

 Linus


Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-09 Thread Cong Wang
On Thu, Nov 9, 2017 at 7:51 AM, Girish Moodalbail
 wrote:
>
> Upon receiving NETDEV_DOWN event, we are calling
>
> vlan_vid_del(dev, htons(ETH_P_8021Q), 0);
>
> which in turn calls call_rcu() to queue vlan_info_free_rcu() to be called at
> some point. This free function frees the array[]
> (vlan_info.vlan_grp.vn_devices_array).  My guess is that
> vlan_info_free_rcu() is being called first and then the array[] is being
> accessed in vlan_device_event().
>

Well yes and no.

No, RCU itself is not broken and we clearly unpublish the RCU pointer
before calling call_rcu().

Yes, I see where it is broken: the grp pointer still points to old
dev->vlan_info, we should re-fetch it after vlan_vid_del().

I will send a fix.

Thanks!


Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-09 Thread Girish Moodalbail

On 11/8/17 10:34 PM, Cong Wang wrote:

On Wed, Nov 8, 2017 at 7:12 PM, Fengguang Wu  wrote:

Hi Alex,


So looking over the trace the panic seems to be happening after a
decnet interface is getting deleted. Is there any chance we could try
compiling the kernel without decnet support to see if that is the
source of these issues? I don't know if anyone on the Intel Wired Lan
team is testing with that enabled so if we can eliminate that as a
possible cause that would be useful.



Sure and thank you for the suggestion!

It looks disabling DECNET still triggers the vlan_device_event BUG.
However when looking at the dmesgs, I find another warning just before
the vlan_device_event BUG. Not sure if it's related one or independent
now-fixed issue.


Those decnet symbols are probably noises.


Right. This is a 32-bit Kernel compiled with CONFIG_PREEMPT=y (I am guessing 
that this has exposed some lock bug). Also, VLAN (8021q) is compiled into the 
kernel, so it registers a vlan_device_event() callback on boot. There may not be 
a VLAN device per-se.


Upon receiving NETDEV_DOWN event, we are calling

vlan_vid_del(dev, htons(ETH_P_8021Q), 0);

which in turn calls call_rcu() to queue vlan_info_free_rcu() to be called at 
some point. This free function frees the array[] 
(vlan_info.vlan_grp.vn_devices_array).  My guess is that vlan_info_free_rcu() is 
being called first and then the array[] is being accessed in vlan_device_event().


The netifd daemon in OpenWRT is marking the interface down and that is why it is 
generating NETDEV_DOWN event. And it uses ioctl(SIOCSIFFLAGS, ~IFF_UP) on a 
AF_UNIX socket. This results in a call to dev_ifsioc() in the kernel with only 
rtnl_lock() held and it is not in RCU read critical section.


~Girish




How do you reproduce it? And what is your setup? Vlan device on
top of your eth0 (e1000)?





Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-08 Thread Fengguang Wu

On Thu, Nov 09, 2017 at 02:55:10PM +0800, Fengguang Wu wrote:

On Wed, Nov 08, 2017 at 10:34:10PM -0800, Cong Wang wrote:

On Wed, Nov 8, 2017 at 7:12 PM, Fengguang Wu  wrote:

Hi Alex,


So looking over the trace the panic seems to be happening after a
decnet interface is getting deleted. Is there any chance we could try
compiling the kernel without decnet support to see if that is the
source of these issues? I don't know if anyone on the Intel Wired Lan
team is testing with that enabled so if we can eliminate that as a
possible cause that would be useful.



Sure and thank you for the suggestion!

It looks disabling DECNET still triggers the vlan_device_event BUG.
However when looking at the dmesgs, I find another warning just before
the vlan_device_event BUG. Not sure if it's related one or independent
now-fixed issue.


Those decnet symbols are probably noises.


Yes it's not related to CONFIG_DECNET.


How do you reproduce it? And what is your setup? Vlan device on
top of your eth0 (e1000)?


It can basically be reproduced in one of our test machines --
lkp-wsx03, which is a Westmere EX server.


Anyway if you'd like to try, here are the steps. It'll auto download
the images and run QEMU.

   apt-get install lib32gcc-7-dev # or lib32gcc-6-dev
   git clone https://github.com/intel/lkp-tests.git
   cd lkp-tests
   bin/lkp qemu -k  job-script  # job-script is attached in this 
email

Note that even in our lkp-wsx03 machine, the chance of reproducing it
is only 3% (3 out of 100 boots).

Thanks,
Fengguang
#!/bin/sh

export_top_env()
{
export suite='trinity'
export testcase='trinity'
export runtime=300
export 
job_origin='/lkp/lkp/src/allot/rand/vm-lkp-wsx03-openwrt-i386/trinity.yaml'
export testbox='vm-lkp-wsx03-openwrt-i386-5'
export tbox_group='vm-lkp-wsx03-openwrt-i386'
export kconfig='i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS'
export compiler='gcc-6'
export queue='wfg'
export branch='linus/master'
export commit='c470abd4fde40ea6a0846a2beab642a578c0b8cd'
export submit_id='5a03a4550b9a93f7c99708b0'
export 
job_file='/lkp/scheduled/vm-lkp-wsx03-openwrt-i386-5/trinity-300s-openwrt-i386-2016-03-16.cgz-c470abd4fde40ea6a0846a2beab642a578c0b8cd-20171109-63433-kf9gj3-wait_kernel-0.yaml'
export id='181954ca4367d475b88dc8de99b2d52ab533a5e1'
export model='qemu-system-i386 -enable-kvm'
export nr_vm=32
export nr_cpu=1
export memory='320M'
export rootfs='openwrt-i386-2016-03-16.cgz'
export hdd_partitions='/dev/vda'
export swap_partitions='/dev/vdb'
export need_kconfig='CONFIG_KVM_GUEST=y'
export enqueue_time='2017-11-09 08:41:58 +0800'
export _id='5a03a4560b9a93f7c99708bb'
export 
_rt='/result/trinity/300s/vm-lkp-wsx03-openwrt-i386/openwrt-i386-2016-03-16.cgz/i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS/gcc-6/c470abd4fde40ea6a0846a2beab642a578c0b8cd'
export user='lkp'
export 
result_root='/result/trinity/300s/vm-lkp-wsx03-openwrt-i386/openwrt-i386-2016-03-16.cgz/i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS/gcc-6/c470abd4fde40ea6a0846a2beab642a578c0b8cd/0'
export LKP_SERVER='inn'
export max_uptime=1500
export initrd='/osimage/openwrt/openwrt-i386-2016-03-16.cgz'
export bootloader_append='root=/dev/ram0
user=lkp
job=/lkp/scheduled/vm-lkp-wsx03-openwrt-i386-5/trinity-300s-openwrt-i386-2016-03-16.cgz-c470abd4fde40ea6a0846a2beab642a578c0b8cd-20171109-63433-kf9gj3-wait_kernel-0.yaml
ARCH=i386
kconfig=i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS
branch=linus/master
commit=c470abd4fde40ea6a0846a2beab642a578c0b8cd
BOOT_IMAGE=/pkg/linux/i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS/gcc-6/c470abd4fde40ea6a0846a2beab642a578c0b8cd/vmlinuz-4.10.0
max_uptime=1500
RESULT_ROOT=/result/trinity/300s/vm-lkp-wsx03-openwrt-i386/openwrt-i386-2016-03-16.cgz/i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS/gcc-6/c470abd4fde40ea6a0846a2beab642a578c0b8cd/0
LKP_SERVER=inn
debug
apic=debug
sysrq_always_enabled
rcupdate.rcu_cpu_stall_timeout=100
net.ifnames=0
printk.devkmsg=on
panic=-1
softlockup_panic=1
nmi_watchdog=panic
oops=panic
load_ramdisk=2
prompt_ramdisk=0
drbd.minor_count=8
systemd.log_level=err
ignore_loglevel
console=tty0
earlyprintk=ttyS0,115200
console=ttyS0,115200
vga=normal
rw'
export lkp_initrd='/lkp/lkp/lkp-i386.cgz'
export bm_initrd='/osimage/pkg/static/trinity-i386.cgz'
export site='inn'
export LKP_CGI_PORT=80
export LKP_CIFS_PORT=139
export 
vmlinux_file='/pkg/linux/i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS/gcc-6/c470abd4fde40ea6a0846a2beab642a578c0b8cd/vmlinux'
export 
kernel='/pkg/linux/i386-randconfig-b0-11061302-CONFIG_DRM_BOCHS/gcc-6/c470abd4fde40ea6a0846a2beab642a578c0b8cd/vmlinuz-4.10.0'
export dequeue_time='2017-11-09 09:06:15 +0800'
export 

Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-08 Thread Fengguang Wu

On Thu, Nov 09, 2017 at 12:09:45PM +0800, Fengguang Wu wrote:

On Thu, Nov 09, 2017 at 11:12:06AM +0800, Fengguang Wu wrote:

Hi Alex,


So looking over the trace the panic seems to be happening after a
decnet interface is getting deleted. Is there any chance we could try
compiling the kernel without decnet support to see if that is the
source of these issues? I don't know if anyone on the Intel Wired Lan
team is testing with that enabled so if we can eliminate that as a
possible cause that would be useful.


Sure and thank you for the suggestion!

It looks disabling DECNET still triggers the vlan_device_event BUG.
However when looking at the dmesgs, I find another warning just before
the vlan_device_event BUG. Not sure if it's related one or independent
now-fixed issue.

Please press Enter to activate this console.
[ 1291.938326] Writes:  Total: 2  Max/Min: 0/0   Fail: 0
[ 1297.731690] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: 
RX
[ 1297.828227] 8021q: adding VLAN 0 to HW filter on device eth0
[ 1300.506245] 8021q: adding VLAN 0 to HW filter on device eth0
[ 1302.467460] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: 
RX
LKP: HOSTNAME vm-lkp-wsx03-openwrt-i386-10, MAC , kernel 4.13.0 1, serial 
console /dev/ttyS0
[ 1304.161688] Kernel tests: Boot OK!
[ 1306.558532] 8021q: adding VLAN 0 to HW filter on device eth0
[ 1308.507499] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: 
RX
[ 1310.526380] 8021q: adding VLAN 0 to HW filter on device eth0
[ 1311.246017] LKP: waiting for network...
[ 1312.543432] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: 
RX
[ 1313.985807]
[ 1313.991541] =
[ 1314.002398] WARNING: bad unlock balance detected!
[ 1314.013154] 4.13.0 #1 Not tainted
[ 1314.021549] -
[ 1314.032505] procd/1244 is trying to release lock (rcu_preempt_state) at:
[ 1314.047216] [] rcu_read_unlock_special+0x580/0x5b0
[ 1314.059825] but there are no more locks to release!
[ 1314.070546]
[ 1314.070546] other info that might help us debug this:
[ 1314.085941] 2 locks held by procd/1244:
[ 1314.095139]  #0:  (>cred_guard_mutex){..}, at: [] 
prepare_bprm_creds+0x28/0xc0
[ 1314.114616]  #1:  (rcu_read_lock){..}, at: [] 
path_init+0x490/0x6f0
[ 1314.132155]
[ 1314.132155] stack backtrace:
[ 1314.144402] CPU: 0 PID: 1244 Comm: procd Not tainted 4.13.0 #1
[ 1314.160197] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.10.2-1 04/01/2014
[ 1314.179404] Call Trace:
[ 1314.186768]  dump_stack+0x16/0x1c
[ 1314.195387]  print_unlock_imbalance_bug+0xb9/0xd0
[ 1314.205753]  ? rcu_read_unlock_special+0x580/0x5b0
[ 1314.216381]  ? rcu_read_unlock_special+0x580/0x5b0
[ 1314.226982]  lock_release+0x1cc/0x490
[ 1314.235602]  ? rcu_gp_kthread_wake+0x34/0x50
[ 1314.245262]  ? rcu_read_unlock_special+0x580/0x5b0
[ 1314.255724]  rt_mutex_unlock+0x1e/0xb0
[ 1314.264610]  rcu_read_unlock_special+0x580/0x5b0
[ 1314.274814]  __rcu_read_unlock+0xa7/0xb0
[ 1314.283954]  unlazy_walk+0xcf/0x1f0
[ 1314.292409]  trailing_symlink+0x349/0x4e0
[ 1314.301583]  path_openat+0x333/0x1280
[ 1314.310197]  do_filp_open+0x67/0x140
[ 1314.318696]  ? getname_kernel+0x23/0x1e0
[ 1314.327766]  ? cache_alloc_debugcheck_after+0x13a/0x2a0
[ 1314.340076]  ? getname_kernel+0x23/0x1e0
[ 1314.349179]  do_open_execat+0xab/0x2a0
[ 1314.358063]  open_exec+0x57/0x80
[ 1314.366128]  load_script+0x33c/0x3d0
[ 1314.374556]  ? kvm_sched_clock_read+0x9/0x20
[ 1314.384219]  ? sched_clock+0x9/0x10
[ 1314.392611]  ? sched_clock_cpu+0x1a/0x1e0
[ 1314.401875]  ? _raw_read_unlock+0x55/0x90
[ 1314.411080]  search_binary_handler+0xd9/0x160
[ 1314.420799]  do_execveat_common+0x8f6/0xb10
[ 1314.430334]  SyS_execve+0x1f/0x30
[ 1314.438458]  do_int80_syscall_32+0x95/0x1b0
[ 1314.447956]  entry_INT80_32+0x2f/0x2f
[ 1314.456606] EIP: 0xb7e9ab07
[ 1314.464062] EFLAGS: 0296 CPU: 0
[ 1314.472421] EAX: ffda EBX: 0807b584 ECX: bfb0fd70 EDX: 08061250
[ 1314.485257] ESI: 0807b584 EDI:  EBP: bfb0fd58 ESP: bfb0fd28
[ 1314.498024]  DS: 007b ES: 007b FS:  GS: 0033 SS: 007b
[ 1314.613681] hotplug-call (1244) used greatest stack depth: 6384 bytes left
[ 1314.957636] 8021q: adding VLAN 0 to HW filter on device eth0
[ 1316.955154] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: 
RX
[ 1318.197800] 8021q: adding VLAN 0 to HW filter on device eth0
[ 1320.222754] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: 
RX
[ 1321.409456] BUG: unable to handle kernel paging request at 6b6b6f4f
[ 1321.421942] IP: vlan_device_event+0x7f5/0xa40
[ 1321.431239] *pde = 


Note that this call trace is different from the ones posted in earlier
emails.

[ 1321.431267]
[ 1321.443356] Oops:  [#1] PREEMPT
[ 1321.451390] CPU: 0 PID: 798 Comm: netifd Not tainted 4.13.0 #1
[ 1321.462802] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.10.2-1 04/01/2014
[ 1321.479743] task: cf8ae5c0 

Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-08 Thread Fengguang Wu

On Wed, Nov 08, 2017 at 10:34:10PM -0800, Cong Wang wrote:

On Wed, Nov 8, 2017 at 7:12 PM, Fengguang Wu  wrote:

Hi Alex,


So looking over the trace the panic seems to be happening after a
decnet interface is getting deleted. Is there any chance we could try
compiling the kernel without decnet support to see if that is the
source of these issues? I don't know if anyone on the Intel Wired Lan
team is testing with that enabled so if we can eliminate that as a
possible cause that would be useful.



Sure and thank you for the suggestion!

It looks disabling DECNET still triggers the vlan_device_event BUG.
However when looking at the dmesgs, I find another warning just before
the vlan_device_event BUG. Not sure if it's related one or independent
now-fixed issue.


Those decnet symbols are probably noises.


Yes it's not related to CONFIG_DECNET.


How do you reproduce it? And what is your setup? Vlan device on
top of your eth0 (e1000)?


It can basically be reproduced in one of our test machines --
lkp-wsx03, which is a Westmere EX server.

The test boots an openwrt image in QEMU and run trinity for minutes.

Here is the openwrt image:

https://github.com/0day-ci/lkp-qemu/blob/master/osimage/openwrt/openwrt-i386-2016-03-16.cgz

I don't know how openwrt deals with VLAN. :)

Thanks,
Fengguang


Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-08 Thread Fengguang Wu

On Thu, Nov 09, 2017 at 10:43:08AM +0800, Fengguang Wu wrote:

Of course, if it's bisectable, that would be great too.


Yes, bisect is on the way. So far it's bisecting in the 4.12 commits.


The bisect was unsuccessful due to an unrelated DRM_BOCHS oops in 4.11.
Disabling the buggy driver, I managed to reproduce the
vlan_device_event bug in 4.11. However only to find the older kernels
suffer from different kind of oops, which make the bisect troublesome.

% grep -E 'dmesg.(BUG|EIP)' v4.*/matrix.json
v4.10/matrix.json:  "dmesg.BUG:unable_to_handle_kernel": [
v4.10/matrix.json:  "dmesg.EIP:kobject_get": [
v4.10/matrix.json:  "dmesg.BUG:kernel_reboot-without-warning_in_test_stage": [
v4.11/matrix.json:  "dmesg.BUG:unable_to_handle_kernel": [
v4.11/matrix.json:  "dmesg.EIP:vlan_device_event": [
v4.8/matrix.json:  
"dmesg.BUG:kernel_reboot-without-warning_in_early-boot_stage,last_printk:Decompressing_Linux":
 [
v4.9/matrix.json:  "dmesg.BUG:key_not_in.data": [

I'll try tuning kconfig to get a good bisect base.

Hopefully we'll make the coming RC releases free from such bisect
stoppers with better regression tracking and bisect automation.


However if the situation turns out to be not that optimistic, we might
track "bisect stopper" kconfig options over kernel releases. For
bisect scripts to auto pick them up.

An example kconfig bisect-blacklist file would be:

   $ cat tools/testing/configs/bisect-4.12.config
   # CONFIG_DRM_BOCHS is not set

That would mean:

   CONFIG_DRM_BOCHS will reliably stop kernel from booting before 4.12.
   If ever bisecting into pre-4.12 kernels, better disable it.

The best fit would be the uncommon drivers and optional features.

Warnings are typically not bisect stoppers so can be ignored. Kernel
panics that happen rare enough to impact bisect may also be ignored.

Thanks,
Fengguang


Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-08 Thread Cong Wang
On Wed, Nov 8, 2017 at 7:12 PM, Fengguang Wu  wrote:
> Hi Alex,
>
>> So looking over the trace the panic seems to be happening after a
>> decnet interface is getting deleted. Is there any chance we could try
>> compiling the kernel without decnet support to see if that is the
>> source of these issues? I don't know if anyone on the Intel Wired Lan
>> team is testing with that enabled so if we can eliminate that as a
>> possible cause that would be useful.
>
>
> Sure and thank you for the suggestion!
>
> It looks disabling DECNET still triggers the vlan_device_event BUG.
> However when looking at the dmesgs, I find another warning just before
> the vlan_device_event BUG. Not sure if it's related one or independent
> now-fixed issue.

Those decnet symbols are probably noises.

How do you reproduce it? And what is your setup? Vlan device on
top of your eth0 (e1000)?


Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-08 Thread Fengguang Wu

On Thu, Nov 09, 2017 at 11:12:06AM +0800, Fengguang Wu wrote:

Hi Alex,


So looking over the trace the panic seems to be happening after a
decnet interface is getting deleted. Is there any chance we could try
compiling the kernel without decnet support to see if that is the
source of these issues? I don't know if anyone on the Intel Wired Lan
team is testing with that enabled so if we can eliminate that as a
possible cause that would be useful.


Sure and thank you for the suggestion!

It looks disabling DECNET still triggers the vlan_device_event BUG.
However when looking at the dmesgs, I find another warning just before
the vlan_device_event BUG. Not sure if it's related one or independent
now-fixed issue.

Please press Enter to activate this console.
[ 1291.938326] Writes:  Total: 2  Max/Min: 0/0   Fail: 0
[ 1297.731690] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: 
RX
[ 1297.828227] 8021q: adding VLAN 0 to HW filter on device eth0
[ 1300.506245] 8021q: adding VLAN 0 to HW filter on device eth0
[ 1302.467460] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: 
RX
LKP: HOSTNAME vm-lkp-wsx03-openwrt-i386-10, MAC , kernel 4.13.0 1, serial 
console /dev/ttyS0
[ 1304.161688] Kernel tests: Boot OK!
[ 1306.558532] 8021q: adding VLAN 0 to HW filter on device eth0
[ 1308.507499] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: 
RX
[ 1310.526380] 8021q: adding VLAN 0 to HW filter on device eth0
[ 1311.246017] LKP: waiting for network...
[ 1312.543432] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: 
RX
[ 1313.985807]
[ 1313.991541] =
[ 1314.002398] WARNING: bad unlock balance detected!
[ 1314.013154] 4.13.0 #1 Not tainted
[ 1314.021549] -
[ 1314.032505] procd/1244 is trying to release lock (rcu_preempt_state) at:
[ 1314.047216] [] rcu_read_unlock_special+0x580/0x5b0
[ 1314.059825] but there are no more locks to release!
[ 1314.070546]
[ 1314.070546] other info that might help us debug this:
[ 1314.085941] 2 locks held by procd/1244:
[ 1314.095139]  #0:  (>cred_guard_mutex){..}, at: [] 
prepare_bprm_creds+0x28/0xc0
[ 1314.114616]  #1:  (rcu_read_lock){..}, at: [] 
path_init+0x490/0x6f0
[ 1314.132155]
[ 1314.132155] stack backtrace:
[ 1314.144402] CPU: 0 PID: 1244 Comm: procd Not tainted 4.13.0 #1
[ 1314.160197] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.10.2-1 04/01/2014
[ 1314.179404] Call Trace:
[ 1314.186768]  dump_stack+0x16/0x1c
[ 1314.195387]  print_unlock_imbalance_bug+0xb9/0xd0
[ 1314.205753]  ? rcu_read_unlock_special+0x580/0x5b0
[ 1314.216381]  ? rcu_read_unlock_special+0x580/0x5b0
[ 1314.226982]  lock_release+0x1cc/0x490
[ 1314.235602]  ? rcu_gp_kthread_wake+0x34/0x50
[ 1314.245262]  ? rcu_read_unlock_special+0x580/0x5b0
[ 1314.255724]  rt_mutex_unlock+0x1e/0xb0
[ 1314.264610]  rcu_read_unlock_special+0x580/0x5b0
[ 1314.274814]  __rcu_read_unlock+0xa7/0xb0
[ 1314.283954]  unlazy_walk+0xcf/0x1f0
[ 1314.292409]  trailing_symlink+0x349/0x4e0
[ 1314.301583]  path_openat+0x333/0x1280
[ 1314.310197]  do_filp_open+0x67/0x140
[ 1314.318696]  ? getname_kernel+0x23/0x1e0
[ 1314.327766]  ? cache_alloc_debugcheck_after+0x13a/0x2a0
[ 1314.340076]  ? getname_kernel+0x23/0x1e0
[ 1314.349179]  do_open_execat+0xab/0x2a0
[ 1314.358063]  open_exec+0x57/0x80
[ 1314.366128]  load_script+0x33c/0x3d0
[ 1314.374556]  ? kvm_sched_clock_read+0x9/0x20
[ 1314.384219]  ? sched_clock+0x9/0x10
[ 1314.392611]  ? sched_clock_cpu+0x1a/0x1e0
[ 1314.401875]  ? _raw_read_unlock+0x55/0x90
[ 1314.411080]  search_binary_handler+0xd9/0x160
[ 1314.420799]  do_execveat_common+0x8f6/0xb10
[ 1314.430334]  SyS_execve+0x1f/0x30
[ 1314.438458]  do_int80_syscall_32+0x95/0x1b0
[ 1314.447956]  entry_INT80_32+0x2f/0x2f
[ 1314.456606] EIP: 0xb7e9ab07
[ 1314.464062] EFLAGS: 0296 CPU: 0
[ 1314.472421] EAX: ffda EBX: 0807b584 ECX: bfb0fd70 EDX: 08061250
[ 1314.485257] ESI: 0807b584 EDI:  EBP: bfb0fd58 ESP: bfb0fd28
[ 1314.498024]  DS: 007b ES: 007b FS:  GS: 0033 SS: 007b
[ 1314.613681] hotplug-call (1244) used greatest stack depth: 6384 bytes left
[ 1314.957636] 8021q: adding VLAN 0 to HW filter on device eth0
[ 1316.955154] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: 
RX
[ 1318.197800] 8021q: adding VLAN 0 to HW filter on device eth0
[ 1320.222754] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: 
RX
[ 1321.409456] BUG: unable to handle kernel paging request at 6b6b6f4f
[ 1321.421942] IP: vlan_device_event+0x7f5/0xa40
[ 1321.431239] *pde = 


Note that this call trace is different from the ones posted in earlier
emails.

[ 1321.431267]
[ 1321.443356] Oops:  [#1] PREEMPT
[ 1321.451390] CPU: 0 PID: 798 Comm: netifd Not tainted 4.13.0 #1
[ 1321.462802] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.10.2-1 04/01/2014
[ 1321.479743] task: cf8ae5c0 task.stack: cf114000
[ 1321.489345] EIP: vlan_device_event+0x7f5/0xa40

Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-08 Thread Fengguang Wu

Of course, if it's bisectable, that would be great too.


Yes, bisect is on the way. So far it's bisecting in the 4.12 commits.


The bisect was unsuccessful due to an unrelated DRM_BOCHS oops in 4.11.
Disabling the buggy driver, I managed to reproduce the
vlan_device_event bug in 4.11. However only to find the older kernels
suffer from different kind of oops, which make the bisect troublesome.

% grep -E 'dmesg.(BUG|EIP)' v4.*/matrix.json
v4.10/matrix.json:  "dmesg.BUG:unable_to_handle_kernel": [
v4.10/matrix.json:  "dmesg.EIP:kobject_get": [
v4.10/matrix.json:  "dmesg.BUG:kernel_reboot-without-warning_in_test_stage": [
v4.11/matrix.json:  "dmesg.BUG:unable_to_handle_kernel": [
v4.11/matrix.json:  "dmesg.EIP:vlan_device_event": [
v4.8/matrix.json:  
"dmesg.BUG:kernel_reboot-without-warning_in_early-boot_stage,last_printk:Decompressing_Linux":
 [
v4.9/matrix.json:  "dmesg.BUG:key_not_in.data": [

I'll try tuning kconfig to get a good bisect base.

Hopefully we'll make the coming RC releases free from such bisect
stoppers with better regression tracking and bisect automation.

Regards,
Fengguang


Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-08 Thread Alexander Duyck
On Wed, Nov 8, 2017 at 9:12 AM, Fengguang Wu  wrote:
> Hi Linus,
>
>
> On Wed, Nov 08, 2017 at 08:20:38AM -0800, Linus Torvalds wrote:
>>
>> On Wed, Nov 8, 2017 at 1:48 AM, Fengguang Wu 
>> wrote:
>>>
>>>
>>> Now I got the faddr2line output. :)
>>
>>
>> Thank you, but this also shows that you then compress the output too
>> much for convenience.
>>
>>> [  745.719623] BUG: unable to handle kernel paging request at 6b6b6f4f
>>> [  745.732871] IP: vlan_device_event at net/8021q/vlan.h:60
>>
>>
>> So this looks more legible than "vlan_device_event+0x7f5/0xa40" (or
>> whatever), but in fact it's not.
>>
>> The reason faddr2line is great is because it gives inlining
>> information, so that you can see exactly which access it is, even if
>> it's inside some helper inline function (macros still obviously are
>> going to be problematic).
>>
>> And you've cut the whole information down, to the point where there is
>> _less_ information than there used to be.
>>
>> So the full faddr2line output is actually important, and should look
>> something like this:
>>
>>__schedule+0x315/0x840:
>>rq_sched_info_arrive at kernel/sched/stats.h:13
>> (inlined by) sched_info_arrive at kernel/sched/stats.h:99
>> (inlined by) __sched_info_switch at kernel/sched/stats.h:151
>> (inlined by) sched_info_switch at kernel/sched/stats.h:158
>> (inlined by) prepare_task_switch at kernel/sched/core.c:2582
>> (inlined by) context_switch at kernel/sched/core.c:2755
>> (inlined by) __schedule at kernel/sched/core.c:3366
>>
>> which is admittedly a fairly extreme case, but it shows how involved
>> things can be.
>
>
> Indeed! I wasn't aware that faddr2line could have so informative
> output! After checking several of its outputs, I decided to pipe its
> output to |tail -1 :/  That's clearly due to my lacking of first hand
> experience on oops analyzing!
>
> IMHO this kind of informative demo in your email could be very good
> candidates to put in Documentation/. If you search faddr2line under
> Documentation there's no single mention of it (addr2line does show up
> 7 times).
>
>
>> Note how in my example above the access itself is from a header file
>> (that inlined rq_sched_info_arrive() function), and I happened to pick
>> the
>>
>>rq->rq_sched_info.pcount++;
>>
>> line. But Many inline functions are inlined from different points,
>> often many times in the same top-level function (think of the atomic
>> helper inlines, for example).
>>
>> In your case, that net/8021q/vlan.h:60 information is literally not
>> helping, because it only shows what I could already figure out from
>> looking at the constants in the disassembly: the code comes from the
>>
>>vlan_group_for_each_dev(...) {
>>..
>>
>> loop setup (it's the inline __vlan_group_get_device() function, and
>> the constants I could recognize are the VLAN_GROUP_ARRAY_PART_LEN
>> things.
>>
>> But exactly like the constants didn't tell me *which* invocation of
>> that loop it was, your "net/8021q/vlan.h:60" doesn't tell me which one
>> it is.
>>
>> There's at least _eight_ different "vlan_group_for_each_dev() {" loops
>> in vlan_device_event(), and maybe it's not all that meaningful which
>> of the eight it is in this case, in other cases it's critical.
>>
>> And since you've actually replaced the "vlan_device_event+0x7f5/0xa40"
>> with that "net/8021q/vlan.h:60" entirely, all the offset information
>> that *could* maybe have been used to pick one case over another (but
>> likely not) is also gone.
>>
>> That's why that inlining chain is important - so that we can see how
>> it got to that access in __vlan_group_get_device().
>>
>> So please do keep _all_ of the faddr2line output, at least for the
>> "IP:" line (the stack traces might not be worth it).
>
>
> OK. Here is the original faddr2line output:
>
> $ ~/linux/scripts/faddr2line vmlinux vlan_device_event+0x7f5/0xa40
> vlan_device_event+0x7f5/0xa40:
> vlan_device_event at net/8021q/vlan.h:60
>
> And below is call trace embedded with full faddr2line output.
>
> I notice that this trace shows no additional inline files at all.
> Is it because I did some kconfig option wrong, so that inline info is
> lost? Eg.
>
> CONFIG_OPTIMIZE_INLINING=y (reading lib/Kconfig.debug, it looks better set
> to N)
> CONFIG_DEBUG_INFO_REDUCED=y
> CONFIG_DEBUG_INFO_SPLIT=y
> (full .config attached)
>
> [  745.719623] BUG: unable to handle kernel paging request at 6b6b6f4f
> [  745.732871] IP: vlan_device_event+0x7f5/0xa40:
> vlan_device_event at net/8021q/vlan.h:60
> [  745.742106] *pde = 
> [  745.748587] Oops:  [#1] PREEMPT
> [  745.756104] CPU: 0 PID: 786 Comm: netifd Not tainted 4.14.0-rc8 #1
> [  745.769171] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> 1.10.2-1 04/01/2014
> [  745.786791] task: cf768780 task.stack: d187a000
> [  745.796485] EIP: vlan_device_event+0x7f5/0xa40:
> vlan_device_event at 

Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-08 Thread Linus Torvalds
On Wed, Nov 8, 2017 at 9:12 AM, Fengguang Wu  wrote:
>
> OK. Here is the original faddr2line output:
>
> $ ~/linux/scripts/faddr2line vmlinux vlan_device_event+0x7f5/0xa40
> vlan_device_event+0x7f5/0xa40:
> vlan_device_event at net/8021q/vlan.h:60

Hmm. Yes, that's not what I hoped for.

> I notice that this trace shows no additional inline files at all.
> Is it because I did some kconfig option wrong, so that inline info is
> lost? Eg.
>
> CONFIG_OPTIMIZE_INLINING=y (reading lib/Kconfig.debug, it looks better set  
> to N)
> CONFIG_DEBUG_INFO_REDUCED=y
> CONFIG_DEBUG_INFO_SPLIT=y

Hmm. It might also be a compiler/linker issue. Debug info is sadly
often a second class citizen, because it's generally not something
that gets tested nearly as much as the actual _code_ that a compiler
generates.

So we've certainly had issues before with incomplete debug info.

That said, I'm not sure how well that CONFIG_DEBUG_INFO_SPLIT works,
I've only ever tested with

  CONFIG_DEBUG_INFO=y
  CONFIG_DEBUG_INFO_REDUCED=y
  # CONFIG_DEBUG_INFO_SPLIT is not set
  # CONFIG_DEBUG_INFO_DWARF4 is not set

which was the config that my scheduler example was generated from.

It might be an issue with the particular version of 'addr2line' too,
for all I know. The one I have is

GNU addr2line version 2.27-24.fc26

and who knows what else could influence it. I *thought* that as long
as you just had DEBUG_INFO enabled, it would automatically be ok, but
I was clearly wrong.

> [  745.719623] BUG: unable to handle kernel paging request at 6b6b6f4f
> [  745.732871] IP: vlan_device_event+0x7f5/0xa40:
> vlan_device_event at net/8021q/vlan.h:60

Ok, at least now there's no lost information, even if it doesn't have
that nice inlining information that I was hoping for.

  Linus


Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-08 Thread Fengguang Wu

[...]

I notice that this trace shows no additional inline files at all.
Is it because I did some kconfig option wrong, so that inline info is
lost? Eg.

CONFIG_OPTIMIZE_INLINING=y (reading lib/Kconfig.debug, it looks better set to N)
CONFIG_DEBUG_INFO_REDUCED=y
CONFIG_DEBUG_INFO_SPLIT=y
(full .config attached)


FYI it's randconfig testing, so may well contain absurd settings --
unless we add rules to force enable/disable some specific options to
make life easier.

Regards,
Fengguang


[  745.719623] BUG: unable to handle kernel paging request at 6b6b6f4f
[  745.732871] IP: vlan_device_event+0x7f5/0xa40:
vlan_device_event at net/8021q/vlan.h:60
[  745.742106] *pde = 
[  745.748587] Oops:  [#1] PREEMPT
[  745.756104] CPU: 0 PID: 786 Comm: netifd Not tainted 4.14.0-rc8 #1
[  745.769171] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.10.2-1 04/01/2014
[  745.786791] task: cf768780 task.stack: d187a000
[  745.796485] EIP: vlan_device_event+0x7f5/0xa40:
vlan_device_event at net/8021q/vlan.h:60
[  745.805877] EFLAGS: 00010206 CPU: 0
[  745.813237] EAX: 00f9 EBX: 0002 ECX:  EDX: 6b6b6b6b
[  745.825774] ESI: 02f9 EDI: d1de3700 EBP: d187bdd8 ESP: d187bda4
[  745.838871]  DS: 007b ES: 007b FS:  GS: 0033 SS: 0068
[  745.850218] CR0: 80050033 CR2: 6b6b6f4f CR3: 0f4c8000 CR4: 0690
[  745.862750] Call Trace:
[  745.868650]  ? dn_dev_delete+0x138/0x1d0:
dn_dev_delete at net/decnet/dn_dev.c:1224
[  745.876751]  ? dn_dev_down+0x69/0x80:
dn_dev_down at net/decnet/dn_dev.c:1240
[  745.885084]  notifier_call_chain+0x4e/0xa0:
notifier_call_chain at kernel/notifier.c:95 (discriminator 1)
[  745.894254]  raw_notifier_call_chain+0xc/0x10:
raw_notifier_call_chain at kernel/notifier.c:402
[  745.903979]  call_netdevice_notifiers_info+0x59/0x90:
call_netdevice_notifiers_info at net/core/dev.c:1672
[  745.914670]  __dev_notify_flags+0xea/0x130:
__dev_notify_flags at net/core/dev.c:1687
[  745.923446]  dev_change_flags+0x60/0x70:
dev_change_flags at net/core/dev.c:6813
[  745.931679]  dev_ifsioc+0x49b/0x610:
dev_ifsioc at net/core/dev_ioctl.c:257
[  745.939102]  ? mutex_lock_nested+0x14/0x20:
mutex_lock_nested at kernel/locking/mutex.c:909
[  745.948173]  dev_ioctl+0x36f/0xb20:
dev_ioctl at net/core/dev_ioctl.c:566
[  745.956154]  sock_ioctl+0x1cd/0x350:
sock_ioctl at net/socket.c:968
[  745.964313]  ? sock_fasync+0xb0/0xb0:
sock_ioctl at net/socket.c:984
[  745.972512]  vfs_ioctl+0x33/0x70:
vfs_ioctl at fs/ioctl.c:47
[  745.979867]  do_vfs_ioctl+0x8d/0xc60:
do_vfs_ioctl at fs/ioctl.c:690
[  745.987782]  ? kmem_cache_free+0x186/0x290:
kmem_cache_free at include/linux/rcupdate.h:777
[  745.996138]  ? putname+0x9f/0xe0:
putname at fs/namei.c:259
[  746.003434]  ? putname+0x9f/0xe0:
putname at fs/namei.c:259
[  746.011240]  ? do_sys_open+0x28d/0x420:
do_sys_open at fs/open.c:1069
[  746.019728]  ? __fget_light+0xb7/0xf0:
__fget_light at fs/file.c:739 (discriminator 2)
bad symbol size: base: 0xc1277040 end: 0xc1277040
[  746.029292]  SyS_ioctl+0x98/0xb0
[  746.036680]  do_int80_syscall_32+0x95/0x290:
do_int80_syscall_32 at arch/x86/entry/common.c:329
[  746.045685]  entry_INT80_32+0x2f/0x2f:
restore_all at arch/x86/entry/entry_32.S:536
[  746.053427] EIP: 0xb7e97648
[  746.059907] EFLAGS: 0246 CPU: 0
[  746.068336] EAX: ffda EBX: 0005 ECX: 8914 EDX: bfcaa350
[  746.081238] ESI: bfcaa350 EDI: bfcaa370 EBP: bfcaa388 ESP: bfcaa31c
[  746.093449]  DS: 007b ES: 007b FS:  GS: 0033 SS: 007b
[  746.103600] Code: 8d e0 db b4 c4 8d 56 01 89 14 8d e0 db b4 c4 0f 85 03 02 00 00 
89 7d d4 31 f6 8b 7d d8 e9 84 00 00 00 8d 74 26 00 25 ff 01 00 00 <8b> 1c 82 31 
d2 85 db 0f 95 c2 8b 04 95 cc db b4 c4 83 c0 01 85
[  746.140089] EIP: vlan_device_event+0x7f5/0xa40:
vlan_device_event at net/8021q/vlan.h:60 SS:ESP: 0068:d187bda4
[  746.153505] CR2: 6b6b6f4f
[  746.413297] Kernel tests: Boot OK!
[  748.237463] ---[ end trace 40505af7d815b57d ]---
[  748.281157] Kernel panic - not syncing: Fatal exception


Anyway, it does all mean that the use-after-free almost certainly is
that "array[]" access, which was the main suspect to begin with.

I'm adding Jeff Kirsher and the e1000 list to the cc, since apparently
the e1000 driver is involved. Of course, all the stack traces are in
generic networking code, so it is possibly a generic networking issue,
but it would probably be good to have a few people look at it.

But all of this code is ancient. So it's almost certainly not a new
bug, whatever it is.

But as a test-case for your faddr2line improvements, this is excellent!

  Linus



Regards,
Fengguang



#
# Automatically generated file; DO NOT EDIT.
# Linux/i386 4.14.0-rc8 Kernel Configuration
#
# CONFIG_64BIT is not set
CONFIG_X86_32=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf32-i386"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/i386_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=8

Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-08 Thread Fengguang Wu

Hi Linus,

On Wed, Nov 08, 2017 at 08:20:38AM -0800, Linus Torvalds wrote:

On Wed, Nov 8, 2017 at 1:48 AM, Fengguang Wu  wrote:


Now I got the faddr2line output. :)


Thank you, but this also shows that you then compress the output too
much for convenience.


[  745.719623] BUG: unable to handle kernel paging request at 6b6b6f4f
[  745.732871] IP: vlan_device_event at net/8021q/vlan.h:60


So this looks more legible than "vlan_device_event+0x7f5/0xa40" (or
whatever), but in fact it's not.

The reason faddr2line is great is because it gives inlining
information, so that you can see exactly which access it is, even if
it's inside some helper inline function (macros still obviously are
going to be problematic).

And you've cut the whole information down, to the point where there is
_less_ information than there used to be.

So the full faddr2line output is actually important, and should look
something like this:

   __schedule+0x315/0x840:
   rq_sched_info_arrive at kernel/sched/stats.h:13
(inlined by) sched_info_arrive at kernel/sched/stats.h:99
(inlined by) __sched_info_switch at kernel/sched/stats.h:151
(inlined by) sched_info_switch at kernel/sched/stats.h:158
(inlined by) prepare_task_switch at kernel/sched/core.c:2582
(inlined by) context_switch at kernel/sched/core.c:2755
(inlined by) __schedule at kernel/sched/core.c:3366

which is admittedly a fairly extreme case, but it shows how involved
things can be.


Indeed! I wasn't aware that faddr2line could have so informative
output! After checking several of its outputs, I decided to pipe its
output to |tail -1 :/  That's clearly due to my lacking of first hand
experience on oops analyzing!

IMHO this kind of informative demo in your email could be very good
candidates to put in Documentation/. If you search faddr2line under
Documentation there's no single mention of it (addr2line does show up
7 times).


Note how in my example above the access itself is from a header file
(that inlined rq_sched_info_arrive() function), and I happened to pick
the

   rq->rq_sched_info.pcount++;

line. But Many inline functions are inlined from different points,
often many times in the same top-level function (think of the atomic
helper inlines, for example).

In your case, that net/8021q/vlan.h:60 information is literally not
helping, because it only shows what I could already figure out from
looking at the constants in the disassembly: the code comes from the

   vlan_group_for_each_dev(...) {
   ..

loop setup (it's the inline __vlan_group_get_device() function, and
the constants I could recognize are the VLAN_GROUP_ARRAY_PART_LEN
things.

But exactly like the constants didn't tell me *which* invocation of
that loop it was, your "net/8021q/vlan.h:60" doesn't tell me which one
it is.

There's at least _eight_ different "vlan_group_for_each_dev() {" loops
in vlan_device_event(), and maybe it's not all that meaningful which
of the eight it is in this case, in other cases it's critical.

And since you've actually replaced the "vlan_device_event+0x7f5/0xa40"
with that "net/8021q/vlan.h:60" entirely, all the offset information
that *could* maybe have been used to pick one case over another (but
likely not) is also gone.

That's why that inlining chain is important - so that we can see how
it got to that access in __vlan_group_get_device().

So please do keep _all_ of the faddr2line output, at least for the
"IP:" line (the stack traces might not be worth it).


OK. Here is the original faddr2line output:

$ ~/linux/scripts/faddr2line vmlinux vlan_device_event+0x7f5/0xa40
vlan_device_event+0x7f5/0xa40:
vlan_device_event at net/8021q/vlan.h:60

And below is call trace embedded with full faddr2line output.

I notice that this trace shows no additional inline files at all.
Is it because I did some kconfig option wrong, so that inline info is
lost? Eg.

CONFIG_OPTIMIZE_INLINING=y (reading lib/Kconfig.debug, it looks better set to N)
CONFIG_DEBUG_INFO_REDUCED=y
CONFIG_DEBUG_INFO_SPLIT=y
(full .config attached)

[  745.719623] BUG: unable to handle kernel paging request at 6b6b6f4f
[  745.732871] IP: vlan_device_event+0x7f5/0xa40:
vlan_device_event at net/8021q/vlan.h:60
[  745.742106] *pde = 
[  745.748587] Oops:  [#1] PREEMPT
[  745.756104] CPU: 0 PID: 786 Comm: netifd Not tainted 4.14.0-rc8 #1
[  745.769171] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.10.2-1 04/01/2014
[  745.786791] task: cf768780 task.stack: d187a000
[  745.796485] EIP: vlan_device_event+0x7f5/0xa40:
vlan_device_event at net/8021q/vlan.h:60
[  745.805877] EFLAGS: 00010206 CPU: 0
[  745.813237] EAX: 00f9 EBX: 0002 ECX:  EDX: 6b6b6b6b
[  745.825774] ESI: 02f9 EDI: d1de3700 EBP: d187bdd8 ESP: d187bda4
[  745.838871]  DS: 007b ES: 007b FS:  GS: 0033 SS: 0068
[  745.850218] CR0: 80050033 CR2: 6b6b6f4f CR3: 0f4c8000 CR4: 0690
[  745.862750] Call Trace:
[  745.868650]  ? 

Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-08 Thread Linus Torvalds
On Wed, Nov 8, 2017 at 1:48 AM, Fengguang Wu  wrote:
>
> Now I got the faddr2line output. :)

Thank you, but this also shows that you then compress the output too
much for convenience.

> [  745.719623] BUG: unable to handle kernel paging request at 6b6b6f4f
> [  745.732871] IP: vlan_device_event at net/8021q/vlan.h:60

So this looks more legible than "vlan_device_event+0x7f5/0xa40" (or
whatever), but in fact it's not.

The reason faddr2line is great is because it gives inlining
information, so that you can see exactly which access it is, even if
it's inside some helper inline function (macros still obviously are
going to be problematic).

And you've cut the whole information down, to the point where there is
_less_ information than there used to be.

So the full faddr2line output is actually important, and should look
something like this:

__schedule+0x315/0x840:
rq_sched_info_arrive at kernel/sched/stats.h:13
 (inlined by) sched_info_arrive at kernel/sched/stats.h:99
 (inlined by) __sched_info_switch at kernel/sched/stats.h:151
 (inlined by) sched_info_switch at kernel/sched/stats.h:158
 (inlined by) prepare_task_switch at kernel/sched/core.c:2582
 (inlined by) context_switch at kernel/sched/core.c:2755
 (inlined by) __schedule at kernel/sched/core.c:3366

which is admittedly a fairly extreme case, but it shows how involved
things can be.

Note how in my example above the access itself is from a header file
(that inlined rq_sched_info_arrive() function), and I happened to pick
the

rq->rq_sched_info.pcount++;

line. But Many inline functions are inlined from different points,
often many times in the same top-level function (think of the atomic
helper inlines, for example).

In your case, that net/8021q/vlan.h:60 information is literally not
helping, because it only shows what I could already figure out from
looking at the constants in the disassembly: the code comes from the

vlan_group_for_each_dev(...) {
..

loop setup (it's the inline __vlan_group_get_device() function, and
the constants I could recognize are the VLAN_GROUP_ARRAY_PART_LEN
things.

But exactly like the constants didn't tell me *which* invocation of
that loop it was, your "net/8021q/vlan.h:60" doesn't tell me which one
it is.

There's at least _eight_ different "vlan_group_for_each_dev() {" loops
in vlan_device_event(), and maybe it's not all that meaningful which
of the eight it is in this case, in other cases it's critical.

And since you've actually replaced the "vlan_device_event+0x7f5/0xa40"
with that "net/8021q/vlan.h:60" entirely, all the offset information
that *could* maybe have been used to pick one case over another (but
likely not) is also gone.

That's why that inlining chain is important - so that we can see how
it got to that access in __vlan_group_get_device().

So please do keep _all_ of the faddr2line output, at least for the
"IP:" line (the stack traces might not be worth it).

Anyway, it does all mean that the use-after-free almost certainly is
that "array[]" access, which was the main suspect to begin with.

I'm adding Jeff Kirsher and the e1000 list to the cc, since apparently
the e1000 driver is involved. Of course, all the stack traces are in
generic networking code, so it is possibly a generic networking issue,
but it would probably be good to have a few people look at it.

But all of this code is ancient. So it's almost certainly not a new
bug, whatever it is.

But as a test-case for your faddr2line improvements, this is excellent!

   Linus


Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-07 Thread Fengguang Wu

On Tue, Nov 07, 2017 at 08:25:03AM -0800, Linus Torvalds wrote:

On Tue, Nov 7, 2017 at 2:21 AM, Fengguang Wu  wrote:


FYI this happens in v4.14-rc8 -- it's not necessarily a new bug.


Probably not.

Looks like a use-after-free bug in vlan_device_event() judging by the
base pointer:

   ECX: 6b6b6b6b

this is one of those circumstances where having the faddr2line output
for that EIP would make it much easier to see exactly which access it
is that causes problems. There's lots of inlining going on, so without
that it's a pain to figure out.

The code is

  0: 31 c0xor%eax,%eax
  2: 8d 76 00  lea0x0(%esi),%esi
  5: 89 c2mov%eax,%edx
  7: 89 c3mov%eax,%ebx
  9: 81 e2 ff 0f 00 00and$0xfff,%edx
  f: 89 d1mov%edx,%ecx
 11: c1 fb 0c  sar$0xc,%ebx
 14: c1 e9 09  shr$0x9,%ecx
 17: 8d 0c d9  lea(%ecx,%ebx,8),%ecx
 1a: 8b 4c 8e 10  mov0x10(%esi,%ecx,4),%ecx
 1e: 85 c9test   %ecx,%ecx
 20: 74 34je 0x56
 22: 81 e2 ff 01 00 00and$0x1ff,%edx
 28:* 8b 14 91  mov(%ecx,%edx,4),%edx <-- trapping instruction
 2b: 85 d2test   %edx,%edx
 2d: 74 27je 0x56
 2f: f6 82 30 01 00 00 01 testb  $0x1,0x130(%edx)
 36: 74 1eje 0x56

and just by going by the constants in question (0xfff and 0x1ff), I
can see that it's one of

   vlan_group_for_each_dev(..) {
   ...
   }

things, but that's pretty much all I can tell.

Apparently we'll get that faddr2line output soon. In the meantime, I
think this is a real bug report but I don't see enough information to
really go on.


Got it. I should be able to get faddr2line output tomorrow.


Of course, if it's bisectable, that would be great too.


It looks reproducible enough to be bisectable. I'll try.

Regards,
Fengguang


Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-07 Thread Linus Torvalds
On Tue, Nov 7, 2017 at 2:21 AM, Fengguang Wu  wrote:
>
> FYI this happens in v4.14-rc8 -- it's not necessarily a new bug.

Probably not.

Looks like a use-after-free bug in vlan_device_event() judging by the
base pointer:

ECX: 6b6b6b6b

this is one of those circumstances where having the faddr2line output
for that EIP would make it much easier to see exactly which access it
is that causes problems. There's lots of inlining going on, so without
that it's a pain to figure out.

The code is

   0: 31 c0xor%eax,%eax
   2: 8d 76 00  lea0x0(%esi),%esi
   5: 89 c2mov%eax,%edx
   7: 89 c3mov%eax,%ebx
   9: 81 e2 ff 0f 00 00and$0xfff,%edx
   f: 89 d1mov%edx,%ecx
  11: c1 fb 0c  sar$0xc,%ebx
  14: c1 e9 09  shr$0x9,%ecx
  17: 8d 0c d9  lea(%ecx,%ebx,8),%ecx
  1a: 8b 4c 8e 10  mov0x10(%esi,%ecx,4),%ecx
  1e: 85 c9test   %ecx,%ecx
  20: 74 34je 0x56
  22: 81 e2 ff 01 00 00and$0x1ff,%edx
  28:* 8b 14 91  mov(%ecx,%edx,4),%edx <-- trapping instruction
  2b: 85 d2test   %edx,%edx
  2d: 74 27je 0x56
  2f: f6 82 30 01 00 00 01 testb  $0x1,0x130(%edx)
  36: 74 1eje 0x56

and just by going by the constants in question (0xfff and 0x1ff), I
can see that it's one of

vlan_group_for_each_dev(..) {
...
}

things, but that's pretty much all I can tell.

Apparently we'll get that faddr2line output soon. In the meantime, I
think this is a real bug report but I don't see enough information to
really go on.

Of course, if it's bisectable, that would be great too.

Linus