Re: [list] 8052ce2d97: WARNING: CPU: 0 PID: 1 at lib/list_debug.c:60 __list_del_entry+0x4d/0x71

2016-10-22 Thread Kees Cook
On Sat, Sep 3, 2016 at 5:35 PM, kernel test robot  wrote:
>
> FYI, we noticed the following commit:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
> dev.2016.08.19a
> commit 8052ce2d9771ab5a818307f3abbaf33bba82a631 ("list: Split list_add() 
> debug checking into separate function")
>
> in testcase: boot
>
> on test machine: qemu-system-i386 -enable-kvm -m 360M
>
> caused below changes:

Okay, I've finally decoded this test result. There is nothing scary
here. I will explain below...

> ++++
> || 0a33d912a3 | 
> 8052ce2d97 |
> ++++
> | boot_successes | 0  | 0 
>  |
> | boot_failures  | 211| 
> 18 |

First, the prior commit (before my change) also never successfully booted.

> | WARNING:at_lib/list_debug.c:#__list_del_entry  | 211| 
> 18 |

In both cases, list_del threw a warning every time.

> | calltrace:init | 209| 
> 18 |
> | WARNING:at_lib/list_debug.c:#__list_add| 209|   
>  |

Before the change, subsequent list_adds would also throw a warning
(since memory had gone ahead and gotten corrupted in the prior
list_del calls). After the change, this stopped happening since the
corruption got isolated.

> | calltrace:i8042_init   | 209| 
> 17 |
> | calltrace:SyS_reboot   | 192| 4 
>  |
> | EIP_is_at__mutex_unlock_slowpath   | 29 |   
>  |
> | Kernel_panic-not_syncing:softlockup:hung_tasks | 138|   
>  |
> | BUG:workqueue_lockup-pool_cpus=#flags=#nice=#stuck_for#s   | 56 |   
>  |
> | EIP_is_at_lock_release | 7  |   
>  |
> | EIP_is_at_lock_acquire | 28 |   
>  |
> | EIP_is_at_raw_spin_unlock_irq  | 30 |   
>  |
> | EIP_is_at_mutex_lock_nested| 21 |   
>  |
> | EIP_is_at_lock_is_held | 10 |   
>  |
> | EIP_is_at_device_shutdown  | 3  |   
>  |
> | EIP_is_at___might_sleep| 5  |   
>  |
> | EIP_is_at__list_del_entry  | 4  | 1 
>  |
> | EIP_is_at_get_device   | 1  |   
>  |
> | BUG:unable_to_handle_kernel| 7  | 5 
>  |
> | Oops   | 7  | 5 
>  |
> | calltrace:SyS_ipc  | 4  | 4 
>  |
> | Kernel_panic-not_syncing:Fatal_exception   | 7  | 5 
>  |
> | EIP_is_at_pm_runtime_barrier   | 1  |   
>  |
> | EIP_is_at_lock_acquired| 1  |   
>  |
> | EIP_is_at__might_sleep | 1  |   
>  |
> | BUG:workqueue_lockup-pool_cpus=#-#flags=#nice=#stuck_for#s | 1  |   
>  |
> | WARNING:at_arch/x86/mm/dump_pagetables.c:#note_page| 26 |   
>  |
> | calltrace:mark_rodata_ro   | 26 |   
>  |
> | WARNING:at_lib/list_debug.c:#__list_add_valid  | 0  | 
> 17 |
> | calltrace:serio_handle_event   | 0  | 
> 17 |
> | calltrace:userio_misc_init | 0  | 
> 17 |
> | calltrace:mousedev_init| 0  | 
> 17 |
> | calltrace:evdev_init   | 0  | 
> 17 |
> | calltrace:cmos_init| 0  | 
> 17 |
> | calltrace:test_init| 0  | 
> 17 |
> | calltrace:pca_isa_driver_init  | 0  | 
> 17 |
> | calltrace:vim2m_init   | 0  | 
> 17 |
> | calltrace:fm_drv_init  | 0  | 
> 17 |
> | calltrace:test_power_init  | 0  | 
> 17 |
> | calltrace:raid#_init   | 0 

Re: [list] 8052ce2d97: WARNING: CPU: 0 PID: 1 at lib/list_debug.c:60 __list_del_entry+0x4d/0x71

2016-10-22 Thread Kees Cook
On Sat, Sep 3, 2016 at 5:35 PM, kernel test robot  wrote:
>
> FYI, we noticed the following commit:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
> dev.2016.08.19a
> commit 8052ce2d9771ab5a818307f3abbaf33bba82a631 ("list: Split list_add() 
> debug checking into separate function")
>
> in testcase: boot
>
> on test machine: qemu-system-i386 -enable-kvm -m 360M
>
> caused below changes:

Okay, I've finally decoded this test result. There is nothing scary
here. I will explain below...

> ++++
> || 0a33d912a3 | 
> 8052ce2d97 |
> ++++
> | boot_successes | 0  | 0 
>  |
> | boot_failures  | 211| 
> 18 |

First, the prior commit (before my change) also never successfully booted.

> | WARNING:at_lib/list_debug.c:#__list_del_entry  | 211| 
> 18 |

In both cases, list_del threw a warning every time.

> | calltrace:init | 209| 
> 18 |
> | WARNING:at_lib/list_debug.c:#__list_add| 209|   
>  |

Before the change, subsequent list_adds would also throw a warning
(since memory had gone ahead and gotten corrupted in the prior
list_del calls). After the change, this stopped happening since the
corruption got isolated.

> | calltrace:i8042_init   | 209| 
> 17 |
> | calltrace:SyS_reboot   | 192| 4 
>  |
> | EIP_is_at__mutex_unlock_slowpath   | 29 |   
>  |
> | Kernel_panic-not_syncing:softlockup:hung_tasks | 138|   
>  |
> | BUG:workqueue_lockup-pool_cpus=#flags=#nice=#stuck_for#s   | 56 |   
>  |
> | EIP_is_at_lock_release | 7  |   
>  |
> | EIP_is_at_lock_acquire | 28 |   
>  |
> | EIP_is_at_raw_spin_unlock_irq  | 30 |   
>  |
> | EIP_is_at_mutex_lock_nested| 21 |   
>  |
> | EIP_is_at_lock_is_held | 10 |   
>  |
> | EIP_is_at_device_shutdown  | 3  |   
>  |
> | EIP_is_at___might_sleep| 5  |   
>  |
> | EIP_is_at__list_del_entry  | 4  | 1 
>  |
> | EIP_is_at_get_device   | 1  |   
>  |
> | BUG:unable_to_handle_kernel| 7  | 5 
>  |
> | Oops   | 7  | 5 
>  |
> | calltrace:SyS_ipc  | 4  | 4 
>  |
> | Kernel_panic-not_syncing:Fatal_exception   | 7  | 5 
>  |
> | EIP_is_at_pm_runtime_barrier   | 1  |   
>  |
> | EIP_is_at_lock_acquired| 1  |   
>  |
> | EIP_is_at__might_sleep | 1  |   
>  |
> | BUG:workqueue_lockup-pool_cpus=#-#flags=#nice=#stuck_for#s | 1  |   
>  |
> | WARNING:at_arch/x86/mm/dump_pagetables.c:#note_page| 26 |   
>  |
> | calltrace:mark_rodata_ro   | 26 |   
>  |
> | WARNING:at_lib/list_debug.c:#__list_add_valid  | 0  | 
> 17 |
> | calltrace:serio_handle_event   | 0  | 
> 17 |
> | calltrace:userio_misc_init | 0  | 
> 17 |
> | calltrace:mousedev_init| 0  | 
> 17 |
> | calltrace:evdev_init   | 0  | 
> 17 |
> | calltrace:cmos_init| 0  | 
> 17 |
> | calltrace:test_init| 0  | 
> 17 |
> | calltrace:pca_isa_driver_init  | 0  | 
> 17 |
> | calltrace:vim2m_init   | 0  | 
> 17 |
> | calltrace:fm_drv_init  | 0  | 
> 17 |
> | calltrace:test_power_init  | 0  | 
> 17 |
> | calltrace:raid#_init   | 0  | 
> 17   

Re: [list] 8052ce2d97: WARNING: CPU: 0 PID: 1 at lib/list_debug.c:60 __list_del_entry+0x4d/0x71

2016-09-08 Thread Kees Cook
On Thu, Sep 8, 2016 at 2:27 AM, Paul E. McKenney
 wrote:
> On Wed, Sep 07, 2016 at 10:25:58AM -0700, Kees Cook wrote:
>> On Wed, Sep 7, 2016 at 12:31 AM, Paul E. McKenney
>>  wrote:
>> > On Tue, Sep 06, 2016 at 11:49:04AM -0400, Kees Cook wrote:
>> >> On Sun, Sep 4, 2016 at 5:48 AM, Paul E. McKenney
>> >>  wrote:
>> >> > On Sun, Sep 04, 2016 at 08:35:40AM +0800, kernel test robot wrote:
>> >> >>
>> >> >> FYI, we noticed the following commit:
>> >> >>
>> >> >> https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
>> >> >> dev.2016.08.19a
>> >> >> commit 8052ce2d9771ab5a818307f3abbaf33bba82a631 ("list: Split 
>> >> >> list_add() debug checking into separate function")
>> >> >>
>> >> >> in testcase: boot
>> >> >>
>> >> >> on test machine: qemu-system-i386 -enable-kvm -m 360M
>> >> >>
>> >> >> caused below changes:
>> >> >
>> >> > Hello, Kees,
>> >> >
>> >> > Looks like your change might have actually found a bug.  ;-)
>> >> >
>> >> > How would you like to handle this?
>> >>
>> >> We should fix the bug! :)
>> >
>> > But of course...
>> >
>> > The question is whether I am considered to be fully responsible for
>> > chasing down any bugs that arise.  It is looking like I am, in which
>> > case I must defer this patch to the 4.10 merge window.  Things are a
>> > bit crazy now due to prior overcommitments...
>>
>> I'm going to double-check the code, but the changes I made shouldn't
>> have changed any of the logical checks under CONFIG_DEBUG_LIST... just
>> the manifestation of WARN vs BUG. So, since both this and the BPF RCU
>> bug have appeared, something has changed... I will go digging...
>
> So maybe the boot failures are a result of the WARN->BUG transition?
>
>> >> >> [  152.698509] [ cut here ]
>> >> >> [  152.709786] WARNING: CPU: 0 PID: 1 at lib/list_debug.c:60 
>> >> >> __list_del_entry+0x4d/0x71
>> >> >> [  152.722785] list_del corruption. next->prev should be cc0230d4, but 
>> >> >> was 6b6b6b6b
>> >>
>> >> That's the poison value, so this is a remove-after-remove bug?

Unfortunately, the 0-day backtrace shows WARN :(

-Kees

-- 
Kees Cook
Nexus Security


Re: [list] 8052ce2d97: WARNING: CPU: 0 PID: 1 at lib/list_debug.c:60 __list_del_entry+0x4d/0x71

2016-09-08 Thread Kees Cook
On Thu, Sep 8, 2016 at 2:27 AM, Paul E. McKenney
 wrote:
> On Wed, Sep 07, 2016 at 10:25:58AM -0700, Kees Cook wrote:
>> On Wed, Sep 7, 2016 at 12:31 AM, Paul E. McKenney
>>  wrote:
>> > On Tue, Sep 06, 2016 at 11:49:04AM -0400, Kees Cook wrote:
>> >> On Sun, Sep 4, 2016 at 5:48 AM, Paul E. McKenney
>> >>  wrote:
>> >> > On Sun, Sep 04, 2016 at 08:35:40AM +0800, kernel test robot wrote:
>> >> >>
>> >> >> FYI, we noticed the following commit:
>> >> >>
>> >> >> https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
>> >> >> dev.2016.08.19a
>> >> >> commit 8052ce2d9771ab5a818307f3abbaf33bba82a631 ("list: Split 
>> >> >> list_add() debug checking into separate function")
>> >> >>
>> >> >> in testcase: boot
>> >> >>
>> >> >> on test machine: qemu-system-i386 -enable-kvm -m 360M
>> >> >>
>> >> >> caused below changes:
>> >> >
>> >> > Hello, Kees,
>> >> >
>> >> > Looks like your change might have actually found a bug.  ;-)
>> >> >
>> >> > How would you like to handle this?
>> >>
>> >> We should fix the bug! :)
>> >
>> > But of course...
>> >
>> > The question is whether I am considered to be fully responsible for
>> > chasing down any bugs that arise.  It is looking like I am, in which
>> > case I must defer this patch to the 4.10 merge window.  Things are a
>> > bit crazy now due to prior overcommitments...
>>
>> I'm going to double-check the code, but the changes I made shouldn't
>> have changed any of the logical checks under CONFIG_DEBUG_LIST... just
>> the manifestation of WARN vs BUG. So, since both this and the BPF RCU
>> bug have appeared, something has changed... I will go digging...
>
> So maybe the boot failures are a result of the WARN->BUG transition?
>
>> >> >> [  152.698509] [ cut here ]
>> >> >> [  152.709786] WARNING: CPU: 0 PID: 1 at lib/list_debug.c:60 
>> >> >> __list_del_entry+0x4d/0x71
>> >> >> [  152.722785] list_del corruption. next->prev should be cc0230d4, but 
>> >> >> was 6b6b6b6b
>> >>
>> >> That's the poison value, so this is a remove-after-remove bug?

Unfortunately, the 0-day backtrace shows WARN :(

-Kees

-- 
Kees Cook
Nexus Security


Re: [list] 8052ce2d97: WARNING: CPU: 0 PID: 1 at lib/list_debug.c:60 __list_del_entry+0x4d/0x71

2016-09-08 Thread Paul E. McKenney
On Wed, Sep 07, 2016 at 10:25:58AM -0700, Kees Cook wrote:
> On Wed, Sep 7, 2016 at 12:31 AM, Paul E. McKenney
>  wrote:
> > On Tue, Sep 06, 2016 at 11:49:04AM -0400, Kees Cook wrote:
> >> On Sun, Sep 4, 2016 at 5:48 AM, Paul E. McKenney
> >>  wrote:
> >> > On Sun, Sep 04, 2016 at 08:35:40AM +0800, kernel test robot wrote:
> >> >>
> >> >> FYI, we noticed the following commit:
> >> >>
> >> >> https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
> >> >> dev.2016.08.19a
> >> >> commit 8052ce2d9771ab5a818307f3abbaf33bba82a631 ("list: Split 
> >> >> list_add() debug checking into separate function")
> >> >>
> >> >> in testcase: boot
> >> >>
> >> >> on test machine: qemu-system-i386 -enable-kvm -m 360M
> >> >>
> >> >> caused below changes:
> >> >
> >> > Hello, Kees,
> >> >
> >> > Looks like your change might have actually found a bug.  ;-)
> >> >
> >> > How would you like to handle this?
> >>
> >> We should fix the bug! :)
> >
> > But of course...
> >
> > The question is whether I am considered to be fully responsible for
> > chasing down any bugs that arise.  It is looking like I am, in which
> > case I must defer this patch to the 4.10 merge window.  Things are a
> > bit crazy now due to prior overcommitments...
> 
> I'm going to double-check the code, but the changes I made shouldn't
> have changed any of the logical checks under CONFIG_DEBUG_LIST... just
> the manifestation of WARN vs BUG. So, since both this and the BPF RCU
> bug have appeared, something has changed... I will go digging...

So maybe the boot failures are a result of the WARN->BUG transition?

Thanx, Paul

> -Kees
> 
> > Yes, yes, they will be crazy then as well, but this would be higher
> > on the priority list.
> >
> > Thanx, Paul
> >
> >> >> ++++
> >> >> || 
> >> >> 0a33d912a3 | 8052ce2d97 |
> >> >> ++++
> >> >> | boot_successes | 0
> >> >>   | 0  |
> >> >> | boot_failures  | 211  
> >> >>   | 18 |
> >> >> | WARNING:at_lib/list_debug.c:#__list_del_entry  | 211  
> >> >>   | 18 |
> >> >> | calltrace:init | 209  
> >> >>   | 18 |
> >> >> | WARNING:at_lib/list_debug.c:#__list_add| 209  
> >> >>   ||
> >> >> | calltrace:i8042_init   | 209  
> >> >>   | 17 |
> >> >> | calltrace:SyS_reboot   | 192  
> >> >>   | 4  |
> >> >> | EIP_is_at__mutex_unlock_slowpath   | 29   
> >> >>   ||
> >> >> | Kernel_panic-not_syncing:softlockup:hung_tasks | 138  
> >> >>   ||
> >> >> | BUG:workqueue_lockup-pool_cpus=#flags=#nice=#stuck_for#s   | 56   
> >> >>   ||
> >> >> | EIP_is_at_lock_release | 7
> >> >>   ||
> >> >> | EIP_is_at_lock_acquire | 28   
> >> >>   ||
> >> >> | EIP_is_at_raw_spin_unlock_irq  | 30   
> >> >>   ||
> >> >> | EIP_is_at_mutex_lock_nested| 21   
> >> >>   ||
> >> >> | EIP_is_at_lock_is_held | 10   
> >> >>   ||
> >> >> | EIP_is_at_device_shutdown  | 3
> >> >>   ||
> >> >> | EIP_is_at___might_sleep| 5
> >> >>   ||
> >> >> | EIP_is_at__list_del_entry  | 4
> >> >>   | 1  |
> >> >> | EIP_is_at_get_device   | 1
> >> >>   ||
> >> >> | BUG:unable_to_handle_kernel| 7
> >> >>   | 5  |
> >> >> | Oops   | 7
> >> >>   | 5  |
> >> >> | calltrace:SyS_ipc  | 4
> >> >>   | 4  |
> >> >> | Kernel_panic-not_syncing:Fatal_exception   | 7
> >> >>   | 5  |
> >> >> | EIP_is_at_pm_runtime_barrier   | 1
> >> >>   ||
> >> >> | EIP_is_at_lock_acquired| 1
> >> >>   ||
> >> >> | EIP_is_at__might_sleep | 1
> >> >>   ||
> >> >> | 

Re: [list] 8052ce2d97: WARNING: CPU: 0 PID: 1 at lib/list_debug.c:60 __list_del_entry+0x4d/0x71

2016-09-08 Thread Paul E. McKenney
On Wed, Sep 07, 2016 at 10:25:58AM -0700, Kees Cook wrote:
> On Wed, Sep 7, 2016 at 12:31 AM, Paul E. McKenney
>  wrote:
> > On Tue, Sep 06, 2016 at 11:49:04AM -0400, Kees Cook wrote:
> >> On Sun, Sep 4, 2016 at 5:48 AM, Paul E. McKenney
> >>  wrote:
> >> > On Sun, Sep 04, 2016 at 08:35:40AM +0800, kernel test robot wrote:
> >> >>
> >> >> FYI, we noticed the following commit:
> >> >>
> >> >> https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
> >> >> dev.2016.08.19a
> >> >> commit 8052ce2d9771ab5a818307f3abbaf33bba82a631 ("list: Split 
> >> >> list_add() debug checking into separate function")
> >> >>
> >> >> in testcase: boot
> >> >>
> >> >> on test machine: qemu-system-i386 -enable-kvm -m 360M
> >> >>
> >> >> caused below changes:
> >> >
> >> > Hello, Kees,
> >> >
> >> > Looks like your change might have actually found a bug.  ;-)
> >> >
> >> > How would you like to handle this?
> >>
> >> We should fix the bug! :)
> >
> > But of course...
> >
> > The question is whether I am considered to be fully responsible for
> > chasing down any bugs that arise.  It is looking like I am, in which
> > case I must defer this patch to the 4.10 merge window.  Things are a
> > bit crazy now due to prior overcommitments...
> 
> I'm going to double-check the code, but the changes I made shouldn't
> have changed any of the logical checks under CONFIG_DEBUG_LIST... just
> the manifestation of WARN vs BUG. So, since both this and the BPF RCU
> bug have appeared, something has changed... I will go digging...

So maybe the boot failures are a result of the WARN->BUG transition?

Thanx, Paul

> -Kees
> 
> > Yes, yes, they will be crazy then as well, but this would be higher
> > on the priority list.
> >
> > Thanx, Paul
> >
> >> >> ++++
> >> >> || 
> >> >> 0a33d912a3 | 8052ce2d97 |
> >> >> ++++
> >> >> | boot_successes | 0
> >> >>   | 0  |
> >> >> | boot_failures  | 211  
> >> >>   | 18 |
> >> >> | WARNING:at_lib/list_debug.c:#__list_del_entry  | 211  
> >> >>   | 18 |
> >> >> | calltrace:init | 209  
> >> >>   | 18 |
> >> >> | WARNING:at_lib/list_debug.c:#__list_add| 209  
> >> >>   ||
> >> >> | calltrace:i8042_init   | 209  
> >> >>   | 17 |
> >> >> | calltrace:SyS_reboot   | 192  
> >> >>   | 4  |
> >> >> | EIP_is_at__mutex_unlock_slowpath   | 29   
> >> >>   ||
> >> >> | Kernel_panic-not_syncing:softlockup:hung_tasks | 138  
> >> >>   ||
> >> >> | BUG:workqueue_lockup-pool_cpus=#flags=#nice=#stuck_for#s   | 56   
> >> >>   ||
> >> >> | EIP_is_at_lock_release | 7
> >> >>   ||
> >> >> | EIP_is_at_lock_acquire | 28   
> >> >>   ||
> >> >> | EIP_is_at_raw_spin_unlock_irq  | 30   
> >> >>   ||
> >> >> | EIP_is_at_mutex_lock_nested| 21   
> >> >>   ||
> >> >> | EIP_is_at_lock_is_held | 10   
> >> >>   ||
> >> >> | EIP_is_at_device_shutdown  | 3
> >> >>   ||
> >> >> | EIP_is_at___might_sleep| 5
> >> >>   ||
> >> >> | EIP_is_at__list_del_entry  | 4
> >> >>   | 1  |
> >> >> | EIP_is_at_get_device   | 1
> >> >>   ||
> >> >> | BUG:unable_to_handle_kernel| 7
> >> >>   | 5  |
> >> >> | Oops   | 7
> >> >>   | 5  |
> >> >> | calltrace:SyS_ipc  | 4
> >> >>   | 4  |
> >> >> | Kernel_panic-not_syncing:Fatal_exception   | 7
> >> >>   | 5  |
> >> >> | EIP_is_at_pm_runtime_barrier   | 1
> >> >>   ||
> >> >> | EIP_is_at_lock_acquired| 1
> >> >>   ||
> >> >> | EIP_is_at__might_sleep | 1
> >> >>   ||
> >> >> | BUG:workqueue_lockup-pool_cpus=#-#flags=#nice=#stuck_for#s | 1

Re: [list] 8052ce2d97: WARNING: CPU: 0 PID: 1 at lib/list_debug.c:60 __list_del_entry+0x4d/0x71

2016-09-07 Thread Kees Cook
On Wed, Sep 7, 2016 at 12:31 AM, Paul E. McKenney
 wrote:
> On Tue, Sep 06, 2016 at 11:49:04AM -0400, Kees Cook wrote:
>> On Sun, Sep 4, 2016 at 5:48 AM, Paul E. McKenney
>>  wrote:
>> > On Sun, Sep 04, 2016 at 08:35:40AM +0800, kernel test robot wrote:
>> >>
>> >> FYI, we noticed the following commit:
>> >>
>> >> https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
>> >> dev.2016.08.19a
>> >> commit 8052ce2d9771ab5a818307f3abbaf33bba82a631 ("list: Split list_add() 
>> >> debug checking into separate function")
>> >>
>> >> in testcase: boot
>> >>
>> >> on test machine: qemu-system-i386 -enable-kvm -m 360M
>> >>
>> >> caused below changes:
>> >
>> > Hello, Kees,
>> >
>> > Looks like your change might have actually found a bug.  ;-)
>> >
>> > How would you like to handle this?
>>
>> We should fix the bug! :)
>
> But of course...
>
> The question is whether I am considered to be fully responsible for
> chasing down any bugs that arise.  It is looking like I am, in which
> case I must defer this patch to the 4.10 merge window.  Things are a
> bit crazy now due to prior overcommitments...

I'm going to double-check the code, but the changes I made shouldn't
have changed any of the logical checks under CONFIG_DEBUG_LIST... just
the manifestation of WARN vs BUG. So, since both this and the BPF RCU
bug have appeared, something has changed... I will go digging...

-Kees

>
> Yes, yes, they will be crazy then as well, but this would be higher
> on the priority list.
>
> Thanx, Paul
>
>> >> ++++
>> >> || 0a33d912a3 
>> >> | 8052ce2d97 |
>> >> ++++
>> >> | boot_successes | 0  
>> >> | 0  |
>> >> | boot_failures  | 211
>> >> | 18 |
>> >> | WARNING:at_lib/list_debug.c:#__list_del_entry  | 211
>> >> | 18 |
>> >> | calltrace:init | 209
>> >> | 18 |
>> >> | WARNING:at_lib/list_debug.c:#__list_add| 209
>> >> ||
>> >> | calltrace:i8042_init   | 209
>> >> | 17 |
>> >> | calltrace:SyS_reboot   | 192
>> >> | 4  |
>> >> | EIP_is_at__mutex_unlock_slowpath   | 29 
>> >> ||
>> >> | Kernel_panic-not_syncing:softlockup:hung_tasks | 138
>> >> ||
>> >> | BUG:workqueue_lockup-pool_cpus=#flags=#nice=#stuck_for#s   | 56 
>> >> ||
>> >> | EIP_is_at_lock_release | 7  
>> >> ||
>> >> | EIP_is_at_lock_acquire | 28 
>> >> ||
>> >> | EIP_is_at_raw_spin_unlock_irq  | 30 
>> >> ||
>> >> | EIP_is_at_mutex_lock_nested| 21 
>> >> ||
>> >> | EIP_is_at_lock_is_held | 10 
>> >> ||
>> >> | EIP_is_at_device_shutdown  | 3  
>> >> ||
>> >> | EIP_is_at___might_sleep| 5  
>> >> ||
>> >> | EIP_is_at__list_del_entry  | 4  
>> >> | 1  |
>> >> | EIP_is_at_get_device   | 1  
>> >> ||
>> >> | BUG:unable_to_handle_kernel| 7  
>> >> | 5  |
>> >> | Oops   | 7  
>> >> | 5  |
>> >> | calltrace:SyS_ipc  | 4  
>> >> | 4  |
>> >> | Kernel_panic-not_syncing:Fatal_exception   | 7  
>> >> | 5  |
>> >> | EIP_is_at_pm_runtime_barrier   | 1  
>> >> ||
>> >> | EIP_is_at_lock_acquired| 1  
>> >> ||
>> >> | EIP_is_at__might_sleep | 1  
>> >> ||
>> >> | BUG:workqueue_lockup-pool_cpus=#-#flags=#nice=#stuck_for#s | 1  
>> >> ||
>> >> | WARNING:at_arch/x86/mm/dump_pagetables.c:#note_page| 26 
>> >> ||
>> >> | calltrace:mark_rodata_ro   | 26 
>> >> ||
>> >> | WARNING:at_lib/list_debug.c:#__list_add_valid  | 0  
>> >> | 17 |
>> >> | 

Re: [list] 8052ce2d97: WARNING: CPU: 0 PID: 1 at lib/list_debug.c:60 __list_del_entry+0x4d/0x71

2016-09-07 Thread Kees Cook
On Wed, Sep 7, 2016 at 12:31 AM, Paul E. McKenney
 wrote:
> On Tue, Sep 06, 2016 at 11:49:04AM -0400, Kees Cook wrote:
>> On Sun, Sep 4, 2016 at 5:48 AM, Paul E. McKenney
>>  wrote:
>> > On Sun, Sep 04, 2016 at 08:35:40AM +0800, kernel test robot wrote:
>> >>
>> >> FYI, we noticed the following commit:
>> >>
>> >> https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
>> >> dev.2016.08.19a
>> >> commit 8052ce2d9771ab5a818307f3abbaf33bba82a631 ("list: Split list_add() 
>> >> debug checking into separate function")
>> >>
>> >> in testcase: boot
>> >>
>> >> on test machine: qemu-system-i386 -enable-kvm -m 360M
>> >>
>> >> caused below changes:
>> >
>> > Hello, Kees,
>> >
>> > Looks like your change might have actually found a bug.  ;-)
>> >
>> > How would you like to handle this?
>>
>> We should fix the bug! :)
>
> But of course...
>
> The question is whether I am considered to be fully responsible for
> chasing down any bugs that arise.  It is looking like I am, in which
> case I must defer this patch to the 4.10 merge window.  Things are a
> bit crazy now due to prior overcommitments...

I'm going to double-check the code, but the changes I made shouldn't
have changed any of the logical checks under CONFIG_DEBUG_LIST... just
the manifestation of WARN vs BUG. So, since both this and the BPF RCU
bug have appeared, something has changed... I will go digging...

-Kees

>
> Yes, yes, they will be crazy then as well, but this would be higher
> on the priority list.
>
> Thanx, Paul
>
>> >> ++++
>> >> || 0a33d912a3 
>> >> | 8052ce2d97 |
>> >> ++++
>> >> | boot_successes | 0  
>> >> | 0  |
>> >> | boot_failures  | 211
>> >> | 18 |
>> >> | WARNING:at_lib/list_debug.c:#__list_del_entry  | 211
>> >> | 18 |
>> >> | calltrace:init | 209
>> >> | 18 |
>> >> | WARNING:at_lib/list_debug.c:#__list_add| 209
>> >> ||
>> >> | calltrace:i8042_init   | 209
>> >> | 17 |
>> >> | calltrace:SyS_reboot   | 192
>> >> | 4  |
>> >> | EIP_is_at__mutex_unlock_slowpath   | 29 
>> >> ||
>> >> | Kernel_panic-not_syncing:softlockup:hung_tasks | 138
>> >> ||
>> >> | BUG:workqueue_lockup-pool_cpus=#flags=#nice=#stuck_for#s   | 56 
>> >> ||
>> >> | EIP_is_at_lock_release | 7  
>> >> ||
>> >> | EIP_is_at_lock_acquire | 28 
>> >> ||
>> >> | EIP_is_at_raw_spin_unlock_irq  | 30 
>> >> ||
>> >> | EIP_is_at_mutex_lock_nested| 21 
>> >> ||
>> >> | EIP_is_at_lock_is_held | 10 
>> >> ||
>> >> | EIP_is_at_device_shutdown  | 3  
>> >> ||
>> >> | EIP_is_at___might_sleep| 5  
>> >> ||
>> >> | EIP_is_at__list_del_entry  | 4  
>> >> | 1  |
>> >> | EIP_is_at_get_device   | 1  
>> >> ||
>> >> | BUG:unable_to_handle_kernel| 7  
>> >> | 5  |
>> >> | Oops   | 7  
>> >> | 5  |
>> >> | calltrace:SyS_ipc  | 4  
>> >> | 4  |
>> >> | Kernel_panic-not_syncing:Fatal_exception   | 7  
>> >> | 5  |
>> >> | EIP_is_at_pm_runtime_barrier   | 1  
>> >> ||
>> >> | EIP_is_at_lock_acquired| 1  
>> >> ||
>> >> | EIP_is_at__might_sleep | 1  
>> >> ||
>> >> | BUG:workqueue_lockup-pool_cpus=#-#flags=#nice=#stuck_for#s | 1  
>> >> ||
>> >> | WARNING:at_arch/x86/mm/dump_pagetables.c:#note_page| 26 
>> >> ||
>> >> | calltrace:mark_rodata_ro   | 26 
>> >> ||
>> >> | WARNING:at_lib/list_debug.c:#__list_add_valid  | 0  
>> >> | 17 |
>> >> | calltrace:serio_handle_event   | 0

Re: [list] 8052ce2d97: WARNING: CPU: 0 PID: 1 at lib/list_debug.c:60 __list_del_entry+0x4d/0x71

2016-09-07 Thread Paul E. McKenney
On Tue, Sep 06, 2016 at 11:49:04AM -0400, Kees Cook wrote:
> On Sun, Sep 4, 2016 at 5:48 AM, Paul E. McKenney
>  wrote:
> > On Sun, Sep 04, 2016 at 08:35:40AM +0800, kernel test robot wrote:
> >>
> >> FYI, we noticed the following commit:
> >>
> >> https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
> >> dev.2016.08.19a
> >> commit 8052ce2d9771ab5a818307f3abbaf33bba82a631 ("list: Split list_add() 
> >> debug checking into separate function")
> >>
> >> in testcase: boot
> >>
> >> on test machine: qemu-system-i386 -enable-kvm -m 360M
> >>
> >> caused below changes:
> >
> > Hello, Kees,
> >
> > Looks like your change might have actually found a bug.  ;-)
> >
> > How would you like to handle this?
> 
> We should fix the bug! :)

But of course...

The question is whether I am considered to be fully responsible for
chasing down any bugs that arise.  It is looking like I am, in which
case I must defer this patch to the 4.10 merge window.  Things are a
bit crazy now due to prior overcommitments...

Yes, yes, they will be crazy then as well, but this would be higher
on the priority list.

Thanx, Paul

> >> ++++
> >> || 0a33d912a3 
> >> | 8052ce2d97 |
> >> ++++
> >> | boot_successes | 0  
> >> | 0  |
> >> | boot_failures  | 211
> >> | 18 |
> >> | WARNING:at_lib/list_debug.c:#__list_del_entry  | 211
> >> | 18 |
> >> | calltrace:init | 209
> >> | 18 |
> >> | WARNING:at_lib/list_debug.c:#__list_add| 209
> >> ||
> >> | calltrace:i8042_init   | 209
> >> | 17 |
> >> | calltrace:SyS_reboot   | 192
> >> | 4  |
> >> | EIP_is_at__mutex_unlock_slowpath   | 29 
> >> ||
> >> | Kernel_panic-not_syncing:softlockup:hung_tasks | 138
> >> ||
> >> | BUG:workqueue_lockup-pool_cpus=#flags=#nice=#stuck_for#s   | 56 
> >> ||
> >> | EIP_is_at_lock_release | 7  
> >> ||
> >> | EIP_is_at_lock_acquire | 28 
> >> ||
> >> | EIP_is_at_raw_spin_unlock_irq  | 30 
> >> ||
> >> | EIP_is_at_mutex_lock_nested| 21 
> >> ||
> >> | EIP_is_at_lock_is_held | 10 
> >> ||
> >> | EIP_is_at_device_shutdown  | 3  
> >> ||
> >> | EIP_is_at___might_sleep| 5  
> >> ||
> >> | EIP_is_at__list_del_entry  | 4  
> >> | 1  |
> >> | EIP_is_at_get_device   | 1  
> >> ||
> >> | BUG:unable_to_handle_kernel| 7  
> >> | 5  |
> >> | Oops   | 7  
> >> | 5  |
> >> | calltrace:SyS_ipc  | 4  
> >> | 4  |
> >> | Kernel_panic-not_syncing:Fatal_exception   | 7  
> >> | 5  |
> >> | EIP_is_at_pm_runtime_barrier   | 1  
> >> ||
> >> | EIP_is_at_lock_acquired| 1  
> >> ||
> >> | EIP_is_at__might_sleep | 1  
> >> ||
> >> | BUG:workqueue_lockup-pool_cpus=#-#flags=#nice=#stuck_for#s | 1  
> >> ||
> >> | WARNING:at_arch/x86/mm/dump_pagetables.c:#note_page| 26 
> >> ||
> >> | calltrace:mark_rodata_ro   | 26 
> >> ||
> >> | WARNING:at_lib/list_debug.c:#__list_add_valid  | 0  
> >> | 17 |
> >> | calltrace:serio_handle_event   | 0  
> >> | 17 |
> >> | calltrace:userio_misc_init | 0  
> >> | 17 |
> >> | calltrace:mousedev_init| 0  
> >> | 17 |
> >> | calltrace:evdev_init   | 0  
> >> | 17 |
> >> | calltrace:cmos_init| 0  
> >> | 17 

Re: [list] 8052ce2d97: WARNING: CPU: 0 PID: 1 at lib/list_debug.c:60 __list_del_entry+0x4d/0x71

2016-09-07 Thread Paul E. McKenney
On Tue, Sep 06, 2016 at 11:49:04AM -0400, Kees Cook wrote:
> On Sun, Sep 4, 2016 at 5:48 AM, Paul E. McKenney
>  wrote:
> > On Sun, Sep 04, 2016 at 08:35:40AM +0800, kernel test robot wrote:
> >>
> >> FYI, we noticed the following commit:
> >>
> >> https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
> >> dev.2016.08.19a
> >> commit 8052ce2d9771ab5a818307f3abbaf33bba82a631 ("list: Split list_add() 
> >> debug checking into separate function")
> >>
> >> in testcase: boot
> >>
> >> on test machine: qemu-system-i386 -enable-kvm -m 360M
> >>
> >> caused below changes:
> >
> > Hello, Kees,
> >
> > Looks like your change might have actually found a bug.  ;-)
> >
> > How would you like to handle this?
> 
> We should fix the bug! :)

But of course...

The question is whether I am considered to be fully responsible for
chasing down any bugs that arise.  It is looking like I am, in which
case I must defer this patch to the 4.10 merge window.  Things are a
bit crazy now due to prior overcommitments...

Yes, yes, they will be crazy then as well, but this would be higher
on the priority list.

Thanx, Paul

> >> ++++
> >> || 0a33d912a3 
> >> | 8052ce2d97 |
> >> ++++
> >> | boot_successes | 0  
> >> | 0  |
> >> | boot_failures  | 211
> >> | 18 |
> >> | WARNING:at_lib/list_debug.c:#__list_del_entry  | 211
> >> | 18 |
> >> | calltrace:init | 209
> >> | 18 |
> >> | WARNING:at_lib/list_debug.c:#__list_add| 209
> >> ||
> >> | calltrace:i8042_init   | 209
> >> | 17 |
> >> | calltrace:SyS_reboot   | 192
> >> | 4  |
> >> | EIP_is_at__mutex_unlock_slowpath   | 29 
> >> ||
> >> | Kernel_panic-not_syncing:softlockup:hung_tasks | 138
> >> ||
> >> | BUG:workqueue_lockup-pool_cpus=#flags=#nice=#stuck_for#s   | 56 
> >> ||
> >> | EIP_is_at_lock_release | 7  
> >> ||
> >> | EIP_is_at_lock_acquire | 28 
> >> ||
> >> | EIP_is_at_raw_spin_unlock_irq  | 30 
> >> ||
> >> | EIP_is_at_mutex_lock_nested| 21 
> >> ||
> >> | EIP_is_at_lock_is_held | 10 
> >> ||
> >> | EIP_is_at_device_shutdown  | 3  
> >> ||
> >> | EIP_is_at___might_sleep| 5  
> >> ||
> >> | EIP_is_at__list_del_entry  | 4  
> >> | 1  |
> >> | EIP_is_at_get_device   | 1  
> >> ||
> >> | BUG:unable_to_handle_kernel| 7  
> >> | 5  |
> >> | Oops   | 7  
> >> | 5  |
> >> | calltrace:SyS_ipc  | 4  
> >> | 4  |
> >> | Kernel_panic-not_syncing:Fatal_exception   | 7  
> >> | 5  |
> >> | EIP_is_at_pm_runtime_barrier   | 1  
> >> ||
> >> | EIP_is_at_lock_acquired| 1  
> >> ||
> >> | EIP_is_at__might_sleep | 1  
> >> ||
> >> | BUG:workqueue_lockup-pool_cpus=#-#flags=#nice=#stuck_for#s | 1  
> >> ||
> >> | WARNING:at_arch/x86/mm/dump_pagetables.c:#note_page| 26 
> >> ||
> >> | calltrace:mark_rodata_ro   | 26 
> >> ||
> >> | WARNING:at_lib/list_debug.c:#__list_add_valid  | 0  
> >> | 17 |
> >> | calltrace:serio_handle_event   | 0  
> >> | 17 |
> >> | calltrace:userio_misc_init | 0  
> >> | 17 |
> >> | calltrace:mousedev_init| 0  
> >> | 17 |
> >> | calltrace:evdev_init   | 0  
> >> | 17 |
> >> | calltrace:cmos_init| 0  
> >> | 17 |
> >> | 

Re: [list] 8052ce2d97: WARNING: CPU: 0 PID: 1 at lib/list_debug.c:60 __list_del_entry+0x4d/0x71

2016-09-06 Thread Kees Cook
On Sun, Sep 4, 2016 at 5:48 AM, Paul E. McKenney
 wrote:
> On Sun, Sep 04, 2016 at 08:35:40AM +0800, kernel test robot wrote:
>>
>> FYI, we noticed the following commit:
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
>> dev.2016.08.19a
>> commit 8052ce2d9771ab5a818307f3abbaf33bba82a631 ("list: Split list_add() 
>> debug checking into separate function")
>>
>> in testcase: boot
>>
>> on test machine: qemu-system-i386 -enable-kvm -m 360M
>>
>> caused below changes:
>
> Hello, Kees,
>
> Looks like your change might have actually found a bug.  ;-)
>
> How would you like to handle this?

We should fix the bug! :)

>
> Thanx, Paul
>
>> ++++
>> || 0a33d912a3 | 
>> 8052ce2d97 |
>> ++++
>> | boot_successes | 0  | 
>> 0  |
>> | boot_failures  | 211| 
>> 18 |
>> | WARNING:at_lib/list_debug.c:#__list_del_entry  | 211| 
>> 18 |
>> | calltrace:init | 209| 
>> 18 |
>> | WARNING:at_lib/list_debug.c:#__list_add| 209|  
>>   |
>> | calltrace:i8042_init   | 209| 
>> 17 |
>> | calltrace:SyS_reboot   | 192| 
>> 4  |
>> | EIP_is_at__mutex_unlock_slowpath   | 29 |  
>>   |
>> | Kernel_panic-not_syncing:softlockup:hung_tasks | 138|  
>>   |
>> | BUG:workqueue_lockup-pool_cpus=#flags=#nice=#stuck_for#s   | 56 |  
>>   |
>> | EIP_is_at_lock_release | 7  |  
>>   |
>> | EIP_is_at_lock_acquire | 28 |  
>>   |
>> | EIP_is_at_raw_spin_unlock_irq  | 30 |  
>>   |
>> | EIP_is_at_mutex_lock_nested| 21 |  
>>   |
>> | EIP_is_at_lock_is_held | 10 |  
>>   |
>> | EIP_is_at_device_shutdown  | 3  |  
>>   |
>> | EIP_is_at___might_sleep| 5  |  
>>   |
>> | EIP_is_at__list_del_entry  | 4  | 
>> 1  |
>> | EIP_is_at_get_device   | 1  |  
>>   |
>> | BUG:unable_to_handle_kernel| 7  | 
>> 5  |
>> | Oops   | 7  | 
>> 5  |
>> | calltrace:SyS_ipc  | 4  | 
>> 4  |
>> | Kernel_panic-not_syncing:Fatal_exception   | 7  | 
>> 5  |
>> | EIP_is_at_pm_runtime_barrier   | 1  |  
>>   |
>> | EIP_is_at_lock_acquired| 1  |  
>>   |
>> | EIP_is_at__might_sleep | 1  |  
>>   |
>> | BUG:workqueue_lockup-pool_cpus=#-#flags=#nice=#stuck_for#s | 1  |  
>>   |
>> | WARNING:at_arch/x86/mm/dump_pagetables.c:#note_page| 26 |  
>>   |
>> | calltrace:mark_rodata_ro   | 26 |  
>>   |
>> | WARNING:at_lib/list_debug.c:#__list_add_valid  | 0  | 
>> 17 |
>> | calltrace:serio_handle_event   | 0  | 
>> 17 |
>> | calltrace:userio_misc_init | 0  | 
>> 17 |
>> | calltrace:mousedev_init| 0  | 
>> 17 |
>> | calltrace:evdev_init   | 0  | 
>> 17 |
>> | calltrace:cmos_init| 0  | 
>> 17 |
>> | calltrace:test_init| 0  | 
>> 17 |
>> | calltrace:pca_isa_driver_init  | 0  | 
>> 17 |
>> | calltrace:vim2m_init   | 0  | 
>> 17 |
>> | calltrace:fm_drv_init  | 0  | 
>> 17 |
>> | calltrace:test_power_init  | 0  | 
>> 17 |
>> | calltrace:raid#_init   | 0  | 
>> 17 |
>> | calltrace:dm_init

Re: [list] 8052ce2d97: WARNING: CPU: 0 PID: 1 at lib/list_debug.c:60 __list_del_entry+0x4d/0x71

2016-09-06 Thread Kees Cook
On Sun, Sep 4, 2016 at 5:48 AM, Paul E. McKenney
 wrote:
> On Sun, Sep 04, 2016 at 08:35:40AM +0800, kernel test robot wrote:
>>
>> FYI, we noticed the following commit:
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
>> dev.2016.08.19a
>> commit 8052ce2d9771ab5a818307f3abbaf33bba82a631 ("list: Split list_add() 
>> debug checking into separate function")
>>
>> in testcase: boot
>>
>> on test machine: qemu-system-i386 -enable-kvm -m 360M
>>
>> caused below changes:
>
> Hello, Kees,
>
> Looks like your change might have actually found a bug.  ;-)
>
> How would you like to handle this?

We should fix the bug! :)

>
> Thanx, Paul
>
>> ++++
>> || 0a33d912a3 | 
>> 8052ce2d97 |
>> ++++
>> | boot_successes | 0  | 
>> 0  |
>> | boot_failures  | 211| 
>> 18 |
>> | WARNING:at_lib/list_debug.c:#__list_del_entry  | 211| 
>> 18 |
>> | calltrace:init | 209| 
>> 18 |
>> | WARNING:at_lib/list_debug.c:#__list_add| 209|  
>>   |
>> | calltrace:i8042_init   | 209| 
>> 17 |
>> | calltrace:SyS_reboot   | 192| 
>> 4  |
>> | EIP_is_at__mutex_unlock_slowpath   | 29 |  
>>   |
>> | Kernel_panic-not_syncing:softlockup:hung_tasks | 138|  
>>   |
>> | BUG:workqueue_lockup-pool_cpus=#flags=#nice=#stuck_for#s   | 56 |  
>>   |
>> | EIP_is_at_lock_release | 7  |  
>>   |
>> | EIP_is_at_lock_acquire | 28 |  
>>   |
>> | EIP_is_at_raw_spin_unlock_irq  | 30 |  
>>   |
>> | EIP_is_at_mutex_lock_nested| 21 |  
>>   |
>> | EIP_is_at_lock_is_held | 10 |  
>>   |
>> | EIP_is_at_device_shutdown  | 3  |  
>>   |
>> | EIP_is_at___might_sleep| 5  |  
>>   |
>> | EIP_is_at__list_del_entry  | 4  | 
>> 1  |
>> | EIP_is_at_get_device   | 1  |  
>>   |
>> | BUG:unable_to_handle_kernel| 7  | 
>> 5  |
>> | Oops   | 7  | 
>> 5  |
>> | calltrace:SyS_ipc  | 4  | 
>> 4  |
>> | Kernel_panic-not_syncing:Fatal_exception   | 7  | 
>> 5  |
>> | EIP_is_at_pm_runtime_barrier   | 1  |  
>>   |
>> | EIP_is_at_lock_acquired| 1  |  
>>   |
>> | EIP_is_at__might_sleep | 1  |  
>>   |
>> | BUG:workqueue_lockup-pool_cpus=#-#flags=#nice=#stuck_for#s | 1  |  
>>   |
>> | WARNING:at_arch/x86/mm/dump_pagetables.c:#note_page| 26 |  
>>   |
>> | calltrace:mark_rodata_ro   | 26 |  
>>   |
>> | WARNING:at_lib/list_debug.c:#__list_add_valid  | 0  | 
>> 17 |
>> | calltrace:serio_handle_event   | 0  | 
>> 17 |
>> | calltrace:userio_misc_init | 0  | 
>> 17 |
>> | calltrace:mousedev_init| 0  | 
>> 17 |
>> | calltrace:evdev_init   | 0  | 
>> 17 |
>> | calltrace:cmos_init| 0  | 
>> 17 |
>> | calltrace:test_init| 0  | 
>> 17 |
>> | calltrace:pca_isa_driver_init  | 0  | 
>> 17 |
>> | calltrace:vim2m_init   | 0  | 
>> 17 |
>> | calltrace:fm_drv_init  | 0  | 
>> 17 |
>> | calltrace:test_power_init  | 0  | 
>> 17 |
>> | calltrace:raid#_init   | 0  | 
>> 17 |
>> | calltrace:dm_init

Re: [list] 8052ce2d97: WARNING: CPU: 0 PID: 1 at lib/list_debug.c:60 __list_del_entry+0x4d/0x71

2016-09-04 Thread Paul E. McKenney
On Sun, Sep 04, 2016 at 08:35:40AM +0800, kernel test robot wrote:
> 
> FYI, we noticed the following commit:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
> dev.2016.08.19a
> commit 8052ce2d9771ab5a818307f3abbaf33bba82a631 ("list: Split list_add() 
> debug checking into separate function")
> 
> in testcase: boot
> 
> on test machine: qemu-system-i386 -enable-kvm -m 360M
> 
> caused below changes:

Hello, Kees,

Looks like your change might have actually found a bug.  ;-)

How would you like to handle this?

Thanx, Paul

> ++++
> || 0a33d912a3 | 
> 8052ce2d97 |
> ++++
> | boot_successes | 0  | 0 
>  |
> | boot_failures  | 211| 
> 18 |
> | WARNING:at_lib/list_debug.c:#__list_del_entry  | 211| 
> 18 |
> | calltrace:init | 209| 
> 18 |
> | WARNING:at_lib/list_debug.c:#__list_add| 209|   
>  |
> | calltrace:i8042_init   | 209| 
> 17 |
> | calltrace:SyS_reboot   | 192| 4 
>  |
> | EIP_is_at__mutex_unlock_slowpath   | 29 |   
>  |
> | Kernel_panic-not_syncing:softlockup:hung_tasks | 138|   
>  |
> | BUG:workqueue_lockup-pool_cpus=#flags=#nice=#stuck_for#s   | 56 |   
>  |
> | EIP_is_at_lock_release | 7  |   
>  |
> | EIP_is_at_lock_acquire | 28 |   
>  |
> | EIP_is_at_raw_spin_unlock_irq  | 30 |   
>  |
> | EIP_is_at_mutex_lock_nested| 21 |   
>  |
> | EIP_is_at_lock_is_held | 10 |   
>  |
> | EIP_is_at_device_shutdown  | 3  |   
>  |
> | EIP_is_at___might_sleep| 5  |   
>  |
> | EIP_is_at__list_del_entry  | 4  | 1 
>  |
> | EIP_is_at_get_device   | 1  |   
>  |
> | BUG:unable_to_handle_kernel| 7  | 5 
>  |
> | Oops   | 7  | 5 
>  |
> | calltrace:SyS_ipc  | 4  | 4 
>  |
> | Kernel_panic-not_syncing:Fatal_exception   | 7  | 5 
>  |
> | EIP_is_at_pm_runtime_barrier   | 1  |   
>  |
> | EIP_is_at_lock_acquired| 1  |   
>  |
> | EIP_is_at__might_sleep | 1  |   
>  |
> | BUG:workqueue_lockup-pool_cpus=#-#flags=#nice=#stuck_for#s | 1  |   
>  |
> | WARNING:at_arch/x86/mm/dump_pagetables.c:#note_page| 26 |   
>  |
> | calltrace:mark_rodata_ro   | 26 |   
>  |
> | WARNING:at_lib/list_debug.c:#__list_add_valid  | 0  | 
> 17 |
> | calltrace:serio_handle_event   | 0  | 
> 17 |
> | calltrace:userio_misc_init | 0  | 
> 17 |
> | calltrace:mousedev_init| 0  | 
> 17 |
> | calltrace:evdev_init   | 0  | 
> 17 |
> | calltrace:cmos_init| 0  | 
> 17 |
> | calltrace:test_init| 0  | 
> 17 |
> | calltrace:pca_isa_driver_init  | 0  | 
> 17 |
> | calltrace:vim2m_init   | 0  | 
> 17 |
> | calltrace:fm_drv_init  | 0  | 
> 17 |
> | calltrace:test_power_init  | 0  | 
> 17 |
> | calltrace:raid#_init   | 0  | 
> 17 |
> | calltrace:dm_init  | 0  | 
> 17 |
> | calltrace:vhci_init| 0  | 
> 17 |
> | calltrace:dcdrbu_init  | 0  

Re: [list] 8052ce2d97: WARNING: CPU: 0 PID: 1 at lib/list_debug.c:60 __list_del_entry+0x4d/0x71

2016-09-04 Thread Paul E. McKenney
On Sun, Sep 04, 2016 at 08:35:40AM +0800, kernel test robot wrote:
> 
> FYI, we noticed the following commit:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
> dev.2016.08.19a
> commit 8052ce2d9771ab5a818307f3abbaf33bba82a631 ("list: Split list_add() 
> debug checking into separate function")
> 
> in testcase: boot
> 
> on test machine: qemu-system-i386 -enable-kvm -m 360M
> 
> caused below changes:

Hello, Kees,

Looks like your change might have actually found a bug.  ;-)

How would you like to handle this?

Thanx, Paul

> ++++
> || 0a33d912a3 | 
> 8052ce2d97 |
> ++++
> | boot_successes | 0  | 0 
>  |
> | boot_failures  | 211| 
> 18 |
> | WARNING:at_lib/list_debug.c:#__list_del_entry  | 211| 
> 18 |
> | calltrace:init | 209| 
> 18 |
> | WARNING:at_lib/list_debug.c:#__list_add| 209|   
>  |
> | calltrace:i8042_init   | 209| 
> 17 |
> | calltrace:SyS_reboot   | 192| 4 
>  |
> | EIP_is_at__mutex_unlock_slowpath   | 29 |   
>  |
> | Kernel_panic-not_syncing:softlockup:hung_tasks | 138|   
>  |
> | BUG:workqueue_lockup-pool_cpus=#flags=#nice=#stuck_for#s   | 56 |   
>  |
> | EIP_is_at_lock_release | 7  |   
>  |
> | EIP_is_at_lock_acquire | 28 |   
>  |
> | EIP_is_at_raw_spin_unlock_irq  | 30 |   
>  |
> | EIP_is_at_mutex_lock_nested| 21 |   
>  |
> | EIP_is_at_lock_is_held | 10 |   
>  |
> | EIP_is_at_device_shutdown  | 3  |   
>  |
> | EIP_is_at___might_sleep| 5  |   
>  |
> | EIP_is_at__list_del_entry  | 4  | 1 
>  |
> | EIP_is_at_get_device   | 1  |   
>  |
> | BUG:unable_to_handle_kernel| 7  | 5 
>  |
> | Oops   | 7  | 5 
>  |
> | calltrace:SyS_ipc  | 4  | 4 
>  |
> | Kernel_panic-not_syncing:Fatal_exception   | 7  | 5 
>  |
> | EIP_is_at_pm_runtime_barrier   | 1  |   
>  |
> | EIP_is_at_lock_acquired| 1  |   
>  |
> | EIP_is_at__might_sleep | 1  |   
>  |
> | BUG:workqueue_lockup-pool_cpus=#-#flags=#nice=#stuck_for#s | 1  |   
>  |
> | WARNING:at_arch/x86/mm/dump_pagetables.c:#note_page| 26 |   
>  |
> | calltrace:mark_rodata_ro   | 26 |   
>  |
> | WARNING:at_lib/list_debug.c:#__list_add_valid  | 0  | 
> 17 |
> | calltrace:serio_handle_event   | 0  | 
> 17 |
> | calltrace:userio_misc_init | 0  | 
> 17 |
> | calltrace:mousedev_init| 0  | 
> 17 |
> | calltrace:evdev_init   | 0  | 
> 17 |
> | calltrace:cmos_init| 0  | 
> 17 |
> | calltrace:test_init| 0  | 
> 17 |
> | calltrace:pca_isa_driver_init  | 0  | 
> 17 |
> | calltrace:vim2m_init   | 0  | 
> 17 |
> | calltrace:fm_drv_init  | 0  | 
> 17 |
> | calltrace:test_power_init  | 0  | 
> 17 |
> | calltrace:raid#_init   | 0  | 
> 17 |
> | calltrace:dm_init  | 0  | 
> 17 |
> | calltrace:vhci_init| 0  | 
> 17 |
> | calltrace:dcdrbu_init  | 0