Re: rtc: hang on boot during unregistration

2014-04-08 Thread Tejun Heo
Hello,

On Mon, Apr 07, 2014 at 12:33:28PM -0700, Andrew Morton wrote:
> > [   16.625261] kworker/0:1/1069 is trying to acquire lock:
> > [   16.626300] ((&(>release)->work)){+.+.+.}, at: flush_work 
> > (kernel/workqueue.c:2880)
> > [   16.628258]
> > [   16.628258] but task is already holding lock:
> > [   16.629415] ((&(>release)->work)){+.+.+.}, at: process_one_work 
> > (include/linux/workqueue.h:186 kernel/workqueue.c:611 
> > kernel/workqueue.c:638 kernel/workqueue.c:2214)
> 
> This is making my head spin a bit.  Why is lockdep babbling about
> include/linux/workqueue.h:186 and similar sites which appear to have
> nothing to do with locking?  Later it points a finger at
> include/linux/jump_label.h:105, wtf.

flush_work() and flush_workqueue() make use of lockdep anntations so
that dependency problems involving them can be detected through
lockdep.

> > [   16.630106] other info that might help us debug this:
> > [   16.630106]  Possible unsafe locking scenario:
> > [   16.630106]
> > [   16.630106]CPU0
> > [   16.630106]
> > [   16.630106]   lock((&(>release)->work));
> > [   16.630106]   lock((&(>release)->work));

Looks like a work item is trying to flush itself.

> > [   16.630106] Call Trace:
> > [   16.630106] dump_stack (lib/dump_stack.c:52)
> > [   16.630106] __lock_acquire (kernel/locking/lockdep.c:1740 
> > kernel/locking/lockdep.c:1783 kernel/locking/lockdep.c:2115 
> > kernel/locking/lockdep.c:3182)
> > [   16.630106] lock_acquire (arch/x86/include/asm/current.h:14 
> > kernel/locking/lockdep.c:3602)
> > [   16.630106] flush_work (kernel/workqueue.c:2884)
> > [   16.630106] __cancel_work_timer (arch/x86/include/asm/bitops.h:313 
> > kernel/workqueue.c:610 kernel/workqueue.c:644 kernel/workqueue.c:2916)
> > [   16.630106] cancel_work_sync (kernel/workqueue.c:2941)
> > [   16.630106] work_fixup_free (kernel/workqueue.c:479)

Hah, this is debugobj fixup code trying to perform cancel_work_sync()
to make the object unused because it's being freed while active;
however, because free() is being called from the work item itself,
it's falling into deadlock and triggering the lockdep anntation.

> > [   16.630106] __debug_check_no_obj_freed (lib/debugobjects.c:277 
> > lib/debugobjects.c:700)
> > [   16.630106] debug_check_no_obj_freed (lib/debugobjects.c:726)
> > [   16.630106] kfree (mm/slub.c:2679 mm/slub.c:3401)
> > [   16.630106] rtc_device_release (drivers/rtc/class.c:35)

IOW, rtc_device_release() is freeing the memory area this work item is
embedded in while the work item is queued on a workqueue.

> > [   16.630106] device_release (drivers/base/core.c:246)
> > [   16.630106] kobject_delayed_cleanup (lib/kobject.c:629 lib/kobject.c:638)
> > [   16.630106] process_one_work (kernel/workqueue.c:2221 
> > include/linux/jump_label.h:105 include/trace/events/workqueue.h:111 
> > kernel/workqueue.c:2226)
> > [   16.630106] worker_thread (kernel/workqueue.c:2348)
> > [   16.630106] kthread (kernel/kthread.c:219)
> > [   16.630106] ret_from_fork (arch/x86/kernel/entry_64.S:555)
> 
> I think rtc's role in this is to trigger case ODEBUG_STATE_ACTIVE in
> __debug_check_no_obj_freed(), which might be due to an RTC bug.  This
> then triggers problems in the workqueue core - calling flush_work() and
> cancel_work_sync() within work_fixup_free() seems, umm, optimistic.

It's debugobj fixup code which is opportunistically trying to take the
work item off workqueue to avoid use-after-free.  As free() is being
called from work item itself, it falls into A-A deadlock.

> Maybe the semi-recursive code is OK and we're just missing lockdep
> annotations, and the hang you saw was for other reasons.

I've been always a bit suspicious about the fixup parts of debugobj.
I suppose they aren't too harmful as these types of chained failures
always follow proper error messages from debugobj itself.  Anyways,
the lockdep annotation is only because debugobj fixup isn't too smart.
I'm not quite sure we wanna try harder tho.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: rtc: hang on boot during unregistration

2014-04-08 Thread Tejun Heo
Hello,

On Mon, Apr 07, 2014 at 12:33:28PM -0700, Andrew Morton wrote:
  [   16.625261] kworker/0:1/1069 is trying to acquire lock:
  [   16.626300] (((kobj-release)-work)){+.+.+.}, at: flush_work 
  (kernel/workqueue.c:2880)
  [   16.628258]
  [   16.628258] but task is already holding lock:
  [   16.629415] (((kobj-release)-work)){+.+.+.}, at: process_one_work 
  (include/linux/workqueue.h:186 kernel/workqueue.c:611 
  kernel/workqueue.c:638 kernel/workqueue.c:2214)
 
 This is making my head spin a bit.  Why is lockdep babbling about
 include/linux/workqueue.h:186 and similar sites which appear to have
 nothing to do with locking?  Later it points a finger at
 include/linux/jump_label.h:105, wtf.

flush_work() and flush_workqueue() make use of lockdep anntations so
that dependency problems involving them can be detected through
lockdep.

  [   16.630106] other info that might help us debug this:
  [   16.630106]  Possible unsafe locking scenario:
  [   16.630106]
  [   16.630106]CPU0
  [   16.630106]
  [   16.630106]   lock(((kobj-release)-work));
  [   16.630106]   lock(((kobj-release)-work));

Looks like a work item is trying to flush itself.

  [   16.630106] Call Trace:
  [   16.630106] dump_stack (lib/dump_stack.c:52)
  [   16.630106] __lock_acquire (kernel/locking/lockdep.c:1740 
  kernel/locking/lockdep.c:1783 kernel/locking/lockdep.c:2115 
  kernel/locking/lockdep.c:3182)
  [   16.630106] lock_acquire (arch/x86/include/asm/current.h:14 
  kernel/locking/lockdep.c:3602)
  [   16.630106] flush_work (kernel/workqueue.c:2884)
  [   16.630106] __cancel_work_timer (arch/x86/include/asm/bitops.h:313 
  kernel/workqueue.c:610 kernel/workqueue.c:644 kernel/workqueue.c:2916)
  [   16.630106] cancel_work_sync (kernel/workqueue.c:2941)
  [   16.630106] work_fixup_free (kernel/workqueue.c:479)

Hah, this is debugobj fixup code trying to perform cancel_work_sync()
to make the object unused because it's being freed while active;
however, because free() is being called from the work item itself,
it's falling into deadlock and triggering the lockdep anntation.

  [   16.630106] __debug_check_no_obj_freed (lib/debugobjects.c:277 
  lib/debugobjects.c:700)
  [   16.630106] debug_check_no_obj_freed (lib/debugobjects.c:726)
  [   16.630106] kfree (mm/slub.c:2679 mm/slub.c:3401)
  [   16.630106] rtc_device_release (drivers/rtc/class.c:35)

IOW, rtc_device_release() is freeing the memory area this work item is
embedded in while the work item is queued on a workqueue.

  [   16.630106] device_release (drivers/base/core.c:246)
  [   16.630106] kobject_delayed_cleanup (lib/kobject.c:629 lib/kobject.c:638)
  [   16.630106] process_one_work (kernel/workqueue.c:2221 
  include/linux/jump_label.h:105 include/trace/events/workqueue.h:111 
  kernel/workqueue.c:2226)
  [   16.630106] worker_thread (kernel/workqueue.c:2348)
  [   16.630106] kthread (kernel/kthread.c:219)
  [   16.630106] ret_from_fork (arch/x86/kernel/entry_64.S:555)
 
 I think rtc's role in this is to trigger case ODEBUG_STATE_ACTIVE in
 __debug_check_no_obj_freed(), which might be due to an RTC bug.  This
 then triggers problems in the workqueue core - calling flush_work() and
 cancel_work_sync() within work_fixup_free() seems, umm, optimistic.

It's debugobj fixup code which is opportunistically trying to take the
work item off workqueue to avoid use-after-free.  As free() is being
called from work item itself, it falls into A-A deadlock.

 Maybe the semi-recursive code is OK and we're just missing lockdep
 annotations, and the hang you saw was for other reasons.

I've been always a bit suspicious about the fixup parts of debugobj.
I suppose they aren't too harmful as these types of chained failures
always follow proper error messages from debugobj itself.  Anyways,
the lockdep annotation is only because debugobj fixup isn't too smart.
I'm not quite sure we wanna try harder tho.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: rtc: hang on boot during unregistration

2014-04-07 Thread Andrew Morton
On Fri, 04 Apr 2014 23:26:42 -0400 Sasha Levin  wrote:

> Hi all,
> 
> I got the following spew while booting. My VM proceeded to hang and die.
> 
> [   16.620530] =
> [   16.621059] [ INFO: possible recursive locking detected ]
> [   16.622024] 3.14.0-next-20140403-sasha-00019-g7474aa9-dirty #376 Not 
> tainted
> [   16.624145] -
> [   16.625261] kworker/0:1/1069 is trying to acquire lock:
> [   16.626300] ((&(>release)->work)){+.+.+.}, at: flush_work 
> (kernel/workqueue.c:2880)
> [   16.628258]
> [   16.628258] but task is already holding lock:
> [   16.629415] ((&(>release)->work)){+.+.+.}, at: process_one_work 
> (include/linux/workqueue.h:186 kernel/workqueue.c:611 kernel/workqueue.c:638 
> kernel/workqueue.c:2214)

This is making my head spin a bit.  Why is lockdep babbling about
include/linux/workqueue.h:186 and similar sites which appear to have
nothing to do with locking?  Later it points a finger at
include/linux/jump_label.h:105, wtf.

> [   16.630106] other info that might help us debug this:
> [   16.630106]  Possible unsafe locking scenario:
> [   16.630106]
> [   16.630106]CPU0
> [   16.630106]
> [   16.630106]   lock((&(>release)->work));
> [   16.630106]   lock((&(>release)->work));
> [   16.630106]
> [   16.630106]  *** DEADLOCK ***
> [   16.630106]
> [   16.630106]  May be due to missing lock nesting notation
> [   16.630106]
> [   16.630106] 2 locks held by kworker/0:1/1069:
> [   16.630106] #0: ("events"){.+.+.+}, at: process_one_work 
> (include/linux/workqueue.h:186 kernel/workqueue.c:611 kernel/workqueue.c:638 
> kernel/workqueue.c:2214)
> [   16.630106] #1: ((&(>release)->work)){+.+.+.}, at: process_one_work 
> (include/linux/workqueue.h:186 kernel/workqueue.c:611 kernel/workqueue.c:638 
> kernel/workqueue.c:2214)
> [   16.630106]
> [   16.630106] stack backtrace:
> [   16.630106] CPU: 0 PID: 1069 Comm: kworker/0:1 Not tainted 
> 3.14.0-next-20140403-sasha-00019-g7474aa9-dirty #376
> [   16.630106] Workqueue: events kobject_delayed_cleanup
> [   16.630106]  b59d09d0 88007c6a58d8 b24bfb2f 
> 
> [   16.630106]  b59d09d0 88007c6a59d8 af1c4c8a 
> 88007c5a3d98
> [   16.630106]  88007c5a3000 8801 af1c0e18 
> 88007c5a3d60
> [   16.630106] Call Trace:
> [   16.630106] dump_stack (lib/dump_stack.c:52)
> [   16.630106] __lock_acquire (kernel/locking/lockdep.c:1740 
> kernel/locking/lockdep.c:1783 kernel/locking/lockdep.c:2115 
> kernel/locking/lockdep.c:3182)
> [   16.630106] ? check_irq_usage (kernel/locking/lockdep.c:1638)
> [   16.630106] ? __lock_acquire (kernel/locking/lockdep.c:3189)
> [   16.630106] lock_acquire (arch/x86/include/asm/current.h:14 
> kernel/locking/lockdep.c:3602)
> [   16.630106] ? flush_work (kernel/workqueue.c:2880)
> [   16.630106] flush_work (kernel/workqueue.c:2884)
> [   16.630106] ? flush_work (kernel/workqueue.c:2880)
> [   16.630106] ? debug_smp_processor_id (lib/smp_processor_id.c:57)
> [   16.630106] ? put_lock_stats.isra.12 (arch/x86/include/asm/preempt.h:98 
> kernel/locking/lockdep.c:254)
> [   16.630106] ? try_to_grab_pending (kernel/workqueue.c:1260 
> kernel/workqueue.c:1259)
> [   16.630106] ? __cancel_work_timer (arch/x86/include/asm/paravirt.h:809 
> (discriminator 2) kernel/workqueue.c:2913 (discriminator 2))
> [   16.630106] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
> [   16.630106] __cancel_work_timer (arch/x86/include/asm/bitops.h:313 
> kernel/workqueue.c:610 kernel/workqueue.c:644 kernel/workqueue.c:2916)
> [   16.630106] cancel_work_sync (kernel/workqueue.c:2941)
> [   16.630106] work_fixup_free (kernel/workqueue.c:479)
> [   16.630106] __debug_check_no_obj_freed (lib/debugobjects.c:277 
> lib/debugobjects.c:700)
> [   16.630106] debug_check_no_obj_freed (lib/debugobjects.c:726)
> [   16.630106] kfree (mm/slub.c:2679 mm/slub.c:3401)
> [   16.630106] ? rtc_device_release (drivers/rtc/class.c:35)
> [   16.630106] rtc_device_release (drivers/rtc/class.c:35)
> [   16.630106] device_release (drivers/base/core.c:246)
> [   16.630106] kobject_delayed_cleanup (lib/kobject.c:629 lib/kobject.c:638)
> [   16.630106] process_one_work (kernel/workqueue.c:2221 
> include/linux/jump_label.h:105 include/trace/events/workqueue.h:111 
> kernel/workqueue.c:2226)
> [   16.630106] ? process_one_work (include/linux/workqueue.h:186 
> kernel/workqueue.c:611 kernel/workqueue.c:638 kernel/workqueue.c:2214)
> [   16.630106] worker_thread (kernel/workqueue.c:2348)
> [   16.630106] ? rescuer_thread (kernel/workqueue.c:2297)
> [   16.630106] kthread (kernel/kthread.c:219)
> [   16.630106] ? kthread_create_on_node (kernel/kthread.c:185)
> [   16.630106] ret_from_fork (arch/x86/kernel/entry_64.S:555)
> [   16.630106] ? kthread_create_on_node (kernel/kthread.c:185)

I think rtc's role in this is to trigger case ODEBUG_STATE_ACTIVE in
__debug_check_no_obj_freed(), 

Re: rtc: hang on boot during unregistration

2014-04-07 Thread Andrew Morton
On Fri, 04 Apr 2014 23:26:42 -0400 Sasha Levin sasha.le...@oracle.com wrote:

 Hi all,
 
 I got the following spew while booting. My VM proceeded to hang and die.
 
 [   16.620530] =
 [   16.621059] [ INFO: possible recursive locking detected ]
 [   16.622024] 3.14.0-next-20140403-sasha-00019-g7474aa9-dirty #376 Not 
 tainted
 [   16.624145] -
 [   16.625261] kworker/0:1/1069 is trying to acquire lock:
 [   16.626300] (((kobj-release)-work)){+.+.+.}, at: flush_work 
 (kernel/workqueue.c:2880)
 [   16.628258]
 [   16.628258] but task is already holding lock:
 [   16.629415] (((kobj-release)-work)){+.+.+.}, at: process_one_work 
 (include/linux/workqueue.h:186 kernel/workqueue.c:611 kernel/workqueue.c:638 
 kernel/workqueue.c:2214)

This is making my head spin a bit.  Why is lockdep babbling about
include/linux/workqueue.h:186 and similar sites which appear to have
nothing to do with locking?  Later it points a finger at
include/linux/jump_label.h:105, wtf.

 [   16.630106] other info that might help us debug this:
 [   16.630106]  Possible unsafe locking scenario:
 [   16.630106]
 [   16.630106]CPU0
 [   16.630106]
 [   16.630106]   lock(((kobj-release)-work));
 [   16.630106]   lock(((kobj-release)-work));
 [   16.630106]
 [   16.630106]  *** DEADLOCK ***
 [   16.630106]
 [   16.630106]  May be due to missing lock nesting notation
 [   16.630106]
 [   16.630106] 2 locks held by kworker/0:1/1069:
 [   16.630106] #0: (events){.+.+.+}, at: process_one_work 
 (include/linux/workqueue.h:186 kernel/workqueue.c:611 kernel/workqueue.c:638 
 kernel/workqueue.c:2214)
 [   16.630106] #1: (((kobj-release)-work)){+.+.+.}, at: process_one_work 
 (include/linux/workqueue.h:186 kernel/workqueue.c:611 kernel/workqueue.c:638 
 kernel/workqueue.c:2214)
 [   16.630106]
 [   16.630106] stack backtrace:
 [   16.630106] CPU: 0 PID: 1069 Comm: kworker/0:1 Not tainted 
 3.14.0-next-20140403-sasha-00019-g7474aa9-dirty #376
 [   16.630106] Workqueue: events kobject_delayed_cleanup
 [   16.630106]  b59d09d0 88007c6a58d8 b24bfb2f 
 
 [   16.630106]  b59d09d0 88007c6a59d8 af1c4c8a 
 88007c5a3d98
 [   16.630106]  88007c5a3000 8801 af1c0e18 
 88007c5a3d60
 [   16.630106] Call Trace:
 [   16.630106] dump_stack (lib/dump_stack.c:52)
 [   16.630106] __lock_acquire (kernel/locking/lockdep.c:1740 
 kernel/locking/lockdep.c:1783 kernel/locking/lockdep.c:2115 
 kernel/locking/lockdep.c:3182)
 [   16.630106] ? check_irq_usage (kernel/locking/lockdep.c:1638)
 [   16.630106] ? __lock_acquire (kernel/locking/lockdep.c:3189)
 [   16.630106] lock_acquire (arch/x86/include/asm/current.h:14 
 kernel/locking/lockdep.c:3602)
 [   16.630106] ? flush_work (kernel/workqueue.c:2880)
 [   16.630106] flush_work (kernel/workqueue.c:2884)
 [   16.630106] ? flush_work (kernel/workqueue.c:2880)
 [   16.630106] ? debug_smp_processor_id (lib/smp_processor_id.c:57)
 [   16.630106] ? put_lock_stats.isra.12 (arch/x86/include/asm/preempt.h:98 
 kernel/locking/lockdep.c:254)
 [   16.630106] ? try_to_grab_pending (kernel/workqueue.c:1260 
 kernel/workqueue.c:1259)
 [   16.630106] ? __cancel_work_timer (arch/x86/include/asm/paravirt.h:809 
 (discriminator 2) kernel/workqueue.c:2913 (discriminator 2))
 [   16.630106] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
 [   16.630106] __cancel_work_timer (arch/x86/include/asm/bitops.h:313 
 kernel/workqueue.c:610 kernel/workqueue.c:644 kernel/workqueue.c:2916)
 [   16.630106] cancel_work_sync (kernel/workqueue.c:2941)
 [   16.630106] work_fixup_free (kernel/workqueue.c:479)
 [   16.630106] __debug_check_no_obj_freed (lib/debugobjects.c:277 
 lib/debugobjects.c:700)
 [   16.630106] debug_check_no_obj_freed (lib/debugobjects.c:726)
 [   16.630106] kfree (mm/slub.c:2679 mm/slub.c:3401)
 [   16.630106] ? rtc_device_release (drivers/rtc/class.c:35)
 [   16.630106] rtc_device_release (drivers/rtc/class.c:35)
 [   16.630106] device_release (drivers/base/core.c:246)
 [   16.630106] kobject_delayed_cleanup (lib/kobject.c:629 lib/kobject.c:638)
 [   16.630106] process_one_work (kernel/workqueue.c:2221 
 include/linux/jump_label.h:105 include/trace/events/workqueue.h:111 
 kernel/workqueue.c:2226)
 [   16.630106] ? process_one_work (include/linux/workqueue.h:186 
 kernel/workqueue.c:611 kernel/workqueue.c:638 kernel/workqueue.c:2214)
 [   16.630106] worker_thread (kernel/workqueue.c:2348)
 [   16.630106] ? rescuer_thread (kernel/workqueue.c:2297)
 [   16.630106] kthread (kernel/kthread.c:219)
 [   16.630106] ? kthread_create_on_node (kernel/kthread.c:185)
 [   16.630106] ret_from_fork (arch/x86/kernel/entry_64.S:555)
 [   16.630106] ? kthread_create_on_node (kernel/kthread.c:185)

I think rtc's role in this is to trigger case ODEBUG_STATE_ACTIVE in
__debug_check_no_obj_freed(), which might be due to an RTC bug.  This
then triggers problems 

rtc: hang on boot during unregistration

2014-04-04 Thread Sasha Levin
Hi all,

I got the following spew while booting. My VM proceeded to hang and die.

[   16.620530] =
[   16.621059] [ INFO: possible recursive locking detected ]
[   16.622024] 3.14.0-next-20140403-sasha-00019-g7474aa9-dirty #376 Not tainted
[   16.624145] -
[   16.625261] kworker/0:1/1069 is trying to acquire lock:
[   16.626300] ((&(>release)->work)){+.+.+.}, at: flush_work 
(kernel/workqueue.c:2880)
[   16.628258]
[   16.628258] but task is already holding lock:
[   16.629415] ((&(>release)->work)){+.+.+.}, at: process_one_work 
(include/linux/workqueue.h:186 kernel/workqueue.c:611 kernel/workqueue.c:638 
kernel/workqueue.c:2214)
[   16.630106]
[   16.630106] other info that might help us debug this:
[   16.630106]  Possible unsafe locking scenario:
[   16.630106]
[   16.630106]CPU0
[   16.630106]
[   16.630106]   lock((&(>release)->work));
[   16.630106]   lock((&(>release)->work));
[   16.630106]
[   16.630106]  *** DEADLOCK ***
[   16.630106]
[   16.630106]  May be due to missing lock nesting notation
[   16.630106]
[   16.630106] 2 locks held by kworker/0:1/1069:
[   16.630106] #0: ("events"){.+.+.+}, at: process_one_work 
(include/linux/workqueue.h:186 kernel/workqueue.c:611 kernel/workqueue.c:638 
kernel/workqueue.c:2214)
[   16.630106] #1: ((&(>release)->work)){+.+.+.}, at: process_one_work 
(include/linux/workqueue.h:186 kernel/workqueue.c:611 kernel/workqueue.c:638 
kernel/workqueue.c:2214)
[   16.630106]
[   16.630106] stack backtrace:
[   16.630106] CPU: 0 PID: 1069 Comm: kworker/0:1 Not tainted 
3.14.0-next-20140403-sasha-00019-g7474aa9-dirty #376
[   16.630106] Workqueue: events kobject_delayed_cleanup
[   16.630106]  b59d09d0 88007c6a58d8 b24bfb2f 

[   16.630106]  b59d09d0 88007c6a59d8 af1c4c8a 
88007c5a3d98
[   16.630106]  88007c5a3000 8801 af1c0e18 
88007c5a3d60
[   16.630106] Call Trace:
[   16.630106] dump_stack (lib/dump_stack.c:52)
[   16.630106] __lock_acquire (kernel/locking/lockdep.c:1740 
kernel/locking/lockdep.c:1783 kernel/locking/lockdep.c:2115 
kernel/locking/lockdep.c:3182)
[   16.630106] ? check_irq_usage (kernel/locking/lockdep.c:1638)
[   16.630106] ? __lock_acquire (kernel/locking/lockdep.c:3189)
[   16.630106] lock_acquire (arch/x86/include/asm/current.h:14 
kernel/locking/lockdep.c:3602)
[   16.630106] ? flush_work (kernel/workqueue.c:2880)
[   16.630106] flush_work (kernel/workqueue.c:2884)
[   16.630106] ? flush_work (kernel/workqueue.c:2880)
[   16.630106] ? debug_smp_processor_id (lib/smp_processor_id.c:57)
[   16.630106] ? put_lock_stats.isra.12 (arch/x86/include/asm/preempt.h:98 
kernel/locking/lockdep.c:254)
[   16.630106] ? try_to_grab_pending (kernel/workqueue.c:1260 
kernel/workqueue.c:1259)
[   16.630106] ? __cancel_work_timer (arch/x86/include/asm/paravirt.h:809 
(discriminator 2) kernel/workqueue.c:2913 (discriminator 2))
[   16.630106] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
[   16.630106] __cancel_work_timer (arch/x86/include/asm/bitops.h:313 
kernel/workqueue.c:610 kernel/workqueue.c:644 kernel/workqueue.c:2916)
[   16.630106] cancel_work_sync (kernel/workqueue.c:2941)
[   16.630106] work_fixup_free (kernel/workqueue.c:479)
[   16.630106] __debug_check_no_obj_freed (lib/debugobjects.c:277 
lib/debugobjects.c:700)
[   16.630106] debug_check_no_obj_freed (lib/debugobjects.c:726)
[   16.630106] kfree (mm/slub.c:2679 mm/slub.c:3401)
[   16.630106] ? rtc_device_release (drivers/rtc/class.c:35)
[   16.630106] rtc_device_release (drivers/rtc/class.c:35)
[   16.630106] device_release (drivers/base/core.c:246)
[   16.630106] kobject_delayed_cleanup (lib/kobject.c:629 lib/kobject.c:638)
[   16.630106] process_one_work (kernel/workqueue.c:2221 
include/linux/jump_label.h:105 include/trace/events/workqueue.h:111 
kernel/workqueue.c:2226)
[   16.630106] ? process_one_work (include/linux/workqueue.h:186 
kernel/workqueue.c:611 kernel/workqueue.c:638 kernel/workqueue.c:2214)
[   16.630106] worker_thread (kernel/workqueue.c:2348)
[   16.630106] ? rescuer_thread (kernel/workqueue.c:2297)
[   16.630106] kthread (kernel/kthread.c:219)
[   16.630106] ? kthread_create_on_node (kernel/kthread.c:185)
[   16.630106] ret_from_fork (arch/x86/kernel/entry_64.S:555)
[   16.630106] ? kthread_create_on_node (kernel/kthread.c:185)


Thanks,
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


rtc: hang on boot during unregistration

2014-04-04 Thread Sasha Levin
Hi all,

I got the following spew while booting. My VM proceeded to hang and die.

[   16.620530] =
[   16.621059] [ INFO: possible recursive locking detected ]
[   16.622024] 3.14.0-next-20140403-sasha-00019-g7474aa9-dirty #376 Not tainted
[   16.624145] -
[   16.625261] kworker/0:1/1069 is trying to acquire lock:
[   16.626300] (((kobj-release)-work)){+.+.+.}, at: flush_work 
(kernel/workqueue.c:2880)
[   16.628258]
[   16.628258] but task is already holding lock:
[   16.629415] (((kobj-release)-work)){+.+.+.}, at: process_one_work 
(include/linux/workqueue.h:186 kernel/workqueue.c:611 kernel/workqueue.c:638 
kernel/workqueue.c:2214)
[   16.630106]
[   16.630106] other info that might help us debug this:
[   16.630106]  Possible unsafe locking scenario:
[   16.630106]
[   16.630106]CPU0
[   16.630106]
[   16.630106]   lock(((kobj-release)-work));
[   16.630106]   lock(((kobj-release)-work));
[   16.630106]
[   16.630106]  *** DEADLOCK ***
[   16.630106]
[   16.630106]  May be due to missing lock nesting notation
[   16.630106]
[   16.630106] 2 locks held by kworker/0:1/1069:
[   16.630106] #0: (events){.+.+.+}, at: process_one_work 
(include/linux/workqueue.h:186 kernel/workqueue.c:611 kernel/workqueue.c:638 
kernel/workqueue.c:2214)
[   16.630106] #1: (((kobj-release)-work)){+.+.+.}, at: process_one_work 
(include/linux/workqueue.h:186 kernel/workqueue.c:611 kernel/workqueue.c:638 
kernel/workqueue.c:2214)
[   16.630106]
[   16.630106] stack backtrace:
[   16.630106] CPU: 0 PID: 1069 Comm: kworker/0:1 Not tainted 
3.14.0-next-20140403-sasha-00019-g7474aa9-dirty #376
[   16.630106] Workqueue: events kobject_delayed_cleanup
[   16.630106]  b59d09d0 88007c6a58d8 b24bfb2f 

[   16.630106]  b59d09d0 88007c6a59d8 af1c4c8a 
88007c5a3d98
[   16.630106]  88007c5a3000 8801 af1c0e18 
88007c5a3d60
[   16.630106] Call Trace:
[   16.630106] dump_stack (lib/dump_stack.c:52)
[   16.630106] __lock_acquire (kernel/locking/lockdep.c:1740 
kernel/locking/lockdep.c:1783 kernel/locking/lockdep.c:2115 
kernel/locking/lockdep.c:3182)
[   16.630106] ? check_irq_usage (kernel/locking/lockdep.c:1638)
[   16.630106] ? __lock_acquire (kernel/locking/lockdep.c:3189)
[   16.630106] lock_acquire (arch/x86/include/asm/current.h:14 
kernel/locking/lockdep.c:3602)
[   16.630106] ? flush_work (kernel/workqueue.c:2880)
[   16.630106] flush_work (kernel/workqueue.c:2884)
[   16.630106] ? flush_work (kernel/workqueue.c:2880)
[   16.630106] ? debug_smp_processor_id (lib/smp_processor_id.c:57)
[   16.630106] ? put_lock_stats.isra.12 (arch/x86/include/asm/preempt.h:98 
kernel/locking/lockdep.c:254)
[   16.630106] ? try_to_grab_pending (kernel/workqueue.c:1260 
kernel/workqueue.c:1259)
[   16.630106] ? __cancel_work_timer (arch/x86/include/asm/paravirt.h:809 
(discriminator 2) kernel/workqueue.c:2913 (discriminator 2))
[   16.630106] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
[   16.630106] __cancel_work_timer (arch/x86/include/asm/bitops.h:313 
kernel/workqueue.c:610 kernel/workqueue.c:644 kernel/workqueue.c:2916)
[   16.630106] cancel_work_sync (kernel/workqueue.c:2941)
[   16.630106] work_fixup_free (kernel/workqueue.c:479)
[   16.630106] __debug_check_no_obj_freed (lib/debugobjects.c:277 
lib/debugobjects.c:700)
[   16.630106] debug_check_no_obj_freed (lib/debugobjects.c:726)
[   16.630106] kfree (mm/slub.c:2679 mm/slub.c:3401)
[   16.630106] ? rtc_device_release (drivers/rtc/class.c:35)
[   16.630106] rtc_device_release (drivers/rtc/class.c:35)
[   16.630106] device_release (drivers/base/core.c:246)
[   16.630106] kobject_delayed_cleanup (lib/kobject.c:629 lib/kobject.c:638)
[   16.630106] process_one_work (kernel/workqueue.c:2221 
include/linux/jump_label.h:105 include/trace/events/workqueue.h:111 
kernel/workqueue.c:2226)
[   16.630106] ? process_one_work (include/linux/workqueue.h:186 
kernel/workqueue.c:611 kernel/workqueue.c:638 kernel/workqueue.c:2214)
[   16.630106] worker_thread (kernel/workqueue.c:2348)
[   16.630106] ? rescuer_thread (kernel/workqueue.c:2297)
[   16.630106] kthread (kernel/kthread.c:219)
[   16.630106] ? kthread_create_on_node (kernel/kthread.c:185)
[   16.630106] ret_from_fork (arch/x86/kernel/entry_64.S:555)
[   16.630106] ? kthread_create_on_node (kernel/kthread.c:185)


Thanks,
Sasha
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/