Hi, Fengguang,

We already fixed that in f7b8a47da17c "sched: Remove lockdep check in 
sched_move_task()".

В Вс, 01/02/2015 в 20:22 -0800, Fengguang Wu пишет:
> Hi Kirill,
> 
> FYI the RCU warning shows up since this commit, however there are
> several other BUGs, so it's not necessarily a problem of this patch.
> 
> commit eeb61e53ea19be0c4015b00b2e8b3b2185436f2b
> Author:     Kirill Tkhai <ktk...@parallels.com>
> AuthorDate: Mon Oct 27 14:18:25 2014 +0400
> Commit:     Ingo Molnar <mi...@kernel.org>
> CommitDate: Tue Oct 28 10:45:59 2014 +0100
> 
>     sched: Fix race between task_group and sched_task_group
>     
>     The race may happen when somebody is changing task_group of a forking 
> task.
>     Child's cgroup is the same as parent's after dup_task_struct() (there just
>     memory copying). Also, cfs_rq and rt_rq are the same as parent's.
>     
>     But if parent changes its task_group before it's called 
> cgroup_post_fork(),
>     we do not reflect this situation on child. Child's cfs_rq and rt_rq remain
>     the same, while child's task_group changes in cgroup_post_fork().
>     
>     To fix this we introduce fork() method, which calls sched_move_task() 
> directly.
>     This function changes sched_task_group on appropriate (also its logic has
>     no problem with freshly created tasks, so we shouldn't introduce something
>     special; we are able just to use it).
>     
>     Possibly, this decides the Burke Libbey's problem: 
> https://lkml.org/lkml/2014/10/24/456
>     
>     Signed-off-by: Kirill Tkhai <ktk...@parallels.com>
>     Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org>
>     Cc: Linus Torvalds <torva...@linux-foundation.org>
>     Link: http://lkml.kernel.org/r/1414405105.19914.169.camel@tkhai
>     Signed-off-by: Ingo Molnar <mi...@kernel.org>
> 
> ===================================================
> PARENT COMMIT NOT CLEAN. LOOK OUT FOR WRONG BISECT!
> ===================================================
> 
> Attached dmesg for the parent commit, too, to help confirm whether it is a 
> noise error.
> 
> +--------------------------------------------------------+------------+------------+------------+
> |                                                        | xfs-for-li | 
> eeb61e53ea | 28ff344e1d |
> +--------------------------------------------------------+------------+------------+------------+
> | boot_successes                                         | 0          | 0     
>      | 0          |
> | boot_failures                                          | 60         | 20    
>      | 11         |
> | WARNING:at_include/linux/kref.h:#kobject_get()         | 60         | 20    
>      | 11         |
> | WARNING:at_fs/kernfs/dir.c:#kernfs_remove_by_name_ns() | 1          | 1     
>      | 1          |
> | BUG:unable_to_handle_kernel                            | 59         | 20    
>      | 11         |
> | Oops                                                   | 59         | 20    
>      | 11         |
> | RIP:string                                             | 1          | 1     
>      | 1          |
> | Kernel_panic-not_syncing:Fatal_exception               | 59         | 20    
>      | 11         |
> | backtrace:platform_driver_unregister                   | 60         | 20    
>      | 11         |
> | backtrace:init_mac80211_hwsim                          | 60         | 20    
>      | 11         |
> | backtrace:kernel_init_freeable                         | 60         | 20    
>      | 11         |
> | backtrace:bus_register                                 | 1          | 1     
>      | 1          |
> | backtrace:i2o_driver_init                              | 1          | 1     
>      | 1          |
> | backtrace:i2o_iop_init                                 | 1          | 1     
>      | 1          |
> | INFO:trying_to_register_non-static_key                 | 59         | 18    
>      | 10         |
> | RIP:mutex_lock_nested                                  | 58         | 18    
>      | 10         |
> | INFO:rcu_sched_detected_stalls_on_CPUs/tasks           | 1          |       
>      |            |
> | INFO:task_blocked_for_more_than#seconds                | 1          |       
>      |            |
> | INFO:lockdep_is_turned_off                             | 1          |       
>      |            |
> | RIP:native_apic_mem_write                              | 1          |       
>      |            |
> | RIP:rep_nop                                            | 1          |       
>      |            |
> | Kernel_panic-not_syncing:hung_task:blocked_tasks       | 1          |       
>      |            |
> | backtrace:ring_buffer_producer_thread                  | 1          |       
>      |            |
> | backtrace:watchdog                                     | 1          |       
>      |            |
> | INFO:suspicious_RCU_usage                              | 0          | 20    
>      | 11         |
> | WARNING:at_kernel/locking/lockdep.c:#__bfs()           | 0          | 1     
>      |            |
> | RIP:__bfs                                              | 0          | 1     
>      |            |
> | backtrace:do_fork                                      | 0          | 20    
>      | 11         |
> +--------------------------------------------------------+------------+------------+------------+
> 
> [    0.014295] debug: unmapping init [mem 
> 0xffffffff82820000-0xffffffff82823fff]
> [    0.017254] 
> [    0.017433] ===============================
> [    0.017885] [ INFO: suspicious RCU usage. ]
> [    0.018000] 3.18.0-rc2-00001-geeb61e5 #6 Not tainted
> [    0.018000] -------------------------------
> [    0.018000] kernel/sched/core.c:7408 suspicious rcu_dereference_check() 
> usage!
> [    0.018000] 
> [    0.018000] other info that might help us debug this:
> [    0.018000] 
> [    0.018000] 
> [    0.018000] rcu_scheduler_active = 1, debug_locks = 1
> [    0.018000] 2 locks held by swapper/0/0:
> [    0.018000]  #0:  (&p->pi_lock){......}, at: [<ffffffff810dbbdf>] 
> task_rq_lock+0x38/0xe7
> [    0.018000]  #1:  (&rq->lock){-.....}, at: [<ffffffff810dbbfa>] 
> task_rq_lock+0x53/0xe7
> [    0.018000] 
> [    0.018000] stack backtrace:
> [    0.018000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
> 3.18.0-rc2-00001-geeb61e5 #6
> [    0.018000] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [    0.018000]  0000000000000001 ffffffff82403cc8 ffffffff81b9e83f 
> ffffffff82425580
> [    0.018000]  0000000000000001 ffffffff82403cf8 ffffffff810f3d12 
> ffff880012068000
> [    0.018000]  ffff880012422040 0000000000000000 0000000000000000 
> ffffffff82403d58
> [    0.018000] Call Trace:
> [    0.018000]  [<ffffffff81b9e83f>] dump_stack+0x51/0xaa
> [    0.018000]  [<ffffffff810f3d12>] lockdep_rcu_suspicious+0xfa/0x103
> [    0.018000]  [<ffffffff810e2c04>] sched_move_task+0xe5/0x153
> [    0.018000]  [<ffffffff810e2c8f>] cpu_cgroup_fork+0x9/0xb
> [    0.018000]  [<ffffffff81131f85>] cgroup_post_fork+0x9b/0xaa
> [    0.018000]  [<ffffffff810ba1f0>] copy_process+0x178a/0x18bf
> [    0.018000]  [<ffffffff81b9776c>] ? rest_init+0x130/0x130
> [    0.018000]  [<ffffffff8119f86c>] ? vm_unmap_aliases+0x196/0x1a5
> [    0.018000]  [<ffffffff810ba4ff>] do_fork+0xec/0x3e1
> [    0.018000]  [<ffffffff81b9776c>] ? rest_init+0x130/0x130
> [    0.018000]  [<ffffffff810ba815>] kernel_thread+0x21/0x23
> [    0.018000]  [<ffffffff81b9765a>] rest_init+0x1e/0x130
> [    0.018000]  [<ffffffff82728f3c>] start_kernel+0x433/0x440
> [    0.018000]  [<ffffffff82728120>] ? early_idt_handlers+0x120/0x120
> [    0.018000]  [<ffffffff827284cb>] x86_64_start_reservations+0x2a/0x2c
> [    0.018000]  [<ffffffff82728608>] x86_64_start_kernel+0x13b/0x14a
> [    0.018079] Getting VERSION: 1050014
> [    0.019012] Getting VERSION: 1050014
> [    0.019403] Getting ID: 0
> 
> git bisect start 5e62863aea463c9bf24538315327d0922be35d64 v3.17 --
> git bisect good f207671f402b9aa7367ee21dbab2e450d16c9d7e  # 01:43     20+     
> 20  staging: unisys: refactor ULTRA_DIAG_CHANNEL_PROTOCOL
> git bisect good 7c0496d19159df8b5a537154194a75dcdfca873a  # 01:50     20+     
> 20  staging: vt6655: mac80211 conversion: s_vFillRTSHead convert to using 
> struct ieee80211_hdr
> git bisect  bad 33192fa1d173b91ba98c29cb8ab414c76ae3ff0b  # 01:53      0-     
> 20  staging: unisys: refactor VISORCHIPSET_BUS_INFO
> git bisect good 146761058f63d8530d150bef00c1d3daaa53dbba  # 01:58     20+     
> 20  staging: vt6655: struct vnt_private remove dead members
> git bisect good 037a449ac71fc5c70d2b0af28ce6792065daae55  # 02:04     20+     
> 20  staging: rtl8188eu: os_dep: Remove useless cast on void pointer
> git bisect good 425ebd0aa35dcfc5fe2449ef3e1194e422f6f9b4  # 02:14     20+     
> 20  staging: unisys: virtpci: Add a blank line after the definition of 
> driver_private
> git bisect good 9056be30542bfff51190bdda67088f319cf4c9f5  # 02:20     20+     
> 20  Staging: lustre: lnet: lnet: Better cookie gen
> git bisect  bad 2c683cde06c1d57a9ecbacc484540e259e7255fa  # 02:26      0-     
> 20  staging: unisys: refactor CRASH_OBJ_TYPE
> git bisect  bad 37951dabd899587afc31ddd508b9cdddc9c24524  # 02:33      0-     
> 20  staging: unisys: get rid of VISORCHIPSET_STATE typedef
> git bisect  bad 7cbe010a5ea728d7c4440b11a1a3997faca0e46d  # 02:40      0-     
> 20  Merge 3.18-rc3 into staging-next
> git bisect good f7e87a44ef60ad379e39b45437604141453bf0ec  # 02:48     20+     
> 20  Merge tag 'media/v3.18-rc2' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
> git bisect  bad 89453379aaf0608253124057df6cd8ac63948135  # 02:56      0-     
> 20  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
> git bisect  bad 19e0d5f16ad9759d92f5739b53acdebfe22fa2a2  # 03:04      0-     
> 20  Merge branch 'x86-urgent-for-linus' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> git bisect good 94712927d06c19b785bae92db07ef84b89f156c8  # 03:11     20+     
> 20  Merge tag 'sound-3.18-rc3' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
> git bisect good aea4869f68b5869afbb308bfb7d777d725df8900  # 03:14     20+     
> 20  Merge branch 'core-urgent-for-linus' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> git bisect good 5656b408ff2696551c0f572689edcad3113e3a32  # 03:17     20+     
> 20  Merge branch 'perf-urgent-for-linus' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> git bisect good d1cd1210834649ce1ca6bafe5ac25d2f40331343  # 03:31     20+     
> 20  x86, pageattr: Prevent overflow in slow_virt_to_phys() for X86_PAE
> git bisect  bad 2847c90e1b3ae95379af24894fc4f98e7f2fd705  # 03:34      0-     
> 20  sched/fair: Care divide error in update_task_scan_period()
> git bisect  bad 64be6f1f5f710f5995d41caf8a1767fe6d2b5a87  # 03:37      0-     
> 20  sched/deadline: Don't replenish from a !SCHED_DEADLINE entity
> git bisect  bad eeb61e53ea19be0c4015b00b2e8b3b2185436f2b  # 03:39      0-     
> 20  sched: Fix race between task_group and sched_task_group
> # first bad commit: [eeb61e53ea19be0c4015b00b2e8b3b2185436f2b] sched: Fix 
> race between task_group and sched_task_group
> git bisect good cac7f2429872d3733dc3f9915857b1691da2eb2f  # 03:45     60+     
> 60  Linux 3.18-rc2
> # extra tests with DEBUG_INFO
> git bisect good eeb61e53ea19be0c4015b00b2e8b3b2185436f2b  # 03:52     60+     
> 49  sched: Fix race between task_group and sched_task_group
> # extra tests on HEAD of next/master
> git bisect good 827e3bdf1bb2401c1a1e5586eb3977d228d298b2  # 03:59     60+     
>  0  Add linux-next specific files for 20150130
> # extra tests on tree/branch linus/master
> git bisect  bad 788807d7ca3821b5ea835a588a52d55631c17e18  # 04:01     58-     
>  3  Merge branch 'i2c/for-current' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
> # extra tests with first bad commit reverted
> # extra tests on tree/branch linus/master
> git bisect  bad 788807d7ca3821b5ea835a588a52d55631c17e18  # 04:03      0-     
>  3  Merge branch 'i2c/for-current' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
> # extra tests on tree/branch next/master
> git bisect  bad 827e3bdf1bb2401c1a1e5586eb3977d228d298b2  # 04:07    295-     
> 40  Add linux-next specific files for 20150130
> 
> 
> This script may reproduce the error.
> 
> ----------------------------------------------------------------------------
> #!/bin/bash
> 
> kernel=$1
> initrd=quantal-core-x86_64.cgz
> 
> wget --no-clobber 
> https://github.com/fengguang/reproduce-kernel-bug/raw/master/initrd/$initrd
> 
> kvm=(
>       qemu-system-x86_64
>       -enable-kvm
>       -cpu Haswell,+smep,+smap
>       -kernel $kernel
>       -initrd $initrd
>       -m 320
>       -smp 2
>       -net nic,vlan=1,model=e1000
>       -net user,vlan=1
>       -boot order=nc
>       -no-reboot
>       -watchdog i6300esb
>       -rtc base=localtime
>       -serial stdio
>       -display none
>       -monitor null 
> )
> 
> append=(
>       hung_task_panic=1
>       earlyprintk=ttyS0,115200
>       debug
>       apic=debug
>       sysrq_always_enabled
>       rcupdate.rcu_cpu_stall_timeout=100
>       panic=-1
>       softlockup_panic=1
>       nmi_watchdog=panic
>       oops=panic
>       load_ramdisk=2
>       prompt_ramdisk=0
>       console=ttyS0,115200
>       console=tty0
>       vga=normal
>       root=/dev/ram0
>       rw
>       drbd.minor_count=8
> )
> 
> "${kvm[@]}" --append "${append[*]}"
> ----------------------------------------------------------------------------

Regards/С уважением,
Kirill Tkhai/Кирилл Тхай

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to