I have traced all execution paths plausible to increment "perf_count_sw_cpu_migrations" (or (task struct *)p->se.nr_migrations++). Which would show us that the task migrated from different CPUs. ____________________________________________________________
(Execution Path 1 - Kernel changes 2 tasks from 2 different CPUs) [execution interrupted by page fault interruption] handle_pte_fault -> numa_migrate_preferred -> task_numa_migrate -> migrate_swap -> stop_two_cpus ... (scheduling task migration state machine) <asynchronous handler on cpu IPIs> [kernel migration thread handles the task switch] [you have 1 migration thread per cpu running] [new cpu comes from numa balance/smp logic] migrate_swap_stop -> migrate_swap_task -> (kernel swaps 2 tasks from different cpus) set_task_cpu -> **** update to perf counter **** [process scheduled in a new cpu] ____________________________________________________________ (Execution Path 2 - Similar to 1 but, instead of swapping, it sends the task) [execution interrupted by page fault interruption] handle_pte_fault -> migrate_task_to -> stop_one_cpu ...(scheduling task submission to another cpu) <asynchronous handler on cpu IPIs> [kernel migration thread handles the task switch] [you have 1 migration thread per cpu running] [new cpu comes from numa balance/smp logic] migration_cpu_stop -> migrate_task -> (move task from one cpu to another) move_queued_task -> set_task_cpu -> **** update to perf counter **** [process scheduled in a new cpu] ____________________________________________________________ (Execution Path 3 - New executions) [fork / exec] sched_exec -> stop_one_cpu... (scheduling task submission to another cpu) <asynchronous handler on cpu IPIs> [kernel migration thread handles the task switch] [you have 1 migration thread per cpu running] [new cpu comes from scheduler_class - fair/deadline/rt - select_task_rq logic] [new cpu can also come from select_fallback_rq] --> fallback might not take cpumask in consideration migration_cpu_stop -> migrate_task -> (move task from one cpu to another) move_queued_task -> set_task_cpu -> **** update to perf counter **** [process scheduled in a new cpu] &&&&&& (Execution Path 4 - Regular Scheduling) [wake up process] [wake up state] try_to_wake_up -> [new cpu comes from scheduler_class - fair/deadline/rt - select_task_rq logic] [new cpu can also come from select_fallback_rq] --> fallback might not take cpumask in consideration select_task_rq -> set_task_cpu **** update to perf counter **** [process scheduled in a new cpu] ****** note for execution paths 3 & 4 ******** -> select_fallback_rq is responsible for the messages: [255688.556945] process 1 (init) no longer affine to cpu1 [266710.938490] process 1 (init) no longer affine to cpu1 [275071.280189] process 1 (init) no longer affine to cpu1 [286088.372647] process 1 (init) no longer affine to cpu1 [355886.470777] process 1 (init) no longer affine to cpu1 [358415.046246] process 1 (init) no longer affine to cpu1 from the dmesg. It shows us that the fallback mechanism of picking the cpu run queue was used. Fallback mechanism might be doing something wrong. ______________________________ PS: There are a few others paths coming from deadline & realtime schedulers not shown here. My idea is to get user & kernel stack traces on a probe to "set_task_cpu". This will tell us if it is being called, by which function and if all calls are coming from the same execution path (like coming from select_fallback_rq instead of (p->sched_class->select_task_rq() functions from fair scheduler). -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1575407 Title: Trusty + 3.19 (lts-vivid) PERF wrong cpu-migration counter Status in linux package in Ubuntu: Confirmed Bug description: It was brought to my attention that: In a PowerPC based server, PERF seems to acuse cpu-migrations when only a single cpu is activated. ## perf Performance counter stats for 'CPU(s) 0': 15027.888988 task-clock (msec) # 1.000 CPUs utilized [100.00%] 25,206 context-switches # 0.002 M/sec [100.00%] 3,518 cpu-migrations # 0.234 K/sec [100.00%] 639 page-faults # 0.043 K/sec 41,545,780,384 cycles # 2.765 GHz [66.68%] 2,868,753,319 stalled-cycles-frontend # 6.91% frontend cycles idle [50.01%] 30,162,193,535 stalled-cycles-backend # 72.60% backend cycles idle [50.01%] 11,161,722,533 instructions # 0.27 insns per cycle # 2.70 stalled cycles per insn [66.68%] 1,544,072,679 branches # 102.747 M/sec [49.99%] 52,536,867 branch-misses # 3.40% of all branches [49.99%] 15.027768835 seconds time elapsed ## lscpu Architecture: ppc64le Byte Order: Little Endian CPU(s): 128 On-line CPU(s) list: 0 Off-line CPU(s) list: 1-127 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 1 NUMA node(s): 2 Model: 8335-GCA L1d cache: 64K L1i cache: 32K L2 cache: 512K L3 cache: 8192K NUMA node0 CPU(s): 0 NUMA node8 CPU(s): So either task migrations are being done to offline cpus or perf is accounting it wrong. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1575407/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp