Re: [PATCH] cpuidle: coupled: fix the potensial race condition and deadlock
On Sun, Dec 2, 2012 at 6:59 PM, Joseph Lo wrote: > Considering the chance that two CPU come into cpuidle_enter_state_coupled at > very close time. The 1st CPU increases the waiting count and the 2nd CPU do > the > same thing right away. The 2nd CPU found the 1st CPU already in waiting then > prepare to poke it. > > Before the 2nd CPU to poke 1st CPU, the 1st found the waiting count already > same > with cpu_online count. So the 1st won't go into waiting loop and the 2nd CPU > didn't poke it yet. The 1st CPU will go into ready loop directly. > > Then the 2nd CPU set up the couple_cpuidle_mask for 1st CPU and poke it. But > the > 1st CPU already lost the chance to handle the poke and clear the > couple_cpuidle_mask. Now whole MPcore are alredy to be coupled. The MPcore > will > go into the power saving idle mode. > > Because the poke was be implemented as a "smp_call_function_single", it's a > software intrrupt of "IPI_SINGLE_FUNC". If the power saving idle mode of the > platform can't retain the software interrupt of power saving idle mode, (e.g. > Tegra's powerd-down idle mode will shut off cpu power that include the power > of > GIC) the software interrupt will got lost. This is the root of your problem. The cpu should never go to idle while an IPI is pending. I thought we already had a patch to return an error from gic_cpu_save when an IPI was pending and abort the idle transition, but apparently not and I can't find any references to it. > When the CPU resumed from the power saving idle mode and the system still keep > idle, it will go into idle mode again immediately. Because the > "smp_call_function_single" not allow the same function be called for the same > cpu twice, or it will got a lock. So the "couple_cpuidle_mask" can't be > cleared > by 1st CPU, the 2nd CPU also can't poke it again. Then the deadlock happens > here. > > The fix here used different wake up mechanism. Because there are already two > loops and a gloable variable "ready_waiting_counts" to sync the status of > MPcore to coupled state, the "coupled_cpuidle_mask" was not really necessary. > Just waking up the CPU from waiting and checking if the CPU need resched at > outside world to take the CPU out of idle are enough. And this fix didn't > modify the original behavior of coupled cpuidle framework. It should still > compitable with the origianal. The cpuidle driver that already applies > coupled cpuidle not need to change as well. I don't like using the arch IPI functions directly, especially not reusing arch_send_call_function_single_ipi without a function to call. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] cpuidle: coupled: fix the potensial race condition and deadlock
On Sun, Dec 2, 2012 at 6:59 PM, Joseph Lo jose...@nvidia.com wrote: Considering the chance that two CPU come into cpuidle_enter_state_coupled at very close time. The 1st CPU increases the waiting count and the 2nd CPU do the same thing right away. The 2nd CPU found the 1st CPU already in waiting then prepare to poke it. Before the 2nd CPU to poke 1st CPU, the 1st found the waiting count already same with cpu_online count. So the 1st won't go into waiting loop and the 2nd CPU didn't poke it yet. The 1st CPU will go into ready loop directly. Then the 2nd CPU set up the couple_cpuidle_mask for 1st CPU and poke it. But the 1st CPU already lost the chance to handle the poke and clear the couple_cpuidle_mask. Now whole MPcore are alredy to be coupled. The MPcore will go into the power saving idle mode. Because the poke was be implemented as a smp_call_function_single, it's a software intrrupt of IPI_SINGLE_FUNC. If the power saving idle mode of the platform can't retain the software interrupt of power saving idle mode, (e.g. Tegra's powerd-down idle mode will shut off cpu power that include the power of GIC) the software interrupt will got lost. This is the root of your problem. The cpu should never go to idle while an IPI is pending. I thought we already had a patch to return an error from gic_cpu_save when an IPI was pending and abort the idle transition, but apparently not and I can't find any references to it. When the CPU resumed from the power saving idle mode and the system still keep idle, it will go into idle mode again immediately. Because the smp_call_function_single not allow the same function be called for the same cpu twice, or it will got a lock. So the couple_cpuidle_mask can't be cleared by 1st CPU, the 2nd CPU also can't poke it again. Then the deadlock happens here. The fix here used different wake up mechanism. Because there are already two loops and a gloable variable ready_waiting_counts to sync the status of MPcore to coupled state, the coupled_cpuidle_mask was not really necessary. Just waking up the CPU from waiting and checking if the CPU need resched at outside world to take the CPU out of idle are enough. And this fix didn't modify the original behavior of coupled cpuidle framework. It should still compitable with the origianal. The cpuidle driver that already applies coupled cpuidle not need to change as well. I don't like using the arch IPI functions directly, especially not reusing arch_send_call_function_single_ipi without a function to call. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] cpuidle: coupled: fix the potensial race condition and deadlock
Considering the chance that two CPU come into cpuidle_enter_state_coupled at very close time. The 1st CPU increases the waiting count and the 2nd CPU do the same thing right away. The 2nd CPU found the 1st CPU already in waiting then prepare to poke it. Before the 2nd CPU to poke 1st CPU, the 1st found the waiting count already same with cpu_online count. So the 1st won't go into waiting loop and the 2nd CPU didn't poke it yet. The 1st CPU will go into ready loop directly. Then the 2nd CPU set up the couple_cpuidle_mask for 1st CPU and poke it. But the 1st CPU already lost the chance to handle the poke and clear the couple_cpuidle_mask. Now whole MPcore are alredy to be coupled. The MPcore will go into the power saving idle mode. Because the poke was be implemented as a "smp_call_function_single", it's a software intrrupt of "IPI_SINGLE_FUNC". If the power saving idle mode of the platform can't retain the software interrupt of power saving idle mode, (e.g. Tegra's powerd-down idle mode will shut off cpu power that include the power of GIC) the software interrupt will got lost. When the CPU resumed from the power saving idle mode and the system still keep idle, it will go into idle mode again immediately. Because the "smp_call_function_single" not allow the same function be called for the same cpu twice, or it will got a lock. So the "couple_cpuidle_mask" can't be cleared by 1st CPU, the 2nd CPU also can't poke it again. Then the deadlock happens here. The fix here used different wake up mechanism. Because there are already two loops and a gloable variable "ready_waiting_counts" to sync the status of MPcore to coupled state, the "coupled_cpuidle_mask" was not really necessary. Just waking up the CPU from waiting and checking if the CPU need resched at outside world to take the CPU out of idle are enough. And this fix didn't modify the original behavior of coupled cpuidle framework. It should still compitable with the origianal. The cpuidle driver that already applies coupled cpuidle not need to change as well. Cc: Colin Cross Signed-off-by: Joseph Lo --- Please check the detail sequence about how race condition and deadlock be happened below: It could be reproduced on Tegra20 and Tegra30. (Suppose the last two cpu came into coupled_cpuidle_state at very close time) CPU_0 CPU_1 cpuidle_enter_state_coupled cpuidle_enter_state_coupled 1.increase waiting count 1. increase waiting count same time frame --- 2.before went into waiting loop, check the waiting count first 2.1 the waiting count was same with online count 2.2 so it won't go into waiting loop 2.in the procedure to prepare to send an "smp_call_function_single" to wake up CPU_0 next time frame --- 3.before go into ready loop 3.1 it will enable_irq 3.2 check if there is any IPI 3.3 if yes, clear the coupled_mask 3.4 then disable_irq 3. trigger the IPI to CPU_0 3.1 set up the coupled_mask for CPU_0 3.2 sent IPI to CPU_0 4. the IPI_SINGLE_FUNC been triggered from CPU_1 to CPU_0 and the coupled_mask of CPU_0 been set up 4.1 but the CPU_0 miss the IPI and the chance to clear coupled_mask next time frame --- 5. MPCores went into ready loop and been coupled 6. go into power saving idle mode (For Tegra, it's means the vdd_cpu rail be shut off. The GIC be powered off and the SW_INT(IPI) couldn't be retained.) 7. After the resume of power saving idle mode, the coupled_mask of CPU_0 not been cleared. 8. return to kernel If the system still idle, it will come back to idle immediately. CPU_0 CPU_1 cpuidle_enter_state_coupled cpuidle_enter_state_coupled 1. set itself to waiting 2. go into waiting loop 3. check the coupled_mask been set or not 3.1 enable_irq 3.2 waiting for the coupled_mask be cleared 1. found CPU_0 in waiting 2. prepare an IPI to poke CPU_0 2.1 check the coupled_mask of CPU_0 2.2 because it had been set 2.3 so it didn't trigger and "smp_call_function_single" to CPU_0 again 3. then go into ready loop 4. because there is no irq of "IPI_SINGLE_FUNC" be triggered for CPU_0 4.1
[PATCH] cpuidle: coupled: fix the potensial race condition and deadlock
Considering the chance that two CPU come into cpuidle_enter_state_coupled at very close time. The 1st CPU increases the waiting count and the 2nd CPU do the same thing right away. The 2nd CPU found the 1st CPU already in waiting then prepare to poke it. Before the 2nd CPU to poke 1st CPU, the 1st found the waiting count already same with cpu_online count. So the 1st won't go into waiting loop and the 2nd CPU didn't poke it yet. The 1st CPU will go into ready loop directly. Then the 2nd CPU set up the couple_cpuidle_mask for 1st CPU and poke it. But the 1st CPU already lost the chance to handle the poke and clear the couple_cpuidle_mask. Now whole MPcore are alredy to be coupled. The MPcore will go into the power saving idle mode. Because the poke was be implemented as a smp_call_function_single, it's a software intrrupt of IPI_SINGLE_FUNC. If the power saving idle mode of the platform can't retain the software interrupt of power saving idle mode, (e.g. Tegra's powerd-down idle mode will shut off cpu power that include the power of GIC) the software interrupt will got lost. When the CPU resumed from the power saving idle mode and the system still keep idle, it will go into idle mode again immediately. Because the smp_call_function_single not allow the same function be called for the same cpu twice, or it will got a lock. So the couple_cpuidle_mask can't be cleared by 1st CPU, the 2nd CPU also can't poke it again. Then the deadlock happens here. The fix here used different wake up mechanism. Because there are already two loops and a gloable variable ready_waiting_counts to sync the status of MPcore to coupled state, the coupled_cpuidle_mask was not really necessary. Just waking up the CPU from waiting and checking if the CPU need resched at outside world to take the CPU out of idle are enough. And this fix didn't modify the original behavior of coupled cpuidle framework. It should still compitable with the origianal. The cpuidle driver that already applies coupled cpuidle not need to change as well. Cc: Colin Cross ccr...@android.com Signed-off-by: Joseph Lo jose...@nvidia.com --- Please check the detail sequence about how race condition and deadlock be happened below: It could be reproduced on Tegra20 and Tegra30. (Suppose the last two cpu came into coupled_cpuidle_state at very close time) CPU_0 CPU_1 cpuidle_enter_state_coupled cpuidle_enter_state_coupled 1.increase waiting count 1. increase waiting count same time frame --- 2.before went into waiting loop, check the waiting count first 2.1 the waiting count was same with online count 2.2 so it won't go into waiting loop 2.in the procedure to prepare to send an smp_call_function_single to wake up CPU_0 next time frame --- 3.before go into ready loop 3.1 it will enable_irq 3.2 check if there is any IPI 3.3 if yes, clear the coupled_mask 3.4 then disable_irq 3. trigger the IPI to CPU_0 3.1 set up the coupled_mask for CPU_0 3.2 sent IPI to CPU_0 4. the IPI_SINGLE_FUNC been triggered from CPU_1 to CPU_0 and the coupled_mask of CPU_0 been set up 4.1 but the CPU_0 miss the IPI and the chance to clear coupled_mask next time frame --- 5. MPCores went into ready loop and been coupled 6. go into power saving idle mode (For Tegra, it's means the vdd_cpu rail be shut off. The GIC be powered off and the SW_INT(IPI) couldn't be retained.) 7. After the resume of power saving idle mode, the coupled_mask of CPU_0 not been cleared. 8. return to kernel If the system still idle, it will come back to idle immediately. CPU_0 CPU_1 cpuidle_enter_state_coupled cpuidle_enter_state_coupled 1. set itself to waiting 2. go into waiting loop 3. check the coupled_mask been set or not 3.1 enable_irq 3.2 waiting for the coupled_mask be cleared 1. found CPU_0 in waiting 2. prepare an IPI to poke CPU_0 2.1 check the coupled_mask of CPU_0 2.2 because it had been set 2.3 so it didn't trigger and smp_call_function_single to CPU_0 again 3. then go into ready loop 4. because there is no irq of IPI_SINGLE_FUNC be triggered