Tyrel Datwyler <tyr...@linux.vnet.ibm.com> writes: > On 01/31/2019 02:21 PM, Tyrel Datwyler wrote: >> On 01/31/2019 01:53 PM, Michael Bringmann wrote: >>> On 1/30/19 11:38 PM, Michael Ellerman wrote: >>>> Michael Bringmann <m...@linux.vnet.ibm.com> writes: >>>>> This patch is to check for cede'ed CPUs during LPM. Some extreme >>>>> tests encountered a problem ehere Linux has put some threads to >>>>> sleep (possibly to save energy or something), LPM was attempted, >>>>> and the Linux kernel didn't awaken the sleeping threads, but issued >>>>> the H_JOIN for the active threads. Since the sleeping threads >>>>> are not awake, they can not issue the expected H_JOIN, and the >>>>> partition would never suspend. This patch wakes the sleeping >>>>> threads back up. >>>> >>>> I'm don't think this is the right solution. >>>> >>>> Just after your for loop we do an on_each_cpu() call, which sends an IPI >>>> to every CPU, and that should wake all CPUs up from CEDE. >>>> >>>> If that's not happening then there is a bug somewhere, and we need to >>>> work out where. >>> >>> Let me explain the scenario of the LPM case that Pete Heyrman found, and >>> that Nathan F. was working upon, previously. >>> >>> In the scenario, the partition has 5 dedicated processors each with 8 >>> threads >>> running. >> >> Do we CEDE processors when running dedicated? I thought H_CEDE was part of >> the >> Shared Processor LPAR option. > > Looks like the cpuidle-pseries driver uses CEDE with dedicated processors as > long as firmware supports SPLPAR option. > >> >>> >>> From the PHYP data we can see that on VP 0, threads 3, 4, 5, 6 and 7 issued >>> a H_CEDE requesting to save energy by putting the requesting thread into >>> sleep mode. In this state, the thread will only be awakened by H_PROD from >>> another running thread or from an external user action (power off, reboot >>> and such). Timers and external interrupts are disabled in this mode. >> >> Not according to PAPR. A CEDE'd processor should awaken if signaled by >> external >> interrupt such as decrementer or IPI as well. > > This statement should still apply though. From PAPR: > > 14.11.3.3 H_CEDE > The architectural intent of this hcall() is to have the virtual processor, > which > has no useful work to do, enter a wait state ceding its processor capacity to > other virtual processors until some useful work appears, signaled either > through > an interrupt or a prod hcall(). To help the caller reduce race conditions, > this > call may be made with interrupts disabled but the semantics of the hcall() > enable the virtual processor’s interrupts so that it may always receive wake > up > interrupt signals.
Thanks for digging that out of PAPR. H_CEDE must respond to IPIs, we have no logic to H_PROD CPUs that are idle in order to wake them up. There must be something else going on here. cheers