On Thu, Oct 3, 2019 at 7:36 PM Amit Kucheria <amit.kuche...@linaro.org> wrote: > > On Wed, Oct 2, 2019 at 11:48 PM Jeffrey Hugo <jeffrey.l.h...@gmail.com> wrote: > > > > On Wed, Oct 2, 2019 at 3:27 AM Niklas Cassel <niklas.cas...@linaro.org> > > wrote: > > > > > > On Wed, Oct 02, 2019 at 11:19:50AM +0200, Niklas Cassel wrote: > > > > On Mon, Sep 30, 2019 at 04:20:15PM -0600, Jeffrey Hugo wrote: > > > > > Amit, the merged version of the below change causes a boot failure > > > > > (nasty hang, sometimes with RCU stalls) on the msm8998 laptops. Oddly > > > > > enough, it seems to be resolved if I remove the cpu-idle-states > > > > > property from one of the cpu nodes. > > > > > > > > > > I see no issues with the msm8998 MTP. > > > > > > > > Hello Jeffrey, Amit, > > > > > > > > If the PSCI idle states work properly on the msm8998 devboard (MTP), > > > > but causes crashes on msm8998 laptops, the only logical change is > > > > that the PSCI firmware is different between the two devices. > > > > > > Since the msm8998 laptops boot using ACPI, perhaps these laptops > > > doesn't support PSCI/have any PSCI firmware at all. > > > > They have PSCI. If there was no PSCI, I would expect the PSCI > > get_version request from Linux to fail, and all PSCI functionality to > > be disabled. > > > > However, your mention about ACPI sparked a thought. ACPI describes > > the idle states, along with the PSCI info, in the ACPI0007 devices. > > Those exist on the laptops, and the info mostly correlates with Amit's > > patch (ACPI seems to be a bit more conservative about the latencies, > > and describes one additional deeper state). However, upon a detailed > > analysis of the ACPI description, I did find something relevant - the > > retention state is not enabled. > > > > So, I hacked out the retention state from Amit's patch, and I did not > > observe a hang. I used sysfs, and appeared able to validate that the > > power collapse state was being used successfully. > > Interesting that the shallower sleep state was causing problems. > Usually, it is the deeper states that cause problems. So you plan to > override the idle states table in the board-specific DT?
Yes. Already posted. > > Why does the platform even rely on DT? Shouldn't we use the ACPI tables > instead? In theory, yes. However the ACPI seems to be incomplete (assumes things are just hardcoded in the driver maybe?) and has tons of non-standard things in it. DT seems to be the easy path to enablement. > > > I'm guessing that something is weird with the laptops, where the CPUs > > can go into retention, but not come out, thus causing issues. > > > > I'll post a patch to fix up the laptops. Thanks for all the help.