Hi, On Mon, Jan 16, 2017 at 05:51:30PM +0100, Ondřej Jirman wrote: > Hi Maxime, > > Dne 16.1.2017 v 17:43 Maxime Ripard napsal(a): > >> It does lock up quickly with mainline ccu_nkmp_find_best algorithm > >> for finding factors. > >> > >> Even with linux kernel, it breaks. It's just more difficult to hit the > >> right conditions. I got oops only right after boot when running cpuburn > >> to trigger thermal_zone issued OPP change, if I first run some cpupower > >> commands. That's why I wrote this program to stress test various CPU_PLL > >> change/factor selection algorithms independently of everything else, to > >> get more predictable and quicker testing results. > > > > Understood. Do you have the code available somewhere? > > It's a bit messy, but it's here: > > https://github.com/megous/h3-firmware/blob/master/main.c
Awesome, thanks! > >>>> What works is selecting NKMP factors so that M is always 1 and P is > >>>> anything other than /1 only for frequencies under 288MHz. As mandated by > >>>> the H3 datasheet. Mainline ccu_nkmp_find_best doesn't respect these > >>>> conditions. With that I can change CPUX frequencies randomly 20x a > >>>> second so far indefinitely without the main CPU ever locking up. > >>>> > >>>> Please drop or revert this patch. It is not a correct approach to the > >>>> problem. I'd suggest dropping the entire clock notifier mechanism, too, > >>>> unless it can be proven to work reliably. > >>> > >>> It has been proven to work reliably on a number of other SoCs. > >> > >> Unless it was stress tested like this with randomy changed settings, I > >> doubt you can call it reliable. It may just be very hard to hit the > >> issue on linux with particular OPP/thermal zone configuration. That's > >> because the issue is dependent on before and after NKMP values. People > >> may have just been lucky so far. > > > > Yes, or maybe we just have OPPs that just don't trigger a low enough P > > factor. > > > > There's no rush anyway, the H3 cpufreq support is not enabled at the > > moment, so that code basically does nothing for the moment. > > > > What's your current plan to fix that? I guess the easiest (and most > > likely to be reusable) would be to allow for clock tables, instead of > > using the generic approach. We might have some other clocks (like > > audio or video) that would need such a precise tuning in the future > > too. > > My proposed solution is this for M factor (H3 specific solution): > > https://github.com/megous/linux/commit/88be3d421e958579026135d8acec4b3983958738 This one is fine. > and this for P factor: > > https://github.com/megous/linux/commit/d7f274ed0c13fa9b4099445cb6bf9b2f8f2cfa8a > > Perhaps it should be configurable if the P limitation is not universal > for all NKMP clocks. But I haven't read all the datasheets. And this is exactly what I wanted to avoid, for the reason you're giving :) This is some generic code, and yet you're putting a clock and SoC specific limit in there. You have two ways to deal with that. Either come up with some generic throttling mecanism to force a particular value above or below a given threshold, but that might be tedious to do, and require some significant rework. Or you can use a table, which is generic and should be relatively easy. I really think this is the most straightforward solution, and even more so since we just want to support a limited number of frequencies in this case. Maxime -- Maxime Ripard, Free Electrons Embedded Linux and Kernel engineering http://free-electrons.com
signature.asc
Description: PGP signature

