Hi Marek, > From: Marek Vasut <ma...@denx.de> > Sent: jeudi 26 mars 2020 17:28 > > On 3/26/20 5:19 PM, Simon Glass wrote: > > Hi Patrick, > > Hi, > > > On Wed, 25 Mar 2020 at 09:57, Patrick DELAUNAY <patrick.delau...@st.com> > wrote: > >> > >> Hi, > >> > >>> From: Marek Vasut <ma...@denx.de> > >>> Sent: mercredi 25 mars 2020 00:39 > >>> > >>> Hi, > >>> > >>> I was looking at the STM32MP1 boot time and I noticed it takes about > >>> 2 seconds to get to U-Boot. > >> > >> Thanks for the feedback. > >> > >> To be clear, the SPL is not the ST priority as we have many > >> limitation (mainly on power management) for the SPL boot chain > (stm32mp15_basic_defconfig): > >> Rom code => SPL => U-Boot > >> > >> The preconized boot chain for STM32MP1 is Rom code => TF-A => U-Boot > >> (stm32mp15_trusted_defconfg). > >> > >>> One problem is the insane I2C timing calculation in stm32f7 i2c > >>> driver, which is almost a mallocator and CPU stress test and takes > >>> about 1 second to complete in SPL -- we need some simpler > >>> replacement for that, possibly the one in DWC I2C driver might do? > >> > >> Our first idea to manage this I2C settings (prescaler/timings > >> setting) was to set this values in device tree, but this binding was > >> refused so this function stm32_i2c_choose_solution() > > > > Was the binding refused in linux? Could we add something > > U-Boot-specific then? I think having 'early' timings, etc. is very > > handy. We are doing this on x86. > > > > Of course it has traditionally been impossible to convince Linux > > people to add this sort of thing. Still, I think we should do it. Our > > U-Boot-specific files allow this. > > Or reuse the DWC I2C driver timing calculation, which is real simple, fast, > and > should be accurate enough.
Yes I checked ./drivers/i2c/designware_i2c.c:: __dw_i2c_set_bus_speed I agree that something simple should be possible to found 'good enough' setting. But I don't ding in the ST I2C specification.... I waiting internal feedback > >> provided the better settings for any input clock and I2C frequency (called > >> for > each probe). > >> > >> But it is brutal and not optimum solution: try all the solution to found > >> the better > one. > >> And the performance problem of this loop (shared code between Linux / > >> U-Boot/TF-A drivers) had be already see/checked on ST side in TF-A context. > > > > We should be able to calculate it, like with dw-i2c. > > Yes > > >> We try to improve the solution, without success, but finally the > >> performance issue was solved by dcache activation in TF-A before to execute > this loop. > > > > I would like to see patches to enable the cache. We did this some > > years ago in a Chromebook and it made a big difference. It is not that > > hard. > > ACK. Why did the chromebook patches never make it upstream ? Work in progress https://gitlab.denx.de/u-boot/custodians/u-boot-stm/-/commit/3399fb37c3b7db6e99118766c4d1cd5e742ecc8f with improvements For example, the result on STM32MP157C-DK2 board is: 1,6s gain for trusted boot chain with TF-A 2,2s gain for basic boot chain with SPL I will push this patch after sanity checks (ARM requirement on TLB / cache update with MMU activated). > >> But as in SPL the data cache is not activated, this loop has terrible > performance. > >> > >> We need to ding again of this topic for U-Boot point of view (SPL & > >> also in U-Boot, before relocation and after relocation) . > >> > >> And I had shared this issue with the ST owner of this code. > >> > >> For information, I add some trace and I get for same code execution on DK2 > board. > >> - 440ms in SPL (dcache OFF) > >> - 36ms in U-Boot (dcache ON) > >> > >>> Another item I found is that, in U-Boot, initf_dm() takes about half > >>> a second and so does serial_init(). I didn't dig into it to find out > >>> why, but I suspect it has to do with the massive amount of UCLASSes > >>> the DM has to traverse OR with the CPU being slow at that point, as the > clock driver didn't get probed just yet. > >>> > >>> Thoughts ? > >> > >> Yes, it is the first parsing of device tree, and it is really slow... > >> directly linked to device tree size and libfdt. > > > > I wonder if we can improve this. There was a change to how the drivers > > were bound (changing the ordering). We could perhaps revert that for > > SPL. > > Link ? > > [...] > > -- > Best regards, > Marek Vasut