Hi Tom, On Sat, 11 Jan 2025 at 15:54, Tom Rini <[email protected]> wrote: > > On Thu, Jan 09, 2025 at 05:29:57AM -0700, Simon Glass wrote: > > > Loading a FIT is useful for other VBE methods, such as ABrec. Create a > > new function to handling reading it. > > > > Signed-off-by: Simon Glass <[email protected]> > > This causes a bunch of growth: > a3y17lte : all +1328 text +1328 > u-boot: add: 8/0, grow: 1/0 bytes: 1328/0 (1328) > function old new > delta > blkcache_fill - 332 > +332 > blkcache_read - 240 > +240 > blk_read - 188 > +188 > vbe_read_nvdata - 156 > +156 > vbe_read_version - 140 > +140 > vbe_get_blk - 100 > +100 > simple_read_nvdata - 96 > +96 > crc8 - 72 > +72 > vbe_simple_read_state 108 112 > +4 > > Which is unexpected for just moving code around that's not newly used.
I hadn't noticed that on the boards I was trying, so thank you for spotting it. This is because it now uses blk_read() instead of blk_dread(), so if BLOCK_CACHE is enabled, it will use the block cache. We could disable BLOCK_CACHE on those boards perhaps? It is a speed optimisation so shouldn't be used by boards which care about code size. > And even when it's just a move it's still growing: > xilinx_zynqmp_virt: all +128 bss -72 text +200 > u-boot: add: 4/0, grow: 0/-1 bytes: 540/-340 (200) > function old new > delta > vbe_read_nvdata - 156 > +156 > vbe_get_blk - 148 > +148 > vbe_read_version - 140 > +140 > simple_read_nvdata - 96 > +96 > vbe_simple_read_state 452 112 > -340 Unfortunately this one is hard to fix. As you know, whenever you take code from a single module and put it into another, the compiler cannot optimise away the function-call overhead. I'll note that there is no increase when LTO is used, e.g. with xilinx_versal_net_mini_qspi So let me know what you think. Regards, Simon

