Hi Tom, On Wed, 15 Jan 2025 at 07:43, Tom Rini <[email protected]> wrote: > > On Mon, Jan 13, 2025 at 07:22:19PM -0600, Tom Rini wrote: > > On Mon, Jan 13, 2025 at 05:13:46PM -0700, Simon Glass wrote: > > > Hi Tom, > > > > > > On Mon, 13 Jan 2025 at 13:44, Tom Rini <[email protected]> wrote: > > > > > > > > On Mon, Jan 13, 2025 at 01:03:52PM -0700, Simon Glass wrote: > > > > > Hi Tom, > > > > > > > > > > On Sat, 11 Jan 2025 at 15:54, Tom Rini <[email protected]> wrote: > > > > > > > > > > > > On Thu, Jan 09, 2025 at 05:29:57AM -0700, Simon Glass wrote: > > > > > > > > > > > > > Loading a FIT is useful for other VBE methods, such as ABrec. > > > > > > > Create a > > > > > > > new function to handling reading it. > > > > > > > > > > > > > > Signed-off-by: Simon Glass <[email protected]> > > > > > > > > > > > > This causes a bunch of growth: > > > > > > a3y17lte : all +1328 text +1328 > > > > > > u-boot: add: 8/0, grow: 1/0 bytes: 1328/0 (1328) > > > > > > function old > > > > > > new delta > > > > > > blkcache_fill - > > > > > > 332 +332 > > > > > > blkcache_read - > > > > > > 240 +240 > > > > > > blk_read - > > > > > > 188 +188 > > > > > > vbe_read_nvdata - > > > > > > 156 +156 > > > > > > vbe_read_version - > > > > > > 140 +140 > > > > > > vbe_get_blk - > > > > > > 100 +100 > > > > > > simple_read_nvdata - > > > > > > 96 +96 > > > > > > crc8 - > > > > > > 72 +72 > > > > > > vbe_simple_read_state 108 > > > > > > 112 +4 > > > > > > > > > > > > Which is unexpected for just moving code around that's not newly > > > > > > used. > > > > > > > > > > I hadn't noticed that on the boards I was trying, so thank you for > > > > > spotting it. > > > > > > > > > > This is because it now uses blk_read() instead of blk_dread(), so if > > > > > > > > That's not what this patch does? There's no caller before or after in > > > > this patch of "blk_dread". Just moving functions around should not > > > > increase size on platforms that weren't using the existing > > > > functionality. > > > > > > Firstly, are we looking at the same patch? Here is the one I am looking > > > at: > > > > > > https://patchwork.ozlabs.org/project/uboot/patch/[email protected]/ > > > > You're right, I replied to the wrong patch here, sorry for the > > confusion. I'll move some of my comments in reply to the correct patch > > now. > > > > [snip] > > > > > > And even when it's just a move it's still growing: > > > > > > xilinx_zynqmp_virt: all +128 bss -72 text +200 > > > > > > u-boot: add: 4/0, grow: 0/-1 bytes: 540/-340 (200) > > > > > > function old > > > > > > new delta > > > > > > vbe_read_nvdata - > > > > > > 156 +156 > > > > > > vbe_get_blk - > > > > > > 148 +148 > > > > > > vbe_read_version - > > > > > > 140 +140 > > > > > > simple_read_nvdata - > > > > > > 96 +96 > > > > > > vbe_simple_read_state 452 > > > > > > 112 -340 > > > > > > > > > > Unfortunately this one is hard to fix. As you know, whenever you take > > > > > code from a single module and put it into another, the compiler cannot > > > > > optimise away the function-call overhead. I'll note that there is no > > > > > increase when LTO is used, e.g. with xilinx_versal_net_mini_qspi > > > > Yes, but 200 bytes isn't just function call overhead. Some of that might > > be from going from one ALLOC_CACHE_ALIGN_BUFFER(u8, buf, > > MMC_MAX_BLOCK_LEN) to two? > > This is the double buffer I was referring to.
OK, well that is just a declaration, so doesn't add any code, so far as I understand it. I ran out of time this morning, but did find that the compiler cannot (of course) optimise the common range-checks when the code is in different files. So I fiddled with removing some of them and that gets the growth down, about 40 bytes on Thumb2 and 130 on aarch64, both without LTO. I'll send a new series out later today. Regards, Simon

