Hi Przemyslaw, On 16 February 2015 at 08:21, Przemyslaw Marczak <p.marc...@samsung.com> wrote: > Hello, > > > On 02/16/2015 04:13 PM, Przemyslaw Marczak wrote: >> >> For ARM architecture, enable the CONFIG_USE_ARCH_MEMSET/MEMCPY, >> will highly increase the memset/memcpy performance. This is able >> thanks to the ARM multiple register instructions. >> >> Unfortunatelly the relocation is done without the cache enabled, >> so it takes some time, but zeroing the BSS memory takes much more >> longer, especially for the configs with big static buffers. >> >> A quick test confirms, that the boot time improvement after using >> the arch memcpy for relocation has no significant meaning. >> The same test confirms that enable the memset for zeroing BSS, >> reduces the boot time. >> >> So this patch enables the arch memset for zeroing the BSS after >> the relocation process. For ARM boards, this can be enabled >> in board configs by defining: 'CONFIG_USE_ARCH_MEMSET'. >> >> This was tested on Trats2. >> A quick test with trace. Boot time from start to main_loop() entry: >> - ~1384ms - before this change >> - ~888ms - after this change >> >> Signed-off-by: Przemyslaw Marczak <p.marc...@samsung.com> >> Cc: Albert Aribaud <albert.u.b...@aribaud.net> >> Cc: Tom Rini <tr...@ti.com> >> --- >> arch/arm/lib/crt0.S | 10 +++++++++- >> 1 file changed, 9 insertions(+), 1 deletion(-) >> >> diff --git a/arch/arm/lib/crt0.S b/arch/arm/lib/crt0.S >> index 22df3e5..fab3d2c 100644 >> --- a/arch/arm/lib/crt0.S >> +++ b/arch/arm/lib/crt0.S >> @@ -115,14 +115,22 @@ here: >> bl c_runtime_cpu_setup /* we still call old routine here >> */ >> >> ldr r0, =__bss_start /* this is auto-relocated! */ >> - ldr r1, =__bss_end /* this is auto-relocated! */ >> >> +#ifdef CONFIG_USE_ARCH_MEMSET >> + ldr r3, =__bss_end /* this is auto-relocated! */ >> + mov r1, #0x00000000 /* prepare zero to clear BSS */ >> + >> + subs r2, r3, r0 /* r2 = memset len */ >> + bl memset >> +#else >> + ldr r1, =__bss_end /* this is auto-relocated! */ >> mov r2, #0x00000000 /* prepare zero to clear BSS */ >> >> clbss_l:cmp r0, r1 /* while not at end of BSS */ >> strlo r2, [r0] /* clear 32-bit BSS word */ >> addlo r0, r0, #4 /* move to next */ >> blo clbss_l >> +#endif >> >> bl coloured_LED_init >> bl red_led_on >> > > This commit left unchanged. After boot time test using oscilloscope and the > clock cycle counter I didn't noticed a time difference in more then one ms. > In this case I think that insert a duplicated code here, has no sense.
I don't understand this comment, sorry. Regards, Simon _______________________________________________ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot