Re: [PATCH v1 2/5] board_init: Use memset_simple() in board_init_f_init_reserve()

2021-08-09 Thread Stefan Roese

Hi Rasmus,

On 09.08.21 09:52, Rasmus Villemoes wrote:

On 06/08/2021 15.38, Stefan Roese wrote:

board_init_f_init_reserve() is called very early in the boot process,
before the caches are enabled. Because of this, the optimized memset()
version can't be used here on ARM64. With this patch, the simple memset
version memset_simple() is used here instead.


I'm afraid this approach is pretty much whack-a-mole, and it's also a
pity that one arch's limitations/special needs affects common code.


I fully agree - I was (am) also not really happy with the suggested
implemented.


It
also means we lose the compiler's knowledge of "ah, here you're calling
a standard library function whose semantics I know".

The linux kernel has a lot of instances of self-modifying code
(alternatives, static keys, etc.). Could we do something like that in
U-Boot? That is, have the optimized asm version of memset start with an
unconditional jump to memset_simple, then when the arch is ready,
overwrite that jump instruction with a nop. Compared to the kernel, we
don't have any complications from multiple CPUs, so it shouldn't be too
hard.

That would allow all callers to just call memset(), which is better both
for the compiler and the human brain. Also, when somebody modifies the
common code later, they won't know that they need to spell memset
"memset_simple", as their modifications probably work just fine on their
platforms, but then it'll explode on ARM64. And finally, what happens if
somebody disables caches at run-time?


U-Boot crashes! I forgot about this problem.


Shouldn't we then also have a
mechanism to switch all mem* calls over to the simple versions?


AFAICT, it's only the "dc" opcode which needs caching enabled. And "dc"
is only used in memset() and not memcpy().

Let me think again about either using your suggestion with the self
modifying code or perhaps by adding a "cache enable" check in the
new memset() function.

Thanks for your valuable feedback.

Thanks,
Stefan


Re: [PATCH v1 2/5] board_init: Use memset_simple() in board_init_f_init_reserve()

2021-08-09 Thread Rasmus Villemoes
On 06/08/2021 15.38, Stefan Roese wrote:
> board_init_f_init_reserve() is called very early in the boot process,
> before the caches are enabled. Because of this, the optimized memset()
> version can't be used here on ARM64. With this patch, the simple memset
> version memset_simple() is used here instead.

I'm afraid this approach is pretty much whack-a-mole, and it's also a
pity that one arch's limitations/special needs affects common code. It
also means we lose the compiler's knowledge of "ah, here you're calling
a standard library function whose semantics I know".

The linux kernel has a lot of instances of self-modifying code
(alternatives, static keys, etc.). Could we do something like that in
U-Boot? That is, have the optimized asm version of memset start with an
unconditional jump to memset_simple, then when the arch is ready,
overwrite that jump instruction with a nop. Compared to the kernel, we
don't have any complications from multiple CPUs, so it shouldn't be too
hard.

That would allow all callers to just call memset(), which is better both
for the compiler and the human brain. Also, when somebody modifies the
common code later, they won't know that they need to spell memset
"memset_simple", as their modifications probably work just fine on their
platforms, but then it'll explode on ARM64. And finally, what happens if
somebody disables caches at run-time? Shouldn't we then also have a
mechanism to switch all mem* calls over to the simple versions?

Rasmus


[PATCH v1 2/5] board_init: Use memset_simple() in board_init_f_init_reserve()

2021-08-06 Thread Stefan Roese
board_init_f_init_reserve() is called very early in the boot process,
before the caches are enabled. Because of this, the optimized memset()
version can't be used here on ARM64. With this patch, the simple memset
version memset_simple() is used here instead.

Signed-off-by: Stefan Roese 
---

 common/init/board_init.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/common/init/board_init.c b/common/init/board_init.c
index 0965b96fa3ad..9996aff74373 100644
--- a/common/init/board_init.c
+++ b/common/init/board_init.c
@@ -140,7 +140,7 @@ void board_init_f_init_reserve(ulong base)
 
gd_ptr = (struct global_data *)base;
/* zero the area */
-   memset(gd_ptr, '\0', sizeof(*gd));
+   memset_simple(gd_ptr, '\0', sizeof(*gd));
/* set GD unless architecture did it already */
 #if !defined(CONFIG_ARM)
arch_setup_gd(gd_ptr);
-- 
2.32.0