On 9/12/23 23:28, Yeqi Fu wrote:
This commit implements a shared library, where native functions are rewritten as special instructions. At runtime, user programs load the shared library, and special instructions are executed when native functions are called.
Hello Yeqi, I like the idea of speeding up linux-user with your approach. Do you have a git tree which I can pull for testing (or please mention the base commit your patches are based on)? How does the emulation behaves if a guest has bugs and accesses wrong memory locations, e.g.: memcpy(NULL, "Hello", 6) Will it segfault the same way as if it would have run natively? At least I think the signal IP addresses will be different. Regarding you implemenation:
diff --git a/common-user/native/libnative.S b/common-user/native/libnative.S new file mode 100644 index 0000000000..bc51dabedf --- /dev/null +++ b/common-user/native/libnative.S @@ -0,0 +1,51 @@ +.macro special_instr sym +#if defined(__i386__)
you use here #ifdefs,
+ ud0 \sym-1f, %eax; 1: +#elif defined(__x86_64__) + ud0 \sym(%rip), %eax +#elif defined(__arm__) || defined(__aarch64__) + hlt 0xffff +1: .word \sym - 1b +#elif defined(__mips__) + syscall 0xffff +1: .word \sym - 1b +#else +# error +#endif +.endm + +.macro ret_instr +#if defined(__i386__) || defined(__x86_64__) || defined(__aarch64__)
and here again,
+ ret +#elif defined(__arm__) + bx lr +#elif defined(__mips__) + jr $ra +#else +# error +#endif +.endm + +/* Symbols of native functions */ + +.macro define_function name + .text +\name: + special_instr 9f
and here the pointer to the string.
+ ret_instr + .globl \name + .type \name, %function + .size \name, . - \name + + .section .rodata +9: .asciz "\name" +.endm
IMHO, I think it would be easier if you just do: +/* wrapper for native functions */ + +.macro define_function name + .text + .align 8 /* function is 8-byte aligned */ +\name: +/* every arch has up to 8 bytes for trigger and return instruction */ +#if defined(__i386__) + ud0 0, %eax + ret +#elif defined(__x86_64__) + ud0 0, %eax + ret +#elif defined(__mips__) + syscall 0xffff + jr $ra +#<...more ifdef for arches...> +#endif + +/* the native function name is stored 8 bytes behind \name symbol: */ + .align 8 + .asciz "\name" + + .globl \name + .type \name, %function + .size \name, . - \name +.endm with that you - save some bytes, code & pointers - don't need to load the pointer to the native function string (as it's always stored as ascii 8 bytes behind the function itself) - don't need to adjust the IP - simplifies the asm code and reduced the ifdefs - ... Helge
+ +define_function memcmp +define_function memcpy +define_function memset +define_function strcat +define_function strcmp +define_function strcpy +define_function strncmp +define_function strncpy