This patchset improves support for sched_* syscalls under user emulation. The first commit adds support for sched_g/setattr that was previously not implemented. These syscalls are not exposed by glibc. The struct type needs to be redefined as it can't be included directly before https://lkml.org/lkml/2020/5/28/810 .
sched_attr type can grow in future kernel versions. When client sends values that QEMU does not understand it will return E2BIG with same semantics as old kernel would so client can retry with smaller inputs. The second commit fixes sched_g/setscheduler and sched_g/setparam, when QEMU is built with musl. Musl does not implement these due to conflict between what these functions should do in syscalls and libc https://git.musl-libc.org/cgit/musl/commit/?id=1e21e78bf7a5c24c217446d8760be7b7188711c2 . I've changed it to call syscall directly what should always be the expected behavior for the user. Via https://github.com/tonistiigi/binfmt/pull/70 https://github.com/tonistiigi/binfmt/pull/73 with additional tests. Changes v4->v3: - host `sched_param` type is used for local syscalls - Added check_zeroed_user() helper. This function takes kernel and userspace size and checks if the extra padding is empty with same rules as kernel does. I also tried with only one size parameter but doing all the `size-sizeof()` calculations on caller side made the function quite useless. - Defined local host version for `sched_param` so target and host types are defined separately. - Moved size type declaration to the beginning of the function. Changes v3->v2: - Fix wrong property name for sched_flags - Validate size parameter and handle E2BIG errors same way as kernel does. There is one case where it can't be done completely correctly but clients should still be able to handle it: when client sends a bigger non-zero structure than current kernel definition we will send E2BIG with the struct size known to qemu. If now the client sends structure with this size it may still get another E2BIG error from the kernel if kernel is old and doesn't implement util_min/util_max. I don't think this can be handled without making additional syscalls to kernel. Changes v1->v2: - Locking guest addresses for sched_attr is now based on size inputs, not local struct size. Also did the same for setter where I now read only the size field of the struct first. - Use offsetof() when checking if optional fields are supported. - `target_sched_attr` now uses aligned types as requested. I didn't quite understand why this is needed as I don't see same in kernel headers, but as this type uses only constant width fields and is already aligned by default it can't break anything. - Fixed formatting. - Defined own `target_sched_param` struct as requested. Tonis Tiigi (2): linux-user: add sched_getattr support linux-user: call set/getscheduler set/getparam directly linux-user/syscall.c | 157 +++++++++++++++++++++++++++++++++++--- linux-user/syscall_defs.h | 18 +++++ 2 files changed, 165 insertions(+), 10 deletions(-) -- 2.32.0 (Apple Git-132)