Thomas Gleixner <[email protected]> writes:

> On Fri, Jul 03 2026 at 08:26, Sven Schnelle wrote:
>> Thomas Gleixner <[email protected]> writes:
>>> It's less than obvious and I have no objections to clean that up and
>>> make it more intuitive, but I still fail to see what Michal is actually
>>> trying to solve and what the magic flag is for. If s390 requires it,
>>> then that's an s390 problem, but definitely x86 does not.
>>
>> The difference between x86 and s390 is that on s390, regs->gprs[2] is
>> used for both the syscall number and the syscall return value.
>> That was a design mistake early in the begin about 25 years ago, but
>> it's ABI now, so it cannot be changed.
>
> Cute.
>
>> When seccomp decides to skip a syscall, it write a return value into
>> regs->gprs[2]. When syscall_enter_from_user_mode_work() returns, it
>> returns this number. If it's negative all is good - the 'if (likely(nr <
>> NR_syscalls))' conditiion would just catch it and skip the syscall.
>>
>> But if it's a positive number, the code cannot distinguish whether
>> that's a return value or a syscall number.
>>
>> So I introduced PIF_SYSCALL_RET_SET when converting s390 to generic
>> entry. This flag tells the syscall code that a return value was set in
>> ptregs and the syscall should be skipped.
>
> You also could have added a 'syscall_ret' member to pt_regs, operate
> on that for the return values (seccomp, syscall...) and swap it into
> gprs[2] right before returning to user space.

That would likely also work, but I found it easier to read and
understand to have an additional flag with a descriptive name than having
yet another 'somehow-related-to-gpr2' member in ptregs.

Reply via email to