Re: syscall_get_error() && TS_ checks
On Thu, Mar 30, 2017 at 12:21 PM, Andy Lutomirski wrote: > > Huh? Aren't those REX prefixes interpreted as INC instructions or > similar in compat mode? Hmm. I think you're right. So it's not like x32 that runs in long mode but then would use the 64-bit ptrace interface anyway. Your statement that 64-bit gdb sometimes uses 32-bit compat ptrace makes me shudder. Why? Don't even tell me. I suspect I'm happier not knowing. Linus
Re: syscall_get_error() && TS_ checks
On Thu, Mar 30, 2017 at 12:11 PM, Linus Torvalds wrote: > On Thu, Mar 30, 2017 at 11:59 AM, Andy Lutomirski wrote: >>> >>> And then actually run such a kernel on a 32-bit distro, and verifying >>> that things like gdb and strace really work. But it needs real >>> testing, not some kind of handwaving. It's a *big* change. >> >> I'll offer the following handwave: if there are problems, I'd expect >> to see them in mixed-bitness uses, not 32-bit distros. But the 32-bit >> case is worth testing, too. > > I wouldn't worry too much about the mixed case, simply because you > clearly cannot use a 32-bit gdb on a 64-bit process. > > So the mixed case already needs to use a 64-bit gdb, which presumably > would never use the 32-bit ptrace paths in the first place, so this > code never triggers. > Hah. Hah hah. IIRC 64-bit gdb *does* use the 32-bit paths, or at least it uses some path that can't see the high regs. I don't fully recall, but this is the case that seems more likely to break to me. It's a great big mess. > Of course, the mroe testing the better, but the thing I'd really want > to check is that there isn't some 32-bit distro that might have a > library that is optimized and notices when it's running on a 64-bit > capable CPU and uses REX prefixes to use special optimized versions. Huh? Aren't those REX prefixes interpreted as INC instructions or similar in compat mode? You can't just run 64-bit instructions in a compat code segment. You *can* use LAR to find a 64-bit code segment and long-jump to it (and I've written code to do exactly that, and it's even snuck it's way into linux.git, muahaha), but code like this is terminally screwed under 32-bit gdb.
Re: syscall_get_error() && TS_ checks
On Thu, Mar 30, 2017 at 11:59 AM, Andy Lutomirski wrote: >> >> And then actually run such a kernel on a 32-bit distro, and verifying >> that things like gdb and strace really work. But it needs real >> testing, not some kind of handwaving. It's a *big* change. > > I'll offer the following handwave: if there are problems, I'd expect > to see them in mixed-bitness uses, not 32-bit distros. But the 32-bit > case is worth testing, too. I wouldn't worry too much about the mixed case, simply because you clearly cannot use a 32-bit gdb on a 64-bit process. So the mixed case already needs to use a 64-bit gdb, which presumably would never use the 32-bit ptrace paths in the first place, so this code never triggers. Of course, the mroe testing the better, but the thing I'd really want to check is that there isn't some 32-bit distro that might have a library that is optimized and notices when it's running on a 64-bit capable CPU and uses REX prefixes to use special optimized versions. That would probably already break with a 32-bit gdb, but I can at least in theory imagine code that simply depends on zero extension. >> And in the signal handling path, the sign-extension and test of %eax >> is reivially ok. Not because it's ok in general, but because we've >> verified using %orig_eax that we're in a system call return path. We >> could happily delete the whole TS_I386_REGS_POKED thing, I think. > > Then how do we know whether to sign extend? TS_COMPAT does *not* work > because it isn't set on signal delivery. Ok, so we'd still need that nasty TS_I386_REGS_POKED. I still think that's "ok" within the particular confines of just signal delivery. Of course, I don't actually know why we don't set that flag unconditionally when we poke things using that 32-bit ptrace interface, but that's just more of the "yeah, this code is crazy hacky" Cleaning things up is good. I just don't want people to dismiss that the current code seems to work well enough in practice. That's *probably* because nobody actually uses it, but I do know that people used to run 64-bit kernels with 32-bit user land exactly because they needed the kernel for large memory machines (PAE really doesn't work for shit) but were carrying 32-bit userland along for whatever lazy legacy reasons. So it has gotten *some* use. It's probably (hopefully) fading. .. there's Wine and dosemu that do crazy things too. Things like "we only set TS_I386_REGS_POKED when modifying orig_eax" might be something that those kinds of programs have actually noticed and are actively working around etc. Because while *normal* programs hopefully don't do insane things on purpose, both wine and dosemu definitely have been known to very much intentionally do some truly crazy stuff. Linus
Re: syscall_get_error() && TS_ checks
On Thu, Mar 30, 2017 at 11:35 AM, Linus Torvalds wrote: > On Thu, Mar 30, 2017 at 11:23 AM, Andy Lutomirski wrote: >> On Thu, Mar 30, 2017 at 10:46 AM, Linus Torvalds >> wrote: >>> >>> We *really* shouldn't sign-extend that value if the debugger ends up >>> updating the pointer (or maybe the debugger just reloads previous >>> values, not really "updating" anything - I think that's what gdb does >>> when you do a call within the context of the debugged program from >>> within gdb, for example) >> >> Can you think of a case where this would actually matter? > > I'd actually be willing to have people try, but then you'd have to > sign-extend *all* registers. No way in hell will I accept a patch that > randomly sign-extends just %eax. > > And then actually run such a kernel on a 32-bit distro, and verifying > that things like gdb and strace really work. But it needs real > testing, not some kind of handwaving. It's a *big* change. I'll offer the following handwave: if there are problems, I'd expect to see them in mixed-bitness uses, not 32-bit distros. But the 32-bit case is worth testing, too. > >> --- issue 1 --- >> >> get_nr_restart_syscall() does: >> >>if (current->thread.status & (TS_COMPAT|TS_I386_REGS_POKED)) >>return __NR_ia32_restart_syscall; >> >> This is very, very buggy > > Quite frankly, a bug somewhere else is not an excuse for then making > other code buggier. I don't see what "issue 1" has to do with anything > what-so-ever,. > >> --- issue 2 --- >> >> syscall_get_error(). This is available on all arches, but it appears >> to be used *only* on x86. > > So I think that one is a red herring. We can trivially fix that by > just (a) removing everybody elses syscall_get_error(), and just > inlining the x86 case into the x86 signal handling. Boom, gone. > > And in the signal handling path, the sign-extension and test of %eax > is reivially ok. Not because it's ok in general, but because we've > verified using %orig_eax that we're in a system call return path. We > could happily delete the whole TS_I386_REGS_POKED thing, I think. Then how do we know whether to sign extend? TS_COMPAT does *not* work because it isn't set on signal delivery. > > So don't even _try_ to equate this code with the general sign > extension code by ptrace. That is a totally different animal > altogether. Of course it's different. But doing sign extension in general would avoid the need for this special case, and it seems that the special case isn't actually correct in cases that people try to use. Oleg, can you clarify what's broken? --Andy
Re: syscall_get_error() && TS_ checks
On Thu, Mar 30, 2017 at 11:23 AM, Andy Lutomirski wrote: > On Thu, Mar 30, 2017 at 10:46 AM, Linus Torvalds > wrote: >> >> We *really* shouldn't sign-extend that value if the debugger ends up >> updating the pointer (or maybe the debugger just reloads previous >> values, not really "updating" anything - I think that's what gdb does >> when you do a call within the context of the debugged program from >> within gdb, for example) > > Can you think of a case where this would actually matter? I'd actually be willing to have people try, but then you'd have to sign-extend *all* registers. No way in hell will I accept a patch that randomly sign-extends just %eax. And then actually run such a kernel on a 32-bit distro, and verifying that things like gdb and strace really work. But it needs real testing, not some kind of handwaving. It's a *big* change. > --- issue 1 --- > > get_nr_restart_syscall() does: > >if (current->thread.status & (TS_COMPAT|TS_I386_REGS_POKED)) >return __NR_ia32_restart_syscall; > > This is very, very buggy Quite frankly, a bug somewhere else is not an excuse for then making other code buggier. I don't see what "issue 1" has to do with anything what-so-ever,. > --- issue 2 --- > > syscall_get_error(). This is available on all arches, but it appears > to be used *only* on x86. So I think that one is a red herring. We can trivially fix that by just (a) removing everybody elses syscall_get_error(), and just inlining the x86 case into the x86 signal handling. Boom, gone. And in the signal handling path, the sign-extension and test of %eax is reivially ok. Not because it's ok in general, but because we've verified using %orig_eax that we're in a system call return path. We could happily delete the whole TS_I386_REGS_POKED thing, I think. But then it's very much about the fact that within the particular case of the signal handling code and system call return detection, the whole sign extension checking makes sense. So don't even _try_ to equate this code with the general sign extension code by ptrace. That is a totally different animal altogether. Linus
Re: syscall_get_error() && TS_ checks
On Thu, Mar 30, 2017 at 10:46 AM, Linus Torvalds wrote: > For example, let's assume that %eax contains a 32-bit pointer with the > high bit set, and we're using a 32-bit debugger on a 32-bit program > (ie you're just running a 32-bit distro on a 64-bit kernel, which > people have definitely done). > > We *really* shouldn't sign-extend that value if the debugger ends up > updating the pointer (or maybe the debugger just reloads previous > values, not really "updating" anything - I think that's what gdb does > when you do a call within the context of the debugged program from > within gdb, for example) Can you think of a case where this would actually matter? > > So I really *really* don't think you can just sign-extend %eax. Which > is exactly why we have that nasty odd sign-extension in the signal > path instead, but then have to make it conditional on running a 32-bit > program. > > But maybe there is still something I'm not understanding in your > argument. This thread has been a series of mis-understandings. As the daft kernel hacker who introduced TS_I386_REGS_POKED in the first place, I'll try to explain what I think is going on. TS_I386_REGS_POKED is an enormous kludge, and it serves two purposes. It avoids a potential security bug that the old code had, and it at least documents the code paths that are thoroughly broken. (Before they were TS_COMPAT instead, but most of the TS_COMPAT users are fine.) It's used in two places: --- issue 1 --- get_nr_restart_syscall() does: if (current->thread.status & (TS_COMPAT|TS_I386_REGS_POKED)) return __NR_ia32_restart_syscall; This is very, very buggy. Fixing this appears to require somewhat some surgery. Proposals include adding new restart_syscall numbers that match across 32-bit and 64-bit (interacts quite awkwardly with seccomp) or trying to store syscall bitness along with restart_block (ick, not actually 100% reliable depending on just how abusing the debugger is). --- issue 2 --- syscall_get_error(). This is available on all arches, but it appears to be used *only* on x86. It's used to figure out whether we're restarting a syscall. It could plausibly matter if we have a buggy compat syscall that returns int instead of long, but the main purpose is for compatibility with 32-bit debuggers. Neither Oleg nor I have thought of anything other than this code path that cares at all about the high bits of RAX on a process that's being poked using 32-bit ptrace. Sign-extending RAX seems like it would get rid of this code path entirely to me. --Andy
Re: syscall_get_error() && TS_ checks
On Thu, Mar 30, 2017 at 8:49 AM, Oleg Nesterov wrote: >> >> case offsetof(struct user32, regs.orig_eax): > ^^ > damn, just in case, of course this is typo, should be > > case offsetof(struct user32, regs.eax); So honestly, the first version of your patch I went "Ok, that looks perfectly reasonable", because orig_eax is special, and always sign-extending it sounds fine. The *fixed* version of your patch I go "no, that's obviously complete garbage, that's completely bogus". Because the regular eax register is *not* special, and sign-extending it sounds very dangerous indeed. The user will actually *use* that register as a register, unlike orig_eax. And eax is no different from all the other random registers. Yes, it happens to be the return value for system calls, but this codepath is not at all limited to system calls, it's absolutely *any* context when you use gdb on a 32-bit binary. Now, you may well say that "the upper bits aren't well-defined in that case anyway". And you'd be mostly right. Except the semantics of x86-64 is that 32-bit operations zero-extend the upper bits, so zero-extension really *is* the right thing to do. For example, let's assume that %eax contains a 32-bit pointer with the high bit set, and we're using a 32-bit debugger on a 32-bit program (ie you're just running a 32-bit distro on a 64-bit kernel, which people have definitely done). We *really* shouldn't sign-extend that value if the debugger ends up updating the pointer (or maybe the debugger just reloads previous values, not really "updating" anything - I think that's what gdb does when you do a call within the context of the debugged program from within gdb, for example) So I really *really* don't think you can just sign-extend %eax. Which is exactly why we have that nasty odd sign-extension in the signal path instead, but then have to make it conditional on running a 32-bit program. But maybe there is still something I'm not understanding in your argument. This thread has been a series of mis-understandings. Linus
Re: syscall_get_error() && TS_ checks
On 03/30, Oleg Nesterov wrote: > > On 03/29, Linus Torvalds wrote: > > > > On Wed, Mar 29, 2017 at 11:50 AM, Oleg Nesterov wrote: > > > > > > Again, afaics we only need these compat checks because regs->ax could be > > > changed by 32-bit debugger without sign-extension. > > > > You don't explain how you were planning on *fixing* that code. You > > know why it exists, but then you just say "let's remove it", without > > any explanation of what you'd replace it with. > > Hmm. I tried to explain... Let me quote my initial email, > > So why we can't simply change putreg32() to always sign-extend regs->ax > regs->orig_ax and just do > > static inline long syscall_get_error(struct task_struct *task, >struct pt_regs *regs) > { > return regs-ax; > } > > ? Or, better, simply kill it and use syscall_get_return_value() in > arch/x86/kernel/signal.c. > > Of course, if the tracee is 64-bit and debugger is 32-bit then the > unconditional sign-extend can be wrong, but do we really care about > this case? This can't really work anyway. And the current code is not > right too. Say, debugger nacks the signal which interrupted syscall > and sets regs->ax = -ERESTARTSYS to force the restart, this won't work > because TS_COMPAT|TS_I386_REGS_POKED are not set. > > In short. can the patch below work? > > Oleg. > > diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c > index 9cc7d5a..96f21fc 100644 > --- a/arch/x86/kernel/ptrace.c > +++ b/arch/x86/kernel/ptrace.c > @@ -917,11 +917,14 @@ static int putreg32(struct task_struct *child, unsigned > regno, u32 value) > R32(edi, di); > R32(esi, si); > R32(ebp, bp); > - R32(eax, ax); > R32(eip, ip); > R32(esp, sp); > > case offsetof(struct user32, regs.orig_eax): ^^ damn, just in case, of course this is typo, should be case offsetof(struct user32, regs.eax); > + regs->ax = (long) (int) value; > + break; > + > + case offsetof(struct user32, regs.orig_eax): > /* >* Warning: bizarre corner case fixup here. A 32-bit >* debugger setting orig_eax to -1 wants to disable > diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c > index b3b98ff..41023f8 100644 > --- a/arch/x86/kernel/signal.c > +++ b/arch/x86/kernel/signal.c > @@ -710,7 +710,7 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs) > /* Are we from a system call? */ > if (syscall_get_nr(current, regs) >= 0) { > /* If so, check system call restarting.. */ > - switch (syscall_get_error(current, regs)) { > + switch (syscall_get_return_value(current, regs)) { > case -ERESTART_RESTARTBLOCK: > case -ERESTARTNOHAND: > regs->ax = -EINTR; > @@ -813,7 +813,7 @@ void do_signal(struct pt_regs *regs) > /* Did we come from a system call? */ > if (syscall_get_nr(current, regs) >= 0) { > /* Restart the system call - no handlers present */ > - switch (syscall_get_error(current, regs)) { > + switch (syscall_get_return_value(current, regs)) { > case -ERESTARTNOHAND: > case -ERESTARTSYS: > case -ERESTARTNOINTR:
Re: syscall_get_error() && TS_ checks
On 03/29, Linus Torvalds wrote: > > On Wed, Mar 29, 2017 at 11:50 AM, Oleg Nesterov wrote: > > > > Again, afaics we only need these compat checks because regs->ax could be > > changed by 32-bit debugger without sign-extension. > > You don't explain how you were planning on *fixing* that code. You > know why it exists, but then you just say "let's remove it", without > any explanation of what you'd replace it with. Hmm. I tried to explain... Let me quote my initial email, So why we can't simply change putreg32() to always sign-extend regs->ax regs->orig_ax and just do static inline long syscall_get_error(struct task_struct *task, struct pt_regs *regs) { return regs-ax; } ? Or, better, simply kill it and use syscall_get_return_value() in arch/x86/kernel/signal.c. Of course, if the tracee is 64-bit and debugger is 32-bit then the unconditional sign-extend can be wrong, but do we really care about this case? This can't really work anyway. And the current code is not right too. Say, debugger nacks the signal which interrupted syscall and sets regs->ax = -ERESTARTSYS to force the restart, this won't work because TS_COMPAT|TS_I386_REGS_POKED are not set. In short. can the patch below work? Oleg. diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c index 9cc7d5a..96f21fc 100644 --- a/arch/x86/kernel/ptrace.c +++ b/arch/x86/kernel/ptrace.c @@ -917,11 +917,14 @@ static int putreg32(struct task_struct *child, unsigned regno, u32 value) R32(edi, di); R32(esi, si); R32(ebp, bp); - R32(eax, ax); R32(eip, ip); R32(esp, sp); case offsetof(struct user32, regs.orig_eax): + regs->ax = (long) (int) value; + break; + + case offsetof(struct user32, regs.orig_eax): /* * Warning: bizarre corner case fixup here. A 32-bit * debugger setting orig_eax to -1 wants to disable diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c index b3b98ff..41023f8 100644 --- a/arch/x86/kernel/signal.c +++ b/arch/x86/kernel/signal.c @@ -710,7 +710,7 @@ handle_signal(struct ksignal *ksig, struct pt_regs *regs) /* Are we from a system call? */ if (syscall_get_nr(current, regs) >= 0) { /* If so, check system call restarting.. */ - switch (syscall_get_error(current, regs)) { + switch (syscall_get_return_value(current, regs)) { case -ERESTART_RESTARTBLOCK: case -ERESTARTNOHAND: regs->ax = -EINTR; @@ -813,7 +813,7 @@ void do_signal(struct pt_regs *regs) /* Did we come from a system call? */ if (syscall_get_nr(current, regs) >= 0) { /* Restart the system call - no handlers present */ - switch (syscall_get_error(current, regs)) { + switch (syscall_get_return_value(current, regs)) { case -ERESTARTNOHAND: case -ERESTARTSYS: case -ERESTARTNOINTR:
Re: syscall_get_error() && TS_ checks
On Wed, Mar 29, 2017 at 11:50 AM, Oleg Nesterov wrote: > > Again, afaics we only need these compat checks because regs->ax could be > changed by 32-bit debugger without sign-extension. You don't explain how you were planning on *fixing* that code. You know why it exists, but then you just say "let's remove it", without any explanation of what you'd replace it with. If your suggestion is just that "let's remove it, breaking the known reason it's there", I really really don't see the upside. It may be hacky, but it *works*. You seem to be advocating replacing it with something simpler - "cleaner, but broken". I really don't see the point of "cleaner, but broken". The fact is, reality is not "clean". But reality trumps :I wish" and "make-believe" every single time. Linus
Re: syscall_get_error() && TS_ checks
On 03/29, Linus Torvalds wrote: > > That said, I'm not sure why you want to change this in the first > place? I think the current syscall_get_error() - with explicit compat > handling and all - is fine. To simplify this logic. To kill TS_I386_REGS_POKED (which doesn't really work and can't) and to remove the subtle dependency on TS_COMPAT in ret- with-signal paths. Again, afaics we only need these compat checks because regs->ax could be changed by 32-bit debugger without sign-extension. And TS_I386_REGS_POKED means that if TS_COMPAT is not set, then the debugger should have also changed regs->orig_ax. This mostly works, but imo too fragile. Currently we have the same check in get_nr_restart_syscall() but it is even more broken (see the patch/changelog), so it should go away and in this case it would be nice to avoid these checks in do_signal() path too. Oleg.
Re: syscall_get_error() && TS_ checks
On Wed, Mar 29, 2017 at 10:04 AM, Oleg Nesterov wrote: > > Oh, I agree, and let me repeat the 3rd time that I suggest to kill this > helper and use syscall_get_return_value() in arch/x86/kernel/signal.c, > it has no other callers. That is probably fine, I'm just arguing against the suggested changes to syscall_get_error(). That said, I'm not sure why you want to change this in the first place? I think the current syscall_get_error() - with explicit compat handling and all - is fine. But if the aim is to just remove syscall_get_error() entirely because it's so unused, then I'm ok with that. Linus
Re: syscall_get_error() && TS_ checks
On 03/29, Linus Torvalds wrote: > > On Wed, Mar 29, 2017 at 9:55 AM, Oleg Nesterov wrote: > > > > Once again, it is only used in arch/x86/kernel/signal.c by do_signal() and > > handle_signal(). We do not care if mmap() returns a valid pointer with the > > high bit set, regs-ax can't be confused with -ERESTART code. > > Immaterial. If the function is called "get_error()", it sure as hell > shouldn't return a random non-error value. Oh, I agree, and let me repeat the 3rd time that I suggest to kill this helper and use syscall_get_return_value() in arch/x86/kernel/signal.c, it has no other callers. Oleg.
Re: syscall_get_error() && TS_ checks
On Wed, Mar 29, 2017 at 9:55 AM, Oleg Nesterov wrote: > > Once again, it is only used in arch/x86/kernel/signal.c by do_signal() and > handle_signal(). We do not care if mmap() returns a valid pointer with the > high bit set, regs-ax can't be confused with -ERESTART code. Immaterial. If the function is called "get_error()", it sure as hell shouldn't return a random non-error value. Code should make sense, otherwise it's not going to be maintainable. Naming matters. If the code doesn't match the name of the function, that's a bug regardless of whether it has semantic effects or not in the end - because somebody will eventually depend on the _expected_ semantics. Linus
Re: syscall_get_error() && TS_ checks
On Wed, Mar 29, 2017 at 9:45 AM, Linus Torvalds wrote: > On Wed, Mar 29, 2017 at 9:33 AM, Oleg Nesterov wrote: >> >> Firstly, why do we need the IS_ERR_VALUE() check? This is only used by >> do_signal/handle_signal, we do not care if it returns non-zero as long >> as the value can't be confused with -ERESTART.* codes. > > There are system calls that can return "negative" values that aren't errors. > > Notably mmap() can return a valid pointer with the high bit set. > > So syscall_get_error() should return 0 for not just positive return > values, but for those kinds of negative non-error values. > >> And why do we need the TS_ checks? > > Those may be bogus. > >> So why we can't simply change putreg32() to always sign-extend regs->ax >> regs->orig_ax and just do >> >> static inline long syscall_get_error(struct task_struct *task, >> struct pt_regs *regs) >> { >> return regs-ax; >> } > > That would be *complete* garbage. Lots of system calls return positive > values that sure as hell aren't errors. Does this cause an observable problem? The only things that care are: a) 32-bit debugger pokes some value with the high bit and a 64-bit debugger reads it back. I seriously doubt we care. b) 32-bit debugger pokes some value with the high bit set and the user code switches to 64-bit mode and reads RAX. This case is so terminally broken anyway that we definitely don't care. c) 32-bit debugger pokes some value with the high bit set and syscall_get_error happens. Oleg's proposed change won't change what we do, but it will dramatically simplify the code.
Re: syscall_get_error() && TS_ checks
On 03/29, Linus Torvalds wrote: > > On Wed, Mar 29, 2017 at 9:33 AM, Oleg Nesterov wrote: > > > > Firstly, why do we need the IS_ERR_VALUE() check? This is only used by > > do_signal/handle_signal, we do not care if it returns non-zero as long > > as the value can't be confused with -ERESTART.* codes. > > There are system calls that can return "negative" values that aren't errors. > > Notably mmap() can return a valid pointer with the high bit set. > > So syscall_get_error() should return 0 for not just positive return > values, but for those kinds of negative non-error values. Once again, it is only used in arch/x86/kernel/signal.c by do_signal() and handle_signal(). We do not care if mmap() returns a valid pointer with the high bit set, regs-ax can't be confused with -ERESTART code. > > And why do we need the TS_ checks? > > Those may be bogus. > > > So why we can't simply change putreg32() to always sign-extend regs->ax > > regs->orig_ax and just do > > > > static inline long syscall_get_error(struct task_struct *task, > > struct pt_regs *regs) > > { > > return regs-ax; > > } > > That would be *complete* garbage. Lots of system calls return positive > values that sure as hell aren't errors. See above. And please note that I actually suggest to kill this helper and just use syscall_get_return_value() in arch/x86/kernel/signal.c. Oleg.
Re: syscall_get_error() && TS_ checks
On Wed, Mar 29, 2017 at 9:33 AM, Oleg Nesterov wrote: > > Firstly, why do we need the IS_ERR_VALUE() check? This is only used by > do_signal/handle_signal, we do not care if it returns non-zero as long > as the value can't be confused with -ERESTART.* codes. There are system calls that can return "negative" values that aren't errors. Notably mmap() can return a valid pointer with the high bit set. So syscall_get_error() should return 0 for not just positive return values, but for those kinds of negative non-error values. > And why do we need the TS_ checks? Those may be bogus. > So why we can't simply change putreg32() to always sign-extend regs->ax > regs->orig_ax and just do > > static inline long syscall_get_error(struct task_struct *task, > struct pt_regs *regs) > { > return regs-ax; > } That would be *complete* garbage. Lots of system calls return positive values that sure as hell aren't errors. Linus