On Wed, May 28, 2014 at 2:01 PM, H. Peter Anvin <h...@linux.intel.com> wrote: > On 05/28/2014 01:47 PM, Andy Lutomirski wrote: >> On 05/28/2014 05:19 AM, Philipp Kern wrote: >>> audit_filter_syscall uses the syscall number to reference into a >>> bitmask (e->rule.mask[word]). Not removing the x32 bit before passing >>> the number to this architecture independent codepath will fail to >>> lookup the proper audit bit. Furthermore it will cause an invalid memory >>> access in the kernel if the out of bound location is not mapped: >>> >>> BUG: unable to handle kernel paging request at ffff8800e5446630 >>> IP: [<ffffffff810fcdd0>] audit_filter_syscall+0x90/0xf0 >>> >>> Together with the entrypoint in entry_64.S this change causes x32 >>> programs to pass in both AUDIT_ARCH_X86_64 and AUDIT_ARCH_I386 depending >>> on the syscall path. >>> >>> Cc: linux-kernel@vger.kernel.org >>> Cc: H. J. Lu <hjl.to...@gmail.com> >>> Cc: Eric Paris <epa...@redhat.com> >>> Signed-off-by: Philipp Kern <pk...@google.com> >>> --- >>> arch/x86/kernel/ptrace.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c >>> index 678c0ad..166a3c7 100644 >>> --- a/arch/x86/kernel/ptrace.c >>> +++ b/arch/x86/kernel/ptrace.c >>> @@ -1489,7 +1489,7 @@ long syscall_trace_enter(struct pt_regs *regs) >>> >>> if (IS_IA32) >>> audit_syscall_entry(AUDIT_ARCH_I386, >>> - regs->orig_ax, >>> + regs->orig_ax & __SYSCALL_MASK, >> >> This is weird. Three questions: >> >> 1. How can this happen? I thought that x32 syscalls always came in >> through the syscall path, which doesn't set is_compat_task. (Can >> someone rename is_compat_task to in_compat_syscall? Pretty please?) > > The SYSCALL path doesn't set TS_COMPAT, but is_compat_task() looks both > as TS_COMPAT and bit 30 of orig_ax. > > I think what is really needed here is IS_IA32 should use is_ia32_task() > instead, and *that* is the context we can mask off the x32 bit in at > all. However, does audit not need that information?
This is thoroughly fscked up. What *should* have happened is that there should have been an AUDIT_ARCH_X32. Unfortunately no one caught that in time, so the current enshrined ABI is that, as far as seccomp is concerned, x32 is AUDIT_ARCH_X86_64 (see syscall_get_arch) but nr has the x32 bit set. But the audit code is different: x32 is AUDIT_ARCH_I386 and the x32 bit is set. This may be changeable, since fixes to the audit ABI may not break anything. Can we just use syscall_get_arch to determine the token to pass to the audit code? If it makes everyone feel better, I think that every single architecture has screwed this up. We actually had to prevent seccomp and audit from being configured in on any OABI-supporting ARM kernel, and MIPS almost did the same thing that x32 did. We caught that one on time, and it's now fixed in Linus' tree. > > (And why the frakk does audit receive the first four syscall arguments? > Audit seems like the worst turd ever...) If you ever benchmark the performance impact of merely running auditd, you might have to upgrade that insult :-/ > >> 2. Now audit can't tell whether a syscall is x32 or i386. And audit is >> inconsistent with seccomp. This seems wrong. > > This is completely and totally bogus, indeed. I'd suggest just fixing the bug in auditsc.c. > >> 3. The OOPS you're fixing doesn't seem like it's fixed. What if some >> other random high bits are set? > > There is a range check in entry_*.S for the system call. I can imagine that causing a certain amount of confusion to fancy seccomp users. Oh, well. No one that I know of has complained yet. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/