Re: [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API
On Tue, Dec 01, 2020 at 02:21:45AM +0900, Masami Hiramatsu wrote: > Because it overruns the buffer. Maybe -E2BIG/ENODATA or any other > error (except for -EINVAL) is OK :) ENODATA it is. :) And propagating that error value is easy because the err_out: labels are all coming from the validate_next() error path so we basically know that the buffer has ended. ./insn_sanity: Success: decoded and checked 1 random instructions with 0 errors (seed:0x7bdfa56e) insn buffer: 0x48 0xcf 0x48 0x83 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 supplied buf size: 15, ret 0 supplied buf size: 2, ret 0 supplied buf size: 3, ret 0 supplied buf size: 4, ret 0 insn_decode: entry insn_decode: will get_length insn_get_immediate: getting immediate insn_get_displacement: getting displacement insn_get_sib: getting sib insn_get_modrm: entry insn_get_opcode: entry insn_get_prefixes: entry, prefixes->got: 0 insn_get_prefixes: 1 insn_get_prefixes: REX insn_get_prefixes: VEX insn_get_prefixes: validate_next: 0 insn_get_prefixes: insn->next_byte: 0x7ffc211eb661, insn->end_kaddr: 0x7ffc211eb661 insn_get_prefixes: errored out supplied buf size: 1, ret -61 -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette
Re: [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API
On Mon, 30 Nov 2020 14:44:42 +0100 Borislav Petkov wrote: > On Sun, Nov 29, 2020 at 05:50:05PM +0900, Masami Hiramatsu wrote: > > Good point. I think we can return, e.g. -EFAULT if we failed in > > get_next(). Then, we can read out next page, for example. > > Why -EFAULT? Because it overruns the buffer. Maybe -E2BIG/ENODATA or any other error (except for -EINVAL) is OK :) > Running this > > size = 1; > ret = insn_decode(, b, size, INSN_MODE_64) > > i.e., buffer size is 1: > > ./arch/x86/tools/insn_sanity: Success: decoded and checked 1 random > instructions with 0 errors (seed:0x9994a137) > insn buffer: > 0x48 0xcf 0x48 0x83 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 > supplied buf size: 15, ret 0 > supplied buf size: 2, ret 0 > supplied buf size: 3, ret 0 > supplied buf size: 4, ret 0 > insn_decode: entry > insn_decode: will get_length > insn_get_immediate: getting immediate > insn_get_displacement: getting displacement > insn_get_sib: getting sib > insn_get_modrm: entry > insn_get_opcode: entry > insn_get_prefixes: entry, prefixes->got: 0 > insn_get_prefixes: 1 > insn_get_prefixes: REX > insn_get_prefixes: VEX > insn_get_prefixes: validate_next: 0 > insn_get_prefixes: insn->next_byte: 0x7ffec297c3e1, insn->end_kaddr: > 0x7ffec297c3e1 > insn_get_prefixes: errored out > supplied buf size: 1, ret -22 > > is caught in validate_next() where ->next_byte == ->end_kaddr. > > I'm thinking we should return EOF here, to denote that we're reached the > end and then propagate that error up the callchain. EOF means the end of file and it is not an error. Here, since the decoder fails to decode the data, so it should return an error code. > > We don't have "define EOF" in the kernel but we can designate one for > the insn decoder, perhaps > > #define EOF -1 > > as stdio.h does: > > /* The value returned by fgetc and similar functions to indicate the >end of the file. */ > #define EOF (-1) It is because libc fgetc() returns -1 in any error case... > > Hmm, but then the callers would need to know EOF too so maybe EIO or > something. > > In any case, it should be a value which callers should be able to use to > get told that input buffer is truncated... Yes! Thank you, > > Thx. > > -- > Regards/Gruss, > Boris. > > https://people.kernel.org/tglx/notes-about-netiquette -- Masami Hiramatsu
Re: [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API
On Sun, Nov 29, 2020 at 05:50:05PM +0900, Masami Hiramatsu wrote: > Good point. I think we can return, e.g. -EFAULT if we failed in > get_next(). Then, we can read out next page, for example. Why -EFAULT? Running this size = 1; ret = insn_decode(, b, size, INSN_MODE_64) i.e., buffer size is 1: ./arch/x86/tools/insn_sanity: Success: decoded and checked 1 random instructions with 0 errors (seed:0x9994a137) insn buffer: 0x48 0xcf 0x48 0x83 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 supplied buf size: 15, ret 0 supplied buf size: 2, ret 0 supplied buf size: 3, ret 0 supplied buf size: 4, ret 0 insn_decode: entry insn_decode: will get_length insn_get_immediate: getting immediate insn_get_displacement: getting displacement insn_get_sib: getting sib insn_get_modrm: entry insn_get_opcode: entry insn_get_prefixes: entry, prefixes->got: 0 insn_get_prefixes: 1 insn_get_prefixes: REX insn_get_prefixes: VEX insn_get_prefixes: validate_next: 0 insn_get_prefixes: insn->next_byte: 0x7ffec297c3e1, insn->end_kaddr: 0x7ffec297c3e1 insn_get_prefixes: errored out supplied buf size: 1, ret -22 is caught in validate_next() where ->next_byte == ->end_kaddr. I'm thinking we should return EOF here, to denote that we're reached the end and then propagate that error up the callchain. We don't have "define EOF" in the kernel but we can designate one for the insn decoder, perhaps #define EOF -1 as stdio.h does: /* The value returned by fgetc and similar functions to indicate the end of the file. */ #define EOF (-1) Hmm, but then the callers would need to know EOF too so maybe EIO or something. In any case, it should be a value which callers should be able to use to get told that input buffer is truncated... Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette
Re: [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API
On Fri, 27 Nov 2020 09:45:39 -0800 Andy Lutomirski wrote: > On Tue, Nov 24, 2020 at 9:46 AM Borislav Petkov wrote: > > > > On Tue, Nov 24, 2020 at 11:19:33AM +0100, Borislav Petkov wrote: > > > In any case, at least the case where I give it > > > > > > 0x48 0xcf 0x48 0x83 > > > > > > and say that buf size is 4, should return an error because the second > > > insn is incomplete. So I need to go look at that now. > > > > Ok, got it: > > > > ./arch/x86/tools/insn_sanity: Success: decoded and checked 1 random > > instructions with 0 errors (seed:0x826fdf9c) > > insn buffer: > > 0x48 0xcf 0x48 0x83 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 > > supplied buf size: 15, ret 0 > > supplied buf size: 2, ret 0 > > supplied buf size: 3, ret 0 > > supplied buf size: 4, ret 0 > > supplied buf size: 1, ret -22 > > > > the current decoder simply decodes the *first* insn in the buffer it > > encounters and that's it. > > > > When you give it a buffer of size smaller than the first instruction: > > > > supplied buf size: 1, ret -22 > > > > while the first insn is 2 bytes long: > > > > 0x48 0xcf (IRETQ) > > > > then it signals an error. > > > > Andy, does that work for your use cases? > > Is -22 (-EINVAL) the same error it returns if you pass in garbage? > How hard would it be to teach it to return a different error code when > the buffer is too small? > Good point. I think we can return, e.g. -EFAULT if we failed in get_next(). Then, we can read out next page, for example. Thank you, -- Masami Hiramatsu
Re: [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API
On Fri, Nov 27, 2020 at 09:45:39AM -0800, Andy Lutomirski wrote: > Is -22 (-EINVAL) the same error it returns if you pass in garbage? Define garbage. Yes, if you have a sequence of bytes which you can unambiguously determine to be - an invalid instruction in some of the tables - REX prefix with the wrong bits set - a byte says that some insn part like ModRM or SIB is following but the buffer falls short - ... other error condition then maybe you can say, yes, I'm looking at garbage and can error out right then and there. But you need to have enough bytes of information to determine that. For example (those are random bytes): 11ff <.asm_start>: 11ff: 95 xchg %eax,%ebp 1200: 14 60 adc$0x60,%al 1202: 77 74 ja 1278 <__libc_csu_init+0x28> 1204: 82 (bad) 1205: 67 dc 55 35 fcoml 0x35(%ebp) The 0x82 is usually in opcode group 1 but that opcode is invalid in 64-bit mode. So if this is a 64-bit executable, you know that that is an invalid insn. Another example: 18: a0 .byte 0xa0 19: 17 (bad) 1a: 27 (bad) 1b: ea (bad) 1c: 90 nop 1d: 90 nop 1e: 90 nop 1f: 90 nop 0xa0 is the opcode for MOV AL, moffset8 where moffset8 is an address-sized memory offset, which in 64-bit mode is 64-bit. But we have only 7 bytes after the 0xa0 thus we know that the buffer is truncated. If it had one byte more, it would be a valid insn: 18: a0 17 27 ea 90 90 90movabs 0x9090909090ea2717,%al 20: 90 90 I'm sure you get the idea: if you have enough unambiguous bits which tell you that this cannot be a valid insn, then you can return early from the decoder and signal that fact. I'm not sure that you can do that for all possible byte combinations and also I'm not sure that it won't ever happen that it per chance misinterprets garbage data as valid instructions. > How hard would it be to teach it to return a different error code when > the buffer is too small? Yap, see above. Unambiguous cases are clear but I don't know it would work in all cases. For example, let's say you give it a zeroed out buffer of 8 bytes which doesn't contain anything - you've just zeroed it out and haven't put any insns in there yet. But those are perfectly valid insns: 0: 00 00 add %al,(%rax) 2: 00 00 add %al,(%rax) 4: 00 00 add %al,(%rax) 6: 00 00 add %al,(%rax) So now you go about your merry way working with those although they're not real instructions which some tool generated. See what I mean? -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette
Re: [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API
On Tue, Nov 24, 2020 at 9:46 AM Borislav Petkov wrote: > > On Tue, Nov 24, 2020 at 11:19:33AM +0100, Borislav Petkov wrote: > > In any case, at least the case where I give it > > > > 0x48 0xcf 0x48 0x83 > > > > and say that buf size is 4, should return an error because the second > > insn is incomplete. So I need to go look at that now. > > Ok, got it: > > ./arch/x86/tools/insn_sanity: Success: decoded and checked 1 random > instructions with 0 errors (seed:0x826fdf9c) > insn buffer: > 0x48 0xcf 0x48 0x83 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 > supplied buf size: 15, ret 0 > supplied buf size: 2, ret 0 > supplied buf size: 3, ret 0 > supplied buf size: 4, ret 0 > supplied buf size: 1, ret -22 > > the current decoder simply decodes the *first* insn in the buffer it > encounters and that's it. > > When you give it a buffer of size smaller than the first instruction: > > supplied buf size: 1, ret -22 > > while the first insn is 2 bytes long: > > 0x48 0xcf (IRETQ) > > then it signals an error. > > Andy, does that work for your use cases? Is -22 (-EINVAL) the same error it returns if you pass in garbage? How hard would it be to teach it to return a different error code when the buffer is too small? > > -- > Regards/Gruss, > Boris. > > https://people.kernel.org/tglx/notes-about-netiquette -- Andy Lutomirski AMA Capital Management, LLC
Re: [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API
On Tue, 24 Nov 2020 18:46:47 +0100 Borislav Petkov wrote: > On Tue, Nov 24, 2020 at 11:19:33AM +0100, Borislav Petkov wrote: > > In any case, at least the case where I give it > > > > 0x48 0xcf 0x48 0x83 > > > > and say that buf size is 4, should return an error because the second > > insn is incomplete. So I need to go look at that now. > > Ok, got it: > > ./arch/x86/tools/insn_sanity: Success: decoded and checked 1 random > instructions with 0 errors (seed:0x826fdf9c) > insn buffer: > 0x48 0xcf 0x48 0x83 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 > supplied buf size: 15, ret 0 > supplied buf size: 2, ret 0 > supplied buf size: 3, ret 0 > supplied buf size: 4, ret 0 > supplied buf size: 1, ret -22 > > the current decoder simply decodes the *first* insn in the buffer it > encounters and that's it. Yes, currently the buf_size is only for checking the maximum length of the buffer, because we expect the user doesn't know the actual length of the instruction before calling insn_get_length(). But yes, for the insn_sanity.c, the return length should be compared. Thank you, > > When you give it a buffer of size smaller than the first instruction: > > supplied buf size: 1, ret -22 > > while the first insn is 2 bytes long: > > 0x48 0xcf (IRETQ) > > then it signals an error. > > Andy, does that work for your use cases? > > -- > Regards/Gruss, > Boris. > > https://people.kernel.org/tglx/notes-about-netiquette -- Masami Hiramatsu
Re: [RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API
On Tue, Nov 24, 2020 at 11:19:33AM +0100, Borislav Petkov wrote: > In any case, at least the case where I give it > > 0x48 0xcf 0x48 0x83 > > and say that buf size is 4, should return an error because the second > insn is incomplete. So I need to go look at that now. Ok, got it: ./arch/x86/tools/insn_sanity: Success: decoded and checked 1 random instructions with 0 errors (seed:0x826fdf9c) insn buffer: 0x48 0xcf 0x48 0x83 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 supplied buf size: 15, ret 0 supplied buf size: 2, ret 0 supplied buf size: 3, ret 0 supplied buf size: 4, ret 0 supplied buf size: 1, ret -22 the current decoder simply decodes the *first* insn in the buffer it encounters and that's it. When you give it a buffer of size smaller than the first instruction: supplied buf size: 1, ret -22 while the first insn is 2 bytes long: 0x48 0xcf (IRETQ) then it signals an error. Andy, does that work for your use cases? -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette
[RFC PATCH v0 00/19] x86/insn: Add an insn_decode() API
From: Borislav Petkov Hi, here's what I had in mind, finally split into proper patches. The final goal is for all users of the decoder to simply call insn_decode() and look at its retval. Simple. As to amluto's question about detecting partial insns, see the diff below. Running that gives: insn buffer: 0x48 0xcf 0x48 0x83 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 0x90 supplied buf size: 15, ret 0 supplied buf size: 2, ret 0 supplied buf size: 3, ret 0 supplied buf size: 4, ret 0 and the return value is always success. Which means that that buf_len that gets supplied to the decoder functions doesn't really work and I need to look into it. That is, provided this is how we want to control what the instruction decoder decodes - by supplying the length of the buffer it should look at. We could also say that probably there should be a way to say "decode only the first insn in the buffer and ignore the rest". That is all up to the use cases so I'm looking for suggestions here. In any case, at least the case where I give it 0x48 0xcf 0x48 0x83 and say that buf size is 4, should return an error because the second insn is incomplete. So I need to go look at that now. Thx. --- diff --git a/arch/x86/tools/insn_sanity.c b/arch/x86/tools/insn_sanity.c index 51309df285b4..41e1ae0cd833 100644 --- a/arch/x86/tools/insn_sanity.c +++ b/arch/x86/tools/insn_sanity.c @@ -220,6 +220,45 @@ static void parse_args(int argc, char **argv) } } +static void run_insn_limits_test(void) +{ + unsigned char b[MAX_INSN_SIZE]; + struct insn insn; + int ret, i, size; + + memset(b, INSN_NOP, MAX_INSN_SIZE); + + /* IRETQ */ + b[0] = 0x48; + b[1] = 0xcf; + + /* partial add $0x8, %rsp */ + b[2] = 0x48; + b[3] = 0x83; + + printf("insn buffer:\n"); + + for (i = 0; i < MAX_INSN_SIZE; i++) + printf("0x%hhx ", b[i]); + printf("\n"); + + size = MAX_INSN_SIZE; + ret = insn_decode(, b, size, INSN_MODE_64); + printf("supplied buf size: %d, ret %d\n", size, ret); + + size = 2; + ret = insn_decode(, b, size, INSN_MODE_64); + printf("supplied buf size: %d, ret %d\n", size, ret); + + size = 3; + ret = insn_decode(, b, size, INSN_MODE_64); + printf("supplied buf size: %d, ret %d\n", size, ret); + + size = 4; + ret = insn_decode(, b, size, INSN_MODE_64); + printf("supplied buf size: %d, ret %d\n", size, ret); +} + int main(int argc, char **argv) { int insns = 0, ret; @@ -265,5 +304,7 @@ int main(int argc, char **argv) errors, seed); + run_insn_limits_test(); + return errors ? 1 : 0; } Borislav Petkov (19): x86/insn: Rename insn_decode() to insn_decode_regs() x86/insn: Add @buf_len param to insn_init() kernel-doc comment x86/insn: Add an insn_decode() API x86/insn-eval: Handle return values from the decoder x86/boot/compressed/sev-es: Convert to insn_decode() perf/x86/intel/ds: Check insn_get_length() retval perf/x86/intel/ds: Check return values of insn decoder functions x86/alternative: Use insn_decode() x86/mce: Convert to insn_decode() x86/kprobes: Convert to insn_decode() x86/sev-es: Convert to insn_decode() x86/traps: Convert to insn_decode() x86/uprobes: Convert to insn_decode() x86/tools/insn_decoder_test: Convert to insn_decode() tools/objtool: Convert to insn_decode() x86/tools/insn_sanity: Convert to insn_decode() tools/perf: Convert to insn_decode() x86/insn: Remove kernel_insn_init() x86/insn: Make insn_complete() static arch/x86/boot/compressed/sev-es.c | 11 +- arch/x86/events/intel/ds.c| 4 +- arch/x86/events/intel/lbr.c | 10 +- arch/x86/include/asm/insn-eval.h | 4 +- arch/x86/include/asm/insn.h | 42 ++-- arch/x86/kernel/alternative.c | 6 +- arch/x86/kernel/cpu/mce/severity.c| 12 +- arch/x86/kernel/kprobes/core.c| 17 +- arch/x86/kernel/kprobes/opt.c | 9 +- arch/x86/kernel/sev-es.c | 15 +- arch/x86/kernel/traps.c | 7 +- arch/x86/kernel/umip.c| 2 +- arch/x86/kernel/uprobes.c | 8 +- arch/x86/lib/insn-eval.c | 20 +- arch/x86/lib/insn.c | 190 ++ arch/x86/tools/insn_decoder_test.c| 10 +- arch/x86/tools/insn_sanity.c | 8 +- tools/arch/x86/include/asm/insn.h | 42 ++-- tools/arch/x86/lib/insn.c | 190 ++ tools/include/linux/kconfig.h | 73 +++ tools/objtool/arch/x86/decode.c | 9 +- tools/perf/arch/x86/tests/insn-x86.c | 9 +- tools/perf/arch/x86/util/archinsn.c | 9 +-