Re: Alignment in BPF verifier
From: Daniel BorkmannDate: Tue, 23 May 2017 23:27:20 +0200 > On 05/23/2017 09:45 PM, Alexei Starovoitov wrote: >> On 5/23/17 7:41 AM, Edward Cree wrote: >>> Hmm, that means that we can't do arithmetic on a >>> PTR_TO_MAP_VALUE_OR_NULL, we have to convert it to a PTR_TO_MAP_VALUE >>> first by NULL-checking it. That's probably fine, but I can just about >>> imagine some compiler optimisation reordering them. Any reason not to >>> split this out into a different reg->field, rather than overloading >>> id? >> >> 'id' is sort of like 'version' of a pointer and has the same meaning >> in >> both cases. How exactly do you see this split? > > Also, same id is never reused once generated and later propagated > through regs. So far we haven't run into this kind of optimization > from llvm side yet, but others which led to requiring the id marker > (see 57a09bf0a416). I could imagine it might be needed at some point, > though where we later transition directly to PTR_TO_MAP_VALUE_ADJ > after NULL check. Out of curiosity, did you run into it with llvm? We could handle this issue in find_good_pkt_pointers(), nothing prevents us from advancing state there for cases like Edward notes above.
Re: Alignment in BPF verifier
On 5/24/17 6:46 AM, Edward Cree wrote: On 23/05/17 22:27, Daniel Borkmann wrote: On 05/23/2017 09:45 PM, Alexei Starovoitov wrote: On 5/23/17 7:41 AM, Edward Cree wrote: Hmm, that means that we can't do arithmetic on a PTR_TO_MAP_VALUE_OR_NULL, we have to convert it to a PTR_TO_MAP_VALUE first by NULL-checking it. That's probably fine, but I can just about imagine some compiler optimisation reordering them. Any reason not to split this out into a different reg->field, rather than overloading id? 'id' is sort of like 'version' of a pointer and has the same meaning in both cases. How exactly do you see this split? I was thinking there would be reg->id and reg->map_id. Both could share the env->id_gen, since that's not likely to run out, but they'd be separate fields so that a PTR_TO_MAP_VALUE_OR_NULL could say "this is either map_value plus a 4-byte-aligned offset less than 24, or NULL plus that same offset", and then if another pointer with the same map_id and no variable-offset part was NULL-checked, we could convert both pointers to PTR_TO_MAP_VALUE. (I'm getting rid of PTR_TO_MAP_VALUE_ADJ in my patch, along with several other types, by taking the 'we have an offset' part out of the bpf_reg_type.) got it. makes sense. So far we haven't run into this kind of optimization from llvm side yet[...] Out of curiosity, did you run into it with llvm? No, purely theoretical. I haven't even built/installed llvm yet, I'm just working with the bytecode in test_verifier.c for now. I'm merely trying to not have restrictions that are unnecessary; but since allowing this kind of construct would take a non-zero amount of work, I'll file it for later. modern fedora/ubuntu come with llvm that has bpf backend by default.
Re: Alignment in BPF verifier
On 23/05/17 22:27, Daniel Borkmann wrote: > On 05/23/2017 09:45 PM, Alexei Starovoitov wrote: >> On 5/23/17 7:41 AM, Edward Cree wrote: >>> Hmm, that means that we can't do arithmetic on a >>> PTR_TO_MAP_VALUE_OR_NULL, we have to convert it to a PTR_TO_MAP_VALUE >>> first by NULL-checking it. That's probably fine, but I can just about >>> imagine some compiler optimisation reordering them. Any reason not to >>> split this out into a different reg->field, rather than overloading id? >> >> 'id' is sort of like 'version' of a pointer and has the same meaning in >> both cases. How exactly do you see this split? I was thinking there would be reg->id and reg->map_id. Both could share the env->id_gen, since that's not likely to run out, but they'd be separate fields so that a PTR_TO_MAP_VALUE_OR_NULL could say "this is either map_value plus a 4-byte-aligned offset less than 24, or NULL plus that same offset", and then if another pointer with the same map_id and no variable-offset part was NULL-checked, we could convert both pointers to PTR_TO_MAP_VALUE. (I'm getting rid of PTR_TO_MAP_VALUE_ADJ in my patch, along with several other types, by taking the 'we have an offset' part out of the bpf_reg_type.) > So far we haven't run into this kind of optimization > from llvm side yet[...] Out of curiosity, did you run into it with llvm? No, purely theoretical. I haven't even built/installed llvm yet, I'm just working with the bytecode in test_verifier.c for now. I'm merely trying to not have restrictions that are unnecessary; but since allowing this kind of construct would take a non-zero amount of work, I'll file it for later. -Ed
Re: Alignment in BPF verifier
On 5/23/17 10:43 AM, Edward Cree wrote: Another issue: it looks like the min/max_value handling for subtraction is bogus. In adjust_reg_min_max_vals() we have if (dst_reg->min_value != BPF_REGISTER_MIN_RANGE) dst_reg->min_value -= min_val; if (dst_reg->max_value != BPF_REGISTER_MAX_RANGE) dst_reg->max_value -= max_val; where min_val and max_val refer to the src_reg. But surely they should be used the other way round; if (say) 2 <= R1 <= 6 and 1 <= R2 <= 4, then this will claim 1 <= (R1 - R2) <= 2, whereas really (R1 - R2) could be anything from -2 to 5. This also means that the code just above the switch, if (min_val == BPF_REGISTER_MIN_RANGE) dst_reg->min_value = BPF_REGISTER_MIN_RANGE; if (max_val == BPF_REGISTER_MAX_RANGE) dst_reg->max_value = BPF_REGISTER_MAX_RANGE; is wrong, since e.g. subtracting MAX_RANGE needs to blow our min_value, not our max_value. right. good catch. I have a feeling we discussed similar thing before. May be some patch felt through the cracks. That's the reason the fancy verifier analysis is root only. I'm assuming you're going to send a fix? Thanks!
Re: Alignment in BPF verifier
On 05/23/2017 09:45 PM, Alexei Starovoitov wrote: On 5/23/17 7:41 AM, Edward Cree wrote: I'm still plugging away at this... it's going to be quite a big patch and rewrite a lot of stuff (and I'm not sure I'll be able to break it into smaller bisectable patches). And of course I have more questions. In check_packet_ptr_add(), we forbid adding a negative constant to a packet ptr. Is there some principled reason for that, or is it just because the bounds checking is hard? It seems like, if imm + reg->off > 0 (suitably carefully checked to avoid overflow etc.), then the subtraction should be legal. Indeed, even if the reg->off (fixed part of offset) is zero, if the variable part is known (min_value) to be >= -imm, the subtraction should be safe. adding negative imm to pkt_ptr is ok, but what is the use case? Do you see llvm generating such code? Btw, currently, you kind of have it in a limited way via BPF_STX_MEM() and BPF_LDX_MEM() when accessing pkt data through the offset, which can be negative on the insn level. I think if we try to track everything with the current shape of state pruning, the verifier will stop accepting old programs because it reaches complexity limit. I think we need to rearchitect the whole thing. I was thinking of doing it compiler-style. Convert to ssa and do traditional data flow analysis, use-def chains, register liveness then pruning heuristics won't be necessary and verifier should be able to check everything in more or less single pass. Things like register liveness can be done without ssa. It can be used to augment existing pruning, since it will know which registers are dead, so they don't have to be compared, but it feels half-way. I'd rather go all the way. On 20/05/17 00:05, Daniel Borkmann wrote: Besides PTR_TO_PACKET also PTR_TO_MAP_VALUE_OR_NULL uses it to track all registers (incl. spilled ones) with the same reg->id that originated from the same map lookup. After the reg type is then migrated to either PTR_TO_MAP_VALUE (resp. CONST_PTR_TO_MAP for map in map) or UNKNOWN_VALUE depending on the branch, the reg->id is then reset to 0 again. Whole reason for this is that LLVM generates code where it can move and/or spill a reg of type PTR_TO_MAP_VALUE_OR_NULL to other regs before we do the NULL test on it, and later on it expects that the spilled or moved regs work wrt access. So they're marked with an id and then all of them are type migrated. So here meaning of reg->id is different than in PTR_TO_PACKET case. Hmm, that means that we can't do arithmetic on a PTR_TO_MAP_VALUE_OR_NULL, we have to convert it to a PTR_TO_MAP_VALUE first by NULL-checking it. That's probably fine, but I can just about imagine some compiler optimisation reordering them. Any reason not to split this out into a different reg->field, rather than overloading id? 'id' is sort of like 'version' of a pointer and has the same meaning in both cases. How exactly do you see this split? Also, same id is never reused once generated and later propagated through regs. So far we haven't run into this kind of optimization from llvm side yet, but others which led to requiring the id marker (see 57a09bf0a416). I could imagine it might be needed at some point, though where we later transition directly to PTR_TO_MAP_VALUE_ADJ after NULL check. Out of curiosity, did you run into it with llvm? Of course that would need (more) caution wrt. states_equal(), but it looks like I'll be mangling that a lot anyway - for instance, we don't want to just use memcmp() to compare alignments, we want to check that our alignment is stricter than the old alignment. (Of course memcmp() is a conservative check, so the "memcmp() the whole reg_state" fast path can remain.) yes. that would be good improvement. Not sure how much it will help the pruning though.
Re: Alignment in BPF verifier
On 5/23/17 7:41 AM, Edward Cree wrote: I'm still plugging away at this... it's going to be quite a big patch and rewrite a lot of stuff (and I'm not sure I'll be able to break it into smaller bisectable patches). And of course I have more questions. In check_packet_ptr_add(), we forbid adding a negative constant to a packet ptr. Is there some principled reason for that, or is it just because the bounds checking is hard? It seems like, if imm + reg->off > 0 (suitably carefully checked to avoid overflow etc.), then the subtraction should be legal. Indeed, even if the reg->off (fixed part of offset) is zero, if the variable part is known (min_value) to be >= -imm, the subtraction should be safe. adding negative imm to pkt_ptr is ok, but what is the use case? Do you see llvm generating such code? I think if we try to track everything with the current shape of state pruning, the verifier will stop accepting old programs because it reaches complexity limit. I think we need to rearchitect the whole thing. I was thinking of doing it compiler-style. Convert to ssa and do traditional data flow analysis, use-def chains, register liveness then pruning heuristics won't be necessary and verifier should be able to check everything in more or less single pass. Things like register liveness can be done without ssa. It can be used to augment existing pruning, since it will know which registers are dead, so they don't have to be compared, but it feels half-way. I'd rather go all the way. On 20/05/17 00:05, Daniel Borkmann wrote: Besides PTR_TO_PACKET also PTR_TO_MAP_VALUE_OR_NULL uses it to track all registers (incl. spilled ones) with the same reg->id that originated from the same map lookup. After the reg type is then migrated to either PTR_TO_MAP_VALUE (resp. CONST_PTR_TO_MAP for map in map) or UNKNOWN_VALUE depending on the branch, the reg->id is then reset to 0 again. Whole reason for this is that LLVM generates code where it can move and/or spill a reg of type PTR_TO_MAP_VALUE_OR_NULL to other regs before we do the NULL test on it, and later on it expects that the spilled or moved regs work wrt access. So they're marked with an id and then all of them are type migrated. So here meaning of reg->id is different than in PTR_TO_PACKET case. Hmm, that means that we can't do arithmetic on a PTR_TO_MAP_VALUE_OR_NULL, we have to convert it to a PTR_TO_MAP_VALUE first by NULL-checking it. That's probably fine, but I can just about imagine some compiler optimisation reordering them. Any reason not to split this out into a different reg->field, rather than overloading id? 'id' is sort of like 'version' of a pointer and has the same meaning in both cases. How exactly do you see this split? Of course that would need (more) caution wrt. states_equal(), but it looks like I'll be mangling that a lot anyway - for instance, we don't want to just use memcmp() to compare alignments, we want to check that our alignment is stricter than the old alignment. (Of course memcmp() is a conservative check, so the "memcmp() the whole reg_state" fast path can remain.) yes. that would be good improvement. Not sure how much it will help the pruning though.
Re: Alignment in BPF verifier
Another issue: it looks like the min/max_value handling for subtraction is bogus. In adjust_reg_min_max_vals() we have if (dst_reg->min_value != BPF_REGISTER_MIN_RANGE) dst_reg->min_value -= min_val; if (dst_reg->max_value != BPF_REGISTER_MAX_RANGE) dst_reg->max_value -= max_val; where min_val and max_val refer to the src_reg. But surely they should be used the other way round; if (say) 2 <= R1 <= 6 and 1 <= R2 <= 4, then this will claim 1 <= (R1 - R2) <= 2, whereas really (R1 - R2) could be anything from -2 to 5. This also means that the code just above the switch, if (min_val == BPF_REGISTER_MIN_RANGE) dst_reg->min_value = BPF_REGISTER_MIN_RANGE; if (max_val == BPF_REGISTER_MAX_RANGE) dst_reg->max_value = BPF_REGISTER_MAX_RANGE; is wrong, since e.g. subtracting MAX_RANGE needs to blow our min_value, not our max_value. -Ed
Re: Alignment in BPF verifier
I'm still plugging away at this... it's going to be quite a big patch and rewrite a lot of stuff (and I'm not sure I'll be able to break it into smaller bisectable patches). And of course I have more questions. In check_packet_ptr_add(), we forbid adding a negative constant to a packet ptr. Is there some principled reason for that, or is it just because the bounds checking is hard? It seems like, if imm + reg->off > 0 (suitably carefully checked to avoid overflow etc.), then the subtraction should be legal. Indeed, even if the reg->off (fixed part of offset) is zero, if the variable part is known (min_value) to be >= -imm, the subtraction should be safe. On 20/05/17 00:05, Daniel Borkmann wrote: > Besides PTR_TO_PACKET also PTR_TO_MAP_VALUE_OR_NULL uses it to > track all registers (incl. spilled ones) with the same reg->id > that originated from the same map lookup. After the reg type is > then migrated to either PTR_TO_MAP_VALUE (resp. CONST_PTR_TO_MAP > for map in map) or UNKNOWN_VALUE depending on the branch, the > reg->id is then reset to 0 again. Whole reason for this is that > LLVM generates code where it can move and/or spill a reg of type > PTR_TO_MAP_VALUE_OR_NULL to other regs before we do the NULL > test on it, and later on it expects that the spilled or moved > regs work wrt access. So they're marked with an id and then all > of them are type migrated. So here meaning of reg->id is different > than in PTR_TO_PACKET case. Hmm, that means that we can't do arithmetic on a PTR_TO_MAP_VALUE_OR_NULL, we have to convert it to a PTR_TO_MAP_VALUE first by NULL-checking it. That's probably fine, but I can just about imagine some compiler optimisation reordering them. Any reason not to split this out into a different reg->field, rather than overloading id? Of course that would need (more) caution wrt. states_equal(), but it looks like I'll be mangling that a lot anyway - for instance, we don't want to just use memcmp() to compare alignments, we want to check that our alignment is stricter than the old alignment. (Of course memcmp() is a conservative check, so the "memcmp() the whole reg_state" fast path can remain.) -Ed
Re: Alignment in BPF verifier
On 05/19/2017 10:39 PM, David Miller wrote: From: Edward CreeDate: Fri, 19 May 2017 21:00:13 +0100 Well, I've managed to get somewhat confused by reg->id. In particular, I'm unsure which bpf_reg_types can have an id, and what exactly it means. There seems to be some code that checks around map value pointers, which seems strange as maps have fixed sizes (and the comments in enum bpf_reg_type make it seem like id is a PTR_TO_PACKET thing) - is this Besides PTR_TO_PACKET also PTR_TO_MAP_VALUE_OR_NULL uses it to track all registers (incl. spilled ones) with the same reg->id that originated from the same map lookup. After the reg type is then migrated to either PTR_TO_MAP_VALUE (resp. CONST_PTR_TO_MAP for map in map) or UNKNOWN_VALUE depending on the branch, the reg->id is then reset to 0 again. Whole reason for this is that LLVM generates code where it can move and/or spill a reg of type PTR_TO_MAP_VALUE_OR_NULL to other regs before we do the NULL test on it, and later on it expects that the spilled or moved regs work wrt access. So they're marked with an id and then all of them are type migrated. So here meaning of reg->id is different than in PTR_TO_PACKET case. Example: 0: (b7) r1 = 10 1: (7b) *(u64 *)(r10 -8) = r1 2: (bf) r2 = r10 3: (07) r2 += -8 4: (18) r1 = 0x59c0 6: (85) call 1 //map lookup 7: (bf) r4 = r0 8: (15) if r0 == 0x0 goto pc+1 R0=map_value(ks=8,vs=8) R4=map_value_or_null(ks=8,vs=8) R10=fp 9: (7a) *(u64 *)(r4 +0) = 0 maybe because of map-of-maps support, can the contained maps have differing element sizes? Or do we allow *(map_value + var + imm), if map_value + var was appropriately bounds-checked? Does the 'id' identify the variable that was added to an object pointer, or the object itself? Or does it blur these and identify (what the comment in enum bpf_reg_type calls) "skb->data + (u16) var"? The reg->id value changes any time a variable gets added to a packet pointer. You will also notice right now that only packet pointers have their alignment tracked. I have changes pending that will do that for MAP pointers too, but it needs more work.
Re: Alignment in BPF verifier
From: Edward CreeDate: Fri, 19 May 2017 21:00:13 +0100 > Here's what I'm thinking of doing: > struct bpf_reg_state { > enum bpf_reg_type type; > union { > /* valid when type == PTR_TO_PACKET */ > u16 range; > > /* valid when type == CONST_PTR_TO_MAP | PTR_TO_MAP_VALUE | > * PTR_TO_MAP_VALUE_OR_NULL > */ > struct bpf_map *map_ptr; > }; > /* Used to find other pointers with the same variable base, so they > * can share range and align knowledge. > */ > u32 id; > u32 off; /* fixed part of pointer offset */ > /* For scalar types (CONST_IMM | UNKNOWN_VALUE), this represents our > * knowledge of the actual value. > * For pointer types, this represents the variable part of the offset > * from the pointed-to object, and is shared with all bpf_reg_states > * with the same id as us. > */ > struct tnum align; > /* Used to determine if any memory access using this register will > * result in a bad access. These two fields must be last. > * See states_equal() > * These refer to the same value as align, not necessarily the actual > * contents of the register. > */ > s64 min_value; > u64 max_value; > }; Be very careful with the layout of bpf_reg_state. There are layout dependencies in the state pruning. Please take a look at states_equal(). It is walking the set of registers at two snapshot locations and trying to see if they are "equivalent". What's happening here is that the verifier makes a stack of all branch points in the program. On the first pass it analyzes the register values taking one of the two paths a branch takes. Then when it hits the end of the program, on that path, to BPF_EXIT it starts popping the entries on the stack. The naive implementation would pop each stack entry, and then traverse the other arm of the branch. But for programs with lots of branches this gets very expensive. So at each stack pop, the verifier tries to determine if it can skip traversing the branch's other path. And it does this by analyzing register state. The tests are basically: if (memcmp(rold, rcur, sizeof(*rold)) == 0) continue; exact equivalent, then we're fine. /* If the ranges were not the same, but everything else was and * we didn't do a variable access into a map then we are a-ok. */ if (!varlen_map_access && memcmp(rold, rcur, offsetofend(struct bpf_reg_state, id)) == 0) continue; We didn't do any variable MAP accesses, and everything in the register "up to and including member ID" is the same, we're fine. And then we drop down into some packet pointer specific tests to try and optimize things further. So you have to be careful what you place before and/or after 'id'. Probably we need to put the alignment stuff before 'id' so that it is considered by the offsetofend() length memcmp(). Hope that helps.
Re: Alignment in BPF verifier
From: Edward CreeDate: Fri, 19 May 2017 21:00:13 +0100 > Well, I've managed to get somewhat confused by reg->id. > In particular, I'm unsure which bpf_reg_types can have an id, and what > exactly it means. There seems to be some code that checks around map value > pointers, which seems strange as maps have fixed sizes (and the comments in > enum bpf_reg_type make it seem like id is a PTR_TO_PACKET thing) - is this > maybe because of map-of-maps support, can the contained maps have differing > element sizes? Or do we allow *(map_value + var + imm), if map_value + var > was appropriately bounds-checked? > > Does the 'id' identify the variable that was added to an object pointer, or > the object itself? Or does it blur these and identify (what the comment in > enum bpf_reg_type calls) "skb->data + (u16) var"? The reg->id value changes any time a variable gets added to a packet pointer. You will also notice right now that only packet pointers have their alignment tracked. I have changes pending that will do that for MAP pointers too, but it needs more work.