Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.
On 5/19/17 4:16 PM, David Miller wrote: From: Alexei StarovoitovDate: Fri, 19 May 2017 14:37:56 -0700 On 5/19/17 1:41 PM, David Miller wrote: From: Edward Cree Date: Fri, 19 May 2017 18:17:42 +0100 One question: is there a way to build the verifier as userland code (or at least as a module), or will I have to reboot every time I want to test a change? There currently is no such machanism, you will have to reboot every time. I have considered working on making the code buildable outside of the kernel. It shouldn't be too hard. it's not hard. We did it twice and both times abandoned. First time to have 'user space verifier' to check programs before loading and second time for fuzzing via llvm. Abandoned since it diverges very quickly from kernel. Well, my idea was the create an environment in which kernel verifier.c could be built as-is. Maybe there would be some small compromises in verifier.c such as an ifdef test or two, but that should be it. that's exactly what we did the first time. Added few ifdef to verifier.c Second time we went even further by compiling kernel/bpf/verifier.c as-is and linking everything magically via user space hooks all the way that test_verifier.c runs as-is but calling bpf_check() function that was compiled for user space via clang. That code is here: https://github.com/iovisor/bpf-fuzzer It's definitely possible to refresh it and make it work again. My point that unless we put such 'lets build verifier.c for user space' as part of tools/testing/selftests/ or something, such project is destined to bit rot.
Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.
From: Alexei StarovoitovDate: Fri, 19 May 2017 14:37:56 -0700 > On 5/19/17 1:41 PM, David Miller wrote: >> From: Edward Cree >> Date: Fri, 19 May 2017 18:17:42 +0100 >> >>> One question: is there a way to build the verifier as userland code >>> (or at least as a module), or will I have to reboot every time I >>> want to test a change? >> >> There currently is no such machanism, you will have to reboot every >> time. >> >> I have considered working on making the code buildable outside of the >> kernel. It shouldn't be too hard. > > it's not hard. > We did it twice and both times abandoned. > First time to have 'user space verifier' to check programs before > loading and second time for fuzzing via llvm. > Abandoned since it diverges very quickly from kernel. > Well, my idea was the create an environment in which kernel verifier.c could be built as-is. Maybe there would be some small compromises in verifier.c such as an ifdef test or two, but that should be it. It really is just a piece of what amounts to compiler infrastructure and not very kernel specific.
Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.
On 5/19/17 1:41 PM, David Miller wrote: From: Edward CreeDate: Fri, 19 May 2017 18:17:42 +0100 One question: is there a way to build the verifier as userland code (or at least as a module), or will I have to reboot every time I want to test a change? There currently is no such machanism, you will have to reboot every time. I have considered working on making the code buildable outside of the kernel. It shouldn't be too hard. it's not hard. We did it twice and both times abandoned. First time to have 'user space verifier' to check programs before loading and second time for fuzzing via llvm. Abandoned since it diverges very quickly from kernel.
Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.
From: Edward CreeDate: Fri, 19 May 2017 18:17:42 +0100 > One question: is there a way to build the verifier as userland code > (or at least as a module), or will I have to reboot every time I > want to test a change? There currently is no such machanism, you will have to reboot every time. I have considered working on making the code buildable outside of the kernel. It shouldn't be too hard.
Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.
On 19/05/17 15:55, Alexei Starovoitov wrote: > On 5/19/17 7:21 AM, Edward Cree wrote: >> I'm currently translating the algos to C. But for the kernel patch, >> I'll need to read & understand the existing verifier code, so it >> might take a while :) (I don't suppose there's any design document >> or hacking-notes you could point me at?) > > Dave just gave a good overview of the verifier: > https://www.spinics.net/lists/xdp-newbies/msg00185.html > > Few more details in > Documentation/networking/filter.txt > and in top comments in kernel/bpf/verifier.c > > General bpf arch overview: > http://docs.cilium.io/en/latest/bpf/#instruction-set Thanks for the links! I'm digging into implementation now. One question: is there a way to build the verifier as userland code (or at least as a module), or will I have to reboot every time I want to test a change? -Ed
Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.
On 5/19/17 7:21 AM, Edward Cree wrote: I'm currently translating the algos to C. But for the kernel patch, I'll need to read & understand the existing verifier code, so it might take a while :) (I don't suppose there's any design document or hacking-notes you could point me at?) Dave just gave a good overview of the verifier: https://www.spinics.net/lists/xdp-newbies/msg00185.html Few more details in Documentation/networking/filter.txt and in top comments in kernel/bpf/verifier.c General bpf arch overview: http://docs.cilium.io/en/latest/bpf/#instruction-set But I'll give it a go for sure. Thanks!
Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.
On 19/05/17 02:22, Alexei Starovoitov wrote: > In your .py I'd only change __str__(self) to print them in mask,value > as the order they're passed into constructor to make it easier to read. Actually I was going to go the other way and change the ctor to take value,mask. But I agree they're inconsistent right now. > this mul algo I don't completely understand. It feels correct, > but I'm not sure we really need it for the kernel. You're probably right; I was just driven by a completionist desire to cover everything I could. > What I love about the whole thing that it works for access into > packet, access into map values and in the future for any other > variable length access. Sure, but don't start thinking it subsumes all the other checks. We will still need e.g. max/min tracking, because packet length isn't always a power of 2. > Are you planning to work on the kernel patch for this algo? > Once we have it the verifier will be smarter regarding > alignment tracking than any compiler i know :) I'm currently translating the algos to C. But for the kernel patch, I'll need to read & understand the existing verifier code, so it might take a while :) (I don't suppose there's any design document or hacking-notes you could point me at?) But I'll give it a go for sure. -Ed
Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.
On 5/18/17 9:38 AM, Edward Cree wrote: On 18/05/17 15:49, Edward Cree wrote: Here's one idea that seemed to work when I did a couple of experiments: let A = (a;am), B = (b;bm) where the m are the masks Σ = am + bm + a + b χ = Σ ^ (a + b) /* unknown carries */ μ = χ | am | bm /* mask of result */ then A + B = ((a + b) & ~μ; μ) The idea is that we find which bits change between the case "all x are 1" and "all x are 0", and those become xs too. > https://gist.github.com/ecree-solarflare/0665d5b46c2d8d08de2377fbd527de8d I played with it quite a bit trying to break it and have to agree that the above algorithm works. At least for add and sub I think it's solid. Still feels a bit magical, since it gave me better results than I could envision for my test vectors. In your .py I'd only change __str__(self) to print them in mask,value as the order they're passed into constructor to make it easier to read. The bin(self) output is the most useful, of course. We should carry it into the kernel too for debugging. And now I've found a similar algorithm for subtraction, which (again) I can't prove but it seems to work. α = a + am - b β = a - b - bm χ = α ^ β μ = χ | α | β then A - B = ((a - b) & ~μ; μ) Again we're effectively finding the max. and min. values, and XORing them to find unknown carries. Bitwise operations are easy, of course; /* By assumption, a & am == b & bm == 0 */ A & B = (a & b; (a | am) & (b | bm) & ~(a & b)) A | B = (a | b; (am | bm) & ~(a | b)) /* It bothers me that & and | aren't symmetric, but I can't fix it */ A ^ B = (a ^ b; am | bm) as are shifts by a constant (just shift 0s into both number and mask). Multiplication by a constant can be done by decomposing into shifts and adds; but it can also be done directly; here we find (a;am) * k. π = a * k γ = am * k then A * k = (π; 0) + (0; γ), for which we use our addition algo. Multiplication of two unknown values is a nightmare, as unknown bits can propagate all over the place. We can do a shift-add decomposition where the adds for unknown bits have all the 1s in the addend replaced with xs. A few experiments suggest that this works, regardless of the order of operands. For instance 110x * x01 comes out as either 110x + xx0x = 0x or x0x x01 + x01 = 0x We can slightly optimise this by handling all the 1 bits in one go; that is, for (a;am) * (b;bm) we first find (a;am) * b using our multiplication-by-a-constant algo above, then for each bit in bm we find (a;am) * bit and force all its nonzero bits to unknown; finally we add all our components. this mul algo I don't completely understand. It feels correct, but I'm not sure we really need it for the kernel. For all practical cases llvm will likely emit shifts or sequence of adds and shifts, so multiplies by crazy non-optimizable constant or variable are rare and likely the end result is going to be outside of packet boundary, so it will be rejected anyway and precise alignment tracking doesn't matter much. What I love about the whole thing that it works for access into packet, access into map values and in the future for any other variable length access. Don't even ask about division; that scrambles bits so hard that the yeah screw div and mod. We have an option to disable div/mod altogether under some new 'prog_flags', since it has this ugly 'div by 0' exception path. We don't even have 'signed division' instruction and llvm errors like: errs() << "Unsupport signed division for DAG: "; errs() << "Please convert to unsigned div/mod.\n"; and no one complained. It just means that division is extremely rare. Are you planning to work on the kernel patch for this algo? Once we have it the verifier will be smarter regarding alignment tracking than any compiler i know :)
Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.
Implementations (still in Python for now) at https://gist.github.com/ecree-solarflare/0665d5b46c2d8d08de2377fbd527de8d (I left out division, because it's so weak.) I still can't prove + and - are correct, but they've passed every test case I've come up with so far. * seems pretty obviously correct as long as the + it uses is. Bitwise ops and shifts are trivial to prove. -Ed
Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.
On 18/05/17 15:49, Edward Cree wrote: > Here's one idea that seemed to work when I did a couple of experiments: > let A = (a;am), B = (b;bm) where the m are the masks > Σ = am + bm + a + b > χ = Σ ^ (a + b) /* unknown carries */ > μ = χ | am | bm /* mask of result */ > then A + B = ((a + b) & ~μ; μ) > > The idea is that we find which bits change between the case "all x are > 1" and "all x are 0", and those become xs too. And now I've found a similar algorithm for subtraction, which (again) I can't prove but it seems to work. α = a + am - b β = a - b - bm χ = α ^ β μ = χ | α | β then A - B = ((a - b) & ~μ; μ) Again we're effectively finding the max. and min. values, and XORing them to find unknown carries. Bitwise operations are easy, of course; /* By assumption, a & am == b & bm == 0 */ A & B = (a & b; (a | am) & (b | bm) & ~(a & b)) A | B = (a | b; (am | bm) & ~(a | b)) /* It bothers me that & and | aren't symmetric, but I can't fix it */ A ^ B = (a ^ b; am | bm) as are shifts by a constant (just shift 0s into both number and mask). Multiplication by a constant can be done by decomposing into shifts and adds; but it can also be done directly; here we find (a;am) * k. π = a * k γ = am * k then A * k = (π; 0) + (0; γ), for which we use our addition algo. Multiplication of two unknown values is a nightmare, as unknown bits can propagate all over the place. We can do a shift-add decomposition where the adds for unknown bits have all the 1s in the addend replaced with xs. A few experiments suggest that this works, regardless of the order of operands. For instance 110x * x01 comes out as either 110x + xx0x = 0x or x0x x01 + x01 = 0x We can slightly optimise this by handling all the 1 bits in one go; that is, for (a;am) * (b;bm) we first find (a;am) * b using our multiplication-by-a-constant algo above, then for each bit in bm we find (a;am) * bit and force all its nonzero bits to unknown; finally we add all our components. Don't even ask about division; that scrambles bits so hard that the only thing you can say for sure is that the leading 0s in the numerator stay 0 in the result. The only exception is divisions by a constant which can be converted into a shift, or divisions of a constant by another constant; if the numerator has any xs and the denominator has more than one 1, everything to the right of the first x is totally unknown in general. -Ed
Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.
On 18/05/17 03:48, Alexei Starovoitov wrote: > Would it be easier to represent this logic via (mask_of_unknown, value) > instead of (mask0, mask1) ? Yes, I like this. > As far as upper bits we can tweak the algorithm to eat into > one or more bits of known bits due to carry. > Like > 00xx11 + 00xx11 = 0xxx10 > we will eat only one bit (second from left) and the highest bit > is known to stay zero, since carry can only compromise 2nd from left. > Such logic should work for sparse representation of unknown bits too > Like: > 10xx01xx10 + > 01xx01xx00 = > 1xxx10 > both upper two bits would be unknown, but only one middle bit becomes > unknown. Yes, that is the behaviour we want. But it's unclear how to efficiently compute it, without just iterating over the bits and computing carry possibilities. Here's one idea that seemed to work when I did a couple of experiments: let A = (a;am), B = (b;bm) where the m are the masks Σ = am + bm + a + b χ = Σ ^ (a + b) /* unknown carries */ μ = χ | am | bm /* mask of result */ then A + B = ((a + b) & ~μ; μ) The idea is that we find which bits change between the case "all x are 1" and "all x are 0", and those become xs too. But I'm not certain that that's always going to cover all possible values in between. It worked on the tests I came up with, and also your example above, but I can't quite prove it'll always work. -Ed
Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.
On 18/05/17 01:16, David Miller wrote: > So, in C, addition (a += b) is something like: > > struct bpf_reg_bits { > u64 zero_bits; > u64 one_bits; > }; > > static void add_update_bits(struct bpf_reg_bits *a, struct bpf_reg_bits *b) > { > u64 m_zeros, m_ones, m_all; > > m_zeros = a->zero_bits ^ b->zero_bits; > m_ones = a->one_bits ^ b->one_bits; > m_all = m_zeros | m_ones; No, this should be u64 m_a, m_b, m_all; m_a = a->zero_bits ^ a->one_bits; /* unknown bits in a */ m_b = b->zero_bits ^ b->one_bits; /* unknown bits in b */ m_all = m_a | m_b; /* unknown bits in result */ > a->zero_bits = (a->zero_bits + b->zero_bits) | m_all; > a->one_bits = (a->one_bits + b->zero_bits) & ~m_all; > } > > Then, is subtraction merely: > > static void sub_update_bits(struct bpf_reg_bits *a, struct bpf_reg_bits *b) > { > u64 m_zeros, m_ones, m_all; > > m_zeros = a->zero_bits ^ b->zero_bits; > m_ones = a->one_bits ^ b->one_bits; > m_all = m_zeros | m_ones; > > a->zero_bits = (a->zero_bits - b->zero_bits) | m_all; > a->one_bits = (a->one_bits - b->zero_bits) & ~m_all; > } > > Or is something different needed? I suspect it's something different, just because I worry about what carries will do. But I think Alexei's idea (mask and value) is better anyway; at the least it's easier to think about. -Ed
Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.
On 5/17/17 8:33 AM, Edward Cree wrote: On 17/05/17 15:00, Edward Cree wrote: OTOH the 'track known 1s as well' might work in a nice generic way and cover all bases, I'll have to experiment a bit with that. -Ed So I did some experiments (in Python, script follows) and found that indeed this does appear to work, at least for addition and shifts. The idea is that we have a 0s mask and a 1s mask; for bits that are unknown, the 0s mask is set and the 1s mask is cleared. So a completely unknown variable has masks (~0, 0), then if you shift it left 2 you get (~3, 0) - just shift both masks. A constant x has masks (x, x) - all the 0s are known 0s and all the 1s are known 1s. Addition is a bit more complicated: we compute the 'unknown bits' mask, by XORing the 0s and 1s masks together, of each addend. Then we add the corresponding masks from each addend together, and force the 'unknown' bits to the appropriate values in each mask. So given (a1, b1) and (a2, b2), we compute m1 = a1 ^ b1, m2 = a2 ^ b2, and m = m1 | m2. Then a = (a1 + a2) | m, and b = (b1 + b2) & ~m. As a worked example, 2 + (x << 2) + 14: 2 => (2, 2) constant x => (~0, 0) unknown x << 2 => (~3, 0) 2 + (x << 2): add (2, 2) with (~3, 0) m1 = 0, m2 = ~3, m = ~3 a = (2 + ~3) | ~3 = ~1 | ~3 = ~1 b = (2 + 0) & ~~3 = 2 & 3 = 2 so (~1, 2), which means "...xx10" now add 14: add (~1, 2) with (14, 14) m1 = ~3, m2 = 0, m = ~3 a = (~1 + 14) | ~3 = 12 | ~3 = ~3 b = (2 + 14) & ~~3 = 16 & 3 = 0 so (~3, 0), which means "...xx00" and the result is 4-byte aligned. -Ed PS. Beware of bugs in the following code; I have only tested it, not proved it correct. --8<-- #!/usr/bin/python2 def cpl(x): return (~x)&0xff class AlignedNumber(object): def __init__(self, mask0=0xff, mask1=0): """mask0 has 0s for bits known to be 0, 1s otherwise. mask1 has 1s for bits known to be 1, 0s otherwise. Thus a bit which is set in mask0 and cleared in mask1 is an 'unknown' bit, while a bit which is cleared in mask0 and set in mask1 is a bug. """ self.masks = (mask0 & 0xff, mask1 & 0xff) self.validate() @classmethod def const(cls, value): """All bits are known, so both masks equal value.""" return cls(value, value) def validate(self): # Check for bits 'known' to be both 0 and 1 assert not (cpl(self.masks[0]) & self.masks[1]), self.masks # Check unknown bits don't follow known bits assert self.mx | ((self.mx - 1) & 0xff) == 0xff, self.masks def __str__(self): return ':'.join(map(bin, self.masks)) def __lshift__(self, sh): """Shift both masks left, low bits become 'known 0'""" return self.__class__(self.masks[0] << sh, self.masks[1] << sh) def __rshift__(self, sh): """Shift 1s into mask0; high bits become 'unknown'. While strictly speaking they may be known 0 if we're tracking the full word and doing unsigned shifts, having known bits before unknown bits breaks the addition code.""" return self.__class__(cpl(cpl(self.masks[0]) >> sh), self.masks[1] >> sh) @property def mx(self): """Mask of unknown bits""" return self.masks[0] ^ self.masks[1] def __add__(self, other): """OR the mx values together. Unknown bits could cause carries, so we just assume that they can carry all the way to the left (thus we keep our mx masks in the form 1...10...0. Then, add our 0- and 1-masks, and force the bits of the combined mx mask to the unknown state.""" if isinstance(other, int): return self + AlignedNumber.const(other) assert isinstance(other, AlignedNumber), other m = self.mx | other.mx return self.__class__((self.masks[0] + other.masks[0]) | m, (self.masks[1] + other.masks[1]) & cpl(m)) def is_aligned(self, bits): """We are 2^n-aligned iff the bottom n bits are known-0.""" mask = (1 << bits) - 1 return not (self.masks[0] & mask) if __name__ == '__main__': a = AlignedNumber.const(2) b = AlignedNumber() << 2 c = AlignedNumber.const(14) print a, b, c print a + b, a + b + c assert (a + b + c).is_aligned(2) that bit was confusing to me. is_aligned(2) checks that number is 4 byte aligned :) Would it be easier to represent this logic via (mask_of_unknown, value) instead of (mask0, mask1) ? Where mask_of_unknown has 1s for unknown bits and 0 for known bits in the value. Then arithmetic operations will be easier to visualize. Like: (mask1, val1) + (mask2, val2) = (mask1 | mask2, val1 + val2) Here is your modified script: #!/usr/bin/python2 class AlignedNumber(object): def __init__(self, mask=0xff, value=0): # mask has 1s for unknown bits, 0s for known bits in value self.masks = (mask & 0xff, value & 0xff) @classmethod def const(cls,
Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.
On 5/17/17 5:16 PM, David Miller wrote: BTW, we track something similar already in reg->imm which tracks how many high order bits are known to be cleared in the register. It is used to avoid potential overflow for packet pointer accesses. I bet we can subsume that into this facility as well. yeah, reg->imm tracks zero upper bits in very simplistic way. For alignment checking it seems we need to track lower zeros and ones. If Edward's algorithm can be adopted to track both that would be double win indeed.
Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.
From: Edward CreeDate: Wed, 17 May 2017 16:33:27 +0100 > So I did some experiments (in Python, script follows) and found that > indeed this does appear to work, at least for addition and shifts. > The idea is that we have a 0s mask and a 1s mask; for bits that are > unknown, the 0s mask is set and the 1s mask is cleared. So a > completely unknown variable has masks (~0, 0), then if you shift it > left 2 you get (~3, 0) - just shift both masks. A constant x has > masks (x, x) - all the 0s are known 0s and all the 1s are known 1s. > Addition is a bit more complicated: we compute the 'unknown bits' > mask, by XORing the 0s and 1s masks together, of each addend. Then > we add the corresponding masks from each addend together, and force > the 'unknown' bits to the appropriate values in each mask. > So given (a1, b1) and (a2, b2), we compute m1 = a1 ^ b1, > m2 = a2 ^ b2, and m = m1 | m2. Then a = (a1 + a2) | m, and > b = (b1 + b2) & ~m. > As a worked example, 2 + (x << 2) + 14: > 2 => (2, 2) constant > x => (~0, 0) unknown > x << 2 => (~3, 0) > 2 + (x << 2): add (2, 2) with (~3, 0) > m1 = 0, m2 = ~3, m = ~3 > a = (2 + ~3) | ~3 = ~1 | ~3 = ~1 > b = (2 + 0) & ~~3 = 2 & 3 = 2 > so (~1, 2), which means "...xx10" > now add 14: add (~1, 2) with (14, 14) > m1 = ~3, m2 = 0, m = ~3 > a = (~1 + 14) | ~3 = 12 | ~3 = ~3 > b = (2 + 14) & ~~3 = 16 & 3 = 0 > so (~3, 0), which means "...xx00" > and the result is 4-byte aligned. Ok, I'm starting to digest this. If it works it's a nice universal way to handle alignment tracking, that's for sure. So, in C, addition (a += b) is something like: struct bpf_reg_bits { u64 zero_bits; u64 one_bits; }; static void add_update_bits(struct bpf_reg_bits *a, struct bpf_reg_bits *b) { u64 m_zeros, m_ones, m_all; m_zeros = a->zero_bits ^ b->zero_bits; m_ones = a->one_bits ^ b->one_bits; m_all = m_zeros | m_ones; a->zero_bits = (a->zero_bits + b->zero_bits) | m_all; a->one_bits = (a->one_bits + b->zero_bits) & ~m_all; } Then, is subtraction merely: static void sub_update_bits(struct bpf_reg_bits *a, struct bpf_reg_bits *b) { u64 m_zeros, m_ones, m_all; m_zeros = a->zero_bits ^ b->zero_bits; m_ones = a->one_bits ^ b->one_bits; m_all = m_zeros | m_ones; a->zero_bits = (a->zero_bits - b->zero_bits) | m_all; a->one_bits = (a->one_bits - b->zero_bits) & ~m_all; } Or is something different needed? What about multiplication? AND should be easy too. BTW, we track something similar already in reg->imm which tracks how many high order bits are known to be cleared in the register. It is used to avoid potential overflow for packet pointer accesses. I bet we can subsume that into this facility as well. Thanks for all of your suggestions so far.
Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.
From: Edward CreeDate: Wed, 17 May 2017 18:00:03 +0100 > Did you see the algorithms I posted with two masks? I'll take a look.
Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.
On 17/05/17 17:13, David Miller wrote: > Both cases are common in real BPF programs. The offsets really are > necessary. It's funny because initially I tried to implement this > without the auxiliary offset and it simply doesn't work. :-) > > We always have to track when you've seen the offset that cancels out > the NET_IP_ALIGN. And as stated it can occur both before and after > variable offsets have been introduced. > > You have to catch both: > > ptr += variable; > ptr += 14; > > and: > > ptr += 14; > ptr += variable; /* align = 4 */ > > And always see at the end that "NET_IP_ALIGN + offsets" will > be properly 4 byte aligned. Did you see the algorithms I posted with two masks? Effectively they are tracking the 'quotient' and 'remainder', but in a was that might be easier to manipulate (bit-twiddly etc.) than the aux offset. I think they handle both the above cases, in a nicely general way. -Ed
Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.
From: Edward CreeDate: Wed, 17 May 2017 15:00:04 +0100 > On 16/05/17 23:53, Alexei Starovoitov wrote: >> following this line of thinking it feels that it should be possible >> to get rid of 'aux_off' and 'aux_off_align' and simplify the code. >> I mean we can always do >> dst_reg->min_align = min(dst_reg->min_align, src_reg->min_align); >> >> and don't use 'off' as part of alignment checks at all. > Problem with this approach, of course, is that (say) NET_IP_ALIGN + > sizeof(ethhdr) = 16 is muchly aligned, whereas if you turn all > constants into alignments you think you're only 2-byte aligned. > I think you have to track exact offsets when you can, and only turn > into an alignment when you introduce a variable. > Of course it can still be fooled by e.g. 2 + (x << 2) + 14, which it > will think is only 2-aligned when really it's 4-aligned, but unless > you want to start tracking 'bits known to be 1' as well as 'bits > known to be 0', I think you just accept that alignment tracking > isn't commutative. The obvious cases (ihl << 2 and so) will work > when written the obvious way, unless the compiler does something > perverse. > OTOH the 'track known 1s as well' might work in a nice generic way > and cover all bases, I'll have to experiment a bit with that. Both cases are common in real BPF programs. The offsets really are necessary. It's funny because initially I tried to implement this without the auxiliary offset and it simply doesn't work. :-) We always have to track when you've seen the offset that cancels out the NET_IP_ALIGN. And as stated it can occur both before and after variable offsets have been introduced. You have to catch both: ptr += variable; ptr += 14; and: ptr += 14; ptr += variable; /* align = 4 */ And always see at the end that "NET_IP_ALIGN + offsets" will be properly 4 byte aligned.
Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.
On 17/05/17 15:00, Edward Cree wrote: > OTOH the 'track known 1s as well' might work in a nice generic way > and cover all bases, I'll have to experiment a bit with that. > > -Ed So I did some experiments (in Python, script follows) and found that indeed this does appear to work, at least for addition and shifts. The idea is that we have a 0s mask and a 1s mask; for bits that are unknown, the 0s mask is set and the 1s mask is cleared. So a completely unknown variable has masks (~0, 0), then if you shift it left 2 you get (~3, 0) - just shift both masks. A constant x has masks (x, x) - all the 0s are known 0s and all the 1s are known 1s. Addition is a bit more complicated: we compute the 'unknown bits' mask, by XORing the 0s and 1s masks together, of each addend. Then we add the corresponding masks from each addend together, and force the 'unknown' bits to the appropriate values in each mask. So given (a1, b1) and (a2, b2), we compute m1 = a1 ^ b1, m2 = a2 ^ b2, and m = m1 | m2. Then a = (a1 + a2) | m, and b = (b1 + b2) & ~m. As a worked example, 2 + (x << 2) + 14: 2 => (2, 2) constant x => (~0, 0) unknown x << 2 => (~3, 0) 2 + (x << 2): add (2, 2) with (~3, 0) m1 = 0, m2 = ~3, m = ~3 a = (2 + ~3) | ~3 = ~1 | ~3 = ~1 b = (2 + 0) & ~~3 = 2 & 3 = 2 so (~1, 2), which means "...xx10" now add 14: add (~1, 2) with (14, 14) m1 = ~3, m2 = 0, m = ~3 a = (~1 + 14) | ~3 = 12 | ~3 = ~3 b = (2 + 14) & ~~3 = 16 & 3 = 0 so (~3, 0), which means "...xx00" and the result is 4-byte aligned. -Ed PS. Beware of bugs in the following code; I have only tested it, not proved it correct. --8<-- #!/usr/bin/python2 def cpl(x): return (~x)&0xff class AlignedNumber(object): def __init__(self, mask0=0xff, mask1=0): """mask0 has 0s for bits known to be 0, 1s otherwise. mask1 has 1s for bits known to be 1, 0s otherwise. Thus a bit which is set in mask0 and cleared in mask1 is an 'unknown' bit, while a bit which is cleared in mask0 and set in mask1 is a bug. """ self.masks = (mask0 & 0xff, mask1 & 0xff) self.validate() @classmethod def const(cls, value): """All bits are known, so both masks equal value.""" return cls(value, value) def validate(self): # Check for bits 'known' to be both 0 and 1 assert not (cpl(self.masks[0]) & self.masks[1]), self.masks # Check unknown bits don't follow known bits assert self.mx | ((self.mx - 1) & 0xff) == 0xff, self.masks def __str__(self): return ':'.join(map(bin, self.masks)) def __lshift__(self, sh): """Shift both masks left, low bits become 'known 0'""" return self.__class__(self.masks[0] << sh, self.masks[1] << sh) def __rshift__(self, sh): """Shift 1s into mask0; high bits become 'unknown'. While strictly speaking they may be known 0 if we're tracking the full word and doing unsigned shifts, having known bits before unknown bits breaks the addition code.""" return self.__class__(cpl(cpl(self.masks[0]) >> sh), self.masks[1] >> sh) @property def mx(self): """Mask of unknown bits""" return self.masks[0] ^ self.masks[1] def __add__(self, other): """OR the mx values together. Unknown bits could cause carries, so we just assume that they can carry all the way to the left (thus we keep our mx masks in the form 1...10...0. Then, add our 0- and 1-masks, and force the bits of the combined mx mask to the unknown state.""" if isinstance(other, int): return self + AlignedNumber.const(other) assert isinstance(other, AlignedNumber), other m = self.mx | other.mx return self.__class__((self.masks[0] + other.masks[0]) | m, (self.masks[1] + other.masks[1]) & cpl(m)) def is_aligned(self, bits): """We are 2^n-aligned iff the bottom n bits are known-0.""" mask = (1 << bits) - 1 return not (self.masks[0] & mask) if __name__ == '__main__': a = AlignedNumber.const(2) b = AlignedNumber() << 2 c = AlignedNumber.const(14) print a, b, c print a + b, a + b + c assert (a + b + c).is_aligned(2) d = (AlignedNumber() << 4) >> 2 print d assert d.is_aligned(2)
Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.
On 16/05/17 23:53, Alexei Starovoitov wrote: > following this line of thinking it feels that it should be possible > to get rid of 'aux_off' and 'aux_off_align' and simplify the code. > I mean we can always do > dst_reg->min_align = min(dst_reg->min_align, src_reg->min_align); > > and don't use 'off' as part of alignment checks at all. Problem with this approach, of course, is that (say) NET_IP_ALIGN + sizeof(ethhdr) = 16 is muchly aligned, whereas if you turn all constants into alignments you think you're only 2-byte aligned. I think you have to track exact offsets when you can, and only turn into an alignment when you introduce a variable. Of course it can still be fooled by e.g. 2 + (x << 2) + 14, which it will think is only 2-aligned when really it's 4-aligned, but unless you want to start tracking 'bits known to be 1' as well as 'bits known to be 0', I think you just accept that alignment tracking isn't commutative. The obvious cases (ihl << 2 and so) will work when written the obvious way, unless the compiler does something perverse. OTOH the 'track known 1s as well' might work in a nice generic way and cover all bases, I'll have to experiment a bit with that. -Ed
Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.
On 5/16/17 5:37 AM, Edward Cree wrote: On 15/05/17 17:04, David Miller wrote: If we use 1<<31, then sequences like: R1 = 0 R1 <<= 2 do silly things. Hmm. It might be a bit late for this, but I wonder if, instead of handling alignments as (1 << align), you could store them as -(1 << align), i.e. leading 1s followed by 'align' 0s. Now the alignment of 0 is 0 (really 1 << 32), which doesn't change when left-shifted some more. Shifts of other numbers' alignments also do the right thing, e.g. align(6) << 2 = (-2) << 2 = -8 = align(6 << 2). Of course you do all this in unsigned, to make sure right shifts work. This also makes other arithmetic simple to track; for instance, align(a + b) is at worst align(a) | align(b). (Of course, this bound isn't tight.) A number is 2^(n+1)-aligned if the 2^n bit of its alignment is cleared. Considered as unsigned numbers, smaller values are stricter alignments. following this line of thinking it feels that it should be possible to get rid of 'aux_off' and 'aux_off_align' and simplify the code. I mean we can always do dst_reg->min_align = min(dst_reg->min_align, src_reg->min_align); and don't use 'off' as part of alignment checks at all. So this bit: if ((ip_align + reg_off + off) % size != 0) { can be removed and replaced with a = alignof(ip_align) a = min(a, reg->align) if (a % size != 0) and do this check always and not only after if (reg->id) In check_packet_ptr_add(): - if (had_id) - dst_reg->aux_off_align = min(dst_reg->aux_off_align, - src_reg->min_align); - else - dst_reg->aux_off_align = src_reg->min_align; + if (had_id) + dst_reg->min_align = min(dst_reg->min_align, src_reg->min_align); + else + dst_reg->min_align = src_reg->min_align; in that sense packet_ptr_add() will be no different than align logic we do in adjust_reg_min_max_vals() Thoughts?
Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.
From: Edward CreeDate: Tue, 16 May 2017 13:37:42 +0100 > On 15/05/17 17:04, David Miller wrote: >> If we use 1<<31, then sequences like: >> >> R1 = 0 >> R1 <<= 2 >> >> do silly things. > Hmm. It might be a bit late for this, but I wonder if, instead of handling > alignments as (1 << align), you could store them as -(1 << align), i.e. > leading 1s followed by 'align' 0s. > Now the alignment of 0 is 0 (really 1 << 32), which doesn't change when > left-shifted some more. Shifts of other numbers' alignments also do the > right thing, e.g. align(6) << 2 = (-2) << 2 = -8 = align(6 << 2). Of > course you do all this in unsigned, to make sure right shifts work. > This also makes other arithmetic simple to track; for instance, align(a + b) > is at worst align(a) | align(b). (Of course, this bound isn't tight.) > A number is 2^(n+1)-aligned if the 2^n bit of its alignment is cleared. > Considered as unsigned numbers, smaller values are stricter alignments. Thanks for the bit twiddling suggestion, I'll take a look!
Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.
On 15/05/17 17:04, David Miller wrote: > If we use 1<<31, then sequences like: > > R1 = 0 > R1 <<= 2 > > do silly things. Hmm. It might be a bit late for this, but I wonder if, instead of handling alignments as (1 << align), you could store them as -(1 << align), i.e. leading 1s followed by 'align' 0s. Now the alignment of 0 is 0 (really 1 << 32), which doesn't change when left-shifted some more. Shifts of other numbers' alignments also do the right thing, e.g. align(6) << 2 = (-2) << 2 = -8 = align(6 << 2). Of course you do all this in unsigned, to make sure right shifts work. This also makes other arithmetic simple to track; for instance, align(a + b) is at worst align(a) | align(b). (Of course, this bound isn't tight.) A number is 2^(n+1)-aligned if the 2^n bit of its alignment is cleared. Considered as unsigned numbers, smaller values are stricter alignments. -Ed
[PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.
If we use 1<<31, then sequences like: R1 = 0 R1 <<= 2 do silly things. Examples of this actually exist in the MAP tests of test_verifier.c Update test_align.c expectation strings. Signed-off-by: David S. Miller--- kernel/bpf/verifier.c| 2 +- tools/testing/selftests/bpf/test_align.c | 12 ++-- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 39f2dcb..0f33db0 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -1715,7 +1715,7 @@ static void check_reg_overflow(struct bpf_reg_state *reg) static u32 calc_align(u32 imm) { if (!imm) - return 1U << 31; + return 1U << 16; return imm - ((imm - 1) & imm); } diff --git a/tools/testing/selftests/bpf/test_align.c b/tools/testing/selftests/bpf/test_align.c index 9644d4e..afaa297 100644 --- a/tools/testing/selftests/bpf/test_align.c +++ b/tools/testing/selftests/bpf/test_align.c @@ -235,12 +235,12 @@ static struct bpf_align_test tests[] = { }, .prog_type = BPF_PROG_TYPE_SCHED_CLS, .matches = { - "4: R0=imm0,min_value=0,max_value=0,min_align=2147483648 R1=ctx R2=pkt(id=0,off=0,r=0) R3=pkt_end R5=pkt(id=0,off=0,r=0) R10=fp", - "5: R0=imm0,min_value=0,max_value=0,min_align=2147483648 R1=ctx R2=pkt(id=0,off=0,r=0) R3=pkt_end R5=pkt(id=0,off=14,r=0) R10=fp", - "6: R0=imm0,min_value=0,max_value=0,min_align=2147483648 R1=ctx R2=pkt(id=0,off=0,r=0) R3=pkt_end R4=pkt(id=0,off=14,r=0) R5=pkt(id=0,off=14,r=0) R10=fp", - "10: R0=imm0,min_value=0,max_value=0,min_align=2147483648 R1=ctx R2=pkt(id=0,off=0,r=18) R3=pkt_end R4=inv56 R5=pkt(id=0,off=14,r=18) R10=fp", - "14: R0=imm0,min_value=0,max_value=0,min_align=2147483648 R1=ctx R2=pkt(id=0,off=0,r=18) R3=pkt_end R4=inv48 R5=pkt(id=0,off=14,r=18) R10=fp", - "15: R0=imm0,min_value=0,max_value=0,min_align=2147483648 R1=ctx R2=pkt(id=0,off=0,r=18) R3=pkt_end R4=inv48 R5=pkt(id=0,off=14,r=18) R10=fp", + "4: R0=imm0,min_value=0,max_value=0,min_align=65536 R1=ctx R2=pkt(id=0,off=0,r=0) R3=pkt_end R5=pkt(id=0,off=0,r=0) R10=fp", + "5: R0=imm0,min_value=0,max_value=0,min_align=65536 R1=ctx R2=pkt(id=0,off=0,r=0) R3=pkt_end R5=pkt(id=0,off=14,r=0) R10=fp", + "6: R0=imm0,min_value=0,max_value=0,min_align=65536 R1=ctx R2=pkt(id=0,off=0,r=0) R3=pkt_end R4=pkt(id=0,off=14,r=0) R5=pkt(id=0,off=14,r=0) R10=fp", + "10: R0=imm0,min_value=0,max_value=0,min_align=65536 R1=ctx R2=pkt(id=0,off=0,r=18) R3=pkt_end R4=inv56 R5=pkt(id=0,off=14,r=18) R10=fp", + "14: R0=imm0,min_value=0,max_value=0,min_align=65536 R1=ctx R2=pkt(id=0,off=0,r=18) R3=pkt_end R4=inv48 R5=pkt(id=0,off=14,r=18) R10=fp", + "15: R0=imm0,min_value=0,max_value=0,min_align=65536 R1=ctx R2=pkt(id=0,off=0,r=18) R3=pkt_end R4=inv48 R5=pkt(id=0,off=14,r=18) R10=fp", }, }, { -- 2.1.2.532.g19b5d50