On Fri, Dec 07, 2018 at 05:19:21PM +0000, Jiong Wang wrote: > On 06/12/2018 03:13, Alexei Starovoitov wrote: > > On Wed, Dec 05, 2018 at 03:32:50PM +0000, Jiong Wang wrote: > > > On 05/12/2018 14:52, Edward Cree wrote: > > > > On 05/12/18 09:46, Jiong Wang wrote: > > > > > There is NO processed instruction number regression, either with or > > > > > without > > > > > -mattr=+alu32. > > > > <snip> > > > > > Cilium bpf > > > > > === > > > > > bpf_lb-DLB_L3.o 2110/2110 1730/1733 > > > > That looks like a regression of 3 insns in the 32-bit case. > > > > May be worth investigating why. > > > > > > Will look into this. > > > > > > > > > > > > + dst_reg = insn->dst_reg; > > > > > + regs[dst_reg] = regs[src_reg]; > > > > > + if (BPF_CLASS(insn->code) == BPF_ALU) { > > > > > + /* Update type and range info. */ > > > > > + regs[dst_reg].type = SCALAR_VALUE; > > > > > + coerce_reg_to_size(®s[dst_reg], 4); > > > > Won't this break when handed a pointer (as root, so allowed to leak > > > > it)? E.g. (pointer + x) gets turned into scalar x, rather than > > > > unknown scalar in range [0, 0xffffffff]. > > > > > > Initially I was gating this to scalar_value only, later was thinking it > > > could be extended to ptr case if ptr leak is allowed. > > > > > > But, your comment remind me min/max value doesn't mean real min/max value > > > for ptr types value, it means the offset only if I am understanding the > > > issue correctly. So, it will break pointer case. > > > > correct. In case of is_pointer_value() && root -> mark_reg_unknown() has to > > be called > > > > The explanation of additional 3 steps from another email makes sense to me. > > > > Can you take a look why it helps default (llvm alu64) case too ? > > bpf_overlay.o 3096/2898 > > It is embarrassing that I am not able to reproduce this number after tried > quite a few env configurations. I think the number must be wrong because > llvm alu64 binary doesn't contain alu32 move so shouldn't be impacted by > this patch even though I double checked the raw data I collected on llvm > alu64, re-calculated the number before this patch, it is still 3096. I > guess there must be something wrong with the binary I was loading. > > I improved my benchmarking methodology to build all alu64 and alu32 > binaries first, and never change them later. Then used a script to load and > collect the processed number. (borrowed the script from > https://github.com/4ast/bpf_cilium_test/, only my binaries are built from > latest Cilium repo and contains alu32 version as well) > > I ran this new benchmarking env for several times, and could get the > following new results consistently: > > bpf_lb-DLB_L3.o: 2085/2085 1685/1687 > bpf_lb-DLB_L4.o: 2287/2287 1986/1982 > bpf_lb-DUNKNOWN.o: 690/690 622/622 > bpf_lxc.o: 95033/95033 N/A > bpf_netdev.o: 7245/7245 N/A > bpf_overlay.o: 2898/2898 3085/2947 > > No change on alu64 binary. > > For alu32, bpf_overlay.o still get fewer processed instruction number, this > is because there is the following sequence (and another similar one). > Before this patch, r2 at insn 139 is unknown, so verifier always explore > both path-taken and path-fall_through. After this patch, it explores > path-fall_through only, so saved some insns. > > 129: (b4) (u32) r7 = (u32) -140 > ... > 136: (bc) (u32) r2 = (u32) r7 > 137: (74) (u32) r2 >>= (u32) 31 > 138: (4c) (u32) r2 |= (u32) r1 > 139: (15) if r2 == 0x0 goto pc+342 > 140: (b4) (u32) r1 = (u32) 2 > > And a permissive register value for r2 hasn't released more path prune for > this test, so in all, after this patch, there is fewer processed insn. > > I have sent out a v2, gated this change under SCALAR_VALUE, and also > updated the patch description.
Thanks for the update. Makes sense.