16] bpf: mark sub-register writes that really need zero extension to high bits

Jiong Wang Fri, 05 Apr 2019 13:45:22 -0700


> On 26 Mar 2019, at 18:44, Edward Cree <ec...@solarflare.com> wrote:
> 
> On 26/03/2019 18:05, Jiong Wang wrote:
>> eBPF ISA specification requires high 32-bit cleared when low 32-bit
>> sub-register is written. This applies to destination register of ALU32 etc.
>> JIT back-ends must guarantee this semantic when doing code-gen.
>> 
>> x86-64 and arm64 ISA has the same semantic, so the corresponding JIT
>> back-end doesn't need to do extra work. However, 32-bit arches (arm, nfp
>> etc.) and some other 64-bit arches (powerpc, sparc etc), need explicit zero
>> extension sequence to meet such semantic.
>> 
>> This is important, because for code the following:
>> 
>>  u64_value = (u64) u32_value
>>  ... other uses of u64_value
>> 
>> compiler could exploit the semantic described above and save those zero
>> extensions for extending u32_value to u64_value. Hardware, runtime, or BPF
>> JIT back-ends, are responsible for guaranteeing this. Some benchmarks show
>> ~40% sub-register writes out of total insns, meaning ~40% extra code-gen (
>> could go up to more for some arches which requires two shifts for zero
>> extension) because JIT back-end needs to do extra code-gen for all such
>> instructions.
>> 
>> However this is not always necessary in case u32_value is never cast into
>> a u64, which is quite normal in real life program. So, it would be really
>> good if we could identify those places where such type cast happened, and
>> only do zero extensions for them, not for the others. This could save a lot
>> of BPF code-gen.
>> 
>> Algo:
>> - Record indices of instructions that do sub-register def (write). And
>>   these indices need to stay with function state so path pruning and bpf
>>   to bpf function call could be handled properly.
>> 
>>   These indices are kept up to date while doing insn walk.
>> 
>> - A full register read on an active sub-register def marks the def insn as
>>   needing zero extension on dst register.
>> 
>> - A new sub-register write overrides the old one.
>> 
>>   A new full register write makes the register free of zero extension on
>>   dst register.
>> 
>> - When propagating register read64 during path pruning, it also marks def
>>   insns whose defs are hanging active sub-register, if there is any read64
>>   from shown from the equal state.
>> 
>> Reviewed-by: Jakub Kicinski <jakub.kicin...@netronome.com>
>> Signed-off-by: Jiong Wang <jiong.w...@netronome.com>
>> ---
>> include/linux/bpf_verifier.h |  4 +++
>> kernel/bpf/verifier.c        | 85 
>> +++++++++++++++++++++++++++++++++++++++++---
>> 2 files changed, 84 insertions(+), 5 deletions(-)
>> 
>> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
>> index 27761ab..0ae9a3f 100644
>> --- a/include/linux/bpf_verifier.h
>> +++ b/include/linux/bpf_verifier.h
>> @@ -181,6 +181,9 @@ struct bpf_func_state {
>>       */
>>      u32 subprogno;
>> 
>> +    /* tracks subreg definition. */
> Ideally this comment should mention that the stored value is the insn_idx
>  of the writing insn.  Perhaps also that this is safe because patching
>  (bpf_patch_insn_data()) only happens after main verification completes.


During full x86_64 host tests, found one new issue.                             
       
                                                                                
         
“convert_ctx_accesses” will change load size, A BPF_W load could be transformed 
         
into BPF_DW or kept as BPF_W depending on the underlying ctx field size. And    
         
“convert_ctx_accesses” happens after zero extension insertion.                  
         
                                                                                
         
So, a BPF_W load could have been marked and zero extensions inserted after      
         
it, however, the later happened “convert_ctx_accesses” then figured out it’s    
         
transformed load size is actually BPF_DW then re-write to that. But the         
         
previously inserted zero extensions then break things, the high 32 bits are     
         
wrongly cleared. For example:

1: r2 = *(u32 *)(r1 + 80)                                                       
         
2: r1 = *(u32 *)(r1 + 76)                                                       
         
3: r3 = r1                                                                      
         
4: r3 += 14                                                                     
         
5: if r3 > r2 goto +35                                                          
         
                                                                                
         
insn 1 and 2 could be turned into BPF_DW load if they are loading xdp “data"
and “data_end". There shouldn’t be zero-extension inserted after them will
will destroy the pointer. However they are treated as 32-bit load initially,
and later due to 64-bit use at insn 3 and 5, they are marked as needing zero
extension.                                                                      
  
                                                                                
         
I am thinking normally the field sizes in *_md inside uapi/linux/bpf.h are
the same those in real underlying context, only when one field is pointer
type, then it could be possible be a u32 to u64 conversion. So, I guess
we just need to mark the dst register as a full 64-bit register write 
inside check_mem_access when for PTR_TO_CTX, the reg type of the dust reg
returned by check_ctx_access is ptr type.

Please let me know if I am thinking wrong.                                      
                         
                                                                                
 
Thanks.
                                                     
Regards,                                                                        
 
Jiong

Re: [PATCH/RFC bpf-next 04/16] bpf: mark sub-register writes that really need zero extension to high bits

Reply via email to