Re: [iovisor-dev] [oss-drivers] Re: [PATCH RFC 0/4] Initial 32-bit eBPF encoding support

2017-09-22 Thread Yonghong Song via iovisor-dev



On 9/22/17 9:24 AM, Jakub Kicinski wrote:

On Thu, 21 Sep 2017 11:56:55 -0700, Alexei Starovoitov wrote:

On Wed, Sep 20, 2017 at 12:20:40AM +0100, Jiong Wang via iovisor-dev wrote:

On 18/09/2017 22:29, Daniel Borkmann wrote:

On 09/18/2017 10:47 PM, Jiong Wang wrote:

Hi,

    Currently, LLVM eBPF backend always generate code in 64-bit mode,
this may
cause troubles when JITing to 32-bit targets.

    For example, it is quite common for XDP eBPF program to access
some packet
fields through base + offset that the default eBPF will generate
BPF_ALU64 for
the address formation, later when JITing to 32-bit hardware,
BPF_ALU64 needs
to be expanded into 32 bit ALU sequences even though the address
space is
32-bit that the high bits is not significant.

    While a complete 32-bit mode implemention may need an new ABI
(something like
-target-abi=ilp32), this patch set first add some initial code so we
could
construct 32-bit eBPF tests through hand-written assembly.

    A new 32-bit register set is introduced, its name is with "w"
prefix and LLVM
assembler will encode statements like "w1 += w2" into the following
8-bit code
field:

  BPF_ADD | BPF_X | BPF_ALU

BPF_ALU will be used instead of BPF_ALU64.

    NOTE, currently you can only use "w" register with ALU
statements, not with
others like branches etc as they don't have different encoding for
32-bit
target.


Great to see work in this direction! Can we also enable to use / emit
all the 32bit BPF_ALU instructions whenever possible for the currently
available bpf targets while at it (which only use BPF_ALU64 right now)?


Hi Daniel,

    Thanks for the feedback.

    I think we could also enable the use of all the 32bit BPF_ALU under
currently
available bpf targets.  As we now have 32bit register set support, we could
make
i32 type as legal type to prevent it be promoted into i64, then hook it up
with i32
ALU patterns, will look into this.


I don't think we need to gate 32bit alu generation with a flag.
Though interpreter and JITs support 32-bit since day one, the verifier
never seen such programs before, so some valid programs may get
rejected. After some time passes and we're sure that all progs
still work fine when they're optimized with 32-bit alu, we can flip
the switch in llvm and make it default.


Thinking about next steps - do we expect the 32b operations to clear the
upper halves of the registers?  The interpreter does it, and so does
x86.  I don't think we can load 32bit-only programs on 64bit hosts, so
we would need some form of data flow analysis in the kernel to prune
the zeroing for 32bit offload targets.  Is that correct?


Could you contrive an example to show the problem? If I understand 
correctly, you most worried that some natural sign extension is gone

with "clearing the upper 32-bit register" and such clearing may make
some operation, esp. memory operation not correct in 64-bit machine?







___
iovisor-dev mailing list
iovisor-dev@lists.iovisor.org
https://lists.iovisor.org/mailman/listinfo/iovisor-dev


Re: [iovisor-dev] [oss-drivers] Re: [PATCH RFC 0/4] Initial 32-bit eBPF encoding support

2017-09-22 Thread Jakub Kicinski via iovisor-dev
On Thu, 21 Sep 2017 11:56:55 -0700, Alexei Starovoitov wrote:
> On Wed, Sep 20, 2017 at 12:20:40AM +0100, Jiong Wang via iovisor-dev wrote:
> > On 18/09/2017 22:29, Daniel Borkmann wrote:  
> > > On 09/18/2017 10:47 PM, Jiong Wang wrote:  
> > > > Hi,
> > > > 
> > > >    Currently, LLVM eBPF backend always generate code in 64-bit mode,
> > > > this may
> > > > cause troubles when JITing to 32-bit targets.
> > > > 
> > > >    For example, it is quite common for XDP eBPF program to access
> > > > some packet
> > > > fields through base + offset that the default eBPF will generate
> > > > BPF_ALU64 for
> > > > the address formation, later when JITing to 32-bit hardware,
> > > > BPF_ALU64 needs
> > > > to be expanded into 32 bit ALU sequences even though the address
> > > > space is
> > > > 32-bit that the high bits is not significant.
> > > > 
> > > >    While a complete 32-bit mode implemention may need an new ABI
> > > > (something like
> > > > -target-abi=ilp32), this patch set first add some initial code so we
> > > > could
> > > > construct 32-bit eBPF tests through hand-written assembly.
> > > > 
> > > >    A new 32-bit register set is introduced, its name is with "w"
> > > > prefix and LLVM
> > > > assembler will encode statements like "w1 += w2" into the following
> > > > 8-bit code
> > > > field:
> > > > 
> > > >  BPF_ADD | BPF_X | BPF_ALU
> > > > 
> > > > BPF_ALU will be used instead of BPF_ALU64.
> > > > 
> > > >    NOTE, currently you can only use "w" register with ALU
> > > > statements, not with
> > > > others like branches etc as they don't have different encoding for
> > > > 32-bit
> > > > target.  
> > > 
> > > Great to see work in this direction! Can we also enable to use / emit
> > > all the 32bit BPF_ALU instructions whenever possible for the currently
> > > available bpf targets while at it (which only use BPF_ALU64 right now)?  
> > 
> > Hi Daniel,
> > 
> >    Thanks for the feedback.
> > 
> >    I think we could also enable the use of all the 32bit BPF_ALU under
> > currently
> > available bpf targets.  As we now have 32bit register set support, we could
> > make
> > i32 type as legal type to prevent it be promoted into i64, then hook it up
> > with i32
> > ALU patterns, will look into this.  
> 
> I don't think we need to gate 32bit alu generation with a flag.
> Though interpreter and JITs support 32-bit since day one, the verifier
> never seen such programs before, so some valid programs may get
> rejected. After some time passes and we're sure that all progs
> still work fine when they're optimized with 32-bit alu, we can flip
> the switch in llvm and make it default.

Thinking about next steps - do we expect the 32b operations to clear the
upper halves of the registers?  The interpreter does it, and so does
x86.  I don't think we can load 32bit-only programs on 64bit hosts, so
we would need some form of data flow analysis in the kernel to prune
the zeroing for 32bit offload targets.  Is that correct?
___
iovisor-dev mailing list
iovisor-dev@lists.iovisor.org
https://lists.iovisor.org/mailman/listinfo/iovisor-dev