Re: [PATCH stable 4.9 1/8] x86: bpf_jit: small optimization in emit_bpf_tail_call()
Hi Eric, On Mon, Jan 29, 2018 at 06:04:30AM -0800, Eric Dumazet wrote: > > If these 4 bytes matter, why not use > > cmpq with an immediate value instead, which saves 2 extra bytes ? : > > > > - the mov above is 11 bytes total : > > > >0: 48 8b 84 d6 78 56 34mov0x12345678(%rsi,%rdx,8),%rax > >7: 12 > >8: 48 85 c0test %rax,%rax > > > > - the equivalent cmp is only 9 bytes : > > > >0: 48 83 bc d6 78 56 34cmpq $0x0,0x12345678(%rsi,%rdx,8) > >7: 12 00 > > > > And as a bonus, it doesn't even clobber rax. > > > > Just my two cents, > > > Hi Willy > > Please look more closely at following instructions. > > We need the value later, not only testing it being zero :) Ah OK that makes total sense then ;-) Thanks, willy
Re: [PATCH stable 4.9 1/8] x86: bpf_jit: small optimization in emit_bpf_tail_call()
On Sun, Jan 28, 2018 at 10:39 PM, Willy Tarreauwrote: > Hi, > > [ replaced stable@ and greg@ by netdev@ as my question below is not > relevant to stable ] > > On Mon, Jan 29, 2018 at 02:48:54AM +0100, Daniel Borkmann wrote: >> From: Eric Dumazet >> >> [ upstream commit 84ccac6e7854ebbfb56d2fc6d5bef9be49bb304c ] >> >> Saves 4 bytes replacing following instructions : >> >> lea rax, [rsi + rdx * 8 + offsetof(...)] >> mov rax, qword ptr [rax] >> cmp rax, 0 >> >> by : >> >> mov rax, [rsi + rdx * 8 + offsetof(...)] >> test rax, rax > > I've just noticed this on stable@. If these 4 bytes matter, why not use > cmpq with an immediate value instead, which saves 2 extra bytes ? : > > - the mov above is 11 bytes total : > >0: 48 8b 84 d6 78 56 34mov0x12345678(%rsi,%rdx,8),%rax >7: 12 >8: 48 85 c0test %rax,%rax > > - the equivalent cmp is only 9 bytes : > >0: 48 83 bc d6 78 56 34cmpq $0x0,0x12345678(%rsi,%rdx,8) >7: 12 00 > > And as a bonus, it doesn't even clobber rax. > > Just my two cents, Hi Willy Please look more closely at following instructions. We need the value later, not only testing it being zero :)
Re: [PATCH stable 4.9 1/8] x86: bpf_jit: small optimization in emit_bpf_tail_call()
Hi, [ replaced stable@ and greg@ by netdev@ as my question below is not relevant to stable ] On Mon, Jan 29, 2018 at 02:48:54AM +0100, Daniel Borkmann wrote: > From: Eric Dumazet> > [ upstream commit 84ccac6e7854ebbfb56d2fc6d5bef9be49bb304c ] > > Saves 4 bytes replacing following instructions : > > lea rax, [rsi + rdx * 8 + offsetof(...)] > mov rax, qword ptr [rax] > cmp rax, 0 > > by : > > mov rax, [rsi + rdx * 8 + offsetof(...)] > test rax, rax I've just noticed this on stable@. If these 4 bytes matter, why not use cmpq with an immediate value instead, which saves 2 extra bytes ? : - the mov above is 11 bytes total : 0: 48 8b 84 d6 78 56 34mov0x12345678(%rsi,%rdx,8),%rax 7: 12 8: 48 85 c0test %rax,%rax - the equivalent cmp is only 9 bytes : 0: 48 83 bc d6 78 56 34cmpq $0x0,0x12345678(%rsi,%rdx,8) 7: 12 00 And as a bonus, it doesn't even clobber rax. Just my two cents, Willy