On 04/30/2015 05:13 PM, Eric Dumazet wrote:
On Thu, 2015-04-30 at 16:24 -0700, Alexander Duyck wrote:

Actually a byte operation itself is not faster.  Note in the next line
we are returning the value.  So what you typically end up with by doing
it that way would be 2 reads, one for the u8 and one for the u16 return
value.  That is actually what I am trying to address in the second patch
in the set since we were doing a 8b test on the first byte of the
address followed by a 64b read.

The advantage with the way I wrote this is that the compiler itself
should be able to sort out how it wants to test the value while
accessing it in a 16b size.  So at worst case it is a mask and compare,
followed by a return of the value.  From what I have seen the compiler
seems to be smart enough on x86 anyway to just convert this into a one
byte compare on AL and then return the result in AX.  I would suspect
that for bit-endian systems it would likely just perform the compare.


My compiler (4.8.2 (Ubuntu 4.8.2-19ubuntu1)) does the following :

  62d:  0f b7 42 0c             movzwl 0xc(%rdx),%eax
  631:  0f b6 d0                movzbl %al,%edx
  634:  83 fa 05                cmp    $0x5,%edx
  637:  7e 02                   jle    63b <eth_type_trans+0x8b>
  639:  c9                      leaveq
  63a:  c3                      retq

Presumably this would be possible

                movzwl 0xc(%rdx),%eax
                cmp    $0x5,%al
                        jle    63b <eth_type_trans+0x8b>
                        leaveq
                        retq



My compiler (5.0.1 (Red Hat 5.0.1-0.1)) does like what you have in the "would be possible" example. What I end up with is something like this:
 648:   0f b7 42 0c             movzwl 0xc(%rdx),%eax
 64c:   3c 05                   cmp    $0x5,%al
 64e:   76 40                   jbe    690 <eth_type_trans+0xc0>

The assembler before my patch was:
 652:   0f b7 40 0c             movzwl 0xc(%rax),%eax
 656:   89 c2                   mov    %eax,%edx
 658:   66 c1 c2 08             rol    $0x8,%dx
 65c:   0f b7 d2                movzwl %dx,%edx
 65f:   81 fa ff 05 00 00       cmp    $0x5ff,%edx
 665:   7e 41                   jle    6a8 <eth_type_trans+0xd8>

The savings isn't meant to be anything huge for the patch, maybe a cycle or two. I suspect the before on your system is probably something similar to what I had so we are still probably dropping at least 2 instructions.

- Alex


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to