-------- Message original --------
Sujet: [RFC] Idea to speedup tcp lookups
Date: Tue, 02 Aug 2005 11:53:12 +0200
De: Eric Dumazet <[EMAIL PROTECTED]>
Pour: David S. Miller <[email protected]>
Copie: [EMAIL PROTECTED],  [EMAIL PROTECTED]
Références: <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]>

Hi David, Hi all

I would like to provide a patch to speedup tcp lookups, but I need your 
comments first.

1) First some peformance data :
--------------------------------

tcp_v4_rcv() waste a lot of time in __tcp_v4_lookup_established()

The most critical code is :

sk_for_each(sk, node, &head->chain) {
        if (TCP_IPV4_MATCH(sk, acookie, saddr, daddr, ports, dif))
                goto hit; /* You sunk my battleship! */
}

The sk_for_each() does use prefetch() hints but only the begining of "struct 
sock" is prefetched.
So TCP_IPV4_MATCH() has to bring into CPU cache cold cache lines.
Each iteration has to use at least 2 cache lines.

2) The goal
-----------

The idea I have is to change things so that TCP_IPV4_MATCH() may return FALSE 
in 95% of cases only using the data already in the CPU cache,
using one cache line per iteration.

3) Description of what is planned
----------------------------------

Changes in layout are to move the "__u16           dport ; __u16           num" from "struct 
inet_sock" to the end of "struct sock_common",
where there is some padding (at least on 64 bits platforms)

File include/net/sock.h

struct sock_common {
        unsigned short      skc_family;
        volatile unsigned char  skc_state;
        unsigned char       skc_reuse;
        int         skc_bound_dev_if;
        struct hlist_node   skc_node;
        struct hlist_node   skc_bind_node;
        atomic_t        skc_refcnt;
+       union  {
+               unsigned int key; /* hash key for fast lookups, or protocol 
private data */
+               unsigned short us[2];
+               } skc_u;
        };

File include/linux/ip.h

struct inet_sock {
        ...
        __u32           rcv_saddr;  /* Bound local IPv4 addr */
-       __u16           dport;      /* Destination port */
-       __u16           num;        /* Local port */
        ...
+#define inetsk_dport sk.skc_u.us[0]
+#define inetsk_num sk.skc_u.us[1]

Then change every sk->dport to sk->inetsk_dport, and every sk->num to 
sk->inetsk_num

Doing so even save 8 bytes for sizeof(inet_sock) on 64 bits platforms :)

Then change the the TCP_IPV4_MATCH macro to

File include/net/tcp.h

64 bits platforms :
#define TCP_IPV4_MATCH(__sk, __cookie, __saddr, __daddr, __ports, __dif)\
        (((__sk)->skc_u.key == (__ports))    &&  \
        ((*((__u64 *)&(inet_sk(__sk)->daddr)))== (__cookie))   &&  \
        (!((__sk)->sk_bound_dev_if) || ((__sk)->sk_bound_dev_if == (__dif))))

32bits platforms:
#define TCP_IPV4_MATCH(__sk, __cookie, __saddr, __daddr, __ports, __dif)\
        (((__sk)->skc_u.key == (__ports))    &&  \
        (inet_sk(__sk)->daddr          == (__saddr))   &&  \
        (inet_sk(__sk)->rcv_saddr      == (__daddr))   &&  \
        (!((__sk)->sk_bound_dev_if) || ((__sk)->sk_bound_dev_if == (__dif))))


This way, the comparison with (__sk->skc_u.key) should reference data already 
fetched is CPU caches, or in the same cache line than
__sk->skc_node (the next element in hash chain)

Discussion :

Instead of using (dport,num) as a key, we could use the tcp_hashfn() value to 
have better fast path, but we would use more memory.
The patch would be nicer, not changing "struct inet_sock".

Thank you for your comments and ideas.

Eric Dumazet

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to