On Thu, Jan 08, 2009 at 11:40:31AM -0500, Sebastien Roy wrote:
> Hi Crossbow team,
> 
> I'm making some progress with the Crossbow/Clearview iptun merge, but am
> running into a problem receiving packets.
> 
> When the first IP interface above a tunnel binds, a flow is created, and
> that flow is inserted into the flow hash table using the
> flow_l2_hash_fe() hashing function.  When a packet is received, it
> attempts to match a flow by calling flow_l2_hash().  Both of these
> functions use the HASH_MAC_VID() hashing macro, which is passed the
> destination MAC address as its first argument.  None of this is news to
> you, every l2 flow behaves this way.
> 
> This HASH_MAC_VID() macro looks like this:
> 
> #define       HASH_MAC_VID(a, v, s) \
>       ((((uint32_t)(a)[3] + (a)[4] + (a)[5]) ^ (v)) % (s))
> 
> I'm sure you see the problem.  It accesses data that may be off of the
> end of the MAC address is the MAC address is less than 6 bytes, and it
> also assumes that this 3-byte sample of the MAC address will represent
> an adequately variable sample.  Perhaps this macro is the reason why
> there's a check that MAC addresses must be greater than or equal to 6
> bytes long...
>

sorry I forgot about this case when I said the address length
doesn't matter.

 
> In the interest of removing this MAC address length limitation (because
> I have to to get iptun to work), this hashing algorithm needs to be more
> flexible and take into account the length of the MAC address.  Perhaps
> the MAC-type plugins could provide their hashing function?  Alternately,
> the hashing function could take the MAC address length into account and
> only use a sample of bytes within the actual address.
> 
> Any thoughts on how you want to see this fixed?
>

I would prefer to keep the existing macro and use it in flow_ether_hash().
for flow_l2_hash() and flow_l2_hash_fe(), we could do something like this:

#define HASH_L2(a, v, l, s, h) {
        uint32_t i, sum = 0, nbytes = ((l) < 3) ? (l) : 3;      \
        for (i = (l) - nbytes; i < (l); i++)                    \
                sum += (a)[i];                                  \
        h = (sum ^ (v)) % (s);                                  \
}

static uint32_t
flow_l2_hash(flow_tab_t *ft, flow_state_t *s)
{
        flow_l2info_t   *l2 = &s->fs_l2info;
        uint32_t        h;

        HASH_l2(l2->l2_daddr, l2->l2_vid, l2->l2_addrlen, ft->ft_size, h);
        return (h);
}

note that the new l2_addrlen will have to be filled in at
flow_l2_accept.


regarding why we do a sum instead of using the last 4 bytes as an integer,
I think the reason was we can't be sure of the alignment of the mac address
so we can't just use *((uint32_t *)&l2->l2_daddr[2]) directly.

eric

Reply via email to