On Thu, Jan 08, 2009 at 11:40:31AM -0500, Sebastien Roy wrote:
> Hi Crossbow team,
>
> I'm making some progress with the Crossbow/Clearview iptun merge, but am
> running into a problem receiving packets.
>
> When the first IP interface above a tunnel binds, a flow is created, and
> that flow is inserted into the flow hash table using the
> flow_l2_hash_fe() hashing function. When a packet is received, it
> attempts to match a flow by calling flow_l2_hash(). Both of these
> functions use the HASH_MAC_VID() hashing macro, which is passed the
> destination MAC address as its first argument. None of this is news to
> you, every l2 flow behaves this way.
>
> This HASH_MAC_VID() macro looks like this:
>
> #define HASH_MAC_VID(a, v, s) \
> ((((uint32_t)(a)[3] + (a)[4] + (a)[5]) ^ (v)) % (s))
>
> I'm sure you see the problem. It accesses data that may be off of the
> end of the MAC address is the MAC address is less than 6 bytes, and it
> also assumes that this 3-byte sample of the MAC address will represent
> an adequately variable sample. Perhaps this macro is the reason why
> there's a check that MAC addresses must be greater than or equal to 6
> bytes long...
>
sorry I forgot about this case when I said the address length
doesn't matter.
> In the interest of removing this MAC address length limitation (because
> I have to to get iptun to work), this hashing algorithm needs to be more
> flexible and take into account the length of the MAC address. Perhaps
> the MAC-type plugins could provide their hashing function? Alternately,
> the hashing function could take the MAC address length into account and
> only use a sample of bytes within the actual address.
>
> Any thoughts on how you want to see this fixed?
>
I would prefer to keep the existing macro and use it in flow_ether_hash().
for flow_l2_hash() and flow_l2_hash_fe(), we could do something like this:
#define HASH_L2(a, v, l, s, h) {
uint32_t i, sum = 0, nbytes = ((l) < 3) ? (l) : 3; \
for (i = (l) - nbytes; i < (l); i++) \
sum += (a)[i]; \
h = (sum ^ (v)) % (s); \
}
static uint32_t
flow_l2_hash(flow_tab_t *ft, flow_state_t *s)
{
flow_l2info_t *l2 = &s->fs_l2info;
uint32_t h;
HASH_l2(l2->l2_daddr, l2->l2_vid, l2->l2_addrlen, ft->ft_size, h);
return (h);
}
note that the new l2_addrlen will have to be filled in at
flow_l2_accept.
regarding why we do a sum instead of using the last 4 bytes as an integer,
I think the reason was we can't be sure of the alignment of the mac address
so we can't just use *((uint32_t *)&l2->l2_daddr[2]) directly.
eric