On 07/02/2012 12:42 AM, Darren Reed wrote:
> On 2/07/2012 8:10 AM, Sašo Kiselkov wrote:
>> I sorted the issue out by modifying the proposed changes you sent me
>> last time. I figured out that mp->b_rptr must only be adjusted if there
>> are higher-level transport protocols in use (TCP, UDP, SCTP, etc.), as
>> is done for example in mac_rx_srs_proto_fanout like so:
> ...
> 
> Ok, yup. I was wondering which approach to take (adjust
> b_rptr inside mac_rx_srs_long_fanout() or after it was called)
> and it seemed the inside one works better because b_rptr
> is only expected to be different for non "OTH" types on return.

I'm not entirely clear what the b_rptr adjustment is for, so I only
guessed how it's supposed to be done judging from the other code. What
exactly is an mblk_t structure? (This is my first time hacking the
kernel and the acronyms are somewhat cryptic to me...)

>> To better deal with fanout of multicast traffic which might originate in
>> professional IRDs and various other hardware appliances (which often
>> stream everything from a single source addr+port combo and only
>> differentiate streams by destination multicast address), I also
>> implemented a new type of mac_fanout_type, MAC_FANOUT_SRC_DST, which
>> does an XOR of the source and destination addresses in hash computation.
>> That way, the default behavior of src-addr + src-port fanout is
>> unchanged and the new behavior is selected only if the user wants it.
> 
> Whilst I applaud the idea of a different fanout being available,
> I think that the interface for selecting which one to use and
> how it is selected needs more careful consideration.
> 
> It would seem to me that in certain scenarios, it might be
> sufficient (or even better) to use MAC_FANOUT_DST.

Of course a per-link or per-flow tuning would probably be best, but that
was above my skills to implement at this time (and as far as I'm aware,
flow classification takes place after IP fanout).

>> I've compiled my changes in the attached src_dst.patch file - you'll
>> want to apply this patch on top of the 918.patch from Nexenta (which
>> fixes most of the fanout problems, but, apparently, breaks multicast, at
>> least for me).
> 
> Just to be clear, 918.patch alone from Nexenta broke multicast, but with
> the additional patch attached to your email, it works fine?

Correct. The actual fix is simply the adjustment of mp->b_rptr as you
suggested. The remainder of the patch is just the implementation of
src-dst fanout.

>> It's also possible to change the code in such a fashion so as to always
>> do src-dst based fanout on multicast packets - for multicast that really
>> makes sense (since the machine may at any one time be a member of a
>> large number of multicast groups all originating at a single source).
> 
> The problem there is that this would override the policy of either
> "default" (source address based fanout) or round-robin.
> 
> But I suppose that begs the question of why you would ever
> want to use round-robin and the answer to that would seem
> to be only when the default algorithm (src based) is not providing
> a very good spread (small number of senders/connections.)
> 
> So it may be that it is simpler to change the default hash to be on
> src^dst instead of just src? (Makes sense to me...)

Well, as far as I'm concerned, I always wondered why it was
"src_addr ^ src_port" to begin with, other than the author assuming that
the bulk of all traffic being unicast (which naturally tends towards
unique src-port tuples). If I had a say in that, I'd always advocate for
"src_addr ^ src_port ^ dst_addr ^ dst_port" to give the maximum amount
of entropy for the fanout and then have a per-link tunable (settable via
dladm set-linkprop) to allow configuring other fanout strategies, like
is possible for link-aggregation (L2, L3, L4, etc.).

--
Saso


-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com

Reply via email to