Hi,

My comments are on the discussion of flow IDs and hashing. I'm not
commenting at all on the overall proposal, because I can't judge
whether the problem is real or the solution is practical.

> A large space of the flow identifications, i.e. finer 
> granularity of the flows, conducts more random in spreading the flows 
> over a set of component links. 

That isn't accurate. The requirement is an ID space in which the IDs
belong to a uniform distribution. Technically speaking, if you have two
links, a one-bit flow ID is sufficient, as long as the values 0 and 1 are
equally likely to appear.

Therefore, the practical issue is not the size of the ID space but the
quality of the hash function used to generate the ID of each flow.
However, whatever the initial ID space, the final hash has to be down
to 0..N if you have N+1 alternative paths.

I think the reason that your model needs a larger ID space is to
reduce the probability of two flows colliding by chance in the ID space.
That would defeat your wish to separate out large flows.

> The advantages of hashing based load  
> distribution are the preservation of the packet sequence in a flow    
> and the real time distribution with the stateless of individual       
> flows. If the traffic flows randomly spread in the flow       
> identification space, the flow rates are much smaller compared to the 
> link capacity, 

That sounds like magic. I don't think you mean that at all.

> and the rate differences are not dramatic, 

Do you mean that the total traffic rate is more fairly distributed
across the links? In any case, "dramatic" isn't an engineering term.

> the hashing   
> algorithm works very well in general.

How can you say that without specifying a particular algorithm? Also,
"very well in general" isn't an engineering term either.

> There may be some false positives due to multiple other flows 
> masquerading as a large flow; the amount of false positives is        
> reduced by parallel hashing using different hash functions

To give you some data, with a 20 bit ID space, the FNV1a-32 hash
algorithm gives at most 5% collisions, based on IPv6 headers in real
packet traces.
[https://researchspace.auckland.ac.nz/handle/2292/13240]

I wonder whether the overhead of running several hashes in parallel
is justified by this collision rate?

Regards
   Brian Carpenter

_______________________________________________
OPSAWG mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/opsawg

Reply via email to