Again, I am not challenging your logic. I am challenging your statistics. "Let us assume the probability of a short-lived flow matching a single hash table entry is .25"
That would be a very poor hash function. As I said, FNV gave me a worst-case result close 0.02 (5%) with real data. Regards Brian On 15/01/2013 18:17, ramki Krishnan wrote: > Hi Brian, > > > > A potential long-lived large flow is declared if it exceeds a pre-programmed > threshold in *all the matching hash table entries*. > > > > Through the use of multiple hashes, the benefit of avoiding false positives > is a much more than a small fraction, see an approximate sample probabilistic > calculation below. > > > > Let us assume the number of hash stages is m (4). Let us assume the > probability of a short-lived flow matching a single hash table entry is .25. > For a short-lived flow to be declared a long-lived large flow, it needs to > match the same hash table entries as the long-lived large flow. The > approximate probability of this event is (.25)^m = (.25)^4 = .003 > > > > Thanks, > > ramki > > > > -----Original Message----- > From: Brian E Carpenter [mailto:[email protected]] > Sent: Tuesday, January 15, 2013 12:13 AM > To: ramki Krishnan > Cc: [email protected]; > [email protected] > Subject: Re: [OPSAWG] I-D Action: > draft-krishnan-opsawg-large-flow-load-balancing-02.txt > > > > Ram, > > > > I understand everything you say below. But you aren't commenting on my main > point: why is it worth the *extra* cost of running several hashes in parallel > to exclude a *small* fraction of mis-identified large flows? > > To me, that is not a useful engineering trade-off. > > > > Regards > > Brian > > > > On 14/01/2013 17:48, ramki Krishnan wrote: > >> Hi Brian, > > > > >>>> I think didn't explain my point sufficiently. I understand why you suggest >>>> using several hashes. But this is statistical load balancing we are >>>> talking >>about. If a few % of the time, you mistakenly treat several >>>> medium flows as one large flow, and therefore rebalance them as a single >>>> unit, so what? >>You will still balance the traffic reasonably well. > > > > >> Each of the flows is eventually learnt/programmed in a flow table > >> (hardware table resource where the flow is finally committed); there > >> is never a case of bundling multiple flows. More details below > > > > >> 1) Scalable detection of long-lived large flows > > >> a. The goal is to keep the size of the flow table (hardware table >> resource where the long-lived large flow is finally committed) bounded and >> the processing requirements (CPU utilization) for flow learning bounded. > > >> b. For satisfying the above goal, several hashes helps. > > > > >> 2) Scalable load-balancing of long-lived large flows > > >> a. The goal is to have a scalable load balancing solution which produces >> meaningful results while keeping the processing requirements (CPU >> utilization) for load-balancing bounded. > > >> b. For satisfying the above goal, it is not worthwhile to load-balance >> medium/small flows. > > > > >> Thanks, > > >> ram > > > > >> -----Original Message----- > >> From: Brian E Carpenter [mailto:[email protected]] > >> Sent: Sunday, January 13, 2013 12:06 AM > >> To: ramki Krishnan > >> Cc: >> [email protected]<mailto:[email protected]>; > >> [email protected]<mailto:[email protected]> > >> Subject: Re: [OPSAWG] I-D Action: > >> draft-krishnan-opsawg-large-flow-load-balancing-02.txt > > > > >> Ram, > > > > >> On 13/01/2013 05:14, ramki Krishnan wrote: > > >>> Hi Brian, > > > >>> Thanks a lot for your comments. Please find answers to some of your >>> comments. We will respond to your other comments shortly. > > > >>>> There may be some false positives due to multiple other flows > > >>>> masquerading as a large flow; the amount of false positives is > > >>>> reduced by parallel hashing using different hash functions > > > >>> Brian: > > > >>>> To give you some data, with a 20 bit ID space, the FNV1a-32 hash algorithm >>>> gives at most 5% collisions, based on IPv6 headers in real packet traces. >>>> [https://researchspace.auckland.ac.nz/handle/2292/13240]. I wonder whether >>>> the overhead of running several hashes in parallel is justified by this >>>> collision rate? > > > >>> Ram: > > > >>> The need for multiple hashes is specific to the suggested algorithm on >>> automatic hardware identification - this algorithm is similar to a bloom >>> filter which uses multiple hash functions. > > > > >> I think didn't explain my point sufficiently. I understand why you suggest >> using several hashes. But this is statistical load balancing we are talking >> about. If a few % of the time, you mistakenly treat several medium flows as >> one large flow, and therefore rebalance them as a single unit, so what? You >> will still balance the traffic reasonably well. > > > > >> Brian > > > > >>> "On packet arrival, a new flow is looked up in parallel in all the hash >>> tables and the corresponding counter is incremented. If the counter exceeds >>> a programmed threshold in a given time interval in all the hash table >>> entries, a candidate large flow is learnt and programmed in a hardware >>> table resource like TCAM. > > > > >>> For a short-lived flow to masquerade as a long-lived lived flow, it needs >>> to match all the hash table entries which is a joint probability event - >>> thus, the amount of false positives due to short-lived flows is reduced. > > > > >>> Thanks, > > > >>> ram > > > >>> -----Original Message----- > > >>> From: >>> [email protected]<mailto:[email protected]<mailto:[email protected]%3cmailto:[email protected]>> > >>> [mailto:[email protected]] On > > >>> Behalf Of Brian E Carpenter > > >>> Sent: Saturday, January 12, 2013 8:40 AM > > >>> To: > >>> [email protected]<mailto<mailto:[email protected]%3cmailto> > >>> :[email protected]> > > >>> Cc: >>> [email protected]<mailto:[email protected]<mailto:[email protected]%3cmailto:[email protected]>> > > >>> Subject: Re: [OPSAWG] I-D Action: > > >>> draft-krishnan-opsawg-large-flow-load-balancing-02.txt > > > >>> Hi, > > > >>> My comments are on the discussion of flow IDs and hashing. I'm not >>> commenting at all on the overall proposal, because I can't judge whether >>> the problem is real or the solution is practical. > > > >>>> A large space of the flow identifications, i.e. finer > > > >>>> granularity of the flows, conducts more random in spreading the > >>>> flows > > >>>> over a set of component links. > > > >>> That isn't accurate. The requirement is an ID space in which the IDs belong >>> to a uniform distribution. Technically speaking, if you have two links, a >>> one-bit flow ID is sufficient, as long as the values 0 and 1 are equally >>> likely to appear. > > >>> Therefore, the practical issue is not the size of the ID space but the >>> quality of the hash function used to generate the ID of each flow. > > >>> However, whatever the initial ID space, the final hash has to be down to >>> 0..N if you have N+1 alternative paths. > > >>> I think the reason that your model needs a larger ID space is to reduce the >>> probability of two flows colliding by chance in the ID space. > > >>> That would defeat your wish to separate out large flows. > > > >>>> The advantages of hashing based load > > >>>> distribution are the preservation of the packet sequence in a flow > > >>>> and the real time distribution with the stateless of individual > > >>>> flows. If the traffic flows randomly spread in the flow > > >>>> identification space, the flow rates are much smaller compared to > >>>> the > > >>>> link capacity, > > > >>> That sounds like magic. I don't think you mean that at all. > > > >>>> and the rate differences are not dramatic, > > > >>> Do you mean that the total traffic rate is more fairly distributed across >>> the links? In any case, "dramatic" isn't an engineering term. > > > > >>>> the hashing > > >>>> algorithm works very well in general. > > > >>> How can you say that without specifying a particular algorithm? Also, "very >>> well in general" isn't an engineering term either. > > > >>>> There may be some false positives due to multiple other flows > > >>>> masquerading as a large flow; the amount of false positives is > > >>>> reduced by parallel hashing using different hash functions > > > >>> To give you some data, with a 20 bit ID space, the FNV1a-32 hash algorithm >>> gives at most 5% collisions, based on IPv6 headers in real packet traces. > > >>> [https://researchspace.auckland.ac.nz/handle/2292/13240] > > >>> I wonder whether the overhead of running several hashes in parallel is >>> justified by this collision rate? > > > >>> Regards > > > >>> Brian Carpenter > > >>> _______________________________________________ > > > >>> OPSAWG mailing list > > >>> [email protected]<mailto:[email protected]<mailto:[email protected]%3cmailt<mailto:[email protected]%3cmailto:[email protected]%3cmailto:[email protected]%3cmailt> > >>> o:[email protected]>> > > >>> https://www.ietf.org/mailman/listinfo/opsawg _______________________________________________ OPSAWG mailing list [email protected] https://www.ietf.org/mailman/listinfo/opsawg
