Re: [ovs-dev] [PATCH v2] bond: Fix broken rebalancing after link state changes.
On 7/16/21 1:51 AM, Ben Pfaff wrote: > On Tue, Jul 13, 2021 at 05:32:06PM +0200, Ilya Maximets wrote: >> There are 3 constraints for moving hashes from one member to another: >> >> 1. The load difference exceeds ~ 3% of the load of one member. >> 2. The difference in load between members exceeds 100,000 bytes. >> 3. Moving the hash reduces the load difference by more than 10%. >> >> In the current implementation, if one of the members transitions to >> the DOWN state, all hashes assigned to it will be moved to the other >> members. After that, if the member goes UP, it will wait for >> rebalancing to get hashes. But in case we have more than 10 equally >> loaded hashes, it will never meet constraint # 3, because each hash >> will handle less than 10% of the load. The situation gets worse when >> the number of flows grows and it is almost impossible to transfer any >> hash when all 256 hash records are used, which is very likely when we >> have few hundred/thousand flows. >> >> As a result, if one of the members goes down and back up while traffic >> flows, it will never be used to transmit packets again. This will not >> be fixed even if we completely stop the traffic and start it again, >> because the first two constraints will block rebalancing in the >> earlier stages, while we have low traffic volume. >> >> Moving a single hash if the destination does not have any hashes, >> as it was before commit c460a6a7bc75 ("ofproto/bond: simplifying the >> rebalancing logic"), will not help, because a single hash is not >> enough to make the difference in load less than 10% of the total load, >> and this member will handle only that one hash forever. >> >> To fix this, let's try to move multiple hashes at the same time to >> meet constraint # 3. >> >> The implementation includes sorting the "records" to be able to >> collect records with a cumulative load close enough to the ideal value. >> >> Signed-off-by: Ilya Maximets > > I reread this and it still looks good to me. > > I spotted one typo in a comment: > > diff --git a/ofproto/bond.c b/ofproto/bond.c > index b9bfa45493b8..c3e2083575b0 100644 > --- a/ofproto/bond.c > +++ b/ofproto/bond.c > @@ -1216,7 +1216,7 @@ choose_entries_to_migrate(const struct bond_member > *from, uint64_t to_tx_bytes, > } > > if (!cnt) { > -/* There is no entry which load less than or equal to 'ideal_delta'. > +/* There is no entry with load less than or equal to 'ideal_delta'. > * Lets try closest one. The closest is the last in sorted list. */ > struct bond_entry *closest; > > > Acked-by: Ben Pfaff > Thanks! I fixed that and a repeated 'to to' in the other comment and applied to master. I also backported to 2.15 as it applies cleanly there. Not sure about backporting to 2.13 though, because it's kind of an algorithmic change. I can do the backport later if someone thinks that it's needed. Best regards, Ilya Maximets. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH v2] bond: Fix broken rebalancing after link state changes.
On Tue, Jul 13, 2021 at 05:32:06PM +0200, Ilya Maximets wrote: > There are 3 constraints for moving hashes from one member to another: > > 1. The load difference exceeds ~ 3% of the load of one member. > 2. The difference in load between members exceeds 100,000 bytes. > 3. Moving the hash reduces the load difference by more than 10%. > > In the current implementation, if one of the members transitions to > the DOWN state, all hashes assigned to it will be moved to the other > members. After that, if the member goes UP, it will wait for > rebalancing to get hashes. But in case we have more than 10 equally > loaded hashes, it will never meet constraint # 3, because each hash > will handle less than 10% of the load. The situation gets worse when > the number of flows grows and it is almost impossible to transfer any > hash when all 256 hash records are used, which is very likely when we > have few hundred/thousand flows. > > As a result, if one of the members goes down and back up while traffic > flows, it will never be used to transmit packets again. This will not > be fixed even if we completely stop the traffic and start it again, > because the first two constraints will block rebalancing in the > earlier stages, while we have low traffic volume. > > Moving a single hash if the destination does not have any hashes, > as it was before commit c460a6a7bc75 ("ofproto/bond: simplifying the > rebalancing logic"), will not help, because a single hash is not > enough to make the difference in load less than 10% of the total load, > and this member will handle only that one hash forever. > > To fix this, let's try to move multiple hashes at the same time to > meet constraint # 3. > > The implementation includes sorting the "records" to be able to > collect records with a cumulative load close enough to the ideal value. > > Signed-off-by: Ilya Maximets I reread this and it still looks good to me. I spotted one typo in a comment: diff --git a/ofproto/bond.c b/ofproto/bond.c index b9bfa45493b8..c3e2083575b0 100644 --- a/ofproto/bond.c +++ b/ofproto/bond.c @@ -1216,7 +1216,7 @@ choose_entries_to_migrate(const struct bond_member *from, uint64_t to_tx_bytes, } if (!cnt) { -/* There is no entry which load less than or equal to 'ideal_delta'. +/* There is no entry with load less than or equal to 'ideal_delta'. * Lets try closest one. The closest is the last in sorted list. */ struct bond_entry *closest; Acked-by: Ben Pfaff ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v2] bond: Fix broken rebalancing after link state changes.
There are 3 constraints for moving hashes from one member to another: 1. The load difference exceeds ~ 3% of the load of one member. 2. The difference in load between members exceeds 100,000 bytes. 3. Moving the hash reduces the load difference by more than 10%. In the current implementation, if one of the members transitions to the DOWN state, all hashes assigned to it will be moved to the other members. After that, if the member goes UP, it will wait for rebalancing to get hashes. But in case we have more than 10 equally loaded hashes, it will never meet constraint # 3, because each hash will handle less than 10% of the load. The situation gets worse when the number of flows grows and it is almost impossible to transfer any hash when all 256 hash records are used, which is very likely when we have few hundred/thousand flows. As a result, if one of the members goes down and back up while traffic flows, it will never be used to transmit packets again. This will not be fixed even if we completely stop the traffic and start it again, because the first two constraints will block rebalancing in the earlier stages, while we have low traffic volume. Moving a single hash if the destination does not have any hashes, as it was before commit c460a6a7bc75 ("ofproto/bond: simplifying the rebalancing logic"), will not help, because a single hash is not enough to make the difference in load less than 10% of the total load, and this member will handle only that one hash forever. To fix this, let's try to move multiple hashes at the same time to meet constraint # 3. The implementation includes sorting the "records" to be able to collect records with a cumulative load close enough to the ideal value. Signed-off-by: Ilya Maximets --- Version 2: * Rebased. * Few small style adjustments. * Combined the test with the main change in a single patch. I don't remember why I sent them separately 4 years ago. :) * I tried to play around with migration_threshold, but different values leads to exctly the same end result in my testing but with larger number of calls to choose_entries_to_migrate(). For example, in the unit test in this patch, with 10% threshold, it migrates 48 hashes at the first call and 5 more on a second, 53 hashes in total. If the threshold is 50% of ideal, it moves same 53 hashes but in smaller chunks: 22, 16, 8, 4, 2, 1 (hashes per call). This just takes a bit more of processing time and doesn't change the result. Note that all these migrations done within a single bond_rebalance() run, so there is no difference for actual mogration of packet flows. Therefore, I kept threshold as it was in the first versin of a patch, i.e. 10% difference with the ideal delta. ofproto/bond.c| 127 +- tests/ofproto-dpif.at | 104 +- 2 files changed, 179 insertions(+), 52 deletions(-) diff --git a/ofproto/bond.c b/ofproto/bond.c index 35b9caac0..b9bfa4549 100644 --- a/ofproto/bond.c +++ b/ofproto/bond.c @@ -1173,49 +1173,72 @@ bond_shift_load(struct bond_entry *hash, struct bond_member *to) bond->bond_revalidate = true; } -/* Picks and returns a bond_entry to migrate from 'from' (the most heavily +/* Picks and returns 'bond_entry's to migrate from 'from' (the most heavily * loaded bond member) to a bond member that has 'to_tx_bytes' bytes of load, * given that doing so must decrease the ratio of the load on the two members - * by at least 0.1. Returns NULL if there is no appropriate entry. + * by at least 0.1. Returns number of entries filled in 'to_migrate'. * - * The list of entries isn't sorted. I don't know of a reason to prefer to - * shift away small hashes or large hashes. */ -static struct bond_entry * -choose_entry_to_migrate(const struct bond_member *from, uint64_t to_tx_bytes) + * The list of entries is sorted in descending order of load. This allows us + * to collect subset of entries with accumulated load close to ideal. */ +static size_t +choose_entries_to_migrate(const struct bond_member *from, uint64_t to_tx_bytes, + struct bond_entry **to_migrate) OVS_REQ_WRLOCK(rwlock) { struct bond_entry *e; +/* Note, the ideal traffic is the mid point between 'from' and 'to'. + * This value does not change by rebalancing. */ +uint64_t ideal_tx_bytes = (from->tx_bytes + to_tx_bytes) / 2; +uint64_t ideal_delta = ideal_tx_bytes - to_tx_bytes; +uint64_t delta = 0; /* The amount to rebalance. */ +uint64_t new_low; /* The lower bandwidth between 'to' and 'from' + * after rebalancing. */ +uint64_t migration_threshold = ideal_delta / 10; /* 10% */ +size_t cnt = 0; if (ovs_list_is_short(>entries)) { /* 'from' carries no more than one MAC hash, so shifting load away from * it would be pointless. */ -return NULL;