Re: [ovs-dev] [PATCH v2] bond: Fix broken rebalancing after link state changes.

2021-07-16 Thread Ilya Maximets
On 7/16/21 1:51 AM, Ben Pfaff wrote:
> On Tue, Jul 13, 2021 at 05:32:06PM +0200, Ilya Maximets wrote:
>> There are 3 constraints for moving hashes from one member to another:
>>
>>   1. The load difference exceeds ~ 3% of the load of one member.
>>   2. The difference in load between members exceeds 100,000 bytes.
>>   3. Moving the hash reduces the load difference by more than 10%.
>>
>> In the current implementation, if one of the members transitions to
>> the DOWN state, all hashes assigned to it will be moved to the other
>> members.  After that, if the member goes UP, it will wait for
>> rebalancing to get hashes.  But in case we have more than 10 equally
>> loaded hashes, it will never meet constraint # 3, because each hash
>> will handle less than 10% of the load.  The situation gets worse when
>> the number of flows grows and it is almost impossible to transfer any
>> hash when all 256 hash records are used, which is very likely when we
>> have few hundred/thousand flows.
>>
>> As a result, if one of the members goes down and back up while traffic
>> flows, it will never be used to transmit packets again.  This will not
>> be fixed even if we completely stop the traffic and start it again,
>> because the first two constraints will block rebalancing in the
>> earlier stages, while we have low traffic volume.
>>
>> Moving a single hash if the destination does not have any hashes,
>> as it was before commit c460a6a7bc75 ("ofproto/bond: simplifying the
>> rebalancing logic"), will not help, because a single hash is not
>> enough to make the difference in load less than 10% of the total load,
>> and this member will handle only that one hash forever.
>>
>> To fix this, let's try to move multiple hashes at the same time to
>> meet constraint # 3.
>>
>> The implementation includes sorting the "records" to be able to
>> collect records with a cumulative load close enough to the ideal value.
>>
>> Signed-off-by: Ilya Maximets 
> 
> I reread this and it still looks good to me.
> 
> I spotted one typo in a comment:
> 
> diff --git a/ofproto/bond.c b/ofproto/bond.c
> index b9bfa45493b8..c3e2083575b0 100644
> --- a/ofproto/bond.c
> +++ b/ofproto/bond.c
> @@ -1216,7 +1216,7 @@ choose_entries_to_migrate(const struct bond_member 
> *from, uint64_t to_tx_bytes,
>  }
>  
>  if (!cnt) {
> -/* There is no entry which load less than or equal to 'ideal_delta'.
> +/* There is no entry with load less than or equal to 'ideal_delta'.
>   * Lets try closest one. The closest is the last in sorted list. */
>  struct bond_entry *closest;
>  
> 
> Acked-by: Ben Pfaff 
> 

Thanks!  I fixed that and a repeated 'to to' in the other comment
and applied to master.  I also backported to 2.15 as it applies
cleanly there.  Not sure about backporting to 2.13 though, because
it's kind of an algorithmic change.  I can do the backport later if
someone thinks that it's needed.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] bond: Fix broken rebalancing after link state changes.

2021-07-15 Thread Ben Pfaff
On Tue, Jul 13, 2021 at 05:32:06PM +0200, Ilya Maximets wrote:
> There are 3 constraints for moving hashes from one member to another:
> 
>   1. The load difference exceeds ~ 3% of the load of one member.
>   2. The difference in load between members exceeds 100,000 bytes.
>   3. Moving the hash reduces the load difference by more than 10%.
> 
> In the current implementation, if one of the members transitions to
> the DOWN state, all hashes assigned to it will be moved to the other
> members.  After that, if the member goes UP, it will wait for
> rebalancing to get hashes.  But in case we have more than 10 equally
> loaded hashes, it will never meet constraint # 3, because each hash
> will handle less than 10% of the load.  The situation gets worse when
> the number of flows grows and it is almost impossible to transfer any
> hash when all 256 hash records are used, which is very likely when we
> have few hundred/thousand flows.
> 
> As a result, if one of the members goes down and back up while traffic
> flows, it will never be used to transmit packets again.  This will not
> be fixed even if we completely stop the traffic and start it again,
> because the first two constraints will block rebalancing in the
> earlier stages, while we have low traffic volume.
> 
> Moving a single hash if the destination does not have any hashes,
> as it was before commit c460a6a7bc75 ("ofproto/bond: simplifying the
> rebalancing logic"), will not help, because a single hash is not
> enough to make the difference in load less than 10% of the total load,
> and this member will handle only that one hash forever.
> 
> To fix this, let's try to move multiple hashes at the same time to
> meet constraint # 3.
> 
> The implementation includes sorting the "records" to be able to
> collect records with a cumulative load close enough to the ideal value.
> 
> Signed-off-by: Ilya Maximets 

I reread this and it still looks good to me.

I spotted one typo in a comment:

diff --git a/ofproto/bond.c b/ofproto/bond.c
index b9bfa45493b8..c3e2083575b0 100644
--- a/ofproto/bond.c
+++ b/ofproto/bond.c
@@ -1216,7 +1216,7 @@ choose_entries_to_migrate(const struct bond_member *from, 
uint64_t to_tx_bytes,
 }
 
 if (!cnt) {
-/* There is no entry which load less than or equal to 'ideal_delta'.
+/* There is no entry with load less than or equal to 'ideal_delta'.
  * Lets try closest one. The closest is the last in sorted list. */
 struct bond_entry *closest;
 

Acked-by: Ben Pfaff 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2] bond: Fix broken rebalancing after link state changes.

2021-07-13 Thread Ilya Maximets
There are 3 constraints for moving hashes from one member to another:

  1. The load difference exceeds ~ 3% of the load of one member.
  2. The difference in load between members exceeds 100,000 bytes.
  3. Moving the hash reduces the load difference by more than 10%.

In the current implementation, if one of the members transitions to
the DOWN state, all hashes assigned to it will be moved to the other
members.  After that, if the member goes UP, it will wait for
rebalancing to get hashes.  But in case we have more than 10 equally
loaded hashes, it will never meet constraint # 3, because each hash
will handle less than 10% of the load.  The situation gets worse when
the number of flows grows and it is almost impossible to transfer any
hash when all 256 hash records are used, which is very likely when we
have few hundred/thousand flows.

As a result, if one of the members goes down and back up while traffic
flows, it will never be used to transmit packets again.  This will not
be fixed even if we completely stop the traffic and start it again,
because the first two constraints will block rebalancing in the
earlier stages, while we have low traffic volume.

Moving a single hash if the destination does not have any hashes,
as it was before commit c460a6a7bc75 ("ofproto/bond: simplifying the
rebalancing logic"), will not help, because a single hash is not
enough to make the difference in load less than 10% of the total load,
and this member will handle only that one hash forever.

To fix this, let's try to move multiple hashes at the same time to
meet constraint # 3.

The implementation includes sorting the "records" to be able to
collect records with a cumulative load close enough to the ideal value.

Signed-off-by: Ilya Maximets 
---

Version 2:
  * Rebased.
  * Few small style adjustments.
  * Combined the test with the main change in a single patch.
I don't remember why I sent them separately 4 years ago. :)

  * I tried to play around with migration_threshold, but different
values leads to exctly the same end result in my testing but with
larger number of calls to choose_entries_to_migrate(). For example,
in the unit test in this patch, with 10% threshold, it migrates 48
hashes at the first call and 5 more on a second, 53 hashes in total.
If the threshold is 50% of ideal, it moves same 53 hashes but in
smaller chunks: 22, 16, 8, 4, 2, 1 (hashes per call).  This just
takes a bit more of processing time and doesn't change the result.
Note that all these migrations done within a single bond_rebalance()
run, so there is no difference for actual mogration of packet flows.

Therefore, I kept threshold as it was in the first versin of a patch,
i.e. 10% difference with the ideal delta.

 ofproto/bond.c| 127 +-
 tests/ofproto-dpif.at | 104 +-
 2 files changed, 179 insertions(+), 52 deletions(-)

diff --git a/ofproto/bond.c b/ofproto/bond.c
index 35b9caac0..b9bfa4549 100644
--- a/ofproto/bond.c
+++ b/ofproto/bond.c
@@ -1173,49 +1173,72 @@ bond_shift_load(struct bond_entry *hash, struct 
bond_member *to)
 bond->bond_revalidate = true;
 }
 
-/* Picks and returns a bond_entry to migrate from 'from' (the most heavily
+/* Picks and returns 'bond_entry's to migrate from 'from' (the most heavily
  * loaded bond member) to a bond member that has 'to_tx_bytes' bytes of load,
  * given that doing so must decrease the ratio of the load on the two members
- * by at least 0.1.  Returns NULL if there is no appropriate entry.
+ * by at least 0.1.  Returns number of entries filled in 'to_migrate'.
  *
- * The list of entries isn't sorted.  I don't know of a reason to prefer to
- * shift away small hashes or large hashes. */
-static struct bond_entry *
-choose_entry_to_migrate(const struct bond_member *from, uint64_t to_tx_bytes)
+ * The list of entries is sorted in descending order of load.  This allows us
+ * to collect subset of entries with accumulated load close to ideal.  */
+static size_t
+choose_entries_to_migrate(const struct bond_member *from, uint64_t to_tx_bytes,
+  struct bond_entry **to_migrate)
 OVS_REQ_WRLOCK(rwlock)
 {
 struct bond_entry *e;
+/* Note, the ideal traffic is the mid point between 'from' and 'to'.
+ * This value does not change by rebalancing.  */
+uint64_t ideal_tx_bytes = (from->tx_bytes + to_tx_bytes) / 2;
+uint64_t ideal_delta = ideal_tx_bytes - to_tx_bytes;
+uint64_t delta = 0; /* The amount to rebalance. */
+uint64_t new_low;   /* The lower bandwidth between 'to' and 'from'
+ * after rebalancing. */
+uint64_t migration_threshold = ideal_delta / 10; /* 10% */
+size_t cnt = 0;
 
 if (ovs_list_is_short(>entries)) {
 /* 'from' carries no more than one MAC hash, so shifting load away from
  * it would be pointless. */
-return NULL;