On 3/28/25 11:00, Vladislav Odintsov wrote: > On 26.03.2025 15:30, Ilya Maximets wrote: >> On 3/25/25 00:21, Vladislav Odintsov wrote: >>> With this commit the rebalance of hash entries between bonding members >>> becomes less frequent. If we know all bond members' speed, we do not >>> move hash entries between them if load difference is less than 1.5% of >>> minimum link speed of bond members. >>> >>> Reported-at: >>> https://mail.openvswitch.org/pipermail/ovs-dev/2025-March/422028.html >>> Suggested-by: Ilya Maximets <i.maxim...@ovn.org> >>> Signed-off-by: Vladislav Odintsov <vlodintsov@k2.cloud> >>> --- >>> v1 -> v2: >>> - Addressed Ilya's, Eelco's, Mike's review comments. >>> - Docs updated. >>> - NEWS entry added. >>> --- >>> Documentation/topics/bonding.rst | 3 ++- >>> NEWS | 4 ++++ >>> ofproto/bond.c | 25 +++++++++++++++++++------ >>> 3 files changed, 25 insertions(+), 7 deletions(-) >> Hi, Vladislav. Thanks for v2! >> >> Have you seen my reply for your observation about comparing speed to raw >> byte counters here: >> https://mail.openvswitch.org/pipermail/ovs-dev/2025-March/422255.html >> >> Tl;DR; I think you're right and we need to change the way we count the load >> in order to be able to compare the load to a link speed. >> >> Best regards, Ilya Maximets. > > Hi Ilya, > > Thanks for review! > > I changed the condition before calling a bond_shift_load() function, so > now Bps with Bps instead of B with Bps should be compared. It seems to > me that this is good enough. > > Do I understand you correctly, that you proposed to change the movement > logic inside bond_shift_load() function? If yes, I'm wondering why do > we care, whether we compare total carried bytes during last rebalance > interval or an average bytes per second during the same interval? In > both cases we seem to fairly compare same units: B vs B or Bps vs Bps, > so the current logic of this function looks correct for me. And in both > cases the result should be the same. Am I missing something? >
Hmm. It's complicated. Yes, you're comparing apples to apples here, i.e. this version compares speed to speed. The problem, I think, is that our total amount of bytes divided by the rebalance interval doesn't give a correct or accurate enough approximation of the average speed. Let's say we have a constant 100Bps traffic on the port. And let's say we have a standard 10sec rebalance interval. After the first 10 seconds, our tx_bytes on this port will be 1000B. Then we'll divide it by 2 for our ad-hoc exponentially weighted moving average implementation. So, it becomes 500B. After the next interval we'll have another 1000B plus our 500B from the previous interval, so 1500B in total. That, divided by the rebalancing interval, gives us 150Bps "average speed", which is 50% higher than the actual average speed on this interface. Next time it will be 175, and so on. In a long run this is 100 * (1 + 1/2 + 1/4 + ...) = 200, which 2x of our actual average speed. So, on a constant packet rate, this way of measuring is not accurate at all. I'm not doing the math for the non-constant speed, but I'd guess it will not be accurate enough either, even if it may average better in some scenarios it might be much worse in others. And I'm not sure if we can make this style of calculation accurate enough in all cases by just choosing a different divisor. All in all, I think, we need to change the way we calculate the load in order to have a more or less accurate idea about the average speed. The Exponential Moving Average from lib/mov-avg.h should give us much more accurate result. The problem, however, is that we can't easily shift the load if we calculate the average on a sum. So, we'll need to calculate an average speed per hash and then compare sums of averages with the link speed. Does that make sense? Best regards, Ilya Maximets. _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev