Thanks for the comments. Got it. I updated the implementation & sent a new patch. Please let me know if there is any additional changes needed.
--Mehrdad On Wed, Feb 4, 2026 at 4:33 AM Dumitru Ceara <[email protected]> wrote: > On 2/4/26 8:30 AM, Ales Musil wrote: > > On Tue, Feb 3, 2026 at 8:40 PM Mehrdad Moradi via dev < > > [email protected]> wrote: > > > >> When ovn-controller returns DNS responses with multiple IP addresses, > >> clients typically use the first IP in the list. Without randomization, > >> all clients direct traffic to the same backend, leading to uneven load > >> distribution. > >> > >> This patch implements Fisher-Yates shuffle on IPv4 and IPv6 address > >> arrays before building DNS answers, ensuring each query returns IPs > >> in a randomized order for better traffic distribution across backends. > >> > >> Signed-off-by: Mehrdad Moradi <[email protected]> > >> --- > >> > > > > Hi Mehrdad, > > > > Hi Mehrdad, Ales, > > > I think there might have been slight misunderstanding. > > Dumitru suggested two things, generic shuffle and shuffled > > indices. If we are going the way of shuffled indices we don't > > need the generic shuffle. So I would put the shuffle into the > > shuffle_indices() function directly. I have also one more comment > > down below. > > > > Yes, that's what I meant, sorry if it wasn't too clear. > > > controller/pinctrl.c | 49 ++++++++++++++++++++++++++++++++++++-- > >> tests/ovn.at | 56 +++++++++++++++++++++++++++++++++++++++----- > >> 2 files changed, 97 insertions(+), 8 deletions(-) > >> > >> diff --git a/controller/pinctrl.c b/controller/pinctrl.c > >> index ab8a0a37c..25a4de6a2 100644 > >> --- a/controller/pinctrl.c > >> +++ b/controller/pinctrl.c > >> @@ -3365,6 +3365,44 @@ dns_build_ptr_answer( > >> free(encoded); > >> } > >> > >> +/* Shuffle an array in-place using Fisher-Yates algorithm. */ > >> +static void > >> +array_shuffle(void *array, size_t elem_size, size_t n) > >> +{ > >> + if (n <= 1) { > >> + return; > >> + } > >> + > >> + uint8_t *arr = array; > >> + for (size_t i = n - 1; i > 0; i--) { > >> + size_t j = random_range(i + 1); > >> + uint8_t *a = arr + i * elem_size; > >> + uint8_t *b = arr + j * elem_size; > >> + for (size_t k = 0; k < elem_size; k++) { > >> + uint8_t tmp = a[k]; > >> + a[k] = b[k]; > >> + b[k] = tmp; > >> + } > > Swapping byte by byte is not really the right way to do it. But if we > move it inside the "shuffle_indices" function we don't need to do this > anymore. We can just use a size_t temporary instead. > > >> + } > >> +} > >> + > >> +/* Return an array of shuffled indices [0, n-1] for DNS round-robin. > > This is a generic function, it just creates a nicely randomized > permutation of indices. It shouldn't mention DNS. It could also go in > ovn-util.[ch] while we're at it. > > >> + * Caller must free the returned array. Returns NULL if n == 0. */ > >> +static size_t * > >> +shuffle_indices(size_t n) > >> +{ > >> + if (!n) { > >> + return NULL; > >> + } > >> + > >> + size_t *indices = xmalloc(n * sizeof *indices); > >> + for (size_t i = 0; i < n; i++) { > >> + indices[i] = i; > >> + } > >> + array_shuffle(indices, sizeof *indices, n); > >> > > > > As said above, let's put the shuffle loop directly here. > > > > + return indices; > >> +} > >> + > >> #define DNS_RCODE_SERVER_REFUSE 0x5 > >> #define DNS_QUERY_TYPE_CLASS_LEN (2 * sizeof(ovs_be16)) > >> > >> @@ -3530,11 +3568,15 @@ pinctrl_handle_dns_lookup( > >> goto exit; > >> } > >> > >> + /* Shuffle indices for round-robin load balancing. */ > >> + size_t *ipv4_order = shuffle_indices(ip_addrs.n_ipv4_addrs); > >> + size_t *ipv6_order = shuffle_indices(ip_addrs.n_ipv6_addrs); > >> + > >> if (query_type == DNS_QUERY_TYPE_A || > >> query_type == DNS_QUERY_TYPE_ANY) { > >> for (size_t i = 0; i < ip_addrs.n_ipv4_addrs; i++) { > >> dns_build_a_answer(&dns_answer, in_queryname, idx, > >> - ip_addrs.ipv4_addrs[i].addr); > >> + > >> ip_addrs.ipv4_addrs[ipv4_order[i]].addr); > >> > > > > nit: This is a bit hard to read, I would make the addr separate variable. > > > > ancount++; > >> } > >> } > >> @@ -3543,11 +3585,14 @@ pinctrl_handle_dns_lookup( > >> query_type == DNS_QUERY_TYPE_ANY) { > >> for (size_t i = 0; i < ip_addrs.n_ipv6_addrs; i++) { > >> dns_build_aaaa_answer(&dns_answer, in_queryname, idx, > >> - &ip_addrs.ipv6_addrs[i].addr); > >> + > >> &ip_addrs.ipv6_addrs[ipv6_order[i]].addr); > >> > > > > nit: Same here. > > > > > >> ancount++; > >> } > >> } > >> > >> + free(ipv4_order); > >> + free(ipv6_order); > >> + > >> /* DNS is configured with a record for this domain with > >> * an IPv4/IPV6 only, so instead of ignoring this A/AAAA query, > >> * we can reply with RCODE = 5 (server refuses) and that > >> diff --git a/tests/ovn.at b/tests/ovn.at > >> index 4d15d4a62..d5e51c60b 100644 > >> --- a/tests/ovn.at > >> +++ b/tests/ovn.at > >> @@ -11636,6 +11636,7 @@ set_dns_params() { > >> local ttl=00000e10 > >> an_count=0001 > >> type=0001 > >> + expected_dns_answer_alt="" > >> case $hname in > >> vm1) > >> # vm1.ovn.org > >> @@ -11652,10 +11653,13 @@ set_dns_params() { > >> vm2) > >> # vm2.ovn.org > >> query_name=03766d32036f766e036f726700 > >> + # IPv4 addresses may be returned in any order due to DNS > >> round-robin. > >> # IPv4 address - 10.0.0.6 > >> - expected_dns_answer=${query_name}00010001${ttl}00040a000006 > >> + local ip1=${query_name}00010001${ttl}00040a000006 > >> # IPv4 address - 20.0.0.4 > >> - > >> > expected_dns_answer=${expected_dns_answer}${query_name}00010001${ttl}000414000004 > >> + local ip2=${query_name}00010001${ttl}000414000004 > >> + expected_dns_answer=${ip1}${ip2} > >> + expected_dns_answer_alt=${ip2}${ip1} > >> an_count=0002 > >> ;; > >> vm3) > >> @@ -11712,6 +11716,12 @@ set_dns_params() { > >> local dns_resp_header=010281200001${an_count}00000000 > >> dns_req_data=${dns_req_header}${query_name}${type}0001 > >> > >> > dns_resp_data=${dns_resp_header}${query_name}${type}0001${expected_dns_answer} > >> + # Alternative response for cases with multiple IPs (DNS > round-robin). > >> + if test -n "$expected_dns_answer_alt"; then > >> + > >> > dns_resp_data_alt=${dns_resp_header}${query_name}${type}0001${expected_dns_answer_alt} > >> + else > >> + dns_resp_data_alt="" > >> + fi > >> } > >> > >> # This shell function sends a DNS request packet > >> @@ -11789,17 +11799,51 @@ src_ip=`ip_to_hex 10 0 0 4` > >> dst_ip=`ip_to_hex 10 0 0 1` > >> dns_reply=1 > >> test_dns 1 f00000000001 f000000000f0 $src_ip $dst_ip $dns_reply > >> $dns_req_data $dns_resp_data > >> +# Generate alternative expected response for DNS round-robin (without > >> sending packet). > >> +if test -n "$dns_resp_data_alt"; then > >> + # Build expected reply packet with alternative IP ordering. > >> + ip_len_val=`expr 28 + ${#dns_resp_data_alt} / 2` > >> + udp_len_val=`expr $ip_len_val - 20` > >> + ip_len_hex=$(printf "%x" $ip_len_val) > >> + udp_len_hex=$(printf "%x" $udp_len_val) > >> + > >> > reply_alt=f00000000001f000000000f00800450000${ip_len_hex}0000000080110000 > >> + > >> > reply_alt=${reply_alt}${dst_ip}${src_ip}0035923400${udp_len_hex}0000${dns_resp_data_alt} > >> + echo $reply_alt > 1.expected.alt > >> +fi > >> > >> # NXT_RESUMEs should be 1. > >> OVS_WAIT_UNTIL([test 1 = `cat ofctl_monitor*.log | grep -c > NXT_RESUME`]) > >> > >> -OVN_CHECK_PACKETS([hv1/vif1-tx.pcap], [1.expected], ["cut -c -48"]) > >> -# Skipping the IPv4 checksum. > >> -OVN_CHECK_PACKETS([hv1/vif1-tx.pcap], [1.expected], ["cut -c 53-"]) > >> +# Check packets - for vm2, IPs may be in either order due to DNS > >> round-robin. > >> +if test -f 1.expected.alt; then > >> + $PYTHON "$ovs_srcdir/utilities/ovs-pcap.in" hv1/vif1-tx.pcap > > >> 1.received > >> + cut -c 53- 1.expected > 1.expected.tail > >> + cut -c 53- 1.expected.alt > 1.expected.alt.tail > >> + cut -c 53- 1.received > 1.received.tail > >> + # Accept either ordering of IPs. > >> + if diff -q 1.expected.tail 1.received.tail > /dev/null 2>&1 || \ > >> + diff -q 1.expected.alt.tail 1.received.tail > /dev/null 2>&1; > then > >> + : > >> + else > >> + echo "Expected (order 1):" && cat 1.expected.tail > >> + echo "Expected (order 2):" && cat 1.expected.alt.tail > >> + echo "Received:" && cat 1.received.tail > >> + AT_FAIL_IF([true]) > >> + fi > >> + # Check header (first 48 chars). > >> + cut -c -48 1.expected > 1.expected.head > >> + cut -c -48 1.received > 1.received.head > >> + AT_CHECK([diff 1.expected.head 1.received.head]) > >> +else > >> + OVN_CHECK_PACKETS([hv1/vif1-tx.pcap], [1.expected], ["cut -c -48"]) > >> + # Skipping the IPv4 checksum. > >> + OVN_CHECK_PACKETS([hv1/vif1-tx.pcap], [1.expected], ["cut -c 53-"]) > >> +fi > >> > >> reset_pcap_file hv1-vif1 hv1/vif1 > >> reset_pcap_file hv1-vif2 hv1/vif2 > >> -rm -f 1.expected > >> +rm -f 1.expected 1.expected.alt 1.received > >> +rm -f 1.expected.head 1.expected.tail 1.expected.alt.tail > 1.received.head > >> 1.received.tail > >> rm -f 2.expected > >> > >> > >> -- > >> 2.52.0 > >> > >> _______________________________________________ > >> dev mailing list > >> [email protected] > >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >> > >> > > If Dumitru agrees I can make the changes during merge. > > > > At this point, I think it might be clearer if Mehrdad posts a v3 > addressing all remaining comments. We're not in a hurry to get this > merged, we already agreed to accept it in 26.03. > > Regards, > Dumitru > > > Regards, > > Ales > > > > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
