[kudu-CR] [net] KUDU-1457 [1/n] Add support for IPv6

Ashwani Raina (Code Review) Thu, 24 Jul 2025 08:40:13 -0700

Ashwani Raina has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/22984 )


Change subject: [net] KUDU-1457 [1/n] Add support for IPv6
......................................................................


Patch Set 6:

(29 comments)

http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/rpc/server_negotiation.cc
File src/kudu/rpc/server_negotiation.cc:

http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/rpc/server_negotiation.cc@104
PS6, Line 104: fc00::/7
> Isn't this the _reserved_ range where no IPv6 address might be actually fro
Correct - fc00::/8 is not defined so no address is expected from the 
corresponding range.

I guess you meant fd00::/8 instead of fd00::/7 representing private networks. 
fc00::/7 is the overarching block for unique local addresses. While the full 
range of fc00::/7 subsumes fd00::/8 (which is defined per 
https://datatracker.ietf.org/doc/html/rfc4193#section-3.1) and fc00::/8 (which 
is undefined part), I thought it would be safe to consider all of fc00::/7 as 
trusted subnet.

Since fc00::/8 is not defined and may serve different purpose in future, I 
guess this can be changed to fd00::/8 to restrict the checks for IPv6 address 
range that is define per rfc.


http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/rpc/server_negotiation.cc@104
PS6, Line 104: fe80::/10
> nit: it would be great to verify that this results in fe80::/64 network sta
Not sure if I understand your point. Do you want to see whether fe80::/10 
starting address falls in fe80::/64 network?
Something like this?

Network network1;                                               
Network network2;
ASSERT_OK(network1.ParseCIDRString("fe80::/10"));   
ASSERT_OK(network2.ParseCIDRString("fe80::/64"));
ASSERT_EQ(network1.GetAddrAsString(),network2.GetAddrAsString());

Perhaps, it would make sense to add a unit test that uses 
ServerNegotiation::IsTrustedConnection() to ensure input Sockaddr belongs to a 
trusted network. I didn't any existing test that does it explicitly. I do want 
to add that there are a couple of unit tests (in 
src/kudu/rpc/negotiation-test.cc) that make use of 
ServerNegotiation::Negotiate() to achieve similar goal under specific 
conditions.


http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/rpc/server_negotiation.cc@104
PS6, Line 104:               "fe80::/10,fc00::/7",
> What about the loopback address ::1/128?  Shouldn't we include it into the
Done.


http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/diagnostic_socket.cc
File src/kudu/util/net/diagnostic_socket.cc:

http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/diagnostic_socket.cc@176
PS6, Line 176:   // TODO(araina): Remove this once diagnostic socket handling 
is added for IPv6.
> I see there https://gerrit.cloudera.org/#/c/23021 has already been posted f
Correct!


http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/net_util-test.cc
File src/kudu/util/net/net_util-test.cc:

http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/net_util-test.cc@188
PS6, Line 188:   // Test some invalid addresses.
             :   Status s = DoParseBindAddresses("0.0.0.0:xyz", &ret);
             :   ASSERT_STR_CONTAINS(s.ToString(), "invalid port");
             :
             :   s = DoParseBindAddresses("0.0.0.0:100000", &ret);
             :   ASSERT_STR_CONTAINS(s.ToString(), "invalid port");
             :
             :   s = DoParseBindAddresses("0.0.0.0:", &ret);
             :   ASSERT_STR_CONTAINS(s.ToString(), "invalid port");
> Add similar negative test cases like these for IPv6 wildcard addresses as w
Done


http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/net_util-test.cc@321
PS6, Line 321: IsNetworkError()
> nit for here and elsewhere in this file: when using Status::IsXxx(), please
Done


http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/net_util-test.cc@326
PS6, Line 326: fd00:1234:5678:9abc::1
> Maybe not in this particular scenario, but in more appropriate one please a
The result should be the same as for lowercase representation.

Hexadecimal characters in an IPv6 address are case-insensitive i.e. 
"FD00:1234:5678:9ABC::1" is same as "fd00:1234:5678:9abc::1"
However, it is recommended to use lowercase to avoid any reading confusion 
between characters like 'D' and '0', '8' and 'B'.

Refer these sections in the rfc (https://www.ietf.org/rfc/rfc5952.txt):
- 1.  Introduction
- 4.3.  Lowercase

Since, ParseString() and ParseCIDRString() don't do any additional checks on 
the character case, there is not much value in adding verification on parsing 
addresses expressed in uppercase unless I am missing something.


http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/net_util.h
File src/kudu/util/net/net_util.h:

http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/net_util.h@121
PS6, Line 121: common representation
> What's 'a common representation'?  Is this the term used in corresponding R
This does sound vague. Replaced with apt notation for both IPv4 and IPv6.


http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/net_util.h@156
PS6, Line 156: class Network {
             :  public:
             :   Network();
             :   Network(sa_family_t family, uint128_t addr, uint128_t netmask);
> Wouldn't it be better to templatize this on the type of IP address family?
Maybe we can discuss this in subsequent changes. I was thinking of using 
std::variant for addr_ and netmask_ along with templatizing the ctor. What do 
you think?


http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/net_util.h@164
PS6, Line 164: Returns true if the network is within 127.0.0.0/8 range.
> Update the comment?
Done


http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/net_util.cc
File src/kudu/util/net/net_util.cc:

http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/net_util.cc@173
PS6, Line 173:
> Instead of introducing this helper function, consider adopting the code in
I changed logic for IPv6 parsing a bit which renders this function unnecessary. 
For both IPv4 and IPv6 splitting, I am now using string_view::find, 
string_view::substr and std::pair.


http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/net_util.cc@174
PS6, Line 174: std::
> nit: add 'using std::string_view;' into the corresponding section in the be
Done


http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/net_util.cc@177
PS6, Line 177: size_t
> nit: use 'auto' instead?
Not a big fan of auto when it comes to correctness and readability. Given 
substr() expects arguments of type size_t, keeping this as is makes more sense.


http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/net_util.cc@202
PS6, Line 202: Substitute("invalid address $0", addr
> This isn't going to help to see the invalid input since 'addr' is already e
Done


http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/net_util.cc@206
PS6, Line 206: addr.find(']');
> If you want to find the rightmost closing bracket, maybe use 'rfind' instea
I was not really looking for the rightmost closing bracket but I guess it would 
make sense to include everything in between first opening and last closing 
bracket and treat that IP address and let subsequent checks take care of its 
validity.

I have added a check for early detection of multiple occurrence of opening or 
closing brackets. Corresponding unit tests are added as well.

I could have relied on subsequent network call i.e. inet_pton (applicable for 
most of the cases) to validate the address but in small number of cases (hms, 
unit tests, etc), there is no immediate subsequent network call. So, in such 
cases, there are chances that we are letting invalid address to roam free for 
more time until it gets busted by some network eventually.


http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/net_util.cc@215
PS6, Line 215: validate_host
> Consider renaming this: 'validate' isn't descriptive since this functor not
Done


http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/net_util.cc@216
PS6, Line 216: host == ""
> Use host.empty() instead?
Done


http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/net_util.cc@217
PS6, Line 217: Substitute(
> nit: there is nothing to substitute in the formatted message -- is somethin
Done


http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/net_util.cc@227
PS6, Line 227: Substitute("$0", addr)
> Could we get away by not using Substitute() here at all?
Done


http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/net_util.cc@241
PS6, Line 241: std::string
> This is a bit funny: creating a std::string instance with all the memory al
Done


http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/net_util.cc@264
PS6, Line 264:     RETURN_NOT_OK_PREPEND(validate_host(addr.substr(1, 
addr.length() - 2)),
> Does it make sense to add extra sanity check of addr.front() == '[' in this
My intention was to rely on subsequent network call for any further validation 
check. However, given the need to detect errors early and in some cases no 
immediate network call is to be seen, it would make sense to add the check.


http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/net_util.cc@272
PS6, Line 272:   vector<std::string_view> parts = SplitStringView(addr, ']');
             :   DCHECK(parts.size() == 2);
> If you are expecting just one ']' symbol in the string at this point, why n
I reorganised the logic. No to explicitly do a rfind. Used bracket_right 
instead.


http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/net_util.cc@275
PS6, Line 275: after opening bracket
> Does it make sense to check whether the first symbol is indeed the opening
Done


http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/net_util.cc@280
PS6, Line 280: SimpleAtoi
> This seems like it's time to introduce SimpleAtoi() that accepts std::strin
I have changed logic above that doesn't require string_view. Instead, I use 
string_view::data() to point to start index.


http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/net_util.cc@432
PS6, Line 432: strcount(host_, ':')
> Use find()/rfind() instead -- it's enough to find just one instance of ':'
Done


http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/sockaddr.h
File src/kudu/util/net/sockaddr.h:

http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/sockaddr.h@87
PS6, Line 87: A.B.C.D:port
> nit: Update comment?
Done


http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/sockaddr.h@203
PS6, Line 203: sockaddr_storage
> Please add static_assert() on the size of sockaddr_storage (in the Sockaddr
Done


http://gerrit.cloudera.org:8080/#/c/22984/1/src/kudu/util/net/sockaddr.cc
File src/kudu/util/net/sockaddr.cc:

http://gerrit.cloudera.org:8080/#/c/22984/1/src/kudu/util/net/sockaddr.cc@225
PS1, Line 225: Sockaddr& Sockaddr::operator=(const struct sockaddr_in &addr) {
             :   // Do not count the padding sin
> Right -- there isn't sin_zero field in sockaddr_in6, but there is another f
Actually, this code doesn't need to change as long as we are setting length to 
full size of sockaddr_in6 instead of just considering the partial struct and 
excluding sin6_scope_id (which is what I am doing at 
https://gerrit.cloudera.org/#/c/22984/6/src/kudu/util/net/sockaddr.cc@158)

I think it would make sense to just set both fields i.e. sin6_flowinfo and 
sin6_scope_id to 0 inside ParseFromNumericHostPort() and include size of 
sin6_scope_id as well in the length. If there is some purpose/use-case for 
these two fields that arises in future, we can make necessary changes in the 
parse function.

With that, there is no need to do all this circus of setting length to 
partially. Also, there is no need to worry about ordering as well. Irrespective 
of what order these two unused fields have, they will always be 0 and have 
uniform behaviour for hash function, comparison, etc.

Thoughts?


http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/sockaddr.cc
File src/kudu/util/net/sockaddr.cc:

http://gerrit.cloudera.org:8080/#/c/22984/6/src/kudu/util/net/sockaddr.cc@138
PS6, Line 138: static_assert
> nit: Do we need the corresponding static asserts in case of IPv6 too?
For IPv4, the static check is to ensure that length corresponds to only the IP 
address part of 'struct sockaddr_in' and ignore padding (that is the last field 
inside struct). Hence, all the relevant fields are within the limit and don't 
cross into padding.

For IPv6, there is no padding. However, there are other fields (sin6_flowinfo 
and sin6_scope_id) that have specific purposes but we may not have any use for 
those fields as of now. Maybe in future, we do. Since these unused fields are 
there for purpose and might have some application in future, it is better to 
set those to '0' and include them in the length. With that logic, there is no 
need to check whether any of the fields inside the structure goes beyond the 
length because all of them have been considered while setting the length.

For more info: 
https://gerrit.cloudera.org/#/c/22984/1/src/kudu/util/net/sockaddr.cc@226



--
To view, visit http://gerrit.cloudera.org:8080/22984
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I22c773ffb2ff44b9cd765b546d6724ec5543586f
Gerrit-Change-Number: 22984
Gerrit-PatchSet: 6
Gerrit-Owner: Ashwani Raina <[email protected]>
Gerrit-Reviewer: Abhishek Chennaka <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Ashwani Raina <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Thu, 24 Jul 2025 15:39:57 +0000
Gerrit-HasComments: Yes

[kudu-CR] [net] KUDU-1457 [1/n] Add support for IPv6

Reply via email to