On 3/30/22 16:07, Dumitru Ceara wrote:
> On 3/30/22 15:08, Ilya Maximets wrote:
>> On 3/21/22 11:45, Dumitru Ceara wrote:
>>> Hi Ilya.
>>>
>>> On 3/10/22 23:33, Ilya Maximets wrote:
>>>> Clustered OVSDB allows to use DNS names as addresses of raft members.
>>>> However, if DNS resolution fails during the initial database read,
>>>> this causes a fatal failure and exit of the ovsdb-server process.
>>>>
>>>> Also, if DNS name of a joining server is not resolvable for one of the
>>>> followers, this follower will reject append requests for a new server
>>>> to join until the name is successfully resolved. This makes a follower
>>>> effectively non-functional while DNS is unavailable.
>>>>
>>>> To fix the problem relaxing the address verification. Allowing
>>>
>>> s/relaxing/relax/
>>>
>>>> validation to pass if only name resolution failed and the address
>>>> is valid otherwise. This will allow addresses to be added to the
>>>> database, so connections could be established later when the DNS
>>>> is available.
>>>
>>> This sounds reasonable to me.
>>>
>>>>
>>>> Additionally fixing missed initialization of the dns-resolve module.
>>>> Wihtout it, DNS requests are blocking. This causes unexpected delays
>>>
>>> Typo: Wihtout
>>>
>>>> in runtime.
>>>>
>>>> Fixes: 771680d96fb6 ("DNS: Add basic support for asynchronous DNS
>>>> resolving")
>>>> Reported-at: https://bugzilla.redhat.com/2055097
>>>> Signed-off-by: Ilya Maximets <[email protected]>
>>>> ---
>>>> lib/dns-resolve.c | 2 +-
>>>> lib/socket-util.c | 39 ++++++++++++++++++++++++------------
>>>> lib/socket-util.h | 3 ++-
>>>> lib/stream.c | 2 +-
>>>> ofproto/ofproto-dpif-sflow.c | 3 ++-
>>>> ovsdb/ovsdb-server.c | 3 +++
>>>> ovsdb/raft-private.c | 5 ++++-
>>>> 7 files changed, 39 insertions(+), 18 deletions(-)
>>>>
>>>> diff --git a/lib/dns-resolve.c b/lib/dns-resolve.c
>>>> index d34451434..8bcecb90c 100644
>>>> --- a/lib/dns-resolve.c
>>>> +++ b/lib/dns-resolve.c
>>>> @@ -265,7 +265,7 @@ resolve_callback__(void *req_, int err, struct
>>>> ub_result *result)
>>>> if (err != 0 || (result->qtype == ns_t_aaaa && !result->havedata)) {
>>>> ub_resolve_free(result);
>>>> req->state = RESOLVE_ERROR;
>>>> - VLOG_ERR_RL(&rl, "%s: failed to resolve", req->name);
>>>> + VLOG_WARN_RL(&rl, "%s: failed to resolve", req->name);
>>>> return;
>>>> }
>>>>
>>>> diff --git a/lib/socket-util.c b/lib/socket-util.c
>>>> index 4f1ffecf5..38705cc51 100644
>>>> --- a/lib/socket-util.c
>>>> +++ b/lib/socket-util.c
>>>> @@ -62,7 +62,8 @@ static bool parse_sockaddr_components(struct
>>>> sockaddr_storage *ss,
>>>> const char *port_s,
>>>> uint16_t default_port,
>>>> const char *s,
>>>> - bool resolve_host);
>>>> + bool resolve_host,
>>>> + bool *dns_failure);
>>>>
>>>> /* Sets 'fd' to non-blocking mode. Returns 0 if successful, otherwise a
>>>> * positive errno value. */
>>>> @@ -438,7 +439,7 @@ parse_sockaddr_components_dns(struct sockaddr_storage
>>>> *ss OVS_UNUSED,
>>>> dns_resolve(host_s, &tmp_host_s);
>>>> if (tmp_host_s != NULL) {
>>>> parse_sockaddr_components(ss, tmp_host_s, port_s,
>>>> - default_port, s, false);
>>>> + default_port, s, false, NULL);
>>>> free(tmp_host_s);
>>>> return true;
>>>> }
>>>> @@ -450,11 +451,15 @@ parse_sockaddr_components(struct sockaddr_storage
>>>> *ss,
>>>> char *host_s,
>>>> const char *port_s, uint16_t default_port,
>>>> const char *s,
>>>> - bool resolve_host)
>>>> + bool resolve_host, bool *dns_failure)
>>>> {
>>>> struct sockaddr_in *sin = sin_cast(sa_cast(ss));
>>>> int port;
>>>>
>>>> + if (dns_failure) {
>>>> + *dns_failure = false;
>>>> + }
>>>> +
>>>> if (port_s && port_s[0]) {
>>>> if (!str_to_int(port_s, 10, &port) || port < 0 || port > 65535) {
>>>> VLOG_ERR("%s: bad port number \"%s\"", s, port_s);
>>>> @@ -501,10 +506,15 @@ parse_sockaddr_components(struct sockaddr_storage
>>>> *ss,
>>>> return true;
>>>>
>>>> resolve:
>>>> - if (resolve_host && parse_sockaddr_components_dns(ss, host_s, port_s,
>>>> - default_port, s)) {
>>>> - return true;
>>>> - } else if (!resolve_host) {
>>>> + if (resolve_host) {
>>>> + if (parse_sockaddr_components_dns(ss, host_s, port_s,
>>>> + default_port, s)) {
>>>> + return true;
>>>> + }
>>>> + if (dns_failure) {
>>>> + *dns_failure = true;
>>>> + }
>>>> + } else {
>>>> VLOG_ERR("%s: bad IP address \"%s\"", s, host_s);
>>>> }
>>>> exit:
>>>> @@ -521,10 +531,12 @@ exit:
>>>> * It resolves the host if 'resolve_host' is true.
>>>> *
>>>> * On success, returns true and stores the parsed remote address into
>>>> '*ss'.
>>>> - * On failure, logs an error, stores zeros into '*ss', and returns false.
>>>> */
>>>> + * On failure, logs an error, stores zeros into '*ss', and returns false,
>>>> + * '*dns_failure' indicates if the host resolution failed. */
>>>> bool
>>>> inet_parse_active(const char *target_, int default_port,
>>>> - struct sockaddr_storage *ss, bool resolve_host)
>>>> + struct sockaddr_storage *ss,
>>>> + bool resolve_host, bool *dns_failure)
>>>
>>> It's not really a fair ask I guess, but would it be possible to avoid
>>> changing this signature?
>>>
>>> OVN uses it in a few places. I know there's no real contract for
>>> functions in lib/*.h but OVN does use them directly.
>>>
>>> If you don't see a simple way of avoiding the signature change, it's not
>>> a big deal I guess, we can always just bump the OVS submodule version in
>>> OVN.
>>
>> Yeah, I'm not sure we can actually keep the signature intact.
>> In general, DNS resolve is a bit hacked into the process and
>> it's hard to manage it in a good way. I think, this API should
>> be split in parts or changed to make error reporting more
>> clear, e.g. by returning a enum instead of bool. But none of
>> these options will actually preserve the signature.
>>
>> Does that make sense?
>>
>> Also, the next submodule update in OVN will likely require
>> code changes due to UB fixes for loops, so it doesn't seem
>> important to keep update seamless for this part.
>>
>
> Good point. Let's keep it like in your original patch.
>
>>>
>>>> {
>>>> char *target = xstrdup(target_);
>>>> char *port, *host;
>>>> @@ -539,7 +551,7 @@ inet_parse_active(const char *target_, int
>>>> default_port,
>>>> ok = false;
>>>> } else {
>>>> ok = parse_sockaddr_components(ss, host, port, default_port,
>>>> - target_, resolve_host);
>>>> + target_, resolve_host,
>>>> dns_failure);
>>>> }
>>>> if (!ok) {
>>>> memset(ss, 0, sizeof *ss);
>>>> @@ -576,7 +588,7 @@ inet_open_active(int style, const char *target, int
>>>> default_port,
>>>> int error;
>>>>
>>>> /* Parse. */
>>>> - if (!inet_parse_active(target, default_port, &ss, true)) {
>>>> + if (!inet_parse_active(target, default_port, &ss, true, NULL)) {
>>>> error = EAFNOSUPPORT;
>>>> goto exit;
>>>> }
>>>> @@ -660,7 +672,7 @@ inet_parse_passive(const char *target_, int
>>>> default_port,
>>>> ok = false;
>>>> } else {
>>>> ok = parse_sockaddr_components(ss, host, port, default_port,
>>>> - target_, true);
>>>> + target_, true, NULL);
>>>> }
>>>> if (!ok) {
>>>> memset(ss, 0, sizeof *ss);
>>>> @@ -783,7 +795,8 @@ inet_parse_address(const char *target_, struct
>>>> sockaddr_storage *ss)
>>>> {
>>>> char *target = xstrdup(target_);
>>>> char *host = unbracket(target);
>>>> - bool ok = parse_sockaddr_components(ss, host, NULL, 0, target_,
>>>> false);
>>>> + bool ok = parse_sockaddr_components(ss, host, NULL, 0,
>>>> + target_, false, NULL);
>>>> if (!ok) {
>>>> memset(ss, 0, sizeof *ss);
>>>> }
>>>> diff --git a/lib/socket-util.h b/lib/socket-util.h
>>>> index 9ccb7d4cc..bf66393df 100644
>>>> --- a/lib/socket-util.h
>>>> +++ b/lib/socket-util.h
>>>> @@ -49,7 +49,8 @@ ovs_be32 guess_netmask(ovs_be32 ip);
>>>> void inet_parse_host_port_tokens(char *s, char **hostp, char **portp);
>>>> void inet_parse_port_host_tokens(char *s, char **portp, char **hostp);
>>>> bool inet_parse_active(const char *target, int default_port,
>>>> - struct sockaddr_storage *ssp, bool resolve_host);
>>>> + struct sockaddr_storage *ssp,
>>>> + bool resolve_host, bool *dns_failure);
>>>> int inet_open_active(int style, const char *target, int default_port,
>>>> struct sockaddr_storage *ssp, int *fdp, uint8_t
>>>> dscp);
>>>>
>>>> diff --git a/lib/stream.c b/lib/stream.c
>>>> index fcaddf10a..71039e24f 100644
>>>> --- a/lib/stream.c
>>>> +++ b/lib/stream.c
>>>> @@ -788,7 +788,7 @@ stream_parse_target_with_default_port(const char
>>>> *target, int default_port,
>>>> struct sockaddr_storage *ss)
>>>> {
>>>> return ((!strncmp(target, "tcp:", 4) || !strncmp(target, "ssl:", 4))
>>>> - && inet_parse_active(target + 4, default_port, ss, true));
>>>> + && inet_parse_active(target + 4, default_port, ss, true,
>>>> NULL));
>>>> }
>>>>
>>>> /* Attempts to guess the content type of a stream whose first few bytes
>>>> were
>>>> diff --git a/ofproto/ofproto-dpif-sflow.c b/ofproto/ofproto-dpif-sflow.c
>>>> index 30e7caf54..36b5c1768 100644
>>>> --- a/ofproto/ofproto-dpif-sflow.c
>>>> +++ b/ofproto/ofproto-dpif-sflow.c
>>>> @@ -468,7 +468,8 @@ sflow_choose_agent_address(const char *agent_device,
>>>> const char *target;
>>>> SSET_FOR_EACH (target, targets) {
>>>> struct sockaddr_storage ss;
>>>> - if (inet_parse_active(target, SFL_DEFAULT_COLLECTOR_PORT, &ss,
>>>> true)) {
>>>> + if (inet_parse_active(target, SFL_DEFAULT_COLLECTOR_PORT,
>>>> + &ss, true, NULL)) {
>>>> /* sFlow only supports target in default routing table with
>>>> * packet mark zero.
>>>> */
>>>> diff --git a/ovsdb/ovsdb-server.c b/ovsdb/ovsdb-server.c
>>>> index b975c17fc..a4f6bca73 100644
>>>> --- a/ovsdb/ovsdb-server.c
>>>> +++ b/ovsdb/ovsdb-server.c
>>>> @@ -26,6 +26,7 @@
>>>> #include "command-line.h"
>>>> #include "daemon.h"
>>>> #include "dirs.h"
>>>> +#include "dns-resolve.h"
>>>> #include "openvswitch/dynamic-string.h"
>>>> #include "fatal-signal.h"
>>>> #include "file.h"
>>>> @@ -329,6 +330,7 @@ main(int argc, char *argv[])
>>>> service_start(&argc, &argv);
>>>> fatal_ignore_sigpipe();
>>>> process_init();
>>>> + dns_resolve_init(true);
>>>>
>>>> bool active = false;
>>>> parse_options(argc, argv, &db_filenames, &remotes, &unixctl_path,
>>>> @@ -511,6 +513,7 @@ main(int argc, char *argv[])
>>>> run_command, process_status_msg(status));
>>>> }
>>>> }
>>>> + dns_resolve_destroy();
>>>> perf_counters_destroy();
>>>> service_stop();
>>>> return 0;
>>>> diff --git a/ovsdb/raft-private.c b/ovsdb/raft-private.c
>>>> index 30760233e..970cf3b54 100644
>>>> --- a/ovsdb/raft-private.c
>>>> +++ b/ovsdb/raft-private.c
>>>> @@ -36,7 +36,10 @@ raft_address_validate(const char *address)
>>>> return NULL;
>>>> } else if (!strncmp(address, "ssl:", 4) || !strncmp(address, "tcp:",
>>>> 4)) {
>>>> struct sockaddr_storage ss;
>>>> - if (!inet_parse_active(address + 4, -1, &ss, true)) {
>>>> + bool dns_failure;
>>>
>>> Not really an issue, but just to make it more clear, could we explicitly
>>> initialize dns_failure here?
>>
>> ok.
>>
>>>
>>>> +
>>>> + if (!inet_parse_active(address + 4, -1, &ss, true, &dns_failure)
>>>> + && !dns_failure) {
>>>> return ovsdb_error(NULL, "%s: syntax error in address",
>>>> address);
>>>> }
>>>> return NULL;
>>>
>>> Not really mandatory for this patch but we might as well update some
>>> comments in raft.c to make it clear that it's allowed to refer to
>>> servers by name, e.g., the comment for raft_join_cluster() mentions:
>>>
>>> * The new server is located at 'local_address', which must take one of the
>>> * forms "tcp:IP:PORT" or "ssl:IP:PORT", where IP is an IPv4 address or a
>>> * square bracket enclosed IPv6 address and PORT is a TCP port number.
>>>
>>> The same thing applies to raft_create_cluster().
>>
>> As I said, DNS resolve is a bit hacked into the logic here, so there
>> are lots of fairly confusing comments in both raft and socket-util
>> modules. When I worked on the patch I started re-writing and clarifying
>> all of them but stopped in the middle. It's just a lot of comment
>> changes and it's not really relevant to the current patch, so deserves
>> a separate one. What do you think?
>>
>
> Sounds good to me, therefore:
>
> Acked-by: Dumitru Ceara <[email protected]>
Thanks! I addressed all other minor comments and applied the fix.
Also backported down to 2.13.
Best regards, Ilya Maximets.
>
>>>
>>> Most of the minor comments above can be addressed at apply time so
>>> there's no need for a v2 but I'd like to see if there's a way to avoid
>>> changing the signature of inet_parse_active() before ACKing the patch.
>>>
>>> Regards,
>>> Dumitru
>>>
>>
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev