On 3/30/22 15:08, Ilya Maximets wrote:
> On 3/21/22 11:45, Dumitru Ceara wrote:
>> Hi Ilya.
>>
>> On 3/10/22 23:33, Ilya Maximets wrote:
>>> Clustered OVSDB allows to use DNS names as addresses of raft members.
>>> However, if DNS resolution fails during the initial database read,
>>> this causes a fatal failure and exit of the ovsdb-server process.
>>>
>>> Also, if DNS name of a joining server is not resolvable for one of the
>>> followers, this follower will reject append requests for a new server
>>> to join until the name is successfully resolved. This makes a follower
>>> effectively non-functional while DNS is unavailable.
>>>
>>> To fix the problem relaxing the address verification. Allowing
>>
>> s/relaxing/relax/
>>
>>> validation to pass if only name resolution failed and the address
>>> is valid otherwise. This will allow addresses to be added to the
>>> database, so connections could be established later when the DNS
>>> is available.
>>
>> This sounds reasonable to me.
>>
>>>
>>> Additionally fixing missed initialization of the dns-resolve module.
>>> Wihtout it, DNS requests are blocking. This causes unexpected delays
>>
>> Typo: Wihtout
>>
>>> in runtime.
>>>
>>> Fixes: 771680d96fb6 ("DNS: Add basic support for asynchronous DNS
>>> resolving")
>>> Reported-at: https://bugzilla.redhat.com/2055097
>>> Signed-off-by: Ilya Maximets <[email protected]>
>>> ---
>>> lib/dns-resolve.c | 2 +-
>>> lib/socket-util.c | 39 ++++++++++++++++++++++++------------
>>> lib/socket-util.h | 3 ++-
>>> lib/stream.c | 2 +-
>>> ofproto/ofproto-dpif-sflow.c | 3 ++-
>>> ovsdb/ovsdb-server.c | 3 +++
>>> ovsdb/raft-private.c | 5 ++++-
>>> 7 files changed, 39 insertions(+), 18 deletions(-)
>>>
>>> diff --git a/lib/dns-resolve.c b/lib/dns-resolve.c
>>> index d34451434..8bcecb90c 100644
>>> --- a/lib/dns-resolve.c
>>> +++ b/lib/dns-resolve.c
>>> @@ -265,7 +265,7 @@ resolve_callback__(void *req_, int err, struct
>>> ub_result *result)
>>> if (err != 0 || (result->qtype == ns_t_aaaa && !result->havedata)) {
>>> ub_resolve_free(result);
>>> req->state = RESOLVE_ERROR;
>>> - VLOG_ERR_RL(&rl, "%s: failed to resolve", req->name);
>>> + VLOG_WARN_RL(&rl, "%s: failed to resolve", req->name);
>>> return;
>>> }
>>>
>>> diff --git a/lib/socket-util.c b/lib/socket-util.c
>>> index 4f1ffecf5..38705cc51 100644
>>> --- a/lib/socket-util.c
>>> +++ b/lib/socket-util.c
>>> @@ -62,7 +62,8 @@ static bool parse_sockaddr_components(struct
>>> sockaddr_storage *ss,
>>> const char *port_s,
>>> uint16_t default_port,
>>> const char *s,
>>> - bool resolve_host);
>>> + bool resolve_host,
>>> + bool *dns_failure);
>>>
>>> /* Sets 'fd' to non-blocking mode. Returns 0 if successful, otherwise a
>>> * positive errno value. */
>>> @@ -438,7 +439,7 @@ parse_sockaddr_components_dns(struct sockaddr_storage
>>> *ss OVS_UNUSED,
>>> dns_resolve(host_s, &tmp_host_s);
>>> if (tmp_host_s != NULL) {
>>> parse_sockaddr_components(ss, tmp_host_s, port_s,
>>> - default_port, s, false);
>>> + default_port, s, false, NULL);
>>> free(tmp_host_s);
>>> return true;
>>> }
>>> @@ -450,11 +451,15 @@ parse_sockaddr_components(struct sockaddr_storage *ss,
>>> char *host_s,
>>> const char *port_s, uint16_t default_port,
>>> const char *s,
>>> - bool resolve_host)
>>> + bool resolve_host, bool *dns_failure)
>>> {
>>> struct sockaddr_in *sin = sin_cast(sa_cast(ss));
>>> int port;
>>>
>>> + if (dns_failure) {
>>> + *dns_failure = false;
>>> + }
>>> +
>>> if (port_s && port_s[0]) {
>>> if (!str_to_int(port_s, 10, &port) || port < 0 || port > 65535) {
>>> VLOG_ERR("%s: bad port number \"%s\"", s, port_s);
>>> @@ -501,10 +506,15 @@ parse_sockaddr_components(struct sockaddr_storage *ss,
>>> return true;
>>>
>>> resolve:
>>> - if (resolve_host && parse_sockaddr_components_dns(ss, host_s, port_s,
>>> - default_port, s)) {
>>> - return true;
>>> - } else if (!resolve_host) {
>>> + if (resolve_host) {
>>> + if (parse_sockaddr_components_dns(ss, host_s, port_s,
>>> + default_port, s)) {
>>> + return true;
>>> + }
>>> + if (dns_failure) {
>>> + *dns_failure = true;
>>> + }
>>> + } else {
>>> VLOG_ERR("%s: bad IP address \"%s\"", s, host_s);
>>> }
>>> exit:
>>> @@ -521,10 +531,12 @@ exit:
>>> * It resolves the host if 'resolve_host' is true.
>>> *
>>> * On success, returns true and stores the parsed remote address into
>>> '*ss'.
>>> - * On failure, logs an error, stores zeros into '*ss', and returns false.
>>> */
>>> + * On failure, logs an error, stores zeros into '*ss', and returns false,
>>> + * '*dns_failure' indicates if the host resolution failed. */
>>> bool
>>> inet_parse_active(const char *target_, int default_port,
>>> - struct sockaddr_storage *ss, bool resolve_host)
>>> + struct sockaddr_storage *ss,
>>> + bool resolve_host, bool *dns_failure)
>>
>> It's not really a fair ask I guess, but would it be possible to avoid
>> changing this signature?
>>
>> OVN uses it in a few places. I know there's no real contract for
>> functions in lib/*.h but OVN does use them directly.
>>
>> If you don't see a simple way of avoiding the signature change, it's not
>> a big deal I guess, we can always just bump the OVS submodule version in
>> OVN.
>
> Yeah, I'm not sure we can actually keep the signature intact.
> In general, DNS resolve is a bit hacked into the process and
> it's hard to manage it in a good way. I think, this API should
> be split in parts or changed to make error reporting more
> clear, e.g. by returning a enum instead of bool. But none of
> these options will actually preserve the signature.
>
> Does that make sense?
>
> Also, the next submodule update in OVN will likely require
> code changes due to UB fixes for loops, so it doesn't seem
> important to keep update seamless for this part.
>
Good point. Let's keep it like in your original patch.
>>
>>> {
>>> char *target = xstrdup(target_);
>>> char *port, *host;
>>> @@ -539,7 +551,7 @@ inet_parse_active(const char *target_, int default_port,
>>> ok = false;
>>> } else {
>>> ok = parse_sockaddr_components(ss, host, port, default_port,
>>> - target_, resolve_host);
>>> + target_, resolve_host, dns_failure);
>>> }
>>> if (!ok) {
>>> memset(ss, 0, sizeof *ss);
>>> @@ -576,7 +588,7 @@ inet_open_active(int style, const char *target, int
>>> default_port,
>>> int error;
>>>
>>> /* Parse. */
>>> - if (!inet_parse_active(target, default_port, &ss, true)) {
>>> + if (!inet_parse_active(target, default_port, &ss, true, NULL)) {
>>> error = EAFNOSUPPORT;
>>> goto exit;
>>> }
>>> @@ -660,7 +672,7 @@ inet_parse_passive(const char *target_, int
>>> default_port,
>>> ok = false;
>>> } else {
>>> ok = parse_sockaddr_components(ss, host, port, default_port,
>>> - target_, true);
>>> + target_, true, NULL);
>>> }
>>> if (!ok) {
>>> memset(ss, 0, sizeof *ss);
>>> @@ -783,7 +795,8 @@ inet_parse_address(const char *target_, struct
>>> sockaddr_storage *ss)
>>> {
>>> char *target = xstrdup(target_);
>>> char *host = unbracket(target);
>>> - bool ok = parse_sockaddr_components(ss, host, NULL, 0, target_, false);
>>> + bool ok = parse_sockaddr_components(ss, host, NULL, 0,
>>> + target_, false, NULL);
>>> if (!ok) {
>>> memset(ss, 0, sizeof *ss);
>>> }
>>> diff --git a/lib/socket-util.h b/lib/socket-util.h
>>> index 9ccb7d4cc..bf66393df 100644
>>> --- a/lib/socket-util.h
>>> +++ b/lib/socket-util.h
>>> @@ -49,7 +49,8 @@ ovs_be32 guess_netmask(ovs_be32 ip);
>>> void inet_parse_host_port_tokens(char *s, char **hostp, char **portp);
>>> void inet_parse_port_host_tokens(char *s, char **portp, char **hostp);
>>> bool inet_parse_active(const char *target, int default_port,
>>> - struct sockaddr_storage *ssp, bool resolve_host);
>>> + struct sockaddr_storage *ssp,
>>> + bool resolve_host, bool *dns_failure);
>>> int inet_open_active(int style, const char *target, int default_port,
>>> struct sockaddr_storage *ssp, int *fdp, uint8_t dscp);
>>>
>>> diff --git a/lib/stream.c b/lib/stream.c
>>> index fcaddf10a..71039e24f 100644
>>> --- a/lib/stream.c
>>> +++ b/lib/stream.c
>>> @@ -788,7 +788,7 @@ stream_parse_target_with_default_port(const char
>>> *target, int default_port,
>>> struct sockaddr_storage *ss)
>>> {
>>> return ((!strncmp(target, "tcp:", 4) || !strncmp(target, "ssl:", 4))
>>> - && inet_parse_active(target + 4, default_port, ss, true));
>>> + && inet_parse_active(target + 4, default_port, ss, true,
>>> NULL));
>>> }
>>>
>>> /* Attempts to guess the content type of a stream whose first few bytes
>>> were
>>> diff --git a/ofproto/ofproto-dpif-sflow.c b/ofproto/ofproto-dpif-sflow.c
>>> index 30e7caf54..36b5c1768 100644
>>> --- a/ofproto/ofproto-dpif-sflow.c
>>> +++ b/ofproto/ofproto-dpif-sflow.c
>>> @@ -468,7 +468,8 @@ sflow_choose_agent_address(const char *agent_device,
>>> const char *target;
>>> SSET_FOR_EACH (target, targets) {
>>> struct sockaddr_storage ss;
>>> - if (inet_parse_active(target, SFL_DEFAULT_COLLECTOR_PORT, &ss,
>>> true)) {
>>> + if (inet_parse_active(target, SFL_DEFAULT_COLLECTOR_PORT,
>>> + &ss, true, NULL)) {
>>> /* sFlow only supports target in default routing table with
>>> * packet mark zero.
>>> */
>>> diff --git a/ovsdb/ovsdb-server.c b/ovsdb/ovsdb-server.c
>>> index b975c17fc..a4f6bca73 100644
>>> --- a/ovsdb/ovsdb-server.c
>>> +++ b/ovsdb/ovsdb-server.c
>>> @@ -26,6 +26,7 @@
>>> #include "command-line.h"
>>> #include "daemon.h"
>>> #include "dirs.h"
>>> +#include "dns-resolve.h"
>>> #include "openvswitch/dynamic-string.h"
>>> #include "fatal-signal.h"
>>> #include "file.h"
>>> @@ -329,6 +330,7 @@ main(int argc, char *argv[])
>>> service_start(&argc, &argv);
>>> fatal_ignore_sigpipe();
>>> process_init();
>>> + dns_resolve_init(true);
>>>
>>> bool active = false;
>>> parse_options(argc, argv, &db_filenames, &remotes, &unixctl_path,
>>> @@ -511,6 +513,7 @@ main(int argc, char *argv[])
>>> run_command, process_status_msg(status));
>>> }
>>> }
>>> + dns_resolve_destroy();
>>> perf_counters_destroy();
>>> service_stop();
>>> return 0;
>>> diff --git a/ovsdb/raft-private.c b/ovsdb/raft-private.c
>>> index 30760233e..970cf3b54 100644
>>> --- a/ovsdb/raft-private.c
>>> +++ b/ovsdb/raft-private.c
>>> @@ -36,7 +36,10 @@ raft_address_validate(const char *address)
>>> return NULL;
>>> } else if (!strncmp(address, "ssl:", 4) || !strncmp(address, "tcp:",
>>> 4)) {
>>> struct sockaddr_storage ss;
>>> - if (!inet_parse_active(address + 4, -1, &ss, true)) {
>>> + bool dns_failure;
>>
>> Not really an issue, but just to make it more clear, could we explicitly
>> initialize dns_failure here?
>
> ok.
>
>>
>>> +
>>> + if (!inet_parse_active(address + 4, -1, &ss, true, &dns_failure)
>>> + && !dns_failure) {
>>> return ovsdb_error(NULL, "%s: syntax error in address",
>>> address);
>>> }
>>> return NULL;
>>
>> Not really mandatory for this patch but we might as well update some
>> comments in raft.c to make it clear that it's allowed to refer to
>> servers by name, e.g., the comment for raft_join_cluster() mentions:
>>
>> * The new server is located at 'local_address', which must take one of the
>> * forms "tcp:IP:PORT" or "ssl:IP:PORT", where IP is an IPv4 address or a
>> * square bracket enclosed IPv6 address and PORT is a TCP port number.
>>
>> The same thing applies to raft_create_cluster().
>
> As I said, DNS resolve is a bit hacked into the logic here, so there
> are lots of fairly confusing comments in both raft and socket-util
> modules. When I worked on the patch I started re-writing and clarifying
> all of them but stopped in the middle. It's just a lot of comment
> changes and it's not really relevant to the current patch, so deserves
> a separate one. What do you think?
>
Sounds good to me, therefore:
Acked-by: Dumitru Ceara <[email protected]>
>>
>> Most of the minor comments above can be addressed at apply time so
>> there's no need for a v2 but I'd like to see if there's a way to avoid
>> changing the signature of inet_parse_active() before ACKing the patch.
>>
>> Regards,
>> Dumitru
>>
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev