Re: [s6-dns] is there a particular reason skadns_packet would return NULL errno ENETUNREACH?

2022-10-12 Thread Guillermo
El mié, 12 oct 2022 a las 23:22, Amelia Bjornsdottir
() escribió:
>
> To clarify, I'm referring to the ->target member (in srv) or the
> ->exchange member (in mx).
>
> Are those not the same as the input format for skadns_send?

Oooh. If you mean the "target" member of a s6dns_message_rr_srv_t
object filled by a call to s6dns_message_get_srv(), or the "exchange"
member of a s6dns_message_rr_mx_t object filled by a call to
s6dns_message_get_mx(), then no, looking at the code from libs6dns
from s6-dns-2.3.5.4, it seems that they are in string format, so you'd
have to convert them to packet format using s6dns_domain_encode()
before using them for a subsequent call to skadns_send().

G.


Re: [s6-dns] is there a particular reason skadns_packet would return NULL errno ENETUNREACH?

2022-10-12 Thread Laurent Bercot

To clarify, I'm referring to the ->target member (in srv) or the ->exchange 
member (in mx).

Are those not the same as the input format for skadns_send?


 When parsed by s6dns_message_parse_answer_srv() and
s6dns_message_parse_answer_mx(), the domains are obtained from the
packet via s6dns_message_get_domain(), which call s6dns_domain_decode().

 In other words, when you obtain a s6dns_message_rr_srv_t or a
s6dns_message_rr_mx_t, the domains in these structures are in string
format. (Because usually they're destined to be returned to the
application and displayed, not used in another packet right away.)
 So if you want to reuse these domains for another skadns_send()
query, you need to re-encode them first via s6dns_domain_encode().

 Thanks Guillermo for getting to the bottom of this! :)

--
 Laurent



Re: [s6-dns] is there a particular reason skadns_packet would return NULL errno ENETUNREACH?

2022-10-12 Thread Amelia Bjornsdottir
To clarify, I'm referring to the ->target member (in srv) or the 
->exchange member (in mx).


Are those not the same as the input format for skadns_send?


On 10/13/22 00:10, Amelia Bjornsdottir wrote:
I'm passing skadns_send an s6dns_domain_t straight out of an 
s6dns_message_rr_srv_t (case 1) or a s6dns_message_rr_mx_t (case 2).Is 
that in packet format or in string format? The documentation claimed 
that it would be in packet format. Should it be in string format?


On 10/12/22 22:32, Guillermo wrote:

El vie, 7 oct 2022 a las 20:29, Amelia Bjornsdottir escribió:
I link truss -f of my application piped through grep skadns' PID to 
show

only skadns, on OmniOS
 and HardenedBSD
.

After further analysis, I see a pattern and have a hypothesis.

Amelia, how's the program constructing the s6dns_domain_t object that
it passes to skadns_send() for A and  queries? Is it calling
s6dns_domain_encode() or s6dns_domain_encode_list(), i.e., is the
object passed to skadns_send() in packet form instead of string form?

G.



--
Amelia Bjornsdottir (she, they)
sysadmin umbrellix.net, deputy sysadmin chatspeed.net
jabber: eamon.aka.amy.malik ~on~ umbrellix.net



Re: [s6-dns] is there a particular reason skadns_packet would return NULL errno ENETUNREACH? [manually resent to list]

2022-10-12 Thread Amelia Bjornsdottir

about those queries, you are exactly correct.

I guess I should change my methodology then.

Thanks...

On 10/13/22 01:39, Guillermo wrote:

El mié, 12 oct 2022 a las 21:10, Amelia Bjornsdottir escribió:

I'm passing skadns_send an s6dns_domain_t straight out of an
s6dns_message_rr_srv_t (case 1) or a s6dns_message_rr_mx_t (case 2).Is
that in packet format or in string format?

Um, neither? As far as I can tell, skadns_send() always takes a domain
name encoded in a s6dns_domain_t object, and the type of resource
record that you want as the "qtype" argument, which go straight to the
"question" section of a DNS query. Objects of types
s6dns_message_rr_srv_t and s6dns_message_rr_mx_t are used for parsing
RRs in the DNS response that skadns_packet() gives you after the
client gets if from skadnsd using skadns_update().

After learning a bit about skadnsd's texclient protocol, looking at
HardenedBSD's truss output, it looks like your program does 3 queries
for SRV RRs, 1 query for an MX RR, 9 queries for A RRs, and 9 queries
for  RRs. I suppose that on OmniOS, the program does the exact
same 22 queries. In both cases you get responses with no error for the
SRV and MX queries. On Vultr's network,the A and  queries all seem
to get a response with a "format error" RCODE, presumably because the
resulting DNS packet is malformed, and on Shaw's network they don't
seem get a response at all. One possible explanation being that, if
packets are really malformed, Shaw's caches might just not bother
responding to them. This:

sendto(17,"\^?!\^A\0\0\^A\0\0\0\0\0\0.perih"...,44,0,NULL,0) = 44 (0x2c)

makes me very suspicious. That looks like a dot followed by the label
"perihelion", i.e. like coming from a s6dns_domain_t object in string
form.

G.


--
Amelia Bjornsdottir (she, they)
sysadmin umbrellix.net, deputy sysadmin chatspeed.net
jabber: eamon.aka.amy.malik ~on~ umbrellix.net



Re: [s6-dns] is there a particular reason skadns_packet would return NULL errno ENETUNREACH? [manually resent to list]

2022-10-12 Thread Guillermo
El mié, 12 oct 2022 a las 21:10, Amelia Bjornsdottir escribió:
>
> I'm passing skadns_send an s6dns_domain_t straight out of an
> s6dns_message_rr_srv_t (case 1) or a s6dns_message_rr_mx_t (case 2).Is
> that in packet format or in string format?

Um, neither? As far as I can tell, skadns_send() always takes a domain
name encoded in a s6dns_domain_t object, and the type of resource
record that you want as the "qtype" argument, which go straight to the
"question" section of a DNS query. Objects of types
s6dns_message_rr_srv_t and s6dns_message_rr_mx_t are used for parsing
RRs in the DNS response that skadns_packet() gives you after the
client gets if from skadnsd using skadns_update().

After learning a bit about skadnsd's texclient protocol, looking at
HardenedBSD's truss output, it looks like your program does 3 queries
for SRV RRs, 1 query for an MX RR, 9 queries for A RRs, and 9 queries
for  RRs. I suppose that on OmniOS, the program does the exact
same 22 queries. In both cases you get responses with no error for the
SRV and MX queries. On Vultr's network,the A and  queries all seem
to get a response with a "format error" RCODE, presumably because the
resulting DNS packet is malformed, and on Shaw's network they don't
seem get a response at all. One possible explanation being that, if
packets are really malformed, Shaw's caches might just not bother
responding to them. This:

sendto(17,"\^?!\^A\0\0\^A\0\0\0\0\0\0.perih"...,44,0,NULL,0) = 44 (0x2c)

makes me very suspicious. That looks like a dot followed by the label
"perihelion", i.e. like coming from a s6dns_domain_t object in string
form.

G.


Re: [s6-dns] is there a particular reason skadns_packet would return NULL errno ENETUNREACH? [manually resent to list]

2022-10-12 Thread Amelia Bjornsdottir
I'm passing skadns_send an s6dns_domain_t straight out of an 
s6dns_message_rr_srv_t (case 1) or a s6dns_message_rr_mx_t (case 2).Is 
that in packet format or in string format? The documentation claimed 
that it would be in packet format. Should it be in string format?


On 10/12/22 22:32, Guillermo wrote:

El vie, 7 oct 2022 a las 20:29, Amelia Bjornsdottir escribió:

I link truss -f of my application piped through grep skadns' PID to show
only skadns, on OmniOS
  and HardenedBSD
.

After further analysis, I see a pattern and have a hypothesis.

Amelia, how's the program constructing the s6dns_domain_t object that
it passes to skadns_send() for A and  queries? Is it calling
s6dns_domain_encode() or s6dns_domain_encode_list(), i.e., is the
object passed to skadns_send() in packet form instead of string form?

G.


--
Amelia Bjornsdottir (she, they)
sysadmin umbrellix.net, deputy sysadmin chatspeed.net
jabber: eamon.aka.amy.malik ~on~ umbrellix.net



Re: [s6-dns] is there a particular reason skadns_packet would return NULL errno ENETUNREACH?

2022-10-12 Thread Guillermo
El vie, 7 oct 2022 a las 20:29, Amelia Bjornsdottir escribió:
>
> I link truss -f of my application piped through grep skadns' PID to show
> only skadns, on OmniOS
>  and HardenedBSD
> .

After further analysis, I see a pattern and have a hypothesis.

Amelia, how's the program constructing the s6dns_domain_t object that
it passes to skadns_send() for A and  queries? Is it calling
s6dns_domain_encode() or s6dns_domain_encode_list(), i.e., is the
object passed to skadns_send() in packet form instead of string form?

G.


Re: [s6-dns] is there a particular reason skadns_packet would return NULL errno ENETUNREACH?

2022-10-11 Thread Ellenor Bjornsdottir
Shaw's cache blocking me would be an interesting hypothesis. However, 
wouldn't my first query respond and the others block me? I got no 
response from any of the queries, suggesting that it is this rd-bit 
issue Ermine raised.


I should run a DNS cache locally, though.

On 10/10/22 19:23, Guillermo wrote:

El lun, 10 oct 2022 a las 13:28, Laurent Bercot escribió:

   s6dns_engine filters answers that do not seem relevant to in-flight
queries. That includes malformed answers or ones that do not follow
RFC 1035.
   I was made aware (thanks, Ermine) that some caches fail to set the
RD bit in their responses to queries containing the RD bit; these
answers were ignored.

However, the OS would still deliver them to skadnsd in a recv() /
recvfrom() call, right? If my reading of the truss outputs is correct,
the HardenedBSD system isn't getting a response at all, and whatever
error happens with the program running on the OmniOS system, if any,
does not involve the network (I can't tell if skadnsd is delivering
all received answers to the client).

I feel that packet capture tools like tcpdump(1) or OmniOS' snoop(8)
would be better suited for answering the questions that have been
raised so far (malformed packets, ignored responses, lack of
responses, etc.). Also, aren't 18 outstanding queries in a short
amount of time from one single host, like, a lot? Couldn't Shaw's
caches think that they are being DoS'ed :P ?

G:


--
Ellenor Agnes Bjornsdottir (she)
sysadmin umbrellix.net
jabber: ellenor ~on~ umbrellix.net



OpenPGP_0x4FF7A78866B94DA6.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [s6-dns] is there a particular reason skadns_packet would return NULL errno ENETUNREACH?

2022-10-10 Thread Laurent Bercot




However, the OS would still deliver them to skadnsd in a recv() /
recvfrom() call, right? If my reading of the truss outputs is correct,
the HardenedBSD system isn't getting a response at all,


 That's right, which is why my hypothesis of the RD bit filter only
applied to OmniOS, which did get responses but these got ignored
by skadnsd. On HardenedBSD, 18 queries getting no answers from the
caches is absolutely a different problem.



 and whatever
error happens with the program running on the OmniOS system, if any,
does not involve the network


 It involves the relevance test:
 
https://github.com/skarnet/s6-dns/blob/master/src/libs6dns/s6dns_engine.c#L32

 This function is called on every incoming message that is a potential
response. If it returns 0, the message is deemed irrelevant to the
current query, and ignored. When you see a recv() (or recvfrom()) from
a UDP socket, but no answer is reported to the client and the socket is
still polled until it times out, it means that the relevant() test 
failed.


 Until tonight, the "h.rd != (q[2] & 1)" test, i.e. "is the rd bit of
the response different from the rd bit of the query", was performed
outside of the "strict" guard. This made some responses be ignored as
malformed, because it's the cache not following the RFC; it is quite
possible that it's what happened on OmniOS here.



 (I can't tell if skadnsd is delivering
all received answers to the client).


 After the first one which is a connection/synchronization marker,
a write() to the async pipe to the client (10 on HardenedBSD, 9 on
OmniOS) is an answer or a sequence of answers. (skadnsd buffers the
answers into a textmessage_sender, i.e. a bufalloc, which is flushed
at the next ppoll() invocation.) Writes of length 7 are failures
(4 bytes length, 2 bytes query id, 1 byte errno); writes of length
14 are 2 reports of failure, you can see it in the string. 28 is
4 failures; 95 and 140 are likely 1 success (length, query id, 0
for success, then the response packet); 279 is likely two successes.

 At the end of the traces, we get EOF on 0 while there are still a
lot of sockets being polled. That's the client exiting - or at least
closing the skadns connection - while some queries are in-flight.
The bro math checks out, it definitely looks like all received
answers, positive and negative, have been delivered.



I feel that packet capture tools like tcpdump(1) or OmniOS' snoop(8)
would be better suited for answering the questions that have been
raised so far (malformed packets, ignored responses, lack of
responses, etc.).


 strace has an option to print full strings. truss should have a
similar option (if its display can be trusted...) You're right that
packet capture tools would be good to use in this situation, but since
I personally loathe using them, I don't want to ask other people to
use them, and I can work with what we have. On HardenedBSD at least,
the traces are readable.



 Also, aren't 18 outstanding queries in a short
amount of time from one single host, like, a lot? Couldn't Shaw's
caches think that they are being DoS'ed :P ?


 That's definitely possible, and I would say likely, but I don't want
to lay the blame on others before making sure we're in the clear. :)

--
 Laurent



Re: [s6-dns] is there a particular reason skadns_packet would return NULL errno ENETUNREACH?

2022-10-10 Thread Amelia Bjornsdottir
It seems like on both Shaw's and Vultr's network I can hammer the cache 
all day with s6-dnsip[46]-filter. I send 13 requests, I get 13 
responses. Read more at 


On 10/10/22 22:07, Laurent Bercot wrote:
Anyway. Pre-update `/package/web/s6-dns/command/s6-dnsip[46] 
perihelion.ultradian.club` returns the correct response on both 
machines, even if run after doing the SRV and MX lookups.


 Wilder and wilder. Can you test s6-dnsip[46]-filter?
{ echo domain1.org ; echo domain2.org ; ... } | s6-dnsip4-filter
These do A and  queries, but via skadns. If skadnsd is the culprit,
the -filter programs should fail.


(side note: I'm realizing that my program makes duplicate queries. 
This shouldn't impact the accuracy of the responses, but it does mean 
the caches could be blocking me or something, but not blocking me 
when I use /package/web/s6-dns/command/s6-dnsip[46].)


 Could be. We're trying to build a simple test case that fails. If our
simple test cases all pass and your program fails, the cause may be in
the way your program is spamming the cache - but you'd have to ask the
cache administrators about querying policies to test that hypothesis.

--
 Laurent


--
Amelia Bjornsdottir (she, they)
sysadmin umbrellix.net, deputy sysadmin chatspeed.net
jabber: eamon.aka.amy.malik ~on~ umbrellix.net



Re: [s6-dns] is there a particular reason skadns_packet would return NULL errno ENETUNREACH?

2022-10-10 Thread Laurent Bercot

Anyway. Pre-update `/package/web/s6-dns/command/s6-dnsip[46] 
perihelion.ultradian.club` returns the correct response on both machines, even 
if run after doing the SRV and MX lookups.


 Wilder and wilder. Can you test s6-dnsip[46]-filter?
{ echo domain1.org ; echo domain2.org ; ... } | s6-dnsip4-filter
These do A and  queries, but via skadns. If skadnsd is the culprit,
the -filter programs should fail.



(side note: I'm realizing that my program makes duplicate queries. This 
shouldn't impact the accuracy of the responses, but it does mean the caches 
could be blocking me or something, but not blocking me when I use 
/package/web/s6-dns/command/s6-dnsip[46].)


 Could be. We're trying to build a simple test case that fails. If our
simple test cases all pass and your program fails, the cause may be in
the way your program is spamming the cache - but you'd have to ask the
cache administrators about querying policies to test that hypothesis.

--
 Laurent



Re: [s6-dns] is there a particular reason skadns_packet would return NULL errno ENETUNREACH?

2022-10-10 Thread Guillermo
El lun, 10 oct 2022 a las 13:28, Laurent Bercot escribió:
>
>   s6dns_engine filters answers that do not seem relevant to in-flight
> queries. That includes malformed answers or ones that do not follow
> RFC 1035.
>   I was made aware (thanks, Ermine) that some caches fail to set the
> RD bit in their responses to queries containing the RD bit; these
> answers were ignored.

However, the OS would still deliver them to skadnsd in a recv() /
recvfrom() call, right? If my reading of the truss outputs is correct,
the HardenedBSD system isn't getting a response at all, and whatever
error happens with the program running on the OmniOS system, if any,
does not involve the network (I can't tell if skadnsd is delivering
all received answers to the client).

I feel that packet capture tools like tcpdump(1) or OmniOS' snoop(8)
would be better suited for answering the questions that have been
raised so far (malformed packets, ignored responses, lack of
responses, etc.). Also, aren't 18 outstanding queries in a short
amount of time from one single host, like, a lot? Couldn't Shaw's
caches think that they are being DoS'ed :P ?

G:


Re: [s6-dns] is there a particular reason skadns_packet would return NULL errno ENETUNREACH?

2022-10-10 Thread Amelia Bjornsdottir
Both machines are AMD64, so endianness should not be a problem. The 
truss tool may be displaying them incorrectly.


Anyway. Pre-update `/package/web/s6-dns/command/s6-dnsip[46] 
perihelion.ultradian.club` returns the correct response on both 
machines, even if run after doing the SRV and MX lookups. Linked below 
is truss output for all four invokations.







(side note: I'm realizing that my program makes duplicate queries. This 
shouldn't impact the accuracy of the responses, but it does mean the 
caches could be blocking me or something, but not blocking me when I use 
/package/web/s6-dns/command/s6-dnsip[46].)


On 10/10/22 11:12, Laurent Bercot wrote:



On OmniOS, all the DNS queries (apparently 58) received a response. On
HardenedBSD, only the first 4 queries received a response, the next 18
timed out. They were retried 4 additional times, as expected, again
timing out without receiving a response.


 The fd of the async pipe to the client isn't the same in both outputs:
it's 9 on OmniOS and 10 on HardenedBSD, which means the client uses one
more fd on HardenedBSD for some reason. (Does OmniOS support signalfd()?
That would explain it.)

 On HardenedBSD, 4 queries received responses, that were properly
reported to the client. The others were pending and retried with longer
timeouts, but only 6 of them reported a full timeout to the client.
The client exited while 12 queries were technically still in flight.

 On OmniOS, I can't even make sense of some of the strings, typically
in the async responses to the client. What is the endianness of this
machine? A network byte order 32-bit number equal to 3 seems to be
encoded as { 0, 0, 3, 0 }, which doesn't look right. (I did check my
uint32_bswap() primitive.) If the client isn't complaining very loudly
when it receives such strings, it means the strings are correct and the
truss tool displays them incorrectly, which doesn't help me diagnose
what's going on.

 In any case the problems look unrelated to skadnsd and come from the
interaction between the s6-dns library and the caches: either the
packets are correct and the caches are not sending the responses they
should, and that's not an s6-dns problem, or the packets are malformed
and that's why the servers are ignoring them, and I need to fix that.
 Amelia, could you do some tests (with the same caches) from s6-dns
command-line clients such as s6-dnsip4? That will bypass the skadns
layer, and will be easier to trace and understand. Thanks :)

--
 Laurent


--
Amelia Bjornsdottir (she, they)
sysadmin umbrellix.net, deputy sysadmin chatspeed.net
jabber: eamon.aka.amy.malik ~on~ umbrellix.net



Re: [s6-dns] is there a particular reason skadns_packet would return NULL errno ENETUNREACH?

2022-10-10 Thread Laurent Bercot



 s6dns_engine filters answers that do not seem relevant to in-flight
queries. That includes malformed answers or ones that do not follow
RFC 1035.
 I was made aware (thanks, Ermine) that some caches fail to set the
RD bit in their responses to queries containing the RD bit; these
answers were ignored.
 I just pushed a workaround to the s6-dns git, to only perform the
RD check on answers when a "strict" flag is given, which it's not
in any of the command-line wrappers or in skadnsd.

 Can you please try with the latest s6-dns git and see if the answers
you're getting on OmniOS are accepted this time?

--
 Laurent



Re: [s6-dns] is there a particular reason skadns_packet would return NULL errno ENETUNREACH?

2022-10-10 Thread Laurent Bercot




On OmniOS, all the DNS queries (apparently 58) received a response. On
HardenedBSD, only the first 4 queries received a response, the next 18
timed out. They were retried 4 additional times, as expected, again
timing out without receiving a response.


 The fd of the async pipe to the client isn't the same in both outputs:
it's 9 on OmniOS and 10 on HardenedBSD, which means the client uses one
more fd on HardenedBSD for some reason. (Does OmniOS support signalfd()?
That would explain it.)

 On HardenedBSD, 4 queries received responses, that were properly
reported to the client. The others were pending and retried with longer
timeouts, but only 6 of them reported a full timeout to the client.
The client exited while 12 queries were technically still in flight.

 On OmniOS, I can't even make sense of some of the strings, typically
in the async responses to the client. What is the endianness of this
machine? A network byte order 32-bit number equal to 3 seems to be
encoded as { 0, 0, 3, 0 }, which doesn't look right. (I did check my
uint32_bswap() primitive.) If the client isn't complaining very loudly
when it receives such strings, it means the strings are correct and the
truss tool displays them incorrectly, which doesn't help me diagnose
what's going on.

 In any case the problems look unrelated to skadnsd and come from the
interaction between the s6-dns library and the caches: either the
packets are correct and the caches are not sending the responses they
should, and that's not an s6-dns problem, or the packets are malformed
and that's why the servers are ignoring them, and I need to fix that.
 Amelia, could you do some tests (with the same caches) from s6-dns
command-line clients such as s6-dnsip4? That will bypass the skadns
layer, and will be easier to trace and understand. Thanks :)

--
 Laurent



Re: [s6-dns] is there a particular reason skadns_packet would return NULL errno ENETUNREACH?

2022-10-08 Thread Guillermo
El vie, 7 oct 2022 a las 20:29, Amelia Bjornsdottir escribió:
>
> On OmniOS, in Vultr's network, my A and  lookups check in (skadns_t
> *)->list after skadns_update(...), with failure, quickly. On
> HardenedBSD, in Shaw's network (so I'm aware that I'm not controlling
> for different DNS recursors here, and I should be), my A and 
> lookups check with failure after the 45 seconds.
>
> I link truss -f of my application piped through grep skadns' PID to show
> only skadns, on OmniOS
>  and HardenedBSD
> .

On OmniOS, all the DNS queries (apparently 58) received a response. On
HardenedBSD, only the first 4 queries received a response, the next 18
timed out. They were retried 4 additional times, as expected, again
timing out without receiving a response.

G.


Re: [s6-dns] is there a particular reason skadns_packet would return NULL errno ENETUNREACH?

2022-10-07 Thread Amelia Bjornsdottir
With my memory management a little less impeachable, we turn now to 
different behavior on different OSes.


On OmniOS, in Vultr's network, my A and  lookups check in (skadns_t 
*)->list after skadns_update(...), with failure, quickly. On 
HardenedBSD, in Shaw's network (so I'm aware that I'm not controlling 
for different DNS recursors here, and I should be), my A and  
lookups check with failure after the 45 seconds.


I link truss -f of my application piped through grep skadns' PID to show 
only skadns, on OmniOS 
 and HardenedBSD 
. If my application's 
truss output is required, I can supply that too.


On 10/7/22 08:27, Ellenor Bjornsdottir wrote:

While chasing this bug, I found out that I screwed up memory management in my 
program.

This is no longer skaware@'s problem. I'll be back with a truss run once I've 
fixed /that/.

On 6 October 2022 16:45:50 UTC, Laurent Bercot  wrote:

Neither of those conditions actually apply - my network is up and my resolver 
is responding (albeit slowly - it takes about a second). I get the expected 
response on the first batch of queries I fire off, but then the second batch 
gets ENETUNREACH. This happens every time I run my program (albeit on special 
snowflake illumos; I have not tried on other OSes).

If you think s6-dns is behaving incorrectly, please pastebin a strace
(or local equivalent) of skadnsd somewhere, so we can check what it is
doing.

--
Laurent


--
Amelia Bjornsdottir (she, they)
sysadmin umbrellix.net, deputy sysadmin chatspeed.net
jabber: eamon.aka.amy.malik ~on~ umbrellix.net



Re: [s6-dns] is there a particular reason skadns_packet would return NULL errno ENETUNREACH?

2022-10-07 Thread Ellenor Bjornsdottir
While chasing this bug, I found out that I screwed up memory management in my 
program.

This is no longer skaware@'s problem. I'll be back with a truss run once I've 
fixed /that/.

On 6 October 2022 16:45:50 UTC, Laurent Bercot  wrote:
>> Neither of those conditions actually apply - my network is up and my 
>> resolver is responding (albeit slowly - it takes about a second). I get the 
>> expected response on the first batch of queries I fire off, but then the 
>> second batch gets ENETUNREACH. This happens every time I run my program 
>> (albeit on special snowflake illumos; I have not tried on other OSes).
>
> If you think s6-dns is behaving incorrectly, please pastebin a strace
>(or local equivalent) of skadnsd somewhere, so we can check what it is
>doing.
>
>--
> Laurent
>

-- 
Ellenor Bjornsdottir (she)
sysadmin umbrellix.net

Re: [s6-dns] is there a particular reason skadns_packet would return NULL errno ENETUNREACH?

2022-10-06 Thread Laurent Bercot

Neither of those conditions actually apply - my network is up and my resolver 
is responding (albeit slowly - it takes about a second). I get the expected 
response on the first batch of queries I fire off, but then the second batch 
gets ENETUNREACH. This happens every time I run my program (albeit on special 
snowflake illumos; I have not tried on other OSes).


 If you think s6-dns is behaving incorrectly, please pastebin a strace
(or local equivalent) of skadnsd somewhere, so we can check what it is
doing.

--
 Laurent



Re: [s6-dns] is there a particular reason skadns_packet would return NULL errno ENETUNREACH?

2022-10-06 Thread Ellenor Bjornsdottir
Neither of those conditions actually apply - my network is up and my resolver 
is responding (albeit slowly - it takes about a second). I get the expected 
response on the first batch of queries I fire off, but then the second batch 
gets ENETUNREACH. This happens every time I run my program (albeit on special 
snowflake illumos; I have not tried on other OSes).

On 6 October 2022 11:23:20 UTC, Laurent Bercot  wrote:
>
>> i source spelunked and the story is that, if the error is coming from 
>> s6dns_engine_prepare, dt->protostate exceeds or equals 4. I chased that 
>> struct member around a few times and I couldn't figure out what it means to 
>> s6dns.
>
> dt->protostate is used for two things:
>
> - in UDP mode, to track how many times the query has been sent to the
>whole list of caches and all of them have failed to answer within a
>given timeout. The timeout increases for each round.
>
> - in TCP mode, to track how many bytes of the query have been written
>and how many bytes of the answer have been received (a congested
>network may result in short writes or reads).
>
> The error you got indeed happens when you're in UDP mode (the starting
>default for every query), dt->protostate has reached 4 and
>s6dns_engine_prepare() returns 0 ENETUNREACH, which
>s6dns_engine_timeout() stores into dt->status and skadnsd then sends
>back to your client.
>
> What it means is that your query was sent in succession to every
>cache listed in dt->servers (most likely, the list of "nameserver"
>entries in your /etc/resolv.conf, unless you overrode it with the
>DNSCACHEIP environment variable), and every one of them failed to
>answer within 1 second, then within 3 seconds, then within 11
>seconds, then within 45 seconds. That sounds like either your
>nameserver list is bad, or your own network is down; and s6-dns reports
>this as ENETUNREACH.
>
>--
> Laurent
>

-- 
Ellenor Bjornsdottir (she)
sysadmin umbrellix.net

Re: [s6-dns] is there a particular reason skadns_packet would return NULL errno ENETUNREACH?

2022-10-06 Thread Laurent Bercot




i source spelunked and the story is that, if the error is coming from 
s6dns_engine_prepare, dt->protostate exceeds or equals 4. I chased that struct 
member around a few times and I couldn't figure out what it means to s6dns.


 dt->protostate is used for two things:

 - in UDP mode, to track how many times the query has been sent to the
whole list of caches and all of them have failed to answer within a
given timeout. The timeout increases for each round.

 - in TCP mode, to track how many bytes of the query have been written
and how many bytes of the answer have been received (a congested
network may result in short writes or reads).

 The error you got indeed happens when you're in UDP mode (the starting
default for every query), dt->protostate has reached 4 and
s6dns_engine_prepare() returns 0 ENETUNREACH, which
s6dns_engine_timeout() stores into dt->status and skadnsd then sends
back to your client.

 What it means is that your query was sent in succession to every
cache listed in dt->servers (most likely, the list of "nameserver"
entries in your /etc/resolv.conf, unless you overrode it with the
DNSCACHEIP environment variable), and every one of them failed to
answer within 1 second, then within 3 seconds, then within 11
seconds, then within 45 seconds. That sounds like either your
nameserver list is bad, or your own network is down; and s6-dns reports
this as ENETUNREACH.

--
 Laurent



Re: [s6-dns] is there a particular reason skadns_packet would return NULL errno ENETUNREACH?

2022-10-05 Thread Amelia Bjornsdottir
i source spelunked and the story is that, if the error is coming from 
s6dns_engine_prepare, dt->protostate exceeds or equals 4. I chased that 
struct member around a few times and I couldn't figure out what it means 
to s6dns.


On 10/6/22 02:30, Amelia Bjornsdottir wrote:
I have to assume that this is because the DNS server that it's trying 
to reach out to is unreachable - is that so?



--
Amelia Bjornsdottir (she, they)
sysadmin umbrellix.net, deputy sysadmin chatspeed.net
jabber: eamon.aka.amy.malik ~on~ umbrellix.net