On 2023-04-13 21:01 +02, "Peter J. Philipp" <p...@delphinusdns.org> wrote:
> On Mon, Apr 10, 2023 at 10:17:08AM +0200, Peter J. Philipp wrote:
>> On Sat, Apr 08, 2023 at 08:28:05PM +0200, Peter J. Philipp wrote:
>> /cut
>> > Apr  6 14:43:05 polarstern unwind[97893]: bad packet: too large: 65552 - 
>> > pool.ntp.org. IN AAAA
>> > Apr  6 14:46:25 polarstern unwind[97893]: bad packet: too large: 65552 - 
>> > pool.ntp.org.mainrechner.de. IN AAAA
>> 
>> /cut
>> > Any thoughts?
>> > 
>> > Best Regards,
>> > -peter
>> 
>> OK I traced this log down, by mere accident.  It is a condition when the
>> forwarder isn't listening on TCP (or UDP).  Perhaps we need a better syslog
>> for this, I'll see later today if I can produce a patch.
>> 
>> Best Regards,
>> -peter
>
> Alright, I have a patch together with the first patch that I sent.  I tested 
> this by killing my forwarder and pinging centroid.eu (my site), the error 
> message is correct in my view.   But you're the judge of that.

Sorry, this doesn't make any sense.

I could never reproduce the -1 or > 65535 case reliably, I see it once
in a while, but I can't trigger it. I also can't reproduce it with your
instructions.

As far as I can work out these weird answer_len numbers come from
libworker_event_done_cb() in libworker.c, specifically:

                (*cb)(cb_arg, rcode, (buf?(void*)sldns_buffer_begin(buf):NULL),
                        (buf?(int)sldns_buffer_limit(buf):0), sec, why_bogus, 
was_ratelimited);

So you are comparing against sldns_buffer_limit(buf), that can just be
anything, you can't derive from that "remote server had no listening
port".

I get "rcode: 2, answer_len: 128" or something else sensibly if I point
unwind at a non-responsive forwarder.

Since you can reproduce it, can you try this diff please and report
back?

Does it hit that code path when answer_len is -1 or when answer_len
is 65552?

diff --git libunbound/libunbound/libworker.c libunbound/libunbound/libworker.c
index 11bf5f9db55..6bb34001ee9 100644
--- libunbound/libunbound/libworker.c
+++ libunbound/libunbound/libworker.c
@@ -680,6 +680,9 @@ libworker_event_done_cb(void* arg, int rcode, sldns_buffer* 
buf,
        context_query_delete(q);
        lock_basic_unlock(&ctx->cfglock);
 
+       log_warn("%s: rcode: %d, answer_len: %d\n", __func__,
+           rcode, buf?(int)sldns_buffer_limit(buf):0);
+
        if(!cancelled) {
                /* call callback */
                int sec = 0;


>
> ---------------->
> Apr 13 20:53:26 polarstern unwind[73363]: terminating
> Apr 13 20:53:27 polarstern resolvd[81113]: rebuilding: new unwind
> Apr 13 20:54:12 polarstern unwind[16252]: remote nameserver had no listening 
> port: centroid.eu. IN A
> ^Z[1] + Suspended            tail -f /var/log/daemon 
> root@polarstern# dddctl start
> starting delphinusdnsd
> root@polarstern# fg
> tail -f /var/log/daemon 
> <----------------
> (after this restart of delphinusdnsd forwarder this pinged fine, no more logs)
>
> The patch is below my .signature,
>
> Best Regards,
> -peter
>
> Index: resolver.c
> ===================================================================
> RCS file: /cvs/src/sbin/unwind/resolver.c,v
> retrieving revision 1.158
> diff -u -p -u -r1.158 resolver.c
> --- resolver.c        8 Feb 2023 08:01:25 -0000       1.158
> +++ resolver.c        13 Apr 2023 18:51:01 -0000
> @@ -954,13 +954,19 @@ resolve_done(struct uw_resolver *res, vo
>       running_res = --rq->running;
>  
>       if (answer_len < LDNS_HEADER_SIZE) {
> -             log_warnx("bad packet: too short");
> +             if (answer_len != -1)
> +                     log_warnx("bad packet: too short");
>               goto servfail;
>       }
>  
>       if (answer_len > UINT16_MAX) {
> -             log_warnx("bad packet: too large: %d - %s", answer_len,
> -                 query_imsg2str(query_imsg));
> +             if (answer_len == 65552) {
> +                     log_warnx("remote nameserver had no listening port: %s",
> +                             query_imsg2str(query_imsg);
> +             } else {
> +                     log_warnx("bad packet: too large: %d - %s", answer_len,
> +                             query_imsg2str(query_imsg));
> +             }
>               goto servfail;
>       }
>       answer_header->answer_len = answer_len;
>

-- 
In my defence, I have been left unsupervised.

Reply via email to