Re: [Dnsmasq-discuss] [PATCH] Fix buffer overflow in TCP requests

2020-06-30 Thread Frank
Resending this because I realized I sent it to Simon rather than the list:

Hi Simon,

This bug is fairly easy to reproduce. It can take 10 mins or more to
reproduce a crash so I suggest checking the length you get in
cache_recv_insert. If the domains being used are > 4 characters, this
will catch the bug the first time it occurs. That usually happens
within a minute. Something like this:

if (m > MAXDNAME) abort();

Then bombard dnsmasq with TCP queries:

docker run --rm -v /root/dns_queries:/dns_queries
aiys0211/flamethrower_v0.10.2 -f
/dns_queries/queryfile-example-100thousand -l 1000 -c 20 -q 10 -d 300
-p 53 -P tcp 

I used the first 100 thousand entries from here:
https://www.dns-oarc.net/files/dnsperf/data/queryfile-example-10million-201202.gz

Frank

On Sun, Jun 28, 2020 at 1:58 PM Simon Kelley  wrote:
>
> That's a nasty bug, and could explain what pi-hole users are seeing.
>
> If I understand things correctly, this bug will only manifest itself
> when the write() or read() syscalls return EINTR ir EAGAIN, which is
> possible, but not common, hence the bugs wasn't detected earlier.
>
> Frank, did you find a way to reproduce this on demand, or just capture
> sporadic instances?
>
> I've applied Frank's patch, but immediately superseded it with a more
> general clean-up which should have the same effect. I've tagged a
> 2.82test1 release with this. If that fixes the problem for pi-hole, I
> plan to go to 2.82 fairly quickly: I don't want this hanging around.
>
> It's slightly ironic that the original change in 2.81 is to fix a
> theoretical hang which I've never actually seen and cannot reproduce,
> but it introduced a really crash bug.
>
>
> Cheers,
>
> Simon.
>
>
>
>
> On 18/06/2020 22:42, Dominik wrote:
> > Hey Frank, dear list-members,
> >
> > thanks for your proposed fix. The Pi-hole team adopted dnsmasq v2.81
> > early on and we're seeing reports for mysterious crashes scattered all
> > over the dnsmasq code since the very first days of releasing our latest
> > version. Crashes reports show several locations, e.g., in_zone(),
> > iface_check() and even main() in a few places. TCP forks were always
> > happening sometime (not necessarily immediately) before a crash.
> > Running dnsmasq in debug mode (prevents forking) stopped any crashes.
> >
> > So far, we haven't had the time to dig into this ourselves properly,
> > however, your description perfectly matches with my preliminary
> > conclusion for one week ago:
> >> the dangling pointer is in **all** bug reports known so far in the
> > daemon pointer
> > (https://github.com/pi-hole/FTL/issues/805#issuecomment-643157367)
> >
> > The addition of TCP forks being able to add cache replies in dnsmasq
> > v2.81 is what is causing this code (which has been like this for along
> > time) to become problematic. I recently looked at the same code but
> > missed the chance that the byte may be still in the pipe. This is a
> > very good catch!
> >
> > I'm writing this mail to raise awareness that this appears to be a high
> > severity bug as it can easily lead to anything between indeterminable
> > behavior to fatal failures (typically the case). Even when the bug can
> > only be triggered under certain conditions, it recently caused over 200
> > posts on the Pi-hole issue ticket system (Github). Crashes due to this
> > have been reported independently by several (more than 20) individual
> > users.
> > Best
> > Dominik
> >
> > On Wed, 2020-06-17 at 19:52 -0700, Frank wrote:
> >> Hello,
> >>
> >> This patch fixes a buffer overflow in TCP requests. Since the read is
> >> not actually being retried, the byte written by the child can be left
> >> in the pipe. When that happens, cache_recv_insert() reads the length
> >> of the name, which is now multiplied by 256 due to the extra 0 byte
> >> (8 bit shift) and results in daemon->namebuff being overflowed.
> >> Namebuff is immediately before the daemon struct in memory so it ends
> >> up corrupting the beginning of the daemon struct.
> >>
> >> diff --git a/src/dnsmasq.c b/src/dnsmasq.c
> >> index 6481de8..457dea3 100644
> >> --- a/src/dnsmasq.c
> >> +++ b/src/dnsmasq.c
> >> @@ -1887,7 +1887,7 @@ static void check_dns_listeners(time_t now)
> >>single byte comes back up the pipe, which
> >>is sent by the child after it has closed
> >> the
> >>netlink socket. */
> >> -   retry_send(read(pipefd[0], , 1));
> >> +   while (retry_send(read(pipefd[0], , 1)));
> >>  #endif
> >> break;
> >>   }
> >> @@ -1928,7 +1928,7 @@ static void check_dns_listeners(time_t now)
> >>  #ifdef HAVE_LINUX_NETWORK
> >>   /* See comment above re netlink socket. */
> >>   close(daemon->netlinkfd);
> >> - retry_send(write(pipefd[1], , 1));
> >> + while (retry_send(write(pipefd[1], , 1)));
> >>  #endif
> >>   

Re: [Dnsmasq-discuss] Static leases issues

2020-06-30 Thread Olaf Hering
Am Mon, 29 Jun 2020 15:45:16 +0100
schrieb Simon Kelley :

> the DHCP server has to comply

I think this is the question: comply to what?

Clearly the DHCP admin is the _only_ authority to decide which MAC gets what IP 
address, the DHCP client is only allowed to make suggestions. As such every 
DHCP server has to follow the source of authority, it should rather not follow 
what is written in some silly RFC. (I assume the implemented behavior comes 
from some RFC, not from real life expectations)

It might well be that the IP for the DHCP client will change in the future, 
once the lease expires. Why would it matter if the client gets the expected IP 
address "now" rather than "later"?


Olaf


pgpwjJzlRxHAs.pgp
Description: Digitale Signatur von OpenPGP
___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
http://lists.thekelleys.org.uk/mailman/listinfo/dnsmasq-discuss