Re: [Dnsmasq-discuss] About resolution performance and adblock

2024-11-25 Thread Ercolino de Spiacico
The amount of resources we have on a router are very limited. That is 
mainly the reason why we used dnsmasq initially.


Wireless drivers keep us jailed to Kernel 2.6 for the time being, and 
this is also another limitation that could prevent a Pi-hole to be even 
be considered, as backporting is usually a fair bit of work.


Our devices have 32-256MB or RAM with a majority sitting at 128MB

Making adblock working, and not crashing anything else, was not a simple 
task. So we have now something fully operational, but we are now 
pondering the actual performance gone past a certain limit.


If there was a simpler internal dnsmasq modification possible, like the 
one suggested in the other post of this thread, with direct file 
mapping, we would definitely support this instead.


Looking at the bright side, there is a relatively new version of Tomato 
firmware called Tomato64 and this runs on x86 hardware. I'll make sure 
to bring up the pi-hole topic with the relevant people for this version 
at least.


Thanks again for all the info you brought up!

___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss


Re: [Dnsmasq-discuss] About resolution performance and adblock

2024-11-25 Thread Buck Horn via Dnsmasq-discuss

Hi Ercolino,

On 24.11.24 17:26, Ercolino de Spiacico wrote:

(...) Also, Pi-Hole requires additional HW, where FreshTomato runs on
a simple router with optional USB storage.


pihole-FTL may require more resources to run than plain dnsmasq (mostly
memory), but it does not tie to any specific hardware.
It can be used as a drop-in replacement for dnsmasq, and it runs on any
supported OS (Debian (in different flavours),Ubuntu,Fedora,CentOS
Stream). While router OSs are not among those, there are some
third-party efforts in getting it to run on some, e.g.
https://github.com/jacklul/entware-pi-hole/ for an Asus router.



The modification you mentioned the Pi-Hole team applied to dnsmasq are
very interesting for sure, are they or can they be made public?


Pi-hole's code is published on GitHub.
The link I've shared in my previous post would also point you to
Pi-hole's GitHub repository. ;)

Kind regards,
    Buck

___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss


Re: [Dnsmasq-discuss] About resolution performance and adblock

2024-11-24 Thread Ercolino de Spiacico

Hi Buck, thank you for the new info you brought up in your message

I should have mentioned: I'm not an adblock user reporting performance 
issue here or lookign for alternatives, but rather the developer of 
adblock (currently in version 2.78m) for FreshTomato; so switching to 
Pi-hole somehow defeats the initiative we have embraced over 3 years 
ago. Also, Pi-Hole requires additional HW, where FreshTomato runs on a 
simple router with optional USB storage.


The modification you mentioned the Pi-Hole team applied to dnsmasq are 
very interesting for sure, are they or can they be made public?


Still, I do like to stick to my grep test of mine as I believe to be 
very significant. SQLite "might" be great, as well as regex, but I do 
believe that an optimized handling of blocked domains might be the way 
to go and a quick win too.


Once again, thank you so much for your interesting message.

___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss


Re: [Dnsmasq-discuss] About resolution performance and adblock

2024-11-22 Thread Buck Horn via Dnsmasq-discuss

Hi Ercolino,

On 19.11.24 17:31, Ercolino de Spiacico wrote:

In the context of Adblock, I noticed that our adblock script can
handle relatively well about 10MB of blockfile which is about 7.8% of
the device RAM (128MB), after that the resolution time increases
exponentially to the point where the DNS resolution times-out and more
importantly the device becomes unstable.
(...)
Then, I'm not suggesting we should re-invent the wheel, but perhaps
there's a margin for a new directive whose behavior is a simple grep
against a mapped file to be used as an authority for those domains?
Might be restricted to blocking only (returning NX or 0.0.0.0 or
127.0.0.1)? Not sure what the secondary implications of such an idea
would be, but I'll be glad to hear some comments/opinions on this topic.



You may want to take a look at Pi-hole (https://docs.pi-hole.net).

It's DNS resolver pihole-FTL is a dnsmasq  fork, combining it with a
sqlite3 database for blocked domains and a B-tree algorithm for domain
matching, also employing some advanced steps like regex matching, or
deep CNAME inspection to thwart CNAME cloaking.
It also provides a web UI for managing and some statistics, but that is
optional.

All of dnsmasq's configuration options are still available and fully
operational, though you may have to pay attention in places not to
conflict with Pi-hole's default options.

Pi-hole's developers are active on dnsmasq's mailing lists as well,
giving back by committing code improvements to dnsmasq, and Pi-hole team
members sometimes offer a piece of advice here as well (including me).

I've been running that (including web server, web UI and unbound as
upstream, plus wireguard) on a quad core Cortex-A7 SBC with 256MB RAM
(115MB used) and about 750,000 blocked domains (weighing about 17M in
hosts format) plus a few regex blocks without issues for years, with
reply times for blocked domains averaging at ~1 ms and ~4ms for regex
matches.

Kind regards,
    Buck


___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss


Re: [Dnsmasq-discuss] About resolution performance and adblock

2024-11-21 Thread Donald Muller
I would like to see a 'block-file=' also with support for multiple files. There 
would also have to be a way for dnsmasq to re-read the file(s), either by 
signal or by watching the file(s).

Lists that I know of are:

https://raw.githubusercontent.com/notracking/hosts-blocklists/master/dnsmasq/dnsmasq.blacklist.txt

https://pgl.yoyo.org/adservers/ - there are many options to choose from to 
tailor the format you want to download the file in.


> -Original Message-
> From: Dnsmasq-discuss  On
> Behalf Of Ercolino de Spiacico
> Sent: Wednesday, November 20, 2024 9:54 AM
> To: Leonid Evdokimov ; imn...@gmail.com
> Cc: dnsmasq-discuss@lists.thekelleys.org.uk
> Subject: Re: [Dnsmasq-discuss] About resolution performance and adblock
> 
> Indeed, I think the point is straight forward, there are part of dnsmasq
> where we do want to comply with RFC, etc, others that are locally
> significant only and can bypass certain check, adblock being one of those.
> 
> There are a number of lists I can suggest, see this link we maintain:
> 
> https://wiki.freshtomato.org/doku.php/adblock_dns_filtering
> 
> However, unless you have de-duplication run internally at code level,
> you can simply pick up any list and append it (>>) multiple times to a
> temp file. That's what I did in my test to then echo in a bogus domain
> at the bottom of the file to satisfy the grep test. This gives you great
> control on the file size and number of records.
> 
> I'll have to see what it takes to suck the patch in, but I can ask help
> from our community. So yes it is of interest for sure!
> 
> In my mind I see the margin for a new directive, e.g named block-file or
> something where based on the directive syntax each domain in that file
> will return the very same result e.g.
> 
> block-file=dnsmasq.adblockme/#
> Returning NX for its content and BTW this special file would only need
> domains defined not the full address/local syntax
> 
> Likewise
> block-file=dnsmasq.adblockme/
> would return 0.0.0.0
> 
> Pretty much the same syntax as we currently have for individual domains.
> 
> Somehow, at code level I do see how this could be treated as an upstream
> server with "special file operation" and queried with the highest
> priority in a hard-coded strict-order leaving unresolved domains to the
> standard DNS operation (strict,no-fail,round-robin)
> 
> Thanks
> 
> 
> 
> On 20/11/2024 15:06, Leonid Evdokimov wrote:
> > On Tue, Nov 19, 2024 at 8:05 PM Ercolino de Spiacico
> >  wrote:
> >> If given the possibility, I would be very happy to map a file in RAM 
> >> knowing
> that
> >> this is handled differently from the "standard" conf-file.
> >
> > I agree with this point and I'm developing libddt (dense domain table)
> > that is basically a mmap()'able tire representing a list of domains.
> > The data structure resembles the one libpsl uses to store
> > publicsuffix.org database.
> >
> > Preliminary results for a test-case of 500k domains were ~2 MiB of RAM
> > usage and sub-10ms resolution latency.
> >
> > However, I got no replies for my call-for-test-cases[1] a few months
> > ago, so I moved my focus to other sub-projects of that project for a
> > while.
> >
> > I would be grateful if you can share your block-lists with me, so I
> > can test my code with more cases.
> >
> > Also, please tell me, if you have any interest in testing the
> > patch-set. We can't know if it'll be merged to the main dnsmasq repo,
> > but extra testing and feedback kinda increases chances of that
> > happening :-)
> >
> > [1] https://lists.thekelleys.org.uk/pipermail/dnsmasq-
> discuss/2024q3/017627.html
> >
> 
> 
> ___
> Dnsmasq-discuss mailing list
> Dnsmasq-discuss@lists.thekelleys.org.uk
> https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss
___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss


Re: [Dnsmasq-discuss] About resolution performance and adblock

2024-11-20 Thread Ercolino de Spiacico
Indeed, I think the point is straight forward, there are part of dnsmasq 
where we do want to comply with RFC, etc, others that are locally 
significant only and can bypass certain check, adblock being one of those.


There are a number of lists I can suggest, see this link we maintain:

https://wiki.freshtomato.org/doku.php/adblock_dns_filtering

However, unless you have de-duplication run internally at code level, 
you can simply pick up any list and append it (>>) multiple times to a 
temp file. That's what I did in my test to then echo in a bogus domain 
at the bottom of the file to satisfy the grep test. This gives you great 
control on the file size and number of records.


I'll have to see what it takes to suck the patch in, but I can ask help 
from our community. So yes it is of interest for sure!


In my mind I see the margin for a new directive, e.g named block-file or 
something where based on the directive syntax each domain in that file 
will return the very same result e.g.


block-file=dnsmasq.adblockme/#
Returning NX for its content and BTW this special file would only need 
domains defined not the full address/local syntax


Likewise
block-file=dnsmasq.adblockme/
would return 0.0.0.0

Pretty much the same syntax as we currently have for individual domains.

Somehow, at code level I do see how this could be treated as an upstream 
server with "special file operation" and queried with the highest 
priority in a hard-coded strict-order leaving unresolved domains to the 
standard DNS operation (strict,no-fail,round-robin)


Thanks



On 20/11/2024 15:06, Leonid Evdokimov wrote:

On Tue, Nov 19, 2024 at 8:05 PM Ercolino de Spiacico
 wrote:

If given the possibility, I would be very happy to map a file in RAM knowing 
that
this is handled differently from the "standard" conf-file.


I agree with this point and I'm developing libddt (dense domain table)
that is basically a mmap()'able tire representing a list of domains.
The data structure resembles the one libpsl uses to store
publicsuffix.org database.

Preliminary results for a test-case of 500k domains were ~2 MiB of RAM
usage and sub-10ms resolution latency.

However, I got no replies for my call-for-test-cases[1] a few months
ago, so I moved my focus to other sub-projects of that project for a
while.

I would be grateful if you can share your block-lists with me, so I
can test my code with more cases.

Also, please tell me, if you have any interest in testing the
patch-set. We can't know if it'll be merged to the main dnsmasq repo,
but extra testing and feedback kinda increases chances of that
happening :-)

[1] https://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2024q3/017627.html




___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss


Re: [Dnsmasq-discuss] About resolution performance and adblock

2024-11-20 Thread Chris Green
On Wed, Nov 20, 2024 at 04:06:08PM +0300, Leonid Evdokimov wrote:
> On Tue, Nov 19, 2024 at 8:05 PM Ercolino de Spiacico
>  wrote:
> > If given the possibility, I would be very happy to map a file in RAM 
> > knowing that
> > this is handled differently from the "standard" conf-file.
> 
> I agree with this point and I'm developing libddt (dense domain table)
> that is basically a mmap()'able tire representing a list of domains.
> The data structure resembles the one libpsl uses to store
> publicsuffix.org database.
> 
> Preliminary results for a test-case of 500k domains were ~2 MiB of RAM
> usage and sub-10ms resolution latency.
> 
> However, I got no replies for my call-for-test-cases[1] a few months
> ago, so I moved my focus to other sub-projects of that project for a
> while.
> 
> I would be grateful if you can share your block-lists with me, so I
> can test my code with more cases.
> 
> Also, please tell me, if you have any interest in testing the
> patch-set. We can't know if it'll be merged to the main dnsmasq repo,
> but extra testing and feedback kinda increases chances of that
> happening :-)
> 
> [1] 
> https://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2024q3/017627.html
> 
This might interest me too.  I run dnsmasq with a 'blacklist' on two
systems.  One is a ThinkPad t470 laptop with 8Gb of memory and more disk
space than I know what to do with so I doubt if reducing dnsmasq's
memory footprint will make much difference there.  However I'm also
running dnsmasq with a blacklist on an Asus DSL-AC68U router (running
ASUSWRT-Merlin) and that has only 256Mb memory so reducing the amount
that dnsmasq uses could well be a good thing.

The blacklist I use comes from:-

https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts

I run a trivial awk script to convert it so I can just put the
converted file into /etc/dnsmasq.d.  An example of the converted file
is:-

address=/uk2.thor.rtk.io/
address=/www.rtk.io/
address=/mt.rtmark.net/
address=/my.rtmark.net/
address=/token.rubiconproject.com/
address=/runative-syndicate.com/
address=/pixel.runative-syndicate.com/
address=/s.sh/
address=/log-1.samsungacr.com/
address=/log-2.samsungacr.com/

Currently the file has a bit over 121k entries and is about 3.7Mb.

At the moment I can't see a way to monitor memory usage on the
DSL-AC68U, top just says 'no process info in /proc'.  I can install
things via opkg on it.  There's 'collectd-mod-memory' which says it's
a "physical memory usage input plugin".


-- 
Chris Green

___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss


Re: [Dnsmasq-discuss] About resolution performance and adblock

2024-11-20 Thread Leonid Evdokimov
On Tue, Nov 19, 2024 at 8:05 PM Ercolino de Spiacico
 wrote:
> If given the possibility, I would be very happy to map a file in RAM knowing 
> that
> this is handled differently from the "standard" conf-file.

I agree with this point and I'm developing libddt (dense domain table)
that is basically a mmap()'able tire representing a list of domains.
The data structure resembles the one libpsl uses to store
publicsuffix.org database.

Preliminary results for a test-case of 500k domains were ~2 MiB of RAM
usage and sub-10ms resolution latency.

However, I got no replies for my call-for-test-cases[1] a few months
ago, so I moved my focus to other sub-projects of that project for a
while.

I would be grateful if you can share your block-lists with me, so I
can test my code with more cases.

Also, please tell me, if you have any interest in testing the
patch-set. We can't know if it'll be merged to the main dnsmasq repo,
but extra testing and feedback kinda increases chances of that
happening :-)

[1] https://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2024q3/017627.html

-- 
WBRBW, Leonid Evdokimov, https://darkk.net.ru tel:+79816800702
PGP: 6691 DE6B 4CCD C1C1 76A0  0D4A E1F2 A980 7F50 FAB2

___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss


Re: [Dnsmasq-discuss] About resolution performance and adblock

2024-11-19 Thread imnozi
FWIW, I use (on a 32-bit i686 appliance) a 33MiB ads/pron/warez blocklist of 
1.2M domains in the form "local=/FQDN/" (that is, the domains do not exist at 
all for me; I'm OK with seeing whitespace). The virtual size (from 'ps aux') of 
the running dnsmasq is 175MiB. Resolution time is a hair slower, but still very 
acceptable.

Most of the domains in the list are longer than 10 bytes. There are no 
SHA1/SHA256 collisions in the first 10 bytes of the FQDN hashes; but there are 
in the first 9 bytes, so it's close.

Does dnsmasq store the actual FQDN in RAM? If so, could it instead store the 
first 10 bytes of a hash of the domain? This would help reduce RAM usage by 
maybe 40%.

Huge blocklists are going to require lots of RAM. If your system has limited 
RAM, you really have two options. (1) Plead with the device manufacturers to 
boost RAM to 1GiB. (2) Use an RPi to run dnsmasq; a personal challenge for me 
would be to run it transparently between my F/W and ISP.

One question remains. What does dnsmasq do that increases RAM usage by about 
160MiB for 25MiB of FQDNs? Though it may be completely reasonable, it seems a 
trifle much from a soaring vulture.

Neal


On Tue, 19 Nov 2024 18:31:42 +0200
Ercolino de Spiacico  wrote:

> In the context of Adblock, I noticed that our adblock script can handle 
> relatively well about 10MB of blockfile which is about 7.8% of the 
> device RAM (128MB), after that the resolution time increases 
> exponentially to the point where the DNS resolution times-out and more 
> importantly the device becomes unstable.
> 
> I was trying to understand the root cause on why we couldn't have a 
> larger blockfile which is compiled in local=/example.com/# format. The 
> RAM, cache and buffer demand stay relatively low when the system becomes 
> unstable.
> 
> In the aim to investigate this further, I run a test creating a bogus 
> blockfile of the size of about 100MB (and 3.6M lines/domains). Here are 
> the grep results to fetch info from it:
> 
> root@router:/mnt/USB/adblock# time grep 'mytestdomain.com' 
> /mnt/USB/adblock/test2
> mytestdomain.com
> real0m 2.44s
> user0m 0.49s
> sys 0m 0.43s
> 
> root@router:/mnt/USB/adblock# cp /mnt/USB/adblock/test2 /tmp/test2
> 
> root@router:/mnt/USB/adblock# time grep 'mytestdomain.com' /tmp/test2
> mytestdomain.com
> real0m 0.65s
> user0m 0.39s
> sys 0m 0.20s
> 
> What I'm trying to demonstrate here is that a USB2 device can extract a 
> domain via grep in 2.44sec, and if that file was to be placed in RAM 
> (/tmp is mapped in RAM on devices with squashfs) it's just 0.65sec. 
> Admittedly /tmp compresses the content so the 100MB uses about 38MB, 
> still the point on performance is valid and tells me we could fit 
> 200-250MB blockfile if ever needed, looking at RAM capacity only.
> 
> As a point of discussion/improvement, I believe dnsmasq uploads the 
> custom config (so the blockfile in this case) into RAM, why do we 
> experience poor resolution performance and system instability at just 10MB?
> 
> Considering the system grep is so fast, could this be an alternative 
> method for dnsmasq to address locally defined domains? If given the 
> possibility, I would be very happy to map a file in RAM knowing that 
> this is handled differently from the "standard" conf-file.
> 
> I suppose the first step would be to fully understand where the 
> limitation we currently have comes from.
> 
> Then, I'm not suggesting we should re-invent the wheel, but perhaps 
> there's a margin for a new directive whose behavior is a simple grep 
> against a mapped file to be used as an authority for those domains? 
> Might be restricted to blocking only (returning NX or 0.0.0.0 or 
> 127.0.0.1)? Not sure what the secondary implications of such an idea 
> would be, but I'll be glad to hear some comments/opinions on this topic.
> 
> 
> Thanks
> 
> ___
> Dnsmasq-discuss mailing list
> Dnsmasq-discuss@lists.thekelleys.org.uk
> https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss


___
Dnsmasq-discuss mailing list
Dnsmasq-discuss@lists.thekelleys.org.uk
https://lists.thekelleys.org.uk/cgi-bin/mailman/listinfo/dnsmasq-discuss