Re: Let's launch our own blocklists...

Matthias Fischer Fri, 23 Jan 2026 07:03:52 -0800

On 22.01.2026 12:33, Michael Tremer wrote:
> Hello everyone,

Hi,


short feedback from me:

- I activated both the suricata (IPFire DBL - Domain Blocklist) - and
the URLfilter lists from 'dbl.ipfire.org'.

- I even took the 'smart-tv' domains from the IFire DBL blacklist and
copied/pasted them in my fritzbox filter lists.

Everything works as expected. Besides, the download of the IPFire
DBL-list loads a lot faster than the list from 'Univ. Toulouse'... ;-)

Functionality is good - no false positives or seen problems. Good work -
thanks!

Best
Matthias

> Over the past few weeks I have made significant progress on this all, and I 
> think we're getting close to something the community will be really happy 
> with. I'd love to get feedback from the team before we finalise things.
> 
> So what has happened?
> 
> First of all, the entire project has been renamed. DNSBL is not entirely what 
> this is. Although the lists can be thrown into DNS, they have much more use 
> outside of it that I thought we should simply go with DBL, short for Domain 
> Blocklist. After all, we are only importing domains. The new home of the 
> project therefore is https://www.ipfire.org/dbl
> 
> I have added a couple more lists that I thought interesting and I have added 
> a couple more sources that I considered a good start. Hopefully, we will soon 
> gather some more feedback on how well this is all holding up. My main focus 
> has however been on the technology that will power this project.
> 
> One of the bigger challenges was to create Suricata rules from the lists. 
> Initially I tried to create a ton of rules but since our lists are so large, 
> this quickly became too complicated. I have now settled on using a feature 
> that is only available in more recent versions of Suricata (I believe 7 and 
> later), but since we are already on Suricata 8 in IPFire this won’t be a 
> problem for us. All domains for each list are basically compiled into one 
> massively large dataset and one single rule is referring to that dataset. 
> This way, we won’t have the option to remove any false-positives, but at 
> least Suricata and the GUI won’t starve a really bad death when loading 
> millions of rules.
> 
> Suricata will now be able to use our rules to block access to any listed 
> domains of each of the categories over DNS, HTTP, TLS or QUIC. Although I 
> don’t expect many users to use Suricata to block porn or other things, this 
> is a great backstop to enforce any policy like that. For example, if there is 
> a user on the network who is trying to circumvent the DNS server that might 
> filter out certain domains, even after getting an IP address resolved through 
> other means, they won’t be able to open a TLS/QUIC connection or send a HTTP 
> request to all blocked domains. Some people have said they were interested in 
> blocking DNS-over-HTTPS and this is a perfect way to do this and actually be 
> sure that any server that is being blocked on the list will actually be 
> completely inaccessible.
> 
> Those Suricata rules are already available for testing in Core Update 200: 
> https://git.ipfire.org/?p=ipfire-2.x.git;a=commitdiff;h=9eb8751487d23dd354a105c28bdbbb0398fe6e85
> 
> I have chosen various severities for the lists. If someone was to block 
> advertising using DBL, this is fine, but not a very severe alert. If someone 
> chooses to block malware and there is a system on the network trying to 
> access those domains, this is an alert worth being investigated by an admin. 
> Our new Suricata Reporter will show those violations in different colours 
> based on the severity which helps to identify the right alerts to further 
> investigate.
> 
> Formerly I have asked you to test the lists using URL Filter. Those rules are 
> now available as well in Core Update 200: 
> https://git.ipfire.org/?p=ipfire-2.x.git;a=commitdiff;h=db160694279a4b10378447f775dd536fdfcfb02a
> 
> I talked about a method to remove any dead domains from any sources which is 
> a great way to keep our lists smaller. The pure size of them is a problem in 
> so many ways. That check was however a little bit too ambitious and I had to 
> make it a little bit less eager. Basically if we are in doubt, we need to 
> still list the domain because it might be resolvable by a user.
> 
>   
> https://git.ipfire.org/?p=dbl.git;a=commitdiff;h=bb5b6e33b731501d45dea293505f7d42a61d5ce7
> 
> So how else could we make the lists smaller without losing any actual data? 
> Since we sometimes list a whole TLD (e.g. .xxx or .porn), there is very 
> little point in listing any domains of this TLD. They will always be caught 
> anyways. So I built a check that marks all domains that don’t need to be 
> included on the exported lists because they will never be needed and was able 
> to shrink the size of the lists by a lot again.
> 
> The website does not show this data, but the API returns the number of 
> “subsumed” domains (I didn’t have a better name):
> 
>   curl https://api.dbl.ipfire.org/lists | jq .
> 
> The number shown would normally be added to the total number of domains and 
> usually cuts the size of the list by 50-200%.
> 
> Those stats will now also be stored in a history table so that we will be 
> able to track growth of all lists.
> 
> Furthermore, the application will now send email notifications for any 
> incoming reports. This way, we will be able to stay in close touch with the 
> reporters and keep them up to date on their submissions as well as inform 
> moderators that there is something to have a look at.
> 
> The search has been refactored as well, so that we can show clearly whether 
> something is blocked or not at one glance: 
> https://www.ipfire.org/dbl/search?q=github.com. There is detailed information 
> available on all domains and what happened to them. In case of GitHub.com, 
> this seems to be blocked and unblocked by someone all of the time and we can 
> see a clear audit trail of that: 
> https://www.ipfire.org/dbl/lists/malware/domains/github.com
> 
> On the DNS front, I have added some metadata to the zones so that people can 
> programmatically request some data, like when it has been last updated (in a 
> human-friendly timestamp and not only the serial), license, description and 
> so on:
> 
>   # dig +short ANY _info.ads.dbl.ipfire.org @primary.dbl.ipfire.org
>   "total-domains=42226"
>   "license=CC BY-SA 4.0"
>   "updated-at=2026-01-20T22:17:02.409933+00:00"
>   "description=Blocks domains used for ads, tracking, and ad delivery”
> 
> Now, I would like to hear more feedback from you. I know we've all been 
> stretched thin lately, so I especially appreciate anyone who has time to 
> review and provide input. Ideas, just say if you like it or not. Where this 
> could go in the future?
> 
> Looking ahead, I would like us to start thinking about the RPZ feature that 
> has been on the wishlist. IPFire DBL has been a bigger piece of work, and I 
> think it's worth having a conversation about sustainability. Resources for 
> this need to be allocated and paid for. Open source is about freedom, not 
> free beer — and to keep building features like this, we will need to explore 
> some funding options. I would be interested to hear any ideas you might have 
> that could work for IPFire.
> 
> Please share your thoughts on the mailing list when you can — even a quick 
> 'looks good' or 'I have concerns about X' is valuable. Public discussion 
> helps everyone stay in the loop and contribute.
> 
> I am aiming to move forward with this in a week's time, so if you have input, 
> now would be a good time to share it.
> 
> Best,
> -Michael
> 
>> On 6 Jan 2026, at 10:20, Michael Tremer <[email protected]> wrote:
>> 
>> Good Morning Adolf,
>> 
>> I had a look at this problem yesterday and it seems that parsing the format 
>> is becoming a little bit difficult this way. Since this is only affecting 
>> very few domains, I have simply whitelisted them all manually and 
>> duckduckgo.com <http://duckduckgo.com/> and others should now be easily 
>> reachable again.
>> 
>> Please let me know if you have any more findings.
>> 
>> All the best,
>> -Michael
>> 
>>> On 5 Jan 2026, at 11:48, Michael Tremer <[email protected]> wrote:
>>> 
>>> Hello Adolf,
>>> 
>>> This is a good find.
>>> 
>>> But if duckduckgo.com <http://duckduckgo.com/> is blocked, we will have to 
>>> have a source somewhere that blocks that domain. Not only a sub-domain of 
>>> it. Otherwise we have a bug somewhere.
>>> 
>>> This is most likely as the domain is listed here, but with some stuff 
>>> afterwards:
>>> 
>>> https://raw.githubusercontent.com/mtxadmin/ublock/refs/heads/master/hosts/_malware_typo
>>> 
>>> We strip everything after a # away because we consider it a comment. 
>>> However, that causes that there is only a line with the domain left which 
>>> will cause it being listed.
>>> 
>>> The # sign is used as some special character but at the same time it is 
>>> being used for comments.
>>> 
>>> I will fix this and then refresh the list.
>>> 
>>> -Michael
>>> 
>>>> On 5 Jan 2026, at 11:31, Adolf Belka <[email protected]> wrote:
>>>> 
>>>> Hi Michael,
>>>> 
>>>> 
>>>> On 05/01/2026 12:11, Adolf Belka wrote:
>>>>> Hi Michael,
>>>>> 
>>>>> I have found that the malware list includes duckduckgo.com
>>>>> 
>>>> I have checked through the various sources used for the malware list.
>>>> 
>>>> The ShadowWhisperer (Tracking) list has improving.duckduckgo.com in its 
>>>> list. I suspect that this one is the one causing the problem.
>>>> 
>>>> The mtxadmin (_malware_typo) list has duckduckgo.com mentioned 3 times but 
>>>> not directly as a domain name - looks more like a reference.
>>>> 
>>>> Regards,
>>>> 
>>>> Adolf.
>>>> 
>>>> 
>>>>> Regards,
>>>>> Adolf.
>>>>> 
>>>>> 
>>>>> On 02/01/2026 14:02, Adolf Belka wrote:
>>>>>> Hi,
>>>>>> 
>>>>>> On 02/01/2026 12:09, Michael Tremer wrote:
>>>>>>> Hello,
>>>>>>> 
>>>>>>>> On 30 Dec 2025, at 14:05, Adolf Belka <[email protected]> wrote:
>>>>>>>> 
>>>>>>>> Hi Michael,
>>>>>>>> 
>>>>>>>> On 29/12/2025 13:05, Michael Tremer wrote:
>>>>>>>>> Hello everyone,
>>>>>>>>> 
>>>>>>>>> I hope everyone had a great Christmas and a couple of quiet days to 
>>>>>>>>> relax from all the stress that was the year 2025.
>>>>>>>> Still relaxing.
>>>>>>> 
>>>>>>> Very good, so let’s have a strong start into 2026 now!
>>>>>> 
>>>>>> Starting next week, yes.
>>>>>> 
>>>>>>> 
>>>>>>>>> Having a couple of quieter days, I have been working on a new, little 
>>>>>>>>> (hopefully) side project that has probably been high up on our radar 
>>>>>>>>> since the Shalla list has shut down in 2020, or maybe even earlier. 
>>>>>>>>> The goal of the project is to provide good lists with categories of 
>>>>>>>>> domain names which are usually used to block access to these domains.
>>>>>>>>> 
>>>>>>>>> I simply call this IPFire DNSBL which is short for IPFire DNS 
>>>>>>>>> Blocklists.
>>>>>>>>> 
>>>>>>>>> How did we get here?
>>>>>>>>> 
>>>>>>>>> As stated before, the URL filter feature in IPFire has the problem 
>>>>>>>>> that there are not many good blocklists available any more. There 
>>>>>>>>> used to be a couple more - most famously the Shalla list - but we are 
>>>>>>>>> now down to a single list from the University of Toulouse. It is a 
>>>>>>>>> great list, but it is not always the best fit for all users.
>>>>>>>>> 
>>>>>>>>> Then there has been talk about whether we could implement more 
>>>>>>>>> blocking features into IPFire that don’t involve the proxy. Most 
>>>>>>>>> famously blocking over DNS. The problem here remains a the blocking 
>>>>>>>>> feature is only as good as the data that is fed into it. Some people 
>>>>>>>>> have been putting forward a number of lists that were suitable for 
>>>>>>>>> them, but they would not have replaced the blocking functionality as 
>>>>>>>>> we know it. Their aim is to provide “one list for everything” but 
>>>>>>>>> that is not what people usually want. It is targeted at a classic 
>>>>>>>>> home user and the only separation that is being made is any 
>>>>>>>>> adult/porn/NSFW content which usually is put into a separate list.
>>>>>>>>> 
>>>>>>>>> It would have been technically possible to include these lists and 
>>>>>>>>> let the users decide, but that is not the aim of IPFire. We want to 
>>>>>>>>> do the job for the user so that their job is getting easier. 
>>>>>>>>> Including obscure lists that don’t have a clear outline of what they 
>>>>>>>>> actually want to block (“bad content” is not a category) and passing 
>>>>>>>>> the burden of figuring out whether they need the “Light”, “Normal”, 
>>>>>>>>> “Pro”, “Pro++”, “Ultimate” or even a “Venti” list with cream on top 
>>>>>>>>> is really not going to work. It is all confusing and will lead to a 
>>>>>>>>> bad user experience.
>>>>>>>>> 
>>>>>>>>> An even bigger problem that is however completely impossible to solve 
>>>>>>>>> is bad licensing of these lists. A user has asked the publisher of 
>>>>>>>>> the HaGeZi list whether they could be included in IPFire and under 
>>>>>>>>> what terms. The response was that the list is available under the 
>>>>>>>>> terms of the GNU General Public License v3, but that does not seem to 
>>>>>>>>> be true. The list contains data from various sources. Many of them 
>>>>>>>>> are licensed under incompatible licenses (CC BY-SA 4.0, MPL, Apache2, 
>>>>>>>>> …) and unless there is a non-public agreement that this data may be 
>>>>>>>>> redistributed, there is a huge legal issue here. We would expose our 
>>>>>>>>> users to potential copyright infringement which we cannot do under 
>>>>>>>>> any circumstances. Furthermore many lists are available under a 
>>>>>>>>> non-commercial license which excludes them from being used in any 
>>>>>>>>> kind of business. Plenty of IPFire systems are running in businesses, 
>>>>>>>>> if not even the vast majority.
>>>>>>>>> 
>>>>>>>>> In short, these lists are completely unusable for us. Apart from 
>>>>>>>>> HaGeZi, I consider OISD to have the same problem.
>>>>>>>>> 
>>>>>>>>> Enough about all the things that are bad. Let’s talk about the new, 
>>>>>>>>> good things:
>>>>>>>>> 
>>>>>>>>> Many blacklists on the internet are an amalgamation of other lists. 
>>>>>>>>> These lists vary in quality with some of them being not that good and 
>>>>>>>>> without a clear focus and others being excellent data. Since we don’t 
>>>>>>>>> have the man power to start from scratch, I felt that we can copy the 
>>>>>>>>> concept that HaGeZi and OISD have started and simply create a new 
>>>>>>>>> list that is based on other lists at the beginning to have a good 
>>>>>>>>> starting point. That way, we have much better control over what is 
>>>>>>>>> going on these lists and we can shape and mould them as we need them. 
>>>>>>>>> Most importantly, we don’t create a single lists, but many lists that 
>>>>>>>>> have a clear focus and allow users to choose what they want to block 
>>>>>>>>> and what not.
>>>>>>>>> 
>>>>>>>>> So the current experimental stage that I am in has these lists:
>>>>>>>>> 
>>>>>>>>>  * Ads
>>>>>>>>>  * Dating
>>>>>>>>>  * DoH
>>>>>>>>>  * Gambling
>>>>>>>>>  * Malware
>>>>>>>>>  * Porn
>>>>>>>>>  * Social
>>>>>>>>>  * Violence
>>>>>>>>> 
>>>>>>>>> The categories have been determined by what source lists we have 
>>>>>>>>> available with good data and are compatible with our chosen license 
>>>>>>>>> CC BY-SA 4.0. This is the same license that we are using for the 
>>>>>>>>> IPFire Location database, too.
>>>>>>>>> 
>>>>>>>>> The main use-cases for any kind of blocking are to comply with legal 
>>>>>>>>> requirements in networks with children (i.e. schools) to remove any 
>>>>>>>>> kind of pornographic content, sometimes block social media as well. 
>>>>>>>>> Gambling and violence are commonly blocked, too. Even more common 
>>>>>>>>> would be filtering advertising and any malicious content.
>>>>>>>>> 
>>>>>>>>> The latter is especially difficult because so many source lists throw 
>>>>>>>>> phishing, spyware, malvertising, tracking and other things into the 
>>>>>>>>> same bucket. Here this is currently all in the malware list which has 
>>>>>>>>> therefore become quite large. I am not sure whether this will stay 
>>>>>>>>> like this in the future or if we will have to make some adjustments, 
>>>>>>>>> but that is exactly why this is now entering some larger testing.
>>>>>>>>> 
>>>>>>>>> What has been built so far? In order to put these lists together 
>>>>>>>>> properly, track any data about where it is coming from, I have built 
>>>>>>>>> a tool in Python available here:
>>>>>>>>> 
>>>>>>>>>  https://git.ipfire.org/?p=dnsbl.git;a=summary
>>>>>>>>> 
>>>>>>>>> This tool will automatically update all lists once an hour if there 
>>>>>>>>> have been any changes and export them in various formats. The 
>>>>>>>>> exported lists are available for download here:
>>>>>>>>> 
>>>>>>>>>  https://dnsbl.ipfire.org/lists/
>>>>>>>> The download using dnsbl.ipfire.org/lists/squidguard.tar.gz as the 
>>>>>>>> custom url works fine.
>>>>>>>> 
>>>>>>>> However you need to remember not to put the https:// at the front of 
>>>>>>>> the url otherwise the WUI page completes without any error messages 
>>>>>>>> but leaves an error message in the system logs saying
>>>>>>>> 
>>>>>>>> URL filter blacklist - ERROR: Not a valid URL filter blacklist
>>>>>>>> 
>>>>>>>> I found this out the hard way.
>>>>>>> 
>>>>>>> Oh yes, I forgot that there is a field on the web UI. If that does not 
>>>>>>> accept https:// as a prefix, please file a bug and we will fix it.
>>>>>> 
>>>>>> I will confirm it and raise a bug.
>>>>>> 
>>>>>>> 
>>>>>>>> The other thing I noticed is that if you already have the Toulouse 
>>>>>>>> University list downloaded and you then change to the ipfire custom 
>>>>>>>> url then all the existing Toulouse blocklists stay in the directory on 
>>>>>>>> IPFire and so you end up with a huge number of category tick boxes, 
>>>>>>>> most of which are the old Toulouse ones, which are still available to 
>>>>>>>> select and it is not clear which ones are from Toulouse and which ones 
>>>>>>>> from IPFire.
>>>>>>> 
>>>>>>> Yes, I got the same thing, too. I think this is a bug, too, because 
>>>>>>> otherwise you would have a lot of unused categories lying around that 
>>>>>>> will never be updated. You cannot even tell which ones are from the 
>>>>>>> current list and which ones from the old list.
>>>>>>> 
>>>>>>> Long-term we could even consider to remove the Univ. Toulouse list 
>>>>>>> entirely and only have our own lists available which would make the 
>>>>>>> problem go away.
>>>>>>> 
>>>>>>>> I think if the blocklist URL source is changed or a custom url is 
>>>>>>>> provided the first step should be to remove the old ones already 
>>>>>>>> existing.
>>>>>>>> That might be a problem because users can also create their own 
>>>>>>>> blocklists and I believe those go into the same directory.
>>>>>>> 
>>>>>>> Good thought. We of course cannot delete the custom lists.
>>>>>>> 
>>>>>>>> Without clearing out the old blocklists you end up with a huge number 
>>>>>>>> of checkboxes for lists but it is not clear what happens if there is a 
>>>>>>>> category that has the same name for the Toulouse list and the IPFire 
>>>>>>>> list such as gambling. I will have a look at that and see what happens.
>>>>>>>> 
>>>>>>>> Not sure what the best approach to this is.
>>>>>>> 
>>>>>>> I believe it is removing all old content.
>>>>>>> 
>>>>>>>> Manually deleting all contents of the urlfilter/blacklists/ directory 
>>>>>>>> and then selecting the IPFire blocklist url for the custom url I end 
>>>>>>>> up with only the 8 categories from the IPFire list.
>>>>>>>> 
>>>>>>>> I have tested some gambling sites from the IPFire list and the block 
>>>>>>>> worked on some. On others the site no longer exists so there is 
>>>>>>>> nothing to block or has been changed to an https site and in that case 
>>>>>>>> it went straight through. Also if I chose the http version of the 
>>>>>>>> link, it was automatically changed to https and went through without 
>>>>>>>> being blocked.
>>>>>>> 
>>>>>>> The entire IPFire infrastructure always requires HTTPS. If you start 
>>>>>>> using HTTP, you will be automatically redirected. It is 2026 and we 
>>>>>>> don’t need to talk HTTP any more :)
>>>>>> 
>>>>>> Some of the domains in the gambling list (maybe quite a lot) seem to 
>>>>>> only have an http access. If I tried https it came back with the fact 
>>>>>> that it couldn't find it.
>>>>>> 
>>>>>>> 
>>>>>>> I am glad to hear that the list is actually blocking. It would have 
>>>>>>> been bad if it didn’t. Now we have the big task to check out the 
>>>>>>> “quality” - however that can be determined. I think this is what needs 
>>>>>>> some time…
>>>>>>> 
>>>>>>> In the meantime I have set up a small page on our website:
>>>>>>> 
>>>>>>>  https://www.ipfire.org/dnsbl
>>>>>>> 
>>>>>>> I would like to run this as a first-class project inside IPFire like we 
>>>>>>> are doing with IPFire Location. That means that we need to tell people 
>>>>>>> about what we are doing. Hopefully this page is a little start.
>>>>>>> 
>>>>>>> Initially it has a couple of high-level bullet points about what we are 
>>>>>>> trying to achieve. I don’t think the text is very good, yet, but it is 
>>>>>>> the best I had in that moment. There is then also a list of the lists 
>>>>>>> that we currently offer. For each list, a detailed page will tell you 
>>>>>>> about the license, how many domains are listed, when the last update 
>>>>>>> has been, the sources and even there is a history page that shows all 
>>>>>>> the changes whenever they have happened.
>>>>>>> 
>>>>>>> Finally there is a section that explains “How To Use?” the list which I 
>>>>>>> would love to extend to include AdGuard Plus and things like that as 
>>>>>>> well as Pi-Hole and whatever else could use the list. In a later step 
>>>>>>> we should go ahead and talk to any projects to include our list(s) into 
>>>>>>> their dropdown so that people can enable them nice and easy.
>>>>>>> 
>>>>>>> Behind the web page there is an API service that is running on the host 
>>>>>>> that is running the DNSBL. The frontend web app that is running 
>>>>>>> www.ipfire.org <http://www.ipfire.org/> is connecting to that API 
>>>>>>> service to fetch the current lists, any details and so on. That way, we 
>>>>>>> can split the logic and avoid creating a huge monolith of a web app. 
>>>>>>> This also means that page could be down a little as I am still working 
>>>>>>> on the entire thing and will frequently restart it.
>>>>>>> 
>>>>>>> The API documentation is available here and the API is publicly 
>>>>>>> available: https://api.dnsbl.ipfire.org/docs
>>>>>>> 
>>>>>>> The website/API allows to file reports for anything that does not seem 
>>>>>>> to be right on any of the lists. I would like to keep it as an open 
>>>>>>> process, however, long-term, this cannot cost us any time. In the 
>>>>>>> current stage, the reports are getting filed and that is about it. I 
>>>>>>> still need to build out some way for admins or moderators (I am not 
>>>>>>> sure what kind of roles I want to have here) to accept or reject those 
>>>>>>> reports.
>>>>>>> 
>>>>>>> In case of us receiving a domain from a source list, I would rather 
>>>>>>> like to submit a report to upstream for them to de-list. That way, we 
>>>>>>> don’t have any admin to do and we are contributing back to other list. 
>>>>>>> That would be a very good thing to do. We cannot however throw tons of 
>>>>>>> emails at some random upstream projects without co-ordinating this 
>>>>>>> first. By not reporting upstream, we will probably over time create 
>>>>>>> large whitelists and I am not sure if that is a good thing to do.
>>>>>>> 
>>>>>>> Finally, there is a search box that can be used to find out if a domain 
>>>>>>> is listed on any of the lists.
>>>>>>> 
>>>>>>>>> If you download and open any of the files, you will see a large 
>>>>>>>>> header that includes copyright information and lists all sources that 
>>>>>>>>> have been used to create the individual lists. This way we ensure 
>>>>>>>>> maximum transparency, comply with the terms of the individual 
>>>>>>>>> licenses of the source lists and give credit to the people who help 
>>>>>>>>> us to put together the most perfect list for our users.
>>>>>>>>> 
>>>>>>>>> I would like this to become a project that is not only being used in 
>>>>>>>>> IPFire. We can and will be compatible with other solutions like 
>>>>>>>>> AdGuard, PiHole so that people can use our lists if they would like 
>>>>>>>>> to even though they are not using IPFire. Hopefully, these users will 
>>>>>>>>> also feed back to us so that we can improve our lists over time and 
>>>>>>>>> make them one of the best options out there.
>>>>>>>>> 
>>>>>>>>> All lists are available as a simple text file that lists the domains. 
>>>>>>>>> Then there is a hosts file available as well as a DNS zone file and 
>>>>>>>>> an RPZ file. Each list is individually available to be used in 
>>>>>>>>> squidGuard and there is a larger tarball available with all lists 
>>>>>>>>> that can be used in IPFire’s URL Filter. I am planning to add 
>>>>>>>>> Suricata/Snort signatures whenever I have time to do so. Even though 
>>>>>>>>> it is not a good idea to filter pornographic content this way, I 
>>>>>>>>> suppose that catching malware and blocking DoH are good use-cases for 
>>>>>>>>> an IPS. Time will tell…
>>>>>>>>> 
>>>>>>>>> As a start, we will make these lists available in IPFire’s URL Filter 
>>>>>>>>> and collect some feedback about how we are doing. Afterwards, we can 
>>>>>>>>> see where else we can take this project.
>>>>>>>>> 
>>>>>>>>> If you want to enable this on your system, simply add the URL to your 
>>>>>>>>> autoupdate.urls file like here:
>>>>>>>>> 
>>>>>>>>> https://git.ipfire.org/?p=people/ms/ipfire-2.x.git;a=commitdiff;h=bf675bb937faa7617474b3cc84435af3b1f7f45f
>>>>>>>> I also tested out adding the IPFire url to autoupdate.urls and that 
>>>>>>>> also worked fine for me.
>>>>>>> 
>>>>>>> Very good. Should we include this already with Core Update 200? I don’t 
>>>>>>> think we would break anything, but we might already gain a couple more 
>>>>>>> people who are helping us to test this all?
>>>>>> 
>>>>>> I think that would be a good idea.
>>>>>> 
>>>>>>> 
>>>>>>> The next step would be to build and test our DNS infrastructure. In the 
>>>>>>> “How To Use?” Section on the pages of the individual lists, you can 
>>>>>>> already see some instructions on how to use the lists as an RPZ. In 
>>>>>>> comparison to other “providers”, I would prefer if people would be 
>>>>>>> using DNS to fetch the lists. This is simply to push out updates in a 
>>>>>>> cheap way for us and also do it very regularly.
>>>>>>> 
>>>>>>> Initially, clients will pull the entire list using AXFR. There is no 
>>>>>>> way around this as they need to have the data in the first place. After 
>>>>>>> that, clients will only need the changes. As you can see in the 
>>>>>>> history, the lists don’t actually change that often. Sometimes only 
>>>>>>> once a day and therefore downloading the entire list again would be a 
>>>>>>> huge waste of data, both on the client side, but also for us hosting 
>>>>>>> then.
>>>>>>> 
>>>>>>> Some other providers update their lists “every 10 minutes”, and there 
>>>>>>> won't be any changes whatsoever. We don’t do that. We will only export 
>>>>>>> the lists again when they have actually changed. The timestamps on the 
>>>>>>> files that we offer using HTTPS can be checked by clients so that they 
>>>>>>> won’t re-download the list again if it has not been changed. But using 
>>>>>>> HTTPS still means that we would have to re-download the entire list and 
>>>>>>> not only the changes.
>>>>>>> 
>>>>>>> Using DNS and IXFR will update the lists by only transferring a few 
>>>>>>> kilobytes and therefore we can have clients check once an hour if a 
>>>>>>> list has actually changed and only send out the raw changes. That way, 
>>>>>>> we will be able to serve millions of clients at very cheap cost and 
>>>>>>> they will always have a very up to date list.
>>>>>>> 
>>>>>>> As far as I can see any DNS software that supports RPZs supports 
>>>>>>> AXFR/IXFR with exception of Knot Resolver which expects the zone to be 
>>>>>>> downloaded externally. There is a ticket for AXFR/IXFR support 
>>>>>>> (https://gitlab.nic.cz/knot/knot-resolver/-/issues/195).
>>>>>>> 
>>>>>>> Initially, some of the lists have been *huge* which is why a simple 
>>>>>>> HTTP download is not feasible. The porn list was over 100 MiB. We could 
>>>>>>> have spent thousands on just traffic alone which I don’t have for this 
>>>>>>> kind of project. It would also be unnecessary money being spent. There 
>>>>>>> are simply better solutions out there. But then I built something that 
>>>>>>> basically tests the data that we are receiving from upstream but simply 
>>>>>>> checking if a listed domain still exists. The result was very 
>>>>>>> astonishing to me.
>>>>>>> 
>>>>>>> So whenever someone adds a domain to the list, we will (eventually, but 
>>>>>>> not immediately) check if we can resolve the domain’s SOA record. If 
>>>>>>> not, we mark the domain as non-active and will no longer include them 
>>>>>>> in the exported data. This brought down the porn list from just under 5 
>>>>>>> million domains to just 421k. On the sources page 
>>>>>>> (https://www.ipfire.org/dnsbl/lists/porn/sources) I am listing the 
>>>>>>> percentage of dead domains from each of them and the UT1 list has 94% 
>>>>>>> dead domains. Wow.
>>>>>>> 
>>>>>>> If we cannot resolve the domain, neither can our users. So we would 
>>>>>>> otherwise fill the lists with tons of domains that simply could never 
>>>>>>> be reached. And if they cannot be reached, why would we block them? We 
>>>>>>> would waste bandwidth and a lot of memory on each single client.
>>>>>>> 
>>>>>>> The other sources have similarly high rations of dead domains. Most of 
>>>>>>> them are in the 50-80% range. Therefore I am happy that we are doing 
>>>>>>> some extra work here to give our users much better data for their 
>>>>>>> filtering.
>>>>>> 
>>>>>> Removing all dead entries sounds like an excellent step.
>>>>>> 
>>>>>> Regards,
>>>>>> 
>>>>>> Adolf.
>>>>>> 
>>>>>>> 
>>>>>>> So, if you like, please go and check out the RPZ blocking with Unbound. 
>>>>>>> Instructions are on the page. I would be happy to hear how this is 
>>>>>>> turning out.
>>>>>>> 
>>>>>>> Please let me know if there are any more questions, and I would be glad 
>>>>>>> to answer them.
>>>>>>> 
>>>>>>> Happy New Year,
>>>>>>> -Michael
>>>>>>> 
>>>>>>>> 
>>>>>>>> Regards,
>>>>>>>> Adolf.
>>>>>>>>> This email is just a brain dump from me to this list. I would be 
>>>>>>>>> happy to answer any questions about implementation details, etc. if 
>>>>>>>>> people are interested. Right now, this email is long enough already…
>>>>>>>>> 
>>>>>>>>> All the best,
>>>>>>>>> -Michael
>>>>>>>> 
>>>>>>>> -- 
>>>>>>>> Sent from my laptop
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>>> -- 
>>>> Sent from my laptop
>>>> 
>>>> 
>>> 
>> 
> 
>

Re: Let's launch our own blocklists...

Reply via email to