Re: [systemd-devel] mDNS resolution with systemd
Hi, I am one of avahi maintainers. While it is nice to have it configurable, it SHOULD work the best with default settings. It SHOULD always offer all interface addresses, unless there is well documented reason why to avoid that. Avahi does not implement NSEC and does not work well with both IPv6 and IPv4 queries, if the received does support only one family. Making it well behaved and working also with other implementations is not trivial task. I haven't read whole history of this discussion, but not following RFC should be well justified. I haven't seen such justification here. Apple engineers working on mdns specs are experts in the area, they very likely know why they demand MUST words in their RFCs. Regards, Petr On 1/19/24 10:31, Jean-Marie Delapierre wrote: Le 20/12/2023 à 14:30, Belanger, Martin a écrit : Hi, On one of my servers, I use avahi to realize mDNS resolution. With avahi I am able to choose on which ip version I want avahi to answer to mdns requests (ipv4 or ipv6). In my opinion, this is convenient on local networks with both stacks, espacially for tracking purposes on the network). This is indeed convenient, however according to RFC 6762, paragraph 6.2 [1]: When a Multicast DNS responder sends a Multicast DNS response message containing its own address records, it MUST include all addresses that are valid on the interface on which it is sending the message, I interpret this as MUST include both IPv4 and IPv6 addresses (i.e. per standard the IP family should not be configurable). One way to solve this would be to disable IPv4 (or IPv6) on an interface so that the interface only has IPv4 or IPv6 addresses assigned to it. [1] https://datatracker.ietf.org/doc/html/rfc6762#section-6.2 I have understood that avahi has to be replaced by systemd-networkd and/or systemd-resolved and I have tried to implement the save behavior with it... Without success (may be I have not found the correct place to adjust it). No here, I have to intervene. I do not know why avahi has to be replaced by systemd-resolved. We are working on fixing bugs in avahi. While the code is far from perfect, it has been used for years. Afaik resolved implements just hostname resolution. Does not offer service registration and browsing of network services. I know there is some ongoing work on that, but then it still needs to persuade applications to switch to it. That would be much more work than just fixing bugs in existing avahi, which I would propose instead. We are accepting pull requests (again). Following are the capabilities I would like to find in systemd for mDNS resolution (espacially on a server) : - One can specify If he wants systemd to respond only in ipv4 or ipv6 (or both, by default ?) ; - In ipv4, one can specify the sub-network on which he wants systemd to respond to mdns requests (in my opinion, the full ipv4 adresse is to be elaborated by systemd-network) ; - in ipv6, one can specify the prefix on which he wants systemd to respond to mdns requests (in my opinion, the full ipv6 adresse is to be elaborated by systemd-network) ; Thank you in advance for reading me. Regards. Jean-Marie Delapierre I agree with your answer , but ... - The goal of mDNS is to resolve adresses only on a local network , not the internet . On the internet , one has to respect the standards , but , on his own local network , it can admited that he does what he wants . You can implement your own protocols, but is it a good idea? I do not think so. The standard has a good reasons to demand all addresses. It then makes obvious when reflectors create conflicts with your registered name. - The goal of any software is not only to respect a standard . It also has to allow the users to do what they want to do . As much as I understand that the default behavior of systemd HAS to be the full respect of the standard , as much it has to allow fine tuning for the user to do what he wants on his private local network . For example , look on all the options you have to configure your local ethernet network . Most people don't use them and are happy with the default configuration , but some ... Can you instead explain, what exactly is your use case? Why do you want to hide something and not propagate it? It is a request for additional work, why it should be done? Are you willing to contribute code for it? so, I suggest systemd-resolved to propose two options for mDNS resolution : - in ipv4 : none (ipv4 answering is disabled as if the interface would have no ipv4 adress) , all (default) , a list of ipv4 subnetworks (answering the adress on each subnetwork if avalaible) . - in ipv6 : none (ipv6 answering is disabled as if the interface would have no ipv6 adress) , all (default) , a list of ipv6 prefixes (answering the adress on each prefix if avalaible) . Thank you in advance for reading me. Regards. Jean-Marie Delapierre -- Petr Menšík Software
Re: CVE-2023-7008 Christmas drama notes
Hello Luca, I did not expect normal and honest apology from you right from the start, but I did at least expect *some* reflection. I see none in that long text. You seem to have very minimal insight into how our internal vulnerability process works, but that does not prevent you judge my guiltiness. I will leave more discussions after I return to the office. It seems face to face discussion would bring less emotions. I want opinion of people who knew how it happened. Not people who think they know, but have no direct way to know it. On 12/26/23 11:37, Luca Boccassi wrote: On Tue, 26 Dec 2023 at 02:30, Petr Menšík wrote: Here's what's really going on: you have found yourself in a position where, as a RH employee, you could abuse the internal CVE process to promote your own projects, and that's exactly what you did: without consulting or notifying anybody who is involved in this project, you went directly to the security team raise a CVE while we all were on holiday, and then promptly went on social media to use the CVE to bash the project and promote your own instead:https://imgur.com/3eqRQcW You even lied about others in RH being aware that a CVE was raised, which is obviously not true - those referenced comments were made months before the CVE was opened. You ignored all processes, went behind the back of all maintainers - upstream and downstream - in order to inflict maximum damage at the worst time possible, and then brag on social media about it. This is a blatant abuse of Redhat's CNA position, and puts the whole company under a bad light, and casts doubts over its trustworthiness as the CNA for the project, all because of your reckless and needless actions. Not content, you even intentionally avoided to mention in the CVE that this feature is off by default everywhere, and thus very few users are actually affected - when CVEs are raised, hardly anybody goes to look for related bug trackers or issues, and the CVE advisory is all that is used to establish impact and decide whether action is needed, and there was no mention anywhere that this requires a local administrator to manually enable it for a machine to be affected. A _lot_ of work for a _lot_ of people kicks off every time a CVE is raised, due to automation, and the correctness of the advisory is fundamental to avoid triggering unneeded work. You made sure it was worded to give the idea that every installation was affected, so that it could cause the maximum amount of panic and damage possible, again so that you could then brag on social media about it, showing a reckless disregard for the wellbeing of your colleagues at Redhat, Redhat's customers and all other downstream users and developers during their holidays. ... I will ask my manager to read that issue and tell me if I did anything wrong or harmed anyone. I will ask Lukáš for his opinion as well. I do not care about your opinion about me, as I doubt you know me, what or how I do anything. Given such a record, the Github org owners (plural) collectively decided that, as the very first and immediate consequence, your membership of the Github project is not compatible with your behaviour, and removed you. Thank you for this part, I did not have to leave it myself. I have had enough and I think I have been very patient. I would like to know which people voted for my membership termination and who against. Best Regards, Petr -- Petr Menšík Software Engineer, RHEL Red Hat,https://www.redhat.com/ PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB OpenPGP_0x4931CA5B6C9FC5CB.asc Description: OpenPGP public key OpenPGP_signature.asc Description: OpenPGP digital signature
CVE-2023-7008 Christmas drama notes
offer some help. Bad mistake in my opinion, but even that deserves to have issues to be known, reported (and ignored). I will try to minimize my reports to unemotional facts as much as I will able to. I think I deserve an apology from Luca, but I doubt I will receive some. Thank you for reading it so far, Happy new year everyone and less drama in it! Best Regards, Petr Menšík 1. https://github.com/systemd/systemd/issues/25676 2. https://bugzilla.redhat.com/show_bug.cgi?id=261 3. https://bugzilla.redhat.com/show_bug.cgi?id=672#c3 4. https://github.com/systemd/systemd/blob/main/docs/CODE_OF_CONDUCT.md -- Petr Menšík Software Engineer, RHEL Red Hat,https://www.redhat.com/ PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB OpenPGP_0x4931CA5B6C9FC5CB.asc Description: OpenPGP public key OpenPGP_signature.asc Description: OpenPGP digital signature
Re: [systemd-devel] Securing bind with systemd methods (was: bind-mount of /run/systemd for chrooted bind9/named)
but I think this is # hard to understand for people who haven't had formal training. But I # also understand that this is hard to change without changing semantics # for existing units, so maybe a few examples in systemd.exec(5) might ease # this - the SystemCallFilter chapter in systemd.exec(5) is already long # though. @raw-ip isnt available in systemd 252, so I had to template # that in my ansible. And setuid is setuid32 on 32 bit archs like armhf, # so I had to template _that_ for my Banana Pi. SystemCallFilter=~@mount @swap @raw-ip @resources @reboot @privileged @obsolete @module @debug @cpu-emulation @clock SystemCallFilter=chroot setuid SystemCallArchitectures=native [Install] WantedBy=multi-user.target # strangely, this alias only holds if the unit is enabled. If the unit # is disabled, the alias is not available which was kind of a surprise. Alias=bind9.service Generally, the error messages I received during the debugging phase were not very helpful. I frequently had to resort to strace -p 1 to find out what exactly went wrong trying to start named. For example, there is no exact feedback when the daemon is being terminated because of a SystemCallFilter violation, I'd like the system call in question to be part of the log. The same applies to directives regarding sandboxing, when paths are given in the directive. My way to debug was either randomly removing some of the directives to narrow down the possible error range, or stracing again to find out what my daemon tried before it was terminated. Those things might be out of scope for systemd, I simply don't know. With this unit, systemd-analyze security named is now down to "1.9 OK", I think it was > 9 with the standard unit. Thanks for your help, I wanted to give something back. I'll probably suggest this unit for the Debian package once it has reached some stability. Greetings Marc -- Petr Menšík Software Engineer, RHEL Red Hat, https://www.redhat.com/ PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB
Re: [systemd-devel] IPv6AcceptRA: RDNSS Lifetime is not expiring
I would suggest creating issue at github.com/systemd/systemd repository. I have not tested it, but sounds like it should be fixed. On 12. 07. 23 0:39, Muggeridge, Matt wrote: Hello there! In our IPv6 network, the address of a Recursive DNS Server (RDNSS) is supplied in a Router Advertisement (RA), with a lifetime of 60 seconds. It appears that RDNSS lifetime is not being honoured (RFC 8106, section 5.1 <https://www.rfc-editor.org/rfc/rfc8106.html#section-5.1:~:text=in the option.-,Lifetime,-32-bit unsigned>). I reviewed the code and can see where the RDNSS lifetime is being saved <https://github.com/systemd/systemd-stable/blob/4a31fa2fb040005b73253da75cf84949b8485175/src/network/networkd-ndisc.c#L712>, though I was unable to determine how it was being handled upon expiry. How do I configure networkd so that the RA’s RDNSS lifetime is honoured? Here is a summary of the simple protocol exchange: 1. Router: Send RA [RDNSS address of “nameserver60s”, lifetime: “60”] 2. Host: “resolvectl” shows the link’s DNS server now lists the RDNSS address of “nameserver60s” 3. ** Wait for more than 60 seconds – the RDNSS entry should expire ** 4. Host: 1. “resolvectl” continues to list the address of “nameserver60s” on the link. 2. Using tcpdump to trace “ping test.example.com”, the “nameserver60s” is still being used. It never timed out. Here is my network configuration, showing UseDNS and UseDomains both set to “yes”: $ cat /etc/systemd/network/10-eno0.network [Match] KernelCommandLine=!nfsroot Name=eno0 [DHCP] ClientIdentifier=mac RouteMetric=10 UseDomains=yes UseHostname=yes UseMTU=yes [Network] #DHCP=ipv6 Address=10.1.1.1/24 #DNS=1.2.3.6 Gateway=1.1.1.2 IPv6AcceptRA=yes [IPv6AcceptRA] UseDNS=yes UseDomains=yes Grateful for any suggestions. Kind regards, Matt. PS: We’re on systemd 250. I’ve searched later versions of the release notes <https://github.com/systemd/systemd/releases> and it seems there have been no changes in this area. -- Petr Menšík Software Engineer, RHEL Red Hat,http://www.redhat.com/ PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB
Re: [systemd-devel] bind-mount of /run/systemd for chrooted bind9/named
I would not recommend using own chroot to anyone, who has enabled SELinux or similar security technology. We still offer subpackage bind-chroot, which has prepared named-chroot.service for doing just that. But SELinux provides better enforcement, while not complicating deployment and usage of named. I kindly disagree it is still suggested. Also, BIND9 is full of assertions ensuring unexpected code paths are reported. This is defensive coding style, which makes it difficult to success in remote code execution attack. I have been maintainer of BIND for 6 years, but I am not aware of any successful remote execution in the last decade. Maybe not ever. I think the more important protection you can deploy is simple: Restart=on-abnormal I think good enough systemd checks are sufficient replacement to custom tailored chroots. Cheers, Petr On 7/4/23 08:40, Marc Haber wrote: On Mon, Jul 03, 2023 at 11:21:22PM +0200, Silvio Knizek wrote: why is it suggested to run `named` within its own chroot? For security reasons? This can be achieved much easier with systemd native options. That feature is two decades older than systemd, and name server operators are darn conservative. Greetings Marc -- Petr Menšík Software Engineer, RHEL Red Hat, https://www.redhat.com/ PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB
[systemd-devel] LLMNR should be disabled on new deployments
Hello everyone, I would like to request disabling LLMNR protocol in new releases by default. The protocol itself is deprecated even by Microsoft, who disabled it in Windows 10. I think Multicast DNS is supperior and MS thinks it also [1]. Because it is not implemented well in systemd-resolved, it has been causing regressions. Because it won't work with the primary system it was created for, I think it is good time to disable it in default installations. If someone needs it, it can still be enabled manually. But because it is breaking single label queries, I think it should not be enabled unless requested. It is enabled even on Fedora Server, which I consider serious mistake. Since even Windows desktops do not enable it anymore, I think also Workstation edition should disable it by default. I have created pull request [2] for that. Examples, how it breaks correct DNS, are in issue [3]. I want to request disabling LLMNR by default in upcoming Fedora 39. I would recommend doing that in any other distributions using systemd-resolved in default installation. Any opinions or comments? Regards, Petr 1. https://techcommunity.microsoft.com/t5/networking-blog/aligning-on-mdns-ramping-down-netbios-name-resolution-and-llmnr/ba-p/3290816 2. https://github.com/systemd/systemd/pull/28263 3. https://github.com/systemd/systemd/issues/23622 -- Petr Menšík Software Engineer, RHEL Red Hat, https://www.redhat.com/ PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB
Re: [systemd-devel] systemd-resolved not working in a realistic scenario
Hi Farkas, I think you need more traditional dns cache like dnsmasq or unbound. I don't think there is a good syntax to configure per-domain nameservers, which would be implementation independent. I think the way to go would be to make forwarding declarations on VPN provided nameservers. It would work well with autoconfiguration, systemd-resolved would be able to configure it. In other words, do not bring the complexity to clients with resolved. Instead, just list all domains with special nameservers in VPN dns.search. Prefix them with ~ to avoid using them to search. Just ensure VPN only domain names are sent to VPN provided DNS. Let proper DNS implementations handle forwarding internally, with possible caching. That way, you will have on client just set of domains coming from the interface and set of servers handling them. Forwarding to specific nameservers for every domain would be handled only at dns resolvers of VPN, not client side. Just configure both 1.2.3.4 and 5.6.7.8 to forward all names valid for the VPN. Domains not authoritative on them forward away. If not possible, create one cache instance with all forwards, which only would be offered by VPN connection. Even when dnsmasq or unbound would be able to configure those domains, there is no good auto-configuration protocol to make it working out of the box. You would need prepare manual configuration solution, which would be difficult to handle in all cases. Especially for mobile devices. I think it would be easier to manage, especially long term. Just my 2 cents Petr On 27. 04. 23 14:46, Farkas Levente wrote: Hi, I already read many docs about systemd-resolved, but still to able to solve my very simple use case for name resolution. Suppose we've two (not only one!) private/internal domain and both of them have a separate dns server. eg: - domain: a.com dns: 1.2.3.4 - domain: b.com dns: 5.6.7.8 Assume we use a vpn and we only access to these domains and dns servers only when we connect to the vpn. What's more we use some wireguard vpn solutions like tailscale, netbird or netmaker. This is important since these has it's own internal dns servers and domains. The dns servers usually listen on the local ip address of the wireguard interface eg: - domain: netbird.cloud dns: 100.76.1.2 This means i've 2 interface: - enp6s0 (lan) - wt0 (wireguard) What I'd like to achieve: - if the vpn is not connected - all dns query goes to 8.8.8.8 - if vpn is connected: - netbird.cloud goes to 100.76.1.2 - a.com goes through wt0 to 1.2.3.4 - b.com goes through wt0 to 5.6.7.8 - all other domain goes to 8.8.8.8 - and of course my search domains netbird.cloud a.com b.com as far as i see systemd-resolved assume all dns on the same interface has the same role and serve the same domains (which is not true in this case). if I start systemd-netword i've only one list for DNS and one for domains. Even though I can define DNS=100.76.1.2%wt0#netbird.cloud 1.2.3.4%wt0#a.com 5.6.7.8%wt0#b.com but it's not working. And if I also add Domains=netbird.cloud a.com b.com that's not help either. So can I configure more domain to different dns server(s) on the same interface or not? How can I configure more dns servers for one domain? eg a.com can have 2 dns servers? What's more if it's as laptop and I can go to the office where the a.com and b.com domain no longer required to route (since these are the internal domains) BUT the vpn connection still required and the netbird.cloud domain still should have to be resolved on the wt0 interface, than I can't put this into network interface specific .network config file since the wt0 is up both at the office and at home. IMHO this is a very realistic setup where i've more than one domain and more then one dns servers. Is it currently not possible or just way too complicated? Thanks in advance. Regards. ps. anyway this 1.2.3.4%wt0#a.com configuration reminds me to the old sendmail config files. which was so cryptic that no one can configure. -- Petr Menšík Software Engineer, RHEL Red Hat, http://www.redhat.com/ PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB
Re: [systemd-devel] Resolver times out resending with same transaction ID
This report led me to few checks and indeed. What systemd-resolved is doing with NXDOMAIN responses from clearly proper servers is plain terrible. It should stop doing it the current way ASAP. Instead of caching negative response it doubles each query resulting in NXDOMAIN response. Not once as a workaround requirement detection, but for every single name not existing. Even for repeated queries. Created issue 26967 [1] requesting to stop doing so weird things. Aruba support were able to identify failing software versions and when they were fixed. I think this is exactly the kind of workaround DNS Flag Day 2019 were about. Please stop doing it by default. Regards, Petr [1] https://github.com/systemd/systemd/issues/26967 On 3/21/23 06:32, Vince Del Vecchio wrote: Hi all, I recently observed reverse IPv4 address lookups timing out on a newly configured host. (Ubuntu 22.04LTS, systemd 249.11-0ubuntu3.7). I tracked the problem to the DVE-2018-0001 mitigation code. An example: $ resolvectl query 151.101.1.164 151.101.1.164: resolve call failed: All attempts to contact name servers or networks failed tcpdump shows (in relevant part): 00:00:00.00 IP 192.168.1.48.35911 > 8.8.8.8.53: 26417+ [1au] PTR? 164.1.101.151.in-addr.arpa. (55) 00:00:00.021127 IP 8.8.8.8.53 > 192.168.1.48.35911: 26417 NXDomain 0/1/1 (115) 00:00:00.021252 IP 192.168.1.48.35911 > 8.8.8.8.53: 26417+ PTR? 164.1.101.151.in-addr.arpa. (44) The first query gets an "NXDOMAIN", which is the correct answer for this address. However, NXDOMAIN triggers the DVE-2018-0001 mitigation code to send an revised query without EDNS OPT (confirmed in debug log). I **never see a response to this revised query**. If there is only a single DNS server, the resolver resends the OPT-less query after a timeout, and *that* gets an NXDOMAIN which is returned. However, if there are multiple DNS servers (e.g. 8.8.8.8 8.8.4.4), on timing out, it sends another query with EDNS to the next server, and the three-packet sequence repeats several times until it gives up. Since the server *will* respond to a retransmit after 5s, my guess is that the server, or maybe something in the network, is dropping close- in-time requests with the same transaction id. I tried a few public DNSs that (surprisingly?) all behaved the same. I haven't found a simple way to rule out a firewall, router or my ISP. Regardless, my thought is that resending a slightly different query after we did get a response should not use the same transaction id. I patched systemd as follows and the problem goes away: --- a/src/resolve/resolved-dns-transaction.c +++ b/src/resolve/resolved-dns-transaction.c @@ -1312,6 +1312,7 @@ void dns_transaction_process_reply(DnsTransaction *t, DnsPacket *p, bool encrypt FORMAT_DNS_RCODE(DNS_PACKET_RCODE(p)), dns_server_feature_level_to_string(t- clamp_feature_level_nxdomain)); +dns_transaction_shuffle_id(t); dns_transaction_retry(t, false /* use the same server */); return; } A few questions: - Does anyone else see this? - Does this look like a reasonable fix? Any thoughts on whether the one other place where dns_transaction_retry(..., false) is called to retry the same server with a lower feature level (SERVFAIL etc) should do the same? - Any other issues with the patch? Or would it be reasonable to (add comments and) submit a pull request? -Vince Del Vecchio -- Petr Menšík Software Engineer, RHEL Red Hat, https://www.redhat.com/ PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB
Re: [systemd-devel] systemd-resolved: performance question
I think it is fairly easy. If the /etc/resolv.conf changes not by a change of systemd-resolved, it is very likely the address specified in it does not point to resolved anymore. In that sense it does not matter what systemd-resolved does with such information and how quickly. Does it update its own configured forwarders if I overwrite /etc/resolv.conf with something unspecified in resolved.conf or resolvectl runtime changes? I have tried appending nameserver 8.8.8.8, but it has not reported any change in resolvectl. I would not expect a third party would overwrite /etc/resolv.conf and use address of systemd-resolved in it. If you know such case, please share. Is there a real-world example case when such behaviour is needed? I think resolved can just queue incoming request even from resolve NSS plugin. If it detects a change in /etc/resolv.conf link, then it can restart waiting requests again (or after a short timeout, 1s would work). It would be much better than checking the link state before every single query. It would conserve CPU work on battery operated devices. I would set resolv.conf files in /run/systemd/resolved/ immutable to prevent other software writing into it. On 3/24/23 11:41, Lennart Poettering wrote: On Fr, 24.03.23 03:16, Petr Menšík (pemen...@redhat.com) wrote: Even if it could not use filesystem monitoring, I guess it could check those files only once per second or so. Should not depend on number of done queries. It's not so easy. We generally want to give the guarantee that if people from some script patch around in /etc/resolv.conf and immediately fire a DNS request afterwards, that we'll handle this and give you answers with the new config. There are conflicting goals here: nice, reliably behaviour that config changes are guaranteed to be taken into account, and a simple goal of performance to reduce these stat calls. Lennart -- Lennart Poettering, Berlin -- Petr Menšík Software Engineer, RHEL Red Hat, https://www.redhat.com/ PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB
Re: [systemd-devel] Resolver times out resending with same transaction ID
On 3/21/23 06:32, Vince Del Vecchio wrote: Hi all, I recently observed reverse IPv4 address lookups timing out on a newly configured host. (Ubuntu 22.04LTS, systemd 249.11-0ubuntu3.7). I tracked the problem to the DVE-2018-0001 mitigation code. An example: $ resolvectl query 151.101.1.164 151.101.1.164: resolve call failed: All attempts to contact name servers or networks failed tcpdump shows (in relevant part): 00:00:00.00 IP 192.168.1.48.35911 > 8.8.8.8.53: 26417+ [1au] PTR? 164.1.101.151.in-addr.arpa. (55) 00:00:00.021127 IP 8.8.8.8.53 > 192.168.1.48.35911: 26417 NXDomain 0/1/1 (115) 00:00:00.021252 IP 192.168.1.48.35911 > 8.8.8.8.53: 26417+ PTR? 164.1.101.151.in-addr.arpa. (44) The first query gets an "NXDOMAIN", which is the correct answer for this address. However, NXDOMAIN triggers the DVE-2018-0001 mitigation code to send an revised query without EDNS OPT (confirmed in debug log). I **never see a response to this revised query**. Frankly, it is wrong from systemd-resolved to try working around clearly broken resolvers. In this case, it delays correct response from well-behaving server. Just because some really broken servers send wrong replies. This should be enabled ONLY by manual configuration, if at all. Every user should know he has broken DNS servers if this (mis)feature helps. Anyway, it should not require a timeout. If the response had correct name and type in question section and matching transaction id, it is cleary the response to our query. If it insist on those kinds of workarounds, do it right away, not after no response timeout. Better though do that only if requested. NXDOMAIN is a valid response and DNS folks are serious to deliver it only when it means requested name does not exist. Proper way to signal the server does not understand something in the query is only FORMERR response. It is a shame ResolveUnicastSingleLabel=yes has to be configured manually to avoid some failures on correct names, but such tricks are enabled by default and cannot even be turned off manually. Please correct that! If there is only a single DNS server, the resolver resends the OPT-less query after a timeout, and *that* gets an NXDOMAIN which is returned. However, if there are multiple DNS servers (e.g. 8.8.8.8 8.8.4.4), on timing out, it sends another query with EDNS to the next server, and the three-packet sequence repeats several times until it gives up. Since the server *will* respond to a retransmit after 5s, my guess is that the server, or maybe something in the network, is dropping close- in-time requests with the same transaction id. I tried a few public DNSs that (surprisingly?) all behaved the same. I haven't found a simple way to rule out a firewall, router or my ISP. Does the re-transmit keep the same source port and transaction id? Regardless, my thought is that resending a slightly different query after we did get a response should not use the same transaction id. I patched systemd as follows and the problem goes away: --- a/src/resolve/resolved-dns-transaction.c +++ b/src/resolve/resolved-dns-transaction.c @@ -1312,6 +1312,7 @@ void dns_transaction_process_reply(DnsTransaction *t, DnsPacket *p, bool encrypt FORMAT_DNS_RCODE(DNS_PACKET_RCODE(p)), dns_server_feature_level_to_string(t- clamp_feature_level_nxdomain)); +dns_transaction_shuffle_id(t); dns_transaction_retry(t, false /* use the same server */); return; } A few questions: - Does anyone else see this? - Does this look like a reasonable fix? Any thoughts on whether the one other place where dns_transaction_retry(..., false) is called to retry the same server with a lower feature level (SERVFAIL etc) should do the same? Yes, to me it is. Only unmodified retries should keep original transaction ids. If it modifies sent query, it should get a new id for it. It also ensures that the EDNS removal were the thing which helped, not just pure retransmit. I think it should change transaction id every time it got any response. SERVFAIL is a response too. - Any other issues with the patch? Or would it be reasonable to (add comments and) submit a pull request? I think pull requests are in general a better way to request a code change. Makes commenting easier and linking related issues too. -Vince Del Vecchio Just my 2 cents. Cheers, Petr -- Petr Menšík Software Engineer, RHEL Red Hat, https://www.redhat.com/ PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB
Re: [systemd-devel] systemd-resolved: performance question
Hi Robert, interesting. It seems resolved is not expecting so heavy usage. Consider other cache until this is fixed, unbound or dnsmasq might be a good choice. Please create an issue for it on https://github.com/systemd/systemd. Especially when it can use the fact it is a deamon, it should be able to use notifications of /etc/resolv.conf symlink changes. It should not require stat of that file on each query. It definitely should save that when the resolve NSS plugin is the first one, even before files. It should watch the file with some filesystem notification system. Especially when it already knows it is the service supposed to maintain that file. I don't understand why it is checking both resolv.conf and stub-resolv.conf. Only one of them should be under active use. Even if it could not use filesystem monitoring, I guess it could check those files only once per second or so. Should not depend on number of done queries. Please share your distribution and systemd version in it, it is unclear what version are we talking about. resolvectl status command output would be also useful. On 3/14/23 19:25, Robert Ayrapetyan wrote: Hello, I'm using systemd-resolved on the server which performs a lot of DNS queries (~20K per second) and systemd-resolved helps a lot providing a cache: Cache Current Cache Size: 263 Cache Hits: 30928976 Cache Misses: 2961 However, systemd-resolved process almost constantly utilizes 100% of CPU (I don't have any other dns services like dnsmasq installed). strace shows this: % time seconds usecs/call callserrors syscall -- --- --- - - 48.730.966765 4194141 stat 15.060.298804 12 24106 sendmsg 13.590.269703 8 32346 32346 openat 6.420.127381 5 24118 recvmsg 5.490.108829 4 24118 recvfrom 5.340.106037 4 24118 epoll_wait 5.330.105838 4 24118 gettid 0.010.000151 624 epoll_ctl And in particular for stat, it queries just 3 files in a loop: stat("/etc/resolv.conf", {st_mode=S_IFREG|0644, st_size=738, ...}) = 0 <0.10> stat("/run/systemd/resolve/resolv.conf", {st_mode=S_IFREG|0644, st_size=624, ...}) = 0 <0.09> stat("/run/systemd/resolve/stub-resolv.conf", {st_mode=S_IFREG|0644, st_size=738, ...}) = 0 <0.13> My question is: why does it "stats" so often? Is this a result of my misconfiguration? /etc/systemd/resolved.conf: ... FallbackDNS=1.1.1.1 1.0.0.1 /etc/nsswitch.conf: ... hosts: resolve [!UNAVAIL=return] files dns -- Petr Menšík Software Engineer, RHEL Red Hat, https://www.redhat.com/ PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB
Re: [systemd-devel] systemd-resolved/NetworkManager resolv.conf handling
Oh, understood. Then it is specific problem to Fedora, because I think other distributions do not use systemd's implementation of resolvconf binary. I think original Debian resolvconf package does not use -a interface parameter for anything serious. It just uses the same interface identifier to pair -a and -d for the same connection. On the other hand systemd's resolvconf tracks settings per interface and it requires it to point to the real interface on the system. Of course F5 client should use a real interface name it is going to use. I am not sure what can be done for it on side of systemd. Perhaps systemd could allow configuration of aliases, so it would allow to map eth0.f5 to tun0 interface. But that seems a mere workaround, F5 client should be modified to call resolvconf once it knows used interface name, not before that. It would be nice to issue a bug on F5 site. I haven't found issue matching your description. It would be worth filling. https://support.f5.com/csp/bug-tracker On 11/2/22 16:20, Thomas HUMMEL wrote: On 10/31/22 12:19, Petr Menšík wrote: Hello, thank you and Barry as well for your answers I would suggest using strace to find what exactly it does and what it tries to modify. I expect sources for that client are not available. Well, digging a little deeper, here's what I've found out: 1) in the default case (described in my initial post), i.e. /etc/resolv.conf symlinked to systemd-resolved /run/systemd/resolve/stub-resolv.conf no dns nor rc.manager directives in NM config no F5 client NM profile The vpn client: a) backs up /etc/resolv.conf to /etc/resolv.conf.fp-saved b) readlinks the symlink c) execve's /sbin/resolvconf providing nameservers (thus trying to play along with systemd-resolved) but on the wrong interface on my Fedora (eth0.f5 instead of tun0) [besides with a deprecated and not used arg (-m)] execve("/sbin/resolvconf", ["/sbin/resolvconf", "-a", "eth0.f5", "-m 0"], 0x7ffd13bf8568 /* 30 vars */ d) set up tun0 interface and bring it up -> hence we end up with: a) /etc/resolv.conf.fp-saved as a regular file, copy of /run/systemd/resolve/stub-resolv.conf b) NM managed tun0 interface without and dns property in its profile nor any disk persistent profile c) unchanded /etc/resolv.conf (still linked to /run/systemd/resolve/stub-resolv.conf so, systemd-resolved not knowing about vpn nameservers and vpn nameresolution fails without workaround (like resolvectl dns adding the tun0 nameserver for instance) 2) with NM handling /etc/resolv.conf as a regular file, i.e. /etc symlink rm-ed dns=default rc.manager=file the F5 client consider it a 'legacy' setting and overwrite (which is wrong to me) NM managed /etc/resolv.conf regular file it restores it when stopped by copying back /etc/resolv.conf.fp-saved That is exactly what it should do for a VPN, unless it knows a more proper way to configure system DNS. Some packages like dnssec-trigger prevents that by setting additional parameter to /etc/resolv.conf, making it non-writeable even by root process. There is not a generic and better way other than resolvconf. On Fedora resolvconf is provided just for systemd-resolved in default installation. But it needs precise interfaces used, unlike original implementation. So, basically I'd say there are 2 bugs : 1) legacy handling which seems to consider pre-NM era legacy 2) resolvconf call when systemd-resolved is used (at least on Fedora) In any case, I don't understand why it does not set the NM profile ipv4.dns property, which would let much more chances for NM and/or resolved to work Does "nmcli c" command show F5 profile in green, when it is connected? "nmcli c show " would provide all details it knows. I am not sure how information obtained from VPN plugins should work. I think network manager list has to be used for qualified response. It seems to me only VPNs configured by NM plugin know configuration details. Anyway, this leaves 2 unanswered questions, the first of which was my initial one: 1) how could, when all resolv.conf-as-a-file-by-NM conf has been removed (by me) and symlink to stub has been restored (by me) systemd-resolved, with *no trace* of the vpn nameservers in its own /run/systemd/resolv/resolv.conf nor seemingly nowhere else, can be still aware of the vpn nameservers (as described in my initial post scenario) ? -> is there a persistent systemd-resolved cache on disk somewhere? I don't think any persistent cache were ever on disk or that it would be a good idea. Most dns caches are able to dump contents of cache somewhere on request, but I haven't found a way to do that with resolvectl. 2) when running resolvconf by hand (resolvconf ) providing specific interface specific nameservers (on stdin), it seems to update the **global** /run/systemd/resol
Re: [systemd-devel] systemd-resolved/NetworkManager resolv.conf handling
he initial state ? I just guess systemd-resolved might have detected outside change of resolv.conf and adds the values provided by F5 client to its servers set. I think systemd-resolved detects the file were modified by another process and rewrites it again. But first obtains nameservers in that changed file. Does it change resolvectl status output? In any case please contact F5 client support and ask for at least working NM integration, including DNS servers provisioning. It would have the same problem with dns=dnsmasq plugin in NM, so it is not just systemd-resolved specific. Does it show DNS servers on this command: nmcli connection show | grep .DNS When the F5 client is connected? Thanks for your help -- Thomas HUMMEL -- Petr Menšík Software Engineer, RHEL Red Hat, https://www.redhat.com/ PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB
[systemd-devel] How is supposed DNS over TLS with NM supposed to work?
Hi, I have noticed recent NM has connection.dnsovertls property. So far only systemd-resolved can use such property. But I am lost somehow. DNS over TLS requires two things to connect securely. IP address of target and also a SNI name of TLS certificate. That is needed to ensure I am not connecting to man in the middle, but to service I want. Of course trusted CA certificate must provide such certificate. Now I have traveled on train and realized everyone in the same carriage can see all my DNS queries. So I would like to use DNS over TLS on airports or mass transit devices, any public places in general. But I don't think it is necessary on my home or work networks, where I trust no unwanted observer watches all my steps. So per-connection setting would be great. However, what servers should it use, when I set per-connection setting to true? I think NM does not accept manual setting of TLS name per each IP. So I am unable to enter it in NM connection setting. Is there some way, how can I tell systemd-resolve to sometime use predefined set of DNS over TLS servers, including the service name? But other time accept anything DHCP supplies and do not insist on using DNS over TLS. Of course there has to be way to direct network specific domains to local servers from DHCP (or manual), not to global DoT upstream. Is anything like that already implemented? Is the current state in NetworkManager-1.38.4 known to be incomplete and only work in progress? Is it already formulated somewhere as a vision, how it should work once it is finished? Cheers, Petr -- Petr Menšík Software Engineer, RHEL Red Hat, http://www.redhat.com/ PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB
[systemd-devel] What is purpose of new DNS proxy address?
Hi! I would like to know what is purpose of DNS proxy listener at 127.0.0.54 address. What were primary motivation for its creation? Would it be possible having just (cached) DNS protocol on default stub 127.0.0.53? LLMNR could be handled by nss-resolve plugin, which provides everything required. It would make it similar to Windows implementation and avoid problems with DNS on default listener. Why is a new listener created instead of fixing the original one? Why should there be a different server address? Is already existing stub offered to the host unfixable? Regards, Petr
Re: [systemd-devel] certificate and trust store feature for systemd
gt; truststores akin to the keychain in Windows and OS X? > > > > But these are solved problems on modern Linux systems aren't they? > > > > At least with RHEL and Fedora they have trust store and keychains. > > > > > > > I still find the management of PKIs in /etc/pki to be problematic. > > > > For my home network I have my own DNS domain and CA setup. It was > easy to add the CA to > > Fedora's trust store. > > > > > > > Having this available as a core service within systemd using > like APIs either in (mostly deprecated) CAPI or the new CNG > > > > Barry > > > > > > > > > Scott Fields > > IBM/Kyndryl > > SRE – BNSF > > 817-593-5038 (BNSF) > > scott.fie...@kyndryl.com <mailto:scott.fie...@kyndryl.com> > > scott.fie...@bnsf.com <mailto:scott.fie...@bnsf.com> > > > -- Petr Menšík Software Engineer Red Hat, http://www.redhat.com/ email: pemen...@redhat.com PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB
[systemd-devel] LLMNR priority over DNS
Hi, I have just filled issue #23494 [1]. I think LLMNR should not be offered at all on DNS stub, if resolve NSS plugin is used. Let's take Windows implementation as an example. I have found Microsoft change of protocol describing used Windows variant, MS-LLMNR [2]. That protocol reduces queried types only to A, and PTR records. Not any other record is queried over this protocol. But systemd-resolved breaks iterative questions over local stub. I tried src.fedoraproject.org as an example, but it would fail any domain except root. I would like to propose serving LLMNR names just over getaddrinfo() and similar functions. That means nss plugins in glibc. Resolved has already plugin for it and that is enabled by default on Fedora. It is not enabled by default on Ubuntu. The protocol is useful especially in cases where no central server providing DNS is used. But for some reason it favors unreliable multicast protocol over reliable unicast DNS. What were my surprise, when dig org. dnskey told me it does not exist. That is nonsense, yet it is the default result with systemd-resolved enabled. I would like to request similar behaviour as on Windows. Only single label queries coming from non-DNS source as is getaddrinfo() would be resolved by LLMNR in advance. DNS queries with single label would be resolved over DNS first. If dns replies with NXDOMAIN, then LLMNR can be tried for a backward compatibility. It would fix weird failures in resolved and speed up single label queries for non-addres types. Such behaviour would be closer to the only different LLMNR implementation on Windows. What would you think about such change? Regards, Petr 1. https://github.com/systemd/systemd/issues/23494 2. https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-llmnrp/eed7fe96-9013-4dec-b14f-5abf85545385 -- Petr Menšík Software Engineer Red Hat, http://www.redhat.com/ email: pemen...@redhat.com PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB
Re: [systemd-devel] resolved vs. DNS servers listening on Linux dummy interfaces
Hi Peter, is there a reason why do you want resolved to serve whole LAN? It has its problems and I think its authors meant it as localhost cache. I don't think resolved considers it common to have more than one DNS server on the localhost. Is there a reason why you wouldn't use dnsmasq, unbound or knot-resolver instead, which are more common on routers? Do you need any systemd-resolved specific features? Like mDNS or LLMNR resolution? On 5/8/22 15:00, Peter Mattern wrote: > Hello. > > Apparently resolved is ignoring DNS servers which are listening on > Linux dummy interfaces. > > When directive "Domains" in section [Network] of the dummy interface's > *.network unit is set as usual, "resolvectl status " > still shows "Current Scopes: none" and "resolvectl query handled by the server>" fails. > Seen on up to date Arch Linux with the network setup handled > completely by networkd/resolved. As DNS servers dnsmasq and Knot were > tested, both were working as expected on that interface type according > to drill queries pointing to the interface's IP. > > Use case is a router on which I'd like to use Knot to serve a > subdomain used in the LAN only while leaving the upstream interface to > the ISP's DNS server and having resolved's stub resolver provide DNS > to the LAN on the downstream interface. > Tbh. I'm not even sure whether Linux dummy interfaces are meant for a > purpose like this. But given that both servers (as well as nginx, > btw.) seem to work well on the interface I'd actually expect resolved > to pick them. > > So can anybody tell me what's the matter here, in particular whether > this may be a problem of resolved or whether there's a way to get this > working somehow? > > Regards > > Peter Mattern > -- Petr Menšík Software Engineer Red Hat, http://www.redhat.com/ email: pemen...@redhat.com PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB
Re: [systemd-devel] systemd-resolved namespacing
Hi Andrew, I think that kind of separation works well if your containers use plain DNS protocol over IP. If you do not use systemd-resolved in a container, it just sends queries to whatever servers it reads from /etc/resolv.conf. If you need different nameservers, mount --bind should allow custom files for selected instances. Not via netns namespace, but file namespace. When local resolver is involved, it is more complicated. I think it is usually not required to deliver different results for internet names in containers. Usually container machine uses just local names and considers any name resolution to be sent to host's provided resolver. I think it usually should be host-level cache, where all containers would take advantage from shared cache. I think dns cache does not belong to containers itself. Because I have installed libvirt anways, I use for my systemd-nspawn containers libvirt's interface with dnsmasq provided cache/dns/dhcp. That ensures any container receives proper network. If multiple separate namespaces would be required separate vibrX interfaces would be used. I don't think systemd should reimplement also whole network setting features of libvirt. For example podman configures also dnsmasq and provides /etc/resolv.conf pointing to that instance. I think that solution does not belong to netns itself. Any nss plugin would depend on filesystem namespace available. Systemd-resolved cannot provide it by default, because it mixes in also different non-DNS protocols. Read "Networking in a systemd-nspawn container" thead for explanation. In any case, DNS cache listening on non-localhost address available to netns network would be required. systemd-nspawn -b allows use of systemd-networkd or any other network configuration via DHCP. Unless you want to provide Wireguard on default netns, I guess you should run dns cache for split-dns feature in netns itself. I guess netns-aware nss_dns would have to be implemented. Which would try netns-specific resolv.conf file before default /etc/resolv.conf. But not all programs use libc functions and they would fail. Wouldn't running full container solve your problems? Cheers, Petr On 12/1/21 10:39, Andrew Athan wrote: > > I'm not sure this is the right place to pose this question, nor that > I'm asking the "right" question, so kindly direct me if I "have it all > wrong." > > Question: > > Having looked at the "namespace" features such as those of `ip netns` > and/or those available via `unshare` or even `systemd-nspawn` it seems > there is a rather large hole in that DNS resolution and the associated > caches cross namespace boundaries. I suppose this is a general problem > faced by any system/node level caching service accessed by APIs from > within namespaces. > > Maybe I'm thinking about this wrong, but it would seem to me that > network services such as the DNS cache should respect namespace > boundaries. Otherwise, a container that has (for example) set an > /etc/netns/othernamespace/resolv.conf pointing to a different DNS > server than the node's main resolv.conf will receive cached responses > from queries made outside its namespace. > > Probably this is an issue that goes beyond systemd-resolved and should > also be addressed in glibc's "nss" helpers such as nss-resolve and > nss-dns and/or any associated caches. > > Are there plans to address this issue? I'm assuming there's enough > information about the context of a resolution request at the time > systemd-resolved receives that request, for it to know the namespace > into which it is vending its response? Perhaps this would not be the > case for queries sent to the stub 127.0.0.53 address, but I imagine > even this could at be dealt with by providing multiple stub responders > on separate IPs that can be targeted appropriately from within each > network namespace. > > It's possible the "safe" solution is to turn of name resolution and > other caches or to use a more complete container solution (e.g. a more > complete virtual OS instance) and that pushing namespace support into > the resolver is some kind of slippery slope -- but it seems like a > clear and present (and common) need, if not danger. > > > > Things I've read before posting this: > > man 8 nsenter > man 8 ip-netns > man 8 systemd-resolved > man 1 systemd-nspawn > man nss-resolve > https://gist.github.com/zoilomora/f7d264cefbb589f3f1b1fc2cea2c844c > > The motivating usecase: > > client apps operating in a namespace, through a Wireguard VPN device > with default routes and DNS via the far end should resolve names > always as if the DNS server configured in the namespace's resolv.conf > sourced the response. > > Thanks! > Andrew > -- Petr Menšík Software Engineer Red Hat, http://www.redhat.com/ email: pemen...@redhat.com PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB
Re: [systemd-devel] sibling DNS lookup of nspawn containers
I am solving this by using libvir's network, which uses dnsmasq for network resolution. That dnsmasq should be able to resolve names of containers, because they were registered by DHCP requests. Because they share dns cache at host, there is single place where they can register to. Not directly related to systemd however. On 6/18/21 5:26 AM, Johannes Ernst wrote: > I’d like to be able to DNS lookup container b from within container a, if > both were started with systemd-nspawn as siblings of each other, and shown as > a and b in machinectl list. > > man nss-mymachines specifically notes it won’t do that. > > What’s the proper way of doing this? > > Thanks, > > > > > Johannes. > > ___ > systemd-devel mailing list > systemd-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/systemd-devel -- Petr Menšík Software Engineer Red Hat, http://www.redhat.com/ email: pemen...@redhat.com PGP: DFCF908DB7C87E8E529925BC4931CA5B6C9FC5CB ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel