Thanks for looking into this, definitely appreciate it! We'll upgrade and give it a go, and also look forward to trying out 3.5.0 when it arrives as well.
Regards Rob On Tue, 29 Jul 2025, at 18:27, Daniel Salzman wrote: > Hi, > > The currently released 3.4.8 improves the zone-commit performance. > Further optimizations (including catalog commit) will be released in > 3.5.0 in September. > > Daniel > > On 7/17/25 07:47, [email protected] wrote: >> Hi >> >> We're noticing that as our list of zones gets larger (about 480k right now), >> adding a new zone or deleting an existing zone seems to continue to get >> slower. We are always doing our modifications as part of a transaction, and >> the time appears to occur in the commit phase. >> >> An example timing. >> >> # time /opt/knot/sbin/knotc ... conf-begin >> OK >> >> real 0m0.010s >> user 0m0.000s >> sys 0m0.010s >> # time /opt/knot/sbin/knotc ... conf-unset zone.domain example.com >> OK >> >> real 0m0.010s >> user 0m0.000s >> sys 0m0.010s >> # time /opt/knot/sbin/knotc ... conf-commit >> OK >> >> real 0m2.330s >> user 0m0.000s >> sys 0m0.009s >> # >> >> As you can see, it took > 2 seconds to commit the transaction that removes >> just the example.com zone. Similarly, it takes > 2 seconds to commit the >> transaction that adds the zone back. >> >> Given the time is real time and not sys/user, I presume knotc is waiting on >> knotd to complete the work. I used perf to record a CPU profile of knotd >> while the commit was running, but nothing hugely stuck out at me. >> >> 10.75% knotd libc.so.6 [.] __memcmp_avx2_movbe >> >> ◆ >> 6.03% knotd knotd [.] __popcountdi2 >> >> ▒ >> 5.89% knotd knotd [.] ns_first_leaf >> >> ▒ >> 5.25% knotd libc.so.6 [.] >> pthread_mutex_lock@@GLIBC_2.2.5 >> ▒ >> 3.85% knotd liblmdb.so.0.0.0 [.] 0x0000000000003706 >> >> ▒ >> 3.72% knotd knotd [.] ns_find_branch.part.0 >> >> ▒ >> 2.76% knotd knotd [.] trie_get_try >> >> ▒ >> 2.63% knotd liblmdb.so.0.0.0 [.] 0x00000000000069d2 >> >> ▒ >> 2.34% knotd libknot.so.14.0.0 [.] knot_dname_lf >> >> ▒ >> 1.92% knotd liblmdb.so.0.0.0 [.] mdb_cursor_get >> >> ▒ >> 1.72% knotd knotd [.] create_zonedb >> >> ▒ >> 1.68% knotd knotd [.] twigbit.isra.0 >> >> ▒ >> 1.68% knotd knotd [.] catalogs_generate >> >> ▒ >> 1.36% knotd knotd [.] twigoff.isra.0 >> >> ▒ >> 1.28% knotd knotd [.] hastwig.isra.0 >> >> ▒ >> 1.28% knotd knotd [.] db_code >> >> ▒ >> 1.27% knotd libknot.so.14.0.0 [.] find_item >> >> ▒ >> 1.11% knotd libknot.so.14.0.0 [.] knot_dname_size >> >> ▒ >> 1.04% knotd knotd [.] zonedb_reload >> >> ▒ >> 0.99% knotd libc.so.6 [.] _int_free >> >> ▒ >> 0.99% knotd liblmdb.so.0.0.0 [.] 0x0000000000003ce8 >> >> ▒ >> 0.96% knotd liblmdb.so.0.0.0 [.] memcmp@plt >> >> ▒ >> 0.95% knotd liblmdb.so.0.0.0 [.] mdb_cursor_open >> >> ▒ >> 0.88% knotd libc.so.6 [.] malloc >> >> ▒ >> 0.88% knotd knotd [.] conf_db_get >> >> ▒ >> 0.87% knotd knotd [.] ns_next_leaf >> >> ▒ >> 0.82% knotd libknot.so.14.0.0 [.] iter_set >> >> ▒ >> 0.75% knotd knotd [.] evsched_cancel >> >> ▒ >> 0.73% knotd libknot.so.14.0.0 [.] find >> >> ▒ >> ... >> >> Our config is pretty simple, conf-export looks like: >> >> server: >> rundir: "/local/knot_dns/run/" >> user: "nobody" >> pidfile: "/local/knot_dns/run/knot.pid" >> listen: [ ... ] >> >> log: >> - target: "syslog" >> any: "info" >> >> statistics: >> timer: "10" >> file: "/tmpfs/knot_dns_stats.yaml" >> >> database: >> storage: "/local/knot_dns/data" >> >> mod-stats: >> - id: "default" >> request-protocol: "on" >> server-operation: "on" >> request-bytes: "on" >> response-bytes: "on" >> edns-presence: "on" >> flag-presence: "on" >> response-code: "on" >> request-edns-option: "on" >> response-edns-option: "on" >> reply-nodata: "on" >> query-type: "on" >> query-size: "on" >> reply-size: "on" >> >> template: >> - id: "default" >> global-module: "mod-stats/default" >> storage: "/local/knot_dns/zones/" >> >> zone: >> - domain: "example.com." >> template: "default" >> >> ... 478,000 more domains all the same ... >> >> Current files on disk are: >> >> # ls -l /local/knot_dns/data/* >> /local/knot_dns/data/catalog: >> total 0 >> >> /local/knot_dns/data/journal: >> total 0 >> >> /local/knot_dns/data/keys: >> total 0 >> >> /local/knot_dns/data/timers: >> total 75880 >> -rw-rw---- 1 root root 77697024 Jun 24 09:26 data.mdb >> -rw-rw---- 1 root root 2432 Jul 17 01:05 lock.mdb >> >> /local/knot_dns/data/timing: >> total 0 >> >> >> This machine is not slow or constrained in any way. It's 24 core, 3.6Ghz, >> 64Gb, NVMe drives, etc. Load is very low (<1) with plenty of free resources. >> >> So what I'm wondering is: >> 1. Is this normal? It doesn't feel right that adding/removing a single >> domain takes > 2 seconds regardless of the size of the existing zone database >> 2. Is there any way to improve this? Doing multiple adds/deletes at once >> within a transaction works and we do that where we can, but there are cases >> where we can't do that and I'd really like to understand why this is as slow >> as it is. >> >> Thanks in advance >> >> Rob >> -- -- Rob Mueller [email protected] --
