Hello,

We are envisioning the deployment of OpenDNSSEC in an ISP environment in order to provide DNSSEC services to clients. As an ISP, the typical use is thus many thousand of small zones.

We've been looking at OpenDNSSEC for some time now, and have read and (hopefully) understood most of the documentation and the very useful hints and messages on the mailing-list. Initial tests with prior versions on a couple of zones were looked very promising, so we upgraded to OpenDNSSEC 1.2.0 and attempted to put a bit of load on the service. Wanting to avoid spending money on a HSM during testing, we are using SoftHSM, also version 1.2.0.

I realize what follows may sound like a rant, which it isn't supposed to be; it is more of a cry of help, coupled with the question on whether OpenDNSSEC is the right tool for our job. :)

We are basically using a default configuration as provided by the project. As mentioned, the first couple of zones work like a charm, and I was delighted to see that the round-trip-time of a dynamic update to BIND, the zone transfer to OpenDNSSEC, it signing the zone and providing it to an NSD server could be completed in just a couple of seconds! Lovely. ;-)

For testing, we added 10,000 synthetic zones, each with 610 RR all configured to use a single default policy. From that point onwards, it all becomes a bit blurry; the following observations are based on "look and feel".

For example, after about 2,000 key pairs were created, we notice concurrency seems to be a problem. While the enforcer is running , the KASP database has a lock on it so that I can't look at a key even, an operation which is surely read-only?

   ods-ksmutil key list -z c1767.aa
SQLite database set to: /usr/local/stow/opendnssec-1.2.0/var/opendnssec/kasp.db /usr/local/stow/opendnssec-1.2.0/var/opendnssec/kasp.db.our_lock already locked, sleep
   ...

Our test system has 6GB of RAM on it. While enforcer and signer were running it locked up (swap), so we had to pull the plug on it. After restart, we notice that starting OpenDNSSEC with `ods-control start' doesn't start the enforcer (only the signerd is started). It appears that files left over in /var/run make
the enforcer think it is still running.

Just before the reboot, about 2,000 key pairs had been created. An `ods/ksmutil key list' then took an inordinate amount of time to complete:

        time ods-ksmutil  key list  > x.01
SQLite database set to: /usr/local/stow/opendnssec-1.2.0/var/opendnssec/kasp.db

        real    5m28.749s
        user    4m44.685s
        sys     0m43.702s

The first 10,000 key pairs took over 4 hours to generate. During that time the signer was blocked (kasp.db.our_lock exists). After the four hours, there was no activity: no signing, no nothing. Two signer processes apparently hung. I killed off one of them, and the enforcer continued working.

Jan 26 19:53:15 sign1 ods-signerd: zone fetcher transferred zone c1111.aa serial 1 successfully Jan 26 19:53:15 sign1 ods-signerd: daemon/cmdhandler.c:209: cmdhandler_handle_cmd_sign: assertion cmdc->engine->tasklist failed Jan 26 19:53:15 sign1 ods-signerd: zone fetcher transferred zone c1112.aa serial 1 successfully

Killing off the processes and restarting didn't help. An `ods-ksmutil update all' seems to have "fixed" the issue (I was able to launch ods-control), but the question remains as to what happened.

What we then did was to completely disable the auditor in the configuration and on the zone policy (all zones have the same policy), hoping to strongly decrease the load of the system. After an `update all` and a restart of the OpenDNSSEC daemons we experienced once again that the enforcer starts and the signers appear to wait on something (a01.aa is the first zone in zonelist.xml):

1955 ? Rs 193:45 /usr/local/stow/opendnssec-1.2.0/sbin/ods-enforcerd 1959 ? Ss 0:00 /usr/local/stow/opendnssec-1.2.0/sbin/ods-signerd -vvv 1967 ? S 0:00 sh -c /usr/local/stow/opendnssec-1.2.0/sbin/ods-signer sign a01.aa > /dev/null 2>&1 1968 ? S 0:00 /usr/local/stow/opendnssec-1.2.0/sbin/ods-signer sign a01.aa

(This has been so since a while now, again: note the times:
 -rw-r--r-- 1 opendnssec opendnssec 5223424 Jan 28 11:12 kasp.db
 -rw-r--r-- 1 opendnssec opendnssec       0 Jan 28 07:52 kasp.db.our_lock
)

I understand OpenDNSSEC is used mainly TLD environments, which have few but large zones. Is OpenDNSSEC theoretically suited to be used in production in a lots-of-small-zones environment?

Is what we are attempting to do, realistically feasable with OpenDNSSEC?

Thank you & regards,

        -JP

_______________________________________________
Opendnssec-user mailing list
[email protected]
https://lists.opendnssec.org/mailman/listinfo/opendnssec-user

Reply via email to