A quick experiment shows that if an infra-cache entry with "rto 120000" (two minutes) is present for all a requested domain's name servers, then SERVFAIL is returned immediately to further queries. However, the infra-cache entries are per-domain-per-nameserver. When a different domain with a NS record pointing to the same nameserver is queried, Unbound creates and tracks a separate state for the second domain and does not combine information regarding reachability of particular nameservers.
So I suppose queries against eighty-plus unresponsive name servers for hundreds-to-thousands of domains easily explains the 700 entry active request queue. One odd quirk is that Unbound sometimes returns SERVFAIL in from ten to thirty seconds the first time a request for a non-responding nameserver domain is made. Subsequent requests then take however long until the "rto 120000" state is achieved for the infra-cache domain entry. Maximum time till SERVFAIL seems to be ten minutes. 'Eventdns' is now tuned to give up after five seconds and permit up to 16k under-five-seconds active requests, so all the above is academic.
