OK, we have unnecessary slowness in our DNS lookups.  Currently we have
these components doing DNS:

  - (a) the normal Received-line DNSBL lookups.  This uses an asynch
    model, starting queries after the metadata is parsed and harvesting
    before the meta tests are run.

  - (b) the new URIBL lookup plugin.  This also uses an async model
    of its own, because the above async event loop is a bit too
    self-contained and Dns.pm-oriented for it to be plugged in.

  - (c) SPF support, using Mail::SPF::Query.  This does everything
    synchronously, unfortunately, which could be slow.

ideally, we should

  - (a) get all of these using 1 single async event loop, and using
    a shared cache of results that have been received so if the URIBL
    code does an NS lookup for "domain.com", and the Received DNSBL
    code wants to do the same, it doesn't even hit the wire.

  - (b) hack at Mail::SPF::Query to support async event polling.

  - (c) hack at Mail::SPF::Query to use the shared DNS cache.  Both
    of these will probably require us making a
    Mail::SpamAssassin::Plugin::SPF::SPFQuery class -- our own
    copy of M:S:Q.

So I'm proposing a new module to do the event loop for Dns.pm, with the
following APIs:

    start_lookup() {
        my ($self, $type, $bgsendargs, $done, $state) = @_;

        type: string representing the lookup type, e.g. "URIBL", "DNSBL",
        "SPF" etc.  Important so we can identify and kill lookups
        for a particular subsystem if they're taking too long,
        without killing all other lookups.  

        It may be appropriate to use a numeric type here, although that
        would mean that future plugins would need to reserve new numbers
        (a bad thing).

        $bgsendargs: ref to array of arguments for the $res->bgsend() fn.
        Also used (in stringified form) as key for {activelookups}
        hash.

        $done: a ref to a function to call once it's complete

        $state: all state required by $done

        if there's a cached result for a query of $bgsendargs already
        present in the cache hash, then $done is called immediately
        with that result.

        Since we may need to perform the same lookup for 2 different
        reasons (e.g. an "A" query for SPF will need different processing
        than the same lookup for URIBL), we keep a multi-level
        "activelookups" hash as follows:

        if $self->{activelookups}->{$bgsendargs} does NOT exist, a new
        subhash $self->{activelookups}->{$bgsendargs} is created with a
        sub object $self->{activelookups}->{$bgsendargs}->{$type}, and
        $self->{activelookups}->{$bgsendargs}->{BGSOCK} =
        $res->bgsend($bgsendargs), then this method returns.

        Alternatively, if $self->{activelookups}->{$bgsendargs} exists,
        a new subhash $self->{activelookups}->{$bgsendargs}->{$type}
        is created with the new lookup object -- in other words
        another callback is added for when an already-running query
        completes.
    }

    run_event_loop() {
        this is called periodically from *anything* that needs
        to process DNS results.  It will check to see if any of
        the lookups have returned a result, and call their
        done() method.

        foreach (activelookups) {
            if (bgisready ($lookup->{sock})) {
                store result from bgread() into the cached-results
                hash

                run through all $type objects under $lookup, and
                call:

                        $type->done($state, $result);

                Where $result was whatever bgread() returned

                remove the lookup and all its callback objects
                from activelookups hash
            }
        }
    }

    complete_lookups_of_type($type, $timeout)

        This will attempt to complete all lookups of type $type
        within $timeout seconds.

    complete_lookups()

        Similar for all lookups

    abort_lookups_of_type ($type)
    abort_lookups()

        when we have to give up on lookups because they're taking
        too long, this does it.

    count_lookups_of_type ($type)
    count_lookups ()

        Return how many lookups are remaining in that type.
        This allows dynamic timeouts a la Dns.pm.

    finish ()

        calls abort_lookups(), cleans up possible circular refs,
        usual destructor stuff

That's basically it.  I think that should cover all the needs
of SPF, Dns.pm and URIBL lookups.  comments?

--j.

Reply via email to