Dear all,

I have been reading up on the discussions on this list as well as the
concerns about databases in the FAQ. Whilst I concur with most of the
points wrt. to a fully fledged SQL database, I think that CDBs are
ideally suited for the purposes of spamdyke. Sam states in the FAQ
that speed, memory, concurrency, portability and availability are not
a concern with CDBs and I agree, especially on the speed issue. After
all, that was what the hash file format was designed for. 

That leaves accessibility and safety for CDBs. It is true that the
database itself is in binary form (that is where the speed comes
from), which means that they cannot be easily viewed and checked for
errors. At the same time, they are read only and are usually generated
from a plain text file as input. There is no reason to not have that
text file sitting next to the actual database file, which means we
have all the advantages of a plain text file plus the speed benefit of
CDBs, which can be substantial for a lot of entries. The only
additional step required (by the admin) would be to convert the text
file into the CDB. We could also have the best of both worlds like
this. Suppose we have this entry in the configuration file:

recipient-blacklist-file=/etc/spamdyke/recipient-blacklist


First, we look for a file with the name
/etc/spamdyke/recipient-blacklist.cdb. If it exists, we assume it is a
CDB version of /etc/spamdyke/recipient-blacklist and look up whatever
we need there. If recipient-blacklist.cdb has an earlier modification
time than recipient-blacklist (we get that for free anyway with a
stat() on both files), we could help the admin by printing a warning
that the CDB is probably out of date and read from recipient-blacklist
instead. If recipient-blacklist.cdb does not exist, we use
recipient-blacklist in ASCII format like before.


Another version of this would be to have lots of new configuration
options like:

recipient-blacklist-file-cdb=/etc/spamdyke/recipient-blacklist.cdb

That makes it possible to name the database file arbitrarily. If we
want the safety checks like in the example above we could make it
mandatory to name the ASCII input file for the CDB database file:

recipient-blacklist-file=/etc/spamdyke/recipient-blacklist
recipient-blacklist-file-cdb=/etc/spamdyke/recipient-blacklist.cdb

That way all the fallbacks to ASCII plus warnings can be implemented at
the cost of more configuration entries.


What do you think?

-- 

Joerg Lenneis

email: [email protected]

_______________________________________________
spamdyke-users mailing list
[email protected]
http://www.spamdyke.org/mailman/listinfo/spamdyke-users

Reply via email to