Re: refactoring FilterParser

Tim Legant Thu, 05 Dec 2002 10:50:52 -0800

Jonathan Ellis <[EMAIL PROTECTED]> writes:

> Here's a fixed version...


Hi Jonathan.  I've looked at your patch and have several comments.
First, thanks for taking the time to work on it and submit it.  If
you'd be willing to resubmit your patch with the changes below, I'd be
happy to commit it.  The patch affects three basic areas in the code,
so I'll just walk through them.

1) We would prefer to leave the -optional flag behavior as it is
   because of the possibility of losing mail through simple typos.
   As Jason says, you should have to load the gun if you intend to
   shoot yourself in the foot.

2) I like the idea of not having a second field/column in the address
   databases (text, CDB, DBM) because it simplifies the code and
   because I think it's clearer than having a from-file rule that says
   'accept' and then several addresses in that file that say 'bounce'
   or 'drop'.

   However, there may be a number of people using that second field so
   I'm going to bring this up in a separate message on -workers so
   that we can discuss it and try to determine how prevalent this
   usage is.

   You can resubmit the patch without support for the second field
   and, if we eventually decide we want to retain that support, I'll
   take responsbility to add it back in.

3) The FilterStore code can go in pretty much as is, although there
   are a couple of problems in the implementation that should be
   addressed before I can commit it.

   First, in FlatFileStore, the first field (or only field) needs to
   be lowercased, probably in __init__.  The email addresses in text
   files can be in any case, so we need to lowercase so that
   Util.findmatch will work.  Util.findmatch actually uses the Python
   function fnmatch.fnmatch, which is designed to work with filenames
   and is therefore case sensitive.  Without the second field, it will
   be much simpler that what was previously in
   FilterParser.__search_file.

   Second, your implementation of DbmStore and Util.autobuild_db
   removed a lot of code from the original implementation.
   Unfortunately, that means that the new code doesn't work on several
   platforms.  Here's why.

   First, only the filename string returned from tempfile.mktemp is
   known on all platforms.  The file extension is not known and can't
   be assumed.  When a new database is created (open(name, 'n')), some
   platforms create the file with the exact filename you give it, some
   add the extension '.db' and other add '.dbm'.

   This means that the tmpname variable does not necessarily contain
   the name of the database file on disk.  The consequence of that is
   that the os.rename call in Util.build_dbm will fail.  It will fail
   every time, meaning the database is never searched (we fall back to
   text) and a collection of strangely named databases builds up in
   the directory.

   Because we can't know, unambiguously, the name of the database, we
   can't get its modification time in Util.autobuild_db.  Therefore,
   for DBM style databases we used a surrogate file (the dbname with
   '.last_built' appended) that we created immediately after creating
   the database.  We *can* unambiguously derive that filename and
   therefore we requested the modification time for the surrogate in
   the original __autobuild_db.  This mess is only necessary for DBM
   support and not for CDB support.

   Second, there was a glob.glob loop in the original Util.build_dbm
   that renamed all files with a first part of tmpname to the desired
   dbname, preserving whatever extension they had.  Thus blahtmp.db
   became, for the textfile 'whitelist', whitelist.db, etc.  The
   reason for this seemingly weird code is partly because of the above
   problem (we don't know the source name for the rename call) and
   partly because, on some platforms, a call to create a new database
   actually creates more than one file.  I believe this is true on
   Solaris, for instance.  Often the second file is an index file.

   In order for DBM support to work on those platforms, both files
   must be renamed to the same base filename with the appropriate
   extensions (which again, we don't know).  So that code is very
   important for multi-platform DBM support in TMDA.

   In order for me to check in the FilterStore code, your
   implementation needs to handle these cases correctly.

   As a general rule of thumb when submitting patches, if the
   combination of the old code and the new code looks like it could
   use some restructuring, have at it!  Often, adding a new feature
   makes the previous implementation's structure look weird, or be
   less maintainable for the next guy to come along.

   On the other hand, if you find code that seems not to make sense,
   don't just remove it.  Run a question up the -workers flagpole to
   find out if anyone can explain it first.  TMDA is young enough and
   small enough that there is very little build-up of cruft.  So
   there's probably a reason that code is there, although the
   structure, like FilterParser.firstmatch, might need to be
   addressed.

Sorry this is so long, but I felt you should understand why I was
requesting the changes in DbmStore.  If you can resubmit with the
above changes, we'll get your code into the tree.  Also, please feel
free to take part in the discussion on the second column issue.


Thanks,

Tim

_________________________________________________
tmda-workers mailing list ([EMAIL PROTECTED])
http://tmda.net/lists/listinfo/tmda-workers

Re: refactoring FilterParser

Reply via email to