Re: refactoring FilterParser

Tim Legant Sun, 08 Dec 2002 21:37:11 -0800

Jonathan Ellis <[EMAIL PROTECTED]> writes:

> 3a) calling lower() in FlatFileStore -- I'm not conviced this is
> needed.  We don't lower() every key read from the DB formats, and
> there's nothing stopping people from modifying those manually, either.


We don't lower() every key read because we don't read every key.  We
only read one.  As you note below, the auto database files that we
create are always lowercased.

We obviously can't control DBM/CDB files created by external tools nor
can we control what goes into text files, except in the cases where
TMDA auto-appends, since the user edits them.  But we can always
consistently lowercase lookup attempts so the user who creates
databases externally has a fighting chance of achieving a match. We
can also lowercase the addresses we read from text files before
comparing.  TMDA has always done this.

> We already lower() both on append to flat files and on creation of
> an autodb from flat files, if anything; there's too much lower()ing
> going on already.

We absolutely must either lowercase or uppercase on auto-creation of
the DB files.  And we must do the same thing to the look-up key(s)
before attempting the search.  Otherwise the hashed search won't work.

The domain portion of an email address is allowed to be in any case
and the local portion is MTA dependent.  The only possible way to
ensure we match any combination of case is to either lowercase or
uppercase.  Ideally, we'd have a case-insensitive string compare, but
Python has no such thing, and it still wouldn't work with the DB
support.

> Better to have it well-defined at which points we can assume lower()
> has been done already, to avoid redundancy.  I suggest we leave the
> lower() on append in, and remove the others.

But the only way to assume lower() has been done already for the
auto database cases is to do it ourselves, which you've just suggested
removing!

We currently lowercase in 4 places.  I think they are well-defined.

1) The search key(s).  This is necessary.
2) On creation of auto database files.  This is necessary.
3) On reading a text file (FlatFileStore).  This is necessary.
4) On append to the CONFIRM_APPEND file.  This is not necessary
   because of (3).  We could remove this, although it's probably
   called the least frequently of the four.

> I suppose if you disagree it's pretty trivial to put a lower() in
> FlatFileStore.__init__ like you said. :)

Yup!  I can do that.

> 3b) put dbmstore code back in, with less "magic" strings getting passed
> around.  commented so the next guy to come along won't make the same
> mistake. :)

Cool.  The firstmatch method could definitely have used more comments.
It looks as if you commented the FilterStore classes well, so that
should help the next sucker^H^H^H^H^H^Hperson who looks at it :)

> the autobuild functionality is now in an superclass of cdbstore and
> dbmstore.

This is nice.  Thank you.

I'll get this patched into my local source tree.  Jason has asked me
to wait on checking changes in until he does a release, so I'll do the
commit as soon as I can after the next release.  Since the changes
affect some important parts of the system, that will give us CVS users
a chance to exercise the new code before the following release.

Again, thanks for taking the time to do this.


Tim
_________________________________________________
tmda-workers mailing list ([EMAIL PROTECTED])
http://tmda.net/lists/listinfo/tmda-workers

Re: refactoring FilterParser

Reply via email to