Re: MySQL support [WARNING: long!]

Tim Legant Thu, 19 Jun 2003 01:31:33 -0700

Gre7g Luterman <[EMAIL PROTECTED]> writes:

> On Sat, 22 Mar 2003 17:11:44 -0700, "Jason R. Mastaler"
> <[EMAIL PROTECTED]> wrote:
> 
> > That's fine.  I'd hold off anyway, as I think MySQL will be added soon
> > by Cory or Tim.


So, here we go. 

Cory and I worked on a contract where, among other things, we added
simple database support to TMDA.  One of my goals in the database
support was not to support MySQL specifically, but rather support the
Python DB API version 2.0.  This would allow the use of any database
that has DB 2.0-compliant modules.  Among others, this includes
PostgreSQL and Oracle.

Another goal was to not restrict the user's table layout.  In other
words, if you already use a database of email users and want to JOIN
on an ID in the user table and a whitelist table, you should be able
to do that.  Of course, not everyone's user table looks the same, or
is even named the same.  So the solution had to allow different
layouts.

What we came up with is by no means complete.  But it does work and,
although the client was using MySQL, I see no reason why it won't work
with other DB 2.0 modules.  I've tested this on a PostgreSQL database
with a different table layout than our client's and it works just
fine.

A final note: I made a deliberate decision to pull the list of
addresses (black or white or whatever) back from the database into
TMDA.  This allows us to use the documented wildcard support that TMDA
provides, which is not provided by any database.  Regular expressions
were a possibility, but the problem is that the syntax for using them
is different across databases; e.g MySQL uses "RLIKE", PostgreSQL uses
"~", etc.

I'm planning on adding 'from-sql' and 'to-sql' rules to the parser.
This is the syntax (I'm using 'from-sql' as the example; 'to-sql'
works the same).

from-sql [-search_key=<string>] "SELECT ..." <action>

This causes the FilterParser.firstmatch() function to perform the
given SELECT statement, read the data back from the database, turn it
into a Python list object and run the Util.findmatch() function on it.
This gives us the ability to use TMDA-style wildcards in the database
as well as any given schema the user prefers.

The from-sql/to-sql code in FilterParser.firstmatch() expects each
returned row to have either one or two columns.  Actually, a row can
have more than two, but the third and following columns are ignored.
The first column must be the email address and the second column, if
present, is the overriding action field, just like in a text list
file.

Here is a simple, memory-wasting table definition with no overriding
action column:

CREATE TABLE whitelist (
    recipient    varchar(80),
    sender       varchar(80)
)

and here is the SELECT statement to retrieve all of the senders in
[EMAIL PROTECTED]'s whitelist:

SELECT sender FROM whitelist WHERE recipient = '[EMAIL PROTECTED]'

Let's add a sequence column, so we can sort the sender addresses.
This is important if wildcards will be used in the table, just like it
matters in text files.

CREATE TABLE whitelist (
    recipient    varchar(80),
    sender       varchar(80),
    num_addr     integer  -- sort order: 1, 2, 3, etc.
)

and here is the SELECT statement to retrieve all of the senders,
sorted properly:

SELECT sender, NULL, num_addr
  FROM whitelist
 WHERE recipient = '[EMAIL PROTECTED]'
ORDER BY num_addr

Note that, because the from-sql rule expects the second column to be
the overriding action, we put a NULL value there, since we don't have
an action column.  If we did have an action column, we could use
that.  Also note that since we ignore the third and following columns,
they can be used for useful things like ORDER BY.

Finally, here's a better designed set of tables:

CREATE TABLE user (
    uid        integer autoincrement not null -- MySQL only.
                   -- PostgreSQL and Oracle use sequences for this.
    email      varchar(80) not null,
    password   varchar(12)
)

CREATE TABLE whitelist (
    uid        integer not null,
    sender     varchar(80),
    action     varchar(80),
    num_addr   integer
)

and the SELECT statement:

SELECT wl.sender, wl.action, wl.num_addr
  FROM user AS u, whitelist AS wl
 WHERE u.email = '[EMAIL PROTECTED]'
   AND u.uid = wl.uid
ORDER BY wl.num_addr

This joins between a user table (where the recipient email address is
stored) and a whitelist table, on the uid column.

In each case, we had to provide the recipient's email address in order
to find what senders were in that recipient's whitelist.  So does
every user have to write their own SQL SELECT statement in their
filter files?  No.  The above SELECT can be written like this:

SELECT wl.sender, wl.action, wl.num_addr
  FROM user AS u, whitelist AS wl
 WHERE u.email = %(search_key)s
   AND u.uid = wl.uid
ORDER BY wl.num_addr

Note that the Python substitution %(VAR_NAME)s is used and the
variable name is 'search_key'.  The 'search_key' argument to the
from-sql and to-sql rules can specify a string to substitute into an
otherwise "generic" SELECT statement.  Thus an adminstrator can define
the SELECT statement in /etc/tmdarc, for example, like this:

WHITELIST_SELECT = "SELECT wl.sender, wl.action, wl.num_addr \
                      FROM user AS u, whitelist AS wl        \
                     WHERE u.email = %(search_key)s          \
                       AND u.uid = wl.uid                    \
                  ORDER BY wl.num_addr"

and the individual users can use this rule in their incoming filter:

from-sql [EMAIL PROTECTED] "${WHITELIST_SELECT}" ok

If no -search_key argument is specified, the from-sql and to-sql rules
will default to [EMAIL PROTECTED]  The search key doesn't have to be
the recipient's email address, but that's about the only thing we have
to distinguish one user from another in the database.  In certain
environments, the administrator may be able to use another key.

If the login username is enough, the rule could be given like this:

from-sql -search_key=${USER} "${WHITELIST_SELECT}" ok

For example, if you keep each user's whitelist in a separate table
named after the user, you could use a SELECT statement like this:

SELECT sender FROM %(search_key)s

Anyhow, that's the idea.  I'm curious to hear any feedback before I go
ahead with this -- enhancements, questions, whatever.  I'd like to
make sure this works reasonably well before the 1.0 release, which is
looming.

> I think we should probably have some sort of MySQL equivilant of
> BARE_APPEND, CONFIRM_APPEND, PENDING_BLACKLIST_APPEND, and/or
> PENDING_DELETE_APPEND, but I held off adding anything as I'm not sure
> how people would want it done.

The PENDING_* variables might be going away after version 1.0 is
released.  Jason might be able to comment more on that, but without
tmda-pending, they really don't serve a purpose.

The *_APPEND functionality would be nice to have in a database
context.  I have an idea how to enhance the database proposal above
which will also allow us to cleanly interface those variables with the
database functionality.  It's too much to do before 1.0, however, so
I'll postpone the discussion until later.


Tim
_________________________________________________
tmda-workers mailing list ([EMAIL PROTECTED])
http://tmda.net/lists/listinfo/tmda-workers

Re: MySQL support [WARNING: long!]

Reply via email to