Matthew T. O'Connor wrote:
Personally, I think we should use a very clean and generic design for
headers / header searching.
First: Leave the current table structures intact, I don't think we
require any modifications to them.
Second: Add 2 new tables: header_list and header_values
Breaking out the header names to a table of their own may be useful. But doing it just for storage's sake
seems a bit overkill given the added complexity in constructing queries and maintaining data integrity. If
it's boosts performance for the target use-cases (search,sort,thread) I all for it, if.
header_list: ( Contains an exhaustive list of all headers from all
messages in the database. )
header_id int primary key
header text not null
header_values: ( Contains the values from all the headers in all the
messages in database )
header_value_id serial primary key
message_id int (references unique message ID from the
physmessage table)
header_id int (references unique ID from the header_list table)
header_value text (the actual value from this header in this message)
hearder_order int ( optional column, used to be able to recreate
the header order from the original message)
Header_order will never happen. Recreating headers from the header tables will never happen. Complete headers
are stored in the messageblk. Also, header-order from the original message is already *not* being maintained.
Gmime does it's own reformatting and reshuffling of the headers.
This structure will make it very easy to query all the headers from a
given message or find all the messages with a given header, or a given
header value. It also leaves our current structure intact which will
make it easier to phase in.
Agreed. Starting with a single separate headers table, or with two tables like you propose will probably be
the starting point. Once we have consistent storage of headers, it will be relatively easy to move certain
headers to tables of their own, or merge them into the physmessage table. Of course, postgres users could
probably even use triggers for stuff like that.
What do you think? I don't think we need to special case any headers
not even sendername or subject.
Well, yukatan's datamodel looks like a very serious attempt at optimizing datastorage for email. My working
assumption is that there are some very valid reasons for doing it the way they're doing things. Also, as a
long term goal, a unified model for sql based email storage is something I think about.
--
________________________________________________________________
Paul Stevens mailto:[EMAIL PROTECTED]
NET FACILITIES GROUP PGP: finger [EMAIL PROTECTED]
The Netherlands________________________________http://www.nfg.nl