I believe your choice is better, in that, it requires many less
changes to
the underlying db_ methods, so I'll back mine out and proceed with
yours. I
may need one new db_ method to add a user mailbox to an existing
message.
I think there's a way to do this entirely in the the database layer, at
least for postgresql. Mysql would have to wait for 5.0 I think.
Essentially, fingerprint each message block with a crypto hash, and
store that as a unique key in a table. If there's duplicate on insert,
the rule would just swallow the row and return the key. A view could be
put together for the current messageblks table, but for most of the
access, its not really a useful construct. Or maybe it is depending on
how well pg pushes the query clauses down into the view definition.
I've got some sql to do this, tested only with dummy manual inserts,
but it seems to work. It's more geared towards the 1.0 series layout,
as I'm not using the physmessage table. It's also at least one delete
rule away from being complete. It does pass the 'it works by hand'
test, but I haven't subbed this under dbmail yet.
There are a few keys here:
there is a copy of the message flags row for every message*recipients.
there is a copy of the mapping between the message_idnr and the blocks
for every message*recipient.
there is one copy of any identical block.
So this would support different header blocks attached to cc/bcc'd
messages where the body is the same. It would support multiple
independent delivery. It would support fully identical delivery as
well.
eric
create table messageblks_new (
fprint varchar(32), -- I'd rather use binary here....
blocksize int,
messageblk text
);
create unique index msgblknew_fp on messageblks_new (fprint);
drop table message_blk_ref;
create table message_blk_ref(
fprint varchar(32),
messageblk_idnr serial,
message_idnr bigint
);
create unique index msgblk_ref_pk on message_blk_ref (message_idnr,
messageblk_idnr);
create index msgblk_ref_fp on message_blk_ref (fprint);
drop view messageblks;
create view messageblks as
select messageblk_idnr, message_idnr, messageblk, blocksize
from message_blk_ref inner join messageblks_new using (fprint);
create or replace function insertAndReturnFprint(text) returns text as '
declare
strMessageblk alias for $1;
res record;
strHash text;
begin
strHash := md5(strMessageblk);
for res in select count(*) from messageblks_new where
messageblks_new.fprint=strHash loop
if res.count > 0 then
return strHash;
end if;
end loop;
insert into messageblks_new (fprint, blocksize, messageblk) values
(strHash, char_length(strMessageblk), strMessageblk);
return strHash;
end;
' language plpgsql;
drop rule ins_msgblks on messageblks;
create rule ins_msgblks as on insert to messageblks do instead
insert into message_blk_ref (fprint, message_idnr) values
((select insertAndReturnFprint(new.messageblk)),
new.message_idnr);
insert into messageblks (messageblk, message_idnr) values ('bar', 3);
insert into messageblks (messageblk, message_idnr) values ('bong', 3);
insert into messageblks (messageblk, message_idnr) values ('baz', 3);
insert into messageblks (messageblk, message_idnr) values ('baz', 4);
This last insert will not add a row to the message block storage, only
to the reference table.