Consider the following tables:
CREATE TABLE `dbmail_messageblks` (
`messageblk_idnr` bigint(21) NOT NULL auto_increment,
`physmessage_id` bigint(21) NOT NULL default '0',
`messagepart_order` int NOT NULL,
`messagepart_idnr` bigint(21) NOT NULL,
`mime_header` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`messageblk_idnr`),
UNIQUE KEY `messageblk_idnr_message_part`
(`messageblk_idnr`,`messagepart_idnr`),
KEY `physmessage_id_index` (`physmessage_id`),
KEY `messagepart_id_index` (`messagepart_idnr`),
CONSTRAINT `dbmail_messageblks_ibfk_1` FOREIGN KEY (`physmessage_id`)
REFERENCES `dbmail_physmessage` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `dbmail_messageparts` (
`messagepart_idnr` bigint(20) NOT NULL auto_increment,
`message` longblob,
`size` bigint(20) NOT NULL,
`hash` varchar(64) NOT NULL,
PRIMARY KEY (`messagepart_idnr`),
UNIQUE KEY `hash_size` (`hash`,`size`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
and a message constructed like follows
->Message
-Non Mime Message or Mime Warning "This is a multi-part message in
MIME format."
->Plain Text
->HTML
->JPEG
->Attached Message 1
-Non Mime Message or Mime Warning "This is a multi-part message
in MIME format."
->Plain Text
->HTML
->MPEG
->Attached Message 2
-Non Mime Message or Mime Warning "This is a multi-part message
in MIME format."
->Plain Text
->HTML
->GIF
Would be come these message parts
begin
message_part=0 Root Message Headers
message_part=1 Root Non-Mime Message
message_part=2 Boundry and headers ("Plain Text")
message_part=3 Body ("The Plain Text")
message_part=4 Boundry and headers ("HTML")
message_part=5 Body ("The HTML")
message_part=6 Boundry and headers ("JPEG Attachment")
message_part=7 Body ("Base64 of the JPEG")
message_part=8 Boundry and headers ("Attached Message 1")
message_part=9 Attached message 1 Headers
message_part=10 Attached message 1 Non-Mime
message_part=11 Attached message 1 Boundry and headers ("Plain Text")
message_part=12 Attached message 1 Body ("Plain Text")
message_part=13 Attached message 1 Boundry and headers ("HTML")
message_part=14 Attached message 1 Body ("HTML")
message_part=15 Attached message 1 Boundry and headers ("MPEG")
message_part=16 Attached message 1 Body ("MPEG")
message_part=17 Attached message 1 Closing boundry
message_part=18 Boundry and headers ("Attached Message 2")
message_part=19 Attached message 2 Headers
message_part=20 Attached message 2 Non-Mime
message_part=21 Attached message 2 Boundry and headers ("Plain Text")
message_part=22 Attached message 2 Body ("Plain Text")
message_part=23 Attached message 2 Boundry and headers ("HTML")
message_part=24 Attached message 2 Body ("HTML")
message_part=25 Attached message 2 Boundry and headers ("GIF")
message_part=26 Attached message 2 Body ("GIF")
message_part=27 Attached message 2 Closing Boundry
message_part=28 Root Message Closing Boundry
end
While a simple non-mime message would be
begin
message_part=0 Message Headers
message_part=1 Message Body
end
I propose using sha256 as it will give a more unique key at 64 chars
than sha1 at 40 chars and only doing the sha256 on the first 8 or 16 MB.
I can't think of any real data files that could result in the same size
and sha256 of the first 8 MB. This could also be configurable in the
dbmail.conf to what ever size the admin choses, and could be changed
later, but would require dbmail to be shut down and the dbmail-util run
to update every hash in the database to use the new value.
Features:
All parts of the messages can be stored uniquely.
Duplicate attachments/parts - even if they were renamed - can't occur -
while the original attached file name remains intact.
I purposly left the boundry's in there so that original message could be
re-built in SQL into its original complete data stream for external use
or mail export.
The is_header field can be removed as message_part=0 will always contain
the headers of the message.
The mime_header field can be used to find an attachment when looking to
download only certain parts.
Rebuiling of headers can be done quickly with message_part=0
The messagepart_order allows the inserting of multiple message parts to
be done by threads
Message above would have a total of 29 threads spawned, though they
may not all be spawned at the same due to thread/memory limits.
1. The thread is told the messagepart_order number and passed the
data to be inserted.
2. The size is calculated and the first 8 MB of the data are sha256'ed.
Shorter data would be ok as we would then have a sha256 on
the enitre part.
3. Do a quick select on the db to check if the hash and size is
already there.
4. If the data is not there - insert the data
5. Get the messagepart_idnr for the size and hash.
6. Insert the messageparts row with the matching order and
messagepart_idnr
Issues/Slowdowns:
Deletion of messageparts would require a check to see if the part is no
longer required.
SELECT messagepart_idnr, COUNT(physmessage_id) FROM
dbmail_messageblks WHERE messagepart_idnr IN (SELECT messagepart_idnr
FROM dbmail_messageblks WHERE physmessage_id='12345' GROUP BY
messagepart_idnr) GROUP BY messagepart_idnr;
This query would return a row for each unique part in the deleted
message. The part could be deleted when the count=1.
This is assuming that every message has a phymessage_id and that
messages sent to multiple mailboxes duplicate the phymessages and not
share one.
-Jon
Paul J Stevens wrote:
Charles Marcus wrote:
DBmail isn't a email system. It's an imap/pop server.
If you want to control this then configure it in your mail server,
like postfix....
Again - saving drafts bypasses the MTA - and I think it would be a good
thing for DBMail to be able to catch these attempts like this to bypass
the limits set by the sys admin...
The size of any message (and attachment in the near future) is known
whenever it passes through dbmail. Adding a couple of runtime options
delimiting their size in the config file is trivial.
Of course, we'd also have to deal with the fallout like handling max
messagesize exceptions and how to communicate it back to the console,
lmtp, or imap clients. That would require some special attention.
_______________________________________________
DBmail mailing list
[email protected]
https://mailman.fastxs.nl/mailman/listinfo/dbmail