I think it's called "mime chunking" or "single instance storage." I
apparently was wrong about searchability for duplications as there is
some sort of hashing scheme for this. I could be wrong.
http://dbmail.10918.n7.nabble.com/Newbie-Question-single-instance-store-for-attachmens-td13048.html
On 4/13/2014 1:19 PM, KT Walrus wrote:
DBMail already does a lot of data deduplication (headers, attachments, etc.).
I’m just not clear how far this goes and whether my turning a message to a list
of recipients into multiple copies of the message with different To: and
possibly different Message-Id: affects the data de-duplication.
If I should keep the headers the same for all copies of the message to get
maximum data deduplication, I will. I just prefer each recipient see the To:
as to only their address and not know about everyone else.
As for my “app”, it is a PHP app that uses the RoundCube Framework to provide
an IMAP interface to the user for accessing their mailbox and some public
mailboxes. The user sends messages using SMTP and I have a milter to send the
message to a special outbox mailbox (in DBMail). Then, I have a PHP cron job
that checks the outbox, retrieves the queued messages, preprocesses the message
headers, and uses dbmail-deliver to send the message to the appropriate
recipients.
I have all this working quite nicely. But, I’m trying to figure out the best
way to send a To: customized copy of each message to each recipient.
I need to understand how DBMail does data deduplication.
Kevin
On Apr 13, 2014, at 3:35 PM, Mark Winslow <furf...@omnicode.com> wrote:
I'm confused about what you're trying to accomplish. I haven't used dbmail
yet, but I've read up on it and and about to implement a test version.
When you talk about your "app," where in the mail delivery process are you
forwarding the messages? My understanding is that dbmail is an IMAP server that
implements the LMTP protocol to receive mail from a mail transport agent like procmail or
sendmail. Does your app work by forwarding messages through an MTA or are you going to
dup things in the database backend? Or something else?
As for the alias/caching scheme you mention, it sounds very complex. The
simplest way of dealing with it would be if dbmail were to check globally for
exact copies of message bodies. It seems like it would be very expensive in
terms of processing time because presumably unindexed message bodies would have
to be checked against potentially millions of other message bodies.
If you knew you were duping the message bodies, you could give each large body
a unique tag and reference that. However, I doubt that dbmail does that, and
I'm not sure if a plugin or something could be easily made to do it. Duping
the messages on the database would be easy. The hard part would be hooking
into dbmail's IMAP serving mechanism.
Just my take.
On 4/13/2014 10:49 AM, KT Walrus wrote:
I’m working on implementing a mailing list feature in my app. Each user has
their own mailing list with the mailing list recipient addresses stored in my
database. The user can send mail to the mailing list address and the message
would be delivered to each recipient address for the list.
I would like to change the To: header from the mailing list address to the
individual recipient address for each copy of the message delivered (and add a
Reply-To: header to use for replying to the message to the group). Basically,
I don’t want recipients to see the original mailing list address or other
recipient addresses in their email.
My question is:
If I change the To: header and use dbmail-deliver to deliver each changed
message, will all copies of the messages be efficiently stored (given that each
copy has a different To: header)?
Also, should I change the Message-Id: header in each copy of the message before
using dbmail-deliver to send a copy of the message to an individual recipient?
Does changing To: or Message-Id: affect storing of attachments? I only want
the attachment stored once regardless of the number of messages it is attached
to. I would like the message bodies and unchanged headers be stored only once
regardless of the number of copies for the message.
Or, would it be better to just change the To: header to “Undisclosed
Recipients:” and the message headers and body the same in all the
dbmail-deliver copies?
Kevin
_______________________________________________
DBmail mailing list
DBmail@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail
_______________________________________________
DBmail mailing list
DBmail@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail
_______________________________________________
DBmail mailing list
DBmail@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail
_______________________________________________
DBmail mailing list
DBmail@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail