RFC: notmuch powered (personal) (end-to-end) e-mail system
Hello all! (Sorry for the long email.) I'm "struggling" for some time to get rid of the current "de-facto" email solutions (i.e. GMail, Zimbra), and I've passively observed for some time the notmuch project and community. Although I've forwarded all my email to a single account, and I'm currently mirroring my GMail account locally (by using `mbsync`), index it by using notmuch, and I collect spam mails for later filter training, unfortunately I'm unable to "convert" because the current notmuch-powered solutions have (some of) the following shortcomings (I don't want to offend anyone, so please take these as observations): * the most feature full UI is the Emacs one -- thus limited remote access (I mean from an arbitrary computer with only a web-browser); (and I'm not a very big fan of Emacs;) * most are still dependent on external IMAP systems -- this is not a problem with notmuch itself, but for the integrating clients; * SPAM -- as above -- is not integrated; * filtering (tag applying) is not automatic (as in integrated in notmuch itself or the client), but triggered through external scripts; As such I'm thinking on implementing a custom end-to-end email system and I would like to hear your feedback before embarking on such a task. I'm targeting the following features: * (inbound) SMTP integration, thus once an email is received it is automatically pushed through the system; (I'm primarily targeting those users that afford to run their own SMTP server; but the solution could still be adapted for those that only want the other features;) * automatic spam filtering, and tag applying; * automatic email triggers based on tags (such as user notifications, forwarding, etc.) * remote RPC-like access to the whole system; * remote Web user interface; About the overall architecture I'm thinking on adopting the following: * in general the whole system is decomposed in independent components (long-lived OS daemons) that each one does a particular job (see below); * all the components communicate between each-other through a message queue system (for example ZeroMQ or RabbitMQ); * all the communication is JSON based; The components would be: * SMTP inbound gateway -- for example I could take qmail or Postfix and replace the delivery agent with a custom process that pushes the email into the system; (any other solution suggestions?); * email store -- as the name suggests it is a simple key-value-like store that should persist raw email-messages; it should be as robust as possible, and its contents should be the only thing needed to reconstruct all the other derived data; (I could use here a simple process that maintains a maildir, I could go also with a BerkeleyDB wrapper, or even something more sophisticated;) * spam filter -- which either classifies the email or trains the spam filter; (for example I would use bogofilter;) * email index -- this is where notmuch would come into play; it would be fed with emails, which it would automatically apply tags and issue trigger notifications based on tags; it also maintains a set of filters and tags to automatically apply; * (maybe) a coordinator that should delegate and monitor requests to the above components; but if I'm using RabbitMQ and carefully designing the above components, they could drive each other; * restful web service that would intermediate access to all the above components; For now I have the following uncertainties: * how should I handle multiple users? I think each user should have it's own store / notmuch / bogofilter instance (at least in terms of storage if not even in terms of separate daemon); * should I keep the emails is a file-system, or a key-value store? (the file-system is more bug-free, but I'm confident that a BerkeleyDB instance would be more efficient); * should I use libnotmuch or for starters just make a notmuch tool wrapper; * and the most pressing one, transactions: I would like that at no point does a message get half processed or lost; as such I need notmuch to behave transactionally -- indexing the message and tagging it should be atomic and durable; (is there a way with libnotmuch to control the underlaying BerkeleyDB database?) Suggestions? Considerations? Ciprian.
RFC: notmuch powered (personal) (end-to-end) e-mail system
Much of the beauty of notmuch is how few assumptions it makes about your mail system. It plays well with others. For example, one deep insight of notmuch is that it *doesn't* require a custom mail store, even though a more obvious design might; in fact, it doesn't even require Maildir. That said, I think I can see where you're coming from and I also think you're targeting some of the deficiencies of notmuch, but I also think you're overengineering the solution. As a result of notmuch's simplicity, a fully working mail setup requires a lot of moving parts besides notmuch and it can take a while for a new user to set all that up, especially if they're migrating wholesale from some external mail setup. On Sun, Mar 20, 2011 at 10:07 AM, Ciprian Dorin Craciun wrote: > ? ?As such I'm thinking on implementing a custom end-to-end email > system and I would like to hear your feedback before embarking on such > a task. > > ? ?I'm targeting the following features: > ? ?* (inbound) SMTP integration, thus once an email is received it is > automatically pushed through the system; (I'm primarily targeting > those users that afford to run their own SMTP server; but the solution > could still be adapted for those that only want the other features;) As others have mentioned, see notmuch-deliver. I and others have also suggested inotify support for notmuch before, which would make the inbound mail mechanism (be it SMTP, IMAP fetching, or whatever) completely unaware of notmuch, offer some other benefits (for example, if mail is manipulated outside notmuch via IMAP), and is highly discoverable for new users (just have notmuch setup ask if they want notmuch to monitor for new email and then fire up an inotify daemon the first time notmuch is called). > ? ?* automatic spam filtering, and tag applying; > ? ?* automatic email triggers based on tags (such as user > notifications, forwarding, etc.) Obviously the above two can be scripted, but I agree that it's unsatisfying that every user needs to roll their own delivery script. While tagging and triggering are highly personal, they're not *so* personal that everyone needs a completely custom solution. This should be more approachable. I'm not sure what the best answer here is, but I don't think it requires it requires integration with a monolithic system to do right. > ? ?* remote RPC-like access to the whole system; This is another deep insight of notmuch. It already has an awesome RPC interface: the CLI. Perhaps your actual problem is that the only supported remote transport protocol is SSH. This comes with a lot of benefits (authentication, RPC pipelining), but also a lot of baggage (a full SSH client on the client side). I've thought about this in the context of both an HTTP client and an Android client and in both cases I concluded that a simple HTTPS transport wrapped around the notmuch CLI would be the way to go. Just put the CLI arguments in a POST and send the JSON on stdout back. This is trivial to prototype as a Python CGI script, easy to build as a standalone Python server, and not especially hard to build as a robust C server. > ? ?* remote Web user interface; A good web UI would be fantastic. Based on the rest of your email, I get the impression this was a requirements driver from much of the above, especially the integrated tagging/triggering and RPC access. I've already suggested a simple solution to the RPC problem. For tagging/triggering, it's probably worth developing a solution that allows for machine-editable rules (ideally retaining user-editableness), which would make it possible to integrate filter management in to a web UI. This could be as simple as a standard delivery script that operates from some simple rule database.
RFC: notmuch powered (personal) (end-to-end) e-mail system
On Sun, 20 Mar 2011 16:07:50 +0200, Ciprian Dorin Craciun wrote: > Hello all! (Sorry for the long email.) > > [snip] > > * the most feature full UI is the Emacs one -- thus limited remote > access (I mean from an arbitrary computer with only a web-browser); > (and I'm not a very big fan of Emacs;) > There have been a few attempts to put together an HTML front-end to notmuch[1]. None have made it very far though. It would be nice to see this space filled. > * most are still dependent on external IMAP systems -- this is not > a problem with notmuch itself, but for the integrating clients; > Not entirely sure what you mean by this. You could easily use e.g. notmuch-deliver as the local delivery agent with a SMTP server and you'd have no need for IMAP. > * SPAM -- as above -- is not integrated; > Nor should it be. Mail indexing, viewing, composing, and filtering are all orthogonal parts of a mail system. It takes all of ten lines to invoke a spam filter in your filter script. > * filtering (tag applying) is not automatic (as in integrated in > notmuch itself or the client), but triggered through external scripts; > Again, there is no reason why this should be incorporated into your mail indexer. > As such I'm thinking on implementing a custom end-to-end email > system and I would like to hear your feedback before embarking on such > a task. > Notmuch works so well for its audience because it adheres to the UNIX philosophy of "do one thing and do it well." The goal of an "integrated end-to-end" mail system might sound nice, but IMHO it's a recipe for a kludgey, unmaintainable nightmare which is mediocre at performing its task, on a good day. Perhaps I'm misunderstanding your proposal but it seems to me like you are taking an easy, already solved problem and turning it into a difficult one. > I'm targeting the following features: > * (inbound) SMTP integration, thus once an email is received it is > automatically pushed through the system; (I'm primarily targeting > those users that afford to run their own SMTP server; but the solution > could still be adapted for those that only want the other features;) > Is there something wrong with Postfix with notmuch-deliver as a LDA? > * automatic spam filtering, and tag applying; > A traditional sorting script with bogofilter/spamassassin? > * automatic email triggers based on tags (such as user > notifications, forwarding, etc.) > Again, a sorting script? > * remote RPC-like access to the whole system; > What's wrong with SSH? > * remote Web user interface; > Nothing fills this need currently. Feel free to write up something but please don't couple it to some all-inclusive beheamoth of a project. Personally, I would think more carefully about this project before proceding. It sounds like you intend on reinventing various portions of the wheel several times. Nothing you have listed is difficult to do with a few scripts, notmuch, and an SMTP server. > About the overall architecture I'm thinking on adopting the following: > * in general the whole system is decomposed in independent > components (long-lived OS daemons) that each one does a particular job > (see below); > * all the components communicate between each-other through a > message queue system (for example ZeroMQ or RabbitMQ); > * all the communication is JSON based; > > The components would be: > * SMTP inbound gateway -- for example I could take qmail or > Postfix and replace the delivery agent with a custom process that > pushes the email into the system; (any other solution suggestions?); > * email store -- as the name suggests it is a simple > key-value-like store that should persist raw email-messages; it should > be as robust as possible, and its contents should be the only thing > needed to reconstruct all the other derived data; (I could use here a > simple process that maintains a maildir, I could go also with a > BerkeleyDB wrapper, or even something more sophisticated;) > * spam filter -- which either classifies the email or trains the > spam filter; (for example I would use bogofilter;) > * email index -- this is where notmuch would come into play; it > would be fed with emails, which it would automatically apply tags and > issue trigger notifications based on tags; it also maintains a set of > filters and tags to automatically apply; > * (maybe) a coordinator that should delegate and monitor requests > to the above components; but if I'm using RabbitMQ and carefully > designing the above components, they could drive each other; > * restful web service that would intermediate access to all the > above components; > > For now I have the following uncertainties: > * how should I handle multiple users? I think each user should > have it's own store / notmuch / bogofilter instance (at least in terms > of storage if not even in terms of separate daemon); > * should I keep the
RFC: notmuch powered (personal) (end-to-end) e-mail system
On Sun, Mar 20, 2011 at 10:07 AM, Ciprian Dorin Craciun wrote: > ? ?I'm "struggling" for some time to get rid of the current > "de-facto" email solutions (i.e. GMail, Zimbra), and I've passively > observed for some time the notmuch project and community. It sounds like what you want *is* GMail (I don't know Zimbra) but just that you want it running on your own box instead of on Google's servers. > ? ?Suggestions? Considerations? Based on what you wrote, I think BerkeleyDB will be too limiting. I suggest for you to look into DBMail[1] for the mail store. -Brett. [1] http://www.dbmail.org/
RFC: notmuch powered (personal) (end-to-end) e-mail system
Hello all! (Sorry for the long email.) I'm struggling for some time to get rid of the current de-facto email solutions (i.e. GMail, Zimbra), and I've passively observed for some time the notmuch project and community. Although I've forwarded all my email to a single account, and I'm currently mirroring my GMail account locally (by using `mbsync`), index it by using notmuch, and I collect spam mails for later filter training, unfortunately I'm unable to convert because the current notmuch-powered solutions have (some of) the following shortcomings (I don't want to offend anyone, so please take these as observations): * the most feature full UI is the Emacs one -- thus limited remote access (I mean from an arbitrary computer with only a web-browser); (and I'm not a very big fan of Emacs;) * most are still dependent on external IMAP systems -- this is not a problem with notmuch itself, but for the integrating clients; * SPAM -- as above -- is not integrated; * filtering (tag applying) is not automatic (as in integrated in notmuch itself or the client), but triggered through external scripts; As such I'm thinking on implementing a custom end-to-end email system and I would like to hear your feedback before embarking on such a task. I'm targeting the following features: * (inbound) SMTP integration, thus once an email is received it is automatically pushed through the system; (I'm primarily targeting those users that afford to run their own SMTP server; but the solution could still be adapted for those that only want the other features;) * automatic spam filtering, and tag applying; * automatic email triggers based on tags (such as user notifications, forwarding, etc.) * remote RPC-like access to the whole system; * remote Web user interface; About the overall architecture I'm thinking on adopting the following: * in general the whole system is decomposed in independent components (long-lived OS daemons) that each one does a particular job (see below); * all the components communicate between each-other through a message queue system (for example ZeroMQ or RabbitMQ); * all the communication is JSON based; The components would be: * SMTP inbound gateway -- for example I could take qmail or Postfix and replace the delivery agent with a custom process that pushes the email into the system; (any other solution suggestions?); * email store -- as the name suggests it is a simple key-value-like store that should persist raw email-messages; it should be as robust as possible, and its contents should be the only thing needed to reconstruct all the other derived data; (I could use here a simple process that maintains a maildir, I could go also with a BerkeleyDB wrapper, or even something more sophisticated;) * spam filter -- which either classifies the email or trains the spam filter; (for example I would use bogofilter;) * email index -- this is where notmuch would come into play; it would be fed with emails, which it would automatically apply tags and issue trigger notifications based on tags; it also maintains a set of filters and tags to automatically apply; * (maybe) a coordinator that should delegate and monitor requests to the above components; but if I'm using RabbitMQ and carefully designing the above components, they could drive each other; * restful web service that would intermediate access to all the above components; For now I have the following uncertainties: * how should I handle multiple users? I think each user should have it's own store / notmuch / bogofilter instance (at least in terms of storage if not even in terms of separate daemon); * should I keep the emails is a file-system, or a key-value store? (the file-system is more bug-free, but I'm confident that a BerkeleyDB instance would be more efficient); * should I use libnotmuch or for starters just make a notmuch tool wrapper; * and the most pressing one, transactions: I would like that at no point does a message get half processed or lost; as such I need notmuch to behave transactionally -- indexing the message and tagging it should be atomic and durable; (is there a way with libnotmuch to control the underlaying BerkeleyDB database?) Suggestions? Considerations? Ciprian. ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: RFC: notmuch powered (personal) (end-to-end) e-mail system
On Sun, Mar 20, 2011 at 10:07 AM, Ciprian Dorin Craciun ciprian.crac...@gmail.com wrote: I'm struggling for some time to get rid of the current de-facto email solutions (i.e. GMail, Zimbra), and I've passively observed for some time the notmuch project and community. It sounds like what you want *is* GMail (I don't know Zimbra) but just that you want it running on your own box instead of on Google's servers. Suggestions? Considerations? Based on what you wrote, I think BerkeleyDB will be too limiting. I suggest for you to look into DBMail[1] for the mail store. -Brett. [1] http://www.dbmail.org/ ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: RFC: notmuch powered (personal) (end-to-end) e-mail system
On Sun, 20 Mar 2011 16:07:50 +0200, Ciprian Dorin Craciun ciprian.crac...@gmail.com wrote: Hello all! (Sorry for the long email.) [snip] * the most feature full UI is the Emacs one -- thus limited remote access (I mean from an arbitrary computer with only a web-browser); (and I'm not a very big fan of Emacs;) There have been a few attempts to put together an HTML front-end to notmuch[1]. None have made it very far though. It would be nice to see this space filled. * most are still dependent on external IMAP systems -- this is not a problem with notmuch itself, but for the integrating clients; Not entirely sure what you mean by this. You could easily use e.g. notmuch-deliver as the local delivery agent with a SMTP server and you'd have no need for IMAP. * SPAM -- as above -- is not integrated; Nor should it be. Mail indexing, viewing, composing, and filtering are all orthogonal parts of a mail system. It takes all of ten lines to invoke a spam filter in your filter script. * filtering (tag applying) is not automatic (as in integrated in notmuch itself or the client), but triggered through external scripts; Again, there is no reason why this should be incorporated into your mail indexer. As such I'm thinking on implementing a custom end-to-end email system and I would like to hear your feedback before embarking on such a task. Notmuch works so well for its audience because it adheres to the UNIX philosophy of do one thing and do it well. The goal of an integrated end-to-end mail system might sound nice, but IMHO it's a recipe for a kludgey, unmaintainable nightmare which is mediocre at performing its task, on a good day. Perhaps I'm misunderstanding your proposal but it seems to me like you are taking an easy, already solved problem and turning it into a difficult one. I'm targeting the following features: * (inbound) SMTP integration, thus once an email is received it is automatically pushed through the system; (I'm primarily targeting those users that afford to run their own SMTP server; but the solution could still be adapted for those that only want the other features;) Is there something wrong with Postfix with notmuch-deliver as a LDA? * automatic spam filtering, and tag applying; A traditional sorting script with bogofilter/spamassassin? * automatic email triggers based on tags (such as user notifications, forwarding, etc.) Again, a sorting script? * remote RPC-like access to the whole system; What's wrong with SSH? * remote Web user interface; Nothing fills this need currently. Feel free to write up something but please don't couple it to some all-inclusive beheamoth of a project. Personally, I would think more carefully about this project before proceding. It sounds like you intend on reinventing various portions of the wheel several times. Nothing you have listed is difficult to do with a few scripts, notmuch, and an SMTP server. About the overall architecture I'm thinking on adopting the following: * in general the whole system is decomposed in independent components (long-lived OS daemons) that each one does a particular job (see below); * all the components communicate between each-other through a message queue system (for example ZeroMQ or RabbitMQ); * all the communication is JSON based; The components would be: * SMTP inbound gateway -- for example I could take qmail or Postfix and replace the delivery agent with a custom process that pushes the email into the system; (any other solution suggestions?); * email store -- as the name suggests it is a simple key-value-like store that should persist raw email-messages; it should be as robust as possible, and its contents should be the only thing needed to reconstruct all the other derived data; (I could use here a simple process that maintains a maildir, I could go also with a BerkeleyDB wrapper, or even something more sophisticated;) * spam filter -- which either classifies the email or trains the spam filter; (for example I would use bogofilter;) * email index -- this is where notmuch would come into play; it would be fed with emails, which it would automatically apply tags and issue trigger notifications based on tags; it also maintains a set of filters and tags to automatically apply; * (maybe) a coordinator that should delegate and monitor requests to the above components; but if I'm using RabbitMQ and carefully designing the above components, they could drive each other; * restful web service that would intermediate access to all the above components; For now I have the following uncertainties: * how should I handle multiple users? I think each user should have it's own store / notmuch / bogofilter instance (at least in terms of storage if not even in terms of separate daemon); * should I keep the emails is a file-system, or a key-value store?
Re: RFC: notmuch powered (personal) (end-to-end) e-mail system
Much of the beauty of notmuch is how few assumptions it makes about your mail system. It plays well with others. For example, one deep insight of notmuch is that it *doesn't* require a custom mail store, even though a more obvious design might; in fact, it doesn't even require Maildir. That said, I think I can see where you're coming from and I also think you're targeting some of the deficiencies of notmuch, but I also think you're overengineering the solution. As a result of notmuch's simplicity, a fully working mail setup requires a lot of moving parts besides notmuch and it can take a while for a new user to set all that up, especially if they're migrating wholesale from some external mail setup. On Sun, Mar 20, 2011 at 10:07 AM, Ciprian Dorin Craciun ciprian.crac...@gmail.com wrote: As such I'm thinking on implementing a custom end-to-end email system and I would like to hear your feedback before embarking on such a task. I'm targeting the following features: * (inbound) SMTP integration, thus once an email is received it is automatically pushed through the system; (I'm primarily targeting those users that afford to run their own SMTP server; but the solution could still be adapted for those that only want the other features;) As others have mentioned, see notmuch-deliver. I and others have also suggested inotify support for notmuch before, which would make the inbound mail mechanism (be it SMTP, IMAP fetching, or whatever) completely unaware of notmuch, offer some other benefits (for example, if mail is manipulated outside notmuch via IMAP), and is highly discoverable for new users (just have notmuch setup ask if they want notmuch to monitor for new email and then fire up an inotify daemon the first time notmuch is called). * automatic spam filtering, and tag applying; * automatic email triggers based on tags (such as user notifications, forwarding, etc.) Obviously the above two can be scripted, but I agree that it's unsatisfying that every user needs to roll their own delivery script. While tagging and triggering are highly personal, they're not *so* personal that everyone needs a completely custom solution. This should be more approachable. I'm not sure what the best answer here is, but I don't think it requires it requires integration with a monolithic system to do right. * remote RPC-like access to the whole system; This is another deep insight of notmuch. It already has an awesome RPC interface: the CLI. Perhaps your actual problem is that the only supported remote transport protocol is SSH. This comes with a lot of benefits (authentication, RPC pipelining), but also a lot of baggage (a full SSH client on the client side). I've thought about this in the context of both an HTTP client and an Android client and in both cases I concluded that a simple HTTPS transport wrapped around the notmuch CLI would be the way to go. Just put the CLI arguments in a POST and send the JSON on stdout back. This is trivial to prototype as a Python CGI script, easy to build as a standalone Python server, and not especially hard to build as a robust C server. * remote Web user interface; A good web UI would be fantastic. Based on the rest of your email, I get the impression this was a requirements driver from much of the above, especially the integrated tagging/triggering and RPC access. I've already suggested a simple solution to the RPC problem. For tagging/triggering, it's probably worth developing a solution that allows for machine-editable rules (ideally retaining user-editableness), which would make it possible to integrate filter management in to a web UI. This could be as simple as a standard delivery script that operates from some simple rule database. ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch