FastMail uses xapian for real time search, can we do the same?

2016-03-24 Thread Edward Betts
I've tried using 'notmuch insert' with mutt-kz, the mutt fork that talks to
the notmuch database. Both keep trying to grab the write lock at the same time.

I tag a mail in mutt-kz and it writes the new tag to the database, at the same
time a new mail is coming in. Either 'notmuch insert' has the lock and
my mail client pauses while it retries or notmuch-insert isn't able to get the
write lock and just saves the mail to the maildir and it doesn't get added
until I later run 'notmuch new'.

I guess many people here are using the Emacs client and the problem is partly
avoided because there is a separate thread in the background trying to grab
the write lock and save tag changes. Is that correct?

I wonder if the solution is real time search? xapian can search in multiple
databases. New mails and tag changes can be written to a new database, there
could be one database for inserting new mails and another for changes to the
tags on existing mails. Then there are no problems with write locks. These
databases will be much smaller, so the writes should be faster. There can be a
process that runs once a day to merge the databases.

FastMail have implemented real time search for email with xapian.

Blog post: https://blog.fastmail.com/2014/12/01/email-search-system/

Implementation: https://github.com/brong/cyrus-imapd/tree/fastmail

Is there any enthusiasm for adding real time search to notmuch?
-- 
Edward.
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH] Don't bother checking for mbox files

2016-03-13 Thread Edward Betts
Keith Packard  wrote:
> Postfix adds mbox-style From lines when used in combination with
> maildrop or .forward files. If they have another line starting with
> 'From ' in them, notmuch complains about them not being mail files.
> 
> If we assume the user hasn't screwed up and misconfigured their mail
> system, then we can safely ignore whether the file started with an
> mbox header and just parse it as a single-message file.

I think it is fine to go ahead with this change. At the same time the
behaviour of Postfix should be corrected so it doesn't add mbox-style From
lines to mails in maildir format.

The same problem existed in the Debian exim4 config. I filed a bug, it was
fixed: https://bugs.debian.org/769396

Here is a bug in maildrop suggesting that it should strip the mbox-style From
line from the top of mails: https://bugs.debian.org/737383

-- 
Edward.
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


exim config for 'notmuch insert' retry

2014-12-16 Thread Edward Betts
Short version: exim has an option to retry delivery when notmuch insert fails

Long version:

I'm using maildrop and 'notmuch insert' to sort and tag my incoming mail.
Sometimes 'notmuch insert' isn't able to write to the notmuch index because
another process is holding the notmuch write lock. At first when this happened
exim would just bounce the message.

I could use the '--keep' option so 'notmuch insert' would write the message to
the maildir, but not index it. It would later be picked up by running 'notmuch
new' from cron. The problem is that the initial tagging is defined in the
maildrop config. I would need to specify my tagging rules in two different
places.

For me, a better approach would be if the undeliverable message could be added
to a queue, ready for another delivery attempt. I realised that this is exactly
what an MTA like exim is designed to do. I added the 'temp_errors = 1' option
to the maildrop pipe transport, this tells exim that delivery errors are
temporary and it should retry failed deliveries.

I'm sharing this information in case other people find it useful.

Here is the complete maildrop pipe transport from my exim4.conf:

maildrop_pipe:
  driver = pipe
  path = "/bin:/usr/bin:/usr/local/bin"
  command = "/usr/bin/maildrop"
  temp_errors = 1
  message_prefix =
  message_suffix =
  log_defer_output = true
  return_path_add
  delivery_date_add
  envelope_to_add

-- 
Edward.


exim config for 'notmuch insert' retry

2014-12-16 Thread Edward Betts
Short version: exim has an option to retry delivery when notmuch insert fails

Long version:

I'm using maildrop and 'notmuch insert' to sort and tag my incoming mail.
Sometimes 'notmuch insert' isn't able to write to the notmuch index because
another process is holding the notmuch write lock. At first when this happened
exim would just bounce the message.

I could use the '--keep' option so 'notmuch insert' would write the message to
the maildir, but not index it. It would later be picked up by running 'notmuch
new' from cron. The problem is that the initial tagging is defined in the
maildrop config. I would need to specify my tagging rules in two different
places.

For me, a better approach would be if the undeliverable message could be added
to a queue, ready for another delivery attempt. I realised that this is exactly
what an MTA like exim is designed to do. I added the 'temp_errors = 1' option
to the maildrop pipe transport, this tells exim that delivery errors are
temporary and it should retry failed deliveries.

I'm sharing this information in case other people find it useful.

Here is the complete maildrop pipe transport from my exim4.conf:

maildrop_pipe:
  driver = pipe
  path = /bin:/usr/bin:/usr/local/bin
  command = /usr/bin/maildrop
  temp_errors = 1
  message_prefix =
  message_suffix =
  log_defer_output = true
  return_path_add
  delivery_date_add
  envelope_to_add

-- 
Edward.
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


exim pipe transport, notmuch insert and mbox-style messages

2014-11-29 Thread Edward Betts
David Bremner  wrote:
> Edward Betts  writes:
> > I wonder if 'notmuch insert' could be modified to detect and drop the From_
> > line before writing the message to disk and index it. It could do that
> > silently or with a warning.
> 
> I don't know about the larger question(s), but I'd suggest just escaping
> it to something like X-Envelope-From: . There may even be some
> semi-standard header to use for this.

The main piece of information in the From_ line is the return path address, a
properly configured MTA will add a Return-path header.

RFC 2821 says the Return-path header is added to a message "when the
delivery SMTP server makes the final delivery".

I don't think the From_ line needs to be captured into an X-Envelope-From
header.  We could make 'notmuch insert' add the 'Return-path' header to
messages if it is missing, but the From_ line is present. This is probably too
much complexity.

-- 
Edward.


Re: exim pipe transport, notmuch insert and mbox-style messages

2014-11-29 Thread Edward Betts
David Bremner da...@tethera.net wrote:
 Edward Betts edw...@4angle.com writes:
  I wonder if 'notmuch insert' could be modified to detect and drop the From_
  line before writing the message to disk and index it. It could do that
  silently or with a warning.
 
 I don't know about the larger question(s), but I'd suggest just escaping
 it to something like X-Envelope-From: . There may even be some
 semi-standard header to use for this.

The main piece of information in the From_ line is the return path address, a
properly configured MTA will add a Return-path header.

RFC 2821 says the Return-path header is added to a message when the
delivery SMTP server makes the final delivery.

I don't think the From_ line needs to be captured into an X-Envelope-From
header.  We could make 'notmuch insert' add the 'Return-path' header to
messages if it is missing, but the From_ line is present. This is probably too
much complexity.

-- 
Edward.
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


exim pipe transport, notmuch insert and mbox-style messages

2014-11-22 Thread Edward Betts
My mail arrives on a Debian machine running exim. It was being filtered
by procmail then piped into 'notmuch insert'. This was generating the
single-message mbox warning:

> Warning: ...  is an mbox containing a single message,
> likely caused by misconfigured mail delivery.  Support for single-message
> mboxes is deprecated and may be removed in the future.

I thought it was a problem with procmail delivering mbox style messages to
pipes, so I switched to maildrop. Same problem. This warning is produced
because there is an mbox-style From_ line before the first header line. Here
is an example:

> From MAILER-DAEMON Fri Jul  8 12:08:34 2011

The man page for maildrop says this style of message isn't supported, but it
doesn't detect and warn about them. It will pipe them into commands without
removing the From_ line.

Reading the exim documentation for pipe transports I found the message_prefix
option, which is by default is set like this:

> message_prefix = \
>   From ${if def:return_path{$return_path}{MAILER-DAEMON}}\
>   ${tod_bsdinbox}\n

This means that exim turns all messages into mbox style messages before
delivering them to a pipe. The justification given in the documentation is
"This is required by the commonly used /usr/bin/vacation program."

The fix is to unset the exim message_prefix option for delivery to maildrop.
Here is an example of the Debian exim maildrop_pipe transport with a blank
message_prefix to stop the From_ line being added before the message is piped
to maildrop.

> maildrop_pipe:
>   debug_print = "T: maildrop_pipe for $local_part@$domain"
>   driver = pipe
>   path = "/bin:/usr/bin:/usr/local/bin"
>   command = "/usr/bin/maildrop"
>   message_prefix =
>   message_suffix =
>   return_path_add
>   delivery_date_add
>   envelope_to_add

The same fix could be applied for procmail pipe or a pipe from exim to
'notmuch insert'.

I submitted a Debian bug for exim4-config with my change as a patch. The
maintainer has excepted my patch and uploaded a release of exim4-config with
the fix to Debian experimental. 

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=769396

I wonder if 'notmuch insert' could be modified to detect and drop the From_
line before writing the message to disk and index it. It could do that
silently or with a warning.

-- 
Edward.


exim pipe transport, notmuch insert and mbox-style messages

2014-11-22 Thread Edward Betts
My mail arrives on a Debian machine running exim. It was being filtered
by procmail then piped into 'notmuch insert'. This was generating the
single-message mbox warning:

 Warning: ...  is an mbox containing a single message,
 likely caused by misconfigured mail delivery.  Support for single-message
 mboxes is deprecated and may be removed in the future.

I thought it was a problem with procmail delivering mbox style messages to
pipes, so I switched to maildrop. Same problem. This warning is produced
because there is an mbox-style From_ line before the first header line. Here
is an example:

 From MAILER-DAEMON Fri Jul  8 12:08:34 2011

The man page for maildrop says this style of message isn't supported, but it
doesn't detect and warn about them. It will pipe them into commands without
removing the From_ line.

Reading the exim documentation for pipe transports I found the message_prefix
option, which is by default is set like this:

 message_prefix = \
   From ${if def:return_path{$return_path}{MAILER-DAEMON}}\
   ${tod_bsdinbox}\n

This means that exim turns all messages into mbox style messages before
delivering them to a pipe. The justification given in the documentation is
This is required by the commonly used /usr/bin/vacation program.

The fix is to unset the exim message_prefix option for delivery to maildrop.
Here is an example of the Debian exim maildrop_pipe transport with a blank
message_prefix to stop the From_ line being added before the message is piped
to maildrop.

 maildrop_pipe:
   debug_print = T: maildrop_pipe for $local_part@$domain
   driver = pipe
   path = /bin:/usr/bin:/usr/local/bin
   command = /usr/bin/maildrop
   message_prefix =
   message_suffix =
   return_path_add
   delivery_date_add
   envelope_to_add

The same fix could be applied for procmail pipe or a pipe from exim to
'notmuch insert'.

I submitted a Debian bug for exim4-config with my change as a patch. The
maintainer has excepted my patch and uploaded a release of exim4-config with
the fix to Debian experimental. 

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=769396

I wonder if 'notmuch insert' could be modified to detect and drop the From_
line before writing the message to disk and index it. It could do that
silently or with a warning.

-- 
Edward.
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Synchronising mail and notmuch tags between machines

2014-11-12 Thread Edward Betts
I've written some code to synchronise mail between my server and laptop. I
have notmuch running on both machine, whenever mail is added to notmuch, it is
tagged as needing to be copied to the other machine, the tags are
sync-to-laptop and sync-to-server. Whenever I modify the tags on a mail I'm
careful to add the sync tag. My mail reader is configured to add the sync tag
whenever I make any changes to a mail.

Here is the code:

https://github.com/EdwardBetts/notmuch-pushy/

The synchronise code opens each database and checks for messages that need to
be synchronised, the tags are copied to the other database, if the mail is new
then content is copied as well. Then the sync tag is removed from the source
message.

I'm using a Python RPC module called pushy, it provides a simple way to
connect two python interpreters on different machines. Pushy provides proxy
objects to access remote objects as if they are local. These means I can
access the local and remote notmuch databases from within the same piece of
code.  All the communication happens over ssh, and pushy even provides methods
for copying files between the machines.

https://pythonhosted.org/pushy/
https://github.com/pushyrpc/pushy

My main todo item is reducing the amount of time that the write lock is held.
I should copy messages back and forth before grabbing the write lock to update
the tags.

It would be helpful if the notmuch would always add the sync tag when a
message was modified, unless the sync tag is explicitly being removed. That
way there is no risk of me forgetting to add the sync tag when I'm modifying
tags using the command line tools.
-- 
Edward.


Synchronising mail and notmuch tags between machines

2014-11-12 Thread Edward Betts
I've written some code to synchronise mail between my server and laptop. I
have notmuch running on both machine, whenever mail is added to notmuch, it is
tagged as needing to be copied to the other machine, the tags are
sync-to-laptop and sync-to-server. Whenever I modify the tags on a mail I'm
careful to add the sync tag. My mail reader is configured to add the sync tag
whenever I make any changes to a mail.

Here is the code:

https://github.com/EdwardBetts/notmuch-pushy/

The synchronise code opens each database and checks for messages that need to
be synchronised, the tags are copied to the other database, if the mail is new
then content is copied as well. Then the sync tag is removed from the source
message.

I'm using a Python RPC module called pushy, it provides a simple way to
connect two python interpreters on different machines. Pushy provides proxy
objects to access remote objects as if they are local. These means I can
access the local and remote notmuch databases from within the same piece of
code.  All the communication happens over ssh, and pushy even provides methods
for copying files between the machines.

https://pythonhosted.org/pushy/
https://github.com/pushyrpc/pushy

My main todo item is reducing the amount of time that the write lock is held.
I should copy messages back and forth before grabbing the write lock to update
the tags.

It would be helpful if the notmuch would always add the sync tag when a
message was modified, unless the sync tag is explicitly being removed. That
way there is no risk of me forgetting to add the sync tag when I'm modifying
tags using the command line tools.
-- 
Edward.
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch