[PATCH 0/3] Speed up notmuch new for unchanged directories
Quoth Sascha Silbe on Jun 26 at 12:13 am: > Austin Clements writes: > > On Sun, 24 Jun 2012, Sascha Silbe wrote: > > ["notmuch new" listing every directory, even if it's unchanged] > > I haven't looked over your patches yet, but this result surprises me. > > Could you explain your setup a little more? How much mail do you have > > and across how many directories? What file system are you using? > > As mentioned in passing already, I have a total of about 900k unique > mails (sometimes several copies of them, received over different paths, > e.g. mailing list and a direct CC). Most of that is "old" mails, in > directories that are not getting updated. If notmuch would support mbox, > I'd use that instead for those old mails. The total number of > directories in the mail store is about 29k and the total number of files > (including the git repository and mbox files that sup used) is about > 1.25M. > > Since a housekeeping job last weekend, the number of mails in > directories that are still getting updated is about 4k, i.e. about 5? of > the total number of mails or 3? of the total number of files. The number > of directories getting updated is 104, i.e. about 4? of the total number > of directories. > > Ideally, we'd get the run-time of "notmuch new" down by a similar > factor. With just plain POSIX and no additional information that won't > be possible, but providing a way to channel information about updates > into notmuch (rather than having it scan everything over and over again) > should help. That information is already available as output from the > mail fetching process (rsync in my case). Of course, it would be purely > optional: "notmuch new" without additional information would simply > continue to scan everything. This would be great. I've been thinking along similar lines for a while (in my case, I want to feed notmuch new from inotify), though I haven't written any code for it. > > I'm also surprised that your new approach helps. This directory listing > > has to be read off disk one way or the other, but listing directories is > > the bread-and-butter of file systems, whereas I would think that Xapian > > would require more IO to accomplish the same effect. > > "notmuch new" needs to iterate over a list of all directories to find > those with new mails (and potentially new subdirectories). However, it > does not need to list the *contents* of those folders. I'm surprised as > well, but rather in the opposite direction: Based on a naive > calculation, we'd expect to see a speedup on the order of > (1.25M+29k)/29k?=?44. The actual results suggest that stat()ing (done > 29k times both before and after the patch) is taking about 19 times as > long as listing a directory entry (before the patch we listed 1M > entries, now we list none if nothing has changed). (*) For a cold cache, these aren't the numbers that matter. With an HDD and how few files your directories contain on average, only seeks will matter. I would expect your workload without your patch to have at least 1 but closer to 2 seeks per directory: one to stat the directory and one to get the directory contents block. Some of the stat seeks will be eliminated by the buffer cache, even starting cold, because of inode locality (absolute best case is 16x reduction, but if you created the directories over time, then this locality is probably quite poor). There are a few other potential seeks to get the directory document from Xapian and to get its mtime value, but those should exhibit strong locality, so they probably don't contribute much. NewEgg says your drive has an average seek time of 8.9ms, so with 29k directories and assuming your directories are sequential on disk, that's at least 258s and closer to 512s, which agrees with your benchmark results. I'm surprised by your results because I would expect your workload with your patches to exhibit about the same number of seeks: one to stat the directory (same as before) and one for notmuch_directory_get_child_files, which has to seek in the term index to get the child directories. My guess is that this exhibits better locality because the child directory terms are stored contiguously in the database's key space (though not necessarily sequentially on disk unless this is a fresh database). Unfortunately, I'm not sure of a good way to test this hypothesis. Any thoughts?
Re: [PATCH 0/3] Speed up notmuch new for unchanged directories
Quoth Sascha Silbe on Jun 26 at 12:13 am: > Austin Clements writes: > > On Sun, 24 Jun 2012, Sascha Silbe wrote: > > ["notmuch new" listing every directory, even if it's unchanged] > > I haven't looked over your patches yet, but this result surprises me. > > Could you explain your setup a little more? How much mail do you have > > and across how many directories? What file system are you using? > > As mentioned in passing already, I have a total of about 900k unique > mails (sometimes several copies of them, received over different paths, > e.g. mailing list and a direct CC). Most of that is "old" mails, in > directories that are not getting updated. If notmuch would support mbox, > I'd use that instead for those old mails. The total number of > directories in the mail store is about 29k and the total number of files > (including the git repository and mbox files that sup used) is about > 1.25M. > > Since a housekeeping job last weekend, the number of mails in > directories that are still getting updated is about 4k, i.e. about 5‰ of > the total number of mails or 3‰ of the total number of files. The number > of directories getting updated is 104, i.e. about 4‰ of the total number > of directories. > > Ideally, we'd get the run-time of "notmuch new" down by a similar > factor. With just plain POSIX and no additional information that won't > be possible, but providing a way to channel information about updates > into notmuch (rather than having it scan everything over and over again) > should help. That information is already available as output from the > mail fetching process (rsync in my case). Of course, it would be purely > optional: "notmuch new" without additional information would simply > continue to scan everything. This would be great. I've been thinking along similar lines for a while (in my case, I want to feed notmuch new from inotify), though I haven't written any code for it. > > I'm also surprised that your new approach helps. This directory listing > > has to be read off disk one way or the other, but listing directories is > > the bread-and-butter of file systems, whereas I would think that Xapian > > would require more IO to accomplish the same effect. > > "notmuch new" needs to iterate over a list of all directories to find > those with new mails (and potentially new subdirectories). However, it > does not need to list the *contents* of those folders. I'm surprised as > well, but rather in the opposite direction: Based on a naive > calculation, we'd expect to see a speedup on the order of > (1.25M+29k)/29k = 44. The actual results suggest that stat()ing (done > 29k times both before and after the patch) is taking about 19 times as > long as listing a directory entry (before the patch we listed 1M > entries, now we list none if nothing has changed). (*) For a cold cache, these aren't the numbers that matter. With an HDD and how few files your directories contain on average, only seeks will matter. I would expect your workload without your patch to have at least 1 but closer to 2 seeks per directory: one to stat the directory and one to get the directory contents block. Some of the stat seeks will be eliminated by the buffer cache, even starting cold, because of inode locality (absolute best case is 16x reduction, but if you created the directories over time, then this locality is probably quite poor). There are a few other potential seeks to get the directory document from Xapian and to get its mtime value, but those should exhibit strong locality, so they probably don't contribute much. NewEgg says your drive has an average seek time of 8.9ms, so with 29k directories and assuming your directories are sequential on disk, that's at least 258s and closer to 512s, which agrees with your benchmark results. I'm surprised by your results because I would expect your workload with your patches to exhibit about the same number of seeks: one to stat the directory (same as before) and one for notmuch_directory_get_child_files, which has to seek in the term index to get the child directories. My guess is that this exhibits better locality because the child directory terms are stored contiguously in the database's key space (though not necessarily sequentially on disk unless this is a fresh database). Unfortunately, I'm not sure of a good way to test this hypothesis. Any thoughts? ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
[RFC PATCH 14/14] new: Add scan support for mbox:// URIs
A lot of code is duplicated from maildir, I don't think I handled all those errors correctly, and I didn't report any progress. Signed-off-by: Ethan Glasser-Camp --- notmuch-new.c | 299 +++-- 1 file changed, 289 insertions(+), 10 deletions(-) diff --git a/notmuch-new.c b/notmuch-new.c index 1bf4e25..36fee34 100644 --- a/notmuch-new.c +++ b/notmuch-new.c @@ -19,6 +19,7 @@ */ #include "notmuch-client.h" +#include #include @@ -239,16 +240,6 @@ _entry_in_ignore_list (const char *entry, add_files_state_t *state) return FALSE; } -/* Call out to the appropriate add_files function, based on the URI. */ -static notmuch_status_t -add_files_uri (unused(notmuch_database_t *notmuch), - unused(const char *uri), - unused(add_files_state_t *state)) -{ -/* Stub for now */ -return NOTMUCH_STATUS_SUCCESS; -} - /* Progress-reporting function. * * Can be used by any mailstore-crawling function that wants to alert @@ -674,6 +665,294 @@ add_files (notmuch_database_t *notmuch, return ret; } +/* Scan an mbox file for messages. + * + * We assume that mboxes grow monotonically only. + * + * The mtime of the mbox file is stored in a "directory" document in + * Xapian. + */ +static notmuch_status_t +add_messages_mbox_file (notmuch_database_t *notmuch, + const char *path, + add_files_state_t *state) +{ +notmuch_status_t ret = NOTMUCH_STATUS_SUCCESS, status; +struct stat st; +time_t fs_mtime, db_mtime, stat_time; +FILE *mbox; +char *line, *path_uri = NULL, *message_uri = NULL; +int line_len; +size_t offset, end_offset, line_size = 0; +notmuch_directory_t *directory; +int content_length = -1, is_headers; + +if (stat (path, &st)) { + fprintf (stderr, "Error reading mbox file %s: %s\n", +path, strerror (errno)); + return NOTMUCH_STATUS_FILE_ERROR; +} + +stat_time = time (NULL); +if (! S_ISREG (st.st_mode)) { + fprintf (stderr, "Error: %s is not a file.\n", path); + return NOTMUCH_STATUS_FILE_ERROR; +} + +fs_mtime = st.st_mtime; + +path_uri = talloc_asprintf (notmuch, "mbox://%s", path); +status = notmuch_database_get_directory (notmuch, path_uri, &directory); +if (status) { + ret = status; + goto DONE; +} +db_mtime = directory ? notmuch_directory_get_mtime (directory) : 0; + +if (directory && db_mtime == fs_mtime) { + goto DONE; +} + +mbox = fopen (path, "r"); +if (mbox == NULL) { + fprintf (stderr, "Error: couldn't open %s for reading.\n", +path); + ret = NOTMUCH_STATUS_FILE_ERROR; + goto DONE; +} + +line_len = getline (&line, &line_size, mbox); + +if (line_len == -1) { + fprintf (stderr, "Error: reading from %s failed: %s\n", +path, strerror (errno)); + ret = NOTMUCH_STATUS_FILE_ERROR; + goto DONE; +} + +if (strncmp (line, "From ", 5) != 0) { + fprintf (stderr, "Note: Ignoring non-mbox file: %s\n", +path); + ret = NOTMUCH_STATUS_FILE_ERROR; + goto DONE; +} +free(line); +line = NULL; + +/* Loop invariant: At the beginning of the loop, we have just read + * a From_ line, but haven't yet read any of the headers. + */ +while (! feof (mbox)) { + is_headers = 1; + offset = ftell (mbox); + content_length = -1; + + /* Read lines until we either get to the next From_ header, or +* we find a Content-Length header (mboxcl) and we run out of headers. +*/ + do { + /* Get the offset before we read, in case we got another From_ header. */ + end_offset = ftell (mbox); + + line_len = getline (&line, &line_size, mbox); + + /* Check to see if this line is a content-length header, +* or the end of the headers. */ + if (is_headers && strncasecmp (line, "Content-Length: ", + strlen ("Content-Length: ")) == 0) + content_length = strtol (line + strlen ("Content-Length: "), +NULL, 10); + + if (is_headers && strlen (line) == 1 && *line == '\n') { + is_headers = 0; + /* If we got a content_length, skip the message body. */ + if (content_length != -1) { + fseek (mbox, content_length, SEEK_CUR); + line_len = getline (&line, &line_size, mbox); + + /* We should be at the end of the message. Sanity +* check: there should be a blank line, and then +* another From_ header. */ + if (strlen (line) != 1 || *line != '\n') { + fprintf (stderr, "Warning: message with Content-Length not " +"immediately
[RFC PATCH 13/14] Tests for mbox support
These need to be improved, rather than hard-coding byte offsets. Signed-off-by: Ethan Glasser-Camp --- test/mbox | 59 + test/notmuch-test |1 + 2 files changed, 60 insertions(+) create mode 100755 test/mbox diff --git a/test/mbox b/test/mbox new file mode 100755 index 000..f03f887 --- /dev/null +++ b/test/mbox @@ -0,0 +1,59 @@ +#!/usr/bin/env bash +# +# Copyright (c) 2005 Junio C Hamano +# + +test_description='basic mbox support' +. ./test-lib.sh + +mkdir -p $MAIL_DIR/some-mboxes/subdir $MAIL_DIR/database $MAIL_DIR/corpus + +# The Content-Length headers here include the final newline (added later). +generate_message '[body]="Mbox message 1."' '[header]="Content-Length: 16"' "[dir]=corpus" +generate_message '[body]="Mbox message 2. Longer."' '[header]="Content-Length: 24"' "[dir]=corpus" +generate_message '[body]="Mbox message 3."' "[dir]=corpus" +generate_message '[body]="Mbox message 4."' "[dir]=corpus" +generate_message '[body]="Mbox message 5. Last message."' '[header]="Content-Length: 30"' "[dir]=corpus" + +MBOX1=$MAIL_DIR/some-mboxes/first.mbox +for x in $MAIL_DIR/corpus/*; do +echo "From MAILER-DAEMON Sat Jan 3 01:05:34 1996" >> $MBOX1 +cat $x >> $MBOX1 +# Final newline +echo >> $MBOX1 +done + +notmuch config set database.path $MAIL_DIR/database +notmuch config set new.scan mbox://$MAIL_DIR/some-mboxes + +test_begin_subtest "read a small mbox (5 messages)" +output=$(NOTMUCH_NEW) +test_expect_equal "$output" "Added 5 new messages to the database." + +test_begin_subtest "search" +output=$(notmuch search '*' | notmuch_search_sanitize) +test_expect_equal "$output" "thread:XXX 2001-01-05 [1/1] Notmuch Test Suite; Test message #1 (inbox unread) +thread:XXX 2001-01-05 [1/1] Notmuch Test Suite; Test message #2 (inbox unread) +thread:XXX 2001-01-05 [1/1] Notmuch Test Suite; Test message #3 (inbox unread) +thread:XXX 2001-01-05 [1/1] Notmuch Test Suite; Test message #4 (inbox unread) +thread:XXX 2001-01-05 [1/1] Notmuch Test Suite; Test message #5 (inbox unread)" + +test_begin_subtest "show (mboxcl)" +output=$(notmuch show "Test message #1" | grep -o "filename:[^ ]*") +test_expect_equal "$output" "filename:mbox://$MAIL_DIR/some-mboxes/first.mbox#44+246" + +test_begin_subtest "show doesn't append an extra space at the end (mboxcl)" +output=$(notmuch show --format=raw "Test message #1" ) +original=$(cat $MAIL_DIR/corpus/msg-001) +test_expect_equal "$output" "$original" + +test_begin_subtest "show (non-cl)" +output=$(notmuch show "Test message #3" | grep -o "filename:[^ ]*") +test_expect_equal "$output" "filename:mbox://$MAIL_DIR/some-mboxes/first.mbox#634+227" + +test_begin_subtest "show doesn't append an extra space at the end (non-cl)" +output=$(notmuch show --format=raw "Test message #3" ) +original=$(cat $MAIL_DIR/corpus/msg-003) +test_expect_equal "$output" "$original" + +test_done diff --git a/test/notmuch-test b/test/notmuch-test index bfad5d3..8cbb2cd 100755 --- a/test/notmuch-test +++ b/test/notmuch-test @@ -47,6 +47,7 @@ TESTS=" emacs-large-search-buffer emacs-subject-to-filename maildir-sync + mbox crypto symbol-hiding search-folder-coherence -- 1.7.9.5
[RFC PATCH 12/14] mailstore: support for mbox:// URIs
Signed-off-by: Ethan Glasser-Camp --- lib/mailstore.c | 85 +++ 1 file changed, 85 insertions(+) diff --git a/lib/mailstore.c b/lib/mailstore.c index ae02c12..e8d9bc1 100644 --- a/lib/mailstore.c +++ b/lib/mailstore.c @@ -19,6 +19,7 @@ */ #include #include +#include #include "notmuch-private.h" @@ -28,6 +29,74 @@ notmuch_mailstore_basic_open (const char *filename) return fopen (filename, "r"); } +/* Since we have to return a FILE*, we use fmemopen to turn buffers + * into FILE* streams. But when we close these streams, we have to + * free() the buffers. Use a hash to associate the two. + */ +static GHashTable *_mbox_files_to_strings = NULL; + +static void +_ensure_mbox_files_to_strings () { +if (_mbox_files_to_strings == NULL) +_mbox_files_to_strings = g_hash_table_new (NULL, NULL); +} + +static FILE * +notmuch_mailstore_mbox_open (UriUriA *uri) +{ +FILE *ret = NULL, *mbox = NULL; +char *filename, *message, *length_s; +const char *error; +long int offset, length, this_read; +_ensure_mbox_files_to_strings (); + +offset = strtol (uri->fragment.first, &length_s, 10); +length = strtol (length_s+1, NULL, 10); + +filename = talloc_strndup (NULL, uri->pathHead->text.first-1, + uri->pathTail->text.afterLast-uri->pathHead->text.first+1); + +if (filename == NULL) +goto DONE; + +mbox = fopen (filename, "r"); +if (mbox == NULL) { +fprintf (stderr, "Couldn't open message %s: %s.\n", uri->scheme.first, + strerror (errno)); +goto DONE; +} + +message = talloc_array (NULL, char, length); +fseek (mbox, offset, SEEK_SET); + +this_read = fread (message, sizeof(char), length, mbox); +if (this_read != length) { +if (feof (mbox)) +error = "end of file reached"; +if (ferror (mbox)) +error = strerror (ferror (mbox)); + +fprintf (stderr, "Couldn't read message %s: %s.\n", uri->scheme.first, error); +goto DONE; +} + +ret = fmemopen (message, length, "r"); +if (ret == NULL) { +/* No fclose will ever be called, so let's free message now */ +talloc_free (message); +goto DONE; +} + +g_hash_table_insert (_mbox_files_to_strings, ret, message); +DONE: +if (filename) +talloc_free (filename); +if (mbox) +fclose (mbox); + +return ret; +} + FILE * notmuch_mailstore_open (const char *filename) { @@ -57,6 +126,14 @@ notmuch_mailstore_open (const char *filename) goto DONE; } +if (0 == strncmp (parsed.scheme.first, "mbox", + parsed.scheme.afterLast-parsed.scheme.first)) { +/* mbox URI of the form mbox:///path/to/file#offset+length. + * Just pass the parsed URI. */ +ret = notmuch_mailstore_mbox_open (&parsed); +goto DONE; +} + DONE: uriFreeUriMembersA (&parsed); return ret; @@ -65,5 +142,13 @@ DONE: int notmuch_mailstore_close (FILE *file) { +char *file_buffer; +if (_mbox_files_to_strings != NULL) { +file_buffer = g_hash_table_lookup (_mbox_files_to_strings, file); +if (file_buffer != NULL) { +talloc_free (file_buffer); +} +g_hash_table_remove (_mbox_files_to_strings, file); +} return fclose (file); } -- 1.7.9.5
[RFC PATCH 11/14] notmuch-new: pull out useful bits of add_files_recursive
This is part of notmuch-new refactor phase 1: make add_files stuff safe for other backends. add_files_recursive is essentially a maildir-crawling function that periodically adds files to the database or adds filenames to remove_files or remove_directory lists. I don't see an easy way to adapt add_files_recursive for other backends who might not have concepts of directories with other directories inside of them, so instead just provide an add_files method for each backend. This patch pulls some bits out of add_files_recursive which will be useful for other backends: two reporting functions _report_before_adding_file and _report_added_file, as well as _add_message, which actually does the message adding. Signed-off-by: Ethan Glasser-Camp --- notmuch-new.c | 192 +++-- 1 file changed, 119 insertions(+), 73 deletions(-) diff --git a/notmuch-new.c b/notmuch-new.c index 57b27bf..1bf4e25 100644 --- a/notmuch-new.c +++ b/notmuch-new.c @@ -249,6 +249,122 @@ add_files_uri (unused(notmuch_database_t *notmuch), return NOTMUCH_STATUS_SUCCESS; } +/* Progress-reporting function. + * + * Can be used by any mailstore-crawling function that wants to alert + * users what message it's about to add. Subsequent errors will be due + * to this message ;) + */ +static void +_report_before_adding_file (add_files_state_t *state, const char *filename) +{ +state->processed_files++; + +if (state->verbose) { + if (state->output_is_a_tty) + printf("\r\033[K"); + + printf ("%i/%i: %s", + state->processed_files, + state->total_files, + filename); + + putchar((state->output_is_a_tty) ? '\r' : '\n'); + fflush (stdout); +} +} + +/* Progress-reporting function. + * + * Call this to respond to the signal handler for SIGALRM. + */ +static void +_report_added_file (add_files_state_t *state) +{ +if (do_print_progress) { + do_print_progress = 0; + generic_print_progress ("Processed", "files", state->tv_start, + state->processed_files, state->total_files); +} +} + + +/* Atomically handles adding a message to the database. + * + * Should be used by any mailstore-crawling function that finds a new + * message to add. + */ +static notmuch_status_t +_add_message (add_files_state_t *state, notmuch_database_t *notmuch, + const char *filename) +{ +notmuch_status_t status, ret = NOTMUCH_STATUS_SUCCESS; +notmuch_message_t *message; +const char **tag; + +status = notmuch_database_begin_atomic (notmuch); +if (status) { + ret = status; + goto DONE; +} + +status = notmuch_database_add_message (notmuch, filename, &message); + +switch (status) { +/* success */ +case NOTMUCH_STATUS_SUCCESS: + state->added_messages++; + notmuch_message_freeze (message); + for (tag=state->new_tags; *tag != NULL; tag++) + notmuch_message_add_tag (message, *tag); + if (state->synchronize_flags == TRUE) + notmuch_message_maildir_flags_to_tags (message); + notmuch_message_thaw (message); + break; +/* Non-fatal issues (go on to next file) */ +case NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID: + if (state->synchronize_flags == TRUE) + notmuch_message_maildir_flags_to_tags (message); + break; +case NOTMUCH_STATUS_FILE_NOT_EMAIL: + fprintf (stderr, "Note: Ignoring non-mail file: %s\n", +filename); + break; +/* Fatal issues. Don't process anymore. */ +case NOTMUCH_STATUS_READ_ONLY_DATABASE: +case NOTMUCH_STATUS_XAPIAN_EXCEPTION: +case NOTMUCH_STATUS_OUT_OF_MEMORY: + fprintf (stderr, "Error: %s. Halting processing.\n", +notmuch_status_to_string (status)); + ret = status; + goto DONE; +default: +case NOTMUCH_STATUS_FILE_ERROR: +case NOTMUCH_STATUS_NULL_POINTER: +case NOTMUCH_STATUS_TAG_TOO_LONG: +case NOTMUCH_STATUS_UNBALANCED_FREEZE_THAW: +case NOTMUCH_STATUS_UNBALANCED_ATOMIC: +case NOTMUCH_STATUS_LAST_STATUS: + INTERNAL_ERROR ("add_message returned unexpected value: %d", status); + ret = status; + goto DONE; +} + +status = notmuch_database_end_atomic (notmuch); +if (status) { + ret = status; + goto DONE; +} + + DONE: +if (message) { + notmuch_message_destroy (message); + message = NULL; +} + +return ret; +} + /* Examine 'path' recursively as follows: * * o Ask the filesystem for the mtime of 'path' (fs_mtime) @@ -300,7 +416,6 @@ add_files (notmuch_database_t *notmuch, char *next = NULL, *path_uri = NULL; time_t fs_mtime, db_mtime; notmuch_status_t status, ret = NOTMUCH_STATUS_SUCCESS; -notmuch_message_t *message = NULL; struct dirent **fs_entries = NULL; int i, num_fs_entries = 0, entry_type; notmuch_directory_t *directory; @@ -309,7 +
[RFC PATCH 10/14] new: add "scan" option
This is just a quick hack to get started on adding an mbox backend. The fact that the default maildir is scanned "automagically" is a little weird, but it doesn't do any harm unless you decide to put mail there that you really don't want indexed. Signed-off-by: Ethan Glasser-Camp --- notmuch-client.h |9 + notmuch-config.c | 30 +- notmuch-new.c| 18 ++ test/config |1 + 4 files changed, 57 insertions(+), 1 deletion(-) diff --git a/notmuch-client.h b/notmuch-client.h index 9b63eae..9d922fe 100644 --- a/notmuch-client.h +++ b/notmuch-client.h @@ -256,6 +256,15 @@ notmuch_config_set_new_ignore (notmuch_config_t *config, const char *new_ignore[], size_t length); +const char ** +notmuch_config_get_new_scan (notmuch_config_t *config, + size_t *length); + +void +notmuch_config_set_new_scan (notmuch_config_t *config, + const char *new_scan[], + size_t length); + notmuch_bool_t notmuch_config_get_maildir_synchronize_flags (notmuch_config_t *config); diff --git a/notmuch-config.c b/notmuch-config.c index 3e37a2d..e9d99ea 100644 --- a/notmuch-config.c +++ b/notmuch-config.c @@ -50,7 +50,10 @@ static const char new_config_comment[] = "\tthat will not be searched for messages by \"notmuch new\".\n" "\n" "\tNOTE: *Every* file/directory that goes by one of those names will\n" -"\tbe ignored, independent of its depth/location in the mail store.\n"; +"\tbe ignored, independent of its depth/location in the mail store.\n" +"\n" +"\tscanA list (separated by ';') of mail URLs to scan.\n" +"\tThe maildir located at database.path, above, will automatically be added.\n"; static const char user_config_comment[] = " User configuration\n" @@ -113,6 +116,8 @@ struct _notmuch_config { size_t new_tags_length; const char **new_ignore; size_t new_ignore_length; +const char **new_scan; +size_t new_scan_length; notmuch_bool_t maildir_synchronize_flags; const char **search_exclude_tags; size_t search_exclude_tags_length; @@ -274,6 +279,8 @@ notmuch_config_open (void *ctx, config->new_tags_length = 0; config->new_ignore = NULL; config->new_ignore_length = 0; +config->new_scan = NULL; +config->new_scan_length = 0; config->maildir_synchronize_flags = TRUE; config->search_exclude_tags = NULL; config->search_exclude_tags_length = 0; @@ -375,6 +382,10 @@ notmuch_config_open (void *ctx, notmuch_config_set_new_ignore (config, NULL, 0); } +if (notmuch_config_get_new_scan (config, &tmp) == NULL) { + notmuch_config_set_new_scan (config, NULL, 0); +} + if (notmuch_config_get_search_exclude_tags (config, &tmp) == NULL) { if (is_new) { const char *tags[] = { "deleted", "spam" }; @@ -631,6 +642,14 @@ notmuch_config_get_new_ignore (notmuch_config_t *config, size_t *length) &(config->new_ignore_length), length); } +const char ** +notmuch_config_get_new_scan (notmuch_config_t *config, size_t *length) +{ +return _config_get_list (config, "new", "scan", +&(config->new_scan), +&(config->new_scan_length), length); +} + void notmuch_config_set_user_other_email (notmuch_config_t *config, const char *list[], @@ -658,6 +677,15 @@ notmuch_config_set_new_ignore (notmuch_config_t *config, &(config->new_ignore)); } +void +notmuch_config_set_new_scan (notmuch_config_t *config, +const char *list[], +size_t length) +{ +_config_set_list (config, "new", "scan", list, length, +&(config->new_scan)); +} + const char ** notmuch_config_get_search_exclude_tags (notmuch_config_t *config, size_t *length) { diff --git a/notmuch-new.c b/notmuch-new.c index 1f11b2c..57b27bf 100644 --- a/notmuch-new.c +++ b/notmuch-new.c @@ -239,6 +239,16 @@ _entry_in_ignore_list (const char *entry, add_files_state_t *state) return FALSE; } +/* Call out to the appropriate add_files function, based on the URI. */ +static notmuch_status_t +add_files_uri (unused(notmuch_database_t *notmuch), + unused(const char *uri), + unused(add_files_state_t *state)) +{ +/* Stub for now */ +return NOTMUCH_STATUS_SUCCESS; +} + /* Examine 'path' recursively as follows: * * o Ask the filesystem for the mtime of 'path' (fs_mtime) @@ -843,6 +853,8 @@ notmuch_new_command (void *ctx, int argc, char *argv[]) int ret = 0; struct stat st; const char *db_path; +const char **new_scan; +size_t new_scan_length, new_scan_i; char *dot_notmuch_path; struct sigaction action; _
[RFC PATCH 09/14] Fix atomicity test to work without relocatable mailstores
Instead of assuming that the mailstore doesn't store its absolute filenames, we use a symlink that can change back and forth. As long as filenames contain this symlink, they can work in either the real database, or the current snapshot. Signed-off-by: Ethan Glasser-Camp --- test/atomicity | 10 +- test/atomicity.gdb | 11 --- 2 files changed, 13 insertions(+), 8 deletions(-) diff --git a/test/atomicity b/test/atomicity index 6df0a00..7b62ec7 100755 --- a/test/atomicity +++ b/test/atomicity @@ -49,13 +49,13 @@ if test_require_external_prereq gdb; then rm $MAIL_DIR/.remove-dir/remove-directory-duplicate:2, rmdir $MAIL_DIR/.remove-dir -# Prepare a snapshot of the updated maildir. The gdb script will -# update the database in this snapshot as it goes. +# Copy the mail database. We will run on this database concurrently. cp -ra $MAIL_DIR $MAIL_DIR.snap -cp ${NOTMUCH_CONFIG} ${NOTMUCH_CONFIG}.snap -NOTMUCH_CONFIG=${NOTMUCH_CONFIG}.snap notmuch config set database.path $MAIL_DIR.snap - +# Use a symlink instead of the real path. This way, we can change the symlink, +# without filenames having to change. +mv $MAIL_DIR $MAIL_DIR.real +ln -s $MAIL_DIR.real $MAIL_DIR # Execute notmuch new and, at every call to rename, snapshot the # database, run notmuch new again on the snapshot, and capture the diff --git a/test/atomicity.gdb b/test/atomicity.gdb index fd67525..3d4e210 100644 --- a/test/atomicity.gdb +++ b/test/atomicity.gdb @@ -38,12 +38,17 @@ shell mv backtrace backtrace.`cat outcount` # Snapshot the database shell rm -r $MAIL_DIR.snap/.notmuch shell cp -r $MAIL_DIR/.notmuch $MAIL_DIR.snap/.notmuch +shell rm $MAIL_DIR +shell ln -s $MAIL_DIR.snap $MAIL_DIR # Restore the mtime of $MAIL_DIR.snap, which we just changed -shell touch -r $MAIL_DIR $MAIL_DIR.snap +shell touch -r $MAIL_DIR.real $MAIL_DIR.snap # Run notmuch new to completion on the snapshot -shell NOTMUCH_CONFIG=${NOTMUCH_CONFIG}.snap XAPIAN_FLUSH_THRESHOLD=1000 notmuch new > /dev/null -shell NOTMUCH_CONFIG=${NOTMUCH_CONFIG}.snap notmuch search '*' > search.`cat outcount` 2>&1 +shell NOTMUCH_CONFIG=${NOTMUCH_CONFIG} XAPIAN_FLUSH_THRESHOLD=1000 notmuch new > /dev/null +shell NOTMUCH_CONFIG=${NOTMUCH_CONFIG} notmuch search '*' > search.`cat outcount` 2>&1 shell echo $(expr $(cat outcount) + 1) > outcount +# restore symlink to correct database before resuming +shell rm $MAIL_DIR +shell ln -s $MAIL_DIR.real $MAIL_DIR cont end -- 1.7.9.5
[RFC PATCH 08/14] Don't cache corpus.mail
corpus.mail has already been processed by notmuch-new, so it seems like a good target to cache, but since filenames are no longer being stored relative to the database, it isn't. Recopy on each test, or else filenames from other tests will show up. Signed-off-by: Ethan Glasser-Camp --- test/test-lib.sh |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/test/test-lib.sh b/test/test-lib.sh index 195158c..def3760 100644 --- a/test/test-lib.sh +++ b/test/test-lib.sh @@ -441,7 +441,7 @@ add_email_corpus () else cp -a $TEST_DIRECTORY/corpus ${MAIL_DIR} notmuch new >/dev/null - cp -a ${MAIL_DIR} $TEST_DIRECTORY/corpus.mail + #cp -a ${MAIL_DIR} $TEST_DIRECTORY/corpus.mail fi } -- 1.7.9.5
[RFC PATCH 07/14] Update tests that need to see filenames to use URIs
This fixes all tests except atomicity, which should be next. Signed-off-by: Ethan Glasser-Camp --- test/emacs |2 +- test/json|4 ++-- test/maildir-sync|7 --- test/multipart |4 ++-- test/new |6 +++--- test/search-folder-coherence |2 +- test/search-output |4 ++-- test/test-lib.sh |3 +++ 8 files changed, 18 insertions(+), 14 deletions(-) diff --git a/test/emacs b/test/emacs index e9f954c..c08791e 100755 --- a/test/emacs +++ b/test/emacs @@ -621,7 +621,7 @@ Stash my stashables id:"bought" bought inbox,stashtest -${gen_msg_filename} +${gen_msg_uri} http://mid.gmane.org/bought http://marc.info/?i=bought http://mail-archive.com/search?l=mid&q=bought diff --git a/test/json b/test/json index 6439788..be29fac 100755 --- a/test/json +++ b/test/json @@ -5,7 +5,7 @@ test_description="--format=json output" test_begin_subtest "Show message: json" add_message "[subject]=\"json-show-subject\"" "[date]=\"Sat, 01 Jan 2000 12:00:00 -\"" "[body]=\"json-show-message\"" output=$(notmuch show --format=json "json-show-message") -test_expect_equal "$output" "[[[{\"id\": \"${gen_msg_id}\", \"match\": true, \"excluded\": false, \"filename\": \"${gen_msg_filename}\", \"timestamp\": 946728000, \"date_relative\": \"2000-01-01\", \"tags\": [\"inbox\",\"unread\"], \"headers\": {\"Subject\": \"json-show-subject\", \"From\": \"Notmuch Test Suite \", \"To\": \"Notmuch Test Suite \", \"Date\": \"Sat, 01 Jan 2000 12:00:00 +\"}, \"body\": [{\"id\": 1, \"content-type\": \"text/plain\", \"content\": \"json-show-message\n\"}]}, [" +test_expect_equal "$output" "[[[{\"id\": \"${gen_msg_id}\", \"match\": true, \"excluded\": false, \"filename\": \"${gen_msg_uri}\", \"timestamp\": 946728000, \"date_relative\": \"2000-01-01\", \"tags\": [\"inbox\",\"unread\"], \"headers\": {\"Subject\": \"json-show-subject\", \"From\": \"Notmuch Test Suite \", \"To\": \"Notmuch Test Suite \", \"Date\": \"Sat, 01 Jan 2000 12:00:00 +\"}, \"body\": [{\"id\": 1, \"content-type\": \"text/plain\", \"content\": \"json-show-message\n\"}]}, [" test_begin_subtest "Search message: json" add_message "[subject]=\"json-search-subject\"" "[date]=\"Sat, 01 Jan 2000 12:00:00 -\"" "[body]=\"json-search-message\"" @@ -22,7 +22,7 @@ test_expect_equal "$output" "[{\"thread\": \"XXX\", test_begin_subtest "Show message: json, utf-8" add_message "[subject]=\"json-show-utf8-body-s?bj?ct\"" "[date]=\"Sat, 01 Jan 2000 12:00:00 -\"" "[body]=\"js?n-show-m?ssage\"" output=$(notmuch show --format=json "js?n-show-m?ssage") -test_expect_equal "$output" "[[[{\"id\": \"${gen_msg_id}\", \"match\": true, \"excluded\": false, \"filename\": \"${gen_msg_filename}\", \"timestamp\": 946728000, \"date_relative\": \"2000-01-01\", \"tags\": [\"inbox\",\"unread\"], \"headers\": {\"Subject\": \"json-show-utf8-body-s?bj?ct\", \"From\": \"Notmuch Test Suite \", \"To\": \"Notmuch Test Suite \", \"Date\": \"Sat, 01 Jan 2000 12:00:00 +\"}, \"body\": [{\"id\": 1, \"content-type\": \"text/plain\", \"content\": \"js?n-show-m?ssage\n\"}]}, [" +test_expect_equal "$output" "[[[{\"id\": \"${gen_msg_id}\", \"match\": true, \"excluded\": false, \"filename\": \"${gen_msg_uri}\", \"timestamp\": 946728000, \"date_relative\": \"2000-01-01\", \"tags\": [\"inbox\",\"unread\"], \"headers\": {\"Subject\": \"json-show-utf8-body-s?bj?ct\", \"From\": \"Notmuch Test Suite \", \"To\": \"Notmuch Test Suite \", \"Date\": \"Sat, 01 Jan 2000 12:00:00 +\"}, \"body\": [{\"id\": 1, \"content-type\": \"text/plain\", \"content\": \"js?n-show-m?ssage\n\"}]}, [" test_begin_subtest "Show message: json, inline attachment filename" subject='json-show-inline-attachment-filename' diff --git a/test/maildir-sync b/test/maildir-sync index 01348d3..a2e110e 100755 --- a/test/maildir-sync +++ b/test/maildir-sync @@ -8,7 +8,7 @@ test_description="maildir synchronization" # --format=json" output includes some newlines. Also, need to avoid # including the local value of MAIL_DIR in the result. filter_show_json() { -sed -e 's/, /,\n/g' | sed -e "s|${MAIL_DIR}/|MAIL_DIR/|" +sed -e 's/, /,\n/g' | sed -e "s|${MAIL_URI}/|MAIL_DIR/|" echo } @@ -102,8 +102,9 @@ No new mail. Detected 1 file rename. thread:XXX 2001-01-05 [1/1] Notmuch Test Suite; Removing S flag (inbox unread)" test_begin_subtest "Removing info from filename leaves tags unchanged" -add_message [subject]='"Message to lose maildir info"' [filename]='message-to-lose-maildir-info' [dir]=cur -notmuch tag -unread subject:"Message to lose maildir info" +generate_message [subject]='"Message to lose maildir info"' [filename]='message-to-lose-maildir-info' [dir]=cur +notmuch new > hrngh.new +notmuch tag -unread subject:"Message to lose maildir info" > hrngh.tag mv "$MAIL_DIR/cur/message-to-lose-maildir-info:2,S" "$MAIL_DIR/cur/message-without-maild
[RFC PATCH 06/14] maildir URIs can be used in tags_to_maildir_flags
A better fix would probably be based on scheme. Signed-off-by: Ethan Glasser-Camp --- lib/message.cc | 51 ++- 1 file changed, 46 insertions(+), 5 deletions(-) diff --git a/lib/message.cc b/lib/message.cc index c9857f5..8ecec71 100644 --- a/lib/message.cc +++ b/lib/message.cc @@ -23,6 +23,7 @@ #include +#include #include struct visible _notmuch_message { @@ -1093,7 +1094,6 @@ notmuch_message_maildir_flags_to_tags (notmuch_message_t *message) { filename = notmuch_filenames_get (filenames); dir = _filename_is_in_maildir (filename); - if (! dir) continue; @@ -1304,12 +1304,46 @@ _new_maildir_filename (void *ctx, return filename_new; } +/* Parses a maildir URI and returns the filename corresponding to its + * path. + * + * Returns NULL if either the URI couldn't be parsed or if the + * scheme isn't maildir:. + */ +static char * +_get_maildir_filename (const char *filename) +{ +UriParserStateA parser_state; +UriUriA parsed; +char *path; +parser_state.uri = &parsed; + +if (uriParseUriA (&parser_state, filename) != URI_SUCCESS) { + uriFreeUriMembersA (&parsed); + return NULL; +} + +if (parsed.scheme.first != NULL && + 0 != strncmp(parsed.scheme.first, "maildir", +parsed.scheme.afterLast-parsed.scheme.first)) { + /* Full URI with non-maildir scheme. */ + uriFreeUriMembersA (&parsed); + return NULL; +} + +path = (char *)parsed.pathHead->text.first - 1; +uriFreeUriMembersA (&parsed); +return path; + +} + + notmuch_status_t notmuch_message_tags_to_maildir_flags (notmuch_message_t *message) { notmuch_filenames_t *filenames; const char *filename; -char *filename_new; +char *filename_new, *filename_old, *filename_new_uri; char *to_set, *to_clear; notmuch_status_t status = NOTMUCH_STATUS_SUCCESS; @@ -1324,16 +1358,22 @@ notmuch_message_tags_to_maildir_flags (notmuch_message_t *message) if (! _filename_is_in_maildir (filename)) continue; - filename_new = _new_maildir_filename (message, filename, + filename_old = _get_maildir_filename (filename); + if (filename_old == NULL) + continue; + + filename_new = _new_maildir_filename (message, filename_old, to_set, to_clear); if (filename_new == NULL) continue; + filename_new_uri = talloc_asprintf (message, "maildir://%s", filename_new); + if (strcmp (filename, filename_new)) { int err; notmuch_status_t new_status; - err = rename (filename, filename_new); + err = rename (filename_old, filename_new); if (err) continue; @@ -1347,7 +1387,7 @@ notmuch_message_tags_to_maildir_flags (notmuch_message_t *message) } new_status = _notmuch_message_add_filename (message, - filename_new); + filename_new_uri); /* Hold on to only the first error. */ if (! status && new_status) { status = new_status; @@ -1358,6 +1398,7 @@ notmuch_message_tags_to_maildir_flags (notmuch_message_t *message) } talloc_free (filename_new); + talloc_free (filename_new_uri); } talloc_free (to_set); -- 1.7.9.5
[RFC PATCH 05/14] new: use new URL-based filenames for messages
This commit breaks a bunch of tests; fixes follow. Signed-off-by: Ethan Glasser-Camp --- notmuch-new.c | 27 +++ 1 file changed, 19 insertions(+), 8 deletions(-) diff --git a/notmuch-new.c b/notmuch-new.c index 938ae29..1f11b2c 100644 --- a/notmuch-new.c +++ b/notmuch-new.c @@ -287,7 +287,7 @@ add_files (notmuch_database_t *notmuch, { DIR *dir = NULL; struct dirent *entry = NULL; -char *next = NULL; +char *next = NULL, *path_uri = NULL; time_t fs_mtime, db_mtime; notmuch_status_t status, ret = NOTMUCH_STATUS_SUCCESS; notmuch_message_t *message = NULL; @@ -315,7 +315,16 @@ add_files (notmuch_database_t *notmuch, fs_mtime = st.st_mtime; -status = notmuch_database_get_directory (notmuch, path, &directory); +/* maildir URIs should never have a hostname component, but + * uriparser doesn't parse paths correctly if they start with //, + * as in scheme://host//path. + */ +if (path[0] == '/') + path_uri = talloc_asprintf (notmuch, "maildir://%s", path); +else + path_uri = talloc_asprintf (notmuch, "maildir:///%s", path); + +status = notmuch_database_get_directory (notmuch, path_uri, &directory); if (status) { ret = status; goto DONE; @@ -423,7 +432,7 @@ add_files (notmuch_database_t *notmuch, strcmp (notmuch_filenames_get (db_files), entry->d_name) < 0) { char *absolute = talloc_asprintf (state->removed_files, - "%s/%s", path, + "%s/%s", path_uri, notmuch_filenames_get (db_files)); _filename_list_add (state->removed_files, absolute); @@ -439,7 +448,7 @@ add_files (notmuch_database_t *notmuch, if (strcmp (filename, entry->d_name) < 0) { char *absolute = talloc_asprintf (state->removed_directories, - "%s/%s", path, filename); + "%s/%s", path_uri, filename); _filename_list_add (state->removed_directories, absolute); } @@ -467,7 +476,7 @@ add_files (notmuch_database_t *notmuch, /* We're now looking at a regular file that doesn't yet exist * in the database, so add it. */ - next = talloc_asprintf (notmuch, "%s/%s", path, entry->d_name); + next = talloc_asprintf (notmuch, "%s/%s", path_uri, entry->d_name); state->processed_files++; @@ -559,7 +568,7 @@ add_files (notmuch_database_t *notmuch, while (notmuch_filenames_valid (db_files)) { char *absolute = talloc_asprintf (state->removed_files, - "%s/%s", path, + "%s/%s", path_uri, notmuch_filenames_get (db_files)); _filename_list_add (state->removed_files, absolute); @@ -570,7 +579,7 @@ add_files (notmuch_database_t *notmuch, while (notmuch_filenames_valid (db_subdirs)) { char *absolute = talloc_asprintf (state->removed_directories, - "%s/%s", path, + "%s/%s", path_uri, notmuch_filenames_get (db_subdirs)); _filename_list_add (state->removed_directories, absolute); @@ -584,9 +593,11 @@ add_files (notmuch_database_t *notmuch, * same second. This may lead to unnecessary re-scans, but it * avoids overlooking messages. */ if (fs_mtime != stat_time) - _filename_list_add (state->directory_mtimes, path)->mtime = fs_mtime; + _filename_list_add (state->directory_mtimes, path_uri)->mtime = fs_mtime; DONE: +if (path_uri) + talloc_free (path_uri); if (next) talloc_free (next); if (dir) -- 1.7.9.5
[RFC PATCH 04/14] Not all filenames need to be converted to absolute paths
_notmuch_message_ensure_filename_list converts "relative" paths, such as those stored in Xapian until now, to "absolute" paths. However, URLs are already absolute, and prepending the database path will just confuse matters. Signed-off-by: Ethan Glasser-Camp --- lib/message.cc | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/lib/message.cc b/lib/message.cc index 978de06..c9857f5 100644 --- a/lib/message.cc +++ b/lib/message.cc @@ -700,9 +700,17 @@ _notmuch_message_ensure_filename_list (notmuch_message_t *message) message->notmuch, directory_id); - if (strlen (directory)) - filename = talloc_asprintf (message, "%s/%s/%s", - db_path, directory, basename); + if (strlen (directory)) { + /* If directory is a URI, we don't need to append the db_path; +* it is already an absolute path. */ + /* This is just a quick hack instead of actually parsing the URL. */ + if (strstr (directory, "://") == NULL) + filename = talloc_asprintf (message, "%s/%s/%s", + db_path, directory, basename); + else + filename = talloc_asprintf (message, "%s/%s", + directory, basename); + } else filename = talloc_asprintf (message, "%s/%s", db_path, basename); -- 1.7.9.5
[RFC PATCH 03/14] mailstore can read from maildir: URLs
No code uses this yet. Signed-off-by: Ethan Glasser-Camp --- lib/mailstore.c | 37 - 1 file changed, 36 insertions(+), 1 deletion(-) diff --git a/lib/mailstore.c b/lib/mailstore.c index 48acd47..ae02c12 100644 --- a/lib/mailstore.c +++ b/lib/mailstore.c @@ -17,14 +17,49 @@ * * Author: Carl Worth */ +#include #include #include "notmuch-private.h" +static FILE * +notmuch_mailstore_basic_open (const char *filename) +{ +return fopen (filename, "r"); +} + FILE * notmuch_mailstore_open (const char *filename) { -return fopen (filename, "r"); +FILE *ret = NULL; +UriUriA parsed; +UriParserStateA state; +state.uri = &parsed; +if (uriParseUriA (&state, filename) != URI_SUCCESS) { +/* Failure. Fall back to fopen and hope for the best. */ +ret = notmuch_mailstore_basic_open (filename); +goto DONE; +} + +if (parsed.scheme.first == NULL) { +/* No scheme. Probably not really a URL but just an ordinary filename. + * Fall back to fopen for backwards compatibility. */ +ret = notmuch_mailstore_basic_open (filename); +goto DONE; +} + +if (0 == strncmp (parsed.scheme.first, "maildir", + parsed.scheme.afterLast-parsed.scheme.first)) { +/* Maildir URI of the form maildir:///path/to/file. + * We want to fopen("/path/to/file"). + * pathHead starts at "path/to/file". */ +ret = notmuch_mailstore_basic_open (parsed.pathHead->text.first - 1); +goto DONE; +} + +DONE: +uriFreeUriMembersA (&parsed); +return ret; } int -- 1.7.9.5
[RFC PATCH 02/14] Introduce uriparser
Seeing as there is no glib-standard way to parse URIs, an external library is needed. This commit introduces another program in compat/ and a stanza in ./configure to test if uriparser is there. Signed-off-by: Ethan Glasser-Camp --- Makefile.local |2 +- compat/have_uriparser.c | 17 + configure | 23 --- 3 files changed, 38 insertions(+), 4 deletions(-) create mode 100644 compat/have_uriparser.c diff --git a/Makefile.local b/Makefile.local index a890df2..084f44e 100644 --- a/Makefile.local +++ b/Makefile.local @@ -41,7 +41,7 @@ PV_FILE=bindings/python/notmuch/version.py # Smash together user's values with our extra values FINAL_CFLAGS = -DNOTMUCH_VERSION=$(VERSION) $(CFLAGS) $(WARN_CFLAGS) $(CONFIGURE_CFLAGS) $(extra_cflags) FINAL_CXXFLAGS = $(CXXFLAGS) $(WARN_CXXFLAGS) $(CONFIGURE_CXXFLAGS) $(extra_cflags) $(extra_cxxflags) -FINAL_NOTMUCH_LDFLAGS = $(LDFLAGS) -Lutil -lutil -Llib -lnotmuch $(AS_NEEDED_LDFLAGS) $(GMIME_LDFLAGS) $(TALLOC_LDFLAGS) +FINAL_NOTMUCH_LDFLAGS = $(LDFLAGS) -Lutil -lutil -Llib -lnotmuch $(AS_NEEDED_LDFLAGS) $(GMIME_LDFLAGS) $(TALLOC_LDFLAGS) $(URIPARSER_LDFLAGS) FINAL_NOTMUCH_LINKER = CC ifneq ($(LINKER_RESOLVES_LIBRARY_DEPENDENCIES),1) FINAL_NOTMUCH_LDFLAGS += $(CONFIGURE_LDFLAGS) diff --git a/compat/have_uriparser.c b/compat/have_uriparser.c new file mode 100644 index 000..d79e51d --- /dev/null +++ b/compat/have_uriparser.c @@ -0,0 +1,17 @@ +#include + +int +main (int argc, char *argv[]) +{ +UriParserStateA state; +UriUriA uri; +char *uriS = NULL; + +state.uri = &uri; +if (uriParseUriA (&state, uriS) != URI_SUCCESS) { +/* Failure */ +uriFreeUriMembersA (&uri); +} + +return 0; +} diff --git a/configure b/configure index 3fad424..80aa13c 100755 --- a/configure +++ b/configure @@ -313,6 +313,19 @@ else errors=$((errors + 1)) fi +printf "Checking for uriparser... " +if ${CC} -o compat/have_uriparser "$srcdir"/compat/have_uriparser.c -luriparser > /dev/null 2>&1 +then +printf "Yes.\n" +uriparser_ldflags="-luriparser" +have_uriparser=1 +else +printf "No.\n" +have_uriparser=0 +errors=$((errors + 1)) +fi +rm -f compat/have_uriparser + printf "Checking for valgrind development files... " if pkg-config --exists valgrind; then printf "Yes.\n" @@ -431,11 +444,11 @@ case a simple command will install everything you need. For example: On Debian and similar systems: - sudo apt-get install libxapian-dev libgmime-2.6-dev libtalloc-dev + sudo apt-get install libxapian-dev libgmime-2.6-dev libtalloc-dev liburiparser-dev Or on Fedora and similar systems: - sudo yum install xapian-core-devel gmime-devel libtalloc-devel + sudo yum install xapian-core-devel gmime-devel libtalloc-devel uriparser-devel On other systems, similar commands can be used, but the details of the package names may be different. @@ -669,6 +682,9 @@ GMIME_LDFLAGS = ${gmime_ldflags} TALLOC_CFLAGS = ${talloc_cflags} TALLOC_LDFLAGS = ${talloc_ldflags} +# Flags needed to link against uriparser +URIPARSER_LDFLAGS = ${uriparser_ldflags} + # Flags needed to have linker set rpath attribute RPATH_LDFLAGS = ${rpath_ldflags} @@ -698,5 +714,6 @@ CONFIGURE_CXXFLAGS = -DHAVE_GETLINE=\$(HAVE_GETLINE) \$(GMIME_CFLAGS)\\ \$(TALLOC_CFLAGS) -DHAVE_VALGRIND=\$(HAVE_VALGRIND) \\ \$(VALGRIND_CFLAGS) \$(XAPIAN_CXXFLAGS) \\ -DHAVE_STRCASESTR=\$(HAVE_STRCASESTR) -CONFIGURE_LDFLAGS = \$(GMIME_LDFLAGS) \$(TALLOC_LDFLAGS) \$(XAPIAN_LDFLAGS) +CONFIGURE_LDFLAGS = \$(GMIME_LDFLAGS) \$(TALLOC_LDFLAGS) \$(XAPIAN_LDFLAGS) \\ + \$(URIPARSER_LDFLAGS) EOF -- 1.7.9.5
[RFC PATCH 01/14] All access to mail files goes through the mailstore module
This commit introduces the mailstore module which provides two functions, notmuch_mailstore_open and notmuch_mailstore_close. These functions are currently just stub calls to fopen and fclose, but later can be made more complex in order to support mail storage systems where one message might not be one file. Signed-off-by: Ethan Glasser-Camp --- lib/Makefile.local|1 + lib/database.cc |2 +- lib/index.cc |2 +- lib/mailstore.c | 34 lib/message-file.c|6 ++--- lib/notmuch-private.h |3 +++ lib/notmuch.h | 16 +++ lib/sha1.c| 70 + mime-node.c |4 +-- notmuch-show.c| 12 - 10 files changed, 120 insertions(+), 30 deletions(-) create mode 100644 lib/mailstore.c diff --git a/lib/Makefile.local b/lib/Makefile.local index 8a9aa28..cfc77bb 100644 --- a/lib/Makefile.local +++ b/lib/Makefile.local @@ -51,6 +51,7 @@ libnotmuch_c_srcs = \ $(dir)/filenames.c \ $(dir)/string-list.c\ $(dir)/libsha1.c\ + $(dir)/mailstore.c \ $(dir)/message-file.c \ $(dir)/messages.c \ $(dir)/sha1.c \ diff --git a/lib/database.cc b/lib/database.cc index 761dc1a..c035edc 100644 --- a/lib/database.cc +++ b/lib/database.cc @@ -1773,7 +1773,7 @@ notmuch_database_add_message (notmuch_database_t *notmuch, if (message_id == NULL ) { /* No message-id at all, let's generate one by taking a * hash over the file's contents. */ - char *sha1 = notmuch_sha1_of_file (filename); + char *sha1 = notmuch_sha1_of_message (filename); /* If that failed too, something is really wrong. Give up. */ if (sha1 == NULL) { diff --git a/lib/index.cc b/lib/index.cc index e377732..b607e82 100644 --- a/lib/index.cc +++ b/lib/index.cc @@ -441,7 +441,7 @@ _notmuch_message_index_file (notmuch_message_t *message, initialized = 1; } -file = fopen (filename, "r"); +file = notmuch_mailstore_open (filename); if (! file) { fprintf (stderr, "Error opening %s: %s\n", filename, strerror (errno)); ret = NOTMUCH_STATUS_FILE_ERROR; diff --git a/lib/mailstore.c b/lib/mailstore.c new file mode 100644 index 000..48acd47 --- /dev/null +++ b/lib/mailstore.c @@ -0,0 +1,34 @@ +/* mailstore.c - code to access individual messages + * + * Copyright ? 2009 Carl Worth + * + * This program is free software: you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation, either version 3 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program. If not, see http://www.gnu.org/licenses/ . + * + * Author: Carl Worth + */ +#include + +#include "notmuch-private.h" + +FILE * +notmuch_mailstore_open (const char *filename) +{ +return fopen (filename, "r"); +} + +int +notmuch_mailstore_close (FILE *file) +{ +return fclose (file); +} diff --git a/lib/message-file.c b/lib/message-file.c index 915aba8..271389c 100644 --- a/lib/message-file.c +++ b/lib/message-file.c @@ -86,7 +86,7 @@ _notmuch_message_file_destructor (notmuch_message_file_t *message) g_hash_table_destroy (message->headers); if (message->file) - fclose (message->file); + notmuch_mailstore_close (message->file); return 0; } @@ -104,7 +104,7 @@ _notmuch_message_file_open_ctx (void *ctx, const char *filename) talloc_set_destructor (message, _notmuch_message_file_destructor); -message->file = fopen (filename, "r"); +message->file = notmuch_mailstore_open (filename); if (message->file == NULL) goto FAIL; @@ -361,7 +361,7 @@ notmuch_message_file_get_header (notmuch_message_file_t *message, } if (message->parsing_finished) { -fclose (message->file); +notmuch_mailstore_close (message->file); message->file = NULL; } diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h index bfb4111..5dbe821 100644 --- a/lib/notmuch-private.h +++ b/lib/notmuch-private.h @@ -468,6 +468,9 @@ notmuch_sha1_of_string (const char *str); char * notmuch_sha1_of_file (const char *filename); +char * +notmuch_sha1_of_message (const char *filename); + /* string-list.c */ typedef struct _notmuch_string_node { diff --git a/lib/notmuch.h b/lib/notmuch.h index 3633bed..0ca367b 100644 --- a/lib/notmuch.h +++ b/lib/notmuch.h @@ -1233,6 +1233,22 @@ notmuch_message_thaw (notmuch_message_t *message); void notmuch_message
[RFC PATCH 00/14] modular mail stores based on URIs
Hi guys, Sorry for dropping off the mailing list after I sent my last patch series (http://notmuchmail.org/pipermail/notmuch/2012/009470.html). I haven't had the time or a stable enough email address to really follow notmuch development :) I signed onto #notmuch a week or two ago and asked what I would need to do to get a feature like this one into mainline. j4ni told me that he agreed with the feedback to my original patch series, and suggested that I follow mjw1009's advice of having filenames encode all information about mail storage transparently, and that this would solve the problem with the original patch series of sprinkling mail storage parameters all over the place. bremner suggested that he had been thinking about how to support mbox or other multiple-message archives, and also commented that he wasn't crazy about so much of the API being in strings. Based on this advice, I decided to revise my approach to this patchset, one that is based around the stated desire to work with mbox formats. This approach, in contrast to the mailstore approach that Michal Sojka proposed and I revised, encodes all mail access information as URIs. These URIs are stored in Xapian the way that relative paths are right now. Examples might be: maildir:///home/ethan/Mail/folder/cur/filename:2,S mbox:///home/ethan/Mail/folder/file.mbox#byte-offset+lenght couchdb://ethan:password at localhost:8080/some-doc-id Personally, this isn't my favorite approach, for the following reasons: 1. Notmuch, at some point in its history, chose to store file paths relative to a "mail database", with the intent that if this mail database was moved, filenames would not change and everything would Just Work (tm). The above scheme completely reverses this design decision, and in general completely breaks this relocatability. I don't see any easy way to handle this problem. This isn't just a wishlist feature; at least two things in the test suite (caching of corpus.mail, and the atomicity tests) rely on this behavior. 2. Mail access information, i.e. open connections, etc. can only be stored in variables global to the mailstore code, and cannot be stored as private members of a mailstore object. This is more an aesthetic concern than a functional one. Anyhow, the following (enormous) patch series implement this design. I used uriparser as an external library to parse URIs. The API for this library is a little idiosyncratic. uriparser supports parsing Unicode URIs (strings of wchar_t), but I just used ASCII filenames because I think that's what comes out of Xapian. Patch 11 is borrowed directly from the last patch series. The last four or five patches add mbox support, including a few tests. That part of the series is still very first-draft: I added a new config option to specify URIs to scan, and ">From " lines still need to be unescaped. However, we support scanning mbox files whether messages have content-length or not. I will try to receive feedback on this series more gratefully than the last one. :) Thanks again for your time, Ethan
Re: [PATCH 0/3] Speed up notmuch new for unchanged directories
Austin Clements writes: > On Sun, 24 Jun 2012, Sascha Silbe wrote: ["notmuch new" listing every directory, even if it's unchanged] > I haven't looked over your patches yet, but this result surprises me. > Could you explain your setup a little more? How much mail do you have > and across how many directories? What file system are you using? As mentioned in passing already, I have a total of about 900k unique mails (sometimes several copies of them, received over different paths, e.g. mailing list and a direct CC). Most of that is "old" mails, in directories that are not getting updated. If notmuch would support mbox, I'd use that instead for those old mails. The total number of directories in the mail store is about 29k and the total number of files (including the git repository and mbox files that sup used) is about 1.25M. Since a housekeeping job last weekend, the number of mails in directories that are still getting updated is about 4k, i.e. about 5‰ of the total number of mails or 3‰ of the total number of files. The number of directories getting updated is 104, i.e. about 4‰ of the total number of directories. Ideally, we'd get the run-time of "notmuch new" down by a similar factor. With just plain POSIX and no additional information that won't be possible, but providing a way to channel information about updates into notmuch (rather than having it scan everything over and over again) should help. That information is already available as output from the mail fetching process (rsync in my case). Of course, it would be purely optional: "notmuch new" without additional information would simply continue to scan everything. > I'm also surprised that your new approach helps. This directory listing > has to be read off disk one way or the other, but listing directories is > the bread-and-butter of file systems, whereas I would think that Xapian > would require more IO to accomplish the same effect. "notmuch new" needs to iterate over a list of all directories to find those with new mails (and potentially new subdirectories). However, it does not need to list the *contents* of those folders. I'm surprised as well, but rather in the opposite direction: Based on a naive calculation, we'd expect to see a speedup on the order of (1.25M+29k)/29k = 44. The actual results suggest that stat()ing (done 29k times both before and after the patch) is taking about 19 times as long as listing a directory entry (before the patch we listed 1M entries, now we list none if nothing has changed). (*) In practice, the speedup achieved by my patch is larger than what the benchmark suggests because there are other processes running that use RAM. If we need to read a lot from disk (like "notmuch new" did before my patch), there's a good chance it's already been evicted from the cache since the last run. The fewer we need to read, the more likely it is to still be in the cache. Similarly, reading lots of data from disk will displace other data in the cache. These effects are not covered by the pure "hot cache" and "cold cache" timings. > Does your patch win because you can specifically list subdirectories > out of Xapian, making the IO proportional to the number of > subdirectories instead of the number of subdirectories and files (even > though the constant factors probably favor reading from the file > system)? It wins because the factor is the number of files in each directory, not just some low constant based on file system overhead vs. Xapian overhead. > I like the idea of these patches, I just want to make sure I have a firm > grip on what's being optimized and why it wins. Certainly a good idea. Thanks for taking the time! Sascha (*) float(linsolve([29000*x + 125*y = 3.3 * 29000*x], [x])); in maxima, if you'd like to check the math. -- http://sascha.silbe.org/ http://www.infra-silbe.de/ pgpk3mdTVr6yA.pgp Description: PGP signature ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
[PATCH 0/3] Speed up notmuch new for unchanged directories
On Sun, 24 Jun 2012, Sascha Silbe wrote: > All the time I thought what makes "notmuch new" so abysmally slow is the > stat() for each maildir. But as it continued to be slow even after I > moved most mails out of 'new' (into 'new-20120624'), I strace'd notmuch > and noticed it listed even unchanged directories, thereby listing and > iterating over each and every single of the 900k mails in my mail store. > > There's still quite some room for further improvements as it continues > to take several minutes to scan < 100 new mails in changed directories > containing < 1000 mails in total. Even the rsync run that fetches the > new mails is faster. I haven't looked over your patches yet, but this result surprises me. Could you explain your setup a little more? How much mail do you have and across how many directories? What file system are you using? I'm also surprised that your new approach helps. This directory listing has to be read off disk one way or the other, but listing directories is the bread-and-butter of file systems, whereas I would think that Xapian would require more IO to accomplish the same effect. Does your patch win because you can specifically list subdirectories out of Xapian, making the IO proportional to the number of subdirectories instead of the number of subdirectories and files (even though the constant factors probably favor reading from the file system)? I like the idea of these patches, I just want to make sure I have a firm grip on what's being optimized and why it wins.
bug related to ical
I've noticed a problem related to handling of ical attachments. I'm using Notmuch 0.13 on Emacs 23.3.1. I've done some basic troubleshooting. The problem arises with emails from Concur that include an ical attachment being viewed with the notmuch message viewer. The problems are: 1. When opening the email there is sometimes the following mesage and error in Emacs message buffer: Converting icalendar...done notmuch-show-insert-bodypart-internal: Wrong type argument: stringp, nil 2. Some (not all) of the view commands fail, e.g. "v", "V", "w". Others work, like "m", and "q". 3. Examination of the /tmp directory shows notmuch-ical temp files being created but they are zero length. This is related to the ical attachment. When I editted one of the emails to remove the attachment, the problem went away. I suspect it is related to the attachments being base64 encoded. The header of the mime attachment shows: Content-Type: application/octet-stream; name="ConcurCalendarEntry.ics" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="ConcurCalendarEntry.ics" The encoding is correct. The attachment decodes and looks right. With some details obscured the attachment contains: BEGIN:VCALENDAR VERSION:2.0 METHOD:PUBLISH BEGIN:VEVENT DTSTART:properly-formatted DTEND:properly-formatted DTSTAMP:properly-formatted LOCATION: SUMMARY:Concur Travel Itinerary DESCRIPTION:Lots of stuff with about 80 lines of description. All indented properly. UID:properly-formatted PRIORITY:3 TRANSP:TRANSPARENT END:VEVENT END:VCALENDAR I can live without the ics files, so fixing this is not a priority for me. If there is someone interested in figuring this out, I've saved an email and can answer questions. I got lost trying to follow the lisp code paths for attachments, so I'm not sure whether it's the text or the base64 that is being handed off to icalendar. R Horn rjhorn at alum.mit.edu
extract attachments from multiple mails
Dear All, someone can give an advice? I have many emails containing attachment. This is typically an output of copy-machine, which fragments a scan into multiple attachments. I'd like to extract those attached files in a one batch into a specific directory. Is there any way how to programmatically fetch those files? thanks ..d..
Re: [PATCH 0/3] Speed up notmuch new for unchanged directories
On Sun, 24 Jun 2012, Sascha Silbe wrote: > All the time I thought what makes "notmuch new" so abysmally slow is the > stat() for each maildir. But as it continued to be slow even after I > moved most mails out of 'new' (into 'new-20120624'), I strace'd notmuch > and noticed it listed even unchanged directories, thereby listing and > iterating over each and every single of the 900k mails in my mail store. > > There's still quite some room for further improvements as it continues > to take several minutes to scan < 100 new mails in changed directories > containing < 1000 mails in total. Even the rsync run that fetches the > new mails is faster. I haven't looked over your patches yet, but this result surprises me. Could you explain your setup a little more? How much mail do you have and across how many directories? What file system are you using? I'm also surprised that your new approach helps. This directory listing has to be read off disk one way or the other, but listing directories is the bread-and-butter of file systems, whereas I would think that Xapian would require more IO to accomplish the same effect. Does your patch win because you can specifically list subdirectories out of Xapian, making the IO proportional to the number of subdirectories instead of the number of subdirectories and files (even though the constant factors probably favor reading from the file system)? I like the idea of these patches, I just want to make sure I have a firm grip on what's being optimized and why it wins. ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
bug related to ical
I've noticed a problem related to handling of ical attachments. I'm using Notmuch 0.13 on Emacs 23.3.1. I've done some basic troubleshooting. The problem arises with emails from Concur that include an ical attachment being viewed with the notmuch message viewer. The problems are: 1. When opening the email there is sometimes the following mesage and error in Emacs message buffer: Converting icalendar...done notmuch-show-insert-bodypart-internal: Wrong type argument: stringp, nil 2. Some (not all) of the view commands fail, e.g. "v", "V", "w". Others work, like "m", and "q". 3. Examination of the /tmp directory shows notmuch-ical temp files being created but they are zero length. This is related to the ical attachment. When I editted one of the emails to remove the attachment, the problem went away. I suspect it is related to the attachments being base64 encoded. The header of the mime attachment shows: Content-Type: application/octet-stream; name="ConcurCalendarEntry.ics" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="ConcurCalendarEntry.ics" The encoding is correct. The attachment decodes and looks right. With some details obscured the attachment contains: BEGIN:VCALENDAR VERSION:2.0 METHOD:PUBLISH BEGIN:VEVENT DTSTART:properly-formatted DTEND:properly-formatted DTSTAMP:properly-formatted LOCATION: SUMMARY:Concur Travel Itinerary DESCRIPTION:Lots of stuff with about 80 lines of description. All indented properly. UID:properly-formatted PRIORITY:3 TRANSP:TRANSPARENT END:VEVENT END:VCALENDAR I can live without the ics files, so fixing this is not a priority for me. If there is someone interested in figuring this out, I've saved an email and can answer questions. I got lost trying to follow the lisp code paths for attachments, so I'm not sure whether it's the text or the base64 that is being handed off to icalendar. R Horn rjh...@alum.mit.edu ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: extract attachments from multiple mails
On Mon, Jun 25 2012, Jameson Graef Rollins wrote: > I hacked up something simple below that will extract parts from messages > matching a search term into the current directory (tested). Improved/bug fixed version attached. jamie. jnotmuch-extract-parts Description: Binary data pgpJ9MjqSSAJ4.pgp Description: PGP signature ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
extract attachments from multiple mails
On Mon, Jun 25 2012, Jameson Graef Rollins wrote: > I hacked up something simple below that will extract parts from messages > matching a search term into the current directory (tested). Improved/bug fixed version attached. jamie. -- next part -- A non-text attachment was scrubbed... Name: jnotmuch-extract-parts Type: application/octet-stream Size: 1046 bytes Desc: not available URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20120625/cfab64bf/attachment.obj> -- next part -- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20120625/cfab64bf/attachment.pgp>
Re: extract attachments from multiple mails
On Mon, Jun 25 2012, David Belohrad wrote: > someone can give an advice? I have many emails containing > attachment. This is typically an output of copy-machine, which fragments > a scan into multiple attachments. > > I'd like to extract those attached files in a one batch into a specific > directory. Is there any way how to programmatically fetch those files? notmuch show has a --part option for outputting a single part from a MIME message. Unfortunately there's currently no clean way to determine the number of parts in a message. But sort of hackily, you could do something like: for id in $(notmuch search --output=messages tag:files-to-extract); do for part in $(seq 1 10); do notmuch show --part=$part --format=raw $id > $id.$part done done That will also save any multipart parts, which aren't really that useful, so you'll have to sort through them. You can make something much cleaner with python, using the notmuch and email python bindings: http://packages.python.org/notmuch/ http://docs.python.org/library/email-examples.html I hacked up something simple below that will extract parts from messages matching a search term into the current directory (tested). hth. jamie. #!/usr/bin/env python import subprocess import sys import os import notmuch import email import errno import mimetypes dbpath = subprocess.check_output(['notmuch', 'config', 'get', 'database.path']).strip() db = notmuch.Database(dbpath) query = notmuch.Query(db, sys.argv[1]) for msg in query.search_messages(): with open(msg.get_filename(), 'r') as f: msg = email.message_from_file(f) counter = 1 for part in msg.walk(): if part.get_content_maintype() == 'multipart': continue filename = part.get_filename() if not filename: ext = mimetypes.guess_extension(part.get_content_type()) if not ext: ext = '.bin' filename = 'part-%03d%s' % (counter, ext) counter += 1 print filename with open(filename, 'wb') as f: f.write(part.get_payload(decode=True)) pgpI2jFwpIn3y.pgp Description: PGP signature ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
extract attachments from multiple mails
On Mon, Jun 25 2012, David Belohrad wrote: > someone can give an advice? I have many emails containing > attachment. This is typically an output of copy-machine, which fragments > a scan into multiple attachments. > > I'd like to extract those attached files in a one batch into a specific > directory. Is there any way how to programmatically fetch those files? notmuch show has a --part option for outputting a single part from a MIME message. Unfortunately there's currently no clean way to determine the number of parts in a message. But sort of hackily, you could do something like: for id in $(notmuch search --output=messages tag:files-to-extract); do for part in $(seq 1 10); do notmuch show --part=$part --format=raw $id > $id.$part done done That will also save any multipart parts, which aren't really that useful, so you'll have to sort through them. You can make something much cleaner with python, using the notmuch and email python bindings: http://packages.python.org/notmuch/ http://docs.python.org/library/email-examples.html I hacked up something simple below that will extract parts from messages matching a search term into the current directory (tested). hth. jamie. #!/usr/bin/env python import subprocess import sys import os import notmuch import email import errno import mimetypes dbpath = subprocess.check_output(['notmuch', 'config', 'get', 'database.path']).strip() db = notmuch.Database(dbpath) query = notmuch.Query(db, sys.argv[1]) for msg in query.search_messages(): with open(msg.get_filename(), 'r') as f: msg = email.message_from_file(f) counter = 1 for part in msg.walk(): if part.get_content_maintype() == 'multipart': continue filename = part.get_filename() if not filename: ext = mimetypes.guess_extension(part.get_content_type()) if not ext: ext = '.bin' filename = 'part-%03d%s' % (counter, ext) counter += 1 print filename with open(filename, 'wb') as f: f.write(part.get_payload(decode=True)) -- next part -- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 835 bytes Desc: not available URL: <http://notmuchmail.org/pipermail/notmuch/attachments/20120625/e5b2a3de/attachment.pgp>
[PATCH v2] ruby: extern linkage portability improvement
Tomi Ollila writes: > Some C compilers are stricter when it comes to (tentative) definition > of a variable -- in those compilers introducing variable without 'extern' > keyword always allocates new 'storage' to the variable and linking all > these modules fails due to duplicate symbols. LGTM
extract attachments from multiple mails
Dear All, someone can give an advice? I have many emails containing attachment. This is typically an output of copy-machine, which fragments a scan into multiple attachments. I'd like to extract those attached files in a one batch into a specific directory. Is there any way how to programmatically fetch those files? thanks ..d.. ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [PATCH v2] ruby: extern linkage portability improvement
Tomi Ollila writes: > Some C compilers are stricter when it comes to (tentative) definition > of a variable -- in those compilers introducing variable without 'extern' > keyword always allocates new 'storage' to the variable and linking all > these modules fails due to duplicate symbols. LGTM ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
[PATCH v2] ruby: extern linkage portability improvement
2012/6/24 Tomi Ollila : > Some C compilers are stricter when it comes to (tentative) definition > of a variable -- in those compilers introducing variable without 'extern' > keyword always allocates new 'storage' to the variable and linking all > these modules fails due to duplicate symbols. > > This is reimplementation of Charlie Allom's patch: > id:"1336481467-66356-1-git-send-email-charlie at mediasp.com", > written originally by Ali Polatel. This version has > more accurate commit message. > --- > ?bindings/ruby/defs.h | ? 46 +++--- > ?bindings/ruby/init.c | ? 26 ++ > ?2 files changed, 49 insertions(+), 23 deletions(-) > > diff --git a/bindings/ruby/defs.h b/bindings/ruby/defs.h > index 3f9512b..fe81b3f 100644 > --- a/bindings/ruby/defs.h > +++ b/bindings/ruby/defs.h > @@ -24,31 +24,31 @@ > ?#include > ?#include > > -VALUE notmuch_rb_cDatabase; > -VALUE notmuch_rb_cDirectory; > -VALUE notmuch_rb_cFileNames; > -VALUE notmuch_rb_cQuery; > -VALUE notmuch_rb_cThreads; > -VALUE notmuch_rb_cThread; > -VALUE notmuch_rb_cMessages; > -VALUE notmuch_rb_cMessage; > -VALUE notmuch_rb_cTags; > - > -VALUE notmuch_rb_eBaseError; > -VALUE notmuch_rb_eDatabaseError; > -VALUE notmuch_rb_eMemoryError; > -VALUE notmuch_rb_eReadOnlyError; > -VALUE notmuch_rb_eXapianError; > -VALUE notmuch_rb_eFileError; > -VALUE notmuch_rb_eFileNotEmailError; > -VALUE notmuch_rb_eNullPointerError; > -VALUE notmuch_rb_eTagTooLongError; > -VALUE notmuch_rb_eUnbalancedFreezeThawError; > -VALUE notmuch_rb_eUnbalancedAtomicError; > - > -ID ID_call; > -ID ID_db_create; > -ID ID_db_mode; > +extern VALUE notmuch_rb_cDatabase; > +extern VALUE notmuch_rb_cDirectory; > +extern VALUE notmuch_rb_cFileNames; > +extern VALUE notmuch_rb_cQuery; > +extern VALUE notmuch_rb_cThreads; > +extern VALUE notmuch_rb_cThread; > +extern VALUE notmuch_rb_cMessages; > +extern VALUE notmuch_rb_cMessage; > +extern VALUE notmuch_rb_cTags; > + > +extern VALUE notmuch_rb_eBaseError; > +extern VALUE notmuch_rb_eDatabaseError; > +extern VALUE notmuch_rb_eMemoryError; > +extern VALUE notmuch_rb_eReadOnlyError; > +extern VALUE notmuch_rb_eXapianError; > +extern VALUE notmuch_rb_eFileError; > +extern VALUE notmuch_rb_eFileNotEmailError; > +extern VALUE notmuch_rb_eNullPointerError; > +extern VALUE notmuch_rb_eTagTooLongError; > +extern VALUE notmuch_rb_eUnbalancedFreezeThawError; > +extern VALUE notmuch_rb_eUnbalancedAtomicError; > + > +extern ID ID_call; > +extern ID ID_db_create; > +extern ID ID_db_mode; > > ?/* RSTRING_PTR() is new in ruby-1.9 */ > ?#if !defined(RSTRING_PTR) > diff --git a/bindings/ruby/init.c b/bindings/ruby/init.c > index 3fe60fb..f4931d3 100644 > --- a/bindings/ruby/init.c > +++ b/bindings/ruby/init.c > @@ -20,6 +20,32 @@ > > ?#include "defs.h" > > +VALUE notmuch_rb_cDatabase; > +VALUE notmuch_rb_cDirectory; > +VALUE notmuch_rb_cFileNames; > +VALUE notmuch_rb_cQuery; > +VALUE notmuch_rb_cThreads; > +VALUE notmuch_rb_cThread; > +VALUE notmuch_rb_cMessages; > +VALUE notmuch_rb_cMessage; > +VALUE notmuch_rb_cTags; > + > +VALUE notmuch_rb_eBaseError; > +VALUE notmuch_rb_eDatabaseError; > +VALUE notmuch_rb_eMemoryError; > +VALUE notmuch_rb_eReadOnlyError; > +VALUE notmuch_rb_eXapianError; > +VALUE notmuch_rb_eFileError; > +VALUE notmuch_rb_eFileNotEmailError; > +VALUE notmuch_rb_eNullPointerError; > +VALUE notmuch_rb_eTagTooLongError; > +VALUE notmuch_rb_eUnbalancedFreezeThawError; > +VALUE notmuch_rb_eUnbalancedAtomicError; > + > +ID ID_call; > +ID ID_db_create; > +ID ID_db_mode; > + > ?/* > ?* Document-module: Notmuch > ?* > -- > 1.7.1 > > ___ > notmuch mailing list > notmuch at notmuchmail.org > http://notmuchmail.org/mailman/listinfo/notmuch Looks highly familiar yet strangely good to me.
[PATCH] manpages: consistent "format" for NAME section
The NAME section in manpages generally doesn't start with capital letter (unless the word is 'proper noun') and doesn't end with period. Notmuch manual pages now matches that "format". --- See http://notmuchmail.org/manpages/ for reference. man/man1/notmuch-config.1 |2 +- man/man1/notmuch-count.1|2 +- man/man1/notmuch-dump.1 |2 +- man/man1/notmuch-new.1 |2 +- man/man1/notmuch-reply.1|2 +- man/man1/notmuch-restore.1 |2 +- man/man1/notmuch-search.1 |2 +- man/man1/notmuch-show.1 |2 +- man/man1/notmuch-tag.1 |2 +- man/man7/notmuch-search-terms.7 |2 +- 10 files changed, 10 insertions(+), 10 deletions(-) diff --git a/man/man1/notmuch-config.1 b/man/man1/notmuch-config.1 index 4f7985c..2ee555d 100644 --- a/man/man1/notmuch-config.1 +++ b/man/man1/notmuch-config.1 @@ -1,6 +1,6 @@ .TH NOTMUCH-CONFIG 1 2012-06-01 "Notmuch 0.13.2" .SH NAME -notmuch-config \- Access notmuch configuration file. +notmuch-config \- access notmuch configuration file .SH SYNOPSIS .B notmuch config get diff --git a/man/man1/notmuch-count.1 b/man/man1/notmuch-count.1 index 8029174..8551ab2 100644 --- a/man/man1/notmuch-count.1 +++ b/man/man1/notmuch-count.1 @@ -1,6 +1,6 @@ .TH NOTMUCH-COUNT 1 2012-06-01 "Notmuch 0.13.2" .SH NAME -notmuch-count \- Count messages matching the given search terms. +notmuch-count \- count messages matching the given search terms .SH SYNOPSIS .B notmuch count diff --git a/man/man1/notmuch-dump.1 b/man/man1/notmuch-dump.1 index 9c7dd84..64abf01 100644 --- a/man/man1/notmuch-dump.1 +++ b/man/man1/notmuch-dump.1 @@ -1,6 +1,6 @@ .TH NOTMUCH-DUMP 1 2012-06-01 "Notmuch 0.13.2" .SH NAME -notmuch-dump \- Creates a plain-text dump of the tags of each message. +notmuch-dump \- creates a plain-text dump of the tags of each message .SH SYNOPSIS diff --git a/man/man1/notmuch-new.1 b/man/man1/notmuch-new.1 index cd83a88..e01f2eb 100644 --- a/man/man1/notmuch-new.1 +++ b/man/man1/notmuch-new.1 @@ -1,6 +1,6 @@ .TH NOTMUCH-NEW 1 2012-06-01 "Notmuch 0.13.2" .SH NAME -notmuch-new \- Incorporate new mail into the notmuch database. +notmuch-new \- incorporate new mail into the notmuch database .SH SYNOPSIS .B notmuch new diff --git a/man/man1/notmuch-reply.1 b/man/man1/notmuch-reply.1 index fb5114c..5aa86c0 100644 --- a/man/man1/notmuch-reply.1 +++ b/man/man1/notmuch-reply.1 @@ -1,6 +1,6 @@ .TH NOTMUCH-REPLY 1 2012-06-01 "Notmuch 0.13.2" .SH NAME -notmuch-reply \- Constructs a reply template for a set of messages. +notmuch-reply \- constructs a reply template for a set of messages .SH SYNOPSIS diff --git a/man/man1/notmuch-restore.1 b/man/man1/notmuch-restore.1 index 3156af7..18281c7 100644 --- a/man/man1/notmuch-restore.1 +++ b/man/man1/notmuch-restore.1 @@ -1,6 +1,6 @@ .TH NOTMUCH-RESTORE 1 2012-06-01 "Notmuch 0.13.2" .SH NAME -notmuch-restore \- Restores the tags from the given file (see notmuch dump). +notmuch-restore \- restores the tags from the given file (see notmuch dump) .SH SYNOPSIS diff --git a/man/man1/notmuch-search.1 b/man/man1/notmuch-search.1 index 5c72c4b..b42eb2c 100644 --- a/man/man1/notmuch-search.1 +++ b/man/man1/notmuch-search.1 @@ -1,6 +1,6 @@ .TH NOTMUCH-SEARCH 1 2012-06-01 "Notmuch 0.13.2" .SH NAME -notmuch-search \- Search for messages matching the given search terms. +notmuch-search \- search for messages matching the given search terms .SH SYNOPSIS .B notmuch search diff --git a/man/man1/notmuch-show.1 b/man/man1/notmuch-show.1 index 4aab17c..b51a54c 100644 --- a/man/man1/notmuch-show.1 +++ b/man/man1/notmuch-show.1 @@ -1,6 +1,6 @@ .TH NOTMUCH-SHOW 1 2012-06-01 "Notmuch 0.13.2" .SH NAME -notmuch-show \- Show messages matching the given search terms. +notmuch-show \- show messages matching the given search terms .SH SYNOPSIS .B notmuch show diff --git a/man/man1/notmuch-tag.1 b/man/man1/notmuch-tag.1 index 27e682e..d810e1b 100644 --- a/man/man1/notmuch-tag.1 +++ b/man/man1/notmuch-tag.1 @@ -1,6 +1,6 @@ .TH NOTMUCH-TAG 1 2012-06-01 "Notmuch 0.13.2" .SH NAME -notmuch-tag \- Add/remove tags for all messages matching the search terms. +notmuch-tag \- add/remove tags for all messages matching the search terms .SH SYNOPSIS .B notmuch tag diff --git a/man/man7/notmuch-search-terms.7 b/man/man7/notmuch-search-terms.7 index c559ed6..b8ab52d 100644 --- a/man/man7/notmuch-search-terms.7 +++ b/man/man7/notmuch-search-terms.7 @@ -1,7 +1,7 @@ .TH NOTMUCH-SEARCH-TERMS 7 2012-06-01 "Notmuch 0.13.2" .SH NAME -notmuch-search-terms \- Syntax for notmuch queries +notmuch-search-terms \- syntax for notmuch queries .SH SYNOPSIS -- 1.7.1