Search mail from people with different addresses and incorrectly configured clients
Hi, Sometimes, people's mail clients are configured incorrectly and do not show the sender name. Often the same person uses different email addresses. Sometimes they use different names for their alternate email addresses. When I search for an email originated from a specific person, I want the search to cover all messages that match the different criteria mentioned above. Has anyone found a good way to use tags to aggregate emails from the same person? Will it have some disadvantages? It sounds straightforward, but the management of this would be quite an effort. The alternative is to create long search strings that encompass all email addresses of the person. This has less management effort, but I would need to somewhere store all email addresses associated with a person. Sebastian ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
[PATCH v4 6/7] On deletion, replace with ghost when other active messages in thread
There is no need to add a ghost message upon deletion if there are no other active messages in the thread. Also, if the message being deleted was a ghost already, we can just go ahead and delete it. --- lib/message.cc | 58 ++ 1 file changed, 42 insertions(+), 16 deletions(-) diff --git a/lib/message.cc b/lib/message.cc index 2399ab3..1b423b0 100644 --- a/lib/message.cc +++ b/lib/message.cc @@ -1044,11 +1044,14 @@ _notmuch_message_delete (notmuch_message_t *message) { notmuch_status_t status; Xapian::WritableDatabase *db; -const char *mid, *tid; +const char *mid, *tid, *query_string; notmuch_message_t *ghost; notmuch_private_status_t private_status; notmuch_database_t *notmuch; - +notmuch_query_t *query; +unsigned int count = 0; +notmuch_bool_t is_ghost; + mid = notmuch_message_get_message_id (message); tid = notmuch_message_get_thread_id (message); notmuch = message->notmuch; @@ -1059,22 +1062,45 @@ _notmuch_message_delete (notmuch_message_t *message) db = static_cast (notmuch->xapian_db); db->delete_document (message->doc_id); - -/* and reintroduce a ghost in its place */ -ghost = _notmuch_message_create_for_message_id (notmuch, mid, &private_status); -if (private_status == NOTMUCH_PRIVATE_STATUS_NO_DOCUMENT_FOUND) { - private_status = _notmuch_message_initialize_ghost (ghost, tid); - if (! private_status) - _notmuch_message_sync (ghost); -} else if (private_status == NOTMUCH_PRIVATE_STATUS_SUCCESS) { - /* this is deeply weird, and we should not have gotten into - this state. is there a better error message to return - here? */ - return NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID; + +/* if this was a ghost to begin with, we are done */ +private_status = _notmuch_message_has_term (message, "type", "ghost", &is_ghost); +if (private_status) + return COERCE_STATUS (private_status, + "Error trying to determine whether message was a ghost"); +if (is_ghost) + return NOTMUCH_STATUS_SUCCESS; + +query_string = talloc_asprintf (message, "thread:%s", tid); +query = notmuch_query_create (notmuch, query_string); +if (query == NULL) + return NOTMUCH_STATUS_OUT_OF_MEMORY; +status = notmuch_query_count_messages_st (query, &count); +if (status) { + notmuch_query_destroy (query); + return status; } -notmuch_message_destroy (ghost); -return COERCE_STATUS (private_status, "Error converting to ghost message"); +if (count > 0) { + /* reintroduce a ghost in its place because there are still +* other active messages in this thread: */ + ghost = _notmuch_message_create_for_message_id (notmuch, mid, &private_status); + if (private_status == NOTMUCH_PRIVATE_STATUS_NO_DOCUMENT_FOUND) { + private_status = _notmuch_message_initialize_ghost (ghost, tid); + if (! private_status) + _notmuch_message_sync (ghost); + } else if (private_status == NOTMUCH_PRIVATE_STATUS_SUCCESS) { + /* this is deeply weird, and we should not have gotten + into this state. is there a better error message to + return here? */ + status = NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID; + } + + notmuch_message_destroy (ghost); + status = COERCE_STATUS (private_status, "Error converting to ghost message"); +} +notmuch_query_destroy (query); +return status; } /* Transform a blank message into a ghost message. The caller must -- 2.8.0.rc3 ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
[PATCH v4 7/7] complete ghost-on-removal-when-shared-thread-exists
To fully complete the ghost-on-removal-when-shared-thread-exists proposal, we need to clear all ghost messages when the last active message is removed from a thread. --- lib/message.cc | 20 test/T590-thread-breakage.sh | 6 +- 2 files changed, 21 insertions(+), 5 deletions(-) diff --git a/lib/message.cc b/lib/message.cc index 1b423b0..39dbe53 100644 --- a/lib/message.cc +++ b/lib/message.cc @@ -1098,6 +1098,26 @@ _notmuch_message_delete (notmuch_message_t *message) notmuch_message_destroy (ghost); status = COERCE_STATUS (private_status, "Error converting to ghost message"); +} else { + /* the thread is empty; drop all ghost messages from it */ + notmuch_messages_t *messages; + status = _notmuch_query_search_documents (query, + "ghost", + &messages); + if (status == NOTMUCH_STATUS_SUCCESS) { + notmuch_status_t last_error = NOTMUCH_STATUS_SUCCESS; + while (notmuch_messages_valid (messages)) { + message = notmuch_messages_get (messages); + status = _notmuch_message_delete (message); + if (status) /* we'll report the last failure we see; +* if there is more than one failure, we +* forget about previous ones */ + last_error = status; + notmuch_message_destroy (message); + notmuch_messages_move_to_next (messages); + } + status = last_error; + } } notmuch_query_destroy (query); return status; diff --git a/test/T590-thread-breakage.sh b/test/T590-thread-breakage.sh index 81f27db..45446b9 100755 --- a/test/T590-thread-breakage.sh +++ b/test/T590-thread-breakage.sh @@ -121,10 +121,6 @@ notmuch new >/dev/null test_thread_count 0 'All messages gone: no threads' test_content_count apple 0 test_content_count banana 0 -test_begin_subtest 'No ghosts should remain after full thread deletion' -# this is known to fail; we are leaking ghost messages deliberately -test_subtest_known_broken -ghosts=$(../ghost-report ${MAIL_DIR}/.notmuch/xapian) -test_expect_equal "$ghosts" "0" +test_ghost_count 0 'No ghosts should remain after full thread deletion' test_done -- 2.8.0.rc3 ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
[PATCH v4 1/7] test: add test-binary to print the number of ghost messages
From: David Bremner This one-liner seems preferable to the complications of depending on delve, getting the binary name right and parsing the output. --- test/Makefile.local | 4 test/ghost-report.cc | 12 2 files changed, 16 insertions(+) create mode 100644 test/ghost-report.cc diff --git a/test/Makefile.local b/test/Makefile.local index 30d420e..022f2cf 100644 --- a/test/Makefile.local +++ b/test/Makefile.local @@ -38,6 +38,9 @@ $(dir)/parse-time: $(dir)/parse-time.o parse-time-string/parse-time-string.o $(dir)/make-db-version: $(dir)/make-db-version.o $(call quiet,CXX) $^ -o $@ $(LDFLAGS) $(XAPIAN_LDFLAGS) +$(dir)/ghost-report: $(dir)/ghost-report.o + $(call quiet,CXX) $^ -o $@ $(LDFLAGS) $(XAPIAN_LDFLAGS) + .PHONY: test check test_main_srcs=$(dir)/arg-test.c \ @@ -47,6 +50,7 @@ test_main_srcs=$(dir)/arg-test.c \ $(dir)/smtp-dummy.c \ $(dir)/symbol-test.cc \ $(dir)/make-db-version.cc \ + $(dir)/ghost-report.cc test_srcs=$(test_main_srcs) $(dir)/database-test.c diff --git a/test/ghost-report.cc b/test/ghost-report.cc new file mode 100644 index 000..1739be4 --- /dev/null +++ b/test/ghost-report.cc @@ -0,0 +1,12 @@ +#include +#include + +int main(int argc, char **argv) { + +if (argc < 2) { + std::cerr << "usage: ghost-report xapian-dir" << std::endl; +} + +Xapian::Database db(argv[1]); +std::cout << db.get_termfreq("Tghost") << std::endl; +} -- 2.8.0.rc3 ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
[PATCH v4 3/7] fix thread breakage via ghost-on-removal
implement ghost-on-removal, the solution to T590-thread-breakage.sh that just adds a ghost message after removing each message. It leaks information about whether we've ever seen a given message id, but it's a fairly simple implementation. Note that _resolve_message_id_to_thread_id already introduces new message_ids to the database, so i think just searching for a given message ID may introduce the same metadata leakage. --- lib/message.cc | 30 +++--- test/T590-thread-breakage.sh | 25 - 2 files changed, 39 insertions(+), 16 deletions(-) diff --git a/lib/message.cc b/lib/message.cc index 8d72ea2..435b78a 100644 --- a/lib/message.cc +++ b/lib/message.cc @@ -1037,20 +1037,44 @@ _notmuch_message_sync (notmuch_message_t *message) message->modified = FALSE; } -/* Delete a message document from the database. */ +/* Delete a message document from the database, leaving a ghost + * message in its place */ notmuch_status_t _notmuch_message_delete (notmuch_message_t *message) { notmuch_status_t status; Xapian::WritableDatabase *db; +const char *mid, *tid; +notmuch_message_t *ghost; +notmuch_private_status_t private_status; +notmuch_database_t *notmuch; + +mid = notmuch_message_get_message_id (message); +tid = notmuch_message_get_thread_id (message); +notmuch = message->notmuch; status = _notmuch_database_ensure_writable (message->notmuch); if (status) return status; -db = static_cast (message->notmuch->xapian_db); +db = static_cast (notmuch->xapian_db); db->delete_document (message->doc_id); -return NOTMUCH_STATUS_SUCCESS; + +/* and reintroduce a ghost in its place */ +ghost = _notmuch_message_create_for_message_id (notmuch, mid, &private_status); +if (private_status == NOTMUCH_PRIVATE_STATUS_NO_DOCUMENT_FOUND) { + private_status = _notmuch_message_initialize_ghost (ghost, tid); + if (! private_status) + _notmuch_message_sync (ghost); +} else if (private_status == NOTMUCH_PRIVATE_STATUS_SUCCESS) { + /* this is deeply weird, and we should not have gotten into + this state. is there a better error message to return + here? */ + return NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID; +} + +notmuch_message_destroy (ghost); +return COERCE_STATUS (private_status, "Error converting to ghost message"); } /* Transform a blank message into a ghost message. The caller must diff --git a/test/T590-thread-breakage.sh b/test/T590-thread-breakage.sh index 2b933f6..81f27db 100755 --- a/test/T590-thread-breakage.sh +++ b/test/T590-thread-breakage.sh @@ -96,20 +96,11 @@ notmuch new >/dev/null test_thread_count 1 'First message removed: still only one thread' test_content_count apple 0 test_content_count banana 1 -test_begin_subtest 'should be one ghost after first message removed' -test_subtest_known_broken -ghosts=$(../ghost-report ${MAIL_DIR}/.notmuch/xapian) -test_expect_equal "$ghosts" "1" +test_ghost_count 1 'should be one ghost after first message removed' message_a notmuch new >/dev/null -# this is known to fail (it shows 2 threads) because no "ghost -# message" was created for message A when it was removed from the -# index, despite message B still pointing to it. -test_begin_subtest 'First message reappears: should return to the same thread' -test_subtest_known_broken -count=$(notmuch count --output=threads) -test_expect_equal "$count" "1" +test_thread_count 1 'First message reappears: should return to the same thread' test_content_count apple 1 test_content_count banana 1 test_ghost_count 0 @@ -119,13 +110,21 @@ notmuch new >/dev/null test_thread_count 1 'Removing second message: still only one thread' test_content_count apple 1 test_content_count banana 0 -test_ghost_count 0 'No ghosts should remain after deletion of second message' +test_begin_subtest 'No ghosts should remain after deletion of second message' +# this is known to fail; we are leaking ghost messages deliberately +test_subtest_known_broken +ghosts=$(../ghost-report ${MAIL_DIR}/.notmuch/xapian) +test_expect_equal "$ghosts" "0" rm -f ${MAIL_DIR}/cur/a notmuch new >/dev/null test_thread_count 0 'All messages gone: no threads' test_content_count apple 0 test_content_count banana 0 -test_ghost_count 0 +test_begin_subtest 'No ghosts should remain after full thread deletion' +# this is known to fail; we are leaking ghost messages deliberately +test_subtest_known_broken +ghosts=$(../ghost-report ${MAIL_DIR}/.notmuch/xapian) +test_expect_equal "$ghosts" "0" test_done -- 2.8.0.rc3 ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
[PATCH v4 5/7] Introduce _notmuch_message_has_term()
It can be useful to easily tell if a given message has a given term associated with it. --- lib/message.cc| 35 +++ lib/notmuch-private.h | 6 ++ 2 files changed, 41 insertions(+) diff --git a/lib/message.cc b/lib/message.cc index 435b78a..2399ab3 100644 --- a/lib/message.cc +++ b/lib/message.cc @@ -1217,6 +1217,41 @@ _notmuch_message_remove_term (notmuch_message_t *message, return NOTMUCH_PRIVATE_STATUS_SUCCESS; } +notmuch_private_status_t +_notmuch_message_has_term (notmuch_message_t *message, + const char *prefix_name, + const char *value, + notmuch_bool_t *result) +{ +char *term; +notmuch_bool_t out = FALSE; +notmuch_private_status_t status = NOTMUCH_PRIVATE_STATUS_SUCCESS; + +if (value == NULL) + return NOTMUCH_PRIVATE_STATUS_NULL_POINTER; + +term = talloc_asprintf (message, "%s%s", + _find_prefix (prefix_name), value); + +if (strlen (term) > NOTMUCH_TERM_MAX) + return NOTMUCH_PRIVATE_STATUS_TERM_TOO_LONG; + +try { + /* Look for the exact term */ + Xapian::TermIterator i = message->doc.termlist_begin (); + i.skip_to (term); + if (i != message->doc.termlist_end () && + !strcmp ((*i).c_str (), term)) + out = TRUE; +} catch (Xapian::Error &error) { + status = NOTMUCH_PRIVATE_STATUS_XAPIAN_EXCEPTION; +} +talloc_free (term); + +*result = out; +return status; +} + notmuch_status_t notmuch_message_add_tag (notmuch_message_t *message, const char *tag) { diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h index d95bf31..9280797 100644 --- a/lib/notmuch-private.h +++ b/lib/notmuch-private.h @@ -280,6 +280,12 @@ _notmuch_message_remove_term (notmuch_message_t *message, const char *value); notmuch_private_status_t +_notmuch_message_has_term (notmuch_message_t *message, + const char *prefix_name, + const char *value, + notmuch_bool_t *result); + +notmuch_private_status_t _notmuch_message_gen_terms (notmuch_message_t *message, const char *prefix_name, const char *text); -- 2.8.0.rc3 ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
[PATCH v4 4/7] Add internal functions to search for alternate doc types
Publicly we are only exposing the non-ghost documents (of "type" "mail"). But internally we might want to inspect the ghost messages as well. This changeset adds two new private interfaces to queries to recover information about alternate document types. --- lib/notmuch-private.h | 11 +++ lib/query.cc | 18 -- 2 files changed, 27 insertions(+), 2 deletions(-) diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h index 5dd4770..d95bf31 100644 --- a/lib/notmuch-private.h +++ b/lib/notmuch-private.h @@ -477,6 +477,17 @@ void _notmuch_doc_id_set_remove (notmuch_doc_id_set_t *doc_ids, unsigned int doc_id); +/* querying xapian documents by type (e.g. "mail" or "ghost"): */ +notmuch_status_t +_notmuch_query_search_documents (notmuch_query_t *query, +const char *type, +notmuch_messages_t **out); + +notmuch_status_t +_notmuch_query_count_documents (notmuch_query_t *query, + const char *type, + unsigned *count_out); + /* message.cc */ void diff --git a/lib/query.cc b/lib/query.cc index e627bfc..77a7926 100644 --- a/lib/query.cc +++ b/lib/query.cc @@ -187,6 +187,14 @@ notmuch_status_t notmuch_query_search_messages_st (notmuch_query_t *query, notmuch_messages_t **out) { +return _notmuch_query_search_documents (query, "mail", out); +} + +notmuch_status_t +_notmuch_query_search_documents (notmuch_query_t *query, +const char *type, +notmuch_messages_t **out) +{ notmuch_database_t *notmuch = query->notmuch; const char *query_string = query->query_string; notmuch_mset_messages_t *messages; @@ -208,7 +216,7 @@ notmuch_query_search_messages_st (notmuch_query_t *query, Xapian::Enquire enquire (*notmuch->xapian_db); Xapian::Query mail_query (talloc_asprintf (query, "%s%s", _find_prefix ("type"), - "mail")); + type)); Xapian::Query string_query, final_query, exclude_query; Xapian::MSet mset; Xapian::MSetIterator iterator; @@ -554,6 +562,12 @@ notmuch_query_count_messages (notmuch_query_t *query) notmuch_status_t notmuch_query_count_messages_st (notmuch_query_t *query, unsigned *count_out) { +return _notmuch_query_count_documents (query, "mail", count_out); +} + +notmuch_status_t +_notmuch_query_count_documents (notmuch_query_t *query, const char *type, unsigned *count_out) +{ notmuch_database_t *notmuch = query->notmuch; const char *query_string = query->query_string; Xapian::doccount count = 0; @@ -562,7 +576,7 @@ notmuch_query_count_messages_st (notmuch_query_t *query, unsigned *count_out) Xapian::Enquire enquire (*notmuch->xapian_db); Xapian::Query mail_query (talloc_asprintf (query, "%s%s", _find_prefix ("type"), - "mail")); + type)); Xapian::Query string_query, final_query, exclude_query; Xapian::MSet mset; unsigned int flags = (Xapian::QueryParser::FLAG_BOOLEAN | -- 2.8.0.rc3 ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
[PATCH v4 2/7] test thread breakage when messages are removed and re-added
This test (T590-thread-breakage.sh) has known-broken subtests. If you have a two-message thread where message "B" is in-reply-to "A", notmuch rightly sees this as a single thread. But if you: * remove "A" from the message store * run "notmuch new" * add "A" back into the message store * re-run "notmuch new" Then notmuch sees the messages as distinct threads. This happens because if you insert "B" initially (before anything is known about "A"), then a "ghost message" gets added to the database in reference to "A" that is in the same thread, which "A" takes over when it appears. But if "A" is subsequently removed, no ghost message is retained, so when "A" appears, it is treated as a new thread. I see a few options to fix this: ghost-on-removal We could unilaterally add a ghost upon message removal. This has a few disadvantages: the message index would leak information about what messages the user has ever been exposed to, and we also create a perpetually-growing dataset -- the ghosts can never be removed. ghost-on-removal-when-shared-thread-exists -- We could add a ghost upon message removal iff there are other non-ghost messages with the same thread ID. We'd also need to remove all ghost messages that share a thread when the last non-ghost message in that thread is removed. This still has a bit of information leakage, though: the message index would reveal that i've seen a newer message in a thread, even if i had deleted it from my message store track-dependencies -- rather than a simple "ghost-message" we could store all the (A,B) message-reference pairs internally, showing which messages A reference which other messages B. Then removal of message X would require deleting all message-reference pairs (X,B), and only deleting a ghost message if no (A,X) reference pair exists. This requires modifying the database by adding a new and fairly weird table that would need to be indexed by both columns. I don't know whether xapian has nice ways to do that. scan-dependencies - Without modifying the database, we could do something less efficient. Upon removal of message X, we could scan the headers of all non-ghost messages that share a thread with X. If any of those messages refers to X, we would add a ghost message. If none of them do, then we would just drop X entirely from the table. - One risk of attempted fixes to this problem is that we could fail to remove the search term indexes entirely. This test contains additional subtests to guard against that. This test also ensures that the right number of ghost messages exist in each situation; this will help us ensure we don't accumulate ghosts indefinitely or leak too much information about what messages we've seen or not seen, while still making it easy to reassemble threads when messages come in out-of-order. --- test/T590-thread-breakage.sh | 131 +++ 1 file changed, 131 insertions(+) create mode 100755 test/T590-thread-breakage.sh diff --git a/test/T590-thread-breakage.sh b/test/T590-thread-breakage.sh new file mode 100755 index 000..2b933f6 --- /dev/null +++ b/test/T590-thread-breakage.sh @@ -0,0 +1,131 @@ +#!/usr/bin/env bash +# +# Copyright (c) 2016 Daniel Kahn Gillmor +# + +test_description='thread breakage during reindexing + +notmuch uses ghost documents to track messages we have seen references +to but have never seen. Regardless of the order of delivery, message +deletion, and reindexing, the list of ghost messages for a given +stored corpus should not vary, so that threads can be reassmebled +cleanly. + +In practice, we accept a small amount of variation (and therefore +traffic pattern metadata leakage to be stored in the index) for the +sake of efficiency. + +This test also embeds some subtests to ensure that indexing actually +works properly and attempted fixes to threading issues do not break +the expected contents of the index.' + +. ./test-lib.sh || exit 1 + +message_a() { +mkdir -p ${MAIL_DIR}/cur +cat > ${MAIL_DIR}/cur/a < +From: Alice +To: Bob +Date: Thu, 31 Mar 2016 20:10:00 -0400 + +This is the first message in the thread. +Apple +EOF +} + +message_b() { +mkdir -p ${MAIL_DIR}/cur +cat > ${MAIL_DIR}/cur/b < +In-Reply-To: +References: +From: Bob +To: Alice +Date: Thu, 31 Mar 2016 20:15:00 -0400 + +This is the second message in the thread. +Banana +EOF +} + + +test_content_count() { +test_begin_subtest "${3:-looking for $2 instance of '$1'}" +count=$(notmuch count --output=threads "$1") +test_expect_equal "$count" "$2" +} + +test_thread_count() { +test_begin_subtest "${2:-Expecting $1 thread(s)}" +count=$(notmuch count --output=threads) +test_expect_equal "$count" "$1" +} + +test_ghost_count() { +test_begin_subtest "${2:-Expecting $1 ghosts(s)}" +ghosts=$(../ghost-report ${MAIL_DIR}/.notmuch/xapian) +te
Re: [PATCH v2 2/7] verify during thread-breakage that messages are removed as well
On Tue 2016-04-05 22:20:45 -0300, David Bremner wrote: > Daniel Kahn Gillmor writes: > >> >> +test_subject_count() { >> +notmuch new >/dev/null >> +test_begin_subtest "${3:-looking for $2 instance of '$1'}" >> +count=$(notmuch count --output=threads "$1") >> +test_expect_equal "$count" "$2" >> +} > > It's confusing that this doesn't have anything to do with the subject: > prefix or the corresponding header. Sigh. sorry, i just realized i missed this absolutely correct comment when preparing the v3 series. v4 is on its way, and hopefully it addresses all the excellent feedback i got here. apologies for the noise. --dkg ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
[PATCH 3/4] configure: add test for default xapian backend
This is mainly for the test suite. We already expect the tests to be run in the same environment as configure was run, at least to get the name of the python interpreter. So we are not really imposing a new restriction. --- configure| 26 +- test/test-lib.sh | 11 +++ 2 files changed, 36 insertions(+), 1 deletion(-) diff --git a/configure b/configure index eb6dbac..4fc31cc 100755 --- a/configure +++ b/configure @@ -371,7 +371,25 @@ if [ ${have_xapian} = "1" ]; then esac fi - +default_xapian_backend="" +if [ ${have_xapian} = "1" ]; then +printf "Testing default Xapian backend... " +cat >_default_backend.cc < +int main(int argc, char** argv) { + Xapian::WritableDatabase db("test.db",Xapian::DB_CREATE_OR_OPEN); +} +EOF +${CXX} ${CXXLAGS} ${xapian_cxxflags} _default_backend.cc -o _default_backend ${xapian_ldflags} +./_default_backend +if [ -f test.db/iamglass ]; then + default_xapian_backend=glass +else + default_xapian_backend=chert +fi +printf "${default_xapian_backend}\n"; +rm -rf test.db _default_backend _default_backend.cc +fi # we need to have a version >= 2.6.5 to avoid a crypto bug. We need # 2.6.7 for permissive "From " header handling. GMIME_MINVER=2.6.7 @@ -1001,6 +1019,9 @@ LINKER_RESOLVES_LIBRARY_DEPENDENCIES = ${linker_resolves_library_dependencies} XAPIAN_CXXFLAGS = ${xapian_cxxflags} XAPIAN_LDFLAGS = ${xapian_ldflags} +# Which backend will Xapian use by default? +DEFAULT_XAPIAN_BACKEND = ${default_xapian_backend} + # Flags needed to compile and link against GMime GMIME_CFLAGS = ${gmime_cflags} GMIME_LDFLAGS = ${gmime_ldflags} @@ -1077,6 +1098,9 @@ cat > sh.config /dev/null && test_set_prereq SYMLINKS rm -f y +# convert variable from configure to more convenient form +case "$NOTMUCH_DEFAULT_XAPIAN_BACKEND" in +glass) + db_ending=glass +;; +chert) + db_ending=DB +;; +*) + error "Unknown Xapian backend $NOTMUCH_DEFAULT_XAPIAN_BACKEND" +esac # declare prerequisites for external binaries used in tests test_declare_external_prereq dtach test_declare_external_prereq emacs -- 2.8.0.rc3 ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
[PATCH 1/4] test: improve error handling in lib-error tests
There is at least one bug fixed here (missing parameter to printf), even if exiting via segfault is considered OK. --- test/T560-lib-error.sh | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/test/T560-lib-error.sh b/test/T560-lib-error.sh index 59a479c..49d3674 100755 --- a/test/T560-lib-error.sh +++ b/test/T560-lib-error.sh @@ -202,16 +202,20 @@ int main (int argc, char** argv) notmuch_database_t *db; notmuch_status_t stat; char *path; + char *msg = NULL; int fd; - stat = notmuch_database_open (argv[1], NOTMUCH_DATABASE_MODE_READ_WRITE, &db); + stat = notmuch_database_open_verbose (argv[1], NOTMUCH_DATABASE_MODE_READ_WRITE, &db, &msg); if (stat != NOTMUCH_STATUS_SUCCESS) { - fprintf (stderr, "error opening database: %d\n", stat); + fprintf (stderr, "error opening database: %d %s\n", stat, msg ? msg : ""); + exit (1); } path = talloc_asprintf (db, "%s/.notmuch/xapian/postlist.DB", argv[1]); fd = open(path,O_WRONLY|O_TRUNC); - if (fd < 0) - fprintf (stderr, "error opening %s\n"); + if (fd < 0) { + fprintf (stderr, "error opening %s\n", argv[1]); + exit (1); + } EOF cat <<'EOF' > c_tail if (stat) { -- 2.8.0.rc3 ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
[PATCH 2/4] test/atomicity: guard chert-only optimization
This should potentially be updated to have an equivalent optimization for the glass backend, but it in my unscientific tests, the glass backend without the optimization is faster then the chert backend with. --- test/atomicity.py | 23 +-- 1 file changed, 13 insertions(+), 10 deletions(-) diff --git a/test/atomicity.py b/test/atomicity.py index 01a4205..1ca52b9 100644 --- a/test/atomicity.py +++ b/test/atomicity.py @@ -29,16 +29,19 @@ class RenameBreakpoint(gdb.Breakpoint): self.n = 0 def stop(self): -# As an optimization, only consider snapshots after a Xapian -# has really committed. Xapian overwrites record.base? as the -# last step in the commit, so keep an eye on their inumbers. -inodes = {} -for path in glob.glob('%s/.notmuch/xapian/record.base*' % maildir): -inodes[path] = os.stat(path).st_ino -if inodes == self.last_inodes: -# Continue -return False -self.last_inodes = inodes +xapiandir = '%s/.notmuch/xapian' % maildir +if os.path.isfile('%s/iamchert' % xapiandir): +# As an optimization, only consider snapshots after a +# Xapian has really committed. The chert backend +# overwrites record.base? as the last step in the commit, +# so keep an eye on their inumbers. +inodes = {} +for path in glob.glob('%s/record.base*' % xapiandir): +inodes[path] = os.stat(path).st_ino +if inodes == self.last_inodes: +# Continue +return False +self.last_inodes = inodes # Save a backtrace in case the test does fail backtrace = gdb.execute('backtrace', to_string=True) -- 2.8.0.rc3 ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
[PATCH 4/4] test: cope with glass backend file naming variations
In several places in the test suite we intentionally corrupt the Xapian database in order to test error handling. This corruption is specific to the on-disk organization of the database, and that changed with the glass backend. We use the previously computed default backend to make the tests adapt to changing names. --- test/T050-new.sh | 2 +- test/T060-count.sh | 4 ++-- test/T150-tagging.sh | 4 ++-- test/T360-symbol-hiding.sh | 4 ++-- test/T560-lib-error.sh | 4 ++-- 5 files changed, 9 insertions(+), 9 deletions(-) diff --git a/test/T050-new.sh b/test/T050-new.sh index 93a6fa9..df9e89a 100755 --- a/test/T050-new.sh +++ b/test/T050-new.sh @@ -284,7 +284,7 @@ notmuch config set new.tags $OLDCONFIG test_begin_subtest "Xapian exception: read only files" -chmod u-w ${MAIL_DIR}/.notmuch/xapian/*.DB +chmod u-w ${MAIL_DIR}/.notmuch/xapian/*.${db_ending} output=$(NOTMUCH_NEW --debug 2>&1 | sed 's/: .*$//' ) chmod u+w ${MAIL_DIR}/.notmuch/xapian/*.DB test_expect_equal "$output" "A Xapian exception occurred opening database" diff --git a/test/T060-count.sh b/test/T060-count.sh index 3fec94e..0ac8314 100755 --- a/test/T060-count.sh +++ b/test/T060-count.sh @@ -95,7 +95,7 @@ test_expect_equal_file EXPECTED OUTPUT backup_database test_begin_subtest "error message for database open" -dd if=/dev/zero of="${MAIL_DIR}/.notmuch/xapian/postlist.DB" count=3 +dd if=/dev/zero of="${MAIL_DIR}/.notmuch/xapian/postlist.${db_ending}" count=3 notmuch count '*' 2>OUTPUT 1>/dev/null output=$(sed 's/^\(A Xapian exception [^:]*\):.*$/\1/' OUTPUT) test_expect_equal "${output}" "A Xapian exception occurred opening database" @@ -105,7 +105,7 @@ cat < count-files.gdb set breakpoint pending on break count_files commands -shell cp /dev/null ${MAIL_DIR}/.notmuch/xapian/postlist.DB +shell cp /dev/null ${MAIL_DIR}/.notmuch/xapian/postlist.${db_ending} continue end run diff --git a/test/T150-tagging.sh b/test/T150-tagging.sh index 8adcabc..6fd6a18 100755 --- a/test/T150-tagging.sh +++ b/test/T150-tagging.sh @@ -287,9 +287,9 @@ test_expect_code 1 "Empty tag names" 'notmuch tag + One' test_expect_code 1 "Tag name beginning with -" 'notmuch tag +- One' test_begin_subtest "Xapian exception: read only files" -chmod u-w ${MAIL_DIR}/.notmuch/xapian/*.DB +chmod u-w ${MAIL_DIR}/.notmuch/xapian/*.${db_ending} output=$(notmuch tag +something '*' 2>&1 | sed 's/: .*$//' ) -chmod u+w ${MAIL_DIR}/.notmuch/xapian/*.DB +chmod u+w ${MAIL_DIR}/.notmuch/xapian/{*.DB,*.glass} test_expect_equal "$output" "A Xapian exception occurred opening database" test_done diff --git a/test/T360-symbol-hiding.sh b/test/T360-symbol-hiding.sh index 89e7f16..3f18ec1 100755 --- a/test/T360-symbol-hiding.sh +++ b/test/T360-symbol-hiding.sh @@ -15,11 +15,11 @@ test_begin_subtest 'running test' run_test mkdir -p ${PWD}/fakedb/.notmuch ( LD_LIBRARY_PATH="$TEST_DIRECTORY/../lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}" \ $TEST_DIRECTORY/symbol-test ${PWD}/fakedb ${PWD}/nonexistent \ -2>&1 | notmuch_dir_sanitize | sed "s,\`,\',g") > OUTPUT +2>&1 | notmuch_dir_sanitize | sed -e "s,\`,\',g" -e "s,${NOTMUCH_DEFAULT_XAPIAN_BACKEND},backend,g") > OUTPUT cat < EXPECTED A Xapian exception occurred opening database: Couldn't stat 'CWD/fakedb/.notmuch/xapian' -caught No chert database found at path 'CWD/nonexistent' +caught No backend database found at path 'CWD/nonexistent' EOF test_expect_equal_file EXPECTED OUTPUT diff --git a/test/T560-lib-error.sh b/test/T560-lib-error.sh index 49d3674..087c6bd 100755 --- a/test/T560-lib-error.sh +++ b/test/T560-lib-error.sh @@ -189,7 +189,7 @@ Path already exists: MAIL_DIR EOF test_expect_equal_file EXPECTED OUTPUT -cat <<'EOF' > c_head +cat < c_head #include #include #include @@ -210,7 +210,7 @@ int main (int argc, char** argv) fprintf (stderr, "error opening database: %d %s\n", stat, msg ? msg : ""); exit (1); } - path = talloc_asprintf (db, "%s/.notmuch/xapian/postlist.DB", argv[1]); + path = talloc_asprintf (db, "%s/.notmuch/xapian/postlist.${db_ending}", argv[1]); fd = open(path,O_WRONLY|O_TRUNC); if (fd < 0) { fprintf (stderr, "error opening %s\n", argv[1]); -- 2.8.0.rc3 ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
xapian 1.3.5 compatibility, v3
Tomi convinced me (with his review of my previous attempt) to go the route of testing the default xapian backend once in configure, and using that variable elsewhere. This obsoletes the series(s) id:1459855082-5715-2-git-send-email-da...@tethera.net id:1459938431-28670-2-git-send-email-da...@tethera.net ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
[PATCH v3 6/7] On deletion, replace with ghost when other active messages in thread
There is no need to add a ghost message upon deletion if there are no other active messages in the thread. Also, if the message being deleted was a ghost already, we can just go ahead and delete it. --- lib/message.cc | 58 ++ 1 file changed, 42 insertions(+), 16 deletions(-) diff --git a/lib/message.cc b/lib/message.cc index 2399ab3..1b423b0 100644 --- a/lib/message.cc +++ b/lib/message.cc @@ -1044,11 +1044,14 @@ _notmuch_message_delete (notmuch_message_t *message) { notmuch_status_t status; Xapian::WritableDatabase *db; -const char *mid, *tid; +const char *mid, *tid, *query_string; notmuch_message_t *ghost; notmuch_private_status_t private_status; notmuch_database_t *notmuch; - +notmuch_query_t *query; +unsigned int count = 0; +notmuch_bool_t is_ghost; + mid = notmuch_message_get_message_id (message); tid = notmuch_message_get_thread_id (message); notmuch = message->notmuch; @@ -1059,22 +1062,45 @@ _notmuch_message_delete (notmuch_message_t *message) db = static_cast (notmuch->xapian_db); db->delete_document (message->doc_id); - -/* and reintroduce a ghost in its place */ -ghost = _notmuch_message_create_for_message_id (notmuch, mid, &private_status); -if (private_status == NOTMUCH_PRIVATE_STATUS_NO_DOCUMENT_FOUND) { - private_status = _notmuch_message_initialize_ghost (ghost, tid); - if (! private_status) - _notmuch_message_sync (ghost); -} else if (private_status == NOTMUCH_PRIVATE_STATUS_SUCCESS) { - /* this is deeply weird, and we should not have gotten into - this state. is there a better error message to return - here? */ - return NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID; + +/* if this was a ghost to begin with, we are done */ +private_status = _notmuch_message_has_term (message, "type", "ghost", &is_ghost); +if (private_status) + return COERCE_STATUS (private_status, + "Error trying to determine whether message was a ghost"); +if (is_ghost) + return NOTMUCH_STATUS_SUCCESS; + +query_string = talloc_asprintf (message, "thread:%s", tid); +query = notmuch_query_create (notmuch, query_string); +if (query == NULL) + return NOTMUCH_STATUS_OUT_OF_MEMORY; +status = notmuch_query_count_messages_st (query, &count); +if (status) { + notmuch_query_destroy (query); + return status; } -notmuch_message_destroy (ghost); -return COERCE_STATUS (private_status, "Error converting to ghost message"); +if (count > 0) { + /* reintroduce a ghost in its place because there are still +* other active messages in this thread: */ + ghost = _notmuch_message_create_for_message_id (notmuch, mid, &private_status); + if (private_status == NOTMUCH_PRIVATE_STATUS_NO_DOCUMENT_FOUND) { + private_status = _notmuch_message_initialize_ghost (ghost, tid); + if (! private_status) + _notmuch_message_sync (ghost); + } else if (private_status == NOTMUCH_PRIVATE_STATUS_SUCCESS) { + /* this is deeply weird, and we should not have gotten + into this state. is there a better error message to + return here? */ + status = NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID; + } + + notmuch_message_destroy (ghost); + status = COERCE_STATUS (private_status, "Error converting to ghost message"); +} +notmuch_query_destroy (query); +return status; } /* Transform a blank message into a ghost message. The caller must -- 2.8.0.rc3 ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
[PATCH v3 1/7] test: add test-binary to print the number of ghost messages
From: David Bremner This one-liner seems preferable to the complications of depending on delve, getting the binary name right and parsing the output. --- test/Makefile.local | 4 test/ghost-report.cc | 12 2 files changed, 16 insertions(+) create mode 100644 test/ghost-report.cc diff --git a/test/Makefile.local b/test/Makefile.local index 30d420e..022f2cf 100644 --- a/test/Makefile.local +++ b/test/Makefile.local @@ -38,6 +38,9 @@ $(dir)/parse-time: $(dir)/parse-time.o parse-time-string/parse-time-string.o $(dir)/make-db-version: $(dir)/make-db-version.o $(call quiet,CXX) $^ -o $@ $(LDFLAGS) $(XAPIAN_LDFLAGS) +$(dir)/ghost-report: $(dir)/ghost-report.o + $(call quiet,CXX) $^ -o $@ $(LDFLAGS) $(XAPIAN_LDFLAGS) + .PHONY: test check test_main_srcs=$(dir)/arg-test.c \ @@ -47,6 +50,7 @@ test_main_srcs=$(dir)/arg-test.c \ $(dir)/smtp-dummy.c \ $(dir)/symbol-test.cc \ $(dir)/make-db-version.cc \ + $(dir)/ghost-report.cc test_srcs=$(test_main_srcs) $(dir)/database-test.c diff --git a/test/ghost-report.cc b/test/ghost-report.cc new file mode 100644 index 000..1739be4 --- /dev/null +++ b/test/ghost-report.cc @@ -0,0 +1,12 @@ +#include +#include + +int main(int argc, char **argv) { + +if (argc < 2) { + std::cerr << "usage: ghost-report xapian-dir" << std::endl; +} + +Xapian::Database db(argv[1]); +std::cout << db.get_termfreq("Tghost") << std::endl; +} -- 2.8.0.rc3 ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
[PATCH v3 3/7] fix thread breakage via ghost-on-removal
implement ghost-on-removal, the solution to T590-thread-breakage.sh that just adds a ghost message after removing each message. It leaks information about whether we've ever seen a given message id, but it's a fairly simple implementation. Note that _resolve_message_id_to_thread_id already introduces new message_ids to the database, so i think just searching for a given message ID may introduce the same metadata leakage. --- lib/message.cc | 30 +++--- test/T590-thread-breakage.sh | 25 - 2 files changed, 39 insertions(+), 16 deletions(-) diff --git a/lib/message.cc b/lib/message.cc index 8d72ea2..435b78a 100644 --- a/lib/message.cc +++ b/lib/message.cc @@ -1037,20 +1037,44 @@ _notmuch_message_sync (notmuch_message_t *message) message->modified = FALSE; } -/* Delete a message document from the database. */ +/* Delete a message document from the database, leaving a ghost + * message in its place */ notmuch_status_t _notmuch_message_delete (notmuch_message_t *message) { notmuch_status_t status; Xapian::WritableDatabase *db; +const char *mid, *tid; +notmuch_message_t *ghost; +notmuch_private_status_t private_status; +notmuch_database_t *notmuch; + +mid = notmuch_message_get_message_id (message); +tid = notmuch_message_get_thread_id (message); +notmuch = message->notmuch; status = _notmuch_database_ensure_writable (message->notmuch); if (status) return status; -db = static_cast (message->notmuch->xapian_db); +db = static_cast (notmuch->xapian_db); db->delete_document (message->doc_id); -return NOTMUCH_STATUS_SUCCESS; + +/* and reintroduce a ghost in its place */ +ghost = _notmuch_message_create_for_message_id (notmuch, mid, &private_status); +if (private_status == NOTMUCH_PRIVATE_STATUS_NO_DOCUMENT_FOUND) { + private_status = _notmuch_message_initialize_ghost (ghost, tid); + if (! private_status) + _notmuch_message_sync (ghost); +} else if (private_status == NOTMUCH_PRIVATE_STATUS_SUCCESS) { + /* this is deeply weird, and we should not have gotten into + this state. is there a better error message to return + here? */ + return NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID; +} + +notmuch_message_destroy (ghost); +return COERCE_STATUS (private_status, "Error converting to ghost message"); } /* Transform a blank message into a ghost message. The caller must diff --git a/test/T590-thread-breakage.sh b/test/T590-thread-breakage.sh index 24ffb42..22e5cbc 100755 --- a/test/T590-thread-breakage.sh +++ b/test/T590-thread-breakage.sh @@ -96,20 +96,11 @@ notmuch new >/dev/null test_thread_count 1 'First message removed: still only one thread' test_subject_count apple 0 test_subject_count banana 1 -test_begin_subtest 'should be one ghost after first message removed' -test_subtest_known_broken -ghosts=$(../ghost-report ${MAIL_DIR}/.notmuch/xapian) -test_expect_equal "$ghosts" "1" +test_ghost_count 1 'should be one ghost after first message removed' message_a notmuch new >/dev/null -# this is known to fail (it shows 2 threads) because no "ghost -# message" was created for message A when it was removed from the -# index, despite message B still pointing to it. -test_begin_subtest 'First message reappears: should return to the same thread' -test_subtest_known_broken -count=$(notmuch count --output=threads) -test_expect_equal "$count" "1" +test_thread_count 1 'First message reappears: should return to the same thread' test_subject_count apple 1 test_subject_count banana 1 test_ghost_count 0 @@ -119,13 +110,21 @@ notmuch new >/dev/null test_thread_count 1 'Removing second message: still only one thread' test_subject_count apple 1 test_subject_count banana 0 -test_ghost_count 0 'No ghosts should remain after deletion of second message' +test_begin_subtest 'No ghosts should remain after deletion of second message' +# this is known to fail; we are leaking ghost messages deliberately +test_subtest_known_broken +ghosts=$(../ghost-report ${MAIL_DIR}/.notmuch/xapian) +test_expect_equal "$ghosts" "0" rm -f ${MAIL_DIR}/cur/a notmuch new >/dev/null test_thread_count 0 'All messages gone: no threads' test_subject_count apple 0 test_subject_count banana 0 -test_ghost_count 0 +test_begin_subtest 'No ghosts should remain after full thread deletion' +# this is known to fail; we are leaking ghost messages deliberately +test_subtest_known_broken +ghosts=$(../ghost-report ${MAIL_DIR}/.notmuch/xapian) +test_expect_equal "$ghosts" "0" test_done -- 2.8.0.rc3 ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
[PATCH v3 7/7] complete ghost-on-removal-when-shared-thread-exists
To fully complete the ghost-on-removal-when-shared-thread-exists proposal, we need to clear all ghost messages when the last active message is removed from a thread. --- lib/message.cc | 20 test/T590-thread-breakage.sh | 6 +- 2 files changed, 21 insertions(+), 5 deletions(-) diff --git a/lib/message.cc b/lib/message.cc index 1b423b0..39dbe53 100644 --- a/lib/message.cc +++ b/lib/message.cc @@ -1098,6 +1098,26 @@ _notmuch_message_delete (notmuch_message_t *message) notmuch_message_destroy (ghost); status = COERCE_STATUS (private_status, "Error converting to ghost message"); +} else { + /* the thread is empty; drop all ghost messages from it */ + notmuch_messages_t *messages; + status = _notmuch_query_search_documents (query, + "ghost", + &messages); + if (status == NOTMUCH_STATUS_SUCCESS) { + notmuch_status_t last_error = NOTMUCH_STATUS_SUCCESS; + while (notmuch_messages_valid (messages)) { + message = notmuch_messages_get (messages); + status = _notmuch_message_delete (message); + if (status) /* we'll report the last failure we see; +* if there is more than one failure, we +* forget about previous ones */ + last_error = status; + notmuch_message_destroy (message); + notmuch_messages_move_to_next (messages); + } + status = last_error; + } } notmuch_query_destroy (query); return status; diff --git a/test/T590-thread-breakage.sh b/test/T590-thread-breakage.sh index 22e5cbc..969243e 100755 --- a/test/T590-thread-breakage.sh +++ b/test/T590-thread-breakage.sh @@ -121,10 +121,6 @@ notmuch new >/dev/null test_thread_count 0 'All messages gone: no threads' test_subject_count apple 0 test_subject_count banana 0 -test_begin_subtest 'No ghosts should remain after full thread deletion' -# this is known to fail; we are leaking ghost messages deliberately -test_subtest_known_broken -ghosts=$(../ghost-report ${MAIL_DIR}/.notmuch/xapian) -test_expect_equal "$ghosts" "0" +test_ghost_count 0 'No ghosts should remain after full thread deletion' test_done -- 2.8.0.rc3 ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
[PATCH v3 4/7] Add internal functions to search for alternate doc types
Publicly we are only exposing the non-ghost documents (of "type" "mail"). But internally we might want to inspect the ghost messages as well. This changeset adds two new private interfaces to queries to recover information about alternate document types. --- lib/notmuch-private.h | 11 +++ lib/query.cc | 18 -- 2 files changed, 27 insertions(+), 2 deletions(-) diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h index 5dd4770..d95bf31 100644 --- a/lib/notmuch-private.h +++ b/lib/notmuch-private.h @@ -477,6 +477,17 @@ void _notmuch_doc_id_set_remove (notmuch_doc_id_set_t *doc_ids, unsigned int doc_id); +/* querying xapian documents by type (e.g. "mail" or "ghost"): */ +notmuch_status_t +_notmuch_query_search_documents (notmuch_query_t *query, +const char *type, +notmuch_messages_t **out); + +notmuch_status_t +_notmuch_query_count_documents (notmuch_query_t *query, + const char *type, + unsigned *count_out); + /* message.cc */ void diff --git a/lib/query.cc b/lib/query.cc index e627bfc..77a7926 100644 --- a/lib/query.cc +++ b/lib/query.cc @@ -187,6 +187,14 @@ notmuch_status_t notmuch_query_search_messages_st (notmuch_query_t *query, notmuch_messages_t **out) { +return _notmuch_query_search_documents (query, "mail", out); +} + +notmuch_status_t +_notmuch_query_search_documents (notmuch_query_t *query, +const char *type, +notmuch_messages_t **out) +{ notmuch_database_t *notmuch = query->notmuch; const char *query_string = query->query_string; notmuch_mset_messages_t *messages; @@ -208,7 +216,7 @@ notmuch_query_search_messages_st (notmuch_query_t *query, Xapian::Enquire enquire (*notmuch->xapian_db); Xapian::Query mail_query (talloc_asprintf (query, "%s%s", _find_prefix ("type"), - "mail")); + type)); Xapian::Query string_query, final_query, exclude_query; Xapian::MSet mset; Xapian::MSetIterator iterator; @@ -554,6 +562,12 @@ notmuch_query_count_messages (notmuch_query_t *query) notmuch_status_t notmuch_query_count_messages_st (notmuch_query_t *query, unsigned *count_out) { +return _notmuch_query_count_documents (query, "mail", count_out); +} + +notmuch_status_t +_notmuch_query_count_documents (notmuch_query_t *query, const char *type, unsigned *count_out) +{ notmuch_database_t *notmuch = query->notmuch; const char *query_string = query->query_string; Xapian::doccount count = 0; @@ -562,7 +576,7 @@ notmuch_query_count_messages_st (notmuch_query_t *query, unsigned *count_out) Xapian::Enquire enquire (*notmuch->xapian_db); Xapian::Query mail_query (talloc_asprintf (query, "%s%s", _find_prefix ("type"), - "mail")); + type)); Xapian::Query string_query, final_query, exclude_query; Xapian::MSet mset; unsigned int flags = (Xapian::QueryParser::FLAG_BOOLEAN | -- 2.8.0.rc3 ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
[PATCH v3 5/7] Introduce _notmuch_message_has_term()
It can be useful to easily tell if a given message has a given term associated with it. --- lib/message.cc| 35 +++ lib/notmuch-private.h | 6 ++ 2 files changed, 41 insertions(+) diff --git a/lib/message.cc b/lib/message.cc index 435b78a..2399ab3 100644 --- a/lib/message.cc +++ b/lib/message.cc @@ -1217,6 +1217,41 @@ _notmuch_message_remove_term (notmuch_message_t *message, return NOTMUCH_PRIVATE_STATUS_SUCCESS; } +notmuch_private_status_t +_notmuch_message_has_term (notmuch_message_t *message, + const char *prefix_name, + const char *value, + notmuch_bool_t *result) +{ +char *term; +notmuch_bool_t out = FALSE; +notmuch_private_status_t status = NOTMUCH_PRIVATE_STATUS_SUCCESS; + +if (value == NULL) + return NOTMUCH_PRIVATE_STATUS_NULL_POINTER; + +term = talloc_asprintf (message, "%s%s", + _find_prefix (prefix_name), value); + +if (strlen (term) > NOTMUCH_TERM_MAX) + return NOTMUCH_PRIVATE_STATUS_TERM_TOO_LONG; + +try { + /* Look for the exact term */ + Xapian::TermIterator i = message->doc.termlist_begin (); + i.skip_to (term); + if (i != message->doc.termlist_end () && + !strcmp ((*i).c_str (), term)) + out = TRUE; +} catch (Xapian::Error &error) { + status = NOTMUCH_PRIVATE_STATUS_XAPIAN_EXCEPTION; +} +talloc_free (term); + +*result = out; +return status; +} + notmuch_status_t notmuch_message_add_tag (notmuch_message_t *message, const char *tag) { diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h index d95bf31..9280797 100644 --- a/lib/notmuch-private.h +++ b/lib/notmuch-private.h @@ -280,6 +280,12 @@ _notmuch_message_remove_term (notmuch_message_t *message, const char *value); notmuch_private_status_t +_notmuch_message_has_term (notmuch_message_t *message, + const char *prefix_name, + const char *value, + notmuch_bool_t *result); + +notmuch_private_status_t _notmuch_message_gen_terms (notmuch_message_t *message, const char *prefix_name, const char *text); -- 2.8.0.rc3 ___ notmuch mailing list notmuch@notmuchmail.org https://notmuchmail.org/mailman/listinfo/notmuch
[PATCH v3 2/7] test thread breakage when messages are removed and re-added
This test (T590-thread-breakage.sh) has known-broken subtests. If you have a two-message thread where message "B" is in-reply-to "A", notmuch rightly sees this as a single thread. But if you: * remove "A" from the message store * run "notmuch new" * add "A" back into the message store * re-run "notmuch new" Then notmuch sees the messages as distinct threads. This happens because if you insert "B" initially (before anything is known about "A"), then a "ghost message" gets added to the database in reference to "A" that is in the same thread, which "A" takes over when it appears. But if "A" is subsequently removed, no ghost message is retained, so when "A" appears, it is treated as a new thread. I see a few options to fix this: ghost-on-removal We could unilaterally add a ghost upon message removal. This has a few disadvantages: the message index would leak information about what messages the user has ever been exposed to, and we also create a perpetually-growing dataset -- the ghosts can never be removed. ghost-on-removal-when-shared-thread-exists -- We could add a ghost upon message removal iff there are other non-ghost messages with the same thread ID. We'd also need to remove all ghost messages that share a thread when the last non-ghost message in that thread is removed. This still has a bit of information leakage, though: the message index would reveal that i've seen a newer message in a thread, even if i had deleted it from my message store track-dependencies -- rather than a simple "ghost-message" we could store all the (A,B) message-reference pairs internally, showing which messages A reference which other messages B. Then removal of message X would require deleting all message-reference pairs (X,B), and only deleting a ghost message if no (A,X) reference pair exists. This requires modifying the database by adding a new and fairly weird table that would need to be indexed by both columns. I don't know whether xapian has nice ways to do that. scan-dependencies - Without modifying the database, we could do something less efficient. Upon removal of message X, we could scan the headers of all non-ghost messages that share a thread with X. If any of those messages refers to X, we would add a ghost message. If none of them do, then we would just drop X entirely from the table. - One risk of attempted fixes to this problem is that we could fail to remove the search term indexes entirely. This test contains additional subtests to guard against that. This test also ensures that the right number of ghost messages exist in each situation; this will help us ensure we don't accumulate ghosts indefinitely or leak too much information about what messages we've seen or not seen, while still making it easy to reassemble threads when messages come in out-of-order. --- test/T590-thread-breakage.sh | 131 +++ 1 file changed, 131 insertions(+) create mode 100755 test/T590-thread-breakage.sh diff --git a/test/T590-thread-breakage.sh b/test/T590-thread-breakage.sh new file mode 100755 index 000..24ffb42 --- /dev/null +++ b/test/T590-thread-breakage.sh @@ -0,0 +1,131 @@ +#!/usr/bin/env bash +# +# Copyright (c) 2016 Daniel Kahn Gillmor +# + +test_description='thread breakage during reindexing + +notmuch uses ghost documents to track messages we have seen references +to but have never seen. Regardless of the order of delivery, message +deletion, and reindexing, the list of ghost messages for a given +stored corpus should not vary, so that threads can be reassmebled +cleanly. + +In practice, we accept a small amount of variation (and therefore +traffic pattern metadata leakage to be stored in the index) for the +sake of efficiency. + +This test also embeds some subtests to ensure that indexing actually +works properly and attempted fixes to threading issues do not break +the expected contents of the index.' + +. ./test-lib.sh || exit 1 + +message_a() { +mkdir -p ${MAIL_DIR}/cur +cat > ${MAIL_DIR}/cur/a < +From: Alice +To: Bob +Date: Thu, 31 Mar 2016 20:10:00 -0400 + +This is the first message in the thread. +Apple +EOF +} + +message_b() { +mkdir -p ${MAIL_DIR}/cur +cat > ${MAIL_DIR}/cur/b < +In-Reply-To: +References: +From: Bob +To: Alice +Date: Thu, 31 Mar 2016 20:15:00 -0400 + +This is the second message in the thread. +Banana +EOF +} + + +test_subject_count() { +test_begin_subtest "${3:-looking for $2 instance of '$1'}" +count=$(notmuch count --output=threads "$1") +test_expect_equal "$count" "$2" +} + +test_thread_count() { +test_begin_subtest "${2:-Expecting $1 thread(s)}" +count=$(notmuch count --output=threads) +test_expect_equal "$count" "$1" +} + +test_ghost_count() { +test_begin_subtest "${2:-Expecting $1 ghosts(s)}" +ghosts=$(../ghost-report ${MAIL_DIR}/.notmuch/xapian) +te