[PATCH 07/11] lib: Implement ghost-based thread linking

2014-10-03 Thread Austin Clements
From: Austin Clements 

This updates the thread linking code to use ghost messages instead of
user metadata to link messages into threads.

In contrast with the old approach, this is actually correct.
Previously, thread merging updated only the thread IDs of message
documents, not thread IDs stored in user metadata.  As originally
diagnosed by Mark Walters [1] and as demonstrated by the broken
T260-thread-order test, this can cause notmuch to fail to link
messages even though they're in the same thread.  In principle the old
approach could have been fixed by updating the user metadata thread
IDs as well, but these are not indexed and hence this would have
required a full scan of all stored thread IDs.  Ghost messages solve
this problem naturally by reusing the exact same thread ID and message
ID representation and indexing as regular messages.

Furthermore, thanks to this greater symmetry, ghost messages are also
algorithmically simpler.  We continue to support the old user metadata
format, so this patch can't delete any code, but when we do remove
support for the old format, several functions can simply be deleted.

[1] id:8738h7kv2q.fsf at qmul.ac.uk
---
 lib/database.cc | 86 +
 1 file changed, 75 insertions(+), 11 deletions(-)

diff --git a/lib/database.cc b/lib/database.cc
index c641bcd..fdcc526 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -1752,6 +1752,12 @@ _get_metadata_thread_id_key (void *ctx, const char 
*message_id)
message_id);
 }

+static notmuch_status_t
+_resolve_message_id_to_thread_id_old (notmuch_database_t *notmuch,
+ void *ctx,
+ const char *message_id,
+ const char **thread_id_ret);
+
 /* Find the thread ID to which the message with 'message_id' belongs.
  *
  * Note: 'thread_id_ret' must not be NULL!
@@ -1760,9 +1766,9 @@ _get_metadata_thread_id_key (void *ctx, const char 
*message_id)
  *
  * Note: If there is no message in the database with the given
  * 'message_id' then a new thread_id will be allocated for this
- * message and stored in the database metadata, (where this same
+ * message ID and stored in the database metadata so that the
  * thread ID can be looked up if the message is added to the database
- * later).
+ * later.
  */
 static notmuch_status_t
 _resolve_message_id_to_thread_id (notmuch_database_t *notmuch,
@@ -1770,6 +1776,49 @@ _resolve_message_id_to_thread_id (notmuch_database_t 
*notmuch,
  const char *message_id,
  const char **thread_id_ret)
 {
+notmuch_private_status_t status;
+notmuch_message_t *message;
+
+if (! (notmuch->features & NOTMUCH_FEATURE_GHOSTS))
+   return _resolve_message_id_to_thread_id_old (notmuch, ctx, message_id,
+thread_id_ret);
+
+/* Look for this message (regular or ghost) */
+message = _notmuch_message_create_for_message_id (
+   notmuch, message_id, );
+if (status == NOTMUCH_PRIVATE_STATUS_SUCCESS) {
+   /* Message exists */
+   *thread_id_ret = talloc_steal (
+   ctx, notmuch_message_get_thread_id (message));
+} else if (status == NOTMUCH_PRIVATE_STATUS_NO_DOCUMENT_FOUND) {
+   /* Message did not exist.  Give it a fresh thread ID and
+* populate this message as a ghost message. */
+   *thread_id_ret = talloc_strdup (
+   ctx, _notmuch_database_generate_thread_id (notmuch));
+   if (! *thread_id_ret) {
+   status = NOTMUCH_PRIVATE_STATUS_OUT_OF_MEMORY;
+   } else {
+   status = _notmuch_message_initialize_ghost (message, 
*thread_id_ret);
+   if (status == 0)
+   /* Commit the new ghost message */
+   _notmuch_message_sync (message);
+   }
+} else {
+   /* Create failed. Fall through. */
+}
+
+notmuch_message_destroy (message);
+
+return COERCE_STATUS (status, "Error creating ghost message");
+}
+
+/* Pre-ghost messages _resolve_message_id_to_thread_id */
+static notmuch_status_t
+_resolve_message_id_to_thread_id_old (notmuch_database_t *notmuch,
+ void *ctx,
+ const char *message_id,
+ const char **thread_id_ret)
+{
 notmuch_status_t status;
 notmuch_message_t *message;
 string thread_id_string;
@@ -2007,7 +2056,7 @@ _consume_metadata_thread_id (void *ctx, 
notmuch_database_t *notmuch,
 }
 }

-/* Given a (mostly empty) 'message' and its corresponding
+/* Given a blank or ghost 'message' and its corresponding
  * 'message_file' link it to existing threads in the database.
  *
  * The first check is in the metadata of the database to see if we
@@ -2035,16 +2084,22 @@ _consume_metadata_thread_id (void *ctx, 
notmuch_database_t 

[PATCH 07/11] lib: Implement ghost-based thread linking

2014-10-03 Thread Austin Clements
From: Austin Clements amdra...@mit.edu

This updates the thread linking code to use ghost messages instead of
user metadata to link messages into threads.

In contrast with the old approach, this is actually correct.
Previously, thread merging updated only the thread IDs of message
documents, not thread IDs stored in user metadata.  As originally
diagnosed by Mark Walters [1] and as demonstrated by the broken
T260-thread-order test, this can cause notmuch to fail to link
messages even though they're in the same thread.  In principle the old
approach could have been fixed by updating the user metadata thread
IDs as well, but these are not indexed and hence this would have
required a full scan of all stored thread IDs.  Ghost messages solve
this problem naturally by reusing the exact same thread ID and message
ID representation and indexing as regular messages.

Furthermore, thanks to this greater symmetry, ghost messages are also
algorithmically simpler.  We continue to support the old user metadata
format, so this patch can't delete any code, but when we do remove
support for the old format, several functions can simply be deleted.

[1] id:8738h7kv2q@qmul.ac.uk
---
 lib/database.cc | 86 +
 1 file changed, 75 insertions(+), 11 deletions(-)

diff --git a/lib/database.cc b/lib/database.cc
index c641bcd..fdcc526 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -1752,6 +1752,12 @@ _get_metadata_thread_id_key (void *ctx, const char 
*message_id)
message_id);
 }
 
+static notmuch_status_t
+_resolve_message_id_to_thread_id_old (notmuch_database_t *notmuch,
+ void *ctx,
+ const char *message_id,
+ const char **thread_id_ret);
+
 /* Find the thread ID to which the message with 'message_id' belongs.
  *
  * Note: 'thread_id_ret' must not be NULL!
@@ -1760,9 +1766,9 @@ _get_metadata_thread_id_key (void *ctx, const char 
*message_id)
  *
  * Note: If there is no message in the database with the given
  * 'message_id' then a new thread_id will be allocated for this
- * message and stored in the database metadata, (where this same
+ * message ID and stored in the database metadata so that the
  * thread ID can be looked up if the message is added to the database
- * later).
+ * later.
  */
 static notmuch_status_t
 _resolve_message_id_to_thread_id (notmuch_database_t *notmuch,
@@ -1770,6 +1776,49 @@ _resolve_message_id_to_thread_id (notmuch_database_t 
*notmuch,
  const char *message_id,
  const char **thread_id_ret)
 {
+notmuch_private_status_t status;
+notmuch_message_t *message;
+
+if (! (notmuch-features  NOTMUCH_FEATURE_GHOSTS))
+   return _resolve_message_id_to_thread_id_old (notmuch, ctx, message_id,
+thread_id_ret);
+
+/* Look for this message (regular or ghost) */
+message = _notmuch_message_create_for_message_id (
+   notmuch, message_id, status);
+if (status == NOTMUCH_PRIVATE_STATUS_SUCCESS) {
+   /* Message exists */
+   *thread_id_ret = talloc_steal (
+   ctx, notmuch_message_get_thread_id (message));
+} else if (status == NOTMUCH_PRIVATE_STATUS_NO_DOCUMENT_FOUND) {
+   /* Message did not exist.  Give it a fresh thread ID and
+* populate this message as a ghost message. */
+   *thread_id_ret = talloc_strdup (
+   ctx, _notmuch_database_generate_thread_id (notmuch));
+   if (! *thread_id_ret) {
+   status = NOTMUCH_PRIVATE_STATUS_OUT_OF_MEMORY;
+   } else {
+   status = _notmuch_message_initialize_ghost (message, 
*thread_id_ret);
+   if (status == 0)
+   /* Commit the new ghost message */
+   _notmuch_message_sync (message);
+   }
+} else {
+   /* Create failed. Fall through. */
+}
+
+notmuch_message_destroy (message);
+
+return COERCE_STATUS (status, Error creating ghost message);
+}
+
+/* Pre-ghost messages _resolve_message_id_to_thread_id */
+static notmuch_status_t
+_resolve_message_id_to_thread_id_old (notmuch_database_t *notmuch,
+ void *ctx,
+ const char *message_id,
+ const char **thread_id_ret)
+{
 notmuch_status_t status;
 notmuch_message_t *message;
 string thread_id_string;
@@ -2007,7 +2056,7 @@ _consume_metadata_thread_id (void *ctx, 
notmuch_database_t *notmuch,
 }
 }
 
-/* Given a (mostly empty) 'message' and its corresponding
+/* Given a blank or ghost 'message' and its corresponding
  * 'message_file' link it to existing threads in the database.
  *
  * The first check is in the metadata of the database to see if we
@@ -2035,16 +2084,22 @@ _consume_metadata_thread_id (void *ctx, 
notmuch_database_t 

Re: [PATCH 07/11] lib: Implement ghost-based thread linking

2014-10-03 Thread Carl Worth
On Fri, Oct 03 2014, Austin Clements wrote:
 This updates the thread linking code to use ghost messages instead of
 user metadata to link messages into threads.

 In contrast with the old approach, this is actually correct.

Hi Austin,

I've read through your commit messages, (not the code), and I just
wanted to say a big thank you to you!

I'm really looking forward to the improved robustness in notmuch with
this support.

I really appreciate your diligence in working through a fairly large,
low-level change here.

-Carl


pgpavnlvD49KS.pgp
Description: PGP signature
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch