[PATCH v3 00/13] Implement and use database "features"

2014-08-08 Thread Austin Clements
Quoth Tomi Ollila on Aug 08 at 12:55 am:
> On Fri, Aug 01 2014, Austin Clements  wrote:
> 
> > This is v3 of id:1406652492-27803-1-git-send-email-amdragon at mit.edu.
> > This fixes one issue and tidies up another thing in
> > notmuch_database_upgrade I found while working on change tracking
> > support.  Most of the patches are logically identical to v2, but the
> > changes ripple through the patch context, so I'm sending a new series.
> >
> > First, this updates notmuch->features before starting the upgrade,
> > rather than after, so that functions called by upgrade will use the
> > new database features instead of the old (this didn't matter in this
> > series because nothing modified the database differently depending on
> > features).  Second, this combines multiple _notmuch_message_sync calls
> > into one, which cleans up the code, should further improve upgrade
> > performance, and makes way for additional per-message upgrades.
> 
> This series looks good to me. I looked through the diffs a few times and
> notmuch_database_upgrade() in lib/database.cc to see that in full.
> Tests pass (also T530-upgrade now that I downloaded that one test database.)
> 
> I googled answers to few questions along the review; one thing still
> interests me -- is there potential to have speed/memory problems
> when doing upgrade transaction in large/very large databases. And
> how long will the (final) commit_transaction() take (i.e how
> many times handle_sigalrm() is called while that is in progress...)
> 
> (my guess is that this transaction just builds a new revision and
> if it is never committed the revision is never used -- and data is
> written there in some batches of suitable size -- so memory usage
> and transaction commit time is O(1))

Your guess is basically right.  Xapian periodically flushes stuff to
disk during a transaction basically just like it does when not in a
transaction; AFAIK the only difference is when it flushes out the new
revision number and updated free block metadata.  Hence, speed and
memory use aren't affected by the use of a transaction, and
commit_transaction isn't particularly sensitive to how big the
transaction is.

> Tomi
> 
> >
> > The diff from v2 is below (excluding patch 2, which David pushed).
> >
> > diff --git a/lib/database.cc b/lib/database.cc
> > index b323691..d90a924 100644
> > --- a/lib/database.cc
> > +++ b/lib/database.cc
> > @@ -1252,6 +1252,10 @@ notmuch_database_upgrade (notmuch_database_t 
> > *notmuch,
> >  /* Perform the upgrade in a transaction. */
> >  db->begin_transaction (true);
> >  
> > +/* Set the target features so we write out changes in the desired
> > + * format. */
> > +notmuch->features = target_features;
> > +
> >  /* Perform per-message upgrades. */
> >  if (new_features &
> > (NOTMUCH_FEATURE_FILE_TERMS | NOTMUCH_FEATURE_BOOL_FOLDER)) {
> > @@ -1280,7 +1284,6 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
> > if (filename && *filename != '\0') {
> > _notmuch_message_add_filename (message, filename);
> > _notmuch_message_clear_data (message);
> > -   _notmuch_message_sync (message);
> > }
> > talloc_free (filename);
> > }
> > @@ -1289,10 +1292,10 @@ notmuch_database_upgrade (notmuch_database_t 
> > *notmuch,
> >  * probabilistic and stemmed. Change it to the current
> >  * boolean prefix. Add "path:" prefixes while at it.
> >  */
> > -   if (new_features & NOTMUCH_FEATURE_BOOL_FOLDER) {
> > +   if (new_features & NOTMUCH_FEATURE_BOOL_FOLDER)
> > _notmuch_message_upgrade_folder (message);
> > -   _notmuch_message_sync (message);
> > -   }
> > +
> > +   _notmuch_message_sync (message);
> >  
> > notmuch_message_destroy (message);
> >  
> > @@ -1348,7 +1351,6 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
> > }
> >  }
> >  
> > -notmuch->features = target_features;
> >  db->set_metadata ("features", _print_features (local, 
> > notmuch->features));
> >  db->set_metadata ("version", STRINGIFY (NOTMUCH_DATABASE_VERSION));
> >


[PATCH v3 00/13] Implement and use database "features"

2014-08-08 Thread Tomi Ollila
On Fri, Aug 01 2014, Austin Clements  wrote:

> This is v3 of id:1406652492-27803-1-git-send-email-amdragon at mit.edu.
> This fixes one issue and tidies up another thing in
> notmuch_database_upgrade I found while working on change tracking
> support.  Most of the patches are logically identical to v2, but the
> changes ripple through the patch context, so I'm sending a new series.
>
> First, this updates notmuch->features before starting the upgrade,
> rather than after, so that functions called by upgrade will use the
> new database features instead of the old (this didn't matter in this
> series because nothing modified the database differently depending on
> features).  Second, this combines multiple _notmuch_message_sync calls
> into one, which cleans up the code, should further improve upgrade
> performance, and makes way for additional per-message upgrades.

This series looks good to me. I looked through the diffs a few times and
notmuch_database_upgrade() in lib/database.cc to see that in full.
Tests pass (also T530-upgrade now that I downloaded that one test database.)

I googled answers to few questions along the review; one thing still
interests me -- is there potential to have speed/memory problems
when doing upgrade transaction in large/very large databases. And
how long will the (final) commit_transaction() take (i.e how
many times handle_sigalrm() is called while that is in progress...)

(my guess is that this transaction just builds a new revision and
if it is never committed the revision is never used -- and data is
written there in some batches of suitable size -- so memory usage
and transaction commit time is O(1))

Tomi

>
> The diff from v2 is below (excluding patch 2, which David pushed).
>
> diff --git a/lib/database.cc b/lib/database.cc
> index b323691..d90a924 100644
> --- a/lib/database.cc
> +++ b/lib/database.cc
> @@ -1252,6 +1252,10 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
>  /* Perform the upgrade in a transaction. */
>  db->begin_transaction (true);
>  
> +/* Set the target features so we write out changes in the desired
> + * format. */
> +notmuch->features = target_features;
> +
>  /* Perform per-message upgrades. */
>  if (new_features &
>   (NOTMUCH_FEATURE_FILE_TERMS | NOTMUCH_FEATURE_BOOL_FOLDER)) {
> @@ -1280,7 +1284,6 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
>   if (filename && *filename != '\0') {
>   _notmuch_message_add_filename (message, filename);
>   _notmuch_message_clear_data (message);
> - _notmuch_message_sync (message);
>   }
>   talloc_free (filename);
>   }
> @@ -1289,10 +1292,10 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
>* probabilistic and stemmed. Change it to the current
>* boolean prefix. Add "path:" prefixes while at it.
>*/
> - if (new_features & NOTMUCH_FEATURE_BOOL_FOLDER) {
> + if (new_features & NOTMUCH_FEATURE_BOOL_FOLDER)
>   _notmuch_message_upgrade_folder (message);
> - _notmuch_message_sync (message);
> - }
> +
> + _notmuch_message_sync (message);
>  
>   notmuch_message_destroy (message);
>  
> @@ -1348,7 +1351,6 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
>   }
>  }
>  
> -notmuch->features = target_features;
>  db->set_metadata ("features", _print_features (local, 
> notmuch->features));
>  db->set_metadata ("version", STRINGIFY (NOTMUCH_DATABASE_VERSION));
>
> ___
> notmuch mailing list
> notmuch at notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH v3 00/13] Implement and use database features

2014-08-08 Thread Austin Clements
Quoth Tomi Ollila on Aug 08 at 12:55 am:
 On Fri, Aug 01 2014, Austin Clements amdra...@mit.edu wrote:
 
  This is v3 of id:1406652492-27803-1-git-send-email-amdra...@mit.edu.
  This fixes one issue and tidies up another thing in
  notmuch_database_upgrade I found while working on change tracking
  support.  Most of the patches are logically identical to v2, but the
  changes ripple through the patch context, so I'm sending a new series.
 
  First, this updates notmuch-features before starting the upgrade,
  rather than after, so that functions called by upgrade will use the
  new database features instead of the old (this didn't matter in this
  series because nothing modified the database differently depending on
  features).  Second, this combines multiple _notmuch_message_sync calls
  into one, which cleans up the code, should further improve upgrade
  performance, and makes way for additional per-message upgrades.
 
 This series looks good to me. I looked through the diffs a few times and
 notmuch_database_upgrade() in lib/database.cc to see that in full.
 Tests pass (also T530-upgrade now that I downloaded that one test database.)
 
 I googled answers to few questions along the review; one thing still
 interests me -- is there potential to have speed/memory problems
 when doing upgrade transaction in large/very large databases. And
 how long will the (final) commit_transaction() take (i.e how
 many times handle_sigalrm() is called while that is in progress...)
 
 (my guess is that this transaction just builds a new revision and
 if it is never committed the revision is never used -- and data is
 written there in some batches of suitable size -- so memory usage
 and transaction commit time is O(1))

Your guess is basically right.  Xapian periodically flushes stuff to
disk during a transaction basically just like it does when not in a
transaction; AFAIK the only difference is when it flushes out the new
revision number and updated free block metadata.  Hence, speed and
memory use aren't affected by the use of a transaction, and
commit_transaction isn't particularly sensitive to how big the
transaction is.

 Tomi
 
 
  The diff from v2 is below (excluding patch 2, which David pushed).
 
  diff --git a/lib/database.cc b/lib/database.cc
  index b323691..d90a924 100644
  --- a/lib/database.cc
  +++ b/lib/database.cc
  @@ -1252,6 +1252,10 @@ notmuch_database_upgrade (notmuch_database_t 
  *notmuch,
   /* Perform the upgrade in a transaction. */
   db-begin_transaction (true);
   
  +/* Set the target features so we write out changes in the desired
  + * format. */
  +notmuch-features = target_features;
  +
   /* Perform per-message upgrades. */
   if (new_features 
  (NOTMUCH_FEATURE_FILE_TERMS | NOTMUCH_FEATURE_BOOL_FOLDER)) {
  @@ -1280,7 +1284,6 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
  if (filename  *filename != '\0') {
  _notmuch_message_add_filename (message, filename);
  _notmuch_message_clear_data (message);
  -   _notmuch_message_sync (message);
  }
  talloc_free (filename);
  }
  @@ -1289,10 +1292,10 @@ notmuch_database_upgrade (notmuch_database_t 
  *notmuch,
   * probabilistic and stemmed. Change it to the current
   * boolean prefix. Add path: prefixes while at it.
   */
  -   if (new_features  NOTMUCH_FEATURE_BOOL_FOLDER) {
  +   if (new_features  NOTMUCH_FEATURE_BOOL_FOLDER)
  _notmuch_message_upgrade_folder (message);
  -   _notmuch_message_sync (message);
  -   }
  +
  +   _notmuch_message_sync (message);
   
  notmuch_message_destroy (message);
   
  @@ -1348,7 +1351,6 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
  }
   }
   
  -notmuch-features = target_features;
   db-set_metadata (features, _print_features (local, 
  notmuch-features));
   db-set_metadata (version, STRINGIFY (NOTMUCH_DATABASE_VERSION));
 
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH v3 00/13] Implement and use database features

2014-08-07 Thread Tomi Ollila
On Fri, Aug 01 2014, Austin Clements amdra...@mit.edu wrote:

 This is v3 of id:1406652492-27803-1-git-send-email-amdra...@mit.edu.
 This fixes one issue and tidies up another thing in
 notmuch_database_upgrade I found while working on change tracking
 support.  Most of the patches are logically identical to v2, but the
 changes ripple through the patch context, so I'm sending a new series.

 First, this updates notmuch-features before starting the upgrade,
 rather than after, so that functions called by upgrade will use the
 new database features instead of the old (this didn't matter in this
 series because nothing modified the database differently depending on
 features).  Second, this combines multiple _notmuch_message_sync calls
 into one, which cleans up the code, should further improve upgrade
 performance, and makes way for additional per-message upgrades.

This series looks good to me. I looked through the diffs a few times and
notmuch_database_upgrade() in lib/database.cc to see that in full.
Tests pass (also T530-upgrade now that I downloaded that one test database.)

I googled answers to few questions along the review; one thing still
interests me -- is there potential to have speed/memory problems
when doing upgrade transaction in large/very large databases. And
how long will the (final) commit_transaction() take (i.e how
many times handle_sigalrm() is called while that is in progress...)

(my guess is that this transaction just builds a new revision and
if it is never committed the revision is never used -- and data is
written there in some batches of suitable size -- so memory usage
and transaction commit time is O(1))

Tomi


 The diff from v2 is below (excluding patch 2, which David pushed).

 diff --git a/lib/database.cc b/lib/database.cc
 index b323691..d90a924 100644
 --- a/lib/database.cc
 +++ b/lib/database.cc
 @@ -1252,6 +1252,10 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
  /* Perform the upgrade in a transaction. */
  db-begin_transaction (true);
  
 +/* Set the target features so we write out changes in the desired
 + * format. */
 +notmuch-features = target_features;
 +
  /* Perform per-message upgrades. */
  if (new_features 
   (NOTMUCH_FEATURE_FILE_TERMS | NOTMUCH_FEATURE_BOOL_FOLDER)) {
 @@ -1280,7 +1284,6 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
   if (filename  *filename != '\0') {
   _notmuch_message_add_filename (message, filename);
   _notmuch_message_clear_data (message);
 - _notmuch_message_sync (message);
   }
   talloc_free (filename);
   }
 @@ -1289,10 +1292,10 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
* probabilistic and stemmed. Change it to the current
* boolean prefix. Add path: prefixes while at it.
*/
 - if (new_features  NOTMUCH_FEATURE_BOOL_FOLDER) {
 + if (new_features  NOTMUCH_FEATURE_BOOL_FOLDER)
   _notmuch_message_upgrade_folder (message);
 - _notmuch_message_sync (message);
 - }
 +
 + _notmuch_message_sync (message);
  
   notmuch_message_destroy (message);
  
 @@ -1348,7 +1351,6 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
   }
  }
  
 -notmuch-features = target_features;
  db-set_metadata (features, _print_features (local, 
 notmuch-features));
  db-set_metadata (version, STRINGIFY (NOTMUCH_DATABASE_VERSION));

 ___
 notmuch mailing list
 notmuch@notmuchmail.org
 http://notmuchmail.org/mailman/listinfo/notmuch
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


[PATCH v3 00/13] Implement and use database "features"

2014-07-31 Thread Austin Clements
This is v3 of id:1406652492-27803-1-git-send-email-amdragon at mit.edu.
This fixes one issue and tidies up another thing in
notmuch_database_upgrade I found while working on change tracking
support.  Most of the patches are logically identical to v2, but the
changes ripple through the patch context, so I'm sending a new series.

First, this updates notmuch->features before starting the upgrade,
rather than after, so that functions called by upgrade will use the
new database features instead of the old (this didn't matter in this
series because nothing modified the database differently depending on
features).  Second, this combines multiple _notmuch_message_sync calls
into one, which cleans up the code, should further improve upgrade
performance, and makes way for additional per-message upgrades.

The diff from v2 is below (excluding patch 2, which David pushed).

diff --git a/lib/database.cc b/lib/database.cc
index b323691..d90a924 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -1252,6 +1252,10 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
 /* Perform the upgrade in a transaction. */
 db->begin_transaction (true);

+/* Set the target features so we write out changes in the desired
+ * format. */
+notmuch->features = target_features;
+
 /* Perform per-message upgrades. */
 if (new_features &
(NOTMUCH_FEATURE_FILE_TERMS | NOTMUCH_FEATURE_BOOL_FOLDER)) {
@@ -1280,7 +1284,6 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
if (filename && *filename != '\0') {
_notmuch_message_add_filename (message, filename);
_notmuch_message_clear_data (message);
-   _notmuch_message_sync (message);
}
talloc_free (filename);
}
@@ -1289,10 +1292,10 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
 * probabilistic and stemmed. Change it to the current
 * boolean prefix. Add "path:" prefixes while at it.
 */
-   if (new_features & NOTMUCH_FEATURE_BOOL_FOLDER) {
+   if (new_features & NOTMUCH_FEATURE_BOOL_FOLDER)
_notmuch_message_upgrade_folder (message);
-   _notmuch_message_sync (message);
-   }
+
+   _notmuch_message_sync (message);

notmuch_message_destroy (message);

@@ -1348,7 +1351,6 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
}
 }

-notmuch->features = target_features;
 db->set_metadata ("features", _print_features (local, notmuch->features));
 db->set_metadata ("version", STRINGIFY (NOTMUCH_DATABASE_VERSION));



[PATCH v3 00/13] Implement and use database features

2014-07-31 Thread Austin Clements
This is v3 of id:1406652492-27803-1-git-send-email-amdra...@mit.edu.
This fixes one issue and tidies up another thing in
notmuch_database_upgrade I found while working on change tracking
support.  Most of the patches are logically identical to v2, but the
changes ripple through the patch context, so I'm sending a new series.

First, this updates notmuch-features before starting the upgrade,
rather than after, so that functions called by upgrade will use the
new database features instead of the old (this didn't matter in this
series because nothing modified the database differently depending on
features).  Second, this combines multiple _notmuch_message_sync calls
into one, which cleans up the code, should further improve upgrade
performance, and makes way for additional per-message upgrades.

The diff from v2 is below (excluding patch 2, which David pushed).

diff --git a/lib/database.cc b/lib/database.cc
index b323691..d90a924 100644
--- a/lib/database.cc
+++ b/lib/database.cc
@@ -1252,6 +1252,10 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
 /* Perform the upgrade in a transaction. */
 db-begin_transaction (true);
 
+/* Set the target features so we write out changes in the desired
+ * format. */
+notmuch-features = target_features;
+
 /* Perform per-message upgrades. */
 if (new_features 
(NOTMUCH_FEATURE_FILE_TERMS | NOTMUCH_FEATURE_BOOL_FOLDER)) {
@@ -1280,7 +1284,6 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
if (filename  *filename != '\0') {
_notmuch_message_add_filename (message, filename);
_notmuch_message_clear_data (message);
-   _notmuch_message_sync (message);
}
talloc_free (filename);
}
@@ -1289,10 +1292,10 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
 * probabilistic and stemmed. Change it to the current
 * boolean prefix. Add path: prefixes while at it.
 */
-   if (new_features  NOTMUCH_FEATURE_BOOL_FOLDER) {
+   if (new_features  NOTMUCH_FEATURE_BOOL_FOLDER)
_notmuch_message_upgrade_folder (message);
-   _notmuch_message_sync (message);
-   }
+
+   _notmuch_message_sync (message);
 
notmuch_message_destroy (message);
 
@@ -1348,7 +1351,6 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
}
 }
 
-notmuch-features = target_features;
 db-set_metadata (features, _print_features (local, notmuch-features));
 db-set_metadata (version, STRINGIFY (NOTMUCH_DATABASE_VERSION));

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch