Re: [PATCH v3 00/13] Implement and use database "features"

2014-08-08 Thread Austin Clements
Quoth Tomi Ollila on Aug 08 at 12:55 am:
> On Fri, Aug 01 2014, Austin Clements  wrote:
> 
> > This is v3 of id:1406652492-27803-1-git-send-email-amdra...@mit.edu.
> > This fixes one issue and tidies up another thing in
> > notmuch_database_upgrade I found while working on change tracking
> > support.  Most of the patches are logically identical to v2, but the
> > changes ripple through the patch context, so I'm sending a new series.
> >
> > First, this updates notmuch->features before starting the upgrade,
> > rather than after, so that functions called by upgrade will use the
> > new database features instead of the old (this didn't matter in this
> > series because nothing modified the database differently depending on
> > features).  Second, this combines multiple _notmuch_message_sync calls
> > into one, which cleans up the code, should further improve upgrade
> > performance, and makes way for additional per-message upgrades.
> 
> This series looks good to me. I looked through the diffs a few times and
> notmuch_database_upgrade() in lib/database.cc to see that in full.
> Tests pass (also T530-upgrade now that I downloaded that one test database.)
> 
> I googled answers to few questions along the review; one thing still
> interests me -- is there potential to have speed/memory problems
> when doing upgrade transaction in large/very large databases. And
> how long will the (final) commit_transaction() take (i.e how
> many times handle_sigalrm() is called while that is in progress...)
> 
> (my guess is that this transaction just builds a new revision and
> if it is never committed the revision is never used -- and data is
> written there in some batches of suitable size -- so memory usage
> and transaction commit time is O(1))

Your guess is basically right.  Xapian periodically flushes stuff to
disk during a transaction basically just like it does when not in a
transaction; AFAIK the only difference is when it flushes out the new
revision number and updated free block metadata.  Hence, speed and
memory use aren't affected by the use of a transaction, and
commit_transaction isn't particularly sensitive to how big the
transaction is.

> Tomi
> 
> >
> > The diff from v2 is below (excluding patch 2, which David pushed).
> >
> > diff --git a/lib/database.cc b/lib/database.cc
> > index b323691..d90a924 100644
> > --- a/lib/database.cc
> > +++ b/lib/database.cc
> > @@ -1252,6 +1252,10 @@ notmuch_database_upgrade (notmuch_database_t 
> > *notmuch,
> >  /* Perform the upgrade in a transaction. */
> >  db->begin_transaction (true);
> >  
> > +/* Set the target features so we write out changes in the desired
> > + * format. */
> > +notmuch->features = target_features;
> > +
> >  /* Perform per-message upgrades. */
> >  if (new_features &
> > (NOTMUCH_FEATURE_FILE_TERMS | NOTMUCH_FEATURE_BOOL_FOLDER)) {
> > @@ -1280,7 +1284,6 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
> > if (filename && *filename != '\0') {
> > _notmuch_message_add_filename (message, filename);
> > _notmuch_message_clear_data (message);
> > -   _notmuch_message_sync (message);
> > }
> > talloc_free (filename);
> > }
> > @@ -1289,10 +1292,10 @@ notmuch_database_upgrade (notmuch_database_t 
> > *notmuch,
> >  * probabilistic and stemmed. Change it to the current
> >  * boolean prefix. Add "path:" prefixes while at it.
> >  */
> > -   if (new_features & NOTMUCH_FEATURE_BOOL_FOLDER) {
> > +   if (new_features & NOTMUCH_FEATURE_BOOL_FOLDER)
> > _notmuch_message_upgrade_folder (message);
> > -   _notmuch_message_sync (message);
> > -   }
> > +
> > +   _notmuch_message_sync (message);
> >  
> > notmuch_message_destroy (message);
> >  
> > @@ -1348,7 +1351,6 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
> > }
> >  }
> >  
> > -notmuch->features = target_features;
> >  db->set_metadata ("features", _print_features (local, 
> > notmuch->features));
> >  db->set_metadata ("version", STRINGIFY (NOTMUCH_DATABASE_VERSION));
> >
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH v3 00/13] Implement and use database "features"

2014-08-07 Thread Tomi Ollila
On Fri, Aug 01 2014, Austin Clements  wrote:

> This is v3 of id:1406652492-27803-1-git-send-email-amdra...@mit.edu.
> This fixes one issue and tidies up another thing in
> notmuch_database_upgrade I found while working on change tracking
> support.  Most of the patches are logically identical to v2, but the
> changes ripple through the patch context, so I'm sending a new series.
>
> First, this updates notmuch->features before starting the upgrade,
> rather than after, so that functions called by upgrade will use the
> new database features instead of the old (this didn't matter in this
> series because nothing modified the database differently depending on
> features).  Second, this combines multiple _notmuch_message_sync calls
> into one, which cleans up the code, should further improve upgrade
> performance, and makes way for additional per-message upgrades.

This series looks good to me. I looked through the diffs a few times and
notmuch_database_upgrade() in lib/database.cc to see that in full.
Tests pass (also T530-upgrade now that I downloaded that one test database.)

I googled answers to few questions along the review; one thing still
interests me -- is there potential to have speed/memory problems
when doing upgrade transaction in large/very large databases. And
how long will the (final) commit_transaction() take (i.e how
many times handle_sigalrm() is called while that is in progress...)

(my guess is that this transaction just builds a new revision and
if it is never committed the revision is never used -- and data is
written there in some batches of suitable size -- so memory usage
and transaction commit time is O(1))

Tomi

>
> The diff from v2 is below (excluding patch 2, which David pushed).
>
> diff --git a/lib/database.cc b/lib/database.cc
> index b323691..d90a924 100644
> --- a/lib/database.cc
> +++ b/lib/database.cc
> @@ -1252,6 +1252,10 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
>  /* Perform the upgrade in a transaction. */
>  db->begin_transaction (true);
>  
> +/* Set the target features so we write out changes in the desired
> + * format. */
> +notmuch->features = target_features;
> +
>  /* Perform per-message upgrades. */
>  if (new_features &
>   (NOTMUCH_FEATURE_FILE_TERMS | NOTMUCH_FEATURE_BOOL_FOLDER)) {
> @@ -1280,7 +1284,6 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
>   if (filename && *filename != '\0') {
>   _notmuch_message_add_filename (message, filename);
>   _notmuch_message_clear_data (message);
> - _notmuch_message_sync (message);
>   }
>   talloc_free (filename);
>   }
> @@ -1289,10 +1292,10 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
>* probabilistic and stemmed. Change it to the current
>* boolean prefix. Add "path:" prefixes while at it.
>*/
> - if (new_features & NOTMUCH_FEATURE_BOOL_FOLDER) {
> + if (new_features & NOTMUCH_FEATURE_BOOL_FOLDER)
>   _notmuch_message_upgrade_folder (message);
> - _notmuch_message_sync (message);
> - }
> +
> + _notmuch_message_sync (message);
>  
>   notmuch_message_destroy (message);
>  
> @@ -1348,7 +1351,6 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
>   }
>  }
>  
> -notmuch->features = target_features;
>  db->set_metadata ("features", _print_features (local, 
> notmuch->features));
>  db->set_metadata ("version", STRINGIFY (NOTMUCH_DATABASE_VERSION));
>
> ___
> notmuch mailing list
> notmuch@notmuchmail.org
> http://notmuchmail.org/mailman/listinfo/notmuch
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch