Quoth Tomi Ollila on Aug 08 at 12:55 am: > On Fri, Aug 01 2014, Austin Clements <amdra...@mit.edu> wrote: > > > This is v3 of id:1406652492-27803-1-git-send-email-amdra...@mit.edu. > > This fixes one issue and tidies up another thing in > > notmuch_database_upgrade I found while working on change tracking > > support. Most of the patches are logically identical to v2, but the > > changes ripple through the patch context, so I'm sending a new series. > > > > First, this updates notmuch->features before starting the upgrade, > > rather than after, so that functions called by upgrade will use the > > new database features instead of the old (this didn't matter in this > > series because nothing modified the database differently depending on > > features). Second, this combines multiple _notmuch_message_sync calls > > into one, which cleans up the code, should further improve upgrade > > performance, and makes way for additional per-message upgrades. > > This series looks good to me. I looked through the diffs a few times and > notmuch_database_upgrade() in lib/database.cc to see that in full. > Tests pass (also T530-upgrade now that I downloaded that one test database.) > > I googled answers to few questions along the review; one thing still > interests me -- is there potential to have speed/memory problems > when doing upgrade transaction in large/very large databases. And > how long will the (final) commit_transaction() take (i.e how > many times handle_sigalrm() is called while that is in progress...) > > (my guess is that this transaction just builds a new revision and > if it is never committed the revision is never used -- and data is > written there in some batches of suitable size -- so memory usage > and transaction commit time is O(1))
Your guess is basically right. Xapian periodically flushes stuff to disk during a transaction basically just like it does when not in a transaction; AFAIK the only difference is when it flushes out the new revision number and updated free block metadata. Hence, speed and memory use aren't affected by the use of a transaction, and commit_transaction isn't particularly sensitive to how big the transaction is. > Tomi > > > > > The diff from v2 is below (excluding patch 2, which David pushed). > > > > diff --git a/lib/database.cc b/lib/database.cc > > index b323691..d90a924 100644 > > --- a/lib/database.cc > > +++ b/lib/database.cc > > @@ -1252,6 +1252,10 @@ notmuch_database_upgrade (notmuch_database_t > > *notmuch, > > /* Perform the upgrade in a transaction. */ > > db->begin_transaction (true); > > > > + /* Set the target features so we write out changes in the desired > > + * format. */ > > + notmuch->features = target_features; > > + > > /* Perform per-message upgrades. */ > > if (new_features & > > (NOTMUCH_FEATURE_FILE_TERMS | NOTMUCH_FEATURE_BOOL_FOLDER)) { > > @@ -1280,7 +1284,6 @@ notmuch_database_upgrade (notmuch_database_t *notmuch, > > if (filename && *filename != '\0') { > > _notmuch_message_add_filename (message, filename); > > _notmuch_message_clear_data (message); > > - _notmuch_message_sync (message); > > } > > talloc_free (filename); > > } > > @@ -1289,10 +1292,10 @@ notmuch_database_upgrade (notmuch_database_t > > *notmuch, > > * probabilistic and stemmed. Change it to the current > > * boolean prefix. Add "path:" prefixes while at it. > > */ > > - if (new_features & NOTMUCH_FEATURE_BOOL_FOLDER) { > > + if (new_features & NOTMUCH_FEATURE_BOOL_FOLDER) > > _notmuch_message_upgrade_folder (message); > > - _notmuch_message_sync (message); > > - } > > + > > + _notmuch_message_sync (message); > > > > notmuch_message_destroy (message); > > > > @@ -1348,7 +1351,6 @@ notmuch_database_upgrade (notmuch_database_t *notmuch, > > } > > } > > > > - notmuch->features = target_features; > > db->set_metadata ("features", _print_features (local, > > notmuch->features)); > > db->set_metadata ("version", STRINGIFY (NOTMUCH_DATABASE_VERSION)); > > _______________________________________________ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch