Re: [PATCH v3 04/13] lib: Database version 3: Introduce fine-grained features
Austin Clements amdra...@mit.edu writes: Quoth David Bremner on Aug 23 at 8:58 pm: Austin Clements amdra...@mit.edu writes: @@ -1048,7 +1164,8 @@ notmuch_database_get_version (notmuch_database_t *notmuch) notmuch_bool_t notmuch_database_needs_upgrade (notmuch_database_t *notmuch) { -return notmuch-needs_upgrade; +return notmuch-mode == NOTMUCH_DATABASE_MODE_READ_WRITE +(NOTMUCH_FEATURES_CURRENT ~notmuch-features); } Maybe I'm not thinking hard enough here, but how does this deal with a feature that is needed to open a database in read only mode? Maybe it needs a comment for people not as clever as Austin ;). I'm not quite sure what you mean. notmuch_database_needs_upgrade returns false for read-only databases because you can't upgrade a read-only database. This was true before this patch, too, though it was less obvious. (Maybe that's not what you're asking?) Yes, that's what I was asking. I guess it's orthogonal to your patch series, but the logic of returning FALSE for read only databases is not very intuitive to me (in the sense that needs upgrade is not the opposite of can't be upgraded. Your new comment later in the series is better, but it would IMHO be even better if you mentioned the read only case. That's a good point. Ideally this should be notmuch_database_can_upgrade or something, which I think would capture exactly what it means after this series. However, I don't think it's worth breaking API compatibility given that this is already how callers use this function (even though that violates the current library spec). So, how's this for a more updated doc comment on needs_upgrade? LGTM ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [PATCH v3 04/13] lib: Database version 3: Introduce fine-grained features
On Fri, 01 Aug 2014, Austin Clements amdra...@mit.edu wrote: Previously, our database schema was versioned by a single number. Each database schema change had to occur atomically in Notmuch's development history: before some commit, Notmuch used version N, after that commit, it used version N+1. Hence, each new schema version could introduce only one change, the task of developing a schema change fell on a single person, and it all had to happen and be perfect in a single commit series. This made introducing a new schema version hard. We've seen only two schema changes in the history of Notmuch. This commit introduces database schema version 3; hopefully the last schema version we'll need for a while. With this version, we switch from a single version number to features: a set of named, independent aspects of the database schema. Features should make backwards compatibility easier. For many things, it should be easy to support databases both with and without a feature, which will allow us to make upgrades optional and will enable unstable features that can be developed and tested over time. Features also make forwards compatibility easier. The features recorded in a database include compatibility flags, which can indicate to an older version of Notmuch when it must support a given feature to open the database for read or for write. This lets us replace the old vague I don't recognize this version, so something might go wrong, but I promise to try my best warnings upon opening a database with an unknown version with precise errors. If a database is safe to open for read/write despite unknown features, an older version will know that and issue no message at all. If the database is not safe to open for read/write because of unknown features, an older version will know that, too, and can tell the user exactly which required features it lacks support for. I agree with the change overall; it might be useful to have another set of eyes on the patch though. --- lib/database-private.h | 57 ++- lib/database.cc| 190 - 2 files changed, 213 insertions(+), 34 deletions(-) diff --git a/lib/database-private.h b/lib/database-private.h index d3e65fd..2ffab33 100644 --- a/lib/database-private.h +++ b/lib/database-private.h @@ -41,11 +41,15 @@ struct _notmuch_database { char *path; -notmuch_bool_t needs_upgrade; notmuch_database_mode_t mode; int atomic_nesting; Xapian::Database *xapian_db; +/* Bit mask of features used by this database. Features are + * named, independent aspects of the database schema. This is a + * bitwise-OR of NOTMUCH_FEATURE_* values (below). */ +unsigned int features; + unsigned int last_doc_id; uint64_t last_thread_id; @@ -55,6 +59,57 @@ struct _notmuch_database { Xapian::ValueRangeProcessor *date_range_processor; }; +/* Bit masks for _notmuch_database::features. */ +enum { +/* If set, file names are stored in file-direntry terms. If + * unset, file names are stored in document data. + * + * Introduced: version 1. Implementation support: both for read; + * required for write. */ Maybe I'm dense, but the implementation support comments could be clearer. +NOTMUCH_FEATURE_FILE_TERMS = 1 0, + +/* If set, directory timestamps are stored in documents with + * XDIRECTORY terms and relative paths. If unset, directory + * timestamps are stored in documents with XTIMESTAMP terms and + * absolute paths. + * + * Introduced: version 1. Implementation support: required. */ +NOTMUCH_FEATURE_DIRECTORY_DOCS = 1 1, + +/* If set, the from, subject, and message-id headers are stored in + * message document values. If unset, message documents *may* + * have these values, but if the value is empty, it must be + * retrieved from the message file. + * + * Introduced: optional in version 1, required as of version 3. + * Implementation support: both. + */ +NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES = 1 2, + +/* If set, folder terms are boolean and path terms exist. If + * unset, folder terms are probabilistic and stemmed and path + * terms do not exist. + * + * Introduced: version 2. Implementation support: required. */ +NOTMUCH_FEATURE_BOOL_FOLDER = 1 3, +}; + +/* Prior to database version 3, features were implied by the database + * version number, so hard-code them for earlier versions. */ +#define NOTMUCH_FEATURES_V0 (0) +#define NOTMUCH_FEATURES_V1 (NOTMUCH_FEATURES_V0 | NOTMUCH_FEATURE_FILE_TERMS | \ + NOTMUCH_FEATURE_DIRECTORY_DOCS) +#define NOTMUCH_FEATURES_V2 (NOTMUCH_FEATURES_V1 | NOTMUCH_FEATURE_BOOL_FOLDER) + +/* Current database features. If any of these are missing from a + * database, request an upgrade. + *
Re: [PATCH v3 04/13] lib: Database version 3: Introduce fine-grained features
Austin Clements amdra...@mit.edu writes: +/* Bit mask of features used by this database. Features are + * named, independent aspects of the database schema. This is a + * bitwise-OR of NOTMUCH_FEATURE_* values (below). */ +unsigned int features; Should we be using a fixed size integer (uint_32t or whatever) for features? iirc the metadata in the database is actually a string, so I guess arbitrary precision there. +/* Bit masks for _notmuch_database::features. */ +enum { +/* If set, file names are stored in file-direntry terms. If + * unset, file names are stored in document data. + * + * Introduced: version 1. Implementation support: both for read; + * required for write. */ +NOTMUCH_FEATURE_FILE_TERMS = 1 0, I agree with Jani that the Implementation support: part is a bit mystifying without the commit message. Maybe part of the commit message could migrate here? Or maybe just add a pointer to the comment in database.cc. + if (! *incompat_out) Should we support passing NULL for incompat_out? or at least check for it? @@ -1048,7 +1164,8 @@ notmuch_database_get_version (notmuch_database_t *notmuch) notmuch_bool_t notmuch_database_needs_upgrade (notmuch_database_t *notmuch) { -return notmuch-needs_upgrade; +return notmuch-mode == NOTMUCH_DATABASE_MODE_READ_WRITE + (NOTMUCH_FEATURES_CURRENT ~notmuch-features); } Maybe I'm not thinking hard enough here, but how does this deal with a feature that is needed to open a database in read only mode? Maybe it needs a comment for people not as clever as Austin ;). d ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [PATCH v3 04/13] lib: Database version 3: Introduce fine-grained features
Quoth Jani Nikula on Aug 23 at 7:02 pm: On Fri, 01 Aug 2014, Austin Clements amdra...@mit.edu wrote: Previously, our database schema was versioned by a single number. Each database schema change had to occur atomically in Notmuch's development history: before some commit, Notmuch used version N, after that commit, it used version N+1. Hence, each new schema version could introduce only one change, the task of developing a schema change fell on a single person, and it all had to happen and be perfect in a single commit series. This made introducing a new schema version hard. We've seen only two schema changes in the history of Notmuch. This commit introduces database schema version 3; hopefully the last schema version we'll need for a while. With this version, we switch from a single version number to features: a set of named, independent aspects of the database schema. Features should make backwards compatibility easier. For many things, it should be easy to support databases both with and without a feature, which will allow us to make upgrades optional and will enable unstable features that can be developed and tested over time. Features also make forwards compatibility easier. The features recorded in a database include compatibility flags, which can indicate to an older version of Notmuch when it must support a given feature to open the database for read or for write. This lets us replace the old vague I don't recognize this version, so something might go wrong, but I promise to try my best warnings upon opening a database with an unknown version with precise errors. If a database is safe to open for read/write despite unknown features, an older version will know that and issue no message at all. If the database is not safe to open for read/write because of unknown features, an older version will know that, too, and can tell the user exactly which required features it lacks support for. I agree with the change overall; it might be useful to have another set of eyes on the patch though. --- lib/database-private.h | 57 ++- lib/database.cc| 190 - 2 files changed, 213 insertions(+), 34 deletions(-) diff --git a/lib/database-private.h b/lib/database-private.h index d3e65fd..2ffab33 100644 --- a/lib/database-private.h +++ b/lib/database-private.h @@ -41,11 +41,15 @@ struct _notmuch_database { char *path; -notmuch_bool_t needs_upgrade; notmuch_database_mode_t mode; int atomic_nesting; Xapian::Database *xapian_db; +/* Bit mask of features used by this database. Features are + * named, independent aspects of the database schema. This is a + * bitwise-OR of NOTMUCH_FEATURE_* values (below). */ +unsigned int features; + unsigned int last_doc_id; uint64_t last_thread_id; @@ -55,6 +59,57 @@ struct _notmuch_database { Xapian::ValueRangeProcessor *date_range_processor; }; +/* Bit masks for _notmuch_database::features. */ +enum { +/* If set, file names are stored in file-direntry terms. If + * unset, file names are stored in document data. + * + * Introduced: version 1. Implementation support: both for read; + * required for write. */ Maybe I'm dense, but the implementation support comments could be clearer. No, this was too terse. The intent was to state what the current library implementation supports: e.g., the current implementation supports reading from databases both with and without NOTMUCH_FEATURE_FILE_TERMS, but if you attempt to write a database without this feature, some operations may return NOTMUCH_STATUS_UPGRADE_REQUIRED (introduced in a later patch). I wonder, though, if the implementation support bits should just be omitted. These comments are inevitably going to get out of sync with the real implementation, and in this case an incorrect comment may be worse than no comment. I could put a comment over the whole enum giving a more general description: /* Bit masks for _notmuch_database::features. Features are named, * independent aspects of the database schema. * * A database stores the set of features that it uses (implicitly * before database version 3 and explicitly as of version 3). * * A given library version will recognize a particular set of * features; if a database uses a feature that the library does not * recognize, the library will refuse to open it. It is assumed the * set of recognized features grows monotonically over time. A * library version will implement some subset of the recognized * features: some operations may require that the database use (or not * use) some feature, while other operations may support both * databases that use and that don't use some feature. * * On disk, the database stores string names for these features (see * the
Re: [PATCH v3 04/13] lib: Database version 3: Introduce fine-grained features
On Sat, 23 Aug 2014, David Bremner da...@tethera.net wrote: Austin Clements amdra...@mit.edu writes: +/* Bit mask of features used by this database. Features are + * named, independent aspects of the database schema. This is a + * bitwise-OR of NOTMUCH_FEATURE_* values (below). */ +unsigned int features; Should we be using a fixed size integer (uint_32t or whatever) for features? iirc the metadata in the database is actually a string, so I guess arbitrary precision there. Right; this doesn't matter for the on-disk format because these don't appear on disk. But you're right that in principle we could overflow this, leading to subtle bugs. I moved the enum above struct _notmuch_database, gave it a name and bitwise operators for C++, and used that enum name everywhere, so precision should never be a problem. +/* Bit masks for _notmuch_database::features. */ +enum { +/* If set, file names are stored in file-direntry terms. If + * unset, file names are stored in document data. + * + * Introduced: version 1. Implementation support: both for read; + * required for write. */ +NOTMUCH_FEATURE_FILE_TERMS = 1 0, I agree with Jani that the Implementation support: part is a bit mystifying without the commit message. Maybe part of the commit message could migrate here? Or maybe just add a pointer to the comment in database.cc. I stripped these out because I don't think they're maintainable. See my reply to Jani. +if (! *incompat_out) Should we support passing NULL for incompat_out? or at least check for it? Added a guard so it's safe to pass NULL. @@ -1048,7 +1164,8 @@ notmuch_database_get_version (notmuch_database_t *notmuch) notmuch_bool_t notmuch_database_needs_upgrade (notmuch_database_t *notmuch) { -return notmuch-needs_upgrade; +return notmuch-mode == NOTMUCH_DATABASE_MODE_READ_WRITE +(NOTMUCH_FEATURES_CURRENT ~notmuch-features); } Maybe I'm not thinking hard enough here, but how does this deal with a feature that is needed to open a database in read only mode? Maybe it needs a comment for people not as clever as Austin ;). I'm not quite sure what you mean. notmuch_database_needs_upgrade returns false for read-only databases because you can't upgrade a read-only database. This was true before this patch, too, though it was less obvious. (Maybe that's not what you're asking?) ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: [PATCH v3 04/13] lib: Database version 3: Introduce fine-grained features
Austin Clements amdra...@mit.edu writes: @@ -1048,7 +1164,8 @@ notmuch_database_get_version (notmuch_database_t *notmuch) notmuch_bool_t notmuch_database_needs_upgrade (notmuch_database_t *notmuch) { -return notmuch-needs_upgrade; +return notmuch-mode == NOTMUCH_DATABASE_MODE_READ_WRITE + (NOTMUCH_FEATURES_CURRENT ~notmuch-features); } Maybe I'm not thinking hard enough here, but how does this deal with a feature that is needed to open a database in read only mode? Maybe it needs a comment for people not as clever as Austin ;). I'm not quite sure what you mean. notmuch_database_needs_upgrade returns false for read-only databases because you can't upgrade a read-only database. This was true before this patch, too, though it was less obvious. (Maybe that's not what you're asking?) Yes, that's what I was asking. I guess it's orthogonal to your patch series, but the logic of returning FALSE for read only databases is not very intuitive to me (in the sense that needs upgrade is not the opposite of can't be upgraded. Your new comment later in the series is better, but it would IMHO be even better if you mentioned the read only case. d ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch