Re: [PATCH v3 04/13] lib: Database version 3: Introduce fine-grained features

2014-08-24 Thread David Bremner
Austin Clements amdra...@mit.edu writes:

 Quoth David Bremner on Aug 23 at  8:58 pm:
 Austin Clements amdra...@mit.edu writes:
 
  @@ -1048,7 +1164,8 @@ notmuch_database_get_version (notmuch_database_t 
  *notmuch)
   notmuch_bool_t
   notmuch_database_needs_upgrade (notmuch_database_t *notmuch)
   {
  -return notmuch-needs_upgrade;
  +return notmuch-mode == NOTMUCH_DATABASE_MODE_READ_WRITE 
  +(NOTMUCH_FEATURES_CURRENT  ~notmuch-features);
   }
 
  Maybe I'm not thinking hard enough here, but how does this deal with a
  feature that is needed to open a database in read only mode? Maybe it
  needs a comment for people not as clever as Austin ;).
 
  I'm not quite sure what you mean.  notmuch_database_needs_upgrade
  returns false for read-only databases because you can't upgrade a
  read-only database.  This was true before this patch, too, though it was
  less obvious.  (Maybe that's not what you're asking?)
 
 Yes, that's what I was asking. I guess it's orthogonal to your patch
 series, but the logic of returning FALSE for read only databases is not
 very intuitive to me (in the sense that needs upgrade is not the
 opposite of can't be upgraded.  Your new comment later in the series
 is better, but it would IMHO be even better if you mentioned the read
 only case.

 That's a good point.  Ideally this should be
 notmuch_database_can_upgrade or something, which I think would
 capture exactly what it means after this series.  However, I don't
 think it's worth breaking API compatibility given that this is already
 how callers use this function (even though that violates the current
 library spec).

 So, how's this for a more updated doc comment on needs_upgrade?

LGTM
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH v3 04/13] lib: Database version 3: Introduce fine-grained features

2014-08-23 Thread Jani Nikula
On Fri, 01 Aug 2014, Austin Clements amdra...@mit.edu wrote:
 Previously, our database schema was versioned by a single number.
 Each database schema change had to occur atomically in Notmuch's
 development history: before some commit, Notmuch used version N, after
 that commit, it used version N+1.  Hence, each new schema version
 could introduce only one change, the task of developing a schema
 change fell on a single person, and it all had to happen and be
 perfect in a single commit series.  This made introducing a new schema
 version hard.  We've seen only two schema changes in the history of
 Notmuch.

 This commit introduces database schema version 3; hopefully the last
 schema version we'll need for a while.  With this version, we switch
 from a single version number to features: a set of named,
 independent aspects of the database schema.

 Features should make backwards compatibility easier.  For many things,
 it should be easy to support databases both with and without a
 feature, which will allow us to make upgrades optional and will enable
 unstable features that can be developed and tested over time.

 Features also make forwards compatibility easier.  The features
 recorded in a database include compatibility flags, which can
 indicate to an older version of Notmuch when it must support a given
 feature to open the database for read or for write.  This lets us
 replace the old vague I don't recognize this version, so something
 might go wrong, but I promise to try my best warnings upon opening a
 database with an unknown version with precise errors.  If a database
 is safe to open for read/write despite unknown features, an older
 version will know that and issue no message at all.  If the database
 is not safe to open for read/write because of unknown features, an
 older version will know that, too, and can tell the user exactly which
 required features it lacks support for.

I agree with the change overall; it might be useful to have another set
of eyes on the patch though.

 ---
  lib/database-private.h |  57 ++-
  lib/database.cc| 190 
 -
  2 files changed, 213 insertions(+), 34 deletions(-)

 diff --git a/lib/database-private.h b/lib/database-private.h
 index d3e65fd..2ffab33 100644
 --- a/lib/database-private.h
 +++ b/lib/database-private.h
 @@ -41,11 +41,15 @@ struct _notmuch_database {
  
  char *path;
  
 -notmuch_bool_t needs_upgrade;
  notmuch_database_mode_t mode;
  int atomic_nesting;
  Xapian::Database *xapian_db;
  
 +/* Bit mask of features used by this database.  Features are
 + * named, independent aspects of the database schema.  This is a
 + * bitwise-OR of NOTMUCH_FEATURE_* values (below). */
 +unsigned int features;
 +
  unsigned int last_doc_id;
  uint64_t last_thread_id;
  
 @@ -55,6 +59,57 @@ struct _notmuch_database {
  Xapian::ValueRangeProcessor *date_range_processor;
  };
  
 +/* Bit masks for _notmuch_database::features. */
 +enum {
 +/* If set, file names are stored in file-direntry terms.  If
 + * unset, file names are stored in document data.
 + *
 + * Introduced: version 1.  Implementation support: both for read;
 + * required for write. */

Maybe I'm dense, but the implementation support comments could be
clearer.

 +NOTMUCH_FEATURE_FILE_TERMS = 1  0,
 +
 +/* If set, directory timestamps are stored in documents with
 + * XDIRECTORY terms and relative paths.  If unset, directory
 + * timestamps are stored in documents with XTIMESTAMP terms and
 + * absolute paths.
 + *
 + * Introduced: version 1.  Implementation support: required. */
 +NOTMUCH_FEATURE_DIRECTORY_DOCS = 1  1,
 +
 +/* If set, the from, subject, and message-id headers are stored in
 + * message document values.  If unset, message documents *may*
 + * have these values, but if the value is empty, it must be
 + * retrieved from the message file.
 + *
 + * Introduced: optional in version 1, required as of version 3.
 + * Implementation support: both.
 + */
 +NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES = 1  2,
 +
 +/* If set, folder terms are boolean and path terms exist.  If
 + * unset, folder terms are probabilistic and stemmed and path
 + * terms do not exist.
 + *
 + * Introduced: version 2.  Implementation support: required. */
 +NOTMUCH_FEATURE_BOOL_FOLDER = 1  3,
 +};
 +
 +/* Prior to database version 3, features were implied by the database
 + * version number, so hard-code them for earlier versions. */
 +#define NOTMUCH_FEATURES_V0 (0)
 +#define NOTMUCH_FEATURES_V1 (NOTMUCH_FEATURES_V0 | 
 NOTMUCH_FEATURE_FILE_TERMS | \
 +  NOTMUCH_FEATURE_DIRECTORY_DOCS)
 +#define NOTMUCH_FEATURES_V2 (NOTMUCH_FEATURES_V1 | 
 NOTMUCH_FEATURE_BOOL_FOLDER)
 +
 +/* Current database features.  If any of these are missing from a
 + * database, request an upgrade.
 + * 

Re: [PATCH v3 04/13] lib: Database version 3: Introduce fine-grained features

2014-08-23 Thread David Bremner
Austin Clements amdra...@mit.edu writes:
  
 +/* Bit mask of features used by this database.  Features are
 + * named, independent aspects of the database schema.  This is a
 + * bitwise-OR of NOTMUCH_FEATURE_* values (below). */
 +unsigned int features;

Should we be using a fixed size integer (uint_32t or whatever) for
features? iirc the metadata in the database is actually a string, so I
guess arbitrary precision there.

 +/* Bit masks for _notmuch_database::features. */
 +enum {
 +/* If set, file names are stored in file-direntry terms.  If
 + * unset, file names are stored in document data.
 + *
 + * Introduced: version 1.  Implementation support: both for read;
 + * required for write. */
 +NOTMUCH_FEATURE_FILE_TERMS = 1  0,

I agree with Jani that the Implementation support: part is a bit
mystifying without the commit message. Maybe part of the commit message
could migrate here? Or maybe just add a pointer to the comment in database.cc.

 + if (! *incompat_out)

Should we support passing NULL for incompat_out? or at least check for
it?

 @@ -1048,7 +1164,8 @@ notmuch_database_get_version (notmuch_database_t 
 *notmuch)
  notmuch_bool_t
  notmuch_database_needs_upgrade (notmuch_database_t *notmuch)
  {
 -return notmuch-needs_upgrade;
 +return notmuch-mode == NOTMUCH_DATABASE_MODE_READ_WRITE 
 + (NOTMUCH_FEATURES_CURRENT  ~notmuch-features);
  }

Maybe I'm not thinking hard enough here, but how does this deal with a
feature that is needed to open a database in read only mode? Maybe it
needs a comment for people not as clever as Austin ;).

d

___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH v3 04/13] lib: Database version 3: Introduce fine-grained features

2014-08-23 Thread Austin Clements
Quoth Jani Nikula on Aug 23 at  7:02 pm:
 On Fri, 01 Aug 2014, Austin Clements amdra...@mit.edu wrote:
  Previously, our database schema was versioned by a single number.
  Each database schema change had to occur atomically in Notmuch's
  development history: before some commit, Notmuch used version N, after
  that commit, it used version N+1.  Hence, each new schema version
  could introduce only one change, the task of developing a schema
  change fell on a single person, and it all had to happen and be
  perfect in a single commit series.  This made introducing a new schema
  version hard.  We've seen only two schema changes in the history of
  Notmuch.
 
  This commit introduces database schema version 3; hopefully the last
  schema version we'll need for a while.  With this version, we switch
  from a single version number to features: a set of named,
  independent aspects of the database schema.
 
  Features should make backwards compatibility easier.  For many things,
  it should be easy to support databases both with and without a
  feature, which will allow us to make upgrades optional and will enable
  unstable features that can be developed and tested over time.
 
  Features also make forwards compatibility easier.  The features
  recorded in a database include compatibility flags, which can
  indicate to an older version of Notmuch when it must support a given
  feature to open the database for read or for write.  This lets us
  replace the old vague I don't recognize this version, so something
  might go wrong, but I promise to try my best warnings upon opening a
  database with an unknown version with precise errors.  If a database
  is safe to open for read/write despite unknown features, an older
  version will know that and issue no message at all.  If the database
  is not safe to open for read/write because of unknown features, an
  older version will know that, too, and can tell the user exactly which
  required features it lacks support for.
 
 I agree with the change overall; it might be useful to have another set
 of eyes on the patch though.
 
  ---
   lib/database-private.h |  57 ++-
   lib/database.cc| 190 
  -
   2 files changed, 213 insertions(+), 34 deletions(-)
 
  diff --git a/lib/database-private.h b/lib/database-private.h
  index d3e65fd..2ffab33 100644
  --- a/lib/database-private.h
  +++ b/lib/database-private.h
  @@ -41,11 +41,15 @@ struct _notmuch_database {
   
   char *path;
   
  -notmuch_bool_t needs_upgrade;
   notmuch_database_mode_t mode;
   int atomic_nesting;
   Xapian::Database *xapian_db;
   
  +/* Bit mask of features used by this database.  Features are
  + * named, independent aspects of the database schema.  This is a
  + * bitwise-OR of NOTMUCH_FEATURE_* values (below). */
  +unsigned int features;
  +
   unsigned int last_doc_id;
   uint64_t last_thread_id;
   
  @@ -55,6 +59,57 @@ struct _notmuch_database {
   Xapian::ValueRangeProcessor *date_range_processor;
   };
   
  +/* Bit masks for _notmuch_database::features. */
  +enum {
  +/* If set, file names are stored in file-direntry terms.  If
  + * unset, file names are stored in document data.
  + *
  + * Introduced: version 1.  Implementation support: both for read;
  + * required for write. */
 
 Maybe I'm dense, but the implementation support comments could be
 clearer.

No, this was too terse.  The intent was to state what the current
library implementation supports: e.g., the current implementation
supports reading from databases both with and without
NOTMUCH_FEATURE_FILE_TERMS, but if you attempt to write a database
without this feature, some operations may return
NOTMUCH_STATUS_UPGRADE_REQUIRED (introduced in a later patch).

I wonder, though, if the implementation support bits should just be
omitted.  These comments are inevitably going to get out of sync with
the real implementation, and in this case an incorrect comment may be
worse than no comment.  I could put a comment over the whole enum
giving a more general description:

/* Bit masks for _notmuch_database::features.  Features are named,
 * independent aspects of the database schema.
 *
 * A database stores the set of features that it uses (implicitly
 * before database version 3 and explicitly as of version 3).
 *
 * A given library version will recognize a particular set of
 * features; if a database uses a feature that the library does not
 * recognize, the library will refuse to open it.  It is assumed the
 * set of recognized features grows monotonically over time.  A
 * library version will implement some subset of the recognized
 * features: some operations may require that the database use (or not
 * use) some feature, while other operations may support both
 * databases that use and that don't use some feature.
 *
 * On disk, the database stores string names for these features (see
 * the 

Re: [PATCH v3 04/13] lib: Database version 3: Introduce fine-grained features

2014-08-23 Thread Austin Clements
On Sat, 23 Aug 2014, David Bremner da...@tethera.net wrote:
 Austin Clements amdra...@mit.edu writes:
  
 +/* Bit mask of features used by this database.  Features are
 + * named, independent aspects of the database schema.  This is a
 + * bitwise-OR of NOTMUCH_FEATURE_* values (below). */
 +unsigned int features;

 Should we be using a fixed size integer (uint_32t or whatever) for
 features? iirc the metadata in the database is actually a string, so I
 guess arbitrary precision there.

Right; this doesn't matter for the on-disk format because these don't
appear on disk.  But you're right that in principle we could overflow
this, leading to subtle bugs.  I moved the enum above struct
_notmuch_database, gave it a name and bitwise operators for C++, and
used that enum name everywhere, so precision should never be a problem.

 +/* Bit masks for _notmuch_database::features. */
 +enum {
 +/* If set, file names are stored in file-direntry terms.  If
 + * unset, file names are stored in document data.
 + *
 + * Introduced: version 1.  Implementation support: both for read;
 + * required for write. */
 +NOTMUCH_FEATURE_FILE_TERMS = 1  0,

 I agree with Jani that the Implementation support: part is a bit
 mystifying without the commit message. Maybe part of the commit message
 could migrate here? Or maybe just add a pointer to the comment in database.cc.

I stripped these out because I don't think they're maintainable.  See my
reply to Jani.

 +if (! *incompat_out)

 Should we support passing NULL for incompat_out? or at least check for
 it?

Added a guard so it's safe to pass NULL.

 @@ -1048,7 +1164,8 @@ notmuch_database_get_version (notmuch_database_t 
 *notmuch)
  notmuch_bool_t
  notmuch_database_needs_upgrade (notmuch_database_t *notmuch)
  {
 -return notmuch-needs_upgrade;
 +return notmuch-mode == NOTMUCH_DATABASE_MODE_READ_WRITE 
 +(NOTMUCH_FEATURES_CURRENT  ~notmuch-features);
  }

 Maybe I'm not thinking hard enough here, but how does this deal with a
 feature that is needed to open a database in read only mode? Maybe it
 needs a comment for people not as clever as Austin ;).

I'm not quite sure what you mean.  notmuch_database_needs_upgrade
returns false for read-only databases because you can't upgrade a
read-only database.  This was true before this patch, too, though it was
less obvious.  (Maybe that's not what you're asking?)
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch


Re: [PATCH v3 04/13] lib: Database version 3: Introduce fine-grained features

2014-08-23 Thread David Bremner
Austin Clements amdra...@mit.edu writes:

 @@ -1048,7 +1164,8 @@ notmuch_database_get_version (notmuch_database_t 
 *notmuch)
  notmuch_bool_t
  notmuch_database_needs_upgrade (notmuch_database_t *notmuch)
  {
 -return notmuch-needs_upgrade;
 +return notmuch-mode == NOTMUCH_DATABASE_MODE_READ_WRITE 
 +   (NOTMUCH_FEATURES_CURRENT  ~notmuch-features);
  }

 Maybe I'm not thinking hard enough here, but how does this deal with a
 feature that is needed to open a database in read only mode? Maybe it
 needs a comment for people not as clever as Austin ;).

 I'm not quite sure what you mean.  notmuch_database_needs_upgrade
 returns false for read-only databases because you can't upgrade a
 read-only database.  This was true before this patch, too, though it was
 less obvious.  (Maybe that's not what you're asking?)

Yes, that's what I was asking. I guess it's orthogonal to your patch
series, but the logic of returning FALSE for read only databases is not
very intuitive to me (in the sense that needs upgrade is not the
opposite of can't be upgraded.  Your new comment later in the series
is better, but it would IMHO be even better if you mentioned the read
only case.

d
___
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch