Re: [Monotone-devel] 3 proposed changes to manifest/changeset format

2005-05-11 Thread Nathaniel Smith
On Wed, May 11, 2005 at 10:26:33AM +0800, Matt Johnston wrote:
 On Tue, May 10, 2005 at 03:26:40AM -0700, Nathaniel Smith wrote:
  Some smaller changes are in the offing, though:
  
  - First-class directory support 
 
 If it'll fix dir dropping etc, sounds sane. Would this have
 consequences for making the root directory renamable? I'd
 like to seen renamable root directories eventually getting
 supported, as it seems to provide a convenient way to bring
 third party branches into a project (ie / of the
 hypothetical org.sqlite.sqlite branch gets renamed to
 sqlite/ when propagated to the net.venge.monotone branch). 

I'm interested in this feature too.  This particular change doesn't
have any direct effect, except that we might want to come up with some
name for the root directory.  (Ugh.)  Or I guess we could later add a
rename_root changeset op.  (Ugh.)

The sort of general goal is to make the revision/changeset/manifest
format be as close a match as possible to the actual underlying
domain; they should exactly record what really happened, and not
record anything else.  (For instance, unique persistent file
identifiers do not actually exist in the model domain; they're
something additional that a VCS might add to make its job easier.  So,
they shouldn't be in this format.)  The meta-goal is to make this
data neutral between different sorts of internal representations and
algorithms that might want to run over it, so we can adjust those
parts as needed.

  - Patches on delete_file: Currently, there's an assymmetry in the
  change_set format -- add_file foo is accompanied by patch foo from
  [] to [initial hash].  delete_file foo, on the other hand, is not
  accompanied by a patch foo from [final hash] to [].  The reason
  for this is that such a hash would be redundant, but I think it's a
  decision worth revisiting.
 
 My gut feeling is that having the explicit patch to [] is
 likely to increase the chances of triggering invariants in
 edge/untested cases for change_set.cc, which is a good thing.

Good point.

  - Attributes in manifests: this is a speculative idea, raised for
  discussion -- if we use basic_io for manifests, it becomes pretty
  straightforward to include file attributes (like .mt-attrs currently
  stores) directly into the manifest, if we want to.  Presumably this
  would also involve adding set_attr and delete_attr type operations
  to changesets.
 
 One issue here is that without changes to the revision
 format, it'd be possible for a revision to have no changeset
 information yet differing old_ and new_ manifests, when only
 the attributes are altered?

Yeah, right now manifests are technically redundant, since they are
fully reconstructible from the changeset history.  This seems like an
important property to preserve.  (Why have manifests at all, then?
Because they provide end-to-end tree state integrity checking -- there
might be bugs in the change set logic, but so long as the hash of the
manifest of my checkout matches the hash of the manifest of your
checkin, we can be certain we have the same tree.  They're probably
handy for things like partial pulls, too, where one wouldn't have
access to full history...)

-- Nathaniel

-- 
Details are all that matters; God dwells there, and you never get to
see Him if you don't struggle to get them right. -- Stephen Jay Gould


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] 3 proposed changes to manifest/changeset format

2005-05-11 Thread Florian Weimer
* Matt Johnston:

 If it'll fix dir dropping etc, sounds sane. Would this have
 consequences for making the root directory renamable? I'd
 like to seen renamable root directories eventually getting
 supported, as it seems to provide a convenient way to bring
 third party branches into a project

I'm not sure if this is going to work.  You could end up with multiple
instances of the same file, which will be rather painful to deal with
(if I'm not mistaken).


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


Re: [Monotone-devel] 3 proposed changes to manifest/changeset format

2005-05-11 Thread Nathaniel Smith
On Thu, May 12, 2005 at 12:13:11AM +0200, Florian Weimer wrote:
 * Matt Johnston:
 
  If it'll fix dir dropping etc, sounds sane. Would this have
  consequences for making the root directory renamable? I'd
  like to seen renamable root directories eventually getting
  supported, as it seems to provide a convenient way to bring
  third party branches into a project
 
 I'm not sure if this is going to work.  You could end up with multiple
 instances of the same file, which will be rather painful to deal with
 (if I'm not mistaken).

How do you mean?  I don't see how it's different from any other sort
of directory renaming.

It does require directory suturing to be useful, though.  Hmm.
(Because for the project-combination case to work, you need to be able
to unify two root directories; and if root directories are directories
like any other, this means directory suturing needs to work.)
(suturing is taking two logically distinct files or directories, and
melding them into a single one.)

-- Nathaniel

-- 
When the flush of a new-born sun fell first on Eden's green and gold,
Our father Adam sat under the Tree and scratched with a stick in the mould;
And the first rude sketch that the world had seen was joy to his mighty heart,
Till the Devil whispered behind the leaves, It's pretty, but is it Art?
  -- The Conundrum of the Workshops, Rudyard Kipling


___
Monotone-devel mailing list
Monotone-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/monotone-devel


[Monotone-devel] 3 proposed changes to manifest/changeset format

2005-05-10 Thread Nathaniel Smith
Just a heads up, and to get some peer-review, on some changes to the
basic manifest/revision format that are probably coming up.  Though
there was some discussion of radical changes related to switching to
a new merge algorithm, at the moment it looks like that will _not_
happen, and the external formats will continue mostly as they are.
Some smaller changes are in the offing, though:

- First-class directory support: basically meaning, put directories
into manifests, and add an add_dir operation to changesets
(currently there is {add,rename,delete}_file, and
{rename,delete}_dir; this completes the set).  The big user-visible
consequence of this is that it will make checking in empty directories
possible, which will probably make a lot of people happy (though I'm
still not sure why ;-)).  The really big consequence of this is that
some subtle-but-important cases become much more sane to support;
there is discussion in the bug tracker:
  https://savannah.nongnu.org/bugs/?func=detailitemitem_id=12070
For most people, though -- empty directories.  Good stuff.  This is
almost certainly happening.

How exactly to modify the manifest format is not entirely settled yet.
One possibility is to denote directories by filenames ending in /, and
some sort of placeholder hashes.  Another possibility is to take the
opportunity to switch the manifest format over to using basic_io, like
most other parts of monotone (including, notably, revisions and
changesets).

- Patches on delete_file: Currently, there's an assymmetry in the
change_set format -- add_file foo is accompanied by patch foo from
[] to [initial hash].  delete_file foo, on the other hand, is not
accompanied by a patch foo from [final hash] to [].  The reason
for this is that such a hash would be redundant, but I think it's a
decision worth revisiting.

The lack of symmetry alone doesn't bother me that much, really.  What
bothers me more is that because of this, changesets are not inherently
invertible -- invert_change_set has to have access to the pre-state
manifest, exactly so that it can invert delete_file's.  This is a bit
annoying, and certainly slows down change_set inversion (which is
needed by things like diff, annotate, etc.).  (Though I guess it's
possible diff should switch to something more codevilley?  I'm not
sure what that would be, but Richard's rename bug seems to mean that
our current diff algorithm is non-deterministic when it comes to
determining tree rearrangements and file identities.)

It also is in some sense the cause of the one bug we've ever had that
came close to losing history; there's a whole section in UPGRADE on
this bug, that was fixed in 0.17.  When generating the root revision,
monotone left stuff out of the changeset; netsync, then, which looks
almost entirely at changesets, could only see file versions whose
hashes were later mentioned in a changeset, and would silently fail to
fetch some files.  The only files that this applied to were ones that
had been imported, and then immediately deleted without being
modified; if a file was later added, or later edited, or existed in
the head revision, then all of its states were automatically visible
to netsync.  If delete_file had included a patch, then this bug
would never have been able to work at all.

I'm not really sure how great the second argument is, we don't expect
to re-introduce the bug :-).

Anyway, seemed worth raising for consideration.

(graydon, be particularly interested to hear your opinion on this one,
since you haven't seen it before...)

- Attributes in manifests: this is a speculative idea, raised for
discussion -- if we use basic_io for manifests, it becomes pretty
straightforward to include file attributes (like .mt-attrs currently
stores) directly into the manifest, if we want to.  Presumably this
would also involve adding set_attr and delete_attr type operations
to changesets.

There are a number of possible trade-offs to this.  With cdv-merge, it
should be reasonably straightforward to actually deal with merging
such things.  (They'd be handled exactly the same way as filenames.)
It would still make the merging more complex, though.  It would also
increase conceptual complexity some, I think, to have more sorts of
versioned data, and it would certainly increase UI complexity in some
ways (e.g., we need ways to annotate attributes, signal conflicts in
attribute changes and let the user provide a resolution, etc.).  It
would close the door on making attributes apply to wildcards (possible
use case: set an attribute on *.pdf that gave a hint to the merger
how to handle pdfs, e.g., with an external program).  I'm not sure
that this would be a good idea anyway, though.

The motivation behind the current .mt-attrs idea is basically the idea
that, we version files; if you want some data to be versioned
(meaning, old versions stored and retrievable, merging to happen,
etc.), then store that data in a file.  It is still a bit of an
expedient hack, though, and