Re: [darcs-users] Fwd: Darcs Problems

John Lato Mon, 04 Mar 2013 00:21:00 -0800

On Mon, Mar 4, 2013 at 12:50 PM, Stephen J. Turnbull <step...@xemacs.org>wrote:

> John Lato writes:
>
>  > Besides, that's rather beside the point.  git-rebase is cheap because
> it's
>  > very common, and it needs to be cheap to be used widely in order to keep
>  > histories clean.  Which was my main point.  Unless you're arguing that
>  > because git-rebase  is usually fast, git users just use it arbitrarily
>  > without regard to a clean history?
>
> Sorry about introducing the performance red herring here.  What I want
> to argue is that rebase allowing you to mandate a linear history (a
> better characterization of what you really get from rebase) is
> insufficient to keep history clean in the sense of allowing easy
> cherry-picking: you also need a well-factored design.
>

Agreed.

>  > >  > 1.  Every patch applied after a base patch depends on the base
> patch.
>  > >  > 2.  A base patch doesn't depend on anything, but it conflicts
> anywhere
>  > >  >     its dependents would conflict.
>  > >
>  > > I take it a base patch must be empty?  Surely it would depend on the
>  > > patch that adds the only file the base patch patches, for example? ;-)
>  >
>  > I highly doubt that situation would come up much in practice,
>
> I don't understand what you mean.  Every file gets added at some
> point.  This means that you can go all the way back to the beginning
> of the world unless the patch is empty.
>

I think I've been insufficiently clear on one point here: a base patch
would *record* all dependencies/contexts/etc, but not *require* them.  To
hopefully better explain myself, I'd like to offer a few worked-out
examples.  Apologies in advance, since this could get long.

(edit: it does get long, and also long-winded.  But I think some clarity is
achieved at the end)

We start with a single patch O establishing a shopping list:

O: add file s_list containing
  1 Apples
  2 Bananas
  3 Cookies
  4 Rice

Now user Arjan clones the repo and adds patch A
O+A (deps O): file s_list contains
  1 Apples
  2 Bananas
  3 Beer
  4 Cookies
  5 Rice

Now user John clones the repo and adds a base patch bp0, then adds another
patch B.  His repo has

O + bp0 (deps B->O) + B (deps bp0):
file s_list contains
  1 Apples
  2 Bananas
  3 Nuts
  4 Cookies
  5 Rice

Since bp0 is a base patch, darcs calculates a transitive dependency list
for it. This dependency list is everything that bp0's dependents would
depend on if the base patch weren't present.

Scenario 1: Ganesh clones the repo and wants to pull from John, so he'd be
getting B+bp0.

B (deps bp0) + bp0 (deps B->O)

Next darcs checks for conflicts.  The idea is to essentially pretend the
base patch doesn't exist and see if everything commutes as it's supposed
to.  In this case, bp0 records a transitive dependency on O, which is
present.  There are no other patches, so everything's fine and we're done.
 If there were other patches, so long as they didn't change this specific
context everything would still commute, so still no problems.

Scenario 2: Arjan wants to pull from John, again he'd be getting B+bp0.

The transitive dependency O is present, but this time there's a problem
because the patches A and B don't commute.  So darcs does a forced
commutation of those patches, marks conflicts, and asks Arjan for a
resolution.  Arjan commits C as the resolution, giving him a repository
with:

O+A (deps O)
    +bp0 (deps B1->O; C->A)
    + B1 (deps bp0) + C (deps B1)

C only directly depends upon B1 (the forced commutation of B with A),
however the base patch has an updated transitive dependency list that
includes A.

Scenario 3: Stephen initializes an empty repository and pulls from John.
 Again, he'd be getting B+bp0

In this case there's an unfulfilled transitive dependency.  That's ok for
now.  But when darcs tries to apply B to an empty repository, it can't.
 Darcs could offer to grab the missing dependency or use a null file.
 Getting the missing dependency from the source repository is simple (it
has to be there, or it wouldn't be recorded in the list).

If you try to apply a patch to a null file without the proper context,
darcs could simply let the developer record a new patch, with no relation
to the original.  Another option would be to let the developer edit the
file with conflict markers in place, however instead of recording a new
patch, instead darcs records a patch to create the necessary context and
then applies the patch we want.  So Stephen would edit his file, perhaps to:

  1 Apples
  2 Bananas
  3 Nuts
  4 Rice

and when he records this, darcs commits the following patches:

O1 add file s_list
  1 Apples
  2 Bananas
  4 Cookies
  5 Rice
bp0 (deps B->O1)
B (deps bp0) add line 3:   3 Nuts
D (deps B) change s_list to
  1 Apples
  2 Bananas
  3 Nuts
  4 Rice

The way this works is that Stephen committed a patch that has the desired
final version of the file.  darcs can work backwards by undoing B from that
to determine the initial file with the proper context to apply B.

In this case the context O1 could be dependent on bp0, but I think that's
less likely to be useful.  In particular, if another developer wants to
pull from here, they probably don't want O1, and if they do get a copy the
result will be a duplicate file conflict.

I don't know that this particular case would be useful, but I also don't
know that it isn't.  I do think that this capability is useful in general
though.

I think that's all the interesting scenarios introduced by a base patch.  I
tried to write a few more, but they ended up being either one of these or
just a normal merge.  But if there are cases I haven't covered, please do
let me know since they may result in an implementation hole.

 > and even if it does I think the correct approach would be to give
>  > the user a choice between aborting, creating a null file, or
>  > attempting to patch a different file.
>
> AFAICS only aborting would make sense in general (you can't apply a
> change-line or delete-line diff to a null file, you can't leave the
> file null and claim to have applied the diff, and darcs tracks renames
> so it should know which different file if there is one).

 > >  > When trying to apply the base patch to the stable repo, the
>  > >  > base patch will conflict everywhere hotfix's dependencies are
>  > >  > unmet.  To resolve the conflicts, we first undo hotfix,
>
> So by "undo" you mean obliterate the patch but leave the changes
> (including conflict markers) in place, for convenience of editing?
>

I mean create a new patch that undoes the changes of the patch.  An inverse
of the original patch, as described at
 http://en.wikibooks.org/wiki/Understanding_Darcs/Patch_theory_and_conflicts

>  > >  > then apply hotfix' (the new patch).  We don't actually touch
>
> What does "actually touch" mean?
>

The base patch has the same identity before and after the merge.  However,
its dependencies are updated depending on everything that depends on it.
 So after hotfix' is applied, the dependencies would be re-calculated.
 Notionally this would happen on-demand, since the dependencies explicitly
don't contribute to the patch's identity.  An actual implementation would
probably re-calculate dependencies every time a patch is applied though.

>  > >  > the base patch when resolving conflicts, but after the
>  > >  > resolution, the base patch's dependents have changed, so now
>  > >  > it will only conflict where hotfix' would (since the original
>  > >  > hotfix is undone).
>  > >
>  > > I don't understand why this changes anything.  The dependencies of
>  > > hotfix are determined by the patches that establish the context it
>  > > needs to be applied to.  I don't see why anything would change for
>  > > hotfix' if it contains the same changes are hotfix.
>  >
>  > hotfix' is a darcs merge patch, so it's hotfix+manual edits.  The manual
>  > edits make the difference :)  Its dependencies would be hotfix,
> hotfix(-1)
>  > (the inverse patch), and whatever else is between it and the base patch
>  > that's necessary for context.
>
> IIUC, for these effects your "base patch" could be implemented as a
> Darcs tag.
>

tags are what led me to this idea, but I don't see how they're sufficient
for the problem I'd like to solve here.  What I'm trying to enable is the
following:

repo-stable and repo-dev are two repositories with some common heritage.
I want to be able to make a new branch based on repo-dev, commit some
patches, then pick up just those patches and apply them to repo-stable.

One model that would enable this is exactly that suggested by a modified
darcs pull: pull the patches, fix them up, and commit new patches that are
textually based upon the originals but have no connection to the originals
that darcs is aware of.

The base patch presents an alternative model, i.e. creating a special patch
to take the place of dependencies, then resolving conflicts after the fact.
 This means the original patches can retain their identity.  One downside
is that this special patch complicates the theory, if indeed it's workable
at all.  However I do think it could be hidden from the user for most
operations.

Maybe it's easier if you think of it as dual to tag?  Patches depend on a
tag to get context, whereas base patches (cotags?) depend on patches to
provide context?  Or maybe I've devolved to nonsense...

> I see two problems with the theory you present though.  One is that
> hotfix' is just going to be a new patch.  The other is that AFAIK (see
> caveat below) there is a theorem of patch theory that says hotfix'
> can't depend on both hotfix and -hotfix, since hotfix + -hotfix = 0.
>

As I understand it, this is why forced commutation flips the identities of
the two patch inverses.

> And I still don't understand how you handle context that was
> established outside of the base patch and subsequent patches, unless
> the base patch is implemented as a tag (which guarantees that no such
> context exists).
>

I hope that my examples above have covered this.

>
>  > This workflow would be enabled by the base patch implementation.  What a
>  > base patch provides to the user are two features:
>
> I don't want to be harsh, but I suspect this is based on an incomplete
> understanding of how patch theory works.

This is almost certainly true.

>  I don't claim to understand
> patch theory myself, except that it seems clear that the only way to
> create a given patch that no dependency can "leapfrog" is for the
> given patch to be a tag, ie, a patch that depends on all "preceding"
> patches in the repo and thus defines a version.  Your idea of a base
> patch that doesn't depend on anything doesn't seem consistent with the
> basic ideas of patch theory.
>

A base patch is sort of like a backwards tag.  If you have a patchset that
all depends on a base patch, then you can determine the necessary context
from those patches.  Then when trying to apply the patches to a repository,
you can do whatever's necessary to establish to proper context before
committing the new patches.

Just rambling now, but there are two parts to a patch, the context and the
function.  When you have the proper context, you can apply the function to
get a new context.  Currently, darcs requires that when you pull a patch,
you also pull everything necessary to establish the proper context.  But
what we frequently want to do is pull a patch, see what's necessary to
establish that context manually, then do it.  So if we start with

oAa  (patch A is a function from context o->a)

and we want to pull

bCc  (patch C is a function from b->c)

darcs requires that we pull whatever's necessary to get to 'b' along with
C.  But what I want is to pull C, and let darcs allow me to record

aB'b (patch B' is a function from a->b)

The problem is that after I record such a patch, most likely nobody else
wants it, and it also means that now my 'C' has different dependencies than
the original because we got to the same context via a different path.  The
base patch is an implementation idea to represent exactly that, a specific
context independent of how it was actually arrived at.

This also means there's an alternative resolution to scenario 2 above:
instead of essentially performing a conflict-resolution merge, Arjan could
instead have recorded a context-definition merge.  Then he'd have a new
patch in his repo before the base patch to establish the correct context
for 'B'.  This would work the same way as my scenario 3 exposition.

Actually, simply allowing for users to create a context-definition patch
like this would be pretty useful in itself.  A base patch just allows for
that to be more explicit by letting users have a handy way to refer to
re-rootable patchsets.

> It occurs to me that Ben's ideas may imply a sort of "tag algebra" to
> go with "patch algebra".
>

I'm unfamiliar with this, do you have a link?  Seems very relevant.

_______________________________________________
darcs-users mailing list
darcs-users@darcs.net
http://lists.osuosl.org/mailman/listinfo/darcs-users

Re: [darcs-users] Fwd: Darcs Problems

Reply via email to