David Roundy wrote:
Another interesting problem
would be that of creating a sort of lexing/parsing language that would
allow customized patch types that are specific to a particular programming
language.  This is a particularly hard problem, as you'd need to have the
parsing always succeed and always give meaningful (and reasonable) results.
And the resulting patches would have to merge and commute in a meaningful
and useful manner.

I understand why you think it is important to create meaningful patches for non-parsing code for those instances when you want to save unstable works in progress.

But, I think there might be a good case for restricting such a patch type to parsing documents (and using hunks if it won't parse). Proper parsing documents make much better semantic sense (and thus better cherry-pickable patches).

I think the big trick here is in commuting parse patches with hunk patches. Let me see if I can explain how I'm thinking this might work...

Assuming, for simplification, that all patches only affect a single file.

A tree structure is the result of parsing a well-formed file state [A]:

B<T> = Parse<T>[A]

A file can be "canonized" (pretty printed) from a tree structure:

[A]' = Canon<T>(B<T>)

[A] is not necessarily equal to [A]'. However, for a given repo the pretty printed [A]' can be considered definitive (it is what the repo owner really "wants"), and we can ignore/forget whatever [A] was. What this means is that for the chosen repo, whenever B<T> can be formed, [A]' should be able to replace [A] without changing the meaning or irritating the user. Who would complain if every time you recorded a parseable file Darcs pretty printed it back? You will need to canonize tree patches this way anyway to create files from the empty tree (assuming you start entirely from tree patches).

You can create a hunk patch in reference to [A]'. The problem is then in the commutation. A hunk patch commuted with a tree patch would result in two hunk patches.

[D] = B<T> C <-> C' B'

At first glance, this seems useless, because if you commute a hunk patch all the way down, you now have hunk representations of all of the tree patches with respect to the canon. [D] may not be parseable, but [E] might be.

[E]' = C' B' F (F would have to be a hunk patch)

This means, that C' B' F would be equivalent to, and thus could be merged into Parse<T>[E]'. In this case, B<T> could then be trivially commuted back out of the merge:

{C F} = Parse<T>[E]' - B<T> (assuming - is the tree diff operation)

What this means is that two or more hunk patches could be composed/merged into a tree patch. Most likely you are going to more interesting in {C F} than in one or the other alone, anyway. Any stable repository with only tree patches (and composed tree patches) could return to sender any single hunk patches for being unstable without even having to build and test the repository, as it would be inherently unparsable, malformed, and thus uncompilable. These composed tree patches thus better model a developer's workflow, as there might be many intermediate patches from each sitting that result in one decent, parseable state.

The next and final diffulty is then in determining if some tree type T and some other U commute. Although I talk about them normal hunk patches, you would have to encode T in the hunk patches so that you can commute them with U. Where this would be used is for things like indent preference that would affect the canonization, but would still be commutative if you translate between the canonical forms. This would only be necessary for unstable repositories where multiple alone tree-hunks would be stored of different canonization types. Once composed with other tree-hunks into a tree patch, the canonization "details" don't matter again.

By the way: Pappy the packrat parser[1] (Haskell code) and its PEGs might be one of the more interesting choices, as opposed to BNF, for building these parse patches, particularly because it might offer a way to also create canonizers from the same PEG (which would be very tough, if not impossible, from a BNF). I don't think anyone's ever programmed such a thing, but I'm thinking it might be an interesting experiment.

[1] http://pdos.csail.mit.edu/~baford/packrat/

I hope that that helps to get some of the idea across. I'm not sure I've explained things that well, or even if I am thinking in the right direction. Hopefully, though, it might spark ideas or experiments for someone else to try.

--
--Max Battcher--
http://www.worldmaker.net/
The WorldMaker.Network: Support Open/Free Mythoi. Read the manifesto @ mythoi.com

_______________________________________________
darcs-users mailing list
[email protected]
http://www.abridgegame.org/mailman/listinfo/darcs-users

Reply via email to