Trent W. Buck wrote:
On Sun, Aug 16, 2009 at 05:46:16AM -0400, Max Battcher wrote:
(Plus such benefits as syntax highlighters are already designed to
be fast, to be non-lossy, and to handle error states and partial
documents well...)

IME syntax highlighting implementations are "best effort", not
lossless[0] -- they *do* sacrifice accuracy for speed.  I don't
consider that acceptable for a VCS, particularly in the presence of
commutation.  I guess that's the crux of our disagreement.

I meant lossless with respect to the character stream... Any syntax highlighter tokenizer is broken if it doesn't respect that every character input has to be represented in the token stream (including and particularly whitespace), generally with full "document space" correlation information such as line and column numbers. Which is to say that if you diff based upon the token stream you don't need any more powerful "text reconstruction" technique than simple concatenation.

(People would balk if their syntax highlighters produced documents that looked substantially different than the input, particularly because most commonly their syntax highlighters are their editors. The same is not necessarily true of the languages "real" lexer; many of which intentionally throw out "garbage" that is useless in parsing such as whitespace.)

As for lossless in terms of the battle between the "accuracy" of a syntax highlighting tokenizer versus that format's own official tokenizer/lexer's token stream, I think that may ultimately be a red herring to worry about. I think that a modern syntax highlighter's "best effort" is "good enough" for smart VCS diffs (and certainly much more interesting than traditional line-based diffs), and there are already generic enough syntax highlighter libraries that could be applied "today", whereas the "real" lexers used in the wild vary greatly and rarely can be used in aggregate.

Regarding partial documents, I tend to take the hard line that "if it
don't compile, don't commit it".  If darcs replace helps enforce that,
all the better :-)

I felt a bit that way when I first thought about this subject and was thinking just about XML documents. David Roundy did help to convince me that people would yell and scream in pain if their VCS didn't support partial documents. Since then I've paid a lot more attention to the number of times I've recorded patches with work in progress source documents that I don't want to lose or that I want others to comment upon/expand. (Plus there are legitimate partial documents that a project might want to store that don't compile/run such as "template" files.)

--
--Max Battcher--
http://worldmaker.net
_______________________________________________
darcs-users mailing list
darcs-users@darcs.net
http://lists.osuosl.org/mailman/listinfo/darcs-users

Reply via email to