Trent W. Buck wrote:
On Sun, Aug 16, 2009 at 05:46:16AM -0400, Max Battcher wrote:
(Plus such benefits as syntax highlighters are already designed to
be fast, to be non-lossy, and to handle error states and partial
documents well...)
IME syntax highlighting implementations are "best effort", not
lossless[0] -- they *do* sacrifice accuracy for speed. I don't
consider that acceptable for a VCS, particularly in the presence of
commutation. I guess that's the crux of our disagreement.
I meant lossless with respect to the character stream... Any syntax
highlighter tokenizer is broken if it doesn't respect that every
character input has to be represented in the token stream (including and
particularly whitespace), generally with full "document space"
correlation information such as line and column numbers. Which is to say
that if you diff based upon the token stream you don't need any more
powerful "text reconstruction" technique than simple concatenation.
(People would balk if their syntax highlighters produced documents that
looked substantially different than the input, particularly because most
commonly their syntax highlighters are their editors. The same is not
necessarily true of the languages "real" lexer; many of which
intentionally throw out "garbage" that is useless in parsing such as
whitespace.)
As for lossless in terms of the battle between the "accuracy" of a
syntax highlighting tokenizer versus that format's own official
tokenizer/lexer's token stream, I think that may ultimately be a red
herring to worry about. I think that a modern syntax highlighter's "best
effort" is "good enough" for smart VCS diffs (and certainly much more
interesting than traditional line-based diffs), and there are already
generic enough syntax highlighter libraries that could be applied
"today", whereas the "real" lexers used in the wild vary greatly and
rarely can be used in aggregate.
Regarding partial documents, I tend to take the hard line that "if it
don't compile, don't commit it". If darcs replace helps enforce that,
all the better :-)
I felt a bit that way when I first thought about this subject and was
thinking just about XML documents. David Roundy did help to convince me
that people would yell and scream in pain if their VCS didn't support
partial documents. Since then I've paid a lot more attention to the
number of times I've recorded patches with work in progress source
documents that I don't want to lose or that I want others to comment
upon/expand. (Plus there are legitimate partial documents that a project
might want to store that don't compile/run such as "template" files.)
--
--Max Battcher--
http://worldmaker.net
_______________________________________________
darcs-users mailing list
darcs-users@darcs.net
http://lists.osuosl.org/mailman/listinfo/darcs-users