[ On Thursday, June 15, 2000 at 17:21:44 (-0400), Larry Jones wrote: ]
> Subject: Re: Format of RCS file -- On windows should newline be \n or \r\n
>
> I'd hardly call what's in rcsfile(5) "well defined" -- it defines (most
> of) the syntax but says almost nothing about semantics. CVS's
> doc/RCSFILES plasters over some of the more gaping holes, but still
> leaves a lot of holes.
Oh, yeah, well of course.
All I can say is that in my vocabulary "format" is much closer to
"syntax" than "semantics".
Your other answer though seemed to indicate that there was some inherent
difference between "text" and "binary" RCS file contents. This is not
so. While it's true that there can be a flag in an RCS file that says
tools using the data from that entire file should be treated in some way
uniquely, the data in an RCS file is, in reality, always "binary",
i.e. opaque:
Strings are enclosed by @. If a string contains a @, it
must be doubled; otherwise, strings can contain arbitrary
binary data.
It is the tool(s) that we use to generate the deltas between two files
which treat the data in those files as lines of text.
Even if we delve deeper into what goes in a "text" section of an RCS
file, as described by CVS' doc/RCSFILES notes, there's still nothing
that differentiates binary and lines-of-text content. So long as your
version of "diff" can handle lines of arbitrary length then there's no
problem -- the diff just doesn't make sense to any normal human. You
can even diff completely unrelated binary files and still end up with
something that "works" and can be stored as a legal delta.
Someone truly intent on making their pain and suffering worse could in
theory add a flag to "diff" that told it to treat '\n' and '\r\n' as
identical just as there's now a flag that'll treat any amount of
whitespace as identical. Similar hooks could be put in RCS and/or CVS
to use, say, "-kt", to enable their new "diff" flag. That's a very
slippery and dangerous slope though.
People who want to end their pain and suffering will forget about trying
to edit files using idiotic antiquated end-of-line conventions and stick
to just feeding normal text files to their cross-compilers. :-)
Conversely people who want to embrace their antiquated EOL conventions
whole-heartedly could create a version of diff that only used '\r\n' as
an end-of-line and integrate that into RCS and CVS.
I.e. the real solution is not to pretend that you can share text files
between modern systems and those with antiquated incompatible silly EOL
conventions with complete transparency. Of course most of the silly
systems in question have development tools that are in fact quite happy
to handle normal '\n' delineated text and it's simply a matter of giving
up the really broken tools like text editors that insist on this ancient
brain damage. If people find it necessary to share text files with
broken systems then they must force the users on the broken systems to
never use tools which will damage normal text files. There are lots of
very viable alternatives (and there *always* have been -- it's just a
matter of taking the initiative and doing something about it!).
In an ideal world we'd use a much more intelligent delta generation tool
that would sense the syntax of the files being compared and if they
matched sufficiently then it would compare them token by token, not line
by line. Storing RCS deltas in some way that would identify tokens
would of course require an "extension" to the currently implied RCSfile
semantics of course.
--
Greg A. Woods
+1 416 218-0098 VE3TCP <[EMAIL PROTECTED]> <robohack!woods>
Planix, Inc. <[EMAIL PROTECTED]>; Secrets of the Weird <[EMAIL PROTECTED]>