John Cowan wrote: > XML 1.1 will treat CR, LF, NEL, <CR, LF>, <CR, NEL>, and LS as line > terminators and report them all as LF. PS is left alone, because of > the bare possibility that it is being used as quasi-markup.
I'm not sure why <CR, NEL> should be seen as a single line end. And I think PS should be seen as a line end for XML too. It, like LS, can be used to format the XML source, but should not be interpreted as other than line end when parsing the XML source. E.g., PS is not a begin-end markup, which all other XML markup is; nor do I know of a way of attaching "style" to a PS, like can be done for <p></p> etc. Following (ex-) UAX 14 fully, FF and VT should be seen as line separtors too. Though they are unlikely in XML source files. FF shouldn't be interpreted as generating a page break in the "styled output" of an XML file, should it? > I can't imagine why EOF should be called a line terminator, except > in the sense that a "read a line" operation should obviously > not attempt to read past EOF. There have been Unix programs that (mistakenly, I'd say) *discarded* the last (possibly partial) line of input, just because it had no LF at its end... And LS it's a separator, not a terminator, so EOF has to be a line terminator. > Calling it a line terminator means that every > document is forced into the mold of being an integral number of lines > long, regardless of the facts. ?? If you mean that concatenating files should not generate a line break between the files, I agree. /kent k
smime.p7s
Description: S/MIME cryptographic signature