On May 4, 2005, at 12:05, Brian E Carpenter wrote:
Henri Sivonen wrote:On Apr 29, 2005, at 12:17, Martin Duerst wrote:You seem to be assuming that the length of a "line" is restricted in XML source. Why? As far as I can tell, it should be permissible to produce Atom documents that contain no LF or CR characters.Making this more precise is definitely desirable. But there is also an i18n issue: This works fine for languages that use spaces between words. It doesn't work for languages that don't have spaces between words (Chinese, Japanese, Thai,...). If Text elements are only used for short things such as names or titles, that's not a big issue, the text in question can just be put on a single line. However, when the texts in question are long, it's a serious issue, and should be fixed.
Can't languages without spaces use long source "lines" and apply soft wrapping in a source view if necessary? Why is this a wire format problem?
Are you suggesting that a canonical format without CRLF should be mandatory?
No, not mandatory, although I expect to produce such feeds myself (even in English or Finnish). I am suggesting that pretty-printing line breaks should not be introduced in places where normalizing them to a space would be inappropriate. So if you have a long string of Chinese, Japanese or Thai and you don't want spaces in that string, just don't put white space there.
-- Henri Sivonen [EMAIL PROTECTED] http://hsivonen.iki.fi/
