On May 4, 2005, at 12:05, Brian E Carpenter wrote:

Henri Sivonen wrote:
On Apr 29, 2005, at 12:17, Martin Duerst wrote:
Making this more precise is definitely desirable. But there is also
an i18n issue: This works fine for languages that use spaces between
words. It doesn't work for languages that don't have spaces between
words (Chinese, Japanese, Thai,...). If Text elements are only used
for short things such as names or titles, that's not a big issue,
the text in question can just be put on a single line. However,
when the texts in question are long, it's a serious issue, and
should be fixed.
You seem to be assuming that the length of a "line" is restricted in XML source. Why? As far as I can tell, it should be permissible to produce Atom documents that contain no LF or CR characters.
Can't languages without spaces use long source "lines" and apply soft wrapping in a source view if necessary? Why is this a wire format problem?

Are you suggesting that a canonical format without CRLF should be mandatory?

No, not mandatory, although I expect to produce such feeds myself (even in English or Finnish). I am suggesting that pretty-printing line breaks should not be introduced in places where normalizing them to a space would be inappropriate. So if you have a long string of Chinese, Japanese or Thai and you don't want spaces in that string, just don't put white space there.


--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/



Reply via email to