XSPExpressionParser.java

Stefano Mazzocchi Mon, 05 Sep 2005 09:38:05 -0700

Niclas Hedhman wrote:

On Monday 05 September 2005 14:43, Antonio Gallardo wrote:
Of course that I am aware that both codesets (Shift-JIS and ISO-8859-1) are
different UNICODE subset. This is same as you stated.
No. Pier doesn't mix the difference between Unicode (sequence of characters)and the mapping of those characters to fixed or variable length encodedbytestreams.The fact that character 65 in Unicode is in many encodings mapped to the bytevalue 65 is for convenience only, and that fact should be ignored.
Our SVN uses UTF-8 as the default charset (or encoding) or not?
Subversion uses binary data, and is agnostic to any encodings in the data (orso they say). AFAIU, marking files as text only deals with the line endingsand how the diff mails are generated.
The --encoding argument applies to commit messages.
Paths, URLs/URIs has additional encoding requirements.


Correct.

And is also worth noting that SVN before 1.2 and CVS2SVN create a prettybroken combination when the commit message in CVS used an encoding thatwas not UTF-8.

As an example, try to get svn log of the apache repository and the svnclient will fail, because we have three commit messages in latin-1placed, as binary, by cvs2svn into svn (and prior to 1.2 there was noencoding validation checking in svn) that get moved into the XML filethat is passed between the svn server and client, which is using UTF-8as the encoding.

I've asked infra@ to fix this, but being not really high priority (onlydata archeologist like myself care about those things) it is unlikely toget fixed.

Anyhow, I agree with Pier, we should *only* use ASCII and escape unicodecharacters explicitly the \uxxxx way.


--
Stefano.

Re: svn commit: r278641 - /cocoon/branches/BRANCH_2_1_X/src/blocks/xsp/java/org/apache/cocoon/components/language/markup/xsp/XSPExpressionParser.java

Reply via email to