Re: jorphan.io.TextFile always uses default encoding, ResultCollector always uses UTF-*

Jordi Salvat i Alabart Tue, 23 Dec 2003 12:27:55 -0800

You're indeed pointing to a whole collection of bugs... but none of them seems to be the one affecting you :-)

Talking about current CVS code, TextFile is used in very few places:

- A unit test in AnchorModifier -- we could live with any side effects on this.

- Last line trim in ResultCollector -- but only last line trim! It should not make any difference whether you use UTF-8 or ISO-Latin-1 here, as far as you use the same one for reading and writing. Still, it could fail if the platform encoding is one for which the UTF-8 representation of some character used in the file is not a valid character representation. (Sorry for the very clear statement -- that's about as good as my English can be.) ISO-Latin-1 is pretty safe, but other platforms will of course use others...

- Retrieving XML data files in WebServiceSampler. I think it's incorrect to use TextFile here, since XML file encoding is either assumed (in which case a TextFile using the platform default encoding is a correct solution, although probably not optimal) or found inside the XML file itself -- in which case the solution is plain wrong.

Whether TextFile should use a given encoding or just the platform default can be discussed, but it certainly should be documented.

Also, it's quite obvious that the ResultCollector should not handle response data as character data, since in many cases it's binary stuff, and any character encoding (UTF-8 or whatever) will be wrong. Actually, XML is a bad format for binary data: we should either store that in separate files or encode it base-64 or alike.

And, you're right, there's just too many places where we use response data is character data. If all this causes is some gibberish in the screen, that's a minor problem, but sometimes it can be worse...

In any case, as I said, I can't see how you can end up with a result XML file with ISO-8859-1 content. Are you sure about that?

--
Salut,

Jordi.

En/na Vincent Partington ha escrit:

Hi,

The class jorphan.io.TextFile always uses the default encoding to read and
write files:
http://cvs.apache.org/viewcvs.cgi/*checkout*/jakarta-jmeter/src/jorphan/org/apache/jorphan/io/TextFile.java?content-type=text%2Fplain&rev=1.4

In my case the default encoding is ISO-8859-1 (Windows XP US). However,
other parts of the JMeter code explicitly use UTF-8:
http://cvs.apache.org/viewcvs.cgi/*checkout*/jakarta-jmeter/src/core/org/apache/jmeter/reporters/ResultCollector.java?content-type=text%2Fplain&rev=1.29

This causes the result XML file to say UTF-8 in its XML header, but the
content is actually ISO-8859-1. If funny characters are written to the
result XML, the file will not be accepted by the XSLT processor.

I fixed the problem by having jorphan.io.TextFile explicitly and
hardcodedly use UTF-8, but I don't know whether that will impact other
code. Any thoughts?

Regards, Vincent.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: jorphan.io.TextFile always uses default encoding, ResultCollector always uses UTF-*

Reply via email to