DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT <http://issues.apache.org/bugzilla/show_bug.cgi?id=43736>. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE.
http://issues.apache.org/bugzilla/show_bug.cgi?id=43736 Summary: Chainsaw does not honor encoding when loading XML files Product: Log4j Version: 1.2 Platform: Other OS/Version: other Status: NEW Severity: normal Priority: P2 Component: chainsaw AssignedTo: [email protected] ReportedBy: [EMAIL PROTECTED] On Oct 30, 2007, at 2:16 PM on log4j-user, Jessica Lin wrote: I want to use Chainsaw to view the log file contains Chinese character. The log file was recorded by using FileAppender which I defined the endoding as UTF-8. Here is part of my log4j.properties file. # xml format file appender log4j.appender.xml=org.apache.log4j.FileAppender log4j.appender.xml.file=xml.log log4j.appender.xml.encoding=UTF-8 log4j.appender.xml.append=false log4j.appender.xml.layout=org.apache.log4j.xml.XMLLayout Then I use Chainsaw to load xml.log file. The Chinese characters are shown as å è¿ä¸ªåè½. The Original characters are ?????. I double checked the xml.log which did save as UTF-8 encoding. The XMLDecoder file Which Chainsaw uses to load XML file also use UTF-8 encoding. Can you help me? Thanks, Jessica --------- The problem appears to be in o.a.l.xml.XMLDecoder in the receivers companion where at line 186 and 188, InputStreamReaders are allocated without explicitly specifying an encoding. That will cause the InputStreamReader to use the default platform encoding which appears not be to UTF-8 in this instance. The approach is broken and needs to be rewritten to handle any arbitrary encoding. The XML parser should be presented with a minimal document like: <!DOCTYPE log4j:eventSet [ <!ENTITY content SYSTEM "..."> ]> <log4j:eventSet version="1.2" xmlns:log4j="..."> &content; </log4:eventSet> and an entity resolver should then load the URL as a byte stream in response to the resolveEntity call. For a work around, anything that sets the default charset for the JVM to UTF-8 should avoid the problem until it can be fixed. There is not a clearly documented way to do that and it is platform dependent. On a Nix machine, you could try export LC_CTYPE=UTF-8 on Windows you could try: java -Dfile.encoding=UTF-8 org.apache.log4j.chainsaw... -- Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
