[ 
https://issues.apache.org/jira/browse/LOG4J2-255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659259#comment-13659259
 ] 

Ralph Goers commented on LOG4J2-255:
------------------------------------

Nick, while UTF-8 is capable of representing characters in many languages most 
computers don't display characters on the screen in Unicode.  They use what IBM 
calls code pages. For example, Gary mentioned cp 1252 - cp stands for code 
page. http://en.wikipedia.org/wiki/Code_page gives a simple explanation of what 
they are.  So the problem is that although you may have data in Unicode, to 
display it on the screen so that it is viewable it must be converted to the 
proper code page.  Since Strings in Java are always UTF-8, when you call 
getBytes() on the string passing in a charset allows Java to convert the UTF-8 
into the target code page, provided that the OS has the definition for the code 
page installed.  This is why Layouts accept a charset parameter. The charset 
Java's name for a code page.

What I don't understand here is that if Remko is generating logs in UTF-8 that 
contain Japanese characters and is specifying the proper Japanese code page for 
the host computer why it is generating unreadable stuff. If no charset is 
specified then it is perfectly understandable why this would be happening.

Note that this is actually the proper way to performa 
internationalization/localization - the Strings should be manipulated in UTF-8 
and passed from client to server that way and only converted to the target code 
page when they are actually displayed. 
                
> Multi-byte character strings are scrambled in log output
> --------------------------------------------------------
>
>                 Key: LOG4J2-255
>                 URL: https://issues.apache.org/jira/browse/LOG4J2-255
>             Project: Log4j 2
>          Issue Type: Bug
>          Components: Appenders, Core
>    Affects Versions: 2.0-beta6
>            Reporter: Remko Popma
>            Assignee: Remko Popma
>            Priority: Blocker
>             Fix For: 2.0-beta7
>
>
> When I tried to log a Japanese string the output was scrambled in both the 
> Console and a log file.
> For example,
> logger.warn("日本語テスト"); // (Japanese test)
> came out as
> 15:07:00.184 [main] WARN  test.JapaneseTest - 譌・譛ャ隱槭ユ繧ケ繝?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: log4j-dev-unsubscr...@logging.apache.org
For additional commands, e-mail: log4j-dev-h...@logging.apache.org

Reply via email to