On Aug 21, 2009, at 4:32 AM, shadow king wrote:

thanks for your reply.

BTW, Is there a convinent way to swtich off all the "decoding & encoding" thing completely? Because I don't want the performance penalty imposed by the related function.

For me, I cound not see the benefit of using unicode as the internal charset in log4cxx and I just want the log4cxx to log the messages without any charset convertion.

On Fri, Aug 21, 2009 at 12:00 PM, Curt Arnold <carn...@apache.org> wrote: I'm thinking the constant should be 0x20, not 0x30. The code was an attempt to be able to handle non-ASCII platforms like EBCDIC but looks like it was mangled and was done without access to a non-ASCII platform. Was just trying to do enough decoding to get the encoding name to load a full charset.


You can hardwire the assumed encoding with

./configure --with-charset=utf-8
./configure --with-charset=usascii
./configure --with-charset=iso-8859-1

All three will replace conversion with glorified copy operations.

Specifying usascii will replace all non ASCII characters with a loss character ('?') but if you specify an particular encoding for a file, the resulting file will be valid.

utf-8 will blast characters directly into internal representation. If you do not specify an encoding on any file appenders, the output file will have the same charset as the platform. Filters, XML files, SocketAppenders, and any thing with a specified encoding may be invalid. However, if you just going straight through to a file, you won't end up with loss characters.

iso-8859-1 will convert characters to utf-8. If you do not specify any encoding, the output file will have the same encoding as the platform. Won't result in illegal byte sequences like specifying UTF-8, but any explicit encoding may result in character substitution.


Reply via email to