On Aug 21, 2009, at 4:32 AM, shadow king wrote:
thanks for your reply.
BTW, Is there a convinent way to swtich off all the "decoding &
encoding" thing completely? Because I don't want the performance
penalty imposed by the related function.
For me, I cound not see the benefit of using unicode as the
internal charset in log4cxx and I just want the log4cxx to log the
messages without any charset convertion.
On Fri, Aug 21, 2009 at 12:00 PM, Curt Arnold <carn...@apache.org>
wrote:
I'm thinking the constant should be 0x20, not 0x30. The code was an
attempt to be able to handle non-ASCII platforms like EBCDIC but
looks like it was mangled and was done without access to a non-ASCII
platform. Was just trying to do enough decoding to get the encoding
name to load a full charset.
You can hardwire the assumed encoding with
./configure --with-charset=utf-8
./configure --with-charset=usascii
./configure --with-charset=iso-8859-1
All three will replace conversion with glorified copy operations.
Specifying usascii will replace all non ASCII characters with a loss
character ('?') but if you specify an particular encoding for a file,
the resulting file will be valid.
utf-8 will blast characters directly into internal representation. If
you do not specify an encoding on any file appenders, the output file
will have the same charset as the platform. Filters, XML files,
SocketAppenders, and any thing with a specified encoding may be
invalid. However, if you just going straight through to a file, you
won't end up with loss characters.
iso-8859-1 will convert characters to utf-8. If you do not specify
any encoding, the output file will have the same encoding as the
platform. Won't result in illegal byte sequences like specifying
UTF-8, but any explicit encoding may result in character substitution.