Re: this is a bug in the encoding procedure

Curt Arnold Fri, 21 Aug 2009 06:39:59 -0700


On Aug 21, 2009, at 4:32 AM, shadow king wrote:

thanks for your reply.
BTW, Is there a convinent way to swtich off all the "decoding &encoding" thing completely? Because I don't want the performancepenalty imposed by the related function.
For me, I cound not see the benefit of using unicode as theinternal charset in log4cxx and I just want the log4cxx to log themessages without any charset convertion.
On Fri, Aug 21, 2009 at 12:00 PM, Curt Arnold <carn...@apache.org>wrote:I'm thinking the constant should be 0x20, not 0x30. The code was anattempt to be able to handle non-ASCII platforms like EBCDIC butlooks like it was mangled and was done without access to a non-ASCIIplatform. Was just trying to do enough decoding to get the encodingname to load a full charset.


You can hardwire the assumed encoding with

./configure --with-charset=utf-8
./configure --with-charset=usascii
./configure --with-charset=iso-8859-1

All three will replace conversion with glorified copy operations.

Specifying usascii will replace all non ASCII characters with a losscharacter ('?') but if you specify an particular encoding for a file,the resulting file will be valid.

utf-8 will blast characters directly into internal representation. Ifyou do not specify an encoding on any file appenders, the output filewill have the same charset as the platform. Filters, XML files,SocketAppenders, and any thing with a specified encoding may beinvalid. However, if you just going straight through to a file, youwon't end up with loss characters.

iso-8859-1 will convert characters to utf-8. If you do not specifyany encoding, the output file will have the same encoding as theplatform. Won't result in illegal byte sequences like specifyingUTF-8, but any explicit encoding may result in character substitution.

Re: this is a bug in the encoding procedure

Reply via email to