This is an automated email from the ASF dual-hosted git repository.

swebb2066 pushed a commit to branch asf-staging
in repository https://gitbox.apache.org/repos/asf/logging-log4cxx-site.git


The following commit(s) were added to refs/heads/asf-staging by this push:
     new 0ce82d0b Improve the Unicode support FAQ documentation
0ce82d0b is described below

commit 0ce82d0ba8222c8b773793f212dd5731e2713cb6
Author: Stephen Webb <[email protected]>
AuthorDate: Mon Aug 14 17:17:22 2023 +1000

    Improve the Unicode support FAQ documentation
---
 1.2.0/faq.html | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/1.2.0/faq.html b/1.2.0/faq.html
index 08c8ca5f..7310368b 100644
--- a/1.2.0/faq.html
+++ b/1.2.0/faq.html
@@ -110,14 +110,15 @@ DLL" with release builds of Log4cxx and "Multithread DLL 
Debug" with debug build
 <h1><a class="anchor" id="unicode_supported"></a>
 Does Apache Log4cxx support Unicode?</h1>
 <p>Yes. Apache Log4cxx exposes API methods in multiple string flavors 
supporting differently encoded textual content, like <code>char*</code>, 
<code>std::string</code>, <code>wchar_t*</code>, <code>std::wstring</code>, 
<code>CFStringRef</code> et al. All provided texts will be converted to the 
<code>LogString</code> type before further processing, which is one of several 
supported internal representations and is selected by the 
<code>LOG4CXX_CHAR</code> cmake option. If methods are used  [...]
-<p>The default external representation is controlled by the 
<code>LOG4CXX_CHARSET</code> cmake option. FileAppenders support an 
<code>Encoding</code> property allowing character set encoding control per 
appender. For example, you can use <code>UTF-8</code> or <code>UTF-16</code> 
when writing XML or JSON layouts. Log4cxx also implements character set 
encodings for <code>US-ASCII</code> (<code>ISO646-US</code> or 
<code>ANSI_X3.4-1968</code>) and <code>ISO-8859-1</code> (<code>ISO-LATIN-1</ 
[...]
+<p>The default external representation is controlled by the 
<code>LOG4CXX_CHARSET</code> cmake option. FileAppenders support an 
<code>Encoding</code> property allowing character set encoding control per 
appender. For example, you can use <code>UTF-8</code> or <code>UTF-16</code> 
when writing XML or JSON layouts. Log4cxx also implements character set 
encodings for <code>US-ASCII</code> (<code>ISO646-US</code> or 
<code>ANSI_X3.4-1968</code>) and <code>ISO-8859-1</code> (<code>ISO-LATIN-1</ 
[...]
 <p>The <code>locale</code> character set encoding provides support beyond the 
above internally implemented options. It allows you to use any multi-byte 
encoding provided by the standard library. See also <a 
href="https://stackoverflow.com/questions/571359/how-do-i-set-the-proper-initial-locale-for-a-c-program-on-windows";>some
 SO post</a> on setting the default locale in C++.</p>
 <div class="fragment"><div class="line">std::setlocale( LC_ALL, &quot;&quot; 
); /* Set locale for C functions */</div>
 <div class="line">std::locale::global(std::locale(&quot;&quot;)); /* set 
locale for C++ functions */</div>
 </div><!-- fragment --><p>According to the <a 
href="https://www.gnu.org/software/libc/manual/html_node/Setting-the-Locale.html";>libc
 documentation</a>, all programs start in the <code>C</code> locale by default, 
which is the <a 
href="https://stackoverflow.com/questions/48743106/whats-ansi-x3-4-1968-encoding";>same
 as ANSI_X3.4-1968</a> and what's commonly known as the encoding 
<code>US-ASCII</code>. That encoding supports a very limited set of characters 
only, so inputting Unicode with th [...]
 </pre><p> If you are to log this information, output on some console might be 
like the following, simply because the app uses <code>US-ASCII</code> by 
default and that can't map those characters:</p>
 <div class="fragment"><div class="line">loggername - ?????????? ???? 
??????????????</div>
-</div><!-- fragment --><p>The important thing to understand is that this is 
some always applied, backwards compatible default behaviour and even the case 
when the current environment sets a locale like <code>en_US.UTF-8</code>. One 
might need to explicitly tell the app at startup to use the locale of the 
environment and make things compatible with Unicode this way. </p>
+</div><!-- fragment --><p>The important thing to understand is that this is 
some always applied, backwards compatible default behaviour and even the case 
when the current environment sets a locale like <code>en_US.UTF-8</code>.</p>
+<p>So when using the <code>locale</code> character set encoding you will, at 
startup, need to explicitly set the <code>std::locale</code> to a value able to 
encode your characters and which is supported on your operating environment. 
</p>
 </div></div><!-- contents -->
 </div><!-- PageDoc -->
 </div><!-- doc-content -->

Reply via email to