This is an automated email from the ASF dual-hosted git repository.
swebb2066 pushed a commit to branch asf-staging
in repository https://gitbox.apache.org/repos/asf/logging-log4cxx-site.git
The following commit(s) were added to refs/heads/asf-staging by this push:
new 0ce82d0b Improve the Unicode support FAQ documentation
0ce82d0b is described below
commit 0ce82d0ba8222c8b773793f212dd5731e2713cb6
Author: Stephen Webb <[email protected]>
AuthorDate: Mon Aug 14 17:17:22 2023 +1000
Improve the Unicode support FAQ documentation
---
1.2.0/faq.html | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/1.2.0/faq.html b/1.2.0/faq.html
index 08c8ca5f..7310368b 100644
--- a/1.2.0/faq.html
+++ b/1.2.0/faq.html
@@ -110,14 +110,15 @@ DLL" with release builds of Log4cxx and "Multithread DLL
Debug" with debug build
<h1><a class="anchor" id="unicode_supported"></a>
Does Apache Log4cxx support Unicode?</h1>
<p>Yes. Apache Log4cxx exposes API methods in multiple string flavors
supporting differently encoded textual content, like <code>char*</code>,
<code>std::string</code>, <code>wchar_t*</code>, <code>std::wstring</code>,
<code>CFStringRef</code> et al. All provided texts will be converted to the
<code>LogString</code> type before further processing, which is one of several
supported internal representations and is selected by the
<code>LOG4CXX_CHAR</code> cmake option. If methods are used [...]
-<p>The default external representation is controlled by the
<code>LOG4CXX_CHARSET</code> cmake option. FileAppenders support an
<code>Encoding</code> property allowing character set encoding control per
appender. For example, you can use <code>UTF-8</code> or <code>UTF-16</code>
when writing XML or JSON layouts. Log4cxx also implements character set
encodings for <code>US-ASCII</code> (<code>ISO646-US</code> or
<code>ANSI_X3.4-1968</code>) and <code>ISO-8859-1</code> (<code>ISO-LATIN-1</
[...]
+<p>The default external representation is controlled by the
<code>LOG4CXX_CHARSET</code> cmake option. FileAppenders support an
<code>Encoding</code> property allowing character set encoding control per
appender. For example, you can use <code>UTF-8</code> or <code>UTF-16</code>
when writing XML or JSON layouts. Log4cxx also implements character set
encodings for <code>US-ASCII</code> (<code>ISO646-US</code> or
<code>ANSI_X3.4-1968</code>) and <code>ISO-8859-1</code> (<code>ISO-LATIN-1</
[...]
<p>The <code>locale</code> character set encoding provides support beyond the
above internally implemented options. It allows you to use any multi-byte
encoding provided by the standard library. See also <a
href="https://stackoverflow.com/questions/571359/how-do-i-set-the-proper-initial-locale-for-a-c-program-on-windows">some
SO post</a> on setting the default locale in C++.</p>
<div class="fragment"><div class="line">std::setlocale( LC_ALL, ""
); /* Set locale for C functions */</div>
<div class="line">std::locale::global(std::locale("")); /* set
locale for C++ functions */</div>
</div><!-- fragment --><p>According to the <a
href="https://www.gnu.org/software/libc/manual/html_node/Setting-the-Locale.html">libc
documentation</a>, all programs start in the <code>C</code> locale by default,
which is the <a
href="https://stackoverflow.com/questions/48743106/whats-ansi-x3-4-1968-encoding">same
as ANSI_X3.4-1968</a> and what's commonly known as the encoding
<code>US-ASCII</code>. That encoding supports a very limited set of characters
only, so inputting Unicode with th [...]
</pre><p> If you are to log this information, output on some console might be
like the following, simply because the app uses <code>US-ASCII</code> by
default and that can't map those characters:</p>
<div class="fragment"><div class="line">loggername - ?????????? ????
??????????????</div>
-</div><!-- fragment --><p>The important thing to understand is that this is
some always applied, backwards compatible default behaviour and even the case
when the current environment sets a locale like <code>en_US.UTF-8</code>. One
might need to explicitly tell the app at startup to use the locale of the
environment and make things compatible with Unicode this way. </p>
+</div><!-- fragment --><p>The important thing to understand is that this is
some always applied, backwards compatible default behaviour and even the case
when the current environment sets a locale like <code>en_US.UTF-8</code>.</p>
+<p>So when using the <code>locale</code> character set encoding you will, at
startup, need to explicitly set the <code>std::locale</code> to a value able to
encode your characters and which is supported on your operating environment.
</p>
</div></div><!-- contents -->
</div><!-- PageDoc -->
</div><!-- doc-content -->