On 23/07/12 10:28PM, Hiltjo Posthuma wrote:
> Unless I'm missing something. It seems like an application or environment
> issue.
> 
> For dwm it is assumed the environment is utf-8 and application should use it.

Sorry, I forgot to list my locale-related variables. They are set up as such:

LANG=sr_RS.UTF-8
LC_ALL=
LC_CTYPE=
LC_NUMERIC=
LC_TIME=
LC_COLLATE=
LC_MONETARY=
LC_MESSAGES=

and I still experience the behavior from this issue. I use Alpine Linux, so 
it's also not related to libc or distro.


> I think it makes sense if the application uses utf-8 or the same encoding as
> the environment. It shouldn't pick some encoding an expect the window manager
> to autodetect and handle all of them.

If I understand the process correctly, there are currently two cases, 
differentiated by the encoding field. If it's set to XA_STRING, no conversion 
is made, otherwise it is passed to XmbTextPropertyToTextList which does the 
conversion from locale ("Multi Byte") to UTF-8. From what I've observed, the 
encoding field cannot be used to make this distinction. This is a table of 
cases I've encountered ("Passed to Xmb...?" means the value being passed to 
XmbTextPropertyToTextList under the current upstream, unaltered dwm):

   Actual Encoding      encoding field  Source          Passed to Xmb...?
-------------------------------------------------------------------------------
1. ISO 8859-1           31 (=XA_STRING) LibreOffice     No
2. (COMPOUND_TEXT?)     385             LibreOffice     Yes
3. UTF-8                31 (=XA_STRING) slstatus        No

Now that I look at it again, I am not sure what is the actual encoding in case 
#2. When I use od(1) to take a look at the bytes in the value field, I get 
this for "thisátestњ.odt - LibreOffice Writer":

$ od -t c value.log
0000000   t   h   i   s 341   t   e   s   t 033   -   L 372   .   o   d
0000020   t       -       L   i   b   r   e   O   f   f   i   c   e
0000040   W   r   i   t   e   r  \n
0000047

Octal 341 (=225=0xE1) for "á" is ISO 8859-1, but the sequence "033   -   L 372" 
for "њ" is... COMPOUND_TEXT[1]? Ah--now I see[2]:


> For supported locales, existence of a converter from COMPOUND_TEXT, STRING, 
                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> UTF8_STRING or the encoding of the current locale is guaranteed if 
  ^^^^^^^^^^^
> XSupportsLocale returns True for the current locale (but the actual text may 

> contain unconvertible characters).


So I guess that explains case #2 then. Still, I think case #1 should be handled 
somehow. I'm not sure whether LibreOffice or X.Org should be blamed for setting 
WM_NAME to unconverted ISO 8859-1 bytes. As stated, contents of my LANG ends in 
.UTF-8, so "current locale" should not be ISO 8859-1, unless hardcoded.


[1]: https://www.x.org/releases/X11R7.6/doc/xorg-docs/specs/CTEXT/ctext.html
[2]: https://linux.die.net/man/3/xmbtextpropertytotextlist

Attachment: signature.asc
Description: PGP signature

Reply via email to