So, we can conclude that for STRING names, the result is shit while
for COMPOUND_TEXT, we get something sensible most of the time.

Sounds like the error would be in util.c:GetWMGetPropertyString().
Not sure what should be done though.

For the space-after-apostrophe problem, I'm guessing that
XmbTextPropertyToTextList() has some bug, because that's the function
used to convert the WM_NAME property string to a string according to
locale.

In message <[EMAIL PROTECTED]> on Thu, 04 Jan 2007 15:02:16 +0200, "Zvi Har'El" 
<[EMAIL PROTECTED]> said:

rl> Hi all,
rl> 
rl> Since I am quite a heavy user of UTF-8 as the web master of a
rl> multilingual web site, http://JV.Gilead.org.il, let me summerize what I
rl> see as the page title when I use firefox to browse to different pages of
rl> my site. Of course, if I use tabbed browsing, the tab title is always
rl> correct.
rl> 1) If the title have non-latin-1 letters, no problem. E.g.,
rl> <http://JV.Gilead.org.il/hebrew/>. In this case xprop gives
rl> 
rl> WM_NAME(COMPOUND_TEXT) = "Les Voyages extraordinaires en H.ANibreu 
$,1rt(B $,1,t,~-!-",u-*(B
rl> $,1,t,~,u-$,|,p,y,}(B - Mozilla Firefox"
rl> _NET_WM_NAME(UTF8_STRING) = 0x4c, 0x65, 0x73, 0x20, 0x56, 0x6f, 0x79,
rl> 0x61, 0x67, 0x65, 0x73, 0x20, 0x65, 0x78, 0x74, 0x72, 0x61, 0x6f, 0x72,
rl> 0x64, 0x69, 0x6e, 0x61, 0x69, 0x72, 0x65, 0x73, 0x20, 0x65, 0x6e, 0x20,
rl> 0x48, 0xc3, 0xa9, 0x62, 0x72, 0x65, 0x75, 0x20, 0xe2, 0x80, 0x94, 0x20,
rl> 0xd7, 0x94, 0xd7, 0x9e, 0xd7, 0xa1, 0xd7, 0xa2, 0xd7, 0x95, 0xd7, 0xaa,
rl> 0x20, 0xd7, 0x94, 0xd7, 0x9e, 0xd7, 0x95, 0xd7, 0xa4, 0xd7, 0x9c, 0xd7,
rl> 0x90, 0xd7, 0x99, 0xd7, 0x9d, 0x20, 0x2d, 0x20, 0x4d, 0x6f, 0x7a, 0x69,
rl> 0x6c, 0x6c, 0x61, 0x20, 0x46, 0x69, 0x72, 0x65, 0x66, 0x6f, 0x78
rl> 
rl> And this is exactly what the title is (also if you use f.identify in
rl> ctwm, this is what you get). Even the non-ascii Latin-1 characters are
rl> translated to two bytes by Xlib.
rl> 
rl> You can also look at the Cyrillic page at
rl> <http://jv.gilead.org.il/FAQ/index.ru.html>. Good title.
rl> 
rl> 2) If you have only latin-1 letters, however some of them are non ascii,
rl> which are represented in UTF-8 by TWO bytes, but in only ONE byte in
rl> ISO-8859-1 . E.g., http://jv.gilead.org.il/sjv.html. In this case xprop
rl> gives
rl> 
rl> WM_NAME(STRING) = "Soci.ANitNi Jules Verne - Mozilla Firefox"
rl> _NET_WM_NAME(UTF8_STRING) = 0x53, 0x6f, 0x63, 0x69, 0xc3, 0xa9, 0x74,
rl> 0xc3, 0xa9, 0x20, 0x4a, 0x75, 0x6c, 0x65, 0x73, 0x20, 0x56, 0x65, 0x72,
rl> 0x6e, 0x65, 0x20, 0x2d, 0x20, 0x4d, 0x6f, 0x7a, 0x69, 0x6c, 0x6c, 0x61,
rl> 0x20, 0x46, 0x69, 0x72, 0x65, 0x66, 0x6f, 0x78
rl> 
rl> However the actual title (and f.identify) show
rl> 
rl> Socit Jules Verne - Mozilla Firefox
rl> 
rl> That is, the non-ascci characters disappear. Note that the property
rl> WM_NAME is now a STRING rather than COMPOUND_NAME.
rl> You can also look at the Turkish page at
rl> <http://jv.gilead.org.il/FAQ/index.tu.html>. Some nonascii character
rl> disappear.
rl> 
rl> 3) If you have punctuation like the single right quote (U+2019), etc.,
rl> which are represented in Unicode in the U+2000 range, and represented in
rl> UTF-8 by THREE bytes, you get extra spaces. E.g., 
rl> http://jv.gilead.org.il/. In this case, xprop gives
rl> 
rl> WM_NAME(COMPOUND_TEXT) = "Zvi Har$,1ry(BEl$,1ry(Bs Jules Verne 
Collection - Mozilla
rl> Firefox"
rl> _NET_WM_NAME(UTF8_STRING) = 0x5a, 0x76, 0x69, 0x20, 0x48, 0x61, 0x72,
rl> 0xe2, 0x80, 0x99, 0x45, 0x6c, 0xe2, 0x80, 0x99, 0x73, 0x20, 0x4a, 0x75,
rl> 0x6c, 0x65, 0x73, 0x20, 0x56, 0x65, 0x72, 0x6e, 0x65, 0x20, 0x43, 0x6f,
rl> 0x6c, 0x6c, 0x65, 0x63, 0x74, 0x69, 0x6f, 0x6e, 0x20, 0x2d, 0x20, 0x4d,
rl> 0x6f, 0x7a, 0x69, 0x6c, 0x6c, 0x61, 0x20, 0x46, 0x69, 0x72, 0x65, 0x66,
rl> 0x6f, 0x78
rl> 
rl> The actual title (and f.identify) show
rl> 
rl> Zvi Har$,1ry(B El$,1ry(B s Jules Verne Collection - Mozilla Firefox
rl> 
rl> with spaces after the aposroph (single right quote).
rl> 
rl> As you can see from the examples, the generated X windows have always
rl> the right properties (in my locale, en_US.UTF8),  but CTWM
rl> interpretation is to be improved.
rl> 
rl> Zvi.
rl> 
rl> -- 
rl> Dr. Zvi Har'El      mailto:[EMAIL PROTECTED]    Department of Mathematics
rl> tel:+972-54-4227607 icq:179294841    Technion - Israel Institute of 
Technology
rl> fax:+972-4-8293388  http://www.math.technion.ac.il/~rl/    Haifa 32000, 
ISRAEL
rl> "If you can't say somethin' nice, don't say nothin' at all." -- Thumper 
(1942)
rl> 

Reply via email to