Klingon on Unicode site?

2012-04-03 Thread Shawn Steele
I was amused to see Klingon on the 
http://www.unicode.org/versions/Unicode6.1.0/ page ;-)

Yes, I realize it’s primarily me and maybe a few other geeks, but I still 
smiled.

[cid:image001.png@01CD1178.97471C20]

- Shawn

 
http://blogs.msdn.com/shawnste

inline: image001.png

Re: Klingon on Unicode site?

2012-04-03 Thread Jukka K. Korpela

2012-04-03 19:03, Shawn Steele wrote:


I was amused to see Klingon on the
http://www.unicode.org/versions/Unicode6.1.0/ page ;-)

Yes, I realize it’s primarily me and maybe a few other geeks, but I
still smiled.


On the other hand, it sets a bad example: an illogical mix of language, 
with all the rest in English, including the label “Last updated:”. It 
would not be that serious if it were not so common: unlocalized content 
(or content localized server-side) and content localized client-side.


If the visible display of the value of the Last-Modified header (which 
may or may not reflect the actual last modification time) is regarded as 
useful, it should of course be in the same language as the rest of the 
page. And if it contains a time of the day part, it should indicate the 
time zone.


Yucca




RE: Klingon on Unicode site?

2012-04-03 Thread Shawn Steele
April 3rd, missed by a few days ☺

My assumption is the page uses JS to get the dates?  Since my user locale 
happened to be set to Klingon, that’s what it displayed.  But it was not the 
first place I expected to see Klingon

-Shawn

From: ver...@gmail.com [mailto:ver...@gmail.com] On Behalf Of Philippe Verdy
Sent: Tuesday, April 3, 2012 9:40 AM
To: Shawn Steele
Cc: unicode@unicode.org
Subject: Re: Klingon on Unicode site?

When was that published on the Unicode website ? On April 1st ?
Le 3 avril 2012 18:03, Shawn Steele 
shawn.ste...@microsoft.commailto:shawn.ste...@microsoft.com a écrit :
I was amused to see Klingon on the 
http://www.unicode.org/versions/Unicode6.1.0/ page ;-)

Yes, I realize it’s primarily me and maybe a few other geeks, but I still 
smiled.

[cid:image001.png@01CD117F.494B68D0]

- Shawn

 
http://blogs.msdn.com/shawnste


inline: image001.png

Re: Klingon on Unicode site?

2012-04-03 Thread Philippe Verdy
When was that published on the Unicode website ? On April 1st ?

Le 3 avril 2012 18:03, Shawn Steele shawn.ste...@microsoft.com a écrit :

  I was amused to see Klingon on the
 http://www.unicode.org/versions/Unicode6.1.0/ page ;-)

 ** **

 Yes, I realize it’s primarily me and maybe a few other geeks, but I still
 smiled.

 ** **

 

 ** **

 - Shawn

 ** **

  

 http://blogs.msdn.com/shawnste

 ** **

image001.png

Re: Klingon on Unicode site?

2012-04-03 Thread Ken Whistler

On 4/3/2012 9:51 AM, Shawn Steele wrote:
My assumption is the page uses JS to get the dates?  Since my user 
locale happened to be set to Klingon, that’s what it displayed. 


Exactly. There is a call to: 
Date(document.lastModified).toLocaleString() in the Javascript.


So for those who assumed that this was an April Fool's joke on the 
Unicode website, I

guess the joke's on you. ;-)

As to Yukka's complaint:

   On the other hand, it sets a bad example: an illogical mix of
   language, with all the rest in English,


I think Yukka may be confusing Klingon with Vulcan. ;-)

--Ken



Re: Klingon on Unicode site?

2012-04-03 Thread Asmus Freytag

On 4/3/2012 11:14 AM, Ken Whistler wrote:

On 4/3/2012 9:51 AM, Shawn Steele wrote:
My assumption is the page uses JS to get the dates?  Since my user 
locale happened to be set to Klingon, that’s what it displayed. 


Exactly. There is a call to: 
Date(document.lastModified).toLocaleString() in the Javascript.




As to Yukka's complaint:

 it sets a bad example: an illogical mix of language, with all the
rest in English,





I think Yucca has a point.

When the document is in English, it doesn't make sens to display the 
footer date in the system locale.


The locale used for this function should either be that of site, or that 
of the page.


A./


RE: Klingon on Unicode site?

2012-04-03 Thread Shawn Steele
 When the document is in English, it doesn't make sens to display the footer 
 date in the system locale.
 The locale used for this function should either be that of site, or that of 
 the page.

After all we wouldn’t want a Unicode page to appear like it got contaminated 
with Klingon ;-)

-Shawn


RE: Klingon on Unicode site?

2012-04-03 Thread Phillips, Addison
Asmus opined:

I think Yucca has a point.

When the document is in English, it doesn't make sens to display the footer 
date in the system locale.

The locale used for this function should either be that of site, or that of the 
page.


AP And hence the work to internationalize JavaScript and provide for this (as 
well as other use cases).

Cf. 
http://norbertlindenberg.com/2012/02/ecmascript-internationalization-api/index.html

Addison

Addison Phillips
Globalization Architect (Lab126)
Chair (W3C I18N WG)

Internationalization is not a feature.
It is an architecture.




Re: Klingon on Unicode site?

2012-04-03 Thread Philippe Verdy
Le 3 avril 2012 21:28, Phillips, Addison addi...@lab126.com a écrit :
 Asmus opined:


 I think Yucca has a point.

 When the document is in English, it doesn't make sens to display the footer
 date in the system locale.

 The locale used for this function should either be that of site, or that of
 the page.

This depends on whever the site has a concept of user account, when
the user wants to have the general UI of the site translated to his
own personal locale independantly of the content displayed within that
UI.

For example in MediaWiki, one connects with his browser that sets a
prefered language that can be used by default for the general
site-wide UI, independantly of the content. If the user creates an
acount and defines another prefered language, that language will be
used for the UI (including generic page footers showing last date of
update of the content).

Of course the hosted article content will not have its language
changed. So there's a separate language for that content.

Everything is good as long as there's a clear separation of the
content using its own language tag, independant of the rest of the
translatable site-wide UI...




SV: Klingon on Unicode site?

2012-04-03 Thread Elsebeth Flarup
If the visible display of the value of the Last-Modified header (which 
may or may not reflect the actual last modification time) is regarded as 
useful, it should of course be in the same language as the rest of the 
page.

I completely disagree. There may be many applications and web pages that are 
not translated into my preferred language, but I still want to be able to use 
my regional preferences. Especially common for US English apps and sites - I am 
entirely happy to use the English language UI, but I am very unhappy if I 
cannot see sensible dates with a dd/mm/ order, a calendar where Monday is 
the first day of the week, and numbers where the decimal separator is a comma, 
just to mention the most common issues.

Elsebeth






 Fra: Jukka K. Korpela jkorp...@cs.tut.fi
Til: unicode@unicode.org 
Sendt: 9:27 tirsdag den 3. april 2012
Emne: Re: Klingon on Unicode site?
 
2012-04-03 19:03, Shawn Steele wrote:

 I was amused to see Klingon on the
 http://www.unicode.org/versions/Unicode6.1.0/ page ;-)
 
 Yes, I realize it’s primarily me and maybe a few other geeks, but I
 still smiled.

On the other hand, it sets a bad example: an illogical mix of language, with 
all the rest in English, including the label “Last updated:”. It would not be 
that serious if it were not so common: unlocalized content (or content 
localized server-side) and content localized client-side.

If the visible display of the value of the Last-Modified header (which may or 
may not reflect the actual last modification time) is regarded as useful, it 
should of course be in the same language as the rest of the page. And if it 
contains a time of the day part, it should indicate the time zone.

Yucca






Re: Klingon on Unicode site?

2012-04-03 Thread Philippe Verdy
The Last-Modified' header of HTTP is not supposed to be translated,
it has a documented format independant of the locale used by the
server, by the browser, or prefered by the user and configured in its
browser settings or in his personnal account of the website.

It is an hidden ***protocol value*** used to know if the page in the
browser cache is still valid or not, so that the browser will only
redownload its content if it does not match.

HTTP headers are NOT part of the UI. But the browser may opt to
display them in a translated form, as part of the browser UI itself,
but not part of the page content. Such translation will not originate
from the server itself, but directly from the browser's local settings
for its UI.

Le 3 avril 2012 22:49, Elsebeth Flarup efla...@yahoo.com a écrit :
If the visible display of the value of the Last-Modified header (which may
 or may not reflect the actual last modification time) is regarded as useful,
 it should of course be in the same language as the rest of the page.

 I completely disagree. There may be many applications and web pages that are
 not translated into my preferred language, but I still want to be able to
 use my regional preferences. Especially common for US English apps and sites
 - I am entirely happy to use the English language UI, but I am very unhappy
 if I cannot see sensible dates with a dd/mm/ order, a calendar where
 Monday is the first day of the week, and numbers where the decimal separator
 is a comma, just to mention the most common issues.




Re: SV: Klingon on Unicode site?

2012-04-03 Thread Asmus Freytag

On 4/3/2012 1:49 PM, Elsebeth Flarup wrote:
If the visible display of the value of the Last-Modified header 
(which may or may not reflect the actual last modification time) is 
regarded as useful, it should of course be in the same language as the 
rest of the page.


I completely disagree. There may be many applications and web pages 
that are not translated into my preferred language, but I still want 
to be able to use my regional preferences. Especially common for US 
English apps and sites - I am entirely happy to use the English 
language UI, but I am very unhappy if I cannot see sensible dates with 
a dd/mm/ order, a calendar where Monday is the first day of the 
week, and numbers where the decimal separator is a comma, just to 
mention the most common issues.


For dates, there are two things affected by the locale. One is the 
language of the names of day and month. The other is the regional 
preference for their arrangement (date order).


I can certainly sympathize with the view that the effect of these on the 
user (foreign or non-native user, to be more precise) is different. 
Switching between two different arrangement (especially short date 
formats) can indeed be more confusing than switching languages. However, 
for long data formats, getting the day month names in other 
scripts/languages than rest of the page is irritating in another way - 
and users do complain about it.


Given all this, I still don't think it makes sense to alternate the 
format randomly based purely on whether the date is generated on the fly 
or during page edit.


The latter is the situation for the Unicode site. Many pages have a date 
(see all the reports) in the header, which is static and formatted 
according the document locale (to use that term). The last updated 
date happens to be generated at runtime, because that is a convenient 
way to do it, but there's nothing intrinsic that requires it to be 
dynamic. (One might imagine some hypothetical alternative editorial 
process that would create a static time stamp at file upload).


Therefore, overall consistency would suggest that exposing the dynamic 
nature of this particular string to the user by localizing it 
differently is a bug or poor design) and not a feature. Worse, on pages 
that have both a static and a dynamic date, using different formatting 
rules could create confusion. (The worst case scenario would happen if 
they were both in short date format).


On the other hand, I can see situations where it might make sense to 
localize date formats even for an untranslated site. Those situations 
revolve typically around user input, but they don't apply here.


A./



Re: SV: Klingon on Unicode site?

2012-04-03 Thread Philippe Verdy
Yes but HTTP headers are still not part of the page content itself. It
is unrelated and only needed for the HTTP protocol and management of
caches inclding in proxies. Those headers are by definition not
translatable by the server. Only the brower may opt to display these
headers outside of the page content, using the user's preferences in
his browser.

So even if there's a visible 'Last modified date included in the
content, it does not have to match the date indicated in the HTTP
header (which may be updated only for technical reasons unrelated to
the content itself, such as fixing an HTML syntax, or CSS style, or
the broen URL to a script or image, or because the querying user has
changed his preferences on his personnal account on that site in such
a way that the surrounded UI (sent as part of the personalized page
content body) needs to be refreshed, without affecting the date of the
content itself.

Both dates (in HTTP headers or in the page content body) are
unrelated. Standard HTTP headers are not translatable.

Le 4 avril 2012 00:07, Asmus Freytag asm...@ix.netcom.com a écrit :
 On 4/3/2012 1:49 PM, Elsebeth Flarup wrote:

If the visible display of the value of the Last-Modified header (which may
 or may not reflect the actual last modification time) is regarded as useful,
 it should of course be in the same language as the rest of the page.

 I completely disagree. There may be many applications and web pages that are
 not translated into my preferred language, but I still want to be able to
 use my regional preferences. Especially common for US English apps and sites
 - I am entirely happy to use the English language UI, but I am very unhappy
 if I cannot see sensible dates with a dd/mm/ order, a calendar where
 Monday is the first day of the week, and numbers where the decimal separator
 is a comma, just to mention the most common issues.

 For dates, there are two things affected by the locale. One is the language
 of the names of day and month. The other is the regional preference for
 their arrangement (date order).

 I can certainly sympathize with the view that the effect of these on the
 user (foreign or non-native user, to be more precise) is different.
 Switching between two different arrangement (especially short date formats)
 can indeed be more confusing than switching languages. However, for long
 data formats, getting the day month names in other scripts/languages than
 rest of the page is irritating in another way - and users do complain about
 it.

 Given all this, I still don't think it makes sense to alternate the format
 randomly based purely on whether the date is generated on the fly or during
 page edit.

 The latter is the situation for the Unicode site. Many pages have a date
 (see all the reports) in the header, which is static and formatted according
 the document locale (to use that term). The last updated date happens to
 be generated at runtime, because that is a convenient way to do it, but
 there's nothing intrinsic that requires it to be dynamic. (One might imagine
 some hypothetical alternative editorial process that would create a static
 time stamp at file upload).

 Therefore, overall consistency would suggest that exposing the dynamic
 nature of this particular string to the user by localizing it differently is
 a bug or poor design) and not a feature. Worse, on pages that have both a
 static and a dynamic date, using different formatting rules could create
 confusion. (The worst case scenario would happen if they were both in short
 date format).

 On the other hand, I can see situations where it might make sense to
 localize date formats even for an untranslated site. Those situations
 revolve typically around user input, but they don't apply here.




Re: [unicode] Re: vertical writing mode of modern Yi?

2012-04-03 Thread fantasai

On 04/02/2012 04:05 AM, mpsuz...@hiroshima-u.ac.jp wrote:


I appreciate your careful attitude considering the possibility
that the found short vertical strings are formed under the
influence of Chinese typography.

So, for further discussion, we need an UCS Yi materials with
vertical text that has no influence from Chinese typography?
How to evaluate the influence? If the book has a colophone
in Chinese, it should be excluded?


It is not the influence of Chinese typography that I'm concerned
with so much (since that probably goes back many hundreds of
years!), but the limitations of the software that was used to
typeset the book and the biases and assumptions of the people
running the software.

If the software is capable of both options, and the people managing
the typesetting process are comfortably literate in Yi and familiar
with its vertical habits in handwritten texts, then we can consider
the results of their work to be correct.

But if the software is only capable of typsetting characters upright
and not sideways, this will obviously be the result regardless of
typographic preference.

Also if the people typesetting the book are familiar with Chinese
but not with Yi, they might assume that these characters, which like
ideographs also fit in fixed-size boxes, should behave the same as
Chinese. This is a reasonable assumption. But it may not be correct.

~fantasai



Re: [unicode] Re: vertical writing mode of modern Yi?

2012-04-03 Thread mpsuzuki
On Tue, 03 Apr 2012 16:53:51 -0700
fantasai fantasai.li...@inkedblade.net wrote:

On 04/02/2012 04:05 AM, mpsuz...@hiroshima-u.ac.jp wrote:

 I appreciate your careful attitude considering the possibility
 that the found short vertical strings are formed under the
 influence of Chinese typography.

 So, for further discussion, we need an UCS Yi materials with
 vertical text that has no influence from Chinese typography?
 How to evaluate the influence? If the book has a colophone
 in Chinese, it should be excluded?

It is not the influence of Chinese typography that I'm concerned
with so much (since that probably goes back many hundreds of
years!), but the limitations of the software that was used to
typeset the book and the biases and assumptions of the people
running the software.

Thank you for correction. Then, what kind of materials
should be collected to discuss the appropriate orientation
for UCS Yi? If the central issue would be the software,
the handwritten materials or, metal typesetted materials
may be the evidence. But if the central issue would be
the biases and assumptions, the contact with native users
of UCS Yi (who were not exposed under the influences of
non-Yi texts) would be essential.

Also if the people typesetting the book are familiar with Chinese
but not with Yi, they might assume that these characters, which like
ideographs also fit in fixed-size boxes, should behave the same as
Chinese. This is a reasonable assumption. But it may not be correct.

Excuse me, please let me confirm my understanding of
it may not be correct, my English is quite poor.

Do you think as
The assumption (behaviour like Chinese text) has no foundation,
 so it is not reliable,

or,

It is inconsistent with the vertical text of Old Yi,
 so it is wrong.

Which opinion matches with your attitude?

Regards,
mpsuzuki



Three character canonical decompositions in version 2 releases

2012-04-03 Thread Karl Williamson
http://unicode.org/policies/stability_policy.html says that effective 
starting in Version 2.0, Canonical mappings (Decomposition_Mapping 
property values) are always limited either to a single value or to a 
pair. The second character in the pair cannot itself have a canonical 
mapping.


I noticed that the UnicodeData.txt file shipped with all the Version 2 
Unicodes have three character canononical decompositions.  For example 
in 2.1.9, there are these:

 01E0;0041 0307 0304
 01E1;0061 0307 0304
 1E1C;0045 0327 0306
 1E1D;0065 0327 0306

There are many more in 2.0.

Is it an error on the web site that this policy was in effect in 2.0, 
and it really should be 3.0? (as there no such decompositions in the 
data files starting in 3.0).


Or were these data files defective?