RE: Mixing languages on a Web site

2000-07-03 Thread Chris Pratley

Only in the sense that Arial is more attractive than Times New Roman. For
on-screen display of small amounts of text, a Gothic font is better due to
the low resolution of displays, but for larger amounts of printed text, a
Mincho font is preferred. Newspapers (and Word documents) are all set in
Mincho for that reason.

If you install both fonts, you should make sure you get SP5 or later for NT4
to fix a problem that NT4 has handling several very large fonts on the
system at the same time.

One thing to note is that there are different versions of MS Gothic and MS
Mincho that have different coverage of CJK. Notably, the ver 2.3 of these
fonts that ships with Win98J, and all languages of Win2000 has JIS X 212 CJK
coverage. Older versions (NT4) covered only JIS X 208. I am not sure which
version ships with the IE language packs, but it is probably a smaller
(older) one for size reasons.

Regarding Mike Ayers's question about usage, the global IME's appear in the
list of installed keyboards (represented by a two-letter icon in the task
bar tray). They appear only if you are using an application that supports
the Global IME (IE4/5, Word2000, Outlook 98/2000 mail, Outlook Express 4/5,
etc.). There is almost no documentation in English on how to use IMEs that I
know of. The Office2000 Proofing Tools manual has one page for each
language, but comprehensive documentation in English does not exist that I
know of (I would love to be proven wrong).

Chris Pratley
Group Program Manager
Microsoft Word

Sent with office10ship build 1829 wordmail on


-Original Message-
From: Michael (michka) Kaplan [mailto:[EMAIL PROTECTED]]
Sent: July 1, 2000 7:00 AM
To: Unicode List
Subject: Re: Mixing languages on a Web site

If you mean the Active IMM, you can install the Japanese lang support
provided by IE5 as well, as it does the same thing (installs a font and code
page support). In fact the cp files have more recent dates, I think.

In fact, the font it installs (MS Gothic) is generally considered to be more
attractive than the LangPack font (MS Mincho).

michka


- Original Message -
From: "Andrew Cunningham" [EMAIL PROTECTED]
To: "Unicode List" [EMAIL PROTECTED]
Cc: "Unicode List" [EMAIL PROTECTED]
Sent: Saturday, July 01, 2000 6:51 AM
Subject: Re: Mixing languages on a Web site


 Hi Mike

 To use microsoft's global IME for Japanese on NT4, there is one very
 important step you need to do ... install NT4 Japanese support .. there
are
 a few articles about it in the Microsoft knowledge base .. i have the urls
 at work, don't have them with me at the moment ...

 on the win NT4 cdrom there is a folder somewhere called langpacks ... use
 windows explorer to look in it ... there is a file called japanese.inf ..
 right mouse click on it .. a pop up menu will appear ... on of the menu
 items is 'install' .. select this .. and it will install NT4's Japanese
 langauge support .. this should be installed before the global IME for
 Japanese ... otherwise it will not work ... at least that's the story ...

 ciao

 Andrew

 Andrew Cunningham
 [EMAIL PROTECTED]




 - Original Message -
 From: Ayers, Mike [EMAIL PROTECTED]
 To: Unicode List [EMAIL PROTECTED]
 Sent: Saturday, 1 July 2000 3:49
 Subject: RE: Mixing languages on a Web site


 
   From: Michael (michka) Kaplan [mailto:[EMAIL PROTECTED]]
   Sent: Friday, June 30, 2000 4:28 AM
  
   To prove #4 will work, see
  
   http://www.trigeminal.com/samples/provincial.html
  
   Along with 102 other languages, this page includes both Japanese and
   Turkish. UTF-8 is what makes that possible
  
   michka
 
  I checked it out, and with IE5 I can now view almost all of it.
  There are 5 lines that I cannot view and for which there are no fonts
  available, but otherwise great.  Netscape does not show nearly as many
  (hints?).
 
  On a possibly entirely unrelated subject, I downloaded Microsoft's
  IMEs for Chinese and Japanese, hoping to learn to use them.  However, I
  cannot figure out how to enable them, and can't locate any helpful info
on
  Microsoft's site.  I am running NT4.  Any tips greatly appreciated.
 
 
  Thanks,
 
  /|/|ike
 





Re: Mixing languages on a Web site

2000-07-01 Thread Andrew Cunningham

Hi Mike

To use microsoft's global IME for Japanese on NT4, there is one very
important step you need to do ... install NT4 Japanese support .. there are
a few articles about it in the Microsoft knowledge base .. i have the urls
at work, don't have them with me at the moment ...

on the win NT4 cdrom there is a folder somewhere called langpacks ... use
windows explorer to look in it ... there is a file called japanese.inf ..
right mouse click on it .. a pop up menu will appear ... on of the menu
items is 'install' .. select this .. and it will install NT4's Japanese
langauge support .. this should be installed before the global IME for
Japanese ... otherwise it will not work ... at least that's the story ...

ciao

Andrew

Andrew Cunningham
[EMAIL PROTECTED]




- Original Message -
From: Ayers, Mike [EMAIL PROTECTED]
To: Unicode List [EMAIL PROTECTED]
Sent: Saturday, 1 July 2000 3:49
Subject: RE: Mixing languages on a Web site



  From: Michael (michka) Kaplan [mailto:[EMAIL PROTECTED]]
  Sent: Friday, June 30, 2000 4:28 AM
 
  To prove #4 will work, see
 
  http://www.trigeminal.com/samples/provincial.html
 
  Along with 102 other languages, this page includes both Japanese and
  Turkish. UTF-8 is what makes that possible
 
  michka

 I checked it out, and with IE5 I can now view almost all of it.
 There are 5 lines that I cannot view and for which there are no fonts
 available, but otherwise great.  Netscape does not show nearly as many
 (hints?).

 On a possibly entirely unrelated subject, I downloaded Microsoft's
 IMEs for Chinese and Japanese, hoping to learn to use them.  However, I
 cannot figure out how to enable them, and can't locate any helpful info on
 Microsoft's site.  I am running NT4.  Any tips greatly appreciated.


 Thanks,

 /|/|ike





Re: Mixing languages on a Web site

2000-07-01 Thread Michael \(michka\) Kaplan

If you mean the Active IMM, you can install the Japanese lang support
provided by IE5 as well, as it does the same thing (installs a font and code
page support). In fact the cp files have more recent dates, I think.

In fact, the font it installs (MS Gothic) is generally considered to be more
attractive than the LangPack font (MS Mincho).

michka


- Original Message -
From: "Andrew Cunningham" [EMAIL PROTECTED]
To: "Unicode List" [EMAIL PROTECTED]
Cc: "Unicode List" [EMAIL PROTECTED]
Sent: Saturday, July 01, 2000 6:51 AM
Subject: Re: Mixing languages on a Web site


 Hi Mike

 To use microsoft's global IME for Japanese on NT4, there is one very
 important step you need to do ... install NT4 Japanese support .. there
are
 a few articles about it in the Microsoft knowledge base .. i have the urls
 at work, don't have them with me at the moment ...

 on the win NT4 cdrom there is a folder somewhere called langpacks ... use
 windows explorer to look in it ... there is a file called japanese.inf ..
 right mouse click on it .. a pop up menu will appear ... on of the menu
 items is 'install' .. select this .. and it will install NT4's Japanese
 langauge support .. this should be installed before the global IME for
 Japanese ... otherwise it will not work ... at least that's the story ...

 ciao

 Andrew

 Andrew Cunningham
 [EMAIL PROTECTED]




 - Original Message -
 From: Ayers, Mike [EMAIL PROTECTED]
 To: Unicode List [EMAIL PROTECTED]
 Sent: Saturday, 1 July 2000 3:49
 Subject: RE: Mixing languages on a Web site


 
   From: Michael (michka) Kaplan [mailto:[EMAIL PROTECTED]]
   Sent: Friday, June 30, 2000 4:28 AM
  
   To prove #4 will work, see
  
   http://www.trigeminal.com/samples/provincial.html
  
   Along with 102 other languages, this page includes both Japanese and
   Turkish. UTF-8 is what makes that possible
  
   michka
 
  I checked it out, and with IE5 I can now view almost all of it.
  There are 5 lines that I cannot view and for which there are no fonts
  available, but otherwise great.  Netscape does not show nearly as many
  (hints?).
 
  On a possibly entirely unrelated subject, I downloaded Microsoft's
  IMEs for Chinese and Japanese, hoping to learn to use them.  However, I
  cannot figure out how to enable them, and can't locate any helpful info
on
  Microsoft's site.  I am running NT4.  Any tips greatly appreciated.
 
 
  Thanks,
 
  /|/|ike
 






Re: Mixing languages on a Web site

2000-06-30 Thread Michael \(michka\) Kaplan

To prove #4 will work, see

http://www.trigeminal.com/samples/provincial.html

Along with 102 other languages, this page includes both Japanese and
Turkish. UTF-8 is what makes that possible

michka


- Original Message -
From: [EMAIL PROTECTED]
To: "Unicode List" [EMAIL PROTECTED]
Sent: Thursday, June 29, 2000 10:19 PM
Subject: Mixing languages on a Web site


 I am mixing Japanese and Turkish letters on my site.

 1) How do I convert Latin-* text to UTF-8 text?
 2) How do I convert Shift-JIS text to UTF-8 text?
 3) How do I mark text as UTF-8?
 4) Will people actually be able to SEE BOTH the Japanese AND the Turkish?
 5) Is there a little "formatted in Unicode" logo I can put on my site?
 6) Is there a "Unicode Help" site so people like me don't have to post
these
 questions on lists like these?

 I bet 1 and 2 could be done with CGI scripts, and 3 is trivial.

 
 Get free email and a permanent address at http://www.netaddress.com/?N=1





Re: Mixing languages on a Web site

2000-06-30 Thread Antoine Leca

[EMAIL PROTECTED] wrote:
 
 [EMAIL PROTECTED] wrote:
 
  3) How do I mark text as UTF-8?
 
 In your head section:
 
 meta http-equiv="content-type" content="text/html; charset=utf-8"
 
 Theoretically, you don't need this: Unicode (UTF-16 or UTF-8) are the
 default for the web. In practice, however, each different browser behaves in
 a slightly different way, so it can be a good idea to use the explicit
 declaration.

Hmmm. Writing from top of my head (which is *not* the good way to go in
such a list), I understood that Unicode was the default character set,
meaning that #65; is supposed to be a Latin 'A' and #x431; is supposed
to be Cyrilic 'a'.

OTOH, I believe that for upward compatibility, the default encoding (i.e.
how the actual bytes are supposed to be understood) is supposed to
be iso-8859-1, not utf-8. (and if it begins with ÿþ or þÿ, the browser
is advised to test if reading the file as utf-16 is not a better idea...)


 
  4) Will people actually be able to SEE BOTH the Japanese AND
  the Turkish?
 
 Yes, provided they have a UTF-8 enabled browser and a font with all
 necessary glyphs.

Well, with current generation browsers (IE5 or Netscape 6), it can even
work with a font for Japanese and a different font for Turkish.

 
  6) Is there a "Unicode Help" site so people like me don't
  have to post these questions on lists like these?
 
 I think this mailing list is the proper place [...]

Yes, but wouldn't it be a very good idea to resume these answers in some
FAQ at Unicode (or W3C) site, allow the Web sites to link relevant
informations from everywhere in a convenient way, particularly for the
poorer guys that cannot afford testing all the cases with all browsers
(also, it can then be easily translated to a bunch of languages).

Perhaps this pertains more to W3C than Unicode, though.


Antoine



Re: Mixing languages on a Web site

2000-06-30 Thread Mark Davis

This is very much like how we did the multlingual content in 
http://www.unicode.org/unicode/standard/WhatIsUnicode.html, which currently has 
English, French, German, Italian, Russian, and Arabic; with more to follow.

Mark

Herman Ranes wrote:

 [EMAIL PROTECTED] skreiv:
 
  I am mixing Japanese and Turkish letters on my site.
 
  1) How do I convert Latin-* text to UTF-8 text?
  2) How do I convert Shift-JIS text to UTF-8 text?

 I suppose you do not mean dynamic / server side conversion, but text
 preparation only.

 You can use MS Internet Explorer 5.0:

 -Load the text

 -Select codepage, so that the text displays properly

 View - Encoding - More ---

 -Save the UNICODE text file:

 Save as - Encoding:UTF-8

  3) How do I mark text as UTF-8?
 If you can not configure the server to include the appropriate HTTP
 header info, you can instead / in addition use the following META-tag in
 the HTML code:

 HEAD
 META http-equiv="Content-Type" content="text/html; charset=UTF-8"

 /HEAD

  4) Will people actually be able to SEE BOTH the Japanese AND the Turkish?

 If fonts with the required repertoire are installed, they will! (Mozilla
 / Netscape 6.0 requires no more preparations -- it will pick
 subtitutions from the avalable fonts.)

 If the HTML-text is language-tagged -- and the browser correctly
 configured, the text may be displayed in correct font styles. (Japanese,
 not Chinese.)

 A page which contains both Japanese and Esperanto, in UTF-8 and with
 language tags:
 http://www.hist.no/~herman/eo/eo-jp.html
 http://www.hist.no/~herman/eo/eo.html

  5) Is there a little "formatted in Unicode" logo I can put on my site?
  6) Is there a "Unicode Help" site so people like me don't have to post these
  questions on lists like these?
 
  I bet 1 and 2 could be done with CGI scripts, and 3 is trivial.
 
  
  Get free email and a permanent address at http://www.netaddress.com/?N=1

 --
 Herman Ranes  Høgskolen i Sør-Trøndelag
   Avdeling for teknologi
 Telefon   +47 73559606Institutt for elektroteknikk
 Telefaks  +47 73559581
 [EMAIL PROTECTED]  N-7004 Trondheim
 http://www.hist.no/~herman/   NOREG




RE: Mixing languages on a Web site

2000-06-30 Thread Ayers, Mike


 From: Michael (michka) Kaplan [mailto:[EMAIL PROTECTED]]
 Sent: Friday, June 30, 2000 4:28 AM
 
 To prove #4 will work, see
 
 http://www.trigeminal.com/samples/provincial.html
 
 Along with 102 other languages, this page includes both Japanese and
 Turkish. UTF-8 is what makes that possible
 
 michka

I checked it out, and with IE5 I can now view almost all of it.
There are 5 lines that I cannot view and for which there are no fonts
available, but otherwise great.  Netscape does not show nearly as many
(hints?).

On a possibly entirely unrelated subject, I downloaded Microsoft's
IMEs for Chinese and Japanese, hoping to learn to use them.  However, I
cannot figure out how to enable them, and can't locate any helpful info on
Microsoft's site.  I am running NT4.  Any tips greatly appreciated.


Thanks,

/|/|ike



RE: Mixing languages on a Web site

2000-06-30 Thread Marco . Cimarosti

Antoine Leca wrote:
 Hmmm. Writing from top of my head (which is *not* the good 
 way to go in such a list), I understood that Unicode was
 the default character set, [...]

You are right (see http://www.w3.org/International/O-HTML-charset.html).

 OTOH, I believe that for upward compatibility, the default 
 encoding [...] is supposed to be iso-8859-1, [...]

I was wrong, and you are right for HTML as served HTTP 1.1. The current
trend is that HTML has no default encoding (see
http://www.w3.org/International/O-HTTP-charset.html) so, yes, the meta tag
should always be there in a decent page.

 Well, with current generation browsers (IE5 or Netscape 6), 
 it can even work with a font for Japanese and a different 
 font for Turkish.

Right, using language tagging within the document
(http://www.w3.org/International/O-HTML-tags.html).

 Yes, but wouldn't it be a very good idea to resume these 
 answers in some FAQ at Unicode (or W3C) site, [...]

But it is much easier to bang out inaccurate answers from one's memory :-(
Sorry for having been careless one more time.

W3C has than nice section I've been mentioning so far
(http://www.w3.org/International/).

Unicode has a FAQ (http://www.unicode.org/unicode/faq/), a technical
introduction (http://www.unicode.org/unicode/standard/principles.html), and
a glossary (http://www.unicode.org/glossary/index.html). Probably the
Unicode FAQ should be updated periodically with questions asked on *this*
list, such as problems authoring web pages, selecting fonts, etc.

Among independent documentation, I would cite at least Roman Czyborra's site
(http://www.czyborra.com/), that is remarkably informative.

_ Marco



Re: Mixing languages on a Web site

2000-06-30 Thread Peter_Constable




On 06/30/2000 08:25:47 AM [EMAIL PROTECTED] wrote:

... a few are missing (Ethiopic, for example).
But its got most of them (and I would love to fill in the blanks if there
is
anyone who has sources for the missing languages!).

Just a few? Most of them? Not by a long shot! (Cf.
http://www.sil.org/ethnologue/) I regret, though, that I can't easily offer
you text for any of the other 6,700 languages (probably only 1/3 of which
are written).

"Ethiopic" is not the name of a language, by the way. Or were you counting
scripts rather than languages?

I'm inclined to think that counting country-specific varieties as separate
languages is artificially stretching things. I really doubt that someone
from Guatemala could complain of someone from Argentina, "¿Por qué no puede
simplemente hablar Español de Guatemala?" Do Australians, Canadians, Brits,
etc.; or Germans, Austrians, etc. make similar complaints of one another?

A fun page, nonetheless!


- Peter


---
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: [EMAIL PROTECTED]




Re: Mixing languages on a Web site

2000-06-30 Thread Peter_Constable




On 06/30/2000 12:09:53 PM [EMAIL PROTECTED] wrote:

Languages and scripts are often very "politically"
involved. I simply chose not to judge people for their contribution, thats
all.

And given those considerations, I don't blame you in the least.



- Peter


---
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: [EMAIL PROTECTED]





Re: Mixing languages on a Web site

2000-06-30 Thread Peter_Constable




On 06/30/2000 01:27:18 PM [EMAIL PROTECTED] wrote:

Peter,

Just read your post to the Unicode list. I'm wondering if your site has
any
Unicode sample texts available (I'm looking for just about every major
script/language). The texts don't have to be long... but I'd like stuff
longer
than one or two sentences (maybe a couple paragraphs would be great). Any
pointers would be much appreciated.


Not yet, really, at least probably not of the type and in a form that
you're looking for. There is some data but it's either in PDF, or it's
probably limited in its character repertoire to what is found in European
languages. Of the latter, I don't know whether any is encoded in UTF-8. At
any rate, try the following links:

http://www.sil.org/silewp/
http://www.sil.org/mexico/pub/publicaciones.htm

Some of our field offices have been working on getting content ready to
publish on the web, but I'm not sure how much and of what sort, and it may
be that a lot of it will be in PDF for now.

I expect we will have a lot more linguistic data from a wide variety of
languages in the future, but this will take some time. With regard to
character sets/encodings, most of our researches have, in the past, worked
with custom character sets/encodings where commonly available standards
like cp1252 weren't adequate - linguists everywhere have had to do that, so
most existing data isn't yet in Unicode. As an organisation, though, we're
committed to Unicode, and those of us in our International offices working
on technology solutions for the researches we support are promoting the use
of Unicode as linguistic software that supports Unicode becomes available.
(We anticipate our first Unicode-enabled language software products will be
released late this year or early next year.) So Unicode-encoded data from
vernacular languages will start to become more common over the next several
years. I also expect that SIL will be getting involved in cooperative
efforts with other major linguistic agencies to start building online
archives of linguistic data, and that will likely build heavily on XML and
Unicode.

One key issue in putting data from hundreds of languages on the web is
fonts and rendering support for complex scripts (which includes IPA and
Roman with diacritics). There is also the issue that some minority
languages use characters that are not yet part of Unicode, or they may use
characters in Unicode but with script behaviour that's slightly different
from what occurs in the more commonly known languages (e.g. different glyph
shapes or different ligatures and ligation rules). It will just take some
time to cross all these bridges.

We do have access to electronic corpora of texts in literally hundreds of
minority languages, we know that there would be a lot of interest in those
being made available, and we want to start making it available. With
personnel resources already stretched and some technical issues still to be
worked out, this will take longer than we wish it would.


- Peter


---
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: [EMAIL PROTECTED]




Re: Mixing languages on a Web site

2000-06-30 Thread Christopher John Fynn

 [EMAIL PROTECTED] wrote:

 Probably the Unicode FAQ should be updated periodically 
 with questions asked on *this*list, such as problems 
 authoring web pages, selecting fonts, etc.

I second that. As Unicode is increasingly available to 
users in operating systems, applications, and on the 
web, more and more people are going to be looking for 
answers to practical questions relating to how to use and 
display Unicode text in their favourite applications or on 
their web pages. The most obvious place for them to look 
for  these answers is on the Unicode.org site - and the 
most  obvious place for people to ask those questions 
which they cannot easily find answers to is on this list.


 - Chris 




Mixing languages on a Web site

2000-06-29 Thread rampshot

I am mixing Japanese and Turkish letters on my site.

1) How do I convert Latin-* text to UTF-8 text?
2) How do I convert Shift-JIS text to UTF-8 text?
3) How do I mark text as UTF-8?
4) Will people actually be able to SEE BOTH the Japanese AND the Turkish?
5) Is there a little "formatted in Unicode" logo I can put on my site?
6) Is there a "Unicode Help" site so people like me don't have to post these
questions on lists like these?

I bet 1 and 2 could be done with CGI scripts, and 3 is trivial.


Get free email and a permanent address at http://www.netaddress.com/?N=1