Re: CYRILLIC SMALL/CAPITAL LETTER SELKUP OE (ISO 10756:1996)

2012-03-06 Thread Philippe Verdy
We've got the example of the ISO 9 standard itself.

Le 5 mars 2012 22:46, Michael Everson ever...@evertype.com a écrit :
 On 5 Mar 2012, at 20:13, Benjamin M Scarborough wrote:

 There is a clear precedent here that the unifications of N2463 are not 
 necessarily the final fate of any of these characters. If the О Е letter for 
 Selkup should be disunified from U+0152/U+0153, then a proposal needs to be 
 submitted calling for the addition of the two letters to the UCS.

 Have you got examples, Ben?

 Michael Everson * http://www.evertype.com/




Re: CYRILLIC SMALL/CAPITAL LETTER SELKUP OE (ISO 10756:1996)

2012-03-05 Thread Denis Jacquerye
On Tue, Feb 28, 2012 at 4:00 AM, Philippe Verdy verd...@wanadoo.fr wrote:
 I am looking for the codes or assignements status of the Cyrillic
 letter OE/oe (ligatured) as used in Selkup (exactly similar to the
 Latin pair).

 This character pair has been part of the registration nr. 223 (in
 1998) by ISO of the (8-bit) extended Cyrillic character set for
 non-Slavic languages for bibliographic information interchange :

 http://www.itscj.ipsj.or.jp/sc2/open/02n3136.pdf

 According to this document, this character set had also been
 standardized as ISO 10756:1996. Note that it contains many other
 characters for which it did not document any mapping to the UCS in the
 then emerging ISO 10646 standard.

 It has even been part of proposals at the UTC and ISO the same year
 for including in the UCS, along with other characters (at that time,
 Michael Everson wrote a proposal, placing them in U+04EC, U+04ED, but
 since the, the slots have been used for other characters (that block
 is now full).

 It is also referenced in the ISO 9 Cyrillic/Latin transliteration standard.

 Still, there's no Cyrillic character I can find in the encoded UCS in
 other Cyrillic extended blocks that are not full (for example,  the
 CYRILLIC SUPPLEMENT block at U+0500-052F).

 Where are those characters ? And what about the remaining characters
 found in the Registration nr. 223 and ISO 10756:1996 ? And their
 status in the ISO 9 standard itself ?

 Thanks.

 -- Philippe.


According to ftp://std.dkuug.dk/jtc1/sc2/WG2/docs/n2463.doc the
Cyrillic Selkup OE is mapped to Latin OE:
CYRILLIC SMALL LETTER SELKUP O E to U+0153 LATIN SMALL LIGATURE OE
CYRILLIC CAPITAL LETTER SELKUP O E to U+0152 LATIN CAPITAL LIGATURE OE
Several other of those missing Cyrillic characters are simply mapped
to Latin ones or sort of decomposed.

-
Denis Moyogo Jacquerye




Re: CYRILLIC SMALL/CAPITAL LETTER SELKUP OE (ISO 10756:1996)

2012-03-05 Thread Benjamin M Scarborough
On Mon, Mar 5, 2012 at 19:35, Denis Jacquerye wrote:
 According to ftp://std.dkuug.dk/jtc1/sc2/WG2/docs/n2463.doc the
 Cyrillic Selkup OE is mapped to Latin OE:
 CYRILLIC SMALL LETTER SELKUP O E to U+0153 LATIN SMALL LIGATURE OE
 CYRILLIC CAPITAL LETTER SELKUP O E to U+0152 LATIN CAPITAL LIGATURE OE
 Several other of those missing Cyrillic characters are simply mapped
 to Latin ones or sort of decomposed. 

N2463 also maps twelve characters from ISO 10574 that have been disunified 
since 2002, namely:
04/06 CYRILLIC SMALL LETTER KURDISH QA is now U+051B CYRILLIC SMALL LETTER QA
04/09 CYRILLIC SMALL LETTER EL WITH MIDDLE HOOK is now U+0521 CYRILLIC SMALL 
LETTER EL WITH MIDDLE HOOK
04/10 CYRILLIC SMALL LETTER MORDVIN EL KA is now U+0515 CYRILLIC SMALL LETTER 
LHA
04/14 CYRILLIC SMALL LETTER EN WITH MIDDLE HOOK is now U+0523 CYRILLIC SMALL 
LETTER EN WITH MIDDLE HOOK
05/06 CYRILLIC CAPITAL LETTER KURDISH QA is now U+051A CYRILLIC CAPITAL LETTER 
QA
05/09 CYRILLIC CAPITAL LETTER EL WITH MIDDLE HOOK is now U+0520 CYRILLIC 
CAPITAL LETTER EL WITH MIDDLE HOOK
05/10 CYRILLIC CAPITAL LETTER MORDVIN EL KA is now U+0514 CYRILLIC CAPITAL 
LETTER LHA
05/14 CYRILLIC CAPITAL LETTER EN WITH MIDDLE HOOK is now U+0522 CYRILLIC 
CAPITAL LETTER EN WITH MIDDLE HOOK
06/03 CYRILLIC SMALL LETTER ER KA is now U+0517 CYRILLIC SMALL LETTER RHA
06/08 CYRILLIC SMALL LETTER KURDISH WE is now U+051D CYRILLIC SMALL LETTER WE
07/03 CYRILLIC CAPITAL LETTER ER KA is now U+0516 CYRILLIC CAPITAL LETTER RHA
07/08 CYRILLIC CAPITAL LETTER KURDISH WE is now U+051C CYRILLIC CAPITAL LETTER 
WE

There is a clear precedent here that the unifications of N2463 are not 
necessarily the final fate of any of these characters. If the О Е letter for 
Selkup should be disunified from U+0152/U+0153, then a proposal needs to be 
submitted calling for the addition of the two letters to the UCS.

It is worth noting that N2463 also decomposes four characters using U+0335, a 
practice which hasn't been used for decompositions since Unicode 1.1.

I also don't understand the mapping of 04/05 CYRILLIC SMALL LETTER CHECHEN KA 
and 05/05 CYRILLIC CAPITAL LETTER CHECHEN KA into U+043A CYRILLIC SMALL LETTER 
KA, U+030A COMBINING RING ABOVE and U+041A CYRILLIC CAPITAL LETTER KA. U+030A 
COMBINING RING ABOVE, respectively. Is the character shown in ISO 10574 just a 
glyph variant of this combining sequence?

—Ben Scarborough




Re: CYRILLIC SMALL/CAPITAL LETTER SELKUP OE (ISO 10756:1996)

2012-03-05 Thread Philippe Verdy
Le 5 mars 2012 19:35, Denis Jacquerye moy...@gmail.com a écrit :
 On Tue, Feb 28, 2012 at 4:00 AM, Philippe Verdy verd...@wanadoo.fr wrote:
 I am looking for the codes or assignements status of the Cyrillic
 letter OE/oe (ligatured) as used in Selkup (exactly similar to the
 Latin pair).

 This character pair has been part of the registration nr. 223 (in
 1998) by ISO of the (8-bit) extended Cyrillic character set for
 non-Slavic languages for bibliographic information interchange :

 http://www.itscj.ipsj.or.jp/sc2/open/02n3136.pdf

 According to this document, this character set had also been
 standardized as ISO 10756:1996. Note that it contains many other
 characters for which it did not document any mapping to the UCS in the
 then emerging ISO 10646 standard.

 It has even been part of proposals at the UTC and ISO the same year
 for including in the UCS, along with other characters (at that time,
 Michael Everson wrote a proposal, placing them in U+04EC, U+04ED, but
 since the, the slots have been used for other characters (that block
 is now full).

 It is also referenced in the ISO 9 Cyrillic/Latin transliteration standard.

 Still, there's no Cyrillic character I can find in the encoded UCS in
 other Cyrillic extended blocks that are not full (for example,  the
 CYRILLIC SUPPLEMENT block at U+0500-052F).

 Where are those characters ? And what about the remaining characters
 found in the Registration nr. 223 and ISO 10756:1996 ? And their
 status in the ISO 9 standard itself ?

 Thanks.

 -- Philippe.


 According to ftp://std.dkuug.dk/jtc1/sc2/WG2/docs/n2463.doc the
 Cyrillic Selkup OE is mapped to Latin OE:
 CYRILLIC SMALL LETTER SELKUP O E to U+0153 LATIN SMALL LIGATURE OE
 CYRILLIC CAPITAL LETTER SELKUP O E to U+0152 LATIN CAPITAL LIGATURE OE
 Several other of those missing Cyrillic characters are simply mapped
 to Latin ones or sort of decomposed.

Apparently this document is obsolete. Some of the proposed mappings to
Latin have been encoded as plain Cyrillic letters such as:

CYRILLIC SMALL LETTER KURDISH QA

(not the initially proposed mapping to LATIN SMALL LETTER Q)

This document was still a draft, and not a decision.

The document specifically says The issue with these letters is
whether they should be deunified from Latin, and encoded in the
Cyrillic block.




Re: CYRILLIC SMALL/CAPITAL LETTER SELKUP OE (ISO 10756:1996)

2012-03-05 Thread Michael Everson
On 5 Mar 2012, at 20:13, Benjamin M Scarborough wrote:

 There is a clear precedent here that the unifications of N2463 are not 
 necessarily the final fate of any of these characters. If the О Е letter for 
 Selkup should be disunified from U+0152/U+0153, then a proposal needs to be 
 submitted calling for the addition of the two letters to the UCS.

Have you got examples, Ben? 

Michael Everson * http://www.evertype.com/





Re: Cyrillic character mapping tables, HP MSL to Unicode

2003-09-03 Thread Philippe Verdy
Did you read the last PDF, notably as it says the following about Table
D-3:

[/quote]
D - MSL/Unicode Symbol Indexes

Introduction

Table D-1, the Master Symbol List, lists all of the characters available
for the printers and their MSL index numbers. Table D-2, shows the
characters contained in the MSL symbol collections. Table D-3, the
Unicode Symbol List, lists all of the characters available for the
printers and identifies their unicode index number. Table D-4 shows
the characters contained in the unicode symbol collections.
[/quote]

Well, I misread the description myself, confused about the title of the
section, and it's true that only *some* MSL indices are identical to
the Unicode code points. It's a shame that one has to compute
the conversion by looking at glyph and names given by HP, which
do not correspond to Unicode names.
It would have been simpler if HP had referenced in its 1999 release
of its book, the Unicode code points in Table D-1, and used the official
Unicode names (additionally the table D-3 should have listed the
MSL index in a reverse index, and not used the decimal code points
but hexadecimal notation U+).
But joining D-1 nad D-3 is possible, and allows creating the conversion
table between MSL to Unicode.

Philippe.
Les messages non sollicités (spams) ne sont pas tolérés.
Tout abus sera signalé automatiquement à vos fournisseurs de service.
- Original Message - 
From: Neil J Geddes [EMAIL PROTECTED]
To: Philippe Verdy [EMAIL PROTECTED]
Sent: Wednesday, September 03, 2003 9:40 AM
Subject: RE: Cyrillic character mapping tables, HP MSL to Unicode


Hello Philippe,

Thank you very much for your messages and for taking the time to
respond. I appreciate this.

I had already checked most of these resources (like you I have the older
paper manuals) however none provide symbol charts for the Cyrillic
character sets. I think I really need to locate TFM files if available.
MSL isn't the same as Unicode however I have found a MSL - CG table
which should help me.

Thanks again,
Neil

-Original Message-
From: Philippe Verdy [mailto:[EMAIL PROTECTED]
Sent: Wednesday, September 03, 2003 1:42 AM
To: Neil J Geddes; [EMAIL PROTECTED]
Subject: Re: Cyrillic character mapping tables, HP MSL to Unicode


More precisely, try this file:
http://h27.www2.hp.com/bc/docs/support/SupportManual/bpl13206/bpl13206.pdf
which contains all the symbol sets charts and cross-references with the
MSL/Unicode code and their assignment in other subsets. It is refered
within the downloadable reference CDROM for the PCL language.

The MSL index seems to be the Unicode code point, so the MSL is merely a
subset of Unicode, as used in the HP implementation of the HP PCL - GL/2
symbol sets and fonts.

Philippe.
Les messages non sollicités (spams) ne sont pas tolérés.
Tout abus sera signalé automatiquement à vos fournisseurs de service.




RE: Cyrillic character mapping tables, HP MSL to Unicode

2003-09-03 Thread Neil J Geddes
Thanks Philippe. What I really need now is access to additional Euro-Asian HP TFM 
files.

Regards, Neil


-Original Message-
From: Philippe Verdy [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, September 03, 2003 9:57 AM
To: Neil J Geddes
Cc: [EMAIL PROTECTED]
Subject: Re: Cyrillic character mapping tables, HP MSL to Unicode


Did you read the last PDF, notably as it says the following about Table
D-3:

[/quote]
D - MSL/Unicode Symbol Indexes

Introduction

Table D-1, the Master Symbol List, lists all of the characters available for the 
printers and their MSL index numbers. Table D-2, shows the characters contained in the 
MSL symbol collections. Table D-3, the Unicode Symbol List, lists all of the 
characters available for the printers and identifies their unicode index number. Table 
D-4 shows the characters contained in the unicode symbol collections. [/quote]

Well, I misread the description myself, confused about the title of the section, and 
it's true that only *some* MSL indices are identical to the Unicode code points. It's 
a shame that one has to compute the conversion by looking at glyph and names given by 
HP, which do not correspond to Unicode names. It would have been simpler if HP had 
referenced in its 1999 release of its book, the Unicode code points in Table D-1, and 
used the official Unicode names (additionally the table D-3 should have listed the MSL 
index in a reverse index, and not used the decimal code points but hexadecimal 
notation U+). But joining D-1 nad D-3 is possible, and allows creating the 
conversion table between MSL to Unicode.

Philippe.
Les messages non sollicités (spams) ne sont pas tolérés.
Tout abus sera signalé automatiquement à vos fournisseurs de service.
- Original Message - 
From: Neil J Geddes [EMAIL PROTECTED]
To: Philippe Verdy [EMAIL PROTECTED]
Sent: Wednesday, September 03, 2003 9:40 AM
Subject: RE: Cyrillic character mapping tables, HP MSL to Unicode


Hello Philippe,

Thank you very much for your messages and for taking the time to respond. I appreciate 
this.

I had already checked most of these resources (like you I have the older paper 
manuals) however none provide symbol charts for the Cyrillic character sets. I think I 
really need to locate TFM files if available. MSL isn't the same as Unicode however I 
have found a MSL - CG table which should help me.

Thanks again,
Neil

-Original Message-
From: Philippe Verdy [mailto:[EMAIL PROTECTED]
Sent: Wednesday, September 03, 2003 1:42 AM
To: Neil J Geddes; [EMAIL PROTECTED]
Subject: Re: Cyrillic character mapping tables, HP MSL to Unicode


More precisely, try this file: 
http://h27.www2.hp.com/bc/docs/support/SupportManual/bpl13206/bpl13206.pdf
which contains all the symbol sets charts and cross-references with the MSL/Unicode 
code and their assignment in other subsets. It is refered within the downloadable 
reference CDROM for the PCL language.

The MSL index seems to be the Unicode code point, so the MSL is merely a subset of 
Unicode, as used in the HP implementation of the HP PCL - GL/2 symbol sets and fonts.

Philippe.
Les messages non sollicités (spams) ne sont pas tolérés.
Tout abus sera signalé automatiquement à vos fournisseurs de service.




Re: Cyrillic character mapping tables, HP MSL to Unicode

2003-09-02 Thread Philippe Verdy
First start with this page:
http://www.hp.com/cposupport/printers/support_doc/bpl04568.html
You may want to buy this:
Refer to the HP PCL5 Technical Reference Bundle. To order, call HP's
driver/software distribution at 661-257-5565. The part number is
5961-0976.

You may also look at:
http://www.hp.com/cposupport/printers/support_doc/bpl02705.html
and refer to this:
For further information about PCL commands, HP-GL/2, macros, or PJL
commands, use the Technical Reference Manual set, part number 5021-0377.
Order the manual set from HP's Support Materials Organization.

Or you may download this:
http://h2.www2.hp.com/bc/docs/support/SupportManual/bpl13210/bpl13210.pdf
PCL 5 Printer Language Technical Reference Manual - ENWW - HP Part No.
5961-0509. Printed in USA. First Edition - October 1992 PCL 5 Printer
LanguageTechnical Reference Manual.
I have the same book, but dated September 1990 (this was really the
first edition), HP part number 33459-90903.

Also:
http://h2.www2.hp.com/bizsupport/TechSupport/SoftwareDescription.jsp?locBasepartNum=5961-0976lang=English%20%28US%29
HP PCL Tech Reference Manual CD-ROM - The HP PCL Tech Reference Bundle
CD-ROM includes, the Technical Quick Reference Guide, Printer Job
Language Technical Reference Manual, PCL 5 Color Technical Reference
Manual, PCL 5 Printer Language Technical Reference Manual. In English in
a PDF. Format.

Philippe.
Les messages non sollicités (spams) ne sont pas tolérés.
Tout abus sera signalé automatiquement à vos fournisseurs de service.
- Original Message - 
From: Neil J Geddes [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Thursday, August 28, 2003 2:23 PM
Subject: Cyrillic character mapping tables, HP MSL to Unicode


 Hello,

 I'm looking for symbol set and character metric information for the
two
 Hewlett-Packard symbol sets 3R (PC Cyrillic) and 9R (Windows 3.1
 Latin/Cyrillic). Specifically I'm after:-

 1) .TFM files for Univers, CG Times, Courier and other common
typefaces
 that use Cyrllic.

 2) A cross mapping table for HP MSL (Master Symbol List) to Unicode.

 Thanks for any help you can offer. It's appreciated!

 Best regards,

 Neil Geddes
 [EMAIL PROTECTED]





Re: Cyrillic character mapping tables, HP MSL to Unicode

2003-09-02 Thread Philippe Verdy
More precisely, try this file:
http://h27.www2.hp.com/bc/docs/support/SupportManual/bpl13206/bpl13206.pdf
which contains all the symbol sets charts and cross-references with the
MSL/Unicode code and their assignment in other subsets.
It is refered within the downloadable reference CDROM for the PCL
language.

The MSL index seems to be the Unicode code point, so the MSL is merely a
subset of Unicode, as used in the HP implementation of the HP PCL - GL/2
symbol sets and fonts.

Philippe.
Les messages non sollicités (spams) ne sont pas tolérés.
Tout abus sera signalé automatiquement à vos fournisseurs de service.
 - Original Message - 
 From: Neil J Geddes [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Sent: Thursday, August 28, 2003 2:23 PM
 Subject: Cyrillic character mapping tables, HP MSL to Unicode


  Hello,
 
  I'm looking for symbol set and character metric information for the
 two
  Hewlett-Packard symbol sets 3R (PC Cyrillic) and 9R (Windows 3.1
  Latin/Cyrillic). Specifically I'm after:-
 
  1) .TFM files for Univers, CG Times, Courier and other common
 typefaces
  that use Cyrllic.
 
  2) A cross mapping table for HP MSL (Master Symbol List) to Unicode.
 
  Thanks for any help you can offer. It's appreciated!





Re: Cyrillic Q

2001-09-27 Thread Roozbeh Pournader

On Thu, 27 Sep 2001, Marco Cimarosti wrote:

 A lot of time ago, someone on this list mentioned a language, written in the
 Cyrillic alphabet, which employed letter Q, taken from the Latin alphabet.

 Which language is it?

IIRC, it was Kurdish.

roozbeh





Re: Cyrillic Q

2001-09-27 Thread John Hudson

At 02:48 9/27/2001, Marco Cimarosti wrote:

A lot of time ago, someone on this list mentioned a language, written in the
Cyrillic alphabet, which employed letter Q, taken from the Latin alphabet.

Which language is it?

Kurdish. The common Cyrillic orthography includes four Latin letterforms 
that are, as far as I know, unique to Kurdish:

 U+0051, U+0071  Capital, Small Q
 U+0057, U+077   Capital, Small W

John Hudson

Tiro Typeworks  www.tiro.com
Vancouver, BC   [EMAIL PROTECTED]

Type is something that you can pick up and hold in your hand.
   - Harry Carter





Re: Cyrillic Q

2001-09-27 Thread James E. Agenbroad

On Thu, 27 Sep 2001, John Hudson wrote:

 At 02:48 9/27/2001, Marco Cimarosti wrote:
 
 A lot of time ago, someone on this list mentioned a language, written in the
 Cyrillic alphabet, which employed letter Q, taken from the Latin alphabet.
 
 Which language is it?
 
 Kurdish. The common Cyrillic orthography includes four Latin letterforms 
 that are, as far as I know, unique to Kurdish:
 
  U+0051, U+0071  Capital, Small Q
  U+0057, U+077   Capital, Small W
 
 John Hudson
 
 Tiro Typeworkswww.tiro.com
 Vancouver, BC [EMAIL PROTECTED]
 
 Type is something that you can pick up and hold in your hand.
- Harry Carter
 
 
 Thursday, Septembe 27, 2001
Besides Kurdish, the section on tansliteration of non-Slavic languages
using Cyrillic the ALA-LC romanization tables (1997) shows Q used with
four other languages: Aisor, Chechen (the 1862 and 1908 orthographies but
not the 1938 one), Dargwa (Uslar) and Lak (1864 but not 1938). For Kurdish
Q seems also to have an alternative glyph that appears as O followed by
a vertical bar which is also used with Lezghian (Uslar).  

 Regards,
  Jim Agenbroad ( [EMAIL PROTECTED] )
 The above are purely personal opinions, not necessarily the official
views of any government or any agency of any.
Phone: 202 707-9612; Fax: 202 707-0955; US mail: I.T.S. Dev.Gp.4, Library
of Congress, 101 Independence Ave. SE, Washington, D.C. 20540-9334 U.S.A.  





RE: Cyrillic -

2000-10-03 Thread Alan Wood

 Aleksandar Poposki [mailto:[EMAIL PROTECTED]] asked:
   where could I obtain true-type fonts for Unicode. 
 
You can find a list of fonts that include the Unicode Cyrillic range of
characters at:
http://www.hclrss.demon.co.uk/unicode/cyrillic.html

You can find information about obtaining those fonts at:
http://www.hclrss.demon.co.uk/unicode/fonts.html

However, you probably don't need to worry about obtaining special fonts.
Unicode Cyrillic characters are included in recent versions of Arial,
Courier New and Times New Roman, and so many Windows users can already
display them.  Macintosh users with Mac OS 9 can install the Cyrillic
language kit from their OS CD-ROM, and this enables recent Web browsers to
display Unicode Cyrillic as well as other encodings.

Fingertip Software Inc. produces Character Set Converter, which runs under
Windows and can convert Unicode Cyrillic to and from various Windows,
Macintosh and DOS Cyrillic character sets:
http://www.fingertipsoft.com/csconv/brochure.html

Alan Wood
Documentation Writer / Web Master
Context Limited (http://www.context.co.uk/)
mailto:[EMAIL PROTECTED]
http://www.alanwood.net/ (Unicode, special characters, pesticide names)



Re: Cyrillic -

2000-10-01 Thread Vladimir Weinstein

Hi,

I have looked at your web site. If I am not mistaken, you are using a
codepage that is commonly refered to as cyrillic YUSCII. This makes the
page almost unusable except for the people that have 'Pulshelvetika7'
font installed.

As you have correctly assumed, the best thing would be to convert the
page to Unicode (although you could also convert it microsoft cp1251 or
ISO 8859-5). You will not loose your pages - just work on the copies of
them.

One possible way would be, as Markus already mentioned, to use ICU
converter framework - but you would have to make a converter table.
There is also a set of macros for Word that handles ex-YU codepage
conversions, which can be found at http://solair.eunet.yu/~minya/

Once you have converted your text to Unicode - you should add encoding
information to your page about the used encoding.

Most modern browsers should be able to swallow and correctly display
such a page.

Should you have more questions, please contact me directly.

Hope this helps,

--
Vladimir Weinstein
Software Engineer, Unicode Technology Group
IBM JTC Cupertino
408-777-5844 (t/l 240-5844)



Re: Cyrillic -

2000-09-29 Thread Markus Scherer

hello,

for fonts etc. have a look at http://www.unicode.org/unicode/onlinedat/resources.html

for converting your pages to unicode, you would need some library or operating system 
api to do so. there are plenty around, but you would have to find out exactly what is 
the encoding of your pages. if you cannot find built-in support, then you might need 
to add a mapping table to one of the libraries' conversion services.

for such libraries see http://www.unicode.org/unicode/onlinedat/products.html#3
i am working with the icu library that you find linked there. with icu for example, 
you can add a mapping table to the library.

best regards,

markus



RE: Cyrillic -

2000-09-29 Thread Carl W. Brown



Aleks,

The 
reason to use Unicode is more fundamental than fonts. I assume that your 
your church members and other interested in your sites will have different 
systems. Those with Cyrillic fonts will prefer Cyrillic text. Using 
Unicode you can encode your entire websites in one encoding mixing both Latin 
and Cyrillic text. What you need is an editor that can save the Cyrillic 
text as Unicode in UTF-8 form. This is the same form that you will send to 
to the browser. This way both Latin and Cyrillic text will be the same to 
the Web server. Make sure that the HTTP header and the charset meta tag 
both specify utf-8. The browser will handle both Latin and Cyrillic the same as 
well. All it will need to display Cyrillic is a Cyrillic font. 
Windows and Mac users can install IE 5.0 and select the pan-European 
support. The Windows  Mac fonts can be downloaded from: http://www.microsoft.com/truetype/fontpack/win.htm

TrueType is a Unicode encoded font that can be used in 
non-Unicode applications as well.

Good 
luck

Carl




  -Original Message-From: Magda Danish (Unicode) 
  [mailto:[EMAIL PROTECTED]]Sent: Friday, September 29, 2000 
  12:05 PMTo: Unicode ListSubject: Cyrillic - 
  
  -Original Message-From: Aleksandar Poposki 
  [mailto:[EMAIL PROTECTED]]Sent: Thursday, September 28, 2000 4:04 
  PMTo: [EMAIL PROTECTED]Subject: Your 
  opinion
  
  Hello.
  
  Im 
  the Webmaster of the Macedonian Orthodox Church website located at www.m-p-c.org. When I started this project I was not 
  very familiar with Unicode and used home-made fonts for Cyrillic characters, 
  but learning about Unicode, I see it is the best way to go, as it is the 
  International standard. Keeping 
  this in mind, and other difficulties Ive had, I wish to 
  ask:
  
  
Is there a way to convert my work to Unicode w/o 
risk. 

  I 
  was wondering writing a program to search my document for a character, 
  once found, replace it with the Unicode character 
  number. 
Is there a script available for me to add to my web page so if 
the user doesnt have Multi-Lingual Cyrillic support, to automatically 
install it? 
And, where could I obtain true-type fonts for Unicode. Also, is there a script as in my 
previous question for true-type fonts? 
  
  
  
  Aleks


Re: Cyrillic -

2000-09-29 Thread Valeriy E. Ushakov

   -Original Message-
   From: Aleksandar Poposki [mailto:[EMAIL PROTECTED]]
   Sent: Thursday, September 28, 2000 4:04 PM
   To: [EMAIL PROTECTED]
   Subject: Your opinion
 
   I'm the Webmaster of the Macedonian Orthodox Church website
 located at www.m-p-c.org.  When I started this project I was not
 very familiar with Unicode and used 'home-made' fonts for Cyrillic
 characters, but learning about Unicode, I see it is the best way to
 go, as it is the International standard.  Keeping this in mind, and
 other difficulties I've had, I wish to ask:

Do you plan to have Old Church Slavonic (OCS) in your pages?

Unicode lacks support for "letter titlo" (i.e. titlo with a letter)
used quite productively in OCS (in Russia at least), so you can't use
Unicode to write "The Lord" (with "slovo-titlo") or "The Gospel" (with
"glagol-titlo").

SY, Uwe
-- 
[EMAIL PROTECTED] |   Zu Grunde kommen
http://www.ptc.spbu.ru/~uwe/|   Ist zu Grunde gehen



Re: Cyrillic -

2000-09-29 Thread Michael Everson

Ar 13:44 -0800 2000-09-29, scríobh Valeriy E. Ushakov:

Unicode lacks support for "letter titlo" (i.e. titlo with a letter)
used quite productively in OCS (in Russia at least), so you can't use
Unicode to write "The Lord" (with "slovo-titlo") or "The Gospel" (with
"glagol-titlo").

Nepravda. Smotrite U+0483 COMBINING CYRILLIC TITLO.

Cyrillic fans will be delighted to learn that 16 Komi Cyrillic characters
used from 1919-1940 have been accepted for processing into the standard
(document describing these is available on my web site). Also the three
columns between the Cyrillic block and the Armenian block have been
dedicated to extensions, called "Cyrillic Supplementary".

Michael Everson  **  Everson Gunn Teoranta  **   http://www.egt.ie
15 Port Chaeimhghein Íochtarach; Baile Átha Cliath 2; Éire/Ireland
Vox +353 1 478 2597 ** Fax +353 1 478 2597 ** Mob +353 86 807 9169
27 Páirc an Fhéithlinn;  Baile an Bhóthair;  Co. Átha Cliath; Éire





Re: Cyrillic -

2000-09-29 Thread Valeriy E. Ushakov

On Fri, Sep 29, 2000 at 15:55:41 -0800, John Cowan wrote:

   What is genuinely missing is IOTIFIED A.  Because LITTLE YUS and
   IOTIFIED A fell together in Russian as /ja/, Peter eliminated the
   latter and adopted a modified form of LITTLE YUS, now CYRILLIC
   LETTER YA.
  
  But aren't IOTIFIED A and YA just glyph variants (with LITTLE YUS
  lacking a parallel glyph in Peter's civil alphabet, merging with YA
  instead).
 
 Historically YA is a glyph variant of LITTLE YUS, not of IOTIFIED A,
 I am told.  So given that we have already encoded YA and LITTLE YUS
 (unavoidable, really, considering how different they look), IOTIFIED
 A has no representation.

My, rather limited, understanding is that at that time the two
letters, LITTLE YUS and IOTIFIED A, were no longer denoting distinct
sounds and were used more or less interchangeably (i.e. they were more
or less glyph variants by that time) and so Peter merged them into one
letter YA with a glyph for it being based on a glyph for LITTLE YUS.

In other words iotified a (ya) survived in Peter's secular Russian
alphabet as a character but lost its Slavonic glyph, while little yus
disappeared as a character but its glyph survived in the new alphabet.
Thus Peter's YA is *character* YA (== iotified a) with a glyph based
on a glyph for little yus.

But important point here is that "old" alphabet and "new" alphabet
were "disjoint".  With regard to Russian they are disjoint in time.
With regard to Slavonic - the new alphabet was "secular Russian",
while old one was "Church Slavonic" and the two never really mixed.
The "typeface" aspect is important too: writing one of the languages
in the other's typeface is clearly perceived as either a visual pun or
transliteration.  So, in theory, you'll never find *glyph* YA
(reversed R) and *glyph* IOTIFIED A (i-a) in one homogeneous text as
this is made impossible by either synchronic or diachronic
constraints. 

So it seems that for Slavonic one should use LITTLE YUS to encode
little yus and YA to encode iotified a (which my grammar book of
Slavonic calls just "ya").  For Russian there's no LITTLE YUS and
character YA is used to encode ya.

Of course it's still possible to develop a typeface with all three
glyphs (little yus, iotified a, ya) in it and use OpenType to choose
correct one.   This is not dissimilar to, say, mixed Serbian and
Russian cursive text with different glyphs for certain characters.
(And the latter have been already discussed to death on this list).


All this, of course, is Russian-centric.  I don't know how things
developed in other Slavic languages, especially in southern slavic
languages that are closer to (also southern by its origin) Church
Slavonic than the eastern slavic Russian.

PS: Sorry if this sounds a little confusing - 6am is not the best time
for writing from memory short essays on history of Cyrillic alphabet
in Russia.

SY, Uwe
-- 
[EMAIL PROTECTED] |   Zu Grunde kommen
http://www.ptc.spbu.ru/~uwe/|   Ist zu Grunde gehen