Re: CYRILLIC SMALL/CAPITAL LETTER SELKUP OE (ISO 10756:1996)
We've got the example of the ISO 9 standard itself. Le 5 mars 2012 22:46, Michael Everson ever...@evertype.com a écrit : On 5 Mar 2012, at 20:13, Benjamin M Scarborough wrote: There is a clear precedent here that the unifications of N2463 are not necessarily the final fate of any of these characters. If the О Е letter for Selkup should be disunified from U+0152/U+0153, then a proposal needs to be submitted calling for the addition of the two letters to the UCS. Have you got examples, Ben? Michael Everson * http://www.evertype.com/
Re: CYRILLIC SMALL/CAPITAL LETTER SELKUP OE (ISO 10756:1996)
On Tue, Feb 28, 2012 at 4:00 AM, Philippe Verdy verd...@wanadoo.fr wrote: I am looking for the codes or assignements status of the Cyrillic letter OE/oe (ligatured) as used in Selkup (exactly similar to the Latin pair). This character pair has been part of the registration nr. 223 (in 1998) by ISO of the (8-bit) extended Cyrillic character set for non-Slavic languages for bibliographic information interchange : http://www.itscj.ipsj.or.jp/sc2/open/02n3136.pdf According to this document, this character set had also been standardized as ISO 10756:1996. Note that it contains many other characters for which it did not document any mapping to the UCS in the then emerging ISO 10646 standard. It has even been part of proposals at the UTC and ISO the same year for including in the UCS, along with other characters (at that time, Michael Everson wrote a proposal, placing them in U+04EC, U+04ED, but since the, the slots have been used for other characters (that block is now full). It is also referenced in the ISO 9 Cyrillic/Latin transliteration standard. Still, there's no Cyrillic character I can find in the encoded UCS in other Cyrillic extended blocks that are not full (for example, the CYRILLIC SUPPLEMENT block at U+0500-052F). Where are those characters ? And what about the remaining characters found in the Registration nr. 223 and ISO 10756:1996 ? And their status in the ISO 9 standard itself ? Thanks. -- Philippe. According to ftp://std.dkuug.dk/jtc1/sc2/WG2/docs/n2463.doc the Cyrillic Selkup OE is mapped to Latin OE: CYRILLIC SMALL LETTER SELKUP O E to U+0153 LATIN SMALL LIGATURE OE CYRILLIC CAPITAL LETTER SELKUP O E to U+0152 LATIN CAPITAL LIGATURE OE Several other of those missing Cyrillic characters are simply mapped to Latin ones or sort of decomposed. - Denis Moyogo Jacquerye
Re: CYRILLIC SMALL/CAPITAL LETTER SELKUP OE (ISO 10756:1996)
On Mon, Mar 5, 2012 at 19:35, Denis Jacquerye wrote: According to ftp://std.dkuug.dk/jtc1/sc2/WG2/docs/n2463.doc the Cyrillic Selkup OE is mapped to Latin OE: CYRILLIC SMALL LETTER SELKUP O E to U+0153 LATIN SMALL LIGATURE OE CYRILLIC CAPITAL LETTER SELKUP O E to U+0152 LATIN CAPITAL LIGATURE OE Several other of those missing Cyrillic characters are simply mapped to Latin ones or sort of decomposed. N2463 also maps twelve characters from ISO 10574 that have been disunified since 2002, namely: 04/06 CYRILLIC SMALL LETTER KURDISH QA is now U+051B CYRILLIC SMALL LETTER QA 04/09 CYRILLIC SMALL LETTER EL WITH MIDDLE HOOK is now U+0521 CYRILLIC SMALL LETTER EL WITH MIDDLE HOOK 04/10 CYRILLIC SMALL LETTER MORDVIN EL KA is now U+0515 CYRILLIC SMALL LETTER LHA 04/14 CYRILLIC SMALL LETTER EN WITH MIDDLE HOOK is now U+0523 CYRILLIC SMALL LETTER EN WITH MIDDLE HOOK 05/06 CYRILLIC CAPITAL LETTER KURDISH QA is now U+051A CYRILLIC CAPITAL LETTER QA 05/09 CYRILLIC CAPITAL LETTER EL WITH MIDDLE HOOK is now U+0520 CYRILLIC CAPITAL LETTER EL WITH MIDDLE HOOK 05/10 CYRILLIC CAPITAL LETTER MORDVIN EL KA is now U+0514 CYRILLIC CAPITAL LETTER LHA 05/14 CYRILLIC CAPITAL LETTER EN WITH MIDDLE HOOK is now U+0522 CYRILLIC CAPITAL LETTER EN WITH MIDDLE HOOK 06/03 CYRILLIC SMALL LETTER ER KA is now U+0517 CYRILLIC SMALL LETTER RHA 06/08 CYRILLIC SMALL LETTER KURDISH WE is now U+051D CYRILLIC SMALL LETTER WE 07/03 CYRILLIC CAPITAL LETTER ER KA is now U+0516 CYRILLIC CAPITAL LETTER RHA 07/08 CYRILLIC CAPITAL LETTER KURDISH WE is now U+051C CYRILLIC CAPITAL LETTER WE There is a clear precedent here that the unifications of N2463 are not necessarily the final fate of any of these characters. If the О Е letter for Selkup should be disunified from U+0152/U+0153, then a proposal needs to be submitted calling for the addition of the two letters to the UCS. It is worth noting that N2463 also decomposes four characters using U+0335, a practice which hasn't been used for decompositions since Unicode 1.1. I also don't understand the mapping of 04/05 CYRILLIC SMALL LETTER CHECHEN KA and 05/05 CYRILLIC CAPITAL LETTER CHECHEN KA into U+043A CYRILLIC SMALL LETTER KA, U+030A COMBINING RING ABOVE and U+041A CYRILLIC CAPITAL LETTER KA. U+030A COMBINING RING ABOVE, respectively. Is the character shown in ISO 10574 just a glyph variant of this combining sequence? —Ben Scarborough
Re: CYRILLIC SMALL/CAPITAL LETTER SELKUP OE (ISO 10756:1996)
Le 5 mars 2012 19:35, Denis Jacquerye moy...@gmail.com a écrit : On Tue, Feb 28, 2012 at 4:00 AM, Philippe Verdy verd...@wanadoo.fr wrote: I am looking for the codes or assignements status of the Cyrillic letter OE/oe (ligatured) as used in Selkup (exactly similar to the Latin pair). This character pair has been part of the registration nr. 223 (in 1998) by ISO of the (8-bit) extended Cyrillic character set for non-Slavic languages for bibliographic information interchange : http://www.itscj.ipsj.or.jp/sc2/open/02n3136.pdf According to this document, this character set had also been standardized as ISO 10756:1996. Note that it contains many other characters for which it did not document any mapping to the UCS in the then emerging ISO 10646 standard. It has even been part of proposals at the UTC and ISO the same year for including in the UCS, along with other characters (at that time, Michael Everson wrote a proposal, placing them in U+04EC, U+04ED, but since the, the slots have been used for other characters (that block is now full). It is also referenced in the ISO 9 Cyrillic/Latin transliteration standard. Still, there's no Cyrillic character I can find in the encoded UCS in other Cyrillic extended blocks that are not full (for example, the CYRILLIC SUPPLEMENT block at U+0500-052F). Where are those characters ? And what about the remaining characters found in the Registration nr. 223 and ISO 10756:1996 ? And their status in the ISO 9 standard itself ? Thanks. -- Philippe. According to ftp://std.dkuug.dk/jtc1/sc2/WG2/docs/n2463.doc the Cyrillic Selkup OE is mapped to Latin OE: CYRILLIC SMALL LETTER SELKUP O E to U+0153 LATIN SMALL LIGATURE OE CYRILLIC CAPITAL LETTER SELKUP O E to U+0152 LATIN CAPITAL LIGATURE OE Several other of those missing Cyrillic characters are simply mapped to Latin ones or sort of decomposed. Apparently this document is obsolete. Some of the proposed mappings to Latin have been encoded as plain Cyrillic letters such as: CYRILLIC SMALL LETTER KURDISH QA (not the initially proposed mapping to LATIN SMALL LETTER Q) This document was still a draft, and not a decision. The document specifically says The issue with these letters is whether they should be deunified from Latin, and encoded in the Cyrillic block.
Re: CYRILLIC SMALL/CAPITAL LETTER SELKUP OE (ISO 10756:1996)
On 5 Mar 2012, at 20:13, Benjamin M Scarborough wrote: There is a clear precedent here that the unifications of N2463 are not necessarily the final fate of any of these characters. If the О Е letter for Selkup should be disunified from U+0152/U+0153, then a proposal needs to be submitted calling for the addition of the two letters to the UCS. Have you got examples, Ben? Michael Everson * http://www.evertype.com/
Re: Cyrillic character mapping tables, HP MSL to Unicode
Did you read the last PDF, notably as it says the following about Table D-3: [/quote] D - MSL/Unicode Symbol Indexes Introduction Table D-1, the Master Symbol List, lists all of the characters available for the printers and their MSL index numbers. Table D-2, shows the characters contained in the MSL symbol collections. Table D-3, the Unicode Symbol List, lists all of the characters available for the printers and identifies their unicode index number. Table D-4 shows the characters contained in the unicode symbol collections. [/quote] Well, I misread the description myself, confused about the title of the section, and it's true that only *some* MSL indices are identical to the Unicode code points. It's a shame that one has to compute the conversion by looking at glyph and names given by HP, which do not correspond to Unicode names. It would have been simpler if HP had referenced in its 1999 release of its book, the Unicode code points in Table D-1, and used the official Unicode names (additionally the table D-3 should have listed the MSL index in a reverse index, and not used the decimal code points but hexadecimal notation U+). But joining D-1 nad D-3 is possible, and allows creating the conversion table between MSL to Unicode. Philippe. Les messages non sollicités (spams) ne sont pas tolérés. Tout abus sera signalé automatiquement à vos fournisseurs de service. - Original Message - From: Neil J Geddes [EMAIL PROTECTED] To: Philippe Verdy [EMAIL PROTECTED] Sent: Wednesday, September 03, 2003 9:40 AM Subject: RE: Cyrillic character mapping tables, HP MSL to Unicode Hello Philippe, Thank you very much for your messages and for taking the time to respond. I appreciate this. I had already checked most of these resources (like you I have the older paper manuals) however none provide symbol charts for the Cyrillic character sets. I think I really need to locate TFM files if available. MSL isn't the same as Unicode however I have found a MSL - CG table which should help me. Thanks again, Neil -Original Message- From: Philippe Verdy [mailto:[EMAIL PROTECTED] Sent: Wednesday, September 03, 2003 1:42 AM To: Neil J Geddes; [EMAIL PROTECTED] Subject: Re: Cyrillic character mapping tables, HP MSL to Unicode More precisely, try this file: http://h27.www2.hp.com/bc/docs/support/SupportManual/bpl13206/bpl13206.pdf which contains all the symbol sets charts and cross-references with the MSL/Unicode code and their assignment in other subsets. It is refered within the downloadable reference CDROM for the PCL language. The MSL index seems to be the Unicode code point, so the MSL is merely a subset of Unicode, as used in the HP implementation of the HP PCL - GL/2 symbol sets and fonts. Philippe. Les messages non sollicités (spams) ne sont pas tolérés. Tout abus sera signalé automatiquement à vos fournisseurs de service.
RE: Cyrillic character mapping tables, HP MSL to Unicode
Thanks Philippe. What I really need now is access to additional Euro-Asian HP TFM files. Regards, Neil -Original Message- From: Philippe Verdy [mailto:[EMAIL PROTECTED] Sent: Wednesday, September 03, 2003 9:57 AM To: Neil J Geddes Cc: [EMAIL PROTECTED] Subject: Re: Cyrillic character mapping tables, HP MSL to Unicode Did you read the last PDF, notably as it says the following about Table D-3: [/quote] D - MSL/Unicode Symbol Indexes Introduction Table D-1, the Master Symbol List, lists all of the characters available for the printers and their MSL index numbers. Table D-2, shows the characters contained in the MSL symbol collections. Table D-3, the Unicode Symbol List, lists all of the characters available for the printers and identifies their unicode index number. Table D-4 shows the characters contained in the unicode symbol collections. [/quote] Well, I misread the description myself, confused about the title of the section, and it's true that only *some* MSL indices are identical to the Unicode code points. It's a shame that one has to compute the conversion by looking at glyph and names given by HP, which do not correspond to Unicode names. It would have been simpler if HP had referenced in its 1999 release of its book, the Unicode code points in Table D-1, and used the official Unicode names (additionally the table D-3 should have listed the MSL index in a reverse index, and not used the decimal code points but hexadecimal notation U+). But joining D-1 nad D-3 is possible, and allows creating the conversion table between MSL to Unicode. Philippe. Les messages non sollicités (spams) ne sont pas tolérés. Tout abus sera signalé automatiquement à vos fournisseurs de service. - Original Message - From: Neil J Geddes [EMAIL PROTECTED] To: Philippe Verdy [EMAIL PROTECTED] Sent: Wednesday, September 03, 2003 9:40 AM Subject: RE: Cyrillic character mapping tables, HP MSL to Unicode Hello Philippe, Thank you very much for your messages and for taking the time to respond. I appreciate this. I had already checked most of these resources (like you I have the older paper manuals) however none provide symbol charts for the Cyrillic character sets. I think I really need to locate TFM files if available. MSL isn't the same as Unicode however I have found a MSL - CG table which should help me. Thanks again, Neil -Original Message- From: Philippe Verdy [mailto:[EMAIL PROTECTED] Sent: Wednesday, September 03, 2003 1:42 AM To: Neil J Geddes; [EMAIL PROTECTED] Subject: Re: Cyrillic character mapping tables, HP MSL to Unicode More precisely, try this file: http://h27.www2.hp.com/bc/docs/support/SupportManual/bpl13206/bpl13206.pdf which contains all the symbol sets charts and cross-references with the MSL/Unicode code and their assignment in other subsets. It is refered within the downloadable reference CDROM for the PCL language. The MSL index seems to be the Unicode code point, so the MSL is merely a subset of Unicode, as used in the HP implementation of the HP PCL - GL/2 symbol sets and fonts. Philippe. Les messages non sollicités (spams) ne sont pas tolérés. Tout abus sera signalé automatiquement à vos fournisseurs de service.
Re: Cyrillic character mapping tables, HP MSL to Unicode
First start with this page: http://www.hp.com/cposupport/printers/support_doc/bpl04568.html You may want to buy this: Refer to the HP PCL5 Technical Reference Bundle. To order, call HP's driver/software distribution at 661-257-5565. The part number is 5961-0976. You may also look at: http://www.hp.com/cposupport/printers/support_doc/bpl02705.html and refer to this: For further information about PCL commands, HP-GL/2, macros, or PJL commands, use the Technical Reference Manual set, part number 5021-0377. Order the manual set from HP's Support Materials Organization. Or you may download this: http://h2.www2.hp.com/bc/docs/support/SupportManual/bpl13210/bpl13210.pdf PCL 5 Printer Language Technical Reference Manual - ENWW - HP Part No. 5961-0509. Printed in USA. First Edition - October 1992 PCL 5 Printer LanguageTechnical Reference Manual. I have the same book, but dated September 1990 (this was really the first edition), HP part number 33459-90903. Also: http://h2.www2.hp.com/bizsupport/TechSupport/SoftwareDescription.jsp?locBasepartNum=5961-0976lang=English%20%28US%29 HP PCL Tech Reference Manual CD-ROM - The HP PCL Tech Reference Bundle CD-ROM includes, the Technical Quick Reference Guide, Printer Job Language Technical Reference Manual, PCL 5 Color Technical Reference Manual, PCL 5 Printer Language Technical Reference Manual. In English in a PDF. Format. Philippe. Les messages non sollicités (spams) ne sont pas tolérés. Tout abus sera signalé automatiquement à vos fournisseurs de service. - Original Message - From: Neil J Geddes [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Thursday, August 28, 2003 2:23 PM Subject: Cyrillic character mapping tables, HP MSL to Unicode Hello, I'm looking for symbol set and character metric information for the two Hewlett-Packard symbol sets 3R (PC Cyrillic) and 9R (Windows 3.1 Latin/Cyrillic). Specifically I'm after:- 1) .TFM files for Univers, CG Times, Courier and other common typefaces that use Cyrllic. 2) A cross mapping table for HP MSL (Master Symbol List) to Unicode. Thanks for any help you can offer. It's appreciated! Best regards, Neil Geddes [EMAIL PROTECTED]
Re: Cyrillic character mapping tables, HP MSL to Unicode
More precisely, try this file: http://h27.www2.hp.com/bc/docs/support/SupportManual/bpl13206/bpl13206.pdf which contains all the symbol sets charts and cross-references with the MSL/Unicode code and their assignment in other subsets. It is refered within the downloadable reference CDROM for the PCL language. The MSL index seems to be the Unicode code point, so the MSL is merely a subset of Unicode, as used in the HP implementation of the HP PCL - GL/2 symbol sets and fonts. Philippe. Les messages non sollicités (spams) ne sont pas tolérés. Tout abus sera signalé automatiquement à vos fournisseurs de service. - Original Message - From: Neil J Geddes [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Thursday, August 28, 2003 2:23 PM Subject: Cyrillic character mapping tables, HP MSL to Unicode Hello, I'm looking for symbol set and character metric information for the two Hewlett-Packard symbol sets 3R (PC Cyrillic) and 9R (Windows 3.1 Latin/Cyrillic). Specifically I'm after:- 1) .TFM files for Univers, CG Times, Courier and other common typefaces that use Cyrllic. 2) A cross mapping table for HP MSL (Master Symbol List) to Unicode. Thanks for any help you can offer. It's appreciated!
Re: Cyrillic Q
On Thu, 27 Sep 2001, Marco Cimarosti wrote: A lot of time ago, someone on this list mentioned a language, written in the Cyrillic alphabet, which employed letter Q, taken from the Latin alphabet. Which language is it? IIRC, it was Kurdish. roozbeh
Re: Cyrillic Q
At 02:48 9/27/2001, Marco Cimarosti wrote: A lot of time ago, someone on this list mentioned a language, written in the Cyrillic alphabet, which employed letter Q, taken from the Latin alphabet. Which language is it? Kurdish. The common Cyrillic orthography includes four Latin letterforms that are, as far as I know, unique to Kurdish: U+0051, U+0071 Capital, Small Q U+0057, U+077 Capital, Small W John Hudson Tiro Typeworks www.tiro.com Vancouver, BC [EMAIL PROTECTED] Type is something that you can pick up and hold in your hand. - Harry Carter
Re: Cyrillic Q
On Thu, 27 Sep 2001, John Hudson wrote: At 02:48 9/27/2001, Marco Cimarosti wrote: A lot of time ago, someone on this list mentioned a language, written in the Cyrillic alphabet, which employed letter Q, taken from the Latin alphabet. Which language is it? Kurdish. The common Cyrillic orthography includes four Latin letterforms that are, as far as I know, unique to Kurdish: U+0051, U+0071 Capital, Small Q U+0057, U+077 Capital, Small W John Hudson Tiro Typeworkswww.tiro.com Vancouver, BC [EMAIL PROTECTED] Type is something that you can pick up and hold in your hand. - Harry Carter Thursday, Septembe 27, 2001 Besides Kurdish, the section on tansliteration of non-Slavic languages using Cyrillic the ALA-LC romanization tables (1997) shows Q used with four other languages: Aisor, Chechen (the 1862 and 1908 orthographies but not the 1938 one), Dargwa (Uslar) and Lak (1864 but not 1938). For Kurdish Q seems also to have an alternative glyph that appears as O followed by a vertical bar which is also used with Lezghian (Uslar). Regards, Jim Agenbroad ( [EMAIL PROTECTED] ) The above are purely personal opinions, not necessarily the official views of any government or any agency of any. Phone: 202 707-9612; Fax: 202 707-0955; US mail: I.T.S. Dev.Gp.4, Library of Congress, 101 Independence Ave. SE, Washington, D.C. 20540-9334 U.S.A.
RE: Cyrillic -
Aleksandar Poposki [mailto:[EMAIL PROTECTED]] asked: where could I obtain true-type fonts for Unicode. You can find a list of fonts that include the Unicode Cyrillic range of characters at: http://www.hclrss.demon.co.uk/unicode/cyrillic.html You can find information about obtaining those fonts at: http://www.hclrss.demon.co.uk/unicode/fonts.html However, you probably don't need to worry about obtaining special fonts. Unicode Cyrillic characters are included in recent versions of Arial, Courier New and Times New Roman, and so many Windows users can already display them. Macintosh users with Mac OS 9 can install the Cyrillic language kit from their OS CD-ROM, and this enables recent Web browsers to display Unicode Cyrillic as well as other encodings. Fingertip Software Inc. produces Character Set Converter, which runs under Windows and can convert Unicode Cyrillic to and from various Windows, Macintosh and DOS Cyrillic character sets: http://www.fingertipsoft.com/csconv/brochure.html Alan Wood Documentation Writer / Web Master Context Limited (http://www.context.co.uk/) mailto:[EMAIL PROTECTED] http://www.alanwood.net/ (Unicode, special characters, pesticide names)
Re: Cyrillic -
Hi, I have looked at your web site. If I am not mistaken, you are using a codepage that is commonly refered to as cyrillic YUSCII. This makes the page almost unusable except for the people that have 'Pulshelvetika7' font installed. As you have correctly assumed, the best thing would be to convert the page to Unicode (although you could also convert it microsoft cp1251 or ISO 8859-5). You will not loose your pages - just work on the copies of them. One possible way would be, as Markus already mentioned, to use ICU converter framework - but you would have to make a converter table. There is also a set of macros for Word that handles ex-YU codepage conversions, which can be found at http://solair.eunet.yu/~minya/ Once you have converted your text to Unicode - you should add encoding information to your page about the used encoding. Most modern browsers should be able to swallow and correctly display such a page. Should you have more questions, please contact me directly. Hope this helps, -- Vladimir Weinstein Software Engineer, Unicode Technology Group IBM JTC Cupertino 408-777-5844 (t/l 240-5844)
Re: Cyrillic -
hello, for fonts etc. have a look at http://www.unicode.org/unicode/onlinedat/resources.html for converting your pages to unicode, you would need some library or operating system api to do so. there are plenty around, but you would have to find out exactly what is the encoding of your pages. if you cannot find built-in support, then you might need to add a mapping table to one of the libraries' conversion services. for such libraries see http://www.unicode.org/unicode/onlinedat/products.html#3 i am working with the icu library that you find linked there. with icu for example, you can add a mapping table to the library. best regards, markus
RE: Cyrillic -
Aleks, The reason to use Unicode is more fundamental than fonts. I assume that your your church members and other interested in your sites will have different systems. Those with Cyrillic fonts will prefer Cyrillic text. Using Unicode you can encode your entire websites in one encoding mixing both Latin and Cyrillic text. What you need is an editor that can save the Cyrillic text as Unicode in UTF-8 form. This is the same form that you will send to to the browser. This way both Latin and Cyrillic text will be the same to the Web server. Make sure that the HTTP header and the charset meta tag both specify utf-8. The browser will handle both Latin and Cyrillic the same as well. All it will need to display Cyrillic is a Cyrillic font. Windows and Mac users can install IE 5.0 and select the pan-European support. The Windows Mac fonts can be downloaded from: http://www.microsoft.com/truetype/fontpack/win.htm TrueType is a Unicode encoded font that can be used in non-Unicode applications as well. Good luck Carl -Original Message-From: Magda Danish (Unicode) [mailto:[EMAIL PROTECTED]]Sent: Friday, September 29, 2000 12:05 PMTo: Unicode ListSubject: Cyrillic - -Original Message-From: Aleksandar Poposki [mailto:[EMAIL PROTECTED]]Sent: Thursday, September 28, 2000 4:04 PMTo: [EMAIL PROTECTED]Subject: Your opinion Hello. Im the Webmaster of the Macedonian Orthodox Church website located at www.m-p-c.org. When I started this project I was not very familiar with Unicode and used home-made fonts for Cyrillic characters, but learning about Unicode, I see it is the best way to go, as it is the International standard. Keeping this in mind, and other difficulties Ive had, I wish to ask: Is there a way to convert my work to Unicode w/o risk. I was wondering writing a program to search my document for a character, once found, replace it with the Unicode character number. Is there a script available for me to add to my web page so if the user doesnt have Multi-Lingual Cyrillic support, to automatically install it? And, where could I obtain true-type fonts for Unicode. Also, is there a script as in my previous question for true-type fonts? Aleks
Re: Cyrillic -
-Original Message- From: Aleksandar Poposki [mailto:[EMAIL PROTECTED]] Sent: Thursday, September 28, 2000 4:04 PM To: [EMAIL PROTECTED] Subject: Your opinion I'm the Webmaster of the Macedonian Orthodox Church website located at www.m-p-c.org. When I started this project I was not very familiar with Unicode and used 'home-made' fonts for Cyrillic characters, but learning about Unicode, I see it is the best way to go, as it is the International standard. Keeping this in mind, and other difficulties I've had, I wish to ask: Do you plan to have Old Church Slavonic (OCS) in your pages? Unicode lacks support for "letter titlo" (i.e. titlo with a letter) used quite productively in OCS (in Russia at least), so you can't use Unicode to write "The Lord" (with "slovo-titlo") or "The Gospel" (with "glagol-titlo"). SY, Uwe -- [EMAIL PROTECTED] | Zu Grunde kommen http://www.ptc.spbu.ru/~uwe/| Ist zu Grunde gehen
Re: Cyrillic -
Ar 13:44 -0800 2000-09-29, scríobh Valeriy E. Ushakov: Unicode lacks support for "letter titlo" (i.e. titlo with a letter) used quite productively in OCS (in Russia at least), so you can't use Unicode to write "The Lord" (with "slovo-titlo") or "The Gospel" (with "glagol-titlo"). Nepravda. Smotrite U+0483 COMBINING CYRILLIC TITLO. Cyrillic fans will be delighted to learn that 16 Komi Cyrillic characters used from 1919-1940 have been accepted for processing into the standard (document describing these is available on my web site). Also the three columns between the Cyrillic block and the Armenian block have been dedicated to extensions, called "Cyrillic Supplementary". Michael Everson ** Everson Gunn Teoranta ** http://www.egt.ie 15 Port Chaeimhghein Íochtarach; Baile Átha Cliath 2; Éire/Ireland Vox +353 1 478 2597 ** Fax +353 1 478 2597 ** Mob +353 86 807 9169 27 Páirc an Fhéithlinn; Baile an Bhóthair; Co. Átha Cliath; Éire
Re: Cyrillic -
On Fri, Sep 29, 2000 at 15:55:41 -0800, John Cowan wrote: What is genuinely missing is IOTIFIED A. Because LITTLE YUS and IOTIFIED A fell together in Russian as /ja/, Peter eliminated the latter and adopted a modified form of LITTLE YUS, now CYRILLIC LETTER YA. But aren't IOTIFIED A and YA just glyph variants (with LITTLE YUS lacking a parallel glyph in Peter's civil alphabet, merging with YA instead). Historically YA is a glyph variant of LITTLE YUS, not of IOTIFIED A, I am told. So given that we have already encoded YA and LITTLE YUS (unavoidable, really, considering how different they look), IOTIFIED A has no representation. My, rather limited, understanding is that at that time the two letters, LITTLE YUS and IOTIFIED A, were no longer denoting distinct sounds and were used more or less interchangeably (i.e. they were more or less glyph variants by that time) and so Peter merged them into one letter YA with a glyph for it being based on a glyph for LITTLE YUS. In other words iotified a (ya) survived in Peter's secular Russian alphabet as a character but lost its Slavonic glyph, while little yus disappeared as a character but its glyph survived in the new alphabet. Thus Peter's YA is *character* YA (== iotified a) with a glyph based on a glyph for little yus. But important point here is that "old" alphabet and "new" alphabet were "disjoint". With regard to Russian they are disjoint in time. With regard to Slavonic - the new alphabet was "secular Russian", while old one was "Church Slavonic" and the two never really mixed. The "typeface" aspect is important too: writing one of the languages in the other's typeface is clearly perceived as either a visual pun or transliteration. So, in theory, you'll never find *glyph* YA (reversed R) and *glyph* IOTIFIED A (i-a) in one homogeneous text as this is made impossible by either synchronic or diachronic constraints. So it seems that for Slavonic one should use LITTLE YUS to encode little yus and YA to encode iotified a (which my grammar book of Slavonic calls just "ya"). For Russian there's no LITTLE YUS and character YA is used to encode ya. Of course it's still possible to develop a typeface with all three glyphs (little yus, iotified a, ya) in it and use OpenType to choose correct one. This is not dissimilar to, say, mixed Serbian and Russian cursive text with different glyphs for certain characters. (And the latter have been already discussed to death on this list). All this, of course, is Russian-centric. I don't know how things developed in other Slavic languages, especially in southern slavic languages that are closer to (also southern by its origin) Church Slavonic than the eastern slavic Russian. PS: Sorry if this sounds a little confusing - 6am is not the best time for writing from memory short essays on history of Cyrillic alphabet in Russia. SY, Uwe -- [EMAIL PROTECTED] | Zu Grunde kommen http://www.ptc.spbu.ru/~uwe/| Ist zu Grunde gehen