Re: interaction of Arabic ligatures with vowel marks

2013-06-12 Thread Andreas Prilop
On Tue, 11 Jun 2013, Stephan Stiller wrote: How is the placement of vowel marks around ligatures handled in Arabic text? I'm also wondering how font designers normally handle this. Older fonts in older operating systems (like Windows XP) often failed. See

Re: interaction of Arabic ligatures with vowel marks

2013-06-12 Thread Andreas Prilop
On Wed, 12 Jun 2013, Richard Wordingham wrote: While the same principle applies to Indic scripts (and indeed, to the Roman alphabet), there is only one Indic mark I can think of for which the issue of component association arises, and that is the nukta. Sanskrit requires candrabindu U+0901

Re: Word reversal from Abobe to Word

2013-02-11 Thread Andreas Prilop
On Thu, 7 Feb 2013, Raymond Mercier wrote: I am using the full commercial Adobe Acrobat version 6, running on XP. If there is more than one word, the order of words IS correct, but the order of characters in each word is reversed. (I don't know about your program.) You can find out how

Connecting overline and Connecting underline

2012-11-16 Thread Andreas Prilop
U+0305 Combining overline U+0332 Combining low line should both connect on left and right. Which software (program and font) actually does this when you overline/underline gh? Test at http://www.user.uni-hannover.de/nhtcapri/combining-marks.html -- Outgoing mail is certified free from

Re: texteditors that can process and save in different encodings

2012-10-19 Thread Andreas Prilop
On Tue, 16 Oct 2012, Jukka K. Korpela wrote: ... BabelPad ... But not even ISO-8859-1. The schoolboys working for Google are also too dumb to process charset=ISO-8859-1 correctly. For example, see http://groups.google.com/group/sfnet.huuhaa/msg/467be25522963c61dmode=source

Re: texteditors that can process and save in different encodings

2012-10-19 Thread Andreas Prilop
On Fri, 19 Oct 2012, I wrote: http://groups.google.com/group/sfnet.huuhaa/msg/467be25522963c61dmode=source Correction: http://groups.google.com/group/sfnet.huuhaa/msg/467be25522963c61?dmode=source

Re: problem with combining diacritcs in HTML5

2012-10-09 Thread Andreas Prilop
On Sat, 6 Oct 2012, Bill Poser wrote: Characters with a combining low line encoded as a single Unicode codepoint are rendered correctly. Thus 's' followed by U+0332 is rendered as 's' followed by a low line, but U+1E95 LATIN SMALL LETTER Z WITH LINE BELOW is correctly rendered with the

Re: problem with combining diacritcs in HTML5

2012-10-08 Thread Andreas Prilop
On Sun, 7 Oct 2012, Leonardo Boiko wrote: Inspecting the Courier New font, version 5.11, I noticed that the advance width of the glyph for U+0332 (glyph uni0331) is 1129 units. I think this explains it all. The advance width should be 0. And other fonts have the same problem, at least the

Re: problem with combining diacritcs in HTML5

2012-10-08 Thread Andreas Prilop
On Mon, 8 Oct 2012, Jukka K. Korpela wrote: http://www.user.uni-hannover.de/nhtcapri/combining-marks.html Your test page is interesting, but is postulates the use of style sheet switching, You are always free to define your preferred font family in your browser’s preferences, no? You may

Re: Compiling a list of Semitic transliteration characters

2012-09-06 Thread Andreas Prilop
On Wed, 5 Sep 2012, Petr Tomasek wrote: Well, isn't Romanization a special case of transliteration? Romanization of Chinese is certainly not a transliteration. This holds for other scripts listed under http://www.loc.gov/catdir/cpso/roman.html as well.

Re: Why no combining-character form for U+00F8?

2012-08-17 Thread Andreas Prilop
On Fri, 17 Aug 2012, Jukka K. Korpela wrote: There is an essential difference between using combining mark and using a precomposed character: ... In searches, for example, they do not match. At least in Google, they match:

Re: Why no combining-character form for U+00F8?

2012-08-17 Thread Andreas Prilop
On Fri, 17 Aug 2012, Michael Everson wrote: http://www.user.uni-hannover.de/nhtcapri/combining-marks.html To change fonts quickly, choose among different style sheets in your browser: How? I'm using Safari. If Safari doesn't let you select alternate stylesheets then you can't change fonts

Re: Mayan numerals

2012-08-16 Thread Andreas Prilop
On Wed, 15 Aug 2012, Jameson Quinn wrote: ... I'd like to see at least 20 glyphs for the (horizontal-barred) numerals. ... Do others agree that it's needed? Certainly not. Mayan numerals will disappear after 21 December 2012.

Re: Apostrophe, and DIN keyboard

2012-08-14 Thread Andreas Prilop
On Mon, 13 Aug 2012, Otto Stolz wrote: http://www.machsmit.de/media/mainteaser/header-ichwillserleben.png http://www.machsmit.de/kampagne/printmedien.php show what the braindead German DIN keyboard layout has done to the apostrophe (’): Killed by the acute accent (´). Andreas’ example does

Re: U+25CA LOZENGE - why is it in the Mac OS Roman character set (and therefore widespread in current fonts)?

2012-08-13 Thread Andreas Prilop
On Mon, 13 Aug 2012, Karl Pentzlin wrote: The problem I am confronted with is that this character shares its German name Raute with the # I learnt in 7th grade what “Raute” means. “#” is not a Raute. The center field of “#” is called Raute or Rhombus. BTW, Herr Pentzlin:

Small i with/out dot and with arrow

2012-08-01 Thread Andreas Prilop
Is it correct that   U+0069 U+20D7   U+006A U+20D7 should have a dot and that   U+0131 U+20D7   U+0237 U+20D7   U+006B U+20D7 should have no dot?

Re: Small i with/out dot and with arrow

2012-08-01 Thread Andreas Prilop
On Wed, 1 Aug 2012, Kent Karlsson wrote: Not sure why you include k here (which has no dot any which way)... Just a little hint because my question might look too strange. i, j, k with arrow are used in mathematics and physics to denote the vectors (1,0,0) , (0,1,0) and (0,0,1) . Sometimes I

Bengali conjuncts with U+09A4 (Ta)

2012-07-05 Thread Andreas Prilop
To obtain the Bengali conjunct (ligature) tka, I write ta   virama   ka   U+09A4 U+09CD U+0995 This worked fine in Windows XP but it no longer works with the fonts Shonar Bangla and Vrinda in Windows 7. Is there an explanation?

Re: Unicode 6.2 to Support the Turkish Lira Sign

2012-05-22 Thread Andreas Prilop
On Sun, 20 May 2012, Michael Everson wrote: - kh with *continuous* underline (romanization of U+0959) ? No. Whose romanization is that? http://www.loc.gov/catdir/cpso/romanization/hindi.pdf http://homepage.ntlworld.com/stone-catend/trimain3.htm

Re: Unicode 6.2 to Support the Turkish Lira Sign

2012-05-20 Thread Andreas Prilop
On Sat, 19 May 2012, Michael Everson wrote: The free Rupakara font, which was introduced to support the INDIAN RUPEE SIGN when it was accepted for encoding, has been updated to include the TURKISH LIRA SIGN. See http://evertype.com/fonts/rupakara/ I don't see here: - n with tilde, U+091E -

Re: Unicode 6.2 to Support the Turkish Lira Sign

2012-05-20 Thread Andreas Prilop
On Sun, 20 May 2012, Michael Everson wrote: I do not understand what it is you are after. I meant: Does your font include - n with tilde (romanization of U+091E) - kh with *continuous* underline (romanization of U+0959) ?

Re: Unicode 6.2 to Support the Turkish Lira Sign

2012-05-19 Thread Andreas Prilop
On Tue, 15 May 2012, announceme...@unicode.org wrote: Recognizing the urgent need to support the new currency symbol in information systems, the Unicode Consortium has scheduled its next release, Unicode 6.2, for the third quarter of 2012. That release will include the new character, U+20BA

Re: Origins of w

2012-05-18 Thread Andreas Prilop
On Wed, 16 May 2012, Denis Jacquerye wrote: How about U+1E1C, U+1E1D Hebrew U+05B1 U+1E4E, U+1E4F I don't know. U+1E64, U+1E65, U+1E66, U+1E67 ? Hebrew U+FB2D and U+FB2C (in this order) Which transliteration systems are they from? ISO 259 (1984)

Re: Origins of w

2012-05-18 Thread Andreas Prilop
On Wed, 16 May 2012, Denis Jacquerye wrote: U+1E00 and U+1E01 are also a mystery. You can find letter a with ring below in the title Grammar of the Pasto or language of the Afghans by Ernest Trumpp, published 1873. http://www.google.co.uk/search?q=%22P%E1%B8%81%E1%B9%A3%CC%8Ct%C5%8D%22 I don't

Re: U+2018 is not RIGHT HIGH 6

2012-05-04 Thread Andreas Prilop
On Fri, 4 May 2012, Michael Probst wrote: This is *not* about Verdana etc. but rather http://www.hairetikos.info/afinalquestion.pdf It seems to me that you have a problem with TeX, not with Unicode. You should complain in a forum/mailing list dealing with TeX.

Re: U+2018 is not RIGHT HIGH 6

2012-05-03 Thread Andreas Prilop
On Wed, 2 May 2012, Asmus Freytag wrote: a document that not only describes the issues but provides a suggested solution. Suggested solution: Correct the typefaces Comic Sans MS, Tahoma, Verdana in the same way as the typeface Trebuchet MS has been corrected: Make U+2018 a rotational image of

Re: U+2018 is not RIGHT HIGH 6

2012-04-30 Thread Andreas Prilop
On Sun, 29 Apr 2012, Asmus Freytag wrote: So, one of the most useful things that could come of the current discussion, would be a thorough documentation of the glyph variations needed to support both English and German for the same quotation mark characters. Actually, the case is quite

Re: U+2018 is not RIGHT HIGH 6

2012-04-27 Thread Andreas Prilop
On Fri, 27 Apr 2012, Michael Probst wrote: http://www.hairetikos.info/Ux2018_is_not_RIGHT_HIGH_6.pdf This is well known. Please read this old thread: http://unicode.org/mail-arch/unicode-ml/y2006-m06/thread.html#30 A few fonts from Microsoft/Monotype are broken:

Re: Origins of w

2012-04-18 Thread Andreas Prilop
On Mon, 16 Apr 2012, arno.s wrote: U+1E96 has the note Semitic transliteration. Indeed U+1E96 to U+1E9A are used for transliterating Arabic according to ISO 233. w with ring is waw with sukun. but *any* consonant occurs with sukun, so why did they not encode b with ring, d with ring, d with

Re: Origins of w

2012-04-16 Thread Andreas Prilop
On Sun, 15 Apr 2012, David Starner wrote: At Wiktionary, we're looking at (U+1E98) and we can't figure out where it came from. It's from Unicode 1.1, which makes it hard to look up discussion on adding it, and the characters around it don't seem to give clues to its origin. U+1E96 has the

Joining Arabic Letters

2012-03-30 Thread Andreas Prilop
I come back to http://www.unicode.org/mail-arch/unicode-ml/y2012-m03/thread.html#11 A similar problem of showing non-joining, isolated Arabic glyphs can be seen in the attached file. Both Internet Explorer 8 and MS Word 2010 display isolated glyphs in some cases. I think a better idea is to

Re: Characters LAM and ALIF together (ligature) in Arabic

2012-03-26 Thread Andreas Prilop
On Mon, 26 Mar 2012, Escape Landsome wrote: In Arabic, when writing a LAM followed by an ALIF, you have a special ligature of the two letters Some (broken) fonts do not form the lam-alif ligature when you insert some non-spacing mark between lam and alif:

Re: Zero-width joiner won't join

2012-03-06 Thread Andreas Prilop
Quoting upside-down, Philippe Verdy wrote: It would help if you created such documents using numeric character references in your source for all invisible characters and format controls instead of inserting them litterally. Monsieur Perdu: Everybody except you understands that I have done

Zero-width joiner won't join

2012-03-05 Thread Andreas Prilop
I think the zero-width joiner (ZWJ, U+200D) should join regardless of typeface. But Internet Explorer 8 won't join if the ZWJ is taken from another font than surrounding text. In MS Windows, the font Mangal contains the zero-width joiner but not Arabic letters. When I specify font-family: Mangal

Archaic Pashto letter

2011-12-09 Thread Andreas Prilop
Arabic letter U+0682 shows two dots above. It has the cryptic remark not used in modern Pashto. But was it ever used? The new 2011 edition of German standard DIN 31635 Romanization of the Arabic Alphabet http://www.beuth.de/en/standard/din-31635/140593750 shows the real archaic Pashto letter on

Arabic alif-lam ligature

2011-11-08 Thread Andreas Prilop
There is a non-standard alif-lam ligature in the Arabic script. The logo of Al Arabiya shows an example. Which fonts have such an alif-lam ligature? Should I write U+0627 ZWJ U+0644 to obtain the ligature? Or should I write U+0627 ZWNJ U+0644 to prevent the ligature? Or is alif-lam outside the

Re: Yiddish digraphs

2011-10-20 Thread Andreas Prilop
On Wed, 19 Oct 2011, Mark E. Shoulson wrote: interesting that the Latin examples have *compatibility* decompositions, and the Hebrew/Yiddish digraphs don't even have that Nevertheless, digraphs and separate letters are the same for Google:

Yiddish digraphs

2011-10-19 Thread Andreas Prilop
There are three so-called Yiddish digraphs in Unicode: U+05F0 wawayim U+05F1 waw yod U+05F2 yodayim What is specifically Yiddish about these digraphs? They can be used in the same way in Hebrew. But this isn't done. Why not?

Re: Yiddish digraphs

2011-10-19 Thread Andreas Prilop
On Wed, 19 Oct 2011, Michael Everson wrote: What is specifically Yiddish about these digraphs? They are used in Yiddish orthography. With digraphs: http://yi.wikipedia.org/wiki/%EE%F9%E4_%EC%D6%E1_%F8%E0%E1%E9%F0%E0%D4%E9%E8%F9 Without digraphs:

Re: Arabic date format and Microsoft programs

2011-10-17 Thread Andreas Prilop
On Mon, 17 Oct 2011, Eli Zaretskii wrote: However, it could be that the confusion is mine, and it stems from the fact that the logical order of these characters was not stated by the OP. You can read the source text, no?

Re: Arabic date format and Microsoft programs

2011-10-17 Thread Andreas Prilop
On Mon, 17 Oct 2011, Eli Zaretskii wrote: Btw, according to my testing, the current Firefox displays this this is http://www.unicode.org/mail-arch/unicode-ml/y2011-m10/att-0059/1999-12-31.html as 31/12/1999. Firefox 7 displays 1999/12/31.

Arabic date format and Microsoft programs

2011-10-15 Thread Andreas Prilop
I return to http://www.unicode.org/mail-arch/unicode-ml/y2011-m10/att-0059/1999-12-31.html Microsoft programs (Internet Explorer, MS Word), display this as 31/12/1999 Other programs (Firefox, Opera, OpenOffice) display this as 1999/12/31 NB: I do not ask how to write unambiguously. (This

Re: Solidus variations

2011-10-11 Thread Andreas Prilop
On Fri, 7 Oct 2011, Murray Sargent wrote: The ASCII solidus is used in various nonmathematical contexts (dates, alternatives) It bothers me that different programs display HTML H1 dir=rtl align=center #1633;#1641;#1641;#1641;/#1633;#1634;/#1635;#1633; /H1 /HTML differently.

Re: Solidus variations

2011-10-11 Thread Andreas Prilop
On Tue, 11 Oct 2011, I wrote: It bothers me that different programs display [...] differently. Including HTML in messages as described in http://www.hypermail-project.org/hypermail.html#6 didn't quite work. Therefore I attach a tiny HTML file so that you can test with different

Re: Japanese font on Non-Japanese Android phones

2011-10-11 Thread Andreas Prilop
On Tue, 11 Oct 2011, Peter Constable wrote: It works flawlessly in Firefox (which is the only browser to support it - Internet Explorer, Chrome and Safari don’t support it. I don’t know for Opera). I've scanned this thread and can't figure out what it is. span lang=ru../span is

Re: Japanese font on Non-Japanese Android phones

2011-10-08 Thread Andreas Prilop
On Fri, 7 Oct 2011, Gerrit wrote: So if somebody from Google reads this, [...] Additionally, if the standard Android web browser could then use the html “lang” tag to select the appropriate font, it would be even nicer. Mark Davis from Google has confessed on this list

Re: Anything from the Symbol font to add along with W*dings?

2011-08-16 Thread Andreas Prilop
On Tue, 16 Aug 2011, Philippe Verdy wrote: Even Netscape 4 was able to display all symbols from http://www.user.uni-hannover.de/nhtcapri/mathematics.html correctly. Yes, but probably not the last part of the table (displayed on the page from the link labelled more...), That is a

Re: Greek Characters Duplicated as Latin

2011-08-15 Thread Andreas Prilop
On Sun, 14 Aug 2011, Asmus Freytag wrote: The Ohm sign should have been encoded as another example of squared letters and abbreviations. It comes from Asian character sets, I’d say the ohm sign comes from the MacRoman character set (0xBD).

Re: Anything from the Symbol font to add along with W*dings?

2011-08-13 Thread Andreas Prilop
On Fri, 12 Aug 2011, Leo Broukhis wrote: http://www.numericana.com/about.htm The author Gerard P. Michon is clueless. Even Netscape 4 was able to display all symbols from http://www.user.uni-hannover.de/nhtcapri/mathematics.html correctly.

Re: ZWNBSP vs. WJ

2011-08-05 Thread Andreas Prilop
On Fri, 5 Aug 2011, Doug Ewell wrote: UTF-8 has the property of being easily detected and verified as such, which solves part of the Google Groups problem (inability to detect which SBCS is being used). No, it doesn't solve. The schoolboys working for Google are so dumb that they even assume

Re: ZWNBSP vs. WJ

2011-08-05 Thread Andreas Prilop
On Fri, 5 Aug 2011, I wrote: Example: http://groups.google.com/group/sfnet.huuhaa/msg/4a7b0cae182e8c50 http://groups.google.com/group/sfnet.huuhaa/msg/4a7b0cae182e8c50dmode=source Make that: http://groups.google.com/group/sfnet.huuhaa/msg/4a7b0cae182e8c50?dmode=source

Re: SHY, CGJ, etc.

2011-07-06 Thread Andreas Prilop
On Tue, 5 Jul 2011, Philippe Verdy wrote: Even MS Word 2010 continues to use U+001F as soft hyphen but does not recognize U+00AD as soft hyphen. I've not spoken at all about U+001F and not even tested it alt+0031 alt+0173 I have entered TRUE soft hyphens as U+00AD, in a plain-text

SHY, CGJ, etc. (was: unicode Digest V12 #108)

2011-07-04 Thread Andreas Prilop
On Sun, 3 Jul 2011, Jukka K. Korpela wrote: You're wrong, it DOES. I just tested it (in Microsoft Word 2010 for Windows 7) within a random long word (aa) and the SHY is recognized to generate the intended hyphenation break. That’s good news, if your analysis is correct, but the

Re: Typo in bidi reference implementation

2011-07-01 Thread Andreas Prilop
On Fri, 1 Jul 2011, Peter Krefting wrote: Not that it matters much, just something we noticed. Peter Krefting - Core Technology Developer, Opera Software ASA I noticed something that matters -- namely that Opera isn't really fit to display bidirectional text and documents. For example:

Re: Apostrophe in transliteration

2010-08-10 Thread Andreas Prilop
On Mon, 9 Aug 2010, Jukka K. Korpela wrote: It is of course transliteration standards that should say something normative about the matter. As far as I can remember, the authoritative versions of the relevant standards are the paper publications, which do no identify characters by Unicode

Re: Pashto yeh characters

2010-07-28 Thread Andreas Prilop
On Tue, 27 Jul 2010, Arno Schmitt wrote: Since U+0649 is called alif maqsura it should be used for alif maqsura. But that argument, you must use U+0027 for an apostrophe instead of U+2019. The Unicode names for characters are often hictorical and you should not infer anything from such names.

Re: Pashto yeh characters

2010-07-28 Thread Andreas Prilop
On Tue, 27 Jul 2010, David Starner wrote: MacArabic, Windows-1256 and ISO-8859-6 are all standards for the encoding of Arabic. Thus U+0649 must be an Arabic character; existing use in both those sets and in Unicode say that is. By that circular logic, S with cedilla and T with cedilla must be

Re: Pashto yeh characters

2010-07-28 Thread Andreas Prilop
On Tue, 27 Jul 2010, Khaled Hosny wrote: According to Grammatik des klassischen Arabisch by Wolfdietrich Fischer, page 9, the ya is written two dots in such cases, too. Except that this is not a Yaa and not pronounced like a Yaa, it is an Alef (note the small dagger Alef above it). That is

Re: Pashto yeh characters

2010-07-28 Thread Andreas Prilop
On Wed, 28 Jul 2010, lingu...@artstein.org wrote: Here's an arbitrary page from today's Al-Ahram newspaper, [...] On my computer this looks particularly jarring, You can find enough pages from Continental Europe and Latin America that have an acute accent instead of an apostrophe due to

Re: Pashto yeh characters

2010-07-27 Thread Andreas Prilop
On Thu, 22 Jul 2010, lingu...@artstein.org wrote: [...] To wrap up, are my observations about the Pashto writing conventions correct? And is there a standard for assigning the Pashto characters representing /j/ and /i:/ to Unicode code points? Practical answer: U+0649 and U+064A are

Re: charset parameter in Google Groups

2010-07-07 Thread Andreas Prilop
On Tue, 6 Jul 2010, Shawn Steele wrote: Often the author seems to use the same code page they were expecting as a system default, so it can appear to work for them even when it's wrong. I am the author of this news message:

Re: charset parameter in Google Groups

2010-07-07 Thread Andreas Prilop
On Wed, 7 Jul 2010, Shawn Steele wrote: however, in general, perhaps not your specific case, the charset tag on the web cannot be 100% reliably trusted, regardless of what the RFCs say. You do not understand what I mean! You have missed my point completely! You DO NOT understand me!

Re: charset parameter in Google Groups

2010-07-02 Thread Andreas Prilop
On Thu, 1 Jul 2010, John Burger wrote: If you have never encountered a web page in which the charset parameter encoded in the page (or in the HTTP headers) did not accurately reflect the real charset, as indicated by the actual data in the page How is it possible that you noticed that? It's

Re: charset parameter in Google Groups

2010-07-01 Thread Andreas Prilop
On Mon, 28 Jun 2010, Mark Davis wrote: I'll overlook the lack of civility, since I can understand that kind of frustration when something doesn't work. Well, I am aware of this problem/bug for many years now: http://groups.google.co.uk/group/sci.lang/msg/eb55255e1925350f Over the years I

Re: Indian Rupee Sign to be chosen today

2010-06-25 Thread Andreas Prilop
On Thu, 24 Jun 2010, Leo Broukhis wrote: a privilege (unique identity) available only to major currencies like dollar, euro, pound, sterling and yen. Even in the year 2010, the euro sign (€) doesn't work reliably. -- From the New World:

Re: Indian Rupee Sign to be chosen today

2010-06-25 Thread Andreas Prilop
On Fri, 25 Jun 2010, I wrote Even in the year 2010, the euro sign (€) doesn't work reliably. in both the Unicode list and in the newsgroup de.test. unicode.org shows a euro sign: http://www.unicode.org/mail-arch/unicode-ml/y2010-m06/0372.html groups.google.com shows a currency sign: