Re: Counting Devanagari Aksharas

2017-04-26 Thread Eli Zaretskii via Unicode
> Date: Wed, 26 Apr 2017 07:45:07 +0100 > From: Richard Wordingham via Unicode > > On Wed, 26 Apr 2017 08:48:13 +0300 > Eli Zaretskii via Unicode wrote: > > > > Date: Sun, 23 Apr 2017 22:59:49 +0100 > > > From: Richard Wordingham

Re: Counting Devanagari Aksharas

2017-04-26 Thread Richard Wordingham via Unicode
On Wed, 26 Apr 2017 08:48:13 +0300 Eli Zaretskii via Unicode wrote: > > Date: Sun, 23 Apr 2017 22:59:49 +0100 > > From: Richard Wordingham > > Cc: Eli Zaretskii > > > > If I search for CGJ, highlighting it is frequently

Re: Counting Devanagari Aksharas

2017-04-25 Thread Eli Zaretskii via Unicode
> Date: Sun, 23 Apr 2017 22:59:49 +0100 > From: Richard Wordingham > Cc: Eli Zaretskii > > If I search for CGJ, highlighting it is frequently supremely useless. > I want to know where it is; highlighting is merely a tool to find it on > the screen.

Re: Go romanize! Re: Counting Devanagari Aksharas

2017-04-25 Thread Naena Guru via Unicode
For example, 'po' sound in Indic has the p consonant with a sign ahead plus a sign after. In many Indic scripts, yes. In Devanagari, the vowel sign is normally a singly element classified as following the consonant. In Thai, the vowel sign precedes the consonant. Tai Tham uses both a two-part sign

Re: Go romanize! Re: Counting Devanagari Aksharas

2017-04-24 Thread Richard Wordingham via Unicode
sound in Indic has the p consonant with a sign ahead plus a sign > after. In many Indic scripts, yes. In Devanagari, the vowel sign is normally a singly element classified as following the consonant. In Thai, the vowel sign precedes the consonant. Tai Tham uses both a two-part sign and a precedi

Go romanize! Re: Counting Devanagari Aksharas

2017-04-24 Thread Naena Guru via Unicode
Quote by Richard: Unless this implies a spelling reform for many languages, I'd like to see how this works for the Tai Tham script. I'm not happy with the Romanisation I use to work round hostile rendering engines. (My scheme is only documented in variable hack_ss02 in the last script blocks of

Re: Counting Devanagari Aksharas

2017-04-24 Thread Richard Wordingham via Unicode
On Mon, 24 Apr 2017 00:36:26 +0530 Naena Guru via Unicode wrote: > The Unicode approach to Sanskrit and all Indic is flawed. Indic > should not be letter-assembly systems. > > Sanskrit vyaakaraNa (grammar) explains the phonemes as the atoms of > the speech. Each writing

Re: Counting Devanagari Aksharas

2017-04-23 Thread Richard Wordingham via Unicode
On Sun, 23 Apr 2017 05:40:29 +0300 Eli Zaretskii via Unicode wrote: > > The cursor moves to the cluster boundary, so there is much less of a > > problem with Emacs. > > But you wanted to highlight only part of the cluster, AFAIU. If I search for CGJ, highlighting it is

Re: Counting Devanagari Aksharas

2017-04-23 Thread Naena Guru via Unicode
, Richard Wordingham via Unicode wrote: Is there consensus on how to count aksharas in the Devanagari script? The doubts I have relate to a visible halant in orthographic syllables other than the first. For example, according to 'Devanagari VIP Team Issues Report' http://www.unicode.org/L2/L2011/11370

Re: Counting Devanagari Aksharas

2017-04-23 Thread Asmus Freytag via Unicode
On 4/22/2017 9:25 PM, Manish Goregaokar via Unicode wrote: Backspace in browsers (chrome and firefox) deletes within EGCs too. They delete matras in devanagari, and jamos in hangul. They don't *exactly* work off of code points (e.g. flag emoji gets deleted

Re: Counting Devanagari Aksharas

2017-04-22 Thread Manish Goregaokar via Unicode
> You cannot even > meaningfully move by single characters in most clusters, because > composing characters generally completely changes how the original > characters looked, so there's nowhere you can display the cursor. Yes, and this is one of the reasons it feels broken in devanag

Re: Counting Devanagari Aksharas

2017-04-22 Thread Eli Zaretskii via Unicode
> Date: Sun, 23 Apr 2017 00:51:59 +0100 > Cc: Julian Bradfield > From: Richard Wordingham via Unicode > > On Sat, 22 Apr 2017 21:39:42 +0100 (BST) > Julian Bradfield via Unicode wrote: > > > On 2017-04-22, Eli Zaretskii via

Re: Counting Devanagari Aksharas

2017-04-22 Thread Richard Wordingham via Unicode
On Sat, 22 Apr 2017 21:39:42 +0100 (BST) Julian Bradfield via Unicode wrote: > On 2017-04-22, Eli Zaretskii via Unicode wrote: > > I could imagine Emacs decomposing characters temporarily when only > > part of a cluster matches the search string.

Re: Counting Devanagari Aksharas

2017-04-22 Thread Julian Bradfield via Unicode
On 2017-04-22, Eli Zaretskii via Unicode wrote: >> From: Richard Wordingham via Unicode [...] >> I've encountered the problem that, while at least I can search for >> text smaller than a cluster, there's no indication in the window of >> where in the

Re: Counting Devanagari Aksharas

2017-04-22 Thread Eli Zaretskii via Unicode
> Date: Sat, 22 Apr 2017 17:13:36 +0100 > From: Richard Wordingham via Unicode > > > Movement by grapheme > > cluster is AFAIK the most natural way of moving in complex scripts. > > Evidence? Personal experience? > It's easiest for displaying the cursor. It's the _only_

Re: Counting Devanagari Aksharas

2017-04-22 Thread Richard Wordingham via Unicode
On Sat, 22 Apr 2017 13:34:32 +0300 Eli Zaretskii via Unicode wrote: > AFAIR, Emacs allows one to _delete_ individual characters, > i.e. Backspace and C-d delete character-by-character, so the problem > shouldn't be so grave for imperfect typists. Deleting forwards by one

Re: Counting Devanagari Aksharas

2017-04-22 Thread Eli Zaretskii via Unicode
> Date: Sat, 22 Apr 2017 11:13:16 +0100 > From: Richard Wordingham via Unicode > > At present these are split into two and three grapheme clusters > respectively, and LibreOffice cursor movement responds accordingly. > (SIGN AA starts a grapheme cluster in several scripts of

Re: Counting Devanagari Aksharas

2017-04-22 Thread Richard Wordingham via Unicode
as (or not breaking at viramas followed by a consonant if we want > to be more precise), but the proposed system would be wrong much less > often. > I am only talking about Devanagari, though scripts like > Bangla/Gujrati/Gurmukhi may have similar needs. Breaking on ZWNJ seems >

Re: Counting Devanagari Aksharas

2017-04-21 Thread Manish Goregaokar via Unicode
tem of not breaking at viramas (or not breaking at viramas followed by a consonant if we want to be more precise), but the proposed system would be wrong much less often. I am only talking about Devanagari, though scripts like Bangla/Gujrati/Gurmukhi may have similar needs. Breaking on ZWNJ seems sensi

Re: Counting Devanagari Aksharas

2017-04-21 Thread Richard Wordingham via Unicode
On Thu, 20 Apr 2017 11:17:05 -0700 Manish Goregaokar via Unicode <unicode@unicode.org> wrote: > On Wed, Apr 19, 2017 at 4:35 PM, Richard Wordingham via Unicode > <unicode@unicode.org> wrote: > > Is there consensus on how to count aksharas in the Devanagari > > sc

Re: Counting Devanagari Aksharas

2017-04-21 Thread Manish Goregaokar via Unicode
That seems like a relatively niche use case (especially with Vedic Sanskrit) compared to having weird selection for everything else. I'm not convinced. When I use a romanized Devanagari input method (I typically do on my laptop), deleting the whole cluster is necessary anyway for things to work

Re: Counting Devanagari Aksharas

2017-04-21 Thread Richard Wordingham via Unicode
o श्री·मा·न्को, which does not concatenate back to the original. Secondly, you have a problem with ANUDATTA. You are not accepting <U+0924, U+0902, U+0952> as a syllable. Perhaps you believed https://www.microsoft.com/typography/OpenTypeDev/devanagari/intro.htm as to the structure of a Devanag

Re: Counting Devanagari Aksharas

2017-04-20 Thread Anshuman Pandey via Unicode
icode >> <unicode@unicode.org> wrote: > >>> On Thu, 20 Apr 2017 11:17:05 -0700 >>> Manish Goregaokar via Unicode <unicode@unicode.org> wrote: > >>>> I'm of the opinion that Unicode should start considering devanagari >>>> (and pos

Re: Counting Devanagari Aksharas

2017-04-20 Thread Richard Wordingham via Unicode
de <unicode@unicode.org> wrote: > >> I'm of the opinion that Unicode should start considering devanagari > >> (and possibly other indic) consonant clusters as single extended > >> grapheme clusters. > > You won't like it if cursor movement granularity is reduced

Re: Counting Devanagari Aksharas

2017-04-20 Thread Manish Goregaokar via Unicode
I mean, we do the same for Hangul. The main time you need intra-conjunct segmentation in Devanagari is when deleting something you just typed. And backspace usually operates on code points anyway (except for some weird cases like flag emoji, though this isn't uniform across platforms). I don't

Re: Counting Devanagari Aksharas

2017-04-20 Thread Richard Wordingham via Unicode
conjoined. Now, that's what I expected. > I'm of the opinion that Unicode should start considering devanagari > (and possibly other indic) consonant clusters as single extended > grapheme clusters. Yes, sometimes it's not rendered as a single glyph, > but sometimes family emoji will n

Re: Counting Devanagari Aksharas

2017-04-20 Thread Richard Wordingham via Unicode
On Thu, 20 Apr 2017 15:33:37 +0530 Shriramana Sharma via Unicode <unicode@unicode.org> wrote: > All I can say is that Tamil script has eschewed most consonant cluster > ligatures/conjoining forms. As for Devanagari, writing श्रीमान्‌को (I > used ZWNJ) i.o. श्रीमान्को is

Re: Counting Devanagari Aksharas

2017-04-20 Thread Manish Goregaokar via Unicode
I don't think there's consensus. When given a rendered representation people seem to uniformly count conjuncts as multiple aksharas if rendered with visible halant, and as a single akshara if they are rendered conjoined. Most fonts for devanagari these days are pretty good at conjoining

Re: Counting Devanagari Aksharas

2017-04-20 Thread Shriramana Sharma via Unicode
Hello Richard. Yes my earlier reply wasn't intended to be offlist. I have near-zero knowledge about non-Indic languages. All I can say is that Tamil script has eschewed most consonant cluster ligatures/conjoining forms. As for Devanagari, writing श्रीमान्‌को (I used ZWNJ) i.o. श्रीमान्को is quite

Re: Counting Devanagari Aksharas

2017-04-20 Thread Richard Wordingham via Unicode
> markers. The complication comes word internally. My understanding is that phonetically syllable-final consonants in non-Indic words in non-Indic languages have a tendency not to be included in an akshara along with the start of the next syllable. However, that tendency is more evident i

Counting Devanagari Aksharas

2017-04-19 Thread Richard Wordingham via Unicode
Is there consensus on how to count aksharas in the Devanagari script? The doubts I have relate to a visible halant in orthographic syllables other than the first. For example, according to 'Devanagari VIP Team Issues Report' http://www.unicode.org/L2/L2011/11370-devanagari-vip-issues.pdf

Re: Sanskrit -e/o a- Sandhi in Devanagari

2017-02-24 Thread Shriramana Sharma
ed by SPACE, am I correct in believing that the change in > codepoints is: > > <U+0020 SPACE, U+0905 LETTER A> becomes <U+200B ZERO WIDTH SPACE, U+093D > DEVANAGARI SIGN AVAGRAHA> > > I ask because I have seen lines starting with avagraha, though within a > line ther

Sanskrit -e/o a- Sandhi in Devanagari

2017-02-24 Thread Richard Wordingham
E, U+093D DEVANAGARI SIGN AVAGRAHA> I ask because I have seen lines starting with avagraha, though within a line there seems not to be a space before avagraha. (I am ignoring didactic writing which shows sandhi effects but leaves a space between the original words.) Richard.

Re: Devanagari and Subscript and Superscript

2015-12-16 Thread Doug Ewell
I missed this yesterday. Plug Gulp wrote: > General support for all characters, words and sentences could be > achieved by just three new formatting characters, e.g. SCR, SUP and > SUB, similar to the way other formatting characters such as ZWS, ZWJ, > ZWNJ etc are defined. The new formatting

Re: Devanagari and Subscript and Superscript

2015-12-16 Thread Philippe Verdy
2015-12-16 19:16 GMT+01:00 Doug Ewell : > The ones you suggest are stateful; they affect the rendering of > arbitrary amounts of subsequent data, in a way reminiscent of ECMA-48 > ("ANSI") attribute switching, or ISO 2022 character-set switching. > Unicode tries hard to avoid

Re: Devanagari and Subscript and Superscript

2015-12-15 Thread Doug Ewell
Plug Gulp wrote: > It will help if Unicode standard itself intrinsically supports > generalised subscript/superscript text. This falls outside the scope of "plain text" as defined by Unicode, in much the same way as bold and italic styles and colors and font faces and sizes. There are several

Re: Devanagari and Subscript and Superscript

2015-12-15 Thread srivas sinnathurai
Does the standard support the use of diacritics in plain text format, when used with all and any complex scripts? Regards Sinnathurai > > On 15 December 2015 at 17:46 Doug Ewell wrote: > > > Plug Gulp wrote: > > > It will help if Unicode standard itself

RE: Devanagari and Subscript and Superscript

2015-12-15 Thread Doug Ewell
srivas sinnathurai wrote: > Does the standard support the use of diacritics in plain text format, > when used with all and any complex scripts? It probably depends on what you mean by "support" and "diacritics." I can type a Tamil letter followed by a combining acute accent or diaeresis, and in

Re: Devanagari and Subscript and Superscript

2015-12-15 Thread Plug Gulp
haracters and words) will tremendously help languages and scripts that are not English/Latin. Thanks and kind regards, ~Plug >> >> Hi, >> >> I am trying to understand if there is a way to use Devanagari >> characters (and grapheme clusters) as subscript an

Re: Devanagari and Subscript and Superscript

2015-12-15 Thread Richard Wordingham
On Tue, 15 Dec 2015 18:00:16 + (GMT) srivas sinnathurai wrote: > Does the standard support the use of diacritics in plain text format, > when used with all and any complex scripts? Relatively few scalar value sequences are prohibited - just possibly sequences

Re: Devanagari and Subscript and Superscript

2015-12-15 Thread Khaled Hosny
On Tue, Dec 15, 2015 at 11:55:02AM +, Plug Gulp wrote: > Please note that the teacher had to use a Circumflex Accent (Caret) to > indicate superscript, which is an unwritten convention, in the absence > of proper superscript support within Unicode. If the teacher is explaining actual math to

Re: Devanagari and Subscript and Superscript

2015-12-11 Thread Richard Wordingham
On Wed, 9 Dec 2015 03:24:39 + Plug Gulp <plug.g...@gmail.com> wrote: > I am trying to understand if there is a way to use Devanagari > characters (and grapheme clusters) as subscript and/or superscript in > unicode text. Why do you want to do this? Are you asking about wri

Re: Devanagari and Subscript and Superscript

2015-12-08 Thread Richard Wordingham
On Wed, 9 Dec 2015 03:24:39 + Plug Gulp <plug.g...@gmail.com> wrote: > Hi, > > I am trying to understand if there is a way to use Devanagari > characters (and grapheme clusters) as subscript and/or superscript in > unicode text. The view is that such would not be 'plain

Re: Devanagari and Subscript and Superscript

2015-12-08 Thread Martin J. Dürst
Hello Plug, I suggest using HTML: बक ्ष Regards, Martin. On 2015/12/09 12:24, Plug Gulp wrote: Hi, I am trying to understand if there is a way to use Devanagari characters (and grapheme clusters) as subscript and/or superscript in unicode text. It will help if someone could please direct

Devanagari and Subscript and Superscript

2015-12-08 Thread Plug Gulp
Hi, I am trying to understand if there is a way to use Devanagari characters (and grapheme clusters) as subscript and/or superscript in unicode text. It will help if someone could please direct me to any document that explains how to achieve that. Is there a unicode marker that will treat

RE: Devanagari Letter Short A

2004-02-19 Thread Aparna A. Kulkarni
The character U+0904 (DEVANAGARI LETTER SHORT A) is not a part of ISCII 91. Neither was it encoded in any of the earlier versions of ISCII. Hence according to the ISCII standard this character simply cannot be formed. Aparna A. Kulkarni -Original Message- From: [EMAIL PROTECTED] [mailto

Re: Devanagari Letter Short A

2004-02-19 Thread Philippe Verdy
From: Aparna A. Kulkarni [EMAIL PROTECTED] To: [EMAIL PROTECTED]; 'Unicode List' [EMAIL PROTECTED] Sent: Thursday, February 19, 2004 8:23 AM Subject: RE: Devanagari Letter Short A The character U+0904 (DEVANAGARI LETTER SHORT A) is not a part of ISCII 91. Neither was it encoded in any

Re: Devanagari Letter Short A

2004-02-18 Thread Antoine Leca
Philippe Verdy va escriure: U+0904 DEVANAGARI LETTER SHORT A is used only for the case of an independant vowel. It can be viewed as a conjunct of the independant vowel U+0905 DEVANAGARI LETTER A and the dependant vowel sign U+0946 DEVANAGARI VOWEL SIGN SHORT E (noted for transcribing

Re: Devanagari Letter Short A

2004-02-18 Thread Antoine Leca
Ernest Cline wrote: I've been trying to make sense of the Indian scripts, but am having one small difficulty. I can't seem to find the ISCII 1991 equivalent for U+0904 (DEVANAGARI LETTER SHORT A). I do not believe you'll find it there. U+0904 had been added to Unicode for version 4.0

Re: Devanagari Letter Short A

2004-02-16 Thread Philippe Verdy
as other consonnant letters /*a/, i.e. coding another isolated vowel requires coding /a/ before the vowel sign (matra). This encodes approximately the same thing as isolated vowels, except that the intended rendering is different. U+0904 DEVANAGARI LETTER SHORT A is used only for the case

Devanagari Letter Short A

2004-02-15 Thread Ernest Cline
I've been trying to make sense of the Indian scripts, but am having one small difficulty. I can't seem to find the ISCII 1991 equivalent for U+0904 (DEVANAGARI LETTER SHORT A). Is this a character that is part of the set accessed by the extended code (xF0) or was this part of the ISCII 1988

Re: Devanagari Glottal Stop

2003-04-06 Thread Michael Everson
I wrote: I would have to disagree with these Indian experts in this instance. The Devanagari glottal stop does not have a dot, and indeed, in the languages which use it, this character will certainly coexist with the question mark. They have different shapes, and different functions. At 15

Re: Devanagari Glottal Stop

2003-04-05 Thread Michael Everson
I would have to disagree with these Indian experts in this instance. The Devanagari glottal stop does not have a dot, and indeed, in the languages which use it, this character will certainly coexist with the question mark. They have different shapes, and different functions. -- Michael Everson

Re: Devanagari Glottal Stop

2003-04-05 Thread Mark Davis
: Saturday, April 05, 2003 01:45 Subject: Re: Devanagari Glottal Stop I would have to disagree with these Indian experts in this instance. The Devanagari glottal stop does not have a dot, and indeed, in the languages which use it, this character will certainly coexist with the question mark

Re: Plane 14 Tag Deprecation Issue (was Re: VS vs. P14 (was Re: Indic Devanagari Query))

2003-02-07 Thread Asmus Freytag
At 11:54 AM 2/6/03 -0800, Kenneth Whistler wrote: My personal opinion? The whole debate about deprecation of language tag characters is a frivolous distraction from other technical matters of greater import, and things would be just fine with the current state of the documentation. But, if formal

Re: VS vs. P14 (was Re: Indic Devanagari Query)

2003-02-07 Thread Andrew C. West
John H. Jenkins wrote: Ah, but decorative motifs are not plain text. Ah, but it could be.

Re: Plane 14 Tag Deprecation Issue (was Re: VS vs. P14 (was Re: Indic Devanagari Query))

2003-02-07 Thread William Overington
I feel that as the matter was put forward for Public Review then it is reasonable for someone reading of that review to respond to the review on the basis of what is stated as the issue in the Public Review item itself. Kenneth Whistler now states an opinion as to what the review is about and

Re: VS vs. P14 (was Re: Indic Devanagari Query)

2003-02-07 Thread Asmus Freytag
At 01:52 AM 2/7/03 -0800, Andrew C. West wrote: Ah, but decorative motifs are not plain text. Ah, but it could be. Ah, but it wouldn't be Unicode. A(h)./

Re: VS vs. P14 (was Re: Indic Devanagari Query)

2003-02-06 Thread Doug Ewell
Asmus Freytag asmusf at ix dot netcom dot com wrote: Unicode 4.0 will be quite specific: P14 tags are reserved for use with particular protocols requiring their use is what the text will say more or less. I didn't know the question of what to do about Plane 14 language tags had already been

VS vs. P14 (was Re: Indic Devanagari Query)

2003-02-06 Thread Andrew C. West
James Kass wrote, (What happens if someone discovers a 257th variant? Do they get a prize? Or, would they be forever banished from polite society?) I was thinking about that. 256 variants of a single character may seem a tad excessive, but there is a common Chinese decoartive motif

Re: VS vs. P14 (was Re: Indic Devanagari Query)

2003-02-06 Thread John H. Jenkins
On Thursday, February 6, 2003, at 08:47 AM, Andrew C. West wrote: There are also a number of other auspicious characters, such as fu2 (U+798F) good fortune that may be found written in a hundred variant forms as a decorative motif. Ah, but decorative motifs are not plain text. ==

Re: Indic Devanagari Query

2003-02-05 Thread Andrew C. West
On Wed, 05 Feb 2003 02:00:30 -0800 (PST), [EMAIL PROTECTED] wrote: If these alternate forms were needed to be displayed in a single multi-lingual plain-text file, wouldn't we need some method of tagging the runs of Latin text for their specific languages? Is this not what the variation

Re: Indic Devanagari Query

2003-02-05 Thread Peter_Constable
On 02/04/2003 02:52:25 PM jameskass wrote: If these alternate forms were needed to be displayed in a single multi-lingual plain-text file, wouldn't we need some method of tagging the runs of Latin text for their specific languages? The plain-text file would be legible without that -- I don't

Re: Indic Devanagari Query

2003-02-05 Thread Peter_Constable
On 02/05/2003 04:05:44 AM Andrew C. West wrote: If these alternate forms were needed to be displayed in a single multi-lingual plain-text file, wouldn't we need some method of tagging the runs of Latin text for their specific languages? Is this not what the variation selectors are available

VS vs. P14 (was Re: Indic Devanagari Query)

2003-02-05 Thread jameskass
. Andrew C. West wrote, Is this not what the variation selectors are available for ? And now that we soon to have 256 of them, perhaps Unicode ought not to be shy about using them for characters other than mathematical symbols. Yes, there seem to be additional variation selectors coming in

VS vs. P14 (was Re: Indic Devanagari Query)

2003-02-05 Thread jameskass
. Peter Constable wrote, The plain-text file would be legible without that -- I don't think this is an argument in favour of plane 14 tag characters. Preserving culturally-preferred appearance would certainly require markup of some form, whether lang IDs or for font-face and perhaps

Re: VS vs. P14 (was Re: Indic Devanagari Query)

2003-02-05 Thread Asmus Freytag
At 06:24 PM 2/5/03 +, [EMAIL PROTECTED] wrote: The advantages of using P14 tags (...equals lang IDs mark-up) is that runs of text could be tagged *in a standard fashion* and preserved in plain-text. The minute you have scoped tagging, you are no longer using plain text. The P14 tags are no

Re: VS vs. P14 (was Re: Indic Devanagari Query)

2003-02-05 Thread Peter_Constable
On 02/05/2003 12:24:39 PM jameskass wrote: The advantages of using P14 tags (...equals lang IDs mark-up) is that runs of text could be tagged *in a standard fashion* and preserved in plain-text. Sure, but why do we want to place so much demand on plain text when the vast majority of content we

Re: VS vs. P14 (was Re: Indic Devanagari Query)

2003-02-05 Thread Michael Everson
At 16:47 -0500 2003-02-05, Jim Allan wrote: There are often conflicting orthographic usages within a language. Language tagging alone does not indicate whether German text is to be rendered in Roman or Fraktur, whether Gaelic text is to be rendered in Roman or Uncial, and if Uncial, a modern

Re: VS vs. P14 (was Re: Indic Devanagari Query)

2003-02-05 Thread jameskass
. Asmus Freytag wrote, Variation selectors also can be ignored based on their code point values, but unlike p14 tags, they don't become invalid when text is cutpaste from the middle of a string. Excellent point. Unicode 4.0 will be quite specific: P14 tags are reserved for use with

Re: VS vs. P14 (was Re: Indic Devanagari Query)

2003-02-05 Thread jameskass
. Peter Constable wrote, Sure, but why do we want to place so much demand on plain text when the vast majority of content we interchange is in some form of marked-up or rich text? Let's let plain text be that -- plain -- and look to the markup conventions that we've invested so much in and

Re: Indic Devanagari Query

2003-02-04 Thread Peter_Constable
On 01/30/2003 03:03:24 PM Anto'nio Martins-Tuva'lkin wrote: Not very different from the serbian vs. russian rendition of cyrillic lower case i in italics. There are more examples, though (almost?) none in the latin script. There are indeed some examples in Latin script. For instance, there are

Re: Indic Devanagari Query

2003-02-04 Thread jameskass
. Peter Constable wrote, There are indeed some examples in Latin script. For instance, there are three different typeforms form 014A used by different language communities. It's also been reported that there's a strong local preference for a variant of U+0257 in certain African language

Re: Indic Devanagari Query

2003-02-04 Thread Jim Allan
Peter Constable wrote, There are indeed some examples in Latin script. For instance, there are three different typeforms form 014A used by different language communities. It's also been reported that there's a strong local preference for a variant of U+0257 in certain African language

Re: Indic Devanagari Query

2003-01-30 Thread Anto'nio Martins-Tuva'lkin
On 2003.01.29, 05:52, Aditya Gokhale [EMAIL PROTECTED] wrote: 1. In Marathi and Sanskrit language two characters glyphs of 'la' and 'sha' are represented differently as shown in the image below - (First glyph is 'la' and second one is 'sha') as compared to Hindi where these character glyphs

Re: Indic Devanagari Query

2003-01-29 Thread Keyur Shroff
Hi Aditya, --- Aditya Gokhale [EMAIL PROTECTED] wrote: I had few query regarding representation of Devanagari script in Unicode (Code page - 0x0900 - 0x097F). Devanagari is a writing script, is used in Hindi, Marathi and Sanskrit languages. I have following questions - In the same

Re: Indic Devanagari Query

2003-01-29 Thread Keyur Shroff
give different code pages for Marathi, Hindi and Sanskrit. May be current code page of Devanagari can be traded as Hindi and two new code pages for Marathi and Sanskrit be added. This could solve these issues. If there is any better way of solving this, any one suggest. Instead of changing

Re: Indic Devanagari Query

2003-01-29 Thread Aditya Gokhale
Hello, Thanks for the reply. I will check the points as you said, as far as the font issues are considered. We all know how jna,shra and ksh are formed in UNICODE and ISCII, but the point I wanted to make was, if we have to sort / search / process the data in Devanagari script, then we have

RE: Indic Devanagari Query

2003-01-29 Thread Marco Cimarosti
Aditya Gokhale wrote: Hello Everybody, I had few query regarding representation of Devanagari script in Unicode All your questions are FAQ's, so I'll just reference the entries which answers them. (Code page - 0x0900 - 0x097F). Devanagari is a writing script, is used in Hindi, Marathi

Re: Indic Devanagari Query

2003-01-29 Thread Keyur Shroff
merit. If there is demand from native speakers then a proposal can be submitted to Unicode. There is a predefined procedure for proposal submission. Once this is discussed with concerned people and agreed upon then these ligatures can be added in Devanagari script itself because Devenagari

Re: Indic Devanagari Query

2003-01-29 Thread John Cowan
Keyur Shroff scripsit: Sentiments are attached with cultures which may vary from one geographical area to another. So when one of the many languages falling under the same script dominate the entire encoding for the script, then other group of people may feel that their language has not been

Re: Indic Devanagari Query

2003-01-29 Thread Michael Everson
in Unicode by a string of three characters. In Unicode many characters have been given codepoints regardless of the fact that the same character could have been rendered through some compose mechanism. This includes Indic scripts as well as other scripts. For example, in Devanagari script some code

RE: Indic Devanagari Query

2003-01-29 Thread Kent Karlsson
I wouldn't go so far. The fact that clusters belong together is something that can be handled by the software. Collation and other data processing needs to deal with such issues already for many other languages. See http://www.unicode.org/reports/tr10 on the collation algorithm. I

Re: Indic Devanagari Query

2003-01-29 Thread Christopher John Fynn
writing Sanskrit words containing KSSA in Tibetan script. I had thought that the argument for including KSSA as a seperate character in the Tibetan block (rather than only having U+0F40 and U+0FB5) was originally for compatibility / cross mapping with Devanagari and other Indic scripts. - Chris

Re: Indic Devanagari Query

2003-01-29 Thread Rick McGowan
Aditya Gokhale wrote: 1. In Marathi and Sanskrit language two characters glyphs of 'la' and 'sha' are represented differently as shown in the image below - Actually, for everyone's information: these allographs for Marathi were recently brought to our attention, and Unicode 4.0 will have a

RE: Indic Devanagari Query

2003-01-29 Thread Marco Cimarosti
Christopher John Fynn wrote: I had thought that the argument for including KSSA as a seperate character in the Tibetan block (rather than only having U+0F40 and U+0FB5) was originally for compatibility / cross mapping with Devanagari and other Indic scripts. Which is not a valid reason

Indic Devanagari Query

2003-01-28 Thread Aditya Gokhale
Hello Everybody, I had few query regarding representation of Devanagari script in Unicode(Code page - 0x0900 - 0x097F). Devanagari is a writing script, isused in Hindi, Marathi and Sanskrit languages. I have following questions - 1. In Marathi and Sanskrit language two charactersglyphs

RE: converting devanagari to mangal unicode

2002-12-17 Thread Marco Cimarosti
John Hudson wrote: At 03:09 PM 12/16/2002, Eric Muller wrote: In order to convert any Devanagari font to be rendered in the same way, May be Sunil is just asking for a conversion of data, presumably from ISCII to Unicode. Ah, yes, this is possible. I'm so used to people asking

Re: converting devanagari to mangal unicode

2002-12-17 Thread Bob_Hallissy
. There are many non-Unicode encodings of Devanagari, so I'm unable to guess how your data is currently encoded. TECkit is table-driven, i.e., you find or prepare a description of the mapping between your encoding and Unicode, and then TECkit uses that description to convert data. You may even be able to find

RE: converting devanagari to mangal unicode

2002-12-17 Thread Marco Cimarosti
Bob Hallissy wrote: NB: One of the complexities you may run into, and which will limit your options, is that your encoding may store text in a different order than Unicode requires. If this is the case, TECkit can do the rearrangement for you but I'm not sure ICU will easily do that. Certainly

Re: converting devanagari to mangal unicode

2002-12-17 Thread Peter_Constable
On 12/16/2002 05:09:04 PM Eric Muller wrote: May be Sunil is just asking for a conversion of data, presumably from ISCII to Unicode. Or perhaps from one of a variety of non-standard Devanagari encodings. - Peter

converting devanagari to mangal unicode

2002-12-16 Thread Magda Danish (Unicode)
. I want to know whether any converter is available for converting devanagari to mangal unicode. Please reply ASAP Sunil -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- (End of Report)

Re: converting devanagari to mangal unicode

2002-12-16 Thread John Hudson
I am Gis/Website developer my query is I have a data in devanagri true type font i want to convert this data into mangal unicode. I want to know whether any converter is available for converting devanagari to mangal unicode. This is, excuse the pun, a bit of a mangled question. Mangal

Re: converting devanagari to mangal unicode

2002-12-16 Thread Eric Muller
In order to convert any Devanagari font to be rendered in the same way, May be Sunil is just asking for a conversion of data, presumably from ISCII to Unicode. Eric.

Re: converting devanagari to mangal unicode

2002-12-16 Thread John Hudson
At 03:09 PM 12/16/2002, Eric Muller wrote: In order to convert any Devanagari font to be rendered in the same way, May be Sunil is just asking for a conversion of data, presumably from ISCII to Unicode. Ah, yes, this is possible. I'm so used to people asking the other question that I

Devanagari

2002-12-03 Thread Vipul Garg
I have downloaded your font chart for Devanagari, which is in the range from 0900 to 097F. I have also installed the Arial Unicode font supplied by Microsoft office XP suite. I found that not all characters are available for Devanagari. For example letters such as Aadha KA, Aadha KHA

RE: Devanagari

2002-12-03 Thread Alan Wood
Vipul Garg wrote: I have downloaded your font chart for Devanagari, which is in the range from 0900 to 097F. I have also installed the Arial Unicode font supplied by Microsoft office XP suite. I found that not all characters are available for Devanagari. For example letters such as Aadha KA

RE: Devanagari

2002-12-03 Thread Andy White
Vipal Garg was asking why half characters were not included in Unicode code charts and in his copy of Arial Unicode font. More recent versions of Arial Unicode Do contain half characters etc. for Devanagari. As to the code charts, to answer this, you needed to explore the Unicode web site a bit

RE: Devanagari

2002-12-03 Thread Marco Cimarosti
Vipul Garg wrote: I have downloaded your font chart for Devanagari, which is in the range from 0900 to 097F. I have also installed the Arial Unicode font supplied by Microsoft office XP suite. I found that not all characters are available for Devanagari. For example letters such as Aadha

Re: Devanagari

2002-12-03 Thread John Cowan
[EMAIL PROTECTED] scripsit: Au contraire! You might find the attached gif of interest. (This is version 1.0 of the font. Some people might have earlier versions.) Ah, excellent. It has not always been so. If you're not getting Indic shaping with Arial Unicode MS, it's very likely the fault

  1   2   3   >