RE: Suggestions in Unicode Indic FAQ

Keyur Shroff Sun, 02 Feb 2003 22:06:06 -0800

--- Kent Karlsson <[EMAIL PROTECTED]> wrote:

> > 
> > Without that dotted circle appearing, the e-matra would appear to
> > have been properly encoded, 
> 
> No, with proper reordering (and "normal" display mode), the e-matra at
> the beginning of the second word would appear to be last glyph of the
> first "word".  Similarly, for the second case, the e-matra glyph would
> have come to the left of the pa.  The fluent reader (ok, not me...)
> would then see those errors anyway, just like I can find spelling
> errors in Swedish, most often without any kind of special marking. (I'm
> assuming through-out that reordrant combining characters are reordered.)


Illegal sequences are not reordered as you indicated. Also, as far as I
know there is no mention of reordering of illegal input sequence (or
invalid combining mark) in Unicode standard.

Consider the last set of glyphs (left-to-right, top-to-bottom) in the
attached image. It is the rendering effect of illegal input sequence
"Devanagari Vowel Sign I" [U+093F] + "Devanagari Letter Ka" [U+0915] and
without any dotted circle. As you might be knowing the correct input
sequence should be U+0915 followed by U+093F. In that case the result would
have been similar to what appears right now. (Though some more
sophisticated font/application may want to replace the appearing glyph for
U+093F to be substituted by some other glyph with proper attachment point).
Now there is no way that user can identify this illegal input sequence
without dotted circle. In the worst case even this rendered glyph is
attached to the character from a class (for example, consonant cluster of
"Ka" "Virama" "Ma") for which the glyph has been designed to render with.
In such case even a fluent reader can not identify the error.

> 
> There are spelling errors, yes.  But there are other ways of indicating
> spelling errors, that are (by now) fairly conventional for any language
> (as long as there is an appropriate dictionary installed), and that also
> are more general (in catching more spelling errors) and less obtrusive
> (the author really wants to write it that way, for some reason).
> 
> > Apparently, Michka used a non-OpenType Bengali Unicode font when
> > he embedded the fonts into the page.  As long as you are looking
> > at the page on-line, with the embedded fonts, these errors are
> > invisible.  
> > 
> > It may be typographically horrible.  It *should* be typographically
> > horrible in order to illustrate bad sequences clearly.
> 
> I'd prefer little red wiggly lines under the word, or yellow background
> or some such (just for screen display, not for printing; screen grabs
> not counted).  And that for any spelling "error".

Spelling mistakes can be categorized into two different classes. One
arising from illegal input sequence (e.g., Vowel Sign E as the first
character in a word) and the other one is legal input sequence with no
contextual meaning in the dictionary. While indication of the second type
of mistake is generally used only in sophisticated applications like word
processor, everyone wants to know the first kind of mistake. With your
explanation it seems that even plain text editor is not useful at all to
identify such common typing mistakes!

- Keyur


__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com

<<inline: img1.jpg>>

RE: Suggestions in Unicode Indic FAQ

Reply via email to