On Sat, Aug 05, 2006 at 11:11:02AM +0200, Werner LEMBERG wrote:
> > BTW another issue of the substitution rules is that, as far as I can
> > tell, they can delete or insert extra glyphs arbitrarily.
> 
> Of course.  How would you handle a ligature?  `f' + `l' = `fl' -- this
> means that a character has been deleted.

That's my exact point: this type of substitution is NOT possible in a
character cell environment. You can still make a ligature out of them,
but it will necessarily have to be two cells wide. This is the small
price you pay in 'prettiness' and huge reward you receive in
simplicity for using a character cell device. The spacing of text does
not vary depending on font (which the application knows nothing
about); it only depends on assumed-constant properties of the
_characters_ involved.

Disadvantages:
- sometimes a bit ugly; good fonts designed to look nice in a
  character cell environment make the situation a lot better.

Advantages:
- extremely low bandwidth for remote access since no font metric
  information or glyphs need be exchanged.
- simple application implementation.
- high performance screen updates.
- compatible with a huge amount of existing software.
- makes adding unicode support to existing software much easier.

IMO users got spoiled in legacy 8bit environments with fancy text
rendering, which was fast and easy with just 256 glyphs but much
harder to make efficient with all of unicode. The nice property of
character cell environments is that they don't lose performance or
massively grow in complexity when you add 60-100k characters.

> > > And there is still the question who is going to implement this.
> >
> > Anyone can since the spec is trivial to implement. With all but the
> > context-matcher implemented so far, my implementation compiles to 528
> > bytes of i386 code.
> 
> I'm *really* interested to see this :-)

OK but it's nothing fancy, just a tiny interpreter for the sort of
language I described in my other post (actually for an earlier version
of the same idea). I'll post the code to the new one once I get a
little more of it done. Most of the fancy code will be in the font
compiler that builds the interpreted code from the list of glyphs and
the characters/contexts they correspond to.

> > > Mongolian can be and is written horizontally as well.
> >
> > I didn't find any better references searching google than you would.
> > It seems to be a new invention, and the glyphs are rotated 90
> > degrees from their vertical presentation in order to combine nicely.
> 
> I rather think that this is an invention to overcome the complications
> with computers.  I'll ask a friend who is an expert for Mongolian.

>From what I could gather there are maybe two styles. One is a form
that was essentially a hack to write your document on the computer
horizontally, then rotate the paper after you print it... :P Nasty
hack, eh? I think this form was written right-to-left to make the
printing come out right.

There also seems to be a form that's meant to be read as-is and mixed
with left-to-right horizontal text in other scripts and languages.
Unless this is highly offensive to many Mongolian speakers (which I
kinda doubt since they're probably used to using Cyrillic quite a
bit..) I think it's reasonable to believe that this is the preferred
form for use in multilingual (as opposed to localized Mongolian)
computers except when preparing traditional-style typeset output.

> > In terms of what I "want" (for my own use): Latin, Tibetan, Japanese,
> > and mathematical notations.
> 
> Hmm.  Mathematical notation is two-dimensional by its very nature.
> Please elaborate.

Obviously if you want anything fancy you should be using LaTeX, but if
you're just trying to express yourself coherently in an email, having
a huge collection of mathematical characters not available in ASCII
can be very helpful. Also, LaTeX source files could be a lot more
compact and legible to non-experts if they contained, for example, the
Greek character alpha instead of \alpha for each occurrance. I'm not
sure on the status of support for things like this, but I've seen some
LaTeX packages for using UTF-8 in source files.

Rich



--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to