On 1 Jul 2001, Juliusz Chroboczek wrote:

> RdB> Well, Juliusz it's probably "just" a "little" change to luit so
> RdB> it can be anti-luit too. :-)

> You might argue that this should be fast enough for your needs, but as
> I've gone to quite a bit of trouble to make luit efficient, I'm not
> very keen on making it benchmark-unfriendly.  (Luit in the direct
> direction is I/O bound on my machine; in other words, I couldn't
> measure a difference in speed between ``luit -c'' and ``cat''.)

I'm never one to argue _that_, _I'm_ the person that pushed PuTTY from
about 40Kchar/s to 2.5Mchar/s. I don't _want_ to do proper benchmarks
(Windows faster than Linux ... N.N.Noooooooo :-) ) but I think it's
a lot faster than xterm on the same hardware and even seems to be
about 60% faster than the linux console in VGA _character_mode_ tho
thats probably be due to having two CPUs available (linux & doze).

> An efficient implementation would consist in keeping a lazily
> maintained table of mappings from codepoints to ISO 2022 charsets,
> probably represented as a lazily unrolled tree.  How many pints is
> such functionality worth to you?

To me as a user ? Nothing. My favorite terminal can do UTF-8 itself.
To me as a sysadmin Ooo a couple at least.

But I wouldn't use a tree myself; this is almost the same situation as
the linux kernel has and it started as a moderatly complex hash table
but switched to a dead simple page table when somebody did some timing
and profiling.

Also in the case of luit you'd have to give it a strong hint as to which
character sets are best to use you might even have to limit it to specific
sets like l-k does. So the first thing to do would be to populate the
page table with those character sets. That should get you most of what
the user wants to use. If you're locked into those sets you need to add
the substitute tables. Only if the host terminal can do full ISO-2022 do
you want to add charsets on the fly. (And you want to add them a full
c-set at a time to reduce the chance of continual switching.)

Hmmm, I only know of one thing that does _full_ ISO-2022 switching ...
and while it might be good hack value to run two luits stacked it
doesn't seem very practical. :-)

NB: I realise it might mean a slow startup for CJK character sets but
    that's minor compared to the width issue.  I don't suppose any of
    the DBCS terminals can switch to ISO 8859-5 to get single width
    cyrillic. :-/

-- 
Rob.                          (Robert de Bath <robert$ @ debath.co.uk>)
                                       <http://www.cix.co.uk/~mayday>


-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to