On 7/5/11, Henrique Peron <hpe...@terra.com.br> wrote:
> Before I forget, I noticed that you do use ISO codepages.
> I'll work on distinct packs of codepages and keyboard layouts for ISO
> 8859-1 ~ 16.

Honestly, I very rarely use only Latin-3 (913), so please don't waste
500 hours on my account!   ;-)   It's very low priority. Minimum
"good" set would be Latin 1-4 (IMHO) and perhaps Latin-15 (or whatever
is Latin-1 with Euro, I never can remember, Latin-9 or ISO 8859-15 or

>>> While Unicode is huge, DOS keyboard layouts tend to be limited to
>>> Latin and Cyrillic and some other symboly which is a tiny subset.
> Nowadays, FreeDOS is able to work with the latin, cyrillic, greek,
> armenian and georgian alphabets, the cherokee syllabary and japanese.

You are a one-man marching band!! You've done such good work here for us!   ;-)

>> Right-to-left might be hard to do (I guess?), but technically as long
>> as they can see and enter what they want, I'm sure they can get used
>> to left-to-right.
> Excuse me? How can anyone type the arabic, syriac or hebrew abjads from
> left to right? *That* would be really exotic, if ever possible! :-)

How can anybody play guitar upside down or wrong-handed? But people do
it!!!  ;-)

kool m'i gnipyt siht sdrawkcab thgir won (ylwols)

BTW, last I heard, Eli Z. was working on bidi editing in GNU Emacs.

> Visually speaking, if an eventual reader doesn't know hebrew (or
> yiddish, or ladino, etc.), he might not know if a text is correctly
> (right-to-left) or incorrectly (left-to-right) typed because the letters
> don't connect to each other. On the other hand, abjads like arabic and
> syriac have most of their letters shaped in a way that they connect to
> each other - always from right to left.

I'm just saying, supporting the actual chars themselves being entered
and displayed is better than nothing, even if it's forced left to
right for simplicity (or technical limitations). Not ideal, but I'm
sure they can get used to it. But I don't honestly know what KEYB does
(or could) support in that area. I'm just trying to be pragmatic /

> UTF-8 is best suited for languages written with the latin alphabet

I just don't know if such a bias really is universally accepted or
not. As we've seen, it's not exactly "universal" which Unicode method
is preferred. I guess it matters less these days with Java being
ubiquitous and RAM being humongous.

>>>> 4). Arabic (easy??)
>>> Unicode lists maybe 300 chars for that, at most.
> If we restrict ourselves to the arabic language, I can tell you that it
> is much less.

We don't need to support "everything", just enough for reasonable functionality.

> If we mean the arabic abjad - and then it comes around 100 languages
> ,... I can tell you that we're talking about much more than 300 chars.

Hmmm, annoying but no huge surprise.

> If  we multiply the number of conjuncts by the number of abugidas in the
> indian subcontinent, we easily have thousands of distinct glyphs.

Ugh! Heheheh, nobody said i18n was easy.   ;-)

> My conclusion: either there was a wholly tailored MS/IBM-DOS for India
> on those days or there were particular COM/EXE programs that would put
> any regular DOS on graphics mode so to handle ISCII.

See Hindawi@FreeDOS. (Haven't checked, but it sounds like it uses
Allegro for gfx.)

> Important to mention is that english is generally regarded as
> "pure-ASCII" but we must consider the fair amount of foreign words (like
> "café") and the need of accented/special chars used in middle and old
> english, therefore the english language (as much as german, french or
> any other latin-alphabet-based language) also falls in the same
> situation as portuguese.

Well, except that almost nobody puts accents on English words, even
loan words. At least I never do. "naive" and "cafe" have to suffice
for me.  ;-)

BTW, surely I'm not telling you anything you didn't already know, but
... Old English is, erm, kinda like dead and old and 100%
incomprehensible and not used and stuff. (Beowulf?)   :-))    Middle
English is just weird spelling and archaic words (Shakespeare?
Chaucer?), hence we're not exactly using it a lot either. ("Anon!
Forewith she shewn the waye!")

I guess it matters more in languages where (lacking) accents changes
the meaning of words (E-o:  si, sxi ... horo, hxoro ... salto, sxalto
... ktp. ktp. ktp.). But English is already weird with homonyms (wind,
bow), ambiguous stuff, or whatever.

> In what comes to storage (and UTF-8), russian needs the regular latin
> digits (1 byte each) and the cyrillic letters (2 bytes each char); if we
> think on cyrillic needs in general, then we also have the ukrainian
> hryvnia currency sign, a 3-byte char (again, "Currency Symbols",
> 2000h-206Fh).

I don't know why it isn't acceptable to just spell it out as "30
hryvnia" instead of always having specific symbols for everything.

>>>> own scripts are a problem, not to mention those like CJK that have
>>>> thousands of special characters. (e.g. Vietnamese won't fit into a
>>>> single code page, even.)
> Actually, it does. There was a standard called VISCII on the old days.
> It has been available for FreeDOS for a while already. The catch is: due
> to the hugh amount of necessary precomposed chars (134), there are no
> linedraw, shade, block or any other strictly non-vietnamese precomposed
> char on the upper half of VISCII and 6 less-used control chars on the
> lower half had their glyphs traded for the remaining 6 precomposed
> vietnamese accented latin letters.

I looked it up on Wikipedia a while back, and it had like three
different workarounds, all different but all logical enough (to me, at
the time). So maybe I'm worried over nothing. Maybe they can get along
fine without explicit support. Maybe we should let them come to us and
tell us how the hell to "fix" it!   :-))

>>> When you have Unicode, you do not need codepages.
>> Right. And when you have a 286 or 386, you don't need to limit to 1 MB
>> of RAM.   ;-))
> Furthermore, due to the number of glyphs (and the shape complexity of
> many of them), I can only imagine Unicode working on graphics mode and
> that will certainly complicates matters for very old computers...

For good or bad, it's long been assumed by most developers that
everybody has VGA or SVGA or newer. (With "modern" OSes, it's worse:
gfx acceleration, OpenGL, DirectX 9, etc.)

> Unless it be considered some sort of "sub-Unicode" support for them, focusing
> only on latin, cyrillic, greek, armenian and georgian alphabets because
> their letters can easily fit on regular codepages and they cover the
> needs of the majority of world's languages. That could be the best
> possible workaround.

I'm not sure 8086 is really a feasible target anymore (though I'm not
actively suggesting dropping it). But do such retrocomputing people
even want Unicode support? I doubt it. Like you said, they're probably
happy enough (or even English only!).

> I'm also working on the arabic and hebrew abjads - to work particularly
> under Mined. Codepages 856 and 862 (hebrew) and 864 (arabic) have been
> ready for a long time already but I had never seen a way to use them
> until I found out about Mined.

Sounds good!

> So far, under request, I have only
> prepared a phonetic spanish/arabic keyboard layout. Since it was a
> particular need (instead of a regular standard), it will be not released
> in the keyboard layout pack for FreeDOS - unless, naturally, I'm told
> that many users would need it.

Not me. For me, E-o is all I need for i18n (semi-joking).   ;-)

P.S. <snip> Dang, that was a long quote (my bad). I ramble too much
(sorry sorry).

All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
Freedos-user mailing list

Reply via email to