Re: [Freedos-user] Unicode (It was 'Problem with USB keyboard in some computers')

2011-07-10 Thread Bret Johnson
 Appart from turning DISPLAY into a DOS device driver and override
 kernel's CON, but not only IOCTL, but also write.

FWIW, you don't actually need to turn DISPLAY into a device driver in order to 
replace/enhance CON.  You can do that with a TSR also.  See my USBPRINT if you 
want an example of how -- it replaces the default LPT1-LPT3, adds LPT4-LPT9, 
and even passes calls through to the old LPTx in situations where that is 
needed.

IMHO, TSR's have a lot of advantages over device drivers, and can still be 
installed in CONFIG.SYS if you actually want/need to do that (with the INSTALL= 
option).


--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
Freedos-user mailing list
Freedos-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-user


Re: [Freedos-user] Unicode (It was 'Problem with USB keyboard in some computers')

2011-07-10 Thread Aitor Santamaría
Ok, sorry, that's what I meant. That you find the chain at the List of
Lists, right?

Aitor

2011/7/10 Bret Johnson bretj...@juno.com:
 I'm curious, you check the LoL to get the pointers and override it?

 No, you just insert a new one with the same name in the Device Driver chain.  
 DOS always searches the chain in order, and uses the first one with the 
 correct name that it finds.  It doesn't actually know, or even care, where 
 the real one is.


--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
Freedos-user mailing list
Freedos-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-user


Re: [Freedos-user] Unicode (It was 'Problem with USB keyboard in some computers')

2011-07-10 Thread Bret Johnson
 Ok, sorry, that's what I meant. That you find the chain at the List
 of Lists, right?

Yes.  The first Device Driver header (NUL) is in the LoL.  From there, you can 
follow the chain (a linked list of pointers) as far as you want, and can 
insert/remove new headers wherever you want.


--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
Freedos-user mailing list
Freedos-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-user


Re: [Freedos-user] Unicode (It was 'Problem with USB keyboard in some computers')

2011-07-08 Thread Aitor Santamaría
Hello,

2011/7/7 Eric Auer e.a...@jpberlin.de:
 Still I think UTF-8 aware KEYB and DISPLAY together with old apps
 are still a lot more useful than any you always have to use 16 bit
 wide characters method which would only work with new apps at all.

KEYB would need no changes, 2-char wide characters would be a String.
True that not too comfortable to write the corresponding KL layouts,
but still feasible.

As for DISPLAY: MS-DISPLAY is a true enhancement of CON, but
FD-DISPLAY is not (yet).
Appart from turning DISPLAY into a DOS device driver and override
kernel's CON, but not only IOCTL, but also write. However, it would
still let out:
- int 10h users
- direct video BIOS writers
but would be a begining. Am I letting anything out?

Aitor

--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
Freedos-user mailing list
Freedos-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-user


Re: [Freedos-user] Unicode (It was 'Problem with USB keyboard in some computers')

2011-07-07 Thread Eric Auer

Hi Jeffrey,

 Would chaining interrupt 0x10 be reasonable? If I am not mistaken the FreeDOS 
 kernel
 uses interrupt 0x10 function 0x0E to print characters to the screen. A TSR 
 could be
 written to handle function 0x0E and pass the other functions to the BIOS.

Of course. In the old days of bad BIOS implementations, there were
TSRs which hooked int 10 functions 0, 2, 9, e and maybe a few more
to replace them with faster implementations, but nobody stops you
from replacing them with a, graphical unicode font implementation.

You can also use a DISPLAY driver which hooks the CON device, but
fewer apps might use that. The default CON of FreeDOS should use
int 10 anyway, so hooking those would cover both... On the other
hand, apps which like to control their screen layout in a quick
and dirty way (or just want fast screen updates) will write the
text mode memory (array of chars and colors) directly anyway so
to cover even those, you would have to poll that memory or use a
protected mode trap to be able to react when that array changes.

HOWEVER, the array, int 10 and CON are all officially assuming a
one byte is one character scenario so your layout is likely to
get messed up when you use UTF-8 and you cannot use UTF-16 anyway.

You could create a variant of int 10 where the upper half of some
32 bit register contains UTF-16, but most apps would not use it.

This is why I think it is acceptable to have a messy layout with
old apps (feeding them with UTF-8 while they believe it is not)
and making some newer apps UTF-8 aware, which means aware of the
fact that in UTF-8, one char in layout can be 2 or more bytes.

Eric


--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
Freedos-user mailing list
Freedos-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-user


Re: [Freedos-user] Unicode (It was 'Problem with USB keyboard in some computers')

2011-07-07 Thread Jeffrey
Hi Eric,

 HOWEVER, the array, int 10 and CON are all officially assuming a
 one byte is one character scenario so your layout is likely to
 get messed up when you use UTF-8 and you cannot use UTF-16 anyway.
In color text modes, alternating bytes are used for character and attribute.
So if you were willing to sacrifice color could you use then both bytes for
UTF-16?

But function 0x0E does not have a parameter for attribute. So the second byte
would have to be specified a different way. Or use function 0x09 or 0x0A.

Also chaining interrupt 0x10 to display text right to left is harder than I
expected. I'll keep working on it though.



--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
Freedos-user mailing list
Freedos-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-user


Re: [Freedos-user] Unicode (It was 'Problem with USB keyboard in some computers')

2011-07-07 Thread Eric Auer

Hi Jeffrey,

 HOWEVER, the array, int 10 and CON are all officially assuming a
 one byte is one character scenario so your layout is likely to
 get messed up when you use UTF-8 and you cannot use UTF-16 anyway.

 In color text modes, alternating bytes are used for character and attribute.
 So if you were willing to sacrifice color could you use then both bytes...

This ia a hardware thing. The widest characters that you can get is
using one bit of the color for selecting a font page, allowing a
total of 512 different characters to be displayed... But then again
software which writes to the b800: array ASSUMES that exactly 1
byte is color and 1 byte is the character... Only software which is
explicitly written for that uses the extra bit of 9 bit characters.

 But function 0x0E does not have a parameter for attribute. So the second byte
 would have to be specified a different way. Or use function 0x09 or 0x0A.

As said - IF you only want to support software which is re-compiled
to explicitly support some wide char extensions, you could say that
e.g. the upper 16 bits of EAX are the character, instead of only AL.

If you want to stay at least partially compatible with old software
then UTF-8 is a good idea because that only transports one byte at
a time, combining wide characters from multiple bytes when needed.
The only problem with that is that as said, it breaks the layout of
old software.

For example old software can easily process some UTF-8 string with
10 bytes which is displayed by UTF-8 aware displays as 7 characters
of which a few are accented, Japanese or whatever, but that older
software will BELIEVE that the string will look 10 characters long
on screen instead of the actual 7. Also, if you edit the string in
the old software, you may have to remove several characters which
are in fact just bytes before you have removed 1 Unicode character.

In the other direction, if old software asks you to type at most 7
characters and your UTF-8 aware keyboard driver sends 10 chars to
represent the 7 Unicode characters that you are trying to type, of
course the old software will only accept the first 7 bytes there,
which could be for example 5 and a half Unicode chars in UTF-8...

Still I think UTF-8 aware KEYB and DISPLAY together with old apps
are still a lot more useful than any you always have to use 16 bit
wide characters method which would only work with new apps at all.

Eric




--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
Freedos-user mailing list
Freedos-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-user


Re: [Freedos-user] Unicode (It was 'Problem with USB keyboard in some computers')

2011-07-06 Thread Rugxulo
Saluton,

On 7/6/11, Henrique Peron hpe...@terra.com.br wrote:

 Em 05/07/2011 18:25, Rugxulo escreveu:

 Honestly, I very rarely use only Latin-3 (913), so please don't waste
 500 hours on my account!   ;-)   It's very low priority.

 My friend, it is always a pleasure. I do hope that end-users have as
 much fun using my codepages and keyboard layouts as I have while
 making the necessary researches and working on them. :)

It's cool to me to see when other languages work. It just seems almost
magical. And of course I consider compilers a similar breed of magic.
(But it can be complicated!)

 ISO 8859: good part of the job is already done (the codepages) - for a
 long time already, by the way. All I need now is to work on distinct
 versions of all the keyboard layouts which could work with ISO
 codepages; if it takes 500 hours to get the job done, don't worry. I
 won't bill you. ;-)

No pressure!! Little by little does the trick.

 Latin-1 with Euro, on ISO, is Latin-9, a.k.a. ISO 8859-15.

Which is completely logical (facepalm)!

 You are a one-man marching band!! You've done such good work here for us!
  ;-)
 Thank you for your words (on the good work) but we know that it is not
 quite a one-man marching band - without Aitor's KEYB/KC/KLIB/DISPLAY
 and Eric's MODE, I couldn't have done anything. hehehe!! :)

Okay, yes, forgot about Eric and Aitor, heheh. Yes, of course they've
done tons, Jim too (and Bart and Japheth and Blair and Steve and
Jeremy and Tom and Stefan and Rene and Bernd and ...).

 Besides, there is this one case which I didn't participate in: support
 for japanese.  This one is not my child. It was teamwork directly
 between Aitor and a japanese end-user. Not only I don't even remotely
 have knowledge on japanese kanji (so to work on japanese codepages) but
 I also don't have the necessary hardware to test it.

Nor do most of us, which is an annoying problem (no suitable
hardware). It's almost insurmountable when you can't find any way to
test.

 It turns out that, when/if there's a korean or chinese FreeDOS user, I
 won't be able to help him at all. I'm seriously curious about how
 Johnson Lam deals with that, by the way.

No idea. I almost would suggest to just let CJK users handle it
themselves since it's so complex. I mean, they understand their needs
better than us! Or perhaps they'll chime in here eventually.

 BTW, last I heard, Eli Z. was working on bidi editing in GNU Emacs.

 H... I don't know Eli Z. nor GNU Emacs. Just a moment. Let me google
 it. (Sandwatch rolling)

 Oh, ok! Great! Interesting! However, I didn't find any mention to
 BIDI, arabic, hebrew, right, left, etc. on his webpage.
 Perhaps BIDI is a work in progress, as you said.

Here's what he said on news://comp.os.msdos.djgpp (May 27):

[M]ost of my scarce resources are taken anyway (adding bidirectional
editing support to Emacs).

He does (apparently?) live in Israel (.il), so presumably he speaks
Hebrew (right to left). Sadly, I don't, so I can't help him. Note that
I have no idea if the DJGPP port of Emacs will ever support it (or
ever be updated again), honestly, but he did just finish / package
23.3 for us recently. Though GNU Emacs in DOS doesn't really display
anything special, it just fakes it via c for c with circumflex.

 Mined has support for poor man's BIDI (Thomas Wolff's, the developer,
 own words).

Yes, Mined has lots of good stuff. It's a true gem for FreeDOS.

 UTF-8 is best suited for languages written with the latin alphabet

 I just don't know if such a bias really is universally accepted or
 not.

 All I said is that UTF-8 is best suited for languages written with the
 latin (also cyrillic, greek, georgian, armenian) alphabet;

Yes, UTF-8 vs. UCS-2 both have advantages and disadvantages.

I wasn't trying to be polemic, sorry, just saying, we don't need 1000
different Unicode variants or we're no better off, right?? I mean, if
*nix had its way, everybody would use UTF-8, but that's not the case
(Windows uses UTF-16).

 Too late. I prepared the vietnamese VISCII and keyboard layout for
 FreeDOS a long time ago, as a matter of fact. :)

Good stuff! Too bad I can't read it.  ;-)   There actually used to be
a fairly sizable community around here (not including me, I'm not a
member, heh), but I don't know where they went. In other words, I
can't grab 'em to test for ya (doh).

BTW, not to get too off-topic, but whenever I want to see what a
language looks like, I check here (1697 languages, wow). Easily one of
the coolest sites on the Internet, even if you're not religious.

http://www.christusrex.org/www1/pater/JPN-vietnam.html

off-topic
The computer programmer's alternative is here:

http://www.99-bottles-of-beer.net/
/off-topic

 For good or bad, it's long been assumed by most developers that
 everybody has VGA or SVGA or newer. (With modern OSes, it's worse:
 gfx acceleration, OpenGL, DirectX 9, etc.)

 Well then, let's go graphics! :)

Well, that's what Blocek and Foxtype have done!   

Re: [Freedos-user] Unicode (It was 'Problem with USB keyboard in some computers')

2011-07-06 Thread Jeffrey
Hi


I don't know much (anything) about unicode but,

 Right-to-left might be hard to do (I guess?), but technically as long
 as they can see and enter what they want, I'm sure they can get used
 to left-to-right.
 Excuse me? How can anyone type the arabic, syriac or hebrew abjads from
 left to right? *That* would be really exotic, if ever possible! :-)
 How can anybody play guitar upside down or wrong-handed? But people do
 it!!!  ;-)

 kool m'i gnipyt siht sdrawkcab thgir won (ylwols)
 hehehe!!! However, your example exactly matches the hebrew case -
 Letters which don't visually connect to the next one. Therefore, it's
 just a matter of reading it in a proper way. In what comes to the arabic
 abjad, the visual aspect if trying to type it left-to-right is not even
 worth to discuss. (I can't resist it: playing the guitar upside down is
 just a matter of training and wrong-handed is just wrong if you don't
 shift the position of the strings and, of course, training - more on
 that, please check with Paul McCartney! :-)))

Would chaining interrupt 0x10 be reasonable? If I am not mistaken the FreeDOS 
kernel
uses interrupt 0x10 function 0x0E to print characters to the screen. A TSR 
could be
written to handle function 0x0E and pass the other functions to the BIOS.



--
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
___
Freedos-user mailing list
Freedos-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-user


[Freedos-user] Unicode (It was 'Problem with USB keyboard in some computers')

2011-07-05 Thread Henrique Peron
Hi all!
Saluton amiko!

Before I forget, I noticed that you do use ISO codepages.
I'll work on distinct packs of codepages and keyboard layouts for ISO 
8859-1 ~ 16.
 While Unicode is huge, DOS keyboard layouts tend to be limited to
 Latin and Cyrillic and some other symboly which is a tiny subset.
Nowadays, FreeDOS is able to work with the latin, cyrillic, greek, 
armenian and georgian alphabets, the cherokee syllabary and japanese.
 If you do not count CJK and right-to-left languages and REALLY
 exotic languages and symbols (maths, dingbats), Braille etc etc
 then the number of Unicode characters that people are likely to
 type on their keyboard in DOS is quite manageable. Of course it
 is still fine to have a somewhat more complete font in DISPLAY.
 Right-to-left might be hard to do (I guess?), but technically as long
 as they can see and enter what they want, I'm sure they can get used
 to left-to-right. BTW, there was an old Forth for DOS with Korean font (...)
Excuse me? How can anyone type the arabic, syriac or hebrew abjads from 
left to right? *That* would be really exotic, if ever possible! :-)
Visually speaking, if an eventual reader doesn't know hebrew (or 
yiddish, or ladino, etc.), he might not know if a text is correctly 
(right-to-left) or incorrectly (left-to-right) typed because the letters 
don't connect to each other. On the other hand, abjads like arabic and 
syriac have most of their letters shaped in a way that they connect to 
each other - always from right to left.
 that). And then I (erroneously?) thought BMP (basic multilingual
 plane) was the easy, two-byte Western portion, but apparently that's
 not true.
Well - that might be true. Under Unicode, if you use UCS-2 encoding, all 
characters in the BMP are represented by 2 bytes. Period. UCS-2 is 
proven to be very good for CJK text because even when they need regular 
(non-accented) latin letters and digits, they are encoded as fullwidth 
(double-byte) on a distinct block in the BMP. All CJK glyphs, if stored 
under the UTF-8, use 3 bytes.
UCS-2 is also good for all abugidas (devanagari, bengali, etc.) because 
it would also be needed 3 bytes per glyph under UTF-8 for those scripts.
UTF-8 is best suited for languages written with the latin alphabet 
because a text encoded like that would oscilate between 1-3 bytes per 
char. Yes, 3 bytes, because many punctuation marks, currency signs, 
etc., are located above the 07FFh codepoint - when UTF-8 starts needing 
3 bytes per glyph. Medieval texts also rely heavily on the Latin 
Extended-D block, which is way above the 07FFh boundary.
The downside of UCS-2 is being limited to the BMP while on the other 
hand there are (in practice) no limitations for UTF-8.
 1). Chinese (hard)
 See above.
 We'd have to ask someone in the know, e.g. Johnson Lam. I think he
 had some primitive workaround for PG.
 4). Arabic (easy??)
 Unicode lists maybe 300 chars for that, at most.
If we restrict ourselves to the arabic language, I can tell you that it 
is much less.
If we mean the arabic abjad - and then it comes around 100 languages 
that use it like persian, urdu, sindhi, uyghur or used it either in the 
middle ages by force of the moor invasion like portuguese, spanish, 
croatian, belarusian or used it in Africa (like hausa) and Asia (like 
turkish, azeri, etc.),... I can tell you that we're talking about much 
more than 300 chars.
 Really? Wikipedia lists 28 char alphabet (single case), IIRC.
Yes - but there's a catch here. Let's think on the glyphs. Letters in 
the latin alphabet have two distinct shapes (upper- and lowercase) and, 
considering that, the regular latin alphabet is comprised of 52 chars. 
The arabic abjad, by its nature, provides up to 4 distinct shapes per 
letter. If we consider the uyghur language, which uses the arabic abjad 
as a regular alphabet (i.e. full representation of vowels), there are up 
to eight shapes per letter, because uyghur is unique among languages 
which use the arabic abjad in that it has digraphs as part of its 
alphabet, like hungarian has ZS (ZS, Zs, zs) or czech has CH (CH, 
Ch, ch).
 5). Hindi
 The writing system is Devanagari, case insensitive,
 has ligatures, not many characters, like Bengali?
 Apparently the Sanskrit alphabet, aka Deva-nagari or just Nagari. Has
 some interesting workarounds (e.g. ISCII, I think).
 Similar to what happens with Cyrillic, there is ISCII
 which puts ASCII and Devanagari together in 256 chars,
 even with Bengali and some other scripts (approx?).
 There you go, you saw Wikipedia too!   ;-)
 6). Bengali
 Apparently has ligatures and is case-insensitive?
 Aka, Bangla (from Bangladesh), uses Eastern Nagari (similar but not
 same). Looks like it could fit in a code page! Interesting workarounds
 include IAST and ITRANS.
ISCII apparently relies on subfonts and probably only worked in graphics 
mode. I imagine that because of the complex shapes of letters from 
abugidas like tamil, malayalam or telugu. There's absolutely no way 

Re: [Freedos-user] Unicode (It was 'Problem with USB keyboard in some computers')

2011-07-05 Thread Rugxulo
Hi,

On 7/5/11, Henrique Peron hpe...@terra.com.br wrote:

 Before I forget, I noticed that you do use ISO codepages.
 I'll work on distinct packs of codepages and keyboard layouts for ISO
 8859-1 ~ 16.

Honestly, I very rarely use only Latin-3 (913), so please don't waste
500 hours on my account!   ;-)   It's very low priority. Minimum
good set would be Latin 1-4 (IMHO) and perhaps Latin-15 (or whatever
is Latin-1 with Euro, I never can remember, Latin-9 or ISO 8859-15 or
???).

 While Unicode is huge, DOS keyboard layouts tend to be limited to
 Latin and Cyrillic and some other symboly which is a tiny subset.

 Nowadays, FreeDOS is able to work with the latin, cyrillic, greek,
 armenian and georgian alphabets, the cherokee syllabary and japanese.

You are a one-man marching band!! You've done such good work here for us!   ;-)

 Right-to-left might be hard to do (I guess?), but technically as long
 as they can see and enter what they want, I'm sure they can get used
 to left-to-right.

 Excuse me? How can anyone type the arabic, syriac or hebrew abjads from
 left to right? *That* would be really exotic, if ever possible! :-)

How can anybody play guitar upside down or wrong-handed? But people do
it!!!  ;-)

kool m'i gnipyt siht sdrawkcab thgir won (ylwols)

BTW, last I heard, Eli Z. was working on bidi editing in GNU Emacs.

 Visually speaking, if an eventual reader doesn't know hebrew (or
 yiddish, or ladino, etc.), he might not know if a text is correctly
 (right-to-left) or incorrectly (left-to-right) typed because the letters
 don't connect to each other. On the other hand, abjads like arabic and
 syriac have most of their letters shaped in a way that they connect to
 each other - always from right to left.

I'm just saying, supporting the actual chars themselves being entered
and displayed is better than nothing, even if it's forced left to
right for simplicity (or technical limitations). Not ideal, but I'm
sure they can get used to it. But I don't honestly know what KEYB does
(or could) support in that area. I'm just trying to be pragmatic /
realistic.

 UTF-8 is best suited for languages written with the latin alphabet

I just don't know if such a bias really is universally accepted or
not. As we've seen, it's not exactly universal which Unicode method
is preferred. I guess it matters less these days with Java being
ubiquitous and RAM being humongous.

 4). Arabic (easy??)
 Unicode lists maybe 300 chars for that, at most.

 If we restrict ourselves to the arabic language, I can tell you that it
 is much less.

We don't need to support everything, just enough for reasonable functionality.

 If we mean the arabic abjad - and then it comes around 100 languages
 ,... I can tell you that we're talking about much more than 300 chars.

Hmmm, annoying but no huge surprise.

 If  we multiply the number of conjuncts by the number of abugidas in the
 indian subcontinent, we easily have thousands of distinct glyphs.

Ugh! Heheheh, nobody said i18n was easy.   ;-)

 My conclusion: either there was a wholly tailored MS/IBM-DOS for India
 on those days or there were particular COM/EXE programs that would put
 any regular DOS on graphics mode so to handle ISCII.

See Hindawi@FreeDOS. (Haven't checked, but it sounds like it uses
Allegro for gfx.)

 Important to mention is that english is generally regarded as
 pure-ASCII but we must consider the fair amount of foreign words (like
 café) and the need of accented/special chars used in middle and old
 english, therefore the english language (as much as german, french or
 any other latin-alphabet-based language) also falls in the same
 situation as portuguese.

Well, except that almost nobody puts accents on English words, even
loan words. At least I never do. naive and cafe have to suffice
for me.  ;-)

BTW, surely I'm not telling you anything you didn't already know, but
... Old English is, erm, kinda like dead and old and 100%
incomprehensible and not used and stuff. (Beowulf?)   :-))Middle
English is just weird spelling and archaic words (Shakespeare?
Chaucer?), hence we're not exactly using it a lot either. (Anon!
Forewith she shewn the waye!)

I guess it matters more in languages where (lacking) accents changes
the meaning of words (E-o:  si, sxi ... horo, hxoro ... salto, sxalto
... ktp. ktp. ktp.). But English is already weird with homonyms (wind,
bow), ambiguous stuff, or whatever.

 In what comes to storage (and UTF-8), russian needs the regular latin
 digits (1 byte each) and the cyrillic letters (2 bytes each char); if we
 think on cyrillic needs in general, then we also have the ukrainian
 hryvnia currency sign, a 3-byte char (again, Currency Symbols,
 2000h-206Fh).

I don't know why it isn't acceptable to just spell it out as 30
hryvnia instead of always having specific symbols for everything.

 own scripts are a problem, not to mention those like CJK that have
 thousands of special characters. (e.g. Vietnamese won't fit into a
 single code page, 

Re: [Freedos-user] Unicode (It was 'Problem with USB keyboard in some computers')

2011-07-05 Thread Henrique Peron
Saluton!

Em 05/07/2011 18:25, Rugxulo escreveu:
 Before I forget, I noticed that you do use ISO codepages.
 I'll work on distinct packs of codepages and keyboard layouts for ISO
 8859-1 ~ 16.
 Honestly, I very rarely use only Latin-3 (913), so please don't waste
 500 hours on my account!   ;-)   It's very low priority. Minimum
 good set would be Latin 1-4 (IMHO) and perhaps Latin-15 (or whatever
 is Latin-1 with Euro, I never can remember, Latin-9 or ISO 8859-15 or
 ???).
My friend, it is always a pleasure. I do hope that end-users have as 
much fun using my codepages and keyboard layouts as I have while 
making the necessary researches and working on them. :)

ISO 8859: good part of the job is already done (the codepages) - for a 
long time already, by the way. All I need now is to work on distinct 
versions of all the keyboard layouts which could work with ISO 
codepages; if it takes 500 hours to get the job done, don't worry. I 
won't bill you. ;-)

Latin-1 with Euro, on ISO, is Latin-9, a.k.a. ISO 8859-15.
 While Unicode is huge, DOS keyboard layouts tend to be limited to
 Latin and Cyrillic and some other symboly which is a tiny subset.
 Nowadays, FreeDOS is able to work with the latin, cyrillic, greek,
 armenian and georgian alphabets, the cherokee syllabary and japanese.

 You are a one-man marching band!! You've done such good work here for us!   
 ;-)
Thank you for your words (on the good work) but we know that it is not 
quite a one-man marching band - without Aitor's KEYB/KC/KLIB/DISPLAY 
and Eric's MODE, I couldn't have done anything. hehehe!! :)

Besides, there is this one case which I didn't participate in: support 
for japanese.  This oneis not my child. It was teamwork directly 
between Aitor and a japanese end-user. Not only I don't even remotely 
have knowledge on japanese kanji (so to work on japanese codepages) but 
I also don't have the necessary hardware to test it. You can see for 
yourself: http://homepage3.nifty.com/sandy55/Video/PS55_DA.html

It turns out that, when/if there's a korean or chinese FreeDOS user, I 
won't be able to help him at all. I'm seriously curious about how 
Johnson Lam deals with that, by the way.
 Right-to-left might be hard to do (I guess?), but technically as long
 as they can see and enter what they want, I'm sure they can get used
 to left-to-right.
 Excuse me? How can anyone type the arabic, syriac or hebrew abjads from
 left to right? *That* would be really exotic, if ever possible! :-)

 How can anybody play guitar upside down or wrong-handed? But people do
 it!!!  ;-)

 kool m'i gnipyt siht sdrawkcab thgir won (ylwols)
hehehe!!! However, your example exactly matches the hebrew case - 
Letters which don't visually connect to the next one. Therefore, it's 
just a matter of reading it in a proper way. In what comes to the arabic 
abjad, the visual aspect if trying to type it left-to-right is not even 
worth to discuss. (I can't resist it: playing the guitar upside down is 
just a matter of training and wrong-handed is just wrong if you don't 
shift the position of the strings and, of course, training - more on 
that, please check with Paul McCartney! :-)))
 BTW, last I heard, Eli Z. was working on bidi editing in GNU Emacs.
H... I don't know Eli Z. nor GNU Emacs. Just a moment. Let me google 
it. (Sandwatch rolling)

Oh, ok! Great! Interesting! However, I didn't find any mention to 
BIDI, arabic, hebrew, right, left, etc. on his webpage. 
Perhaps BIDI is a work in progress, as you said.

Mined has support for poor man's BIDI (Thomas Wolff's, the developer, 
own words).

Arabic letters (for the arabic language) can have up to 4 different 
shapes, according to the position in a word (initial, medial, final) or 
if it is isolated (as on an acronym). On graphical environments, you 
only find the isolated shapes of the letters on the keyboard. However, 
as you type them, the operating system dynamically and continously 
replaces the shapes of the letters for the proper ones. Let me take the 
arabic word qamar (moon), for instance. For reasons not relevant to 
the scope of this conversation (and particularly concerning this word), 
a is not written, therefore we type qmr.
a) You type qaf (the arabic letter equivalent to our q). The screen 
displays the isolated shape of it.
b) You don't press space; now you type meem (the arabic equivalent 
to our m). Since you hadn't pressed space, the operating system 
understands that qaf was the first letter of a word. It replaces its 
isolated shape for its initial shape. Well, there's another letter to 
come: meem. There's already a letter in a initial position, therefore 
letter meem can only come on its final shape. End of word.
c) You still don't press space; now you type ra. Yes, their r. 
Since once again you hadn't press space, the operating system 
understands that meem wasn't the last letter of the word, after all. 
It trades its shape from final to medial. Then, it displays ra on its 
final shape. End of