[Freedos-user] DOS and Right-to-left support

2011-07-13 Thread Henrique Peron
Hi all,

would anyone out there happen to know how did arabic DOS, on the old 
days, deal with:

1) The control characters needed to handle the script - ZWJ (Zero-width 
joiner), ZWNJ (zero-width non-joiner), RLM (right-to-left mark), LRM 
(left-to-right mark) and control characters needed to handle bilingual 
text (LTR and RTL) in a same sentence: RLE/LRE (right-to-left and 
left-to-right embedding), RLO/LRO (right-to-left and left-to-right 
override) and PDF (POP directional Formatting).

2) Codepage 720 and many others which only present the isolated shapes 
of the characters. DOS, seemingly, had somehow to rely on subfonts or 
any feature which would cause DOS to trade the characters' isolated 
shapes for their initial, medial or final shapes on-the-fly as the text 
was typed.

3) Combining chars. All arabic codepages, including cp864, include at 
least two codepoints which present them.

Hebrew DOS is a simpler case yet topic #3 also applies to the script 
and, with the exception of control characters ZWJ and ZWNJ, topic #1 
also does.

Thanks in advance,
Henrique


--
AppSumo Presents a FREE Video for the SourceForge Community by Eric 
Ries, the creator of the Lean Startup Methodology on Lean Startup 
Secrets Revealed. This video shows you how to validate your ideas, 
optimize your ideas and identify your business strategy.
http://p.sf.net/sfu/appsumosfdev2dev
___
Freedos-user mailing list
Freedos-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-user


Re: [Freedos-user] DOS and Right-to-left support

2011-07-13 Thread Eric Auer

Hi Henrique,

 would anyone out there happen to know how did arabic DOS, on the old 
 days, deal with:
 
 1) The control characters needed to handle the script - ZWJ (Zero-width 
 joiner), ZWNJ (zero-width non-joiner), RLM (right-to-left mark), LRM 
 (left-to-right mark) and control characters needed to handle bilingual 
 text (LTR and RTL) in a same sentence: RLE/LRE (right-to-left and 
 left-to-right embedding), RLO/LRO (right-to-left and left-to-right 
 override) and PDF (POP directional Formatting).

All of those sound like control characters which would have
to be understood by DISPLAY or similar and which will need
space in the codepage, possibly in lesser used control char
areas (ASCII 0 to 31 somewhere). Without having to have any
specific shape, so in a character table, 0 to 31 ASCII will
still look like ASCII but you would not be able to print
those any more, at least without using escape sequences...?

 2) Codepage 720 and many others which only present the isolated shapes 
 of the characters. DOS, seemingly, had somehow to rely on subfonts or 
 any feature which would cause DOS to trade the characters' isolated 
 shapes for their initial, medial or final shapes on-the-fly as the text 
 was typed.

Maybe it just looked ugly and used non-contextual shapes? ;-)

 3) Combining chars. All arabic codepages, including cp864, include at 
 least two codepoints which present them.

You mean Unicode would represent them either as pre-combined
or as some character plus a separate accent character? Not
something that DOS is likely to have cared about, probably
it only used pre-composed characters and had the characters
without accent as separate entities, just like Latin vowels
and Latin accented vowels (umlauts etc) having separate full
shape font items in CP850 and similar. Note that CP850 does
not even have double dot above for composition, it only
has that as part of pre-composed umlaut character stapes...

 Hebrew DOS is a simpler case yet topic #3 also applies to the script 
 and, with the exception of control characters ZWJ and ZWNJ, topic #1 
 also does.

So it is interesting to hear how Hebrew codepages tick :-)

Eric


--
AppSumo Presents a FREE Video for the SourceForge Community by Eric 
Ries, the creator of the Lean Startup Methodology on Lean Startup 
Secrets Revealed. This video shows you how to validate your ideas, 
optimize your ideas and identify your business strategy.
http://p.sf.net/sfu/appsumosfdev2dev
___
Freedos-user mailing list
Freedos-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-user


Re: [Freedos-user] DOS and Right-to-left support

2011-07-13 Thread Henrique Peron
Hi Eric,
 would anyone out there happen to know how did arabic DOS, on the old
 days, deal with:

 1) The control characters needed to handle the script - ZWJ (Zero-width
 joiner), ZWNJ (zero-width non-joiner), RLM (right-to-left mark), LRM
 (left-to-right mark) and control characters needed to handle bilingual
 text (LTR and RTL) in a same sentence: RLE/LRE (right-to-left and
 left-to-right embedding), RLO/LRO (right-to-left and left-to-right
 override) and PDF (POP directional Formatting).
 All of those sound like control characters which would have
 to be understood by DISPLAY or similar and which will need
 space in the codepage, possibly in lesser used control char
 areas (ASCII 0 to 31 somewhere). (...)
They /are/ part of codepages, as a matter of fact. I've found ZWJ and 
ZWNJ on ISO-8859-6 and all the other control characters mentioned on (1) 
at range A0h-A6h of both arabic codepage 862 and hebrew codepage 856. 
There is no visual representation of them, unlike what happens to the 
control characters found at range 00h-1Fh and 7Fh. Therefore, there's 
nothing to be done by DISPLAY or MODE. There must have had proper 
arabic/hebrew text editors (and other applications) out there which knew 
how to take advantage of those control characters.
 2) Codepage 720 and many others which only present the isolated shapes
 of the characters. DOS, seemingly, had somehow to rely on subfonts or
 any feature which would cause DOS to trade the characters' isolated
 shapes for their initial, medial or final shapes on-the-fly as the text
 was typed.
 Maybe it just looked ugly and used non-contextual shapes? ;-)
H... Very unlikely to have happened this way. If you ever saw a text 
written with the arabic script, even it being in the correct direction 
(right-to-left) though with letters only in their isolated shapes, you 
would agree that it was chaotic to the point of not being used that way. 
There must have had some trick somewhere.
 3) Combining chars. All arabic codepages, including cp864, include at
 least two codepoints which present them.
 You mean Unicode would represent them either as pre-combined
 or as some character plus a separate accent character? Not
 something that DOS is likely to have cared about, probably
 it only used pre-composed characters and had the characters
 without accent as separate entities, just like Latin vowels
 and Latin accented vowels (umlauts etc) having separate full
 shape font items in CP850 and similar. Note that CP850 does
 not even have double dot above for composition, it only
 has that as part of pre-composed umlaut character stapes...
I think that for the Unicode consortium to ever provide precomposed 
accented arabic (or hebrew, or syriac, or divehi) letters is a very 
unlikely thing to happen... Arabic (and hebrew, and syriac) letters are 
not accented. The combining chars used on these scripts perform a 
whole different role and they're even dismissed on most scenarios (but 
mandatory on others). There is also the case of the divehi script, which 
is also written right-to-left, even looks like the arabic script for the 
non-trained eye and makes a much heavier use of combining chars because 
they're always mandatory for every single letter in every word.

The vietnamese case is an interesting parallel. Before the availability 
of Unicode, vietnamese computers dealt with codepages which provided all 
their accented letters in a precomposed fashion, since they also 
seemingly didn't handle combining chars on DOS. Now we find all those 
precomposed accented latin vietnamese letters on Unicode - though for 
compatibility with legacy applications only, because nowadays the 
vietnamese only type their text by making (heavy) use of the 5 combining 
chars that they need: acute, grave, tilde, dot below and horn. Perhaps 
if it was ever possible to encode all precomposed arabic accented 
letters in 8-bit codepages we would have them in Unicode today but for 
the same single reason - backward compatibility.

By the way, in what comes to cp850, there are stand-alone cedilla, acute 
accent, diaeresis and macron, probably to be used only as combinining 
printing chars since this is how we used them on the old days when we 
wanted to print portuguese text on printers which did not provide 
hardcoded codepages.
 Hebrew DOS is a simpler case yet topic #3 also applies to the script
 and, with the exception of control characters ZWJ and ZWNJ, topic #1
 also does.
 So it is interesting to hear how Hebrew codepages tick :-)
Well... Almost. It ticks as much as arabic codepages do, provided that 
users don't need combining chars. :-)

Henrique

--
AppSumo Presents a FREE Video for the SourceForge Community by Eric 
Ries, the creator of the Lean Startup Methodology on Lean Startup 
Secrets Revealed. This video shows you how to validate your ideas, 
optimize your ideas and identify your business strategy.

[Freedos-user] Kernel 2040 16-bit

2011-07-13 Thread Marcos Favero Florence de Barros
Hi,

Here's some additional information about the file system
errors I've been getting.

Today I tried to use the 32-bit kernel (2040), and both
ChkDsk and Defrag ran normally. ChkDsk did not detect any
errors, and Defrag did its job as usual.

Immediately after that, I reverted to the 16-bit kernel
(2040) and tried to run ChkDsk again, just to test it. ChkDsk
simply did not run, just like yesterday, and displayed the same
kind of error messages, for instance:

\FDOS\INI is a directory without '..'
\RECURSOS doesn't contain an '.' as first entry
Error accessing the volume

I tried copying one of these directories from disk C to D,
and the errors are also copied. That is bad news, because I do
not know how to fix the disk. Fortunately, the computer is still
working normally, but clearly there is something amiss.

There is also another symptom that I find strange, but I will
report it, whatever its value.

For several years, I have repeatedly observed that for some
periods of time, typically a few weeks, all the Euphoria
programs that I use daily would  become much slower to load.
Then, for no apparent reason, they  would come back to their
usual fast loading, and remain that way for some more weeks.

This happened with more than one hard disk, and more than
one motherboard.

Earlier this year, in one of those periods when Euphoria was
slow, I noticed it became fast again after I ran  ChkDsk and
Defrag. I thought there might be a relation, and started using
ChkDsk and Defrag on a reguar basis.

I never had another period of slowness with Euphoria, but then,
Starting yesterday, it is again slow -- precisely when the file
system errors appeared. It would seem that the two things are
related, although I cannot even imagine how.

The loading delay is very noticeable. What usually took half
a second now takes 3-4 sec. I run these programs many times
every day.

I'm now using the 32-bit 2040 kernel, because none of these
errors has ever appeared under the previous 32-bit kernels, and
I'm hoping they will not appear under the new one either
[fingers crossed].

Any advice is welcome.

Marcos


--
Marcos Favero Florence de Barros
Campinas, Brazil


--
AppSumo Presents a FREE Video for the SourceForge Community by Eric 
Ries, the creator of the Lean Startup Methodology on Lean Startup 
Secrets Revealed. This video shows you how to validate your ideas, 
optimize your ideas and identify your business strategy.
http://p.sf.net/sfu/appsumosfdev2dev
___
Freedos-user mailing list
Freedos-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-user