[Freedos-user] DOS and Right-to-left support
Hi all, would anyone out there happen to know how did arabic DOS, on the old days, deal with: 1) The control characters needed to handle the script - ZWJ (Zero-width joiner), ZWNJ (zero-width non-joiner), RLM (right-to-left mark), LRM (left-to-right mark) and control characters needed to handle bilingual text (LTR and RTL) in a same sentence: RLE/LRE (right-to-left and left-to-right embedding), RLO/LRO (right-to-left and left-to-right override) and PDF (POP directional Formatting). 2) Codepage 720 and many others which only present the isolated shapes of the characters. DOS, seemingly, had somehow to rely on subfonts or any feature which would cause DOS to trade the characters' isolated shapes for their initial, medial or final shapes on-the-fly as the text was typed. 3) Combining chars. All arabic codepages, including cp864, include at least two codepoints which present them. Hebrew DOS is a simpler case yet topic #3 also applies to the script and, with the exception of control characters ZWJ and ZWNJ, topic #1 also does. Thanks in advance, Henrique -- AppSumo Presents a FREE Video for the SourceForge Community by Eric Ries, the creator of the Lean Startup Methodology on Lean Startup Secrets Revealed. This video shows you how to validate your ideas, optimize your ideas and identify your business strategy. http://p.sf.net/sfu/appsumosfdev2dev ___ Freedos-user mailing list Freedos-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freedos-user
Re: [Freedos-user] DOS and Right-to-left support
Hi Henrique, would anyone out there happen to know how did arabic DOS, on the old days, deal with: 1) The control characters needed to handle the script - ZWJ (Zero-width joiner), ZWNJ (zero-width non-joiner), RLM (right-to-left mark), LRM (left-to-right mark) and control characters needed to handle bilingual text (LTR and RTL) in a same sentence: RLE/LRE (right-to-left and left-to-right embedding), RLO/LRO (right-to-left and left-to-right override) and PDF (POP directional Formatting). All of those sound like control characters which would have to be understood by DISPLAY or similar and which will need space in the codepage, possibly in lesser used control char areas (ASCII 0 to 31 somewhere). Without having to have any specific shape, so in a character table, 0 to 31 ASCII will still look like ASCII but you would not be able to print those any more, at least without using escape sequences...? 2) Codepage 720 and many others which only present the isolated shapes of the characters. DOS, seemingly, had somehow to rely on subfonts or any feature which would cause DOS to trade the characters' isolated shapes for their initial, medial or final shapes on-the-fly as the text was typed. Maybe it just looked ugly and used non-contextual shapes? ;-) 3) Combining chars. All arabic codepages, including cp864, include at least two codepoints which present them. You mean Unicode would represent them either as pre-combined or as some character plus a separate accent character? Not something that DOS is likely to have cared about, probably it only used pre-composed characters and had the characters without accent as separate entities, just like Latin vowels and Latin accented vowels (umlauts etc) having separate full shape font items in CP850 and similar. Note that CP850 does not even have double dot above for composition, it only has that as part of pre-composed umlaut character stapes... Hebrew DOS is a simpler case yet topic #3 also applies to the script and, with the exception of control characters ZWJ and ZWNJ, topic #1 also does. So it is interesting to hear how Hebrew codepages tick :-) Eric -- AppSumo Presents a FREE Video for the SourceForge Community by Eric Ries, the creator of the Lean Startup Methodology on Lean Startup Secrets Revealed. This video shows you how to validate your ideas, optimize your ideas and identify your business strategy. http://p.sf.net/sfu/appsumosfdev2dev ___ Freedos-user mailing list Freedos-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freedos-user
Re: [Freedos-user] DOS and Right-to-left support
Hi Eric, would anyone out there happen to know how did arabic DOS, on the old days, deal with: 1) The control characters needed to handle the script - ZWJ (Zero-width joiner), ZWNJ (zero-width non-joiner), RLM (right-to-left mark), LRM (left-to-right mark) and control characters needed to handle bilingual text (LTR and RTL) in a same sentence: RLE/LRE (right-to-left and left-to-right embedding), RLO/LRO (right-to-left and left-to-right override) and PDF (POP directional Formatting). All of those sound like control characters which would have to be understood by DISPLAY or similar and which will need space in the codepage, possibly in lesser used control char areas (ASCII 0 to 31 somewhere). (...) They /are/ part of codepages, as a matter of fact. I've found ZWJ and ZWNJ on ISO-8859-6 and all the other control characters mentioned on (1) at range A0h-A6h of both arabic codepage 862 and hebrew codepage 856. There is no visual representation of them, unlike what happens to the control characters found at range 00h-1Fh and 7Fh. Therefore, there's nothing to be done by DISPLAY or MODE. There must have had proper arabic/hebrew text editors (and other applications) out there which knew how to take advantage of those control characters. 2) Codepage 720 and many others which only present the isolated shapes of the characters. DOS, seemingly, had somehow to rely on subfonts or any feature which would cause DOS to trade the characters' isolated shapes for their initial, medial or final shapes on-the-fly as the text was typed. Maybe it just looked ugly and used non-contextual shapes? ;-) H... Very unlikely to have happened this way. If you ever saw a text written with the arabic script, even it being in the correct direction (right-to-left) though with letters only in their isolated shapes, you would agree that it was chaotic to the point of not being used that way. There must have had some trick somewhere. 3) Combining chars. All arabic codepages, including cp864, include at least two codepoints which present them. You mean Unicode would represent them either as pre-combined or as some character plus a separate accent character? Not something that DOS is likely to have cared about, probably it only used pre-composed characters and had the characters without accent as separate entities, just like Latin vowels and Latin accented vowels (umlauts etc) having separate full shape font items in CP850 and similar. Note that CP850 does not even have double dot above for composition, it only has that as part of pre-composed umlaut character stapes... I think that for the Unicode consortium to ever provide precomposed accented arabic (or hebrew, or syriac, or divehi) letters is a very unlikely thing to happen... Arabic (and hebrew, and syriac) letters are not accented. The combining chars used on these scripts perform a whole different role and they're even dismissed on most scenarios (but mandatory on others). There is also the case of the divehi script, which is also written right-to-left, even looks like the arabic script for the non-trained eye and makes a much heavier use of combining chars because they're always mandatory for every single letter in every word. The vietnamese case is an interesting parallel. Before the availability of Unicode, vietnamese computers dealt with codepages which provided all their accented letters in a precomposed fashion, since they also seemingly didn't handle combining chars on DOS. Now we find all those precomposed accented latin vietnamese letters on Unicode - though for compatibility with legacy applications only, because nowadays the vietnamese only type their text by making (heavy) use of the 5 combining chars that they need: acute, grave, tilde, dot below and horn. Perhaps if it was ever possible to encode all precomposed arabic accented letters in 8-bit codepages we would have them in Unicode today but for the same single reason - backward compatibility. By the way, in what comes to cp850, there are stand-alone cedilla, acute accent, diaeresis and macron, probably to be used only as combinining printing chars since this is how we used them on the old days when we wanted to print portuguese text on printers which did not provide hardcoded codepages. Hebrew DOS is a simpler case yet topic #3 also applies to the script and, with the exception of control characters ZWJ and ZWNJ, topic #1 also does. So it is interesting to hear how Hebrew codepages tick :-) Well... Almost. It ticks as much as arabic codepages do, provided that users don't need combining chars. :-) Henrique -- AppSumo Presents a FREE Video for the SourceForge Community by Eric Ries, the creator of the Lean Startup Methodology on Lean Startup Secrets Revealed. This video shows you how to validate your ideas, optimize your ideas and identify your business strategy.
[Freedos-user] Kernel 2040 16-bit
Hi, Here's some additional information about the file system errors I've been getting. Today I tried to use the 32-bit kernel (2040), and both ChkDsk and Defrag ran normally. ChkDsk did not detect any errors, and Defrag did its job as usual. Immediately after that, I reverted to the 16-bit kernel (2040) and tried to run ChkDsk again, just to test it. ChkDsk simply did not run, just like yesterday, and displayed the same kind of error messages, for instance: \FDOS\INI is a directory without '..' \RECURSOS doesn't contain an '.' as first entry Error accessing the volume I tried copying one of these directories from disk C to D, and the errors are also copied. That is bad news, because I do not know how to fix the disk. Fortunately, the computer is still working normally, but clearly there is something amiss. There is also another symptom that I find strange, but I will report it, whatever its value. For several years, I have repeatedly observed that for some periods of time, typically a few weeks, all the Euphoria programs that I use daily would become much slower to load. Then, for no apparent reason, they would come back to their usual fast loading, and remain that way for some more weeks. This happened with more than one hard disk, and more than one motherboard. Earlier this year, in one of those periods when Euphoria was slow, I noticed it became fast again after I ran ChkDsk and Defrag. I thought there might be a relation, and started using ChkDsk and Defrag on a reguar basis. I never had another period of slowness with Euphoria, but then, Starting yesterday, it is again slow -- precisely when the file system errors appeared. It would seem that the two things are related, although I cannot even imagine how. The loading delay is very noticeable. What usually took half a second now takes 3-4 sec. I run these programs many times every day. I'm now using the 32-bit 2040 kernel, because none of these errors has ever appeared under the previous 32-bit kernels, and I'm hoping they will not appear under the new one either [fingers crossed]. Any advice is welcome. Marcos -- Marcos Favero Florence de Barros Campinas, Brazil -- AppSumo Presents a FREE Video for the SourceForge Community by Eric Ries, the creator of the Lean Startup Methodology on Lean Startup Secrets Revealed. This video shows you how to validate your ideas, optimize your ideas and identify your business strategy. http://p.sf.net/sfu/appsumosfdev2dev ___ Freedos-user mailing list Freedos-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freedos-user