This character set is also attested in Windows 3.1 Arabic as well. The
following DOS C code will output 256 character tiles into the native text mode
character grid: #include <dos.h> int main(void){ union REGS regs;
int i; regs.h.ah=0x00; regs.h.al=0x03; int86(0x10, &regs, &regs);
for(i=0; i<256; i++){ regs.h.ah=0x02; regs.h.bh=0x00;
regs.h.dh=(i>>4)&0xF; regs.h.dl=i&0xF; int86(0x10, &regs,
&regs); regs.h.ah=0x09; regs.h.al=i; regs.h.bh=0x00; regs.h.bl=0xF0;
regs.x.cx=1; int86(0x10, &regs, &regs); } regs.h.ah = 0x00;
int86(0x16, &regs, &regs); return 0; } In Windows 3.1 Arabic, when
setting Font Page to MS-DOS Arabic Support FP 164, the result is the same
character set as in Windows 95/98/ME Arabic: i.imgur.com
https://i.imgur.com/8kYDazX.png (though the tile 0x00 seems to prevent the
rest of the line from being rendered so I had to move the first column off
screen and redraw 01—0F before moving it back on screen). 8×8, 8×12, and 10×20
bitmap fonts have been attested in Windows 3.1 Arabic (all defined in
ARAAPP.FON). The DOS version of the script also works in Windows 95/98/ME
Arabic, having the same results as the Win32 version. So those character tiles
can be output with both BIOS functions (int 10h) and Win32 functions
(WriteConsoleOutputA). Also "forms with included tail: 0x92, 0x95, 0x98,
0x8A" is a misspelling of "forms with included tail: 0x92, 0x95, 0x99,
0x9B" Dnia 09 stycznia 2026 17:37 [email protected] via Unicode
<[email protected]> napisał(a): The following Win32 C code will
output 256 characters in system console codepage into the character grid,
capture those character tiles in UCS-2 if possible, and then output the current
console codepage number. #include <windows.h>
#include <stdio.h>
int main(){
HANDLE hConsole=GetStdHandle(STD_OUTPUT_HANDLE);
CHAR_INFO screen[256];
COORD size={16,16,};
COORD pos={0,0,};
SMALL_RECT rect={0,0,15,15,};
for(int i=0;i<256;i++){
screen[i].Attributes=0xF0;
screen[i].Char.AsciiChar=i;
}
WriteConsoleOutputA(hConsole,screen,size,pos,&rect);
CHAR_INFO screenu[256];
if(ReadConsoleOutputW(hConsole,screenu,size,pos,&rect)){
for(int i=0;i<256;i++) printf("%04X ",screenu[i].Char.UnicodeChar);
}
else{
printf("error %08X\n",GetLastError());
}
printf("codepage %u",GetConsoleOutputCP());
}
In most cases, whenever a legacy Win32 codepage is used, the application can
run on Windows NT to capture the UCS-2 mapping of those character cells to the
BMP (although for CJK codepages a more complex setup would be necessary due to
thousands of fullwidth characters with 2-byte sequences). However, in Arabic
versions of Windows 9x (95/98/ME) the resulting character set has many
presentation forms that are not in Unicode. This is the result when running on
Windows ME: i.imgur.com https://i.imgur.com/QFm3SkI.png in 10×20 font,
i.imgur.com https://i.imgur.com/KUbLQ0A.png in 10×18 font (same result also
appears in Windows 95/98). 5×12, 7×12, 8×12, 10×18, 10×20, and 12×16 bitmap
fonts have been attested with that character set (VGAOEM.FON, 8514OEM.FON,
DOSAPP.FON). The 10×20 font has slightly different mapping than the other
sizes: 0x93 is ö instead of ô, and 0x97 is missing (causing the following
characters on the same line to be drawn at the wrong position). It also claims
to be using codepage 720, but many characters differ from their CP720 mappings,
including the bundled CP_720.NLS mappings (for example, ـ (U+0640 ARABIC
TATWEEL) is 0x95 in CP720, but in the console 0x95 is ش instead, and the
tatweel is at 0xFF). On Windows 9x, ReadConsoleOutputW is not supported so the
UCS-2 mappings of the console character tiles cannot be captured (error
0x00000078 ERROR_CALL_NOT_IMPLEMENTED). When that program runs on Arabic
versions of Windows NT, the visual output is of the CP437 character set if one
of the bundled bitmap fonts is used ( i.imgur.com
https://i.imgur.com/RxjtxMH.png ), or the CP720 set if Lucida Console is used,
with the Arabic letters either having glitchy font substitution (NT 4.0, NT
5.0/2000) or the .notdef glyph (NT 5.1/XP and up). In fact, it seems that the
only Arabic bitmap fonts that occur in Windows NT are CP1256 fonts, which are
not used in terminals. So this appears to be one of those permanent Windows
compatibility regressions that occured when Windows 9x ended, where the
terminals can no longer render legacy Arabic text. Even if the user managed to
use registry hacks to set the font to Courier New or Simplified Arabic Fixed,
it would still use the CP720 mapping which is not compatible with the Windows
9x set. It appears that in the Windows 9x Arabic terminal character set, 244
characters ( ﺀﺁﺂﺃﺄﺅﺇﺈﺊﺋﺍﺎﺏﺑﺓ►◄↕ﺕ¶§ﺗﺙ↑↓→←ﺛﹰ▲▼
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~ﺝﺟﺡéâﺣàﺥçêëèïîﺧﺩﺫﺭﺯôﺳûùﺷﺻ£ﺿﻁﻅﻉﻊﻋﻌ
are already in Unicode, but 12 characters are not in Unicode: • 6 of them are
pieces of lam-alef ligatures (0xDD, 0xDE, 0xF9, 0xFB, 0xFC, 0xFD) • 2 of them
are shadda with fathatan ligatures without or with tatweel (0xD0, 0xD1) — in
some legacy Microsoft fonts, shadda with fathatan is mapped to private use
U+E818 • 4 of them are disunifications of seen/sheen/sad/dad occuring either
with or without tail — ﹳ (U+FE73 ARABIC TAIL FRAGMENT) was originally encoded
in Unicode 3.2 for CP864 compatibility; in that codepage, the forms of
seen/sheen/sad/dad attach to the tail fragment — forms with included tail:
0x92, 0x95, 0x98, 0x8A — forms without tail (attaching to tail fragment like
in CP864): 0xF3, 0xF4, 0xF5, 0xF6 If someone tried to make a Win32 console
implementation and tried to implement both Windows 9x Arabic terminal character
set compatibility and wide string API (ReadConsoleOutputW) compatibility
simultaneously, then they would run into the issue that there is currently no
standardized mapping to handle that scenario. What should Windows 9x Arabic
console compatible implementations do in that case?