Dnia 26 kwietnia 2025 09:37 Eli Zaretskii <e...@gnu.org> napisał(a):  
Date: Sat, 26 Apr 2025 09:15:02 +0200  Cc: unicode@corp.unicode.org 
<unicode@corp.unicode.org>,         b...@bapha.be <b...@bapha.be>  
From: "piotrunio-2...@wp.pl via Unicode" 
<unicode@corp.unicode.org>    Dnia 26 kwietnia 2025 09:03 Eli Zaretskii 
<e...@gnu.org> napisał(a):   Date: Fri, 25 Apr 2025 23:21:05 +0200  From: 
"piotrunio-2...@wp.pl via Unicode" <unicode@corp.unicode.org>   
In non-Unix-like terminals, the width is always linearly proportional to the 
amount of bytes that the  text takes  in memory, because that is how a random 
access array works. Each character cell takes a  constant amount  of bytes, for 
example in VGA-compatible text mode there are 16 bits per character cell (8 
bits for  attributes  and 8 bits for character code), and in Win32 console 
there are 32 bits per character cell (16 bits  for  attributes and 16 bits for 
character code). Whether a character is fullwidth may be determined by  the 
text  encoding (some legacy encodings such as Shift JIS will store fullwidth 
characters in the bytes of  two  consecutive character cells) or by attributes. 
  I think you have very outdated mental model of how the Windows console  works 
and how it represents and encodes characters.  In particular,  the width of a 
character is NOT determined by the length of its byte  sequence, but by the 
font glyphs used to display those characters.   The CHAR_INFO structure is 
defined as a 32-bit structure with 16 bits for attributes and 16 bits for 
character  code. The Win32 API allows for directly reading and writing arrays 
of that structure by using  ReadConsoleOutput and WriteConsoleOutput functions. 
This means that there is absolutely no way that a  native Win32 console could 
possibly store its characters in a variable amount of bytes; in particular, the 
 structure cannot store emoji +VC15, +VC16 sequences because it was never 
intended for that purpose.   You seem to assume that the layout of characters 
in memory is the same  as their layout on display.  This was true for MS-DOS 
terminals, but  is no longer true on modern Windows versions, where 
WriteConsoleOutput  and similar APIs do not write directly to the video memory. 
 Instead,  they write to some intermediate memory structure, which is 
thereafter  used to draw the corresponding font glyphs on display.  I'm 
quite sure  that the actual drawing on the glass is performed using shaping  
engines such as DirectWrite, which consult the font glyph metrics to  determine 
the width of glyphs on display.  The actual width of  characters as shown on 
display is therefore not directly determined by  the amount of bytes the 
characters take in their UTF-16  representation.   The legacy Win32 console 
does not use DirectWrite, so what you're describing seems to be Windows 
Terminal, which as I said is a unix-like terminal, it's not a native Win32 
console at all. It can emulate a Win32 console, but it doesn't run it 
natively. This only further shows that emoji +VC15, +VC16 in terminals only 
exists in unix-like context.

Reply via email to