On 10/24/2025 10:54 PM, [email protected] wrote:
Dnia 25 października 2025 00:38 Asmus Freytag via Unicode <[email protected]> napisał(a):

    On 10/24/2025 2:58 PM, [email protected]
    <mailto:[email protected]> via Unicode wrote:
    and not subject to font variation.

    That's overstating things.

    A./

How is that overstating things?

Because exact glyph details are not normative. Especially for compatibility characters. Here, the intent is usually to facilitate a unique and unambiguous mapping between some kind of legacy character and a Unicode character.

I think that the analysis for the curved connectors that unified two distinct elements because their rendering was close was a mistake, because the distinction occurred in the same set and unifying the characters killed fidelity in round trip conversion for many members of the "large set" while "saving" only one character code. In my personal view, that's precisely the wrong way to do unification.

When a legacy computing platform defines blocks in terms of fractions, it does so to ensure specific alignment with those fractions, making it part of the fundamental character identity. On the other hand, when a legacy computing platform defines strokes in terms of stem weight and there is known variation across platforms, it is inappropriate to define those characters using exact fractions when those fractions mismatch some of the platforms.

So far, you have only argued that a font (or bitmap) used to emulate a specific legacy platform should faithfully adhere to any specifications that apply to that platform.

There is nothing wrong with the same *Unicode* character being rendered slightly differently when used to emulate *different* platforms. Unless it is the very same platform that exhibits different shapes (and in the same display "mode" or "shift"). In that case, the principle of source set separation becomes applicable (which is the principle that should have been applied to the curved connector case. If it makes you happy, you can cite my opinion on that).

However, I didn't spot where that would have been the case for the line segments. From my quick perusal of the proposals and the critique here it seems that this is a matter of the different displays having different weights and therefore, the preferred font / bitmap cannot be the same in each context. However, there's not implied need to be able to emulate a screen where different parts of the emulator have support a different legacy system. Usually, a single window (or nested window) would display a single emulator.

Again, the identity of the Unicode character is giving by encoding the intended mappings. If Unicode decides to map the same character to similar characters on different platforms, that is not a problem, as long as implementers know that the intent is to use a platform-specific rendering (and not assume that there is only one possible rendering per character).

If you feel that the guidance available to implementers in the text of the standard or in an annotation of the nameslist is not sufficent, then the remedy would be to ask for the explanation to be updated. We are unfortunately locked in as far as character names are concerned, but we can add a note (best in the text of the standard) that explains that emulators for some systems will need an adjusted design so a sequence or other arrangement of these characters looks correct.

A./

PS: I see that you confirm below that the two cases are of a different nature.

Dnia 25 października 2025 00:44 Asmus Freytag via Unicode <[email protected]> napisał(a):

    On 10/24/2025 2:54 PM, Nitai Sasson via Unicode wrote:
    f you use a font that makes those Unicode characters look like
    they did on their original platform, there is no issue. But a
    given font can only emulate one platform at a time. You're not
    going to get a C64 and PET/VIC-20 frankenstein of a document.
    Take your pick: do you want it to look like C64, or do you want
    it to look like PET/VIC-20? Choose your font accordingly.

    Round tripping plain text to a mix of devices is not a goal, just
    as round tripping plain text Han characters to a mix of regional
    variants is not a goal.

    You (Piotr) need to demonstrate that for a single display, on a
    single device or emulator for a single device, you cannot get the
    correct appearance by systematically using a device appropriate font.

    If a device supports "shifted" modes, then a device appropriate
    font may change based on the shift status.

    Only when that accommodation fails to produce the correct
    appearance is there a case for further disunification.

    The diagonal connector issue satisfies this requirement, but as
    far as I have been able to understand, the block characters do not.

    A./


In case of PETSCII and Apple II characters, this is an instance of source characters having an incompatible character identity from their mapped Unicode characters. Therefore, there is a character identity conflict between the legacy platform and the Unicode characters they are mapped to. Whereas in case of HP 264x characters, two source characters having an incompatible character identity from each other are mapped to the same Unicode character. Therefore, there is a character identity conflict between the two characters.

    The required evidence to support a request for disunification
    therefore
    always consists of a document (screenshot) (usually other than a
    character set table) that shows that the two characters are
    distinct in
    their source environment and that that distinction matters (for
    example,
    that it can't be determined mechanically by context).

    From the original document (section 1, page 1), it looks like that
    there are two characters that are distinct in the source, but have
    been
    mapped to the same Unicode character 1CE2B. I can certainly sympathize
    with the view that unifying these based on their close visual
    similarity
    was, what we used to call a case of "arms-length" unification.


As I have explained in Odp: Re: Unicode fundamental character identity <https://corp.unicode.org/pipermail/unicode/2025-January/011312.html>, This is what it looks like on a screenshot: https://i.imgur.com/obGQ4Ie.png . The two different characters and their different types of connections are demonstrated. Furthermore, since all character tiles are visually independent, and both characters may be used as isolated character cells, no contextual mechanism can possibly apply.



Reply via email to