On 12/4/2025 4:35 AM, [email protected] via Unicode wrote:
I have investigated the situation further and it seems that defect in the Unicode 13.0—17.0 mapping is even more fundamental than I previously thought. In particular, the proposal L2/25-037 does not acknowledge the proposal L2/00-159, which had already been incorporated into Unicode 3.2. In that proposal, the description of characters U+23B8 (LEFT VERTICAL BOX LINE) and U+23B9 (RIGHT VERTICAL BOX LINE) exactly matches the proposed characters L2/25-037:1FBFC (BOX DRAWINGS LIGHT LEFT EDGE) and L2/25-037:1FBFD (BOX DRAWINGS LIGHT RIGHT EDGE). In both proposals, those two characters are specified to be aligned to left or right edge, span the entire edge (extending to the top and bottom), and match the thickness of Box Drawings Light lines. The description of the characters U+23BA (HORIZONTAL SCAN LINE-1) and U+23BD (HORIZONTAL SCAN LINE-9) also exactly matches the proposed characters L2/25-037:1FBFA (BOX DRAWINGS LIGHT TOP EDGE) and L2/25-037:1FBFB (BOX DRAWINGS LIGHT BOTTOM EDGE). In both proposals, those two characters are specified to be aligned to top and bottom edges, span the entire edge (extending to the left and right), and match the thickness of Box Drawings Light lines. However, the proposal L2/00-159 had already set precedent for usage of [U+23BA, U+23BD, U+23B8, U+23B9] (and not the 1÷8 blocks or 1÷4 blocks) in mapping to certain platforms such as The Heath/Zenith 19 Graphics Character Set and The DEC Special Graphics Character Set. This contrasts with the usage of 1÷8 blocks [U+2594, U+2581, U+258F, U+2595] and other related 1÷8 or 7÷8 block characters in the mapping to PETSCII and Apple II. Therefore there is a discrepancy between the legacy platforms added in Unicode 3.2 (which use the box drawing lines 23B8, 23B9, 23BA, 23BD) and the legacy platforms added in Unicode 13.0—17.0 (which use 1÷8 blocks 2594, 2581, 258F, 2595).

Dnia 25 października 2025 10:27 [email protected] via Unicode <[email protected]> napisał(a):


    Dnia 25 października 2025 08:29 Asmus Freytag via Unicode
    <[email protected]> napisał(a):

        Again, the identity of the Unicode character is giving by
        encoding the intended mappings. If Unicode decides to map the
        same character to similar characters on different platforms,
        that is not a problem, as long as implementers know that the
        intent is to use a platform-specific rendering (and not assume
        that there is only one possible rendering per character).

        If you feel that the guidance available to implementers in the
        text of the standard or in an annotation of the nameslist is
        not sufficent, then the remedy would be to ask for the
        explanation to be updated. We are unfortunately locked in as
        far as character names are concerned, but we can add a note
        (best in the text of the standard) that explains that
        emulators for some systems will need an adjusted design so a
        sequence or other arrangement of these characters looks correct.

    Indeed the character names cannot be changed due to stability
    policies. An explanation note has been provided for U+1FB81 that
    claims "The lines corresponding to 3 and 5 are not actually block
    elements, but can show any horizontally repeating pattern", but
    still implicitly enforces 1÷8 blocks for top and bottom. However,
    this doesn't address other cases such as the PETSCII C64
    variation. And if 1FB70—1FB81 1FBB5—1FBB8 1FBBC were all noted to
    no longer require exact 1÷8 blocks, that would also not remedy the
    issue because it would introduce an inconsistency with the
    existing 1÷8 or 7÷8 block characters 2581 2589 258F 2594—2595,
    which already have established compatibility precedents that
    require the exact fraction, but are also used in the Unicode 13.0
    mapping to PETSCII and Apple II character sets despite those
    platforms using varying thickness (consistent with light box
    drawings, except for the 1÷8 top and bottom blocks in C64, where
    the 1÷4 top and bottom blocks are made consistent instead).




What is missing is an actual proposal. That is, not just analysis or exposition, but actual proposed wording or proposed encoding that would fix the issue.

That would need to be provided as a UTC document (aka L2 document) submission, with the analysis appended in a background section.

A./

PS: I am not convinced that platform-specific mappings (glyphs) are an issue, because the scenario where these data are reliably transferred *between* legacy implementations can't have existed then, so it's questionably why it needs to be perfect today. My assumption would be that the use case is lossless round trip from (each) legacy emulator to Unicode and back. Having PETSII / Apple II specific characters does not improve things, because any data stream containing those could not be displayed on any other emulator. This is different from legacy characters mapped to letters and common text symbols because we have an expectation that we can share text across devices (or emulators).

Reply via email to