Hi Richard, On Fri, Feb 1, 2019 at 12:19 AM Richard Wordingham via Unicode <[email protected]> wrote:
> Cropped why? If the problem is the truncation of lines, one can simple > store the next character. Yup, trancation of line for example. I agree that one could "store the next character". We could extend the terminal emulation protocol where by some means you can specify that column 80 contains a letter X, and even though there's no column 81, an app can still tell the terminal emulator that it should imagine that column 81 contans the letter Y, and perform shaping accordingly. This will need to be done not just at the end of the terminal, but at any position, and for both directions. Think of e.g. a vertically split tmux. You should be able to tell that column 40 contains X which should be shaped as if column 41 contained Y, and column 41 contains Z which should be shaped as if column 40 contained A. What I canont see at all is how this could be "simply". Could you please elaborate on that? I don't find this simple at all! >> > It's not able to > > separate different UI elements that happen to be adjacent in the > > terminal, separated by different background color or such. > > ZWJ and ZWNJ can handle that. Wouldn't it be a semantical misuse of these characters, though? They are supposed to be present in the logical order, and in logical order (that is: the terminal's implicit mode) they can work as desired. Are they okay to be present in visual order (the terminal's explicit mode, what we're discussing now) too? Anyway, ZWJ/ZWNJ aren't sufficient to handle the cases I outlined above. > If a general text manipulating application, e.g. cat, grep or awk, is > writing to a file, it should not convert normal Arabic characters to > presentation forms. You are now asking a general application to > determine whether it is writing to a terminal or not, and alter its > output if it is writing to a terminal. No, this absolutely not what I'm talking about! There are two vastly different modes of the terminal. For "cat", "grep" etc. the terminal will be in implicit mode. Absolutely no BiDi handling is expected from these apps, the terminal will do BiDi and shaping (perhaps using Harfbuzz; perhaps using presentation form characters as a temporarily low hanging fruit until a better one is implemented – the choice is obviously up to the implementation and not to the specification). For "emacs" and friends, an explicit mode is required where visual order is passed to the terminal. What we're discussing is how to handle shaping in this mode. > But it as an issue that needs to be addressed. As a terminal can be > addressed by cell, an application may need to keep track of what text > went into each cell. Misery results when the application gets it wrong. My recommendation doesn't change this principle at all. In the lower (emulation) layer every character still goes into the cell it used to go to, and is addressable using cursor motion escapes and so on exactly as without BiDi. > How many cells do CJK ideographs occupy? We've had a strong hint > that a medial BEH should occupy one cell, while an isolated BEH should > occupy two. CJK occupy two, but they do regardless of what's around them. That is, they already occupy two cells in the logical buffers, in the emulation layer. There is absolutely no sane way we can make in terminal emulation a character's logical width (as in number of cells it occupies) depend on its neighboring characters. (And even if we could by some terrible hacks, it would break the principle you just said as "misery results...", and the principle Eli said that things should remain reasonably simple, otherwise hardly anyone will bother implementing them.) This is a compromise Arabic folks will have to accept. When displayed, it's up for terminal emulators to perhaps enwiden/shrink cells as it wants to (they might even totally give up on monospace fonts), but then they'll risk vertical lines not aligning up perfectly vertically, content overflowing on the right etc. Konsole does such things. cheers, egmont

