E. Keown wrote:

       Elaine Keown
       Tucson

Dear Peter,



*plain text* standard is the bidirectional
algorithm, which sorts out how a (horizontal)
*line* of text is laid out when text of opposite
directions



In the 'old' Unicode 3.0 there was a one-line note on
doing boustrophedon near the bidi material. Boustrophedon is needed not 'just' in Archaic Greek,
but also in some periods of Egyptian and in some early
Semitic stuff.


For a small percentage of early Semitics stuff, it
would be convenient to be able to automatically
reverse the direction in a database, so the retrieval
algorithm could look at 'both directions.'


That shouldn't be a problem, not even an issue. Remember, no matter which direction the text runs on the page, Unicode text is stored in logical order, not visual order. So a huge text that happens to be rendered boustrophedon is still stored as a sequence of characters in reading order. So you don't need to "reverse" the direction of anything when you're searching. If you're looking for "herman", the letters will be in exactly that order no matter which line of the text it wound up on.

Is there a larger 'boustrophedon' note in Unicode 4.0?
Is there any interest in expanding the bidi algorithm
to definitely cover all possible RTL - LTR
boustropheda (plural?) ?

Boustrophedon is probable outside the scope of unmarked Unicode. Which is not as bad as it sounds. So far as a computer is concerned, text is a stream of characters, in logical reading order. None of this silly "lines" business, and reversing directions, even if some of the characters are newline characters. That doesn't mean anything in terms of how the data is stored. It's only when the data is *rendered* on a screen or on paper that the bidi algorithm takes over and dictates where to put the various marks. The bidi algorithm is enough of a headache as it stands, just trying to deal with RTL and LTR scripts and their possible coexistence on a single line. Boustrophedon is far too complex for it. Probably what you'd do is have some higher-level markup tag saying "Begin boustrophedon here..." which your renderer would know to interpret properly: as it breaks the text into lines, reverse every other one, etc etc... You'd have stuff like "<boust></boust>" tags or something equivalent. The same goes for all various possible variants of boustrophedon, and whatever other exotic directions happen.

The discussion so far on the list doesn't appear to me
to cover every possibility....my impression is that
there are probably sub-varieties of boustrophedon and
of the vertical material....sometimes individual
characters get re-aligned, turned a certain number of
degrees, and maybe sometimes they don't.


That's okay. Things like that are outside of plain Unicode's capabilities. Other standards (XML stuff, whatever) need to be developed to handle them.

~mark





Reply via email to