E. Keown wrote:
Elaine Keown
Tucson
Dear Peter,
*plain text* standard is the bidirectional
algorithm, which sorts out how a (horizontal)
*line* of text is laid out when text of opposite
directions
In the 'old' Unicode 3.0 there was a one-line note on
doing boustrophedon near the bidi material.
Boustrophedon is needed not 'just' in Archaic Greek,
but also in some periods of Egyptian and in some early
Semitic stuff.
For a small percentage of early Semitics stuff, it
would be convenient to be able to automatically
reverse the direction in a database, so the retrieval
algorithm could look at 'both directions.'
That shouldn't be a problem, not even an issue. Remember, no matter
which direction the text runs on the page, Unicode text is stored in
logical order, not visual order. So a huge text that happens to be
rendered boustrophedon is still stored as a sequence of characters in
reading order. So you don't need to "reverse" the direction of anything
when you're searching. If you're looking for "herman", the letters will
be in exactly that order no matter which line of the text it wound up on.
Is there a larger 'boustrophedon' note in Unicode 4.0?
Is there any interest in expanding the bidi algorithm
to definitely cover all possible RTL - LTR
boustropheda (plural?) ?
Boustrophedon is probable outside the scope of unmarked Unicode. Which
is not as bad as it sounds. So far as a computer is concerned, text is
a stream of characters, in logical reading order. None of this silly
"lines" business, and reversing directions, even if some of the
characters are newline characters. That doesn't mean anything in terms
of how the data is stored. It's only when the data is *rendered* on a
screen or on paper that the bidi algorithm takes over and dictates where
to put the various marks. The bidi algorithm is enough of a headache as
it stands, just trying to deal with RTL and LTR scripts and their
possible coexistence on a single line. Boustrophedon is far too complex
for it. Probably what you'd do is have some higher-level markup tag
saying "Begin boustrophedon here..." which your renderer would know to
interpret properly: as it breaks the text into lines, reverse every
other one, etc etc... You'd have stuff like "<boust></boust>" tags or
something equivalent. The same goes for all various possible variants
of boustrophedon, and whatever other exotic directions happen.
The discussion so far on the list doesn't appear to me
to cover every possibility....my impression is that
there are probably sub-varieties of boustrophedon and
of the vertical material....sometimes individual
characters get re-aligned, turned a certain number of
degrees, and maybe sometimes they don't.
That's okay. Things like that are outside of plain Unicode's
capabilities. Other standards (XML stuff, whatever) need to be
developed to handle them.
~mark