At 03:29 PM 6/25/2003, Kenneth Whistler wrote:

> This is not simply
> 'non-traditional' but results in incorrect rendering and a different
> vocalisation of the text.

I don't think this is true.

First, the intent of the (admittedly problematical) fixed position
combining classes was that the position of the relevant marks,
including the relevant Hebrew points, was fixed with respect to
the consonant base letter, so that application of one would not
impact the rendering of application of another.

This idea of Hebrew vowels as 'fixed' marks is problematical, because in Biblical Hebrew they are not fixed: they move relative to additional marks (other vowels or cantillation marks).


It may be more *difficult* for applications to do correct rendering,
but there was never any intention in the standard that I know
of that a sequence <hiriq, patah> would render differently
than a sequence <patah, hiriq>.

Yes, this is what I am saying is wrong: <hiriq, patah> *should* render differently from <patah, hiriq>. This example is particularly important, because it occurs in the spelling of yerushalaim, the Masoretic approximation of yerushalayim. Correct rendering requires that the hiriq follows the patah, and not vice versa.


And never any intent that it
would represent a "different vocalisation of the text".

Fair enough for modern Hebrew. Fair enough for phonetically accurate Hebrew. Not good enough for Biblical Hebrew in which vocalisation reflects Masoretic pronunciation applied to ancient consonant structures.


> The point is that hiriq before patah is *not*
> canonically equivalent to patah before hiriq,

This is true.

> except in the erroneous
> assumption of the Unicode Standard: the order of vowels makes words sound
> different and mean different things.

This is not. The Unicode Standard makes no assumptions or claims
about what the phonological or meaning equivalence of <hiriq, patah>
or <patah, hiriq> is for Biblical Hebrew.

But it does make assumptions about the canonical equivalence of the mark orders <U+05B4, U+05B7> and <U+05B7, U+05B4>, unless my understanding of the purpose of combining classes is completely mistaken. My understanding is that any ordering of two marks with different combining classes is canonically equivalent; further, I understand that some normalisation forms will re-order marks to move marks with lower combining class values closer to the base character. If the sequence <lamed, patah, hiriq, final mem> is what the text says, normalisation that re-orders the sequence as <lamed, hiriq, patah, final mem> is erroneous.


The fact that traditional Biblical Hebrew spelling prefers one
order of representation and canonically ordered Unicode text
specifies the opposite order may be a problem for implementations,
but that problem does not extend to the claims that John is
making here.

This isn't a problem for implementations. This is a problem of Unicode canonical ordering re-ordering marks whose order is lexically significant. The fact that, in some cases, the canonical ordering also cannot be rendered with existing implementations simply makes the problem visually obvious.


> In order to correctly encode and render the Biblical Hebrew text, it is
> necessary to either a) never use normalisation routines that re-order marks
> (which is beyond the control of document authors), or b) re-classify the
> existing Hebrew marks so that all vowels are in a single class and will not
> be re-ordered during normalisation, or c) encode new marks for Biblical
> Hebrew with all vowels in a single class.


I don't think these conclusions following from the current
situation.

Such changes are certainly not necessary in order to *render*
Biblical Hebrew text correctly, nor to accurately represent
the content of Biblical Hebrew text.

They are necessary to render Biblical Hebrew text correctly using current font and layout engine technologies. These technologies work perfectly for Biblical Hebrew so long as Unicode canonical ordering is ignored. I think there is very little impetus to change or develop new implementations to take into account what strikes most of those involved with Biblical Hebrew text processing as an error in Unicode.


The current situation is not optimal for implementations, nor
does canonically ordered text follow traditional preferences
for spelling order -- that we can agree on. But I think the
claims of inadequacy for the representation or rendering
of Biblical Hebrew text are overblown.

I've spent nine months working on Biblical Hebrew rendering for the major user community (the Society of Biblical Literature and their Font Foundation partners), and their take on this is that a) they want a solution that works with today's technology, and b) they will avoid Unicode canonical ordering like the plague and use custom normalisations instead. When we conducted normalisation tests, switching from Unicode normalisation of to a custom normalisation that does not re-order vowels or meteg*, we increased the number of unique consonant + mark(s) sequences encoded in the Old Testament text by more 340. This means that Unicode normalisation was creating 340 textual ambiguities by treating lexically distinct sequences as canonically equivalent. I don't think that kind of textual ambiguity is 'overblown'.


* Meteg re-ordering is in some respects even more problematic than multi-vowel re-ordering; certainly it is a more common problem. The meteg can occur to the left or right of a vowel (sometimes the distinction is the result of editorial intervention (see Kittel's original Biblia Hebraice edition), left, right and hataf-itermediary meteg positioning are all found in the ben Asher manuscripts). Unicode canonical ordering treats meteg as a fixed position mark with a combining class higher than vowels, which suggests that it always appears in the same position relative to vowels. This is incorrect.

John Hudson

Tiro Typeworks          www.tiro.com
Vancouver, BC           [EMAIL PROTECTED]

If you browse in the shelves that, in American bookstores,
are labeled New Age, you can find there even Saint Augustine,
who, as far as I know, was not a fascist. But combining Saint
Augustine and Stonehenge -- that is a symptom of Ur-Fascism.
                                                            - Umberto Eco




Reply via email to