My understanding is that most of the problems with Unicode Hebrew are in fact with the points which are sometimes used with modern Hebrew, rather than with the accents or cantillation marks. The combining classes for the latter, apart from meteg, are mostly correctly assigned, and although there are some small issues which can be resolved, and a couple of badly misleading character names, there are no major problems.While the current combining classes may cause some difficulties for Biblical scholars (and this isn't cut and dry yet - it isn't certain whether these are Unicode problem, implementation problems, missing characters or mis-identified characters), I have yet to see a claimed problem with pointed Hebrew - I mean just the points, without cantillation marks, as used for non-Biblical texts. And I don't count Microsoft's strange implementation mentioned yesterday as a Unicode problem.
Jony
The major combining class related problems with Unicode Hebrew are concerned with:
1) Meteg - basically part of the accent system but it was encoded in the "Points and Punctuation" sub-block, based on Israeli standards, because it is sometimes used in modern Hebrew. In this marginal modern Hebrew use of meteg it is always positioned to the left of any vowel. The problem arises in biblical Hebrew because meteg is sometimes positioned to the right of a vowel, and also because its order relative to accents is sometimes significant.
2) Cases of two vowels with one base character, mostly but not only in the defective form Yerushala(y)im. These are a problem in the biblical text and also, as Mark just pointed out, in biblical extracts quoted in modern Hebrew.
3) This is the issue which causes significant problems in pointed modern Hebrew as well as in the biblical text: Hebrew consonants are commonly combined with dagesh (a dot in the middle of the letter) and a vowel point; the consonant shin is additionally combined, in pointed text, with either sin dot or shin dot; and meteg may be added, though only occasionally in modern Hebrew. Logically, and commonly for typing purposes, the sin or shin dot combines most closely with the consonant (cf. cedilla and the inseparable dots on many Arabic letters); then the dagesh, which modifies the pronunciation of (almost) any consonant (cf. Arabic shadda and the IPA length mark U+02D0 - all are commonly transliterated by doubling the consonant); then the vowel, which is pronounced separately after the consonant; then the meteg which effectively modifies or disambiguates the vowel. So the logical order is <shin, sin/shin dot, dagesh, vowel, meteg>. But the canonical order is <shin, vowel, dagesh, meteg, sin/shin dot>; up to three (and in theory more, at least in biblical Hebrew) other characters may appear between the base letter and the dot which fundamentally modifies it.
Jony, this is the problem which I claim, and have claimed before, which affects pointed modern Hebrew just as much as the biblical text. But the question is, is it really a problem? As Ken Whistler has written in http://www.unicode.org/faq/normalization.html, "The Unicode Standard does not guarantee that the canonical ordering of a combining character sequence for any particular script is the 'correct' order from a linguistic point of view".
For rendering, there is no problem as long as a rendering engine does what the Unicode standard (4.0 p.127) tells it to do:
Canonical equivalence must be taken into account in rendering multiple accents, so that any two canonically equivalent sequences display as the same.
- or at least the problem is reduced to one of efficiency. But it seems that certain software companies have decided that modern Hebrew users prefer to see normalised text rendered quickly but incorrectly rather than slightly more slowly but correctly. I wonder if they have consulted with people like you, Jony, before making that decision. Perhaps they think that this is an issue for biblical Hebrew only, but it is not.
I have just tested whether the sequences (in canonical order) <shin, patah, dagesh, shin dot> and <shin, patah, dagesh, meteg, shin dot> are rendered correctly in Windows 2000 by Uniscribe (version 1.468.4015.0) and a variety of fonts. SBL Hebrew (draft) and Guttmann David render the former correctly, because there is no positioning adjustment required in this case (and so even Times New Roman and Arial Unicode MS render correctly), but Ezra SIL misplaces the dagesh, Code2000 misplaces the shin dot, and Vusillus (draft) misplaces both. But when the meteg is added, none of these fonts are able to make the proper positioning adjustments; but Ezra SIL, SBL Hebrew and Vusillus give correct results for the logical order <shin, shin dot, dagesh, patah, meteg>. The problem is that the reordering which the rendering engine should be doing is being passed to the fonts, although it is a task which the OpenType fonts cannot do at least without very complex and inefficient code.
And then the issue is not just one of rendering. There are also issues of searching and sorting. If I want to search for the letter sin, i.e. shin with sin dot, with the current canonical order that search needs to be able to find a discontinuous string with three or more intervening characters. That is certainly grossly inefficient, I'm not even sure if it will work at all. As for collation, as we have discussed before there need to be some seriously complex combinations in the collation data, for default or tailored collation, so that shin/sin dot and dagesh are collated either at a higher level than or simply as before the logically following vowel point.
The issue might have been simplified if U+FB2A to U+FB4A had not been defined as composition exceptions. I mention this only because there is a precedent for changing the composition exception table.
Jony, I hope you now realise that the problems do in principle affect modern Hebrew. If they have not been noticed so far it is only because people have not yet been normalising text very often. But as XML becomes widespread and its normalisation recommendations are incorporated into software, text will start being normalised unexpectedly, and Israeli readers of pointed Hebrew on the Web etc will quickly start to complain that documents cannot be viewed or searched properly. The problem is coming, and won't go away simply by being ignored.
-- Peter Kirk [EMAIL PROTECTED] (personal) [EMAIL PROTECTED] (work) http://www.qaya.org/

