Ted, I agree 100% with your description of the characters that have not been encoded in Unicode. There are certainly marks and consonants that mean two completely different things, as you have so accurately described. But there are two approaches to encoding. There is "Code what you see" and "Code what is meant". In your analysis and in the way SIL encoded the original SIL Ezra font, we went with "Code what is meant". This means that we have two shevas (one pronounced and one silent), a holemwaw character and a shureq character. Unicode, on the other hand, is totally "Code what you see". It is attempting to make no analysis of the marks on the page. If there is a mark, code it. If it is identical to another mark, then it gets the same codepoint. (Of course, there are exceptions, but this is the general rule.)
So with Unicode, there is no way to separate even vowels and consonants, since a waw in a shureq, a holem-waw, and just a plain waw will always be encoded the same. Some of us are trying to make this approach usable by allowing at least a holem-waw to be distinguished from waw holem, by placing the holem first. For the encoders, it is fairly straight-forward. For the people trying to actually use the encoding, it's going to take a lot of context to determine what you've got. Joan Wardell NRSI-SIL

