--- Uri David Akavia <[EMAIL PROTECTED]> wrote: > Shalom. Shalom.
> I sent your message to the ivrix list. I'm sorry it > took so long. No problem. I'm CC'in the AbiWord dev list with this answer since a couple of people there may have some- thing to add. > Since the problem came up with spell checking, why > do you need these characters for it at all? I'm sorry I don't have the original thread of the conversation around since I'm out of inbox space ): > Hebrew has two form of writing (I think you probably > know this) > KTIV MALEH (full writing) in which letters replace > the marks > KTIV HASER (lacking writing) in which there are no > special letters Yes I'm aware of this. In fact I think there's more than two ways. I've read about this in a history of Hebrew at my library. Originally there were no vowels at all, then later, yod, vav, aleph, and ayin began to be used to represent vowels with special rules. They're usually called matres lectionis in English. Then later still, the vowel points were invented and I believe these are used in combination with the matres lectionis. > It is possible just to choose one (I prefer KTIV > MALEH), which I believe is correct if you don't > write the marks, which most people don't. > Whatever is decided should just be written in the > documentation somewhere. Besides, these marks have > absolutely no value in spellchecking - no one > actually checks them for correctness (it is much > harder than checking spelling, since it has rules). > So it is not a problem when you don't check them in > the spellchecker. Well I wish it was that simple. The problem is that people do use them and will continue to use them. Religious texts always seem to use them. This is an important case for Hebrew and we do already have users doing Biblical work in Hebrew with AbiWord. Now if some text is marked as being Hebrew and it does have vowel marks, they simply won't match the entries in the dictionary at all. So they'll all be marked as errors! The next step would be for us to filter out all the vowel points before passing words to the spell-checker. But now imagine we are editing a section of Genisis which has full vowel points and also some spelling errors. The spellchecker will tell you the word has errors and offer some suggestions. But all the suggestions will have no vowel points! Perhaps the user will be able to fix it, perhaps not. But the computer is a machine and ought to be able to do exactly this type of work for us. Also, the very reason the points are used in religious works is to remove any ambiguity raised when two words have the same consonants but different vowels. A user would expect that if we support Hebrew we support this. But when she has words with correct consonants in the correct order but with incorrect vowel points, no error will be shown and the user will be lead to believe she has made no errors. So the next step is to think, well I guess the Bible is pretty important so maybe we should just have two dictionaries, one without vowels that we can do now and start using right away for most things, then Biblical Hebrew can be treated as a separate language with its own dictionary made by whoever needs to use such a thing. Problem is, we're trying to stick to standards so we used ISO 639 language codes to mark sections of our documents as to which language they belong to. ISO 639 can be a bit vague. It does now have separate codes for Modern Greek (ell or gre), and Ancient Greek (grc); but it still has only one code for Hebrew (heb). Also, I collect foreign novels and the only one I have in Hebrew, Memoirs of a Geisha by Arthur Golden seems to have at least one word per 5 pages or so which is using vowel points. If AbiWord is to be a proffesional quality word processor, it needs to be good enough for the translators to have used it to create this book. I have created a bug report for AbiWord some time ago suggesting that we need more flexibility in our use of language codes so you might care to look into that: http://bugzilla.abisource.com/show_bug.cgi?id=3227 It might seem like I'm fighting against Hebrew spell- checking but I'm really not. I just want to do it right - and it is doable. I'd love to implement it myself. What we can do is start building up a high quality Hebrew wordlist. Probably as a plain text UTF-8 encoded file. We can start with just the vowelless words and add the vowelled versions later. But we really shouldn't lock in place a system which isn't going to be flexible enough in the long-term. If you think building a word-list is a good idea we might be able to give it a place in AbiWord's CVS somewhere. Perhaps creating a special project just for this on SourceForge is a better idea. Hope this helps. Andrew Dunbar. > Yours, > > Uri David > ===== http://linguaphile.sourceforge.net/cgi-bin/translator.pl http://www.abisource.com __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com
