I'd be happy to help anyone test this out, my Arabic is pretty good.

Nader

Andrzej Bialecki wrote:

Dawid Weiss wrote:


nothing to do with each other furthermore, Arabic uses phonetic indicators on each letter called diacritics that change the way you pronounce the word which in turn changes the words meaning so two word spelled exactly the same way with different diacritics will mean two separate things,



Just to point out the fact: most slavic languages also use diacritic marks (above, like 'acute', or 'dot' marks, or below, like the Polish 'ogonek' mark). Some people argue that they can be stripped off the text upon indexing and that the queries usually disambiguate the context of the word.


Hmm. This brings up a question: the algorithmic stemmer package from Egothor works quite well for Polish (http://www.getopt.org/stempel), wouldn't it work well for Arabic, too?

I lack the necessary expertise to evaluate results (knowing only two or three arabic words ;-) ), but I can certainly help someone to get started with testing...


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to