On Mon, Jul 18, 2011 at 6:24 PM, Patrick Estarian <patrick.estar...@gmail.com> wrote: > Hi, > > I am trying to get the Persian part of Lucene to work but apparently the > current implementation is just a simple version of sopt word tokenizer and > no stemmer, etc. I was trying to find the contact of the person who had done > this but couldn't find it any where in the code. >
There is no stemmer intentionally, as my findings (and others) seem to correspond with this statement: Our various experiments clearly show that a stemming procedure decreases retrieval effectiveness when applied to the Persian language. http://portal.acm.org/citation.cfm?id=1674748 But YMMV, -- lucidimagination.com --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org