[
https://issues.apache.org/jira/browse/LUCENE-8028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226750#comment-16226750
]
Robert Muir commented on LUCENE-8028:
-------------------------------------
Hi, we should add it as an option! It is ok to have multiple stemmers (choices).
I think we should be conservative about changing the default: at least for the
second paper (which isn't paywalled, so i could quickly look), this appears to
incorporate a dictionary-based approach (domain-dependent, typically perform
less well on average than rule-based due to OOV) and i don't yet see any
standard IR experiments confirming the improvement.
> Arabic Stemmer improvement for Better Search Accuracy
> -----------------------------------------------------
>
> Key: LUCENE-8028
> URL: https://issues.apache.org/jira/browse/LUCENE-8028
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Ayah Shamandi
> Labels: Arabic, Stemmer, improvement
>
> HI, this is Ayah - bidi developer at IBM Egypt - Globalization Team, we are
> responsible to support Arabic at IBM products and services and as we use
> lucence at many of services, we found that it needs major improvement at
> Arabic stemmer, we implement the following two papers
> https://dl.acm.org/citation.cfm?id=1921657 and
> http://waset.org/publications/10005688/arabic-light-stemmer-for-better-search-accuracy
> to improve lucene arabic stemmer function and would like to open a Pull
> request to let you integrate it as a part of lucene
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]