[ 
https://issues.apache.org/jira/browse/LUCENE-8028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226750#comment-16226750
 ] 

Robert Muir commented on LUCENE-8028:
-------------------------------------

Hi, we should add it as an option! It is ok to have multiple stemmers (choices).

I think we should be conservative about changing the default: at least for the 
second paper (which isn't paywalled, so i could quickly look), this appears to 
incorporate a dictionary-based approach (domain-dependent, typically perform 
less well on average than rule-based due to OOV) and i don't yet see any 
standard IR experiments confirming the improvement.

> Arabic Stemmer improvement for Better Search Accuracy
> -----------------------------------------------------
>
>                 Key: LUCENE-8028
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8028
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Ayah Shamandi
>              Labels: Arabic, Stemmer, improvement
>
> HI, this is Ayah - bidi developer at IBM Egypt - Globalization Team, we are 
> responsible to support Arabic at IBM products and services and as we use 
> lucence at many of services, we found that it needs major improvement at 
> Arabic stemmer, we implement the following two papers 
> https://dl.acm.org/citation.cfm?id=1921657 and 
> http://waset.org/publications/10005688/arabic-light-stemmer-for-better-search-accuracy
>  to improve lucene arabic stemmer function and would like to open a Pull 
> request to let you integrate it as a part of lucene 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to