[jira] Commented: (LUCENE-1966) Arabic Analyzer: Stopwords list needs enhancement

2009-10-11 Thread Basem Narmok (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764515#action_12764515 ] Basem Narmok commented on LUCENE-1966: -- Seems good. BTW with FAST ESP we never

[jira] Commented: (LUCENE-1966) Arabic Analyzer: Stopwords list needs enhancement

2009-10-11 Thread Basem Narmok (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764493#action_12764493 ] Basem Narmok commented on LUCENE-1966: -- Oh, my mistake, sorry, yes please remove

[jira] Commented: (LUCENE-1966) Arabic Analyzer: Stopwords list needs enhancement

2009-10-11 Thread Basem Narmok (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764456#action_12764456 ] Basem Narmok commented on LUCENE-1966: -- Hi Robert, Regarding ايضا / أيضا ...

[jira] Updated: (LUCENE-1966) Arabic Analyzer: Stopwords list needs enhancement

2009-10-11 Thread Basem Narmok (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Basem Narmok updated LUCENE-1966: - Attachment: LUCENE-1966.patch Robert, you are correct, to solve the problem we have two options

[jira] Updated: (LUCENE-1966) Arabic Analyzer: Stopwords list needs enhancement

2009-10-08 Thread Basem Narmok (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Basem Narmok updated LUCENE-1966: - Attachment: LUCENE-1966.patch arabic-stopwords-comments.txt Please see the

[jira] Created: (LUCENE-1966) Arabic Analyzer: Stopwords list needs enhancement

2009-10-08 Thread Basem Narmok (JIRA)
: contrib/analyzers Affects Versions: 2.9.1 Reporter: Basem Narmok Priority: Trivial Fix For: 2.9 The provided Arabic stopwords list needs some enhancements (e.g. it contains a lot of words that not stopwords, and some cleanup) . patch will be provided

Re: Arabic Analyzer: possible bug

2009-10-08 Thread Basem Narmok
Robert, Yes, this issue will not work, as some numbers are used to represent (transliterate if I may say) some English letters (e.g. 3 for Arabic Aeen, and 7 for Arabic H'a). Some online services provide instant translation for such transliteration (e.g. http://www.yamli.com/ try this word "7elo"

Re: Arabic Analyzer: possible bug

2009-10-08 Thread Basem Narmok
//wiki.apache.org/lucene-java/HowToContribute and create a JIRA Issue > with a patch file to improve our stopwords list. > > Otherwise, in my opinion a good list is also acceptable and I will volunteer > to turn it into a patch :) > > On Thu, Oct 8, 2009 at 9:32 AM, Basem Narmok wrot

Re: Arabic Analyzer: possible bug

2009-10-08 Thread Basem Narmok
Uwe, !00% correct On Thu, Oct 8, 2009 at 4:56 PM, Uwe Schindler wrote: > I think the idea of lowercase filter in the arabic analyzers is not to > really index mixed language texts. It is more for the case, if you have some > word between the Arabic content (like product names,.), which happens of

Re: Arabic Analyzer: possible bug

2009-10-08 Thread Basem Narmok
to help improve it for us? > > On Thu, Oct 8, 2009 at 9:20 AM, Basem Narmok wrote: >> >> DM, there is no upper/lower cases in Arabic, so don't worry, but the >> stop word list needs some corrections and may miss some common/stop >> Arabic words. >> >> Be

Re: Arabic Analyzer: possible bug

2009-10-08 Thread Basem Narmok
DM, there is no upper/lower cases in Arabic, so don't worry, but the stop word list needs some corrections and may miss some common/stop Arabic words. Best, On Thu, Oct 8, 2009 at 4:14 PM, DM Smith wrote: > Robert, > Thanks for the info. > As I said, I am illiterate in Arabic. So I have another,