[
https://issues.apache.org/jira/browse/LUCENE-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764515#action_12764515
]
Basem Narmok commented on LUCENE-1966:
--
Seems good.
BTW with FAST ESP we never
[
https://issues.apache.org/jira/browse/LUCENE-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764493#action_12764493
]
Basem Narmok commented on LUCENE-1966:
--
Oh, my mistake, sorry, yes please remove
[
https://issues.apache.org/jira/browse/LUCENE-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764456#action_12764456
]
Basem Narmok commented on LUCENE-1966:
--
Hi Robert,
Regarding ايضا / أيضا ...
[
https://issues.apache.org/jira/browse/LUCENE-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Basem Narmok updated LUCENE-1966:
-
Attachment: LUCENE-1966.patch
Robert, you are correct, to solve the problem we have two options
[
https://issues.apache.org/jira/browse/LUCENE-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Basem Narmok updated LUCENE-1966:
-
Attachment: LUCENE-1966.patch
arabic-stopwords-comments.txt
Please see the
: contrib/analyzers
Affects Versions: 2.9.1
Reporter: Basem Narmok
Priority: Trivial
Fix For: 2.9
The provided Arabic stopwords list needs some enhancements (e.g. it contains a
lot of words that not stopwords, and some cleanup) . patch will be provided
Robert,
Yes, this issue will not work, as some numbers are used to represent
(transliterate if I may say) some English letters (e.g. 3 for Arabic
Aeen, and 7 for Arabic H'a).
Some online services provide instant translation for such
transliteration (e.g. http://www.yamli.com/ try this word "7elo"
//wiki.apache.org/lucene-java/HowToContribute and create a JIRA Issue
> with a patch file to improve our stopwords list.
>
> Otherwise, in my opinion a good list is also acceptable and I will volunteer
> to turn it into a patch :)
>
> On Thu, Oct 8, 2009 at 9:32 AM, Basem Narmok wrot
Uwe,
!00% correct
On Thu, Oct 8, 2009 at 4:56 PM, Uwe Schindler wrote:
> I think the idea of lowercase filter in the arabic analyzers is not to
> really index mixed language texts. It is more for the case, if you have some
> word between the Arabic content (like product names,.), which happens of
to help improve it for us?
>
> On Thu, Oct 8, 2009 at 9:20 AM, Basem Narmok wrote:
>>
>> DM, there is no upper/lower cases in Arabic, so don't worry, but the
>> stop word list needs some corrections and may miss some common/stop
>> Arabic words.
>>
>> Be
DM, there is no upper/lower cases in Arabic, so don't worry, but the
stop word list needs some corrections and may miss some common/stop
Arabic words.
Best,
On Thu, Oct 8, 2009 at 4:14 PM, DM Smith wrote:
> Robert,
> Thanks for the info.
> As I said, I am illiterate in Arabic. So I have another,
11 matches
Mail list logo