[ 
https://issues.apache.org/jira/browse/LUCENE-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ibrahim updated LUCENE-4293:
----------------------------

    Attachment: rootsTableIndex.zip
                ArabicTokens.txt
                ArabicTokenizer.java
                ArabicRootsAnalyzer.java
                ArabicRootFilter.java
    
> ArabicRootsAnalyzer
> -------------------
>
>                 Key: LUCENE-4293
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4293
>             Project: Lucene - Core
>          Issue Type: New Feature
>            Reporter: Ibrahim
>            Priority: Minor
>         Attachments: ArabicRootFilter.java, ArabicRootsAnalyzer.java, 
> ArabicTokenizer.java, ArabicTokens.txt, rootsTableIndex.zip
>
>
> ArabicRootsAnalyzer is using an index of Arabic terms associated with its 
> roots. each Arabic word has a root. There is no automatic way of deciding the 
> root.
> This Analyzer will match any term with its root, searching/indexing will be 
> based on roots. It gives me great results in my application.
> attached all the required files with the db. the problem with it is the size 
> of the db (16MB). number of terms is around 300,000. I have another db with 
> 600,000 but the attached one is summarized and better i believe.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to