[
https://issues.apache.org/jira/browse/LUCENE-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ibrahim updated LUCENE-4293:
----------------------------
Attachment: rootsTableIndex.zip
ArabicTokens.txt
ArabicTokenizer.java
ArabicRootsAnalyzer.java
ArabicRootFilter.java
> ArabicRootsAnalyzer
> -------------------
>
> Key: LUCENE-4293
> URL: https://issues.apache.org/jira/browse/LUCENE-4293
> Project: Lucene - Core
> Issue Type: New Feature
> Reporter: Ibrahim
> Priority: Minor
> Attachments: ArabicRootFilter.java, ArabicRootsAnalyzer.java,
> ArabicTokenizer.java, ArabicTokens.txt, rootsTableIndex.zip
>
>
> ArabicRootsAnalyzer is using an index of Arabic terms associated with its
> roots. each Arabic word has a root. There is no automatic way of deciding the
> root.
> This Analyzer will match any term with its root, searching/indexing will be
> based on roots. It gives me great results in my application.
> attached all the required files with the db. the problem with it is the size
> of the db (16MB). number of terms is around 300,000. I have another db with
> 600,000 but the attached one is summarized and better i believe.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]