[ https://issues.apache.org/jira/browse/LUCENE-9043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
pavithra kariyawasam updated LUCENE-9043: ----------------------------------------- Status: Patch Available (was: Open) > Currently Lucene doesn't have an analyzer for Sinhala. We have built analyzer > which consist of language dependent tokenizer, stemming algorithm and list of > stop words. > ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: LUCENE-9043 > URL: https://issues.apache.org/jira/browse/LUCENE-9043 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/analysis > Affects Versions: 8.3 > Reporter: pavithra kariyawasam > Priority: Major > Fix For: 5.5.6 > > Attachments: SinhalaAnalyzer.java, SinhalaStemmer.java, > SinhalaTokenizer.java, stopwords.txt > > > This component is developed based on three main researches. > Sinhala Analyzer, as it word implies it is an enhanced software library to > analyze documents which are written in Sinhala language. Sinhala Analyzer has > implemented by performing Sinhala morphological analysis. Tokenizing the > document content precisely, Removing stopwords accordingly and converting the > terms to its base/root form accurately are the main three functionalities of > Sinhala Analyzer. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org