[ https://issues.apache.org/jira/browse/LUCENE-4956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13660528#comment-13660528 ]
Christian Moen commented on LUCENE-4956: ---------------------------------------- I've run {{KoreanAnalyzer}} on Korean Wikipedia and also had a look at memory/heap usage. Things look okay overall. I believe {{KoreanFilter}} uses wrong offsets for synonym tokens, which was discovered by random-blasting. Looking into the issue... > the korean analyzer that has a korean morphological analyzer and dictionaries > ----------------------------------------------------------------------------- > > Key: LUCENE-4956 > URL: https://issues.apache.org/jira/browse/LUCENE-4956 > Project: Lucene - Core > Issue Type: New Feature > Components: modules/analysis > Affects Versions: 4.2 > Reporter: SooMyung Lee > Assignee: Christian Moen > Labels: newbie > Attachments: kr.analyzer.4x.tar > > > Korean language has specific characteristic. When developing search service > with lucene & solr in korean, there are some problems in searching and > indexing. The korean analyer solved the problems with a korean morphological > anlyzer. It consists of a korean morphological analyzer, dictionaries, a > korean tokenizer and a korean filter. The korean anlyzer is made for lucene > and solr. If you develop a search service with lucene in korean, It is the > best idea to choose the korean analyzer. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org