[
https://issues.apache.org/jira/browse/LUCENENET-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shad Storhaug updated LUCENENET-567:
------------------------------------
Attachment: mecab-ipadic-2.7.0-20070801.tar.gz
I posted a comment here:
https://issues.apache.org/jira/browse/LUCENE-3305?focusedCommentId=16097465&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16097465
and also contacted the Kuromoji project owners to see if they could help out.
However, so far received no response.
Fortunately, I was able to find [this blog
post](http://mentaldetritus.blogspot.com/2013/03/compiling-custom-dictionary-for.html)
that links to some files to use to check the I/O code so it doesn't just blow
up (attached).
I used this data to create a smoke test. Hopefully, someday the Kuromoji team
will add some real tests to Lucene so we can verify automatically instead of
manually that the binary format works.
I also modified the way the files are loaded so they can be overridden by
dropping them into a subdirectory of the application named {{kuromoji-data}}.
If that directory exists, the files will be loaded from it instead of the
embedded resources. This is better than the option that Lucene provided, which
requires you to recompile the assembly in order to change the dictionary.
> Port Lucene.Net.Analysis.Kuromoji
> ---------------------------------
>
> Key: LUCENENET-567
> URL: https://issues.apache.org/jira/browse/LUCENENET-567
> Project: Lucene.Net
> Issue Type: Task
> Components: Lucene.Net.Analysis.Kuromoji
> Affects Versions: Lucene.Net 4.8.0
> Reporter: Shad Storhaug
> Assignee: Shad Storhaug
> Priority: Minor
> Labels: features
> Fix For: Lucene.Net 4.8.0
>
> Attachments: mecab-ipadic-2.7.0-20070801.tar.gz
>
> Original Estimate: 96h
> Remaining Estimate: 96h
>
> Support for Analysis.Kuromoji has been added already to the ByteBuffer in the
> Support namespace
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)