[ 
https://issues.apache.org/jira/browse/LUCENENET-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shad Storhaug updated LUCENENET-567:
------------------------------------
    Attachment: mecab-ipadic-2.7.0-20070801.tar.gz

I posted a comment here: 
https://issues.apache.org/jira/browse/LUCENE-3305?focusedCommentId=16097465&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16097465
 and also contacted the Kuromoji project owners to see if they could help out. 
However, so far received no response.

Fortunately, I was able to find [this blog 
post](http://mentaldetritus.blogspot.com/2013/03/compiling-custom-dictionary-for.html)
 that links to some files to use to check the I/O code so it doesn't just blow 
up (attached).

I used this data to create a smoke test. Hopefully, someday the Kuromoji team 
will add some real tests to Lucene so we can verify automatically instead of 
manually that the binary format works.

I also modified the way the files are loaded so they can be overridden by 
dropping them into a subdirectory of the application named {{kuromoji-data}}. 
If that directory exists, the files will be loaded from it instead of the 
embedded resources. This is better than the option that Lucene provided, which 
requires you to recompile the assembly in order to change the dictionary.

> Port Lucene.Net.Analysis.Kuromoji
> ---------------------------------
>
>                 Key: LUCENENET-567
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-567
>             Project: Lucene.Net
>          Issue Type: Task
>          Components: Lucene.Net.Analysis.Kuromoji
>    Affects Versions: Lucene.Net 4.8.0
>            Reporter: Shad Storhaug
>            Assignee: Shad Storhaug
>            Priority: Minor
>              Labels: features
>             Fix For: Lucene.Net 4.8.0
>
>         Attachments: mecab-ipadic-2.7.0-20070801.tar.gz
>
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Support for Analysis.Kuromoji has been added already to the ByteBuffer in the 
> Support namespace



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to