[sword-devel] Chinese lucene problem

2012-10-07 Thread Karl Kleinpaste
We've got a bug report in Xiphos saying that Chinese modules can't be searched well with CLucene indices. https://sourceforge.net/p/gnomesword/bugs/488/ I know nothing at all about Chinese, and can't address this. Can anyone supply some info? ___

Re: [sword-devel] Chinese lucene problem

2012-10-07 Thread DM Smith
SWORD uses an English analyzer (StandardAnalyzer) that works well for Latin-1 languages and for languages that bear some passing similarity to English (e.g. spaces between words, phonetic spelling, ...), but it does not do well with others. The Lucene project has a few Chinese analyzers.

Re: [sword-devel] Chinese lucene problem

2012-10-07 Thread Karl Kleinpaste
DM Smith dmsm...@crosswire.org writes: For JSword, we use the language code as supplied in the conf to vector into the selection of the best analyzer. OK, well, considering that the regular Sword interface to this is particularly generic, i.e. module.createSearchFramework(...), providing no way

Re: [sword-devel] Chinese lucene problem

2012-10-07 Thread DM Smith
Because it is module.createSearchFramework, it has access to the conf and could vector to the right analyzer. It would be a very small change to the code, but with big impact. -- DM On Oct 7, 2012, at 7:40 PM, Karl Kleinpaste k...@kleinpaste.org wrote: DM Smith dmsm...@crosswire.org writes: