Re: 1.3.9+ and CJK support in sphinx.conf
> > > > Hello, > > On 2020-12-27 08:31, d tbsky wrote: > >I want to upgrade our piler from "1.3.4 + sphinx 2.2" to "1.3.9/10 > > + sphinx 3.3.1". > > I notice there is a new config entry at sphinx.conf: > > "SPHINX_CHARSET_TABLE".I am still confused after reading the document. > >if I already setup "ngram_len" and "ngram_chars" as FAQ said, do I > > need to change the setting of "SPHINX_CHARSET_TABLE" of piler default > > to something else? > >thanks a lot for your help!! > > the SPHINX_CHARSET_TABLE settings cover the characters that most Western > countries use. If you want support for CJK languages, then you need to > uncomment ngram_len and ngram_chars in the line of NGRAM_CONFIG. > > However, frankly I don't have much experience with CJK stuff, so I count > on your and other's experience and feedback whether the mentioned setup > works or if it needs some improvement. I didn't change the settings of "SPHINX_CHARSET_TABLE" and it seems works fine. I don't really know how sphinx works, but I guess with "ngram_len = 1", sphinx just treat any CJK characters as words? so if I search a CJK "word" which consist of two "characters", piler will return results with any character. however if I double quote the "word", then piler will return result with the exact "word". I think the behavior is ok.
Re: 1.3.9+ and CJK support in sphinx.conf
Hello, On 2020-12-27 08:31, d tbsky wrote: I want to upgrade our piler from "1.3.4 + sphinx 2.2" to "1.3.9/10 + sphinx 3.3.1". I notice there is a new config entry at sphinx.conf: "SPHINX_CHARSET_TABLE".I am still confused after reading the document. if I already setup "ngram_len" and "ngram_chars" as FAQ said, do I need to change the setting of "SPHINX_CHARSET_TABLE" of piler default to something else? thanks a lot for your help!! the SPHINX_CHARSET_TABLE settings cover the characters that most Western countries use. If you want support for CJK languages, then you need to uncomment ngram_len and ngram_chars in the line of NGRAM_CONFIG. However, frankly I don't have much experience with CJK stuff, so I count on your and other's experience and feedback whether the mentioned setup works or if it needs some improvement. Janos
1.3.9+ and CJK support in sphinx.conf
Hi: I want to upgrade our piler from "1.3.4 + sphinx 2.2" to "1.3.9/10 + sphinx 3.3.1". I notice there is a new config entry at sphinx.conf: "SPHINX_CHARSET_TABLE".I am still confused after reading the document. if I already setup "ngram_len" and "ngram_chars" as FAQ said, do I need to change the setting of "SPHINX_CHARSET_TABLE" of piler default to something else? thanks a lot for your help!!