<s...@acts.hu> > > > > Hello, > > On 2020-12-27 08:31, d tbsky wrote: > > I want to upgrade our piler from "1.3.4 + sphinx 2.2" to "1.3.9/10 > > + sphinx 3.3.1". > > I notice there is a new config entry at sphinx.conf: > > "SPHINX_CHARSET_TABLE".I am still confused after reading the document. > > if I already setup "ngram_len" and "ngram_chars" as FAQ said, do I > > need to change the setting of "SPHINX_CHARSET_TABLE" of piler default > > to something else? > > thanks a lot for your help!! > > the SPHINX_CHARSET_TABLE settings cover the characters that most Western > countries use. If you want support for CJK languages, then you need to > uncomment ngram_len and ngram_chars in the line of NGRAM_CONFIG. > > However, frankly I don't have much experience with CJK stuff, so I count > on your and other's experience and feedback whether the mentioned setup > works or if it needs some improvement. I didn't change the settings of "SPHINX_CHARSET_TABLE" and it seems works fine. I don't really know how sphinx works, but I guess with "ngram_len = 1", sphinx just treat any CJK characters as words? so if I search a CJK "word" which consist of two "characters", piler will return results with any character. however if I double quote the "word", then piler will return result with the exact "word". I think the behavior is ok.