<s...@acts.hu>
>
>
>
> Hello,
>
> On 2020-12-27 08:31, d tbsky wrote:
> > I want to upgrade our piler from "1.3.4 + sphinx 2.2" to "1.3.9/10
> > + sphinx 3.3.1".
> > I notice there is a new config entry at sphinx.conf:
> > "SPHINX_CHARSET_TABLE".I am still confused after reading the document.
> > if I already setup "ngram_len" and "ngram_chars" as FAQ said, do I
> > need to change the setting of "SPHINX_CHARSET_TABLE" of piler default
> > to something else?
> > thanks a lot for your help!!
>
> the SPHINX_CHARSET_TABLE settings cover the characters that most Western
> countries use. If you want support for CJK languages, then you need to
> uncomment ngram_len and ngram_chars in the line of NGRAM_CONFIG.
>
> However, frankly I don't have much experience with CJK stuff, so I count
> on your and other's experience and feedback whether the mentioned setup
> works or if it needs some improvement.
I didn't change the settings of "SPHINX_CHARSET_TABLE" and it seems
works fine. I don't really know how sphinx works, but I guess with
"ngram_len = 1", sphinx just treat any CJK characters as words? so if
I search a CJK "word" which consist of two "characters", piler will
return results with any character. however if I double quote the
"word", then piler will return result with the exact "word". I think
the behavior is ok.