Turns out I just needed to use a good charset_table: https://gist.github.com/787128 (which is the full set from http://sphinxsearch.com/wiki/doku.php?id=charset_tables)... plus ngrams helps with Chinese and Japanese, which don't use spaces for word boundaries...
Sorry for the trouble! -- You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/thinking-sphinx?hl=en.
