[ts] charset_table for all of UTF-8 not working

Matt Margolis Thu, 20 May 2010 08:00:07 -0700

I want to index almost all UTF-8 characters without any folding.

My Sphinx is not matching japanese characters and does not appear to
be counting commas as characters.  Any idea why this wouldn't work in
my sphinx.yml file?


  charset_table: "U+0021..U+003f, U+0041..U+005f, U+0061+U+007e , U
+0080..U+FFFF"



Here is an example of a japanese string that I want to search for "お気に入
りの旅行先はどこですか？" that is not matching in Sphinx but is in my database in
an indexed field.


The comma issue presents itself when I do a search for "Hello,"
I get results like "Hello" and "Hello," when all I want is "Hello,"
with the comma.

Thank you,
Matt Margolis

-- 
You received this message because you are subscribed to the Google Groups 
"Thinking Sphinx" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/thinking-sphinx?hl=en.

[ts] charset_table for all of UTF-8 not working

Reply via email to