Hi Neil charset_table setting is not supplementary - so, you need to include all values. Also, it's worth noting that # is Sphinx's configuration comment character, so you'll need to put the Unicode code in the list for that instead (U+0023). Not sure if Sphinx prefers unicode for the @ as well - it's U+0040. http://en.wikipedia.org/wiki/List_of_Unicode_characters
So, a full set could be this: 0..9, A..Z->a..z, _, a..z, U+410..U+42F->U+430..U+44F, U+430..U+44F, U+023, U+040 Give that a spin, let us know how you go. Cheers -- Pat On 10/01/2013, at 5:24 AM, Neil wrote: > Thanks Pat > > I tried tinkering with the charset before posting but without much luck. > > Is the TS charset_table setting supplementary to Sphinxs default charset > rules? i.e. if the sphinx.yml looks as follows, will Sphinx merge the # and > the @ symbol to its charset rules? > > charset_table: "#, @" > > The Sphinx config docs lists an ignore_chars method - it would be great if it > had an include_chars method. > > > On Tuesday, 8 January 2013 10:42:55 UTC, Pat Allan wrote: > Hi Neil > It may be that you need to add the @ symbol to your charset_table, to ensure > it gets indexed as a word character. I'm guessing that the default is it's > ignored by Sphinx's indexer? > > See here: > http://sphinxsearch.com/docs/manual-2.0.6.html#conf-charset-table > > And two-thirds down this page: > http://pat.github.com/ts/en/advanced_config.html > > Thinking Sphinx defaults to using the utf-8 charset_type (and thus, the > default utf-8 charset_table values). > > Cheers > > -- > Pat > > On 08/01/2013, at 8:09 PM, Neil wrote: > > > The plan is to use Thinking Sphinx to search for @Replies and @Mentions > > within a messages.content column, but at present Sphinx is also returning > > "UserName" matches alongside "@UserName": > > > > @Replies (Only return Messages where messages.content begins with > > "@UserName"): > > Message.search("^\\@#{user_name}") > > > > @Mentions (Only return Messages where messages.content contains "@UserName" > > but does not being with "@UserName"): > > Message.search("\\@#{user_name}", conditions: { content: > > "!^\\@#{user_name}" }) > > > > Does anyone know how to filter out the "UserName" matches and to only > > return "@UserName" in both cases? > > > > -- > > You received this message because you are subscribed to the Google Groups > > "Thinking Sphinx" group. > > To view this discussion on the web visit > > https://groups.google.com/d/msg/thinking-sphinx/-/crUUOWFC_soJ. > > To post to this group, send email to [email protected]. > > To unsubscribe from this group, send email to > > [email protected]. > > For more options, visit this group at > > http://groups.google.com/group/thinking-sphinx?hl=en. > > > > > > > > > > > > > > -- > You received this message because you are subscribed to the Google Groups > "Thinking Sphinx" group. > To view this discussion on the web visit > https://groups.google.com/d/msg/thinking-sphinx/-/tDjHHbgc0IEJ. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/thinking-sphinx?hl=en. -- You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/thinking-sphinx?hl=en.
