Hi Andrea This is certainly a tricky situation.
I’d consider creating a new field with a modified version of the string, replacing the periods with something else… perhaps even letters. For example: 2and7and1and2 - which remains a single word from Sphinx’s perspective without needing to modify the charset_table, and is distinct. Of course, you’d need to do that translation for both the new field, and any search queries. I can’t think of a neater way to handle this at the moment, but perhaps others on the Sphinx forum have ideas as well? http://sphinxsearch.com/forum/ <http://sphinxsearch.com/forum/> Also, it’s worth noting that TS v3 uses the SphinxQL protocol, and that *only* has the one match mode: extended. Everything else can be done within that match mode, so Sphinx does not support any of the others with this newer protocol. Hope this helps! — Pat > On 3 Nov 2015, at 7:02 pm, Andrea S. <[email protected]> wrote: > > I need to enable boolean queries that optionally contain keywords with > periods / full stops as part of their name. For instance such a keyword may > look like this: "2.7.1.2". I noticed that a search for "2.7.1.2" also returns > results for "1.2.7.1", which is something I want to avoid. > > What setting would I need to tweak to distinguish between "2.7.1.2" and > "1.2.7.1"? I know that the period is treated as a word separator unless > explicitly added to the charset_table, but I'm also aware that adding it to > the charset table would have other side effects, e.g. the word "foo." will be > different from just "foo". So this doesn't seem to be a viable solution. > > This is the search I'm attempting: > > Feature.search("\"2.7.1.2\"", :match_mode=>"boolean") > > I'm explicitly quoting the search query, which I hoped would treat it as an > exact phrase. I've also tried other match_modes, such as "extended" (no match > mode / default) and "phrase", but every time results for "1.2.7.1" cropped > up. > > Essentially, I'm confused about the source of the problem - if the period is > treated as a word separator, does that mean that sphinx searches for the > phrase "2 7 1 2"? If yes, then why does the phrase "1 2 7 1" also match? Does > that have to do with the min_prefix/infix length settings? By the way, my > thinking_sphinx.yml file looks like this: > > mysql41: 5532 > charset_table: 0..9, A..Z->a..z, a..z > max_matches: 10000 > sql_query_range: "SELECT MIN(id),MAX(id) FROM features" > sql_range_step: 1000 > mem_limit: 128M > bin_path: '/usr/bin' > > > Any tips would be greatly appreciated. > > > > Thanks, > > > > Andrea > > > -- > You received this message because you are subscribed to the Google Groups > "Thinking Sphinx" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] > <mailto:[email protected]>. > To post to this group, send email to [email protected] > <mailto:[email protected]>. > Visit this group at http://groups.google.com/group/thinking-sphinx > <http://groups.google.com/group/thinking-sphinx>. > For more options, visit https://groups.google.com/d/optout > <https://groups.google.com/d/optout>. -- You received this message because you are subscribed to the Google Groups "Thinking Sphinx" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/thinking-sphinx. For more options, visit https://groups.google.com/d/optout.
