Synonyms are working on *tokens*. So, after breaking the text into tokens. The '-' is normally a separator, so J5-55 gets splitted into J5 and 55. So, your synonym filter gets J5 and 55, and there is no rule for that. Could that be your problem?
If so, you can use a different tokenizer that doesn't split the '-', or use a charfilter that maps it to an '_'. Another approach would be to use shingles, If you want the dash to be a separator as well, take a look at the word delimiter filter. /Peter Op woensdag 25 februari 2015 03:10:28 UTC+1 schreef Tyler H: > > Greetings community, > > I'm hoping to get some feedback on synonym rule formatting. I'll do my > best to explain using a pseudo example; please bear with me. > > > 1. I have a specific use case where five documents contain the word > *J555*, and one document contains *J5-55*. > - both mean the same thing, but are indexed from two different > sources over which I have no control. > 2. Users search for these documents using *J555*, *J5-55*, *J-5-55*, > and *J-555* > > How do I create a mapping that will allow each of the cases listed in #2 > result in, at the very least, the six documents referred to in #1? I > thought this would have been the following: > > *J555*, *J5-55*, *J-5-55*, *J-555 => J555, J5-55* > > > But that doesn't work as expected. We have expand=true in our synonym > configuration. Do you have any thoughts? My main goal is simply to > understand better how the mappings work. > > Sincerely, > Tyler > > > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d1e51513-e872-4962-9602-cd989ee4b0ec%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
