On Thu, Sep 4, 2014 at 1:41 PM, mooky <[email protected]> wrote:
> I am indexing some entities that have up to 140 fields in the resultant > document - ie lots. > I am providing a simple/powerful google-style search of such entities > using the _all field - however, to make the user's life easier, we do > prefix searches. > (e.g. rather than the user having to type "johannesburg" or "aluminium" - > they can just type "joh" or "alu"). > > We display the results in a grid (with number of columns much less than > 140!) > > The users are new to this kind of search, and while they appreciate the > many benefits, they are sometimes confused by hits they don't expect. > E.g. they may search for johannesburg, expecting to get a hit on the > location - but get some odd hits because someone has put "johannesburg" in > a comment for something whose location is not johannesburg - and this is > compounded by the fact that they can't necessarily see why they got a > particular hit (because we show less than 140 columns - and some things > like comments are unsuitable to show in a grid. > > In my experience its a bit of a common problem - you tend to want to show > the user the fields they can search on - but in reality, there are always > more fields that you want to search on than you want to display (esp as > columns). > > The question is how to assist the user to see why something matched. > > The problem is we are searching on _all so traditional highlighting > doesn't (and probably will never) help. > > My question is are there some other tricks that anyone can suggest that > will help the user understand why they got unexpected hits? > > E.g. One of my initial thoughts is that the nature of prefix search means > they might get more false-positives than expected simply because they > haven't typed enough characters. e.g. "joh" will get all items located in > "Johannesburg", but also get all items created by "John". My thought was > that maybe just showing (in a tooltip) the matching term might be of some > help - ie if the user sees "John", they know that simply typing one more > character - ie "joha" will eliminate a raft of false-positives. > > Thoughts? > > Cheers... > > I think the problem is pretty hard. We have about 10 fields and use the experimental highlighter <https://github.com/wikimedia/search-highlighter>to highlight in "chains" using skip_if_last_matched. You could try that. It might not be fast enough, but it'd help, I think. Nik -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd3DfXjXUsvz_HooyBEN%2BROyEZVOpYh_Rp6HK7r1k-MxqQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
