All Lucene queries implement extractTerms [1] and this API is used by highlighter implementations to get the expanded set of terms in wildcards/fuzzy etc. This set of terms isn't exposed directly in elasticsearch today but you may be able to hack something together using scripts or a custom Java plugin - look at SearchContext.current().query().extractTerms().
Cheers Mark [1] http://lucene.apache.org/core/5_1_0/core/org/apache/lucene/search/Query.html#extractTerms(java.util.Set) On Tuesday, April 28, 2015 at 12:00:49 PM UTC+1, Graham Turner wrote: > > Thanks Mark. > > I did wonder about the highlighter, but using it would mean potentially > retrieving every hit and parsing it, which feels pretty impractical for > large searches. > > Presumably the fuzzy query has to identify a full list of matching terms > internally - is there any way we could somehow hook into this, or retrieve > the list separately to the query results? A mechanism similar to the > suggester, just accepting a single fuzzy term or a wildcard term would be > perfect. I appreciate this probably isn't a common request, but I'm sure > it would have other use cases. Something to consider for a future release > perhaps? :-) > > Cheers > > Graham > > > On Monday, 27 April 2015 17:41:17 UTC+1, ma...@elastic.co wrote: >> >> Hi Graham, >> If you were to use the highlighter functionality you would essentially >> "see what the search engine saw". >> With some client-side coding you could parse out the expanded search >> terms because they would be surrounded by tags in matching docs. >> Of course this wouldn't provide a de-duped list of terms and would be >> inefficient to return an exhaustive list of all expansions used but may be >> an approach to investigate. >> >> Cheers >> Mark >> >> On Monday, April 27, 2015 at 5:08:55 PM UTC+1, Graham Turner wrote: >>> >>> Hi, >>> >>> I'm working on a proof-of-concept for a client, replacing an existing >>> legacy search system with an elastic based alternative. One of the >>> requirements that comes from the existing system is that, when performing a >>> fuzzy or wildcard search, the user can view all the matching terms, and >>> include/exclude them manually from the subsequent search. >>> >>> Thus, if a fuzzy search for 'graham' is submitted (or a wildcard like >>> 'gr*m*'), it might match grayam, graeme, grahum, grahem, etc. The users >>> want to be able to see this list of matched terms, then, for instance, >>> exclude 'grayam' from the expanded terms list, so that all the other >>> expansions are used, but not the specifically excluded one. >>> >>> I’m struggling to retrieve this list of terms in the first place. >>> Ideally I’d like to submit a simple query for a fuzzy or wildcard term, and >>> have it return just the possible matching terms (up to a given limit). >>> >>> I’ve had reasonable success using the term suggester for fuzzy-type >>> responses, but can’t use this for wildcard expansions. >>> >>> Is there a good way to do this using 'out-of-the-box' elastic >>> functionality? >>> >>> Any advice / hints gratefully accepted! >>> >>> Thanks >>> >>> Graham >>> >> -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d8672e94-9063-4005-9d53-15b5cd0c6beb%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.