Inquiries for such a reverse query feature have come up before - the standard response is:
1. No, Solr does not have such a feature at this time. 2. Check out Luwak. 3. Sounds like you want what Elasticsearch calls Percolator. See: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-percolate.html 4. It would be great if someone would submit a patch to add this feature to Solr. So... 1. Is Percolator indeed what you want? 2. Why doesn't Luwak satisfy your needs? Be specific. 3. Somebody should file a Solr Jira for "Add reverse query (ala Percolator) to Solr". Including specific examples. -- Jack Krupansky On Fri, Oct 2, 2015 at 9:33 AM, remi tassing <tassingr...@gmail.com> wrote: > Hi, > I have medium-low experience on Solr and I have a question I couldn't quite > solve yet. > > Typically we have quite short query strings (a couple of words) and the > search is done through a set of bigger documents. What if the logic is > turned a little bit around. I have a document and I need to find out what > strings appear in the document. A string here could be a person name > (including space for example) or a location...which are indexed in Solr. > > A concrete example, we take this text from wikipedia (Mad Max): > "*Mad Max is a 1979 Australian dystopian action film directed by George > Miller <https://en.wikipedia.org/wiki/George_Miller_%28director%29>. > Written by Miller and James McCausland from a story by Miller and producer > Byron Kennedy <https://en.wikipedia.org/wiki/Byron_Kennedy>, it tells a > story of societal breakdown > <https://en.wikipedia.org/wiki/Societal_collapse>, murder, and vengeance > <https://en.wikipedia.org/wiki/Revenge>. The film, starring the > then-little-known Mel Gibson <https://en.wikipedia.org/wiki/Mel_Gibson>, > was released internationally in 1980. It became a top-grossing Australian > film, while holding the record in the Guinness Book of Records > <https://en.wikipedia.org/wiki/Guinness_Book_of_Records> for decades as > the > most profitable film ever created,[1] > <https://en.wikipedia.org/wiki/Mad_Max_%28franchise%29#cite_note-1> and > has > been credited for further opening the global market to Australian New Wave > <https://en.wikipedia.org/wiki/Australian_New_Wave> films.* > <https://en.wikipedia.org/wiki/Mad_Max_%28franchise%29#cite_note-2> > <https://en.wikipedia.org/wiki/Mad_Max_%28franchise%29#cite_note-3>" > > I would like it to match "Mad Max" but not "Mad" or "Max" seperately, and > "George Miller", "global market" ... > > I've tried the keywordTokenizer but it didn't work. I suppose it's ok for > the index time but not query time (in this specific case) > > I had a look at Luwak but it's not what I'm looking for ( > > http://www.flax.co.uk/blog/2013/12/06/introducing-luwak-a-library-for-high-performance-stored-queries/ > ) > > The typical name search doesn't seem to work either, > https://dzone.com/articles/tips-name-search-solr > > I was thinking this problem must have already be solved...or? > > Remi >