Hi,
I have medium-low experience on Solr and I have a question I couldn't quite
solve yet.

Typically we have quite short query strings (a couple of words) and the
search is done through a set of bigger documents. What if the logic is
turned a little bit around. I have a document and I need to find out what
strings appear in the document. A string here could be a person name
(including space for example) or a location...which are indexed in Solr.

A concrete example, we take this text from wikipedia (Mad Max):
"*Mad Max is a 1979 Australian dystopian action film directed by George
Miller <https://en.wikipedia.org/wiki/George_Miller_%28director%29>.
Written by Miller and James McCausland from a story by Miller and producer
Byron Kennedy <https://en.wikipedia.org/wiki/Byron_Kennedy>, it tells a
story of societal breakdown
<https://en.wikipedia.org/wiki/Societal_collapse>, murder, and vengeance
<https://en.wikipedia.org/wiki/Revenge>. The film, starring the
then-little-known Mel Gibson <https://en.wikipedia.org/wiki/Mel_Gibson>,
was released internationally in 1980. It became a top-grossing Australian
film, while holding the record in the Guinness Book of Records
<https://en.wikipedia.org/wiki/Guinness_Book_of_Records> for decades as the
most profitable film ever created,[1]
<https://en.wikipedia.org/wiki/Mad_Max_%28franchise%29#cite_note-1> and has
been credited for further opening the global market to Australian New Wave
<https://en.wikipedia.org/wiki/Australian_New_Wave> films.*
<https://en.wikipedia.org/wiki/Mad_Max_%28franchise%29#cite_note-2>
<https://en.wikipedia.org/wiki/Mad_Max_%28franchise%29#cite_note-3>"

I would like it to match "Mad Max" but not "Mad" or "Max" seperately, and
"George Miller", "global market" ...

I've tried the keywordTokenizer but it didn't work. I suppose it's ok for
the index time but not query time (in this specific case)

I had a look at Luwak but it's not what I'm looking for (
http://www.flax.co.uk/blog/2013/12/06/introducing-luwak-a-library-for-high-performance-stored-queries/
)

The typical name search doesn't seem to work either,
https://dzone.com/articles/tips-name-search-solr

I was thinking this problem must have already be solved...or?

Remi

Reply via email to