Re: How to exactly match fields which are multi-valued?

Jonathan Rochkind Thu, 08 Mar 2012 13:24:10 -0800

Well, if you really want EXACT exact, just use a KeywordTokenizer (ie,not tokenize at all). But then matches will really have to be EXACT,including punctuation, whitespace, diacritics, etc. But a query willonly match if it 'exactly' matches one value in your multi-valued field.


You could try a KeywordTokenizer with some normalization too.

Either way, though, if you're issuing a query to a field tokenized withKeywordTokenizer that can include whitespace in it's values, you reallyneed to issue it as a _phrase query_, to avoid being messed up by thelucene or dismax query parser's "pre tokenization". Which ispotentially fine, that's what you want to do anyway for 'exact match'.Except if you wanted to use dismax multiple qf's with just a BOOST onthe 'exact match', but _not_ a phrase query for other fields... well, Ican't figure out any way to do it with this technique.


It gets tricky, I haven't found a great solution.

On 3/8/2012 7:44 AM, Erick Erickson wrote:

You haven't really given us much to go on here. Matches
are just like a single valued field with the exception of
the increment gap. Say one entry were
large cat big dog
in a multi-valued field. ay the next document
indexed two values,
large cat
big dog

And, say the increment gap were 100. The token offsets
for doc 1 would be
0, 1, 2, 3
and for doc 2 would be
0, 1, 101, 102

The only effective difference is that phrase queries with "slop"
less than 100 would NEVER match across multi-values. I.e.
"cat big"~10 would match doc1 but not doc 2

Best
Erick

2012/3/7 SuoNayi<suonayi2...@163.com>:

Hi all, how to offer exact-match capabilities on the multi-valued fields?

Any helps are appreciated!

SuoNayi

Re: How to exactly match fields which are multi-valued?

Reply via email to