Well, if you really want EXACT exact, just use a KeywordTokenizer (ie, not tokenize at all). But then matches will really have to be EXACT, including punctuation, whitespace, diacritics, etc. But a query will only match if it 'exactly' matches one value in your multi-valued field.

You could try a KeywordTokenizer with some normalization too.

Either way, though, if you're issuing a query to a field tokenized with KeywordTokenizer that can include whitespace in it's values, you really need to issue it as a _phrase query_, to avoid being messed up by the lucene or dismax query parser's "pre tokenization". Which is potentially fine, that's what you want to do anyway for 'exact match'. Except if you wanted to use dismax multiple qf's with just a BOOST on the 'exact match', but _not_ a phrase query for other fields... well, I can't figure out any way to do it with this technique.

It gets tricky, I haven't found a great solution.

On 3/8/2012 7:44 AM, Erick Erickson wrote:
You haven't really given us much to go on here. Matches
are just like a single valued field with the exception of
the increment gap. Say one entry were
large cat big dog
in a multi-valued field. ay the next document
indexed two values,
large cat
big dog

And, say the increment gap were 100. The token offsets
for doc 1 would be
0, 1, 2, 3
and for doc 2 would be
0, 1, 101, 102

The only effective difference is that phrase queries with "slop"
less than 100 would NEVER match across multi-values. I.e.
"cat big"~10 would match doc1 but not doc 2

Best
Erick

2012/3/7 SuoNayi<suonayi2...@163.com>:
Hi all, how to offer exact-match capabilities on the multi-valued fields?

Any helps are appreciated!

SuoNayi

Reply via email to