Hi,

 

there are already a Filter available (that optimizes this special case):

http://lucene.apache.org/core/4_10_1/core/org/apache/lucene/search/FieldValueFilter.html

 

To make a query out of it use ConstantScoreQuery. But this filter is better 
used as real filter, because it has a bitset behind.

 

Uwe

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

 <http://www.thetaphi.de/> http://www.thetaphi.de

eMail: [email protected]

 

From: Tommaso Teofili [mailto:[email protected]] 
Sent: Thursday, October 30, 2014 5:34 PM
To: [email protected]
Subject: "field exists" queries and benchmarks

 

Hi all,

 

I'm doing some (rough) tests / benchmarks in order to understand what's the 
best way of doing a "field exists" query.

 

As far as I could find we can use TermRangeQuery (somefield:[* TO *]), 
WildcardQuery (somefield:*) or a plain TermQuery on another field where the 
doc's fieldnames have been indexed (fields:somfield).

 

Besides some other suggestion on how to accomplish that (very much welcome), 
I'd like to understand what is the expected performance of each of the above 
approaches because in my case the TermRangeQuery seems to be the less 
performant while the other 2 are on average on the same level.

 

One strange thing is that with TermRangeQuery and WildcardQuery the hitcount is 
not fully correct, I meaning that with 100k docs I get the correct hit count 
only with the TermQuery approach.

Code and sample outputs can be found at [1].

Any hint would be appreciated.

 

Regards,

Tommaso

 

[1] : https://gist.github.com/tteofili/52856d938fcd465eab58

Reply via email to