Hi,

> I was pointed to Lucene from the Solr list. I am wondering if the
> performance of the below two queries is expected to be quite different and
> would they return the same set of results?
> 
> field:*
> field:[* TO *]

>From the Lucene side they are identical, but it depends on the implementation 
>in Solr's query parser. They both iterate all terms in the field (if it’s a 
>string field).

> The use case I am trying to optimize is returning all documents that
> contain any value for a given field, and I've noticed the queries to be
> quite slow especially for fields that have a large number of distinct
> values.

Unfortunately Solr has no optimized support for that. There are 2 issues open:

https://issues.apache.org/jira/browse/SOLR-11437
https://issues.apache.org/jira/browse/SOLR-12488

This is the same way how Elasticsearch is doing this today. I can look into 
implementing this (it's on my TODO list of issues).

In the meantime there is another efficient way to do this, but it requires you 
to index an additional field. The nice thing with that one is, that it does not 
require the field properties to be correct (e.g, it does not need to 
differentiate between different field types, if there are docuvalues or norms). 
The idea came also from Elasticsearch, which had this since the first day. 
Elasticsearch indexed (until they switched to the above approach using 
DocValues/NormsExistsQuery) an hidden internal field (invisible to the user) 
that was powering the exists query. This field was basically (in Solr speak) a 
"multivalued, non-tokenized, string" field. This field just contains the field 
names of all fields that have a value. E.g., if you have a document:

{ "foo": "hello", "bar": 20, "text": "all fine" }

Your indexing code would extend this document to add an additional field (Solr 
won't do this automatically like Elasticsearch):

{ "foo": "hello", "bar": 20, "text": "all fine", "fields ": ["foo", "bar", 
"text"] }

Then you can query: &fq=fields:bar to filter all field that have a value in 
"bar".

Uwe


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to