Re: field:(-null) returns records where field was not specified
Thanks Chris, this is useful, we can you the query format you suggest, Karen On Tuesday 15 January 2008 01:13:14 Chris Hostetter wrote: Several things in this thread should be clarified (note: order of quotations munged for clarity)... : I had read this page. But I'm not using the NOT operator, I'm using : the - operator. I'm assuming there is a subtle difference between them : in that NOT qualifies something else, hence needs 2 terms. Isn't the - : operator supposed to be a complement to the + operator, ie. excludes : something rather than requiring it ? The NOT operator and the - operator are in fact the same thing ... the duplicate syntax comes from Lucene trying to appease people that want boolean style operator synta (AND/OR/NOT) even though the query parser is not a boolean syntax. : Have you seen this page? : http://lucene.apache.org/java/docs/queryparsersyntax.html : : From that page: : Note: The NOT operator cannot be used with just one term. For example, : the following search will return no results: : NOT jakarta apache In Solr, the query parser can in fact support purely negative queries, by internally transforming the query, this is noted on the Solr query syntax wiki... http://wiki.apache.org/solr/SolrQuerySyntax : field_name:(-null) null is not a special keyword, if you look at the debugging output when doing that query you'll see that it is the same as: -field_name:null ... which is a search for all docs containing the string null in the field field_name. : The *:* (star colon star) means all records. The trick is to use (*:* : AND -field:[* TO *]). It's silly, but there it is. as i mentioned, you can do pure wildcard queries now, so a simple search for -field_name:[* TO *] will find all docs that have no indexed values for that field at all. : A performance note: we switched from empty fields to fields with a : standard 'empty' value. This way we don't have to do a range check to : find records with empty fields. Your milage may vary depending on how many docs you have with no value ... this also issn't practical when dealing with numeric, boolean, or date based fields. (and depending on how much churn there is in your index, the filterCache can probably make the difference negliable on average anyway). -Hoss
Re: field:(-null) returns records where field was not specified
Have you seen this page? http://lucene.apache.org/java/docs/queryparsersyntax.html From that page: Note: The NOT operator cannot be used with just one term. For example, the following search will return no results: NOT jakarta apache Erick On Jan 14, 2008 9:30 AM, Karen Loughran [EMAIL PROTECTED] wrote: Hi all, We are indexing different types of documents, some with certain fields set and some without, some fields sometimes in both. If a particular field is missing in a newly added record, I would have expected the query: field_name:(-null) not to return this particular record in the response, ie, I'm assuming the field is set to null. But the response we see includes empty docs: .. .. doc /doc doc /doc doc /doc etc, etc .. Can someone explain why field_name:(-null) returns the records where field_name is missing ? We note that if we do the range operation we can get a response without the records with no field_name: field_name:[* TO *] Many thanks Karen
Re: field:(-null) returns records where field was not specified
Hi Erik, thanks for your reply, I had read this page. But I'm not using the NOT operator, I'm using the - operator. I'm assuming there is a subtle difference between them in that NOT qualifies something else, hence needs 2 terms. Isn't the - operator supposed to be a complement to the + operator, ie. excludes something rather than requiring it ? thanks Karen On Monday 14 January 2008 15:14:05 Erick Erickson wrote: Have you seen this page? http://lucene.apache.org/java/docs/queryparsersyntax.html From that page: Note: The NOT operator cannot be used with just one term. For example, the following search will return no results: NOT jakarta apache Erick On Jan 14, 2008 9:30 AM, Karen Loughran [EMAIL PROTECTED] wrote: Hi all, We are indexing different types of documents, some with certain fields set and some without, some fields sometimes in both. If a particular field is missing in a newly added record, I would have expected the query: field_name:(-null) not to return this particular record in the response, ie, I'm assuming the field is set to null. But the response we see includes empty docs: .. .. doc /doc doc /doc doc /doc etc, etc .. Can someone explain why field_name:(-null) returns the records where field_name is missing ? We note that if we do the range operation we can get a response without the records with no field_name: field_name:[* TO *] Many thanks Karen
RE: field:(-null) returns records where field was not specified
Several things in this thread should be clarified (note: order of quotations munged for clarity)... : I had read this page. But I'm not using the NOT operator, I'm using the : - operator. I'm assuming there is a subtle difference between them in : that NOT qualifies something else, hence needs 2 terms. Isn't the - : operator supposed to be a complement to the + operator, ie. excludes : something rather than requiring it ? The NOT operator and the - operator are in fact the same thing ... the duplicate syntax comes from Lucene trying to appease people that want boolean style operator synta (AND/OR/NOT) even though the query parser is not a boolean syntax. : Have you seen this page? : http://lucene.apache.org/java/docs/queryparsersyntax.html : : From that page: : Note: The NOT operator cannot be used with just one term. For example, : the following search will return no results: : NOT jakarta apache In Solr, the query parser can in fact support purely negative queries, by internally transforming the query, this is noted on the Solr query syntax wiki... http://wiki.apache.org/solr/SolrQuerySyntax : field_name:(-null) null is not a special keyword, if you look at the debugging output when doing that query you'll see that it is the same as: -field_name:null ... which is a search for all docs containing the string null in the field field_name. : The *:* (star colon star) means all records. The trick is to use (*:* AND : -field:[* TO *]). It's silly, but there it is. as i mentioned, you can do pure wildcard queries now, so a simple search for -field_name:[* TO *] will find all docs that have no indexed values for that field at all. : A performance note: we switched from empty fields to fields with a standard : 'empty' value. This way we don't have to do a range check to find records : with empty fields. Your milage may vary depending on how many docs you have with no value ... this also issn't practical when dealing with numeric, boolean, or date based fields. (and depending on how much churn there is in your index, the filterCache can probably make the difference negliable on average anyway). -Hoss