Re: field:(-null) returns records where field was not specified

2008-01-15 Thread Karen Loughran

Thanks Chris, this is useful, we can you the query format you suggest,

Karen

On Tuesday 15 January 2008 01:13:14 Chris Hostetter wrote:
 Several things in this thread should be clarified (note: order of
 quotations munged for clarity)...

 : I had read this page.  But I'm not using the NOT operator,  I'm using
 : the - operator.  I'm assuming there is a subtle difference between them
 : in that NOT qualifies something else, hence needs 2 terms.  Isn't the -
 : operator supposed to be a complement to the + operator, ie. excludes
 : something rather than requiring it ?

 The NOT operator and the - operator are in fact the same thing ... the
 duplicate syntax comes from Lucene trying to appease people that
 want boolean style operator synta (AND/OR/NOT) even though the query
 parser is not a boolean syntax.

 :  Have you seen this page?
 :  http://lucene.apache.org/java/docs/queryparsersyntax.html
 : 
 :  From that page:
 :  Note: The NOT operator cannot be used with just one term. For example,
 :  the following search will return no results:
 :  NOT jakarta apache

 In Solr, the query parser can in fact support purely negative queries, by
 internally transforming the query, this is noted on the Solr query syntax
 wiki...

 http://wiki.apache.org/solr/SolrQuerySyntax

 :   field_name:(-null)

 null is not a special keyword, if you look at the debugging output when
 doing that query you'll see that it is the same as:   -field_name:null
 ... which is a search for all docs containing the string null in the
 field field_name.

 : The *:* (star colon star) means all records. The trick is to use (*:*
 : AND -field:[* TO *]). It's silly, but there it is.

 as i mentioned, you can do pure wildcard queries now, so a simple search
 for -field_name:[* TO *] will find all docs that have no indexed values
 for that field at all.

 : A performance note: we switched from empty fields to fields with a
 : standard 'empty' value. This way we don't have to do a range check to
 : find records with empty fields.

 Your milage may vary depending on how many docs you have with no value
 ... this also issn't practical when dealing with numeric, boolean, or date
 based fields.  (and depending on how much churn there is in your index,
 the filterCache can probably make the difference negliable on average
 anyway).




 -Hoss




Re: field:(-null) returns records where field was not specified

2008-01-14 Thread Erick Erickson
Have you seen this page?
http://lucene.apache.org/java/docs/queryparsersyntax.html

From that page:
Note: The NOT operator cannot be used with just one term. For example, the
following search will return no results:
NOT jakarta apache


Erick


On Jan 14, 2008 9:30 AM, Karen Loughran [EMAIL PROTECTED] wrote:



 Hi all,

 We are indexing different types of documents, some with certain fields set
 and
 some without, some fields sometimes in both.

 If a particular field is missing in a newly added record, I would have
 expected the query:

 field_name:(-null)

 not to return this particular record in the response, ie, I'm assuming the
 field is set to null.

 But the response we see includes empty docs:

 ..
 
 ..
 doc
  /doc
 doc
  /doc
 doc
  /doc
 etc, etc
 ..
 

 Can someone explain why field_name:(-null) returns the records where
 field_name is missing ?

 We note that if we do the range operation we can get a response without
 the
 records with no field_name:

 field_name:[* TO *]

 Many thanks
 Karen



Re: field:(-null) returns records where field was not specified

2008-01-14 Thread Karen Loughran

Hi Erik, thanks for your reply,

I had read this page.  But I'm not using the NOT operator,  I'm using 
the - operator.  I'm assuming there is a subtle difference between them in 
that NOT qualifies something else, hence needs 2 terms.  Isn't the - 
operator supposed to be a complement to the + operator, ie. excludes 
something rather than requiring it ?

thanks
Karen



On Monday 14 January 2008 15:14:05 Erick Erickson wrote:
 Have you seen this page?
 http://lucene.apache.org/java/docs/queryparsersyntax.html

 From that page:
 Note: The NOT operator cannot be used with just one term. For example, the
 following search will return no results:
 NOT jakarta apache


 Erick

 On Jan 14, 2008 9:30 AM, Karen Loughran [EMAIL PROTECTED] wrote:
  Hi all,
 
  We are indexing different types of documents, some with certain fields
  set and
  some without, some fields sometimes in both.
 
  If a particular field is missing in a newly added record, I would have
  expected the query:
 
  field_name:(-null)
 
  not to return this particular record in the response, ie, I'm assuming
  the field is set to null.
 
  But the response we see includes empty docs:
 
  ..
  
  ..
  doc
   /doc
  doc
   /doc
  doc
   /doc
  etc, etc
  ..
  
 
  Can someone explain why field_name:(-null) returns the records where
  field_name is missing ?
 
  We note that if we do the range operation we can get a response without
  the
  records with no field_name:
 
  field_name:[* TO *]
 
  Many thanks
  Karen




RE: field:(-null) returns records where field was not specified

2008-01-14 Thread Chris Hostetter

Several things in this thread should be clarified (note: order of 
quotations munged for clarity)...

: I had read this page.  But I'm not using the NOT operator,  I'm using the
: - operator.  I'm assuming there is a subtle difference between them in
: that NOT qualifies something else, hence needs 2 terms.  Isn't the - 
: operator supposed to be a complement to the + operator, ie. excludes
: something rather than requiring it ?

The NOT operator and the - operator are in fact the same thing ... the 
duplicate syntax comes from Lucene trying to appease people that 
want boolean style operator synta (AND/OR/NOT) even though the query 
parser is not a boolean syntax.

:  Have you seen this page?
:  http://lucene.apache.org/java/docs/queryparsersyntax.html
: 
:  From that page:
:  Note: The NOT operator cannot be used with just one term. For example, 
:  the following search will return no results:
:  NOT jakarta apache

In Solr, the query parser can in fact support purely negative queries, by 
internally transforming the query, this is noted on the Solr query syntax 
wiki...

http://wiki.apache.org/solr/SolrQuerySyntax

:   field_name:(-null)

null is not a special keyword, if you look at the debugging output when 
doing that query you'll see that it is the same as:   -field_name:null  
... which is a search for all docs containing the string null in the 
field field_name.

: The *:* (star colon star) means all records. The trick is to use (*:* AND
: -field:[* TO *]). It's silly, but there it is.

as i mentioned, you can do pure wildcard queries now, so a simple search 
for -field_name:[* TO *] will find all docs that have no indexed values 
for that field at all.

: A performance note: we switched from empty fields to fields with a standard
: 'empty' value. This way we don't have to do a range check to find records
: with empty fields.

Your milage may vary depending on how many docs you have with no value 
... this also issn't practical when dealing with numeric, boolean, or date 
based fields.  (and depending on how much churn there is in your index, 
the filterCache can probably make the difference negliable on average 
anyway).




-Hoss