Re: searching for non-empty fields

2007-09-27 Thread Yonik Seeley
On 9/27/07, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> On 9/27/07, Pieter Berkel <[EMAIL PROTECTED]> wrote:
> > While in theory -URL:"" should be valid syntax, the Lucene query parser
> > doesn't accept it and throws a ParseException.
>
> I don't have time to work on that now,

OK, I lied :-)  It was simple (and a nice diversion).

-Yonik

> but I did just open a bug:
> https://issues.apache.org/jira/browse/LUCENE-1006


Re: searching for non-empty fields

2007-09-27 Thread Yonik Seeley
On 9/27/07, Pieter Berkel <[EMAIL PROTECTED]> wrote:
> While in theory -URL:"" should be valid syntax, the Lucene query parser
> doesn't accept it and throws a ParseException.

I don't have time to work on that now, but I did just open a bug:
https://issues.apache.org/jira/browse/LUCENE-1006

-Yonik


Re: searching for non-empty fields

2007-09-27 Thread Brian Whitman

thanks Peter, Hoss and Ryan..


q=(URL:[* TO *] -URL:"")


This gives me 400 Query parsing error: Cannot parse '(URL:[* TO *] - 
URL:"")': Lexical error at line 1, column 29. Encountered: "\"" (34),  
after : "\""




adding something like:
  


I'll do this but the problem here is I have to wait around for all  
these docs to re-index..


Your query will work if you make sure the URL field is omitted from  
the

document at index time when the field is blank.


The thing is, I thought I was omitting the field if it's blank. It's  
in a solrj instance that takes a lucenedocument, so maybe it's a  
solrj issue?


   if( URL != null && URL.length() > 5 )
  doc.add(new Field("URL", URL, Field.Store.YES,  
Field.Index.UN_TOKENIZED));


And then during indexing:

SimpleSolrDoc solrDoc = new SimpleSolrDoc();
solrDoc.setBoost( null, new Float ( doc.getBoost()));
for (Enumeration e = doc.fields(); e.hasMoreElements();) {
  Field field = e.nextElement();
  if (!ignoreFields.contains((field.name( {
solrDoc.addField(field.name(), field.stringValue());
  }
}
try {
  solr.add(solrDoc);
...







Re: searching for non-empty fields

2007-09-27 Thread Pieter Berkel
While in theory -URL:"" should be valid syntax, the Lucene query parser
doesn't accept it and throws a ParseException.  I've considered raising this
issue on lucene-dev but it didn't seem to affect many users so I decided not
to pursue the matter.



On 27/09/2007, Chris Hostetter <[EMAIL PROTECTED]> wrote:

> ...and to work arround the problem untill you reindex...
>
> q=(URL:[* TO *] -URL:"")
>
> ...at least: i'm 97% certain that will work.  it won't help if you "empty"
> values are really " " or "  " or ...
>
>


Re: searching for non-empty fields

2007-09-26 Thread Chris Hostetter


: Date: Thu, 27 Sep 2007 00:12:48 -0400
: From: Ryan McKinley <[EMAIL PROTECTED]>
: Reply-To: solr-user@lucene.apache.org
: To: solr-user@lucene.apache.org
: Subject: Re: searching for non-empty fields
: 
: > 
: > Your query will work if you make sure the URL field is omitted from the
: > document at index time when the field is blank.
: > 
: 
: adding something like:
:   
: 
: to the schema field should do it without needing to ensure it is not null or
: "" on the client side.

...and to work arround the problem untill you reindex...

q=(URL:[* TO *] -URL:"")

...at least: i'm 97% certain that will work.  it won't help if you "empty" 
values are really " " or "  " or ...



-Hoss



Re: searching for non-empty fields

2007-09-26 Thread Ryan McKinley


Your query will work if you make sure the URL field is omitted from the
document at index time when the field is blank.



adding something like:
  

to the schema field should do it without needing to ensure it is not 
null or "" on the client side.


ryan


Re: searching for non-empty fields

2007-09-26 Thread Pieter Berkel
I've experienced a similar problem before, assuming the field type is
"string" (i.e. not tokenized), there is subtle yet important difference
between a field that is null (i.e. not contained in the document) and one
that is an empty string (in the document but with no value). See
http://www.nabble.com/indexing-null-values--tf4238702.html#a12067741 for a
previous discussion of the issue.

Your query will work if you make sure the URL field is omitted from the
document at index time when the field is blank.

cheers,
Piete



On 27/09/2007, Brian Whitman <[EMAIL PROTECTED]> wrote:
>
> I have a large index with a field for a URL. For some reason or
> another, sometimes a doc will get indexed with that field blank. This
> is fine but I want a query to return only the set URL fields...
>
> If I do a query like:
>
> q=URL:[* TO *]
>
> I get a lot of empty fields back, like:
>
> 
> 
> http://thing.com
>
> What I can query for to remove the empty fields?
>
>
>
>


searching for non-empty fields

2007-09-26 Thread Brian Whitman
I have a large index with a field for a URL. For some reason or  
another, sometimes a doc will get indexed with that field blank. This  
is fine but I want a query to return only the set URL fields...


If I do a query like:

q=URL:[* TO *]

I get a lot of empty fields back, like:



http://thing.com

What I can query for to remove the empty fields?