Re: searching for non-empty fields
On 9/27/07, Yonik Seeley <[EMAIL PROTECTED]> wrote: > On 9/27/07, Pieter Berkel <[EMAIL PROTECTED]> wrote: > > While in theory -URL:"" should be valid syntax, the Lucene query parser > > doesn't accept it and throws a ParseException. > > I don't have time to work on that now, OK, I lied :-) It was simple (and a nice diversion). -Yonik > but I did just open a bug: > https://issues.apache.org/jira/browse/LUCENE-1006
Re: searching for non-empty fields
On 9/27/07, Pieter Berkel <[EMAIL PROTECTED]> wrote: > While in theory -URL:"" should be valid syntax, the Lucene query parser > doesn't accept it and throws a ParseException. I don't have time to work on that now, but I did just open a bug: https://issues.apache.org/jira/browse/LUCENE-1006 -Yonik
Re: searching for non-empty fields
thanks Peter, Hoss and Ryan.. q=(URL:[* TO *] -URL:"") This gives me 400 Query parsing error: Cannot parse '(URL:[* TO *] - URL:"")': Lexical error at line 1, column 29. Encountered: "\"" (34), after : "\"" adding something like: I'll do this but the problem here is I have to wait around for all these docs to re-index.. Your query will work if you make sure the URL field is omitted from the document at index time when the field is blank. The thing is, I thought I was omitting the field if it's blank. It's in a solrj instance that takes a lucenedocument, so maybe it's a solrj issue? if( URL != null && URL.length() > 5 ) doc.add(new Field("URL", URL, Field.Store.YES, Field.Index.UN_TOKENIZED)); And then during indexing: SimpleSolrDoc solrDoc = new SimpleSolrDoc(); solrDoc.setBoost( null, new Float ( doc.getBoost())); for (Enumeration e = doc.fields(); e.hasMoreElements();) { Field field = e.nextElement(); if (!ignoreFields.contains((field.name( { solrDoc.addField(field.name(), field.stringValue()); } } try { solr.add(solrDoc); ...
Re: searching for non-empty fields
While in theory -URL:"" should be valid syntax, the Lucene query parser doesn't accept it and throws a ParseException. I've considered raising this issue on lucene-dev but it didn't seem to affect many users so I decided not to pursue the matter. On 27/09/2007, Chris Hostetter <[EMAIL PROTECTED]> wrote: > ...and to work arround the problem untill you reindex... > > q=(URL:[* TO *] -URL:"") > > ...at least: i'm 97% certain that will work. it won't help if you "empty" > values are really " " or " " or ... > >
Re: searching for non-empty fields
: Date: Thu, 27 Sep 2007 00:12:48 -0400 : From: Ryan McKinley <[EMAIL PROTECTED]> : Reply-To: solr-user@lucene.apache.org : To: solr-user@lucene.apache.org : Subject: Re: searching for non-empty fields : : > : > Your query will work if you make sure the URL field is omitted from the : > document at index time when the field is blank. : > : : adding something like: : : : to the schema field should do it without needing to ensure it is not null or : "" on the client side. ...and to work arround the problem untill you reindex... q=(URL:[* TO *] -URL:"") ...at least: i'm 97% certain that will work. it won't help if you "empty" values are really " " or " " or ... -Hoss
Re: searching for non-empty fields
Your query will work if you make sure the URL field is omitted from the document at index time when the field is blank. adding something like: to the schema field should do it without needing to ensure it is not null or "" on the client side. ryan
Re: searching for non-empty fields
I've experienced a similar problem before, assuming the field type is "string" (i.e. not tokenized), there is subtle yet important difference between a field that is null (i.e. not contained in the document) and one that is an empty string (in the document but with no value). See http://www.nabble.com/indexing-null-values--tf4238702.html#a12067741 for a previous discussion of the issue. Your query will work if you make sure the URL field is omitted from the document at index time when the field is blank. cheers, Piete On 27/09/2007, Brian Whitman <[EMAIL PROTECTED]> wrote: > > I have a large index with a field for a URL. For some reason or > another, sometimes a doc will get indexed with that field blank. This > is fine but I want a query to return only the set URL fields... > > If I do a query like: > > q=URL:[* TO *] > > I get a lot of empty fields back, like: > > > > http://thing.com > > What I can query for to remove the empty fields? > > > >
searching for non-empty fields
I have a large index with a field for a URL. For some reason or another, sometimes a doc will get indexed with that field blank. This is fine but I want a query to return only the set URL fields... If I do a query like: q=URL:[* TO *] I get a lot of empty fields back, like: http://thing.com What I can query for to remove the empty fields?