Are you using NumberTools both at index and query time? Because
this works exactly as I expect....
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.NumberTools;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Hits;
import org.apache.lucene.search.ConstantScoreRangeQuery;
import java.io.IOException;
/**
* Created by: eoericks
* Date: May 12, 2008
* History: $Log$
*/
public class Test {
public static void main(String args[]) {
try {
Test test = new Test();
test.doIndex();
test.doSearch();
} catch (Exception e) {
e.printStackTrace();
}
}
private void doIndex() throws IOException {
IndexWriter w = new
IndexWriter(FSDirectory.getDirectory("C:/lucidx"), new StandardAnalyzer(),
true);
Document doc = new Document();
doc.add(new Field("num", NumberTools.longToString(1), Field.Store.NO,
Field.Index.UN_TOKENIZED));
doc.add(new Field("name", "doc 1", Field.Store.YES,
Field.Index.UN_TOKENIZED));
w.addDocument(doc);
doc = new Document();
doc.add(new Field("num", NumberTools.longToString(11),
Field.Store.NO, Field.Index.UN_TOKENIZED));
doc.add(new Field("name", "doc 11", Field.Store.YES,
Field.Index.UN_TOKENIZED));
w.addDocument(doc);
doc = new Document();
doc.add(new Field("num", NumberTools.longToString(5), Field.Store.NO,
Field.Index.UN_TOKENIZED));
doc.add(new Field("name", "doc 5", Field.Store.YES,
Field.Index.UN_TOKENIZED));
w.addDocument(doc);
doc = new Document();
doc.add(new Field("num", NumberTools.longToString(9), Field.Store.NO,
Field.Index.UN_TOKENIZED));
doc.add(new Field("name", "doc 9", Field.Store.YES,
Field.Index.UN_TOKENIZED));
w.addDocument(doc);
w.close();
}
private void doSearch() throws IOException {
IndexSearcher r = new
IndexSearcher(FSDirectory.getDirectory("c:/lucidx"));
oneSearch(r, 1L);
oneSearch(r, 2L);
oneSearch(r, 5L);
oneSearch(r, 9L);
oneSearch(r, 0L);
}
private void oneSearch(IndexSearcher r, Long lower) throws IOException {
System.out.println("\n\nSearching for greater than " +
Long.toString(lower));
Hits hits = r.search(new ConstantScoreRangeQuery("num",
NumberTools.longToString(lower), null, false, true));
for (int idx = 0; idx < hits.length(); ++idx) {
System.out.println(hits.doc(idx).get("name"));
}
}
}
***output***
Searching for greater than 1
doc 11
doc 5
doc 9
Searching for greater than 2
doc 11
doc 5
doc 9
Searching for greater than 5
doc 11
doc 9
Searching for greater than 9
doc 11
Searching for greater than 0
doc 1
doc 11
doc 5
doc 9
On Mon, May 12, 2008 at 3:21 PM, Dan Hardiker <[EMAIL PROTECTED]>
wrote:
> Erick Erickson wrote:
>
> > Although I'm a bit puzzled by what you're actually getting back.
> > You might try using Luke to look at your index to see what's
> > there.
> >
>
> I've looked through with Luke and it doesn't look like much has changed
> between using NumberTools and not. NumberTools definitely does some padding
> which makes sense, however even though I'm using that, Lucene or Luke seems
> to be boiling it down to just the number. I'm not sure which.
>
> See the NumberTools class for some help here.......
> >
> > BTW, at least in Lucene 2.1, the preferred way to go about this
> > would be ConstantScoreRangeQuery...
> >
>
> Taking your advice I'm now indexing using:
>
> document.add( new Field(RateUtils.SF_FILTERED_CNT,
> NumberTools.longToString( filteredCount ), Field.Store.YES,
> Field.Index.UN_TOKENIZED) );
>
> and searching using:
>
> I'm now
> int minRates = Long.valueOf( minRatesString ).intValue();
> luceneQuery.add( new ConstantScoreRangeQuery( RateUtils.SF_FILTERED_CNT,
> NumberTools.longToString(minRates), "", true, false ),
> BooleanClause.Occur.MUST );
>
> I get very odd results back now, but they seem to work similarly. The
> documentation for ConstantScoreRangeQuery is rather thin however I did find
> this example which suggests I'm doing the right thing:
>
>
> http://github.com/we4tech/semantic-repository/tree/master/development/idea-repository-core/src/main/java/com/ideabase/repository/core/index/ExtendedQueryParser.java
>
> The code _looks_ like it should work, it makes sense logically but it
> still doesn't do what I'm expecting.
>
> I've tried changing the indexing over to Field.Index.NO_NORMS and it makes
> the field value "0000000000000b" instead of "11", and "00000000000002"
> instead of "2" ... but that meant that the searching didn't pick up on that
> field _at all_.
>
> Surely "find me results where numeric field x is higher than y" can't be
> an uncommon request? I can think of many areas where you want to do that
> (age filtering for example).
>
> Any other suggestions of what I should be looking for, or where I can look
> to find out the next step to take?
>
>
> --
> Dan Hardiker
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>