Hi
I am having problems accessing float values in a lucene 5.0 index via the
functionquery.
My setup is as follows
Indexing time
------------------
Document doc = new Document();
FieldType f = new FieldType();
f.setStored(false);
f.setNumericType(NumericType.FLOAT);
f.setDocValuesType(DocValuesType.NUMERIC);
f.setNumericPrecisionStep(4);
f.setIndexOptions(IndexOptions.DOCS);
for(Entry<Integer, Float> component:vector.entrySet()) {
String w = component.getKey().toString();
Float score = component.getValue();
doc.add(new FloatField(w, score, f));
}
writer.addDocument(doc);
At end of indexing I do
writer.forceMerge(1);
writer.close();
Search Time
------------------
for(Entry<Integer,Float> vector:vectors.entrySet())
{
w = vector.getKey().toString();
Float score = (Float) vector.getValue();
Query tq= NumericRangeQuery.newFloatRange(w, 0.0f, 100.0f, true, true
);
FunctionQuery fq = new FunctionQuery( new FloatFieldSource(w) );
CustomScoreQuery customQ = new My_CustomScorerQuery(tq, fq,score);
TopDocs topDocs = indexSearcher.search(customQ,10000);
}
where My_CustomScorerQuery() is defined as follows:
public class My_CustomScorerQuery extends CustomScoreQuery{
public My_CustomScorerQuery(Query mainQuery,FunctionQuery valSrcQuery,Float
mainQueryScore) {
super(mainQuery,valSrcQuery);
this.mainQueryScore = mainQueryScore;
}
public CustomScoreProvider getCustomScoreProvider(LeafReaderContext r) {
return new My_CustomScorer(r);
}
private class My_CustomScorer extends CustomScoreProvider{
public My_CustomScorer(LeafReaderContext context) {
super(context);
}
public float customScore(int doc,float subQueryScore, float valSrcScore) {
System.out.println("\thit lucene docID: "+doc+
"\n\tquery score: "+mainQueryScore+
"\n\tsubQueryScore: "+subQueryScore+
"\n\tvalSrcScore: "+valSrcScore);
return (float) (mainQueryScore * valSrcScore);
}
}
}
The problem I am seeing is that the `valSrcScore` is always 0, and
sometimes disappears if I change the "setNumericPrecisionStep" above 4. I
am indexing the following 2 docs
Map<Integer, Float> doc1 = new LinkedHashMap<Integer, Float>();
doc1.put(12,0.5f);
doc1.put(18,0.4f);
doc1.put(10,0.1f);
indexer.indexVector("doc1", doc1);
Map<Integer, Float> doc2 = new LinkedHashMap<Integer, Float>();
doc2.put(10,0.9f);
doc2.put(1,0.8f);
doc2.put(9,0.2f);
doc2.put(2,0.1f);
and testing with the following query:
Map<Integer, Float> query = new LinkedHashMap<Integer, Float>();
query.put(10,0.8f);
query.put(9,0.6f);
query.put(2,0.01f);
So field `10` in the query should have the following total scores for the
two documents in the index
score(query,doc0) = 0.8*0.1
score(query,doc1) = 0.8*0.9
but I only see
score(query,doc0) = 0.8*0.0
score(query,doc1) = 0.8*0.0
i.e. FloatFieldSource is always returning 0. If I subclass
"FloatFieldSource" then accessing
NumericDocValues arr = DocValues.getNumeric(readerContext.reader(), field);
tells me "NumericDocValues of doc0: 0" which _seems_ to suggest indexing
does not contain the docvalues? I can see the docs fine in Luke. There is a
subtle nuance (related to the way I am indexing the fields --> some fields
in a doc are "not present" and some are).
Any pointers would be much appreciated
Peyman