Thanks Doron, that's definitely completely backwards!!
Good thing the API is gone.
Mike McCandless
http://blog.mikemccandless.com
On Thu, Jul 18, 2013 at 7:50 AM, Doron Cohen cdor...@gmail.com wrote:
Hi, just an FYI - may be helpful for anyone obliged to use 4.0.0 or 4.1.0 -
it seems that this method is actually doing the opposite of its intention.
I did not find mentions of this in the lists or elsewhere.
This is the code for o.a.l.search.FieldCacheImpl.DocTermsImpl.exists(int):
public boolean exists(int docID) {
return docToOffset.get(docID) == 0;
}
Its description says: Returns true if this doc has this field and is not
deleted.
But it returns true for docs not containing the field and false for those
that do contain it.
A simple workaround is to not to call this method before calling getTerm()
but rather just rely on getTerm() logic: ... returns the same BytesRef, or
an empty (length=0) BytesRef if the doc did not have this field or was
deleted.
So usage code can be like this:
DocTerms values = FieldCache.DEFAULT.getTerms(reader, FIELD_NAME);
BytesRef term = new BytesRef();
for (int docid=0; docidvalues.size(); docid++) {
term = values.getTerm(docid, term);
if (term.length0) {
doSomethingWith(term.utf8ToString());
}
}
FieldCache.DEFAULT.purge(reader);
I am not sure about the overhead of this comparing to first checking
exists(), but it at least work correctly.
The code for exists() was as above until R1442497
(http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/search/FieldCacheImpl.java?revision=1442497view=markup)
and then in R1443717
(http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/core/src/java/org/apache/lucene/search/FieldCacheImpl.java?r1=1442497r2=1443717diff_format=h)
the API was change as part of LUCENE-4547 (DocValues improvements) which was
included in 4.2.
Simple code to demonstrate this (here with 4.1 but same results with 4.0):
RAMDirectory d = new RAMDirectory();
IndexWriter w = new IndexWriter(d, new IndexWriterConfig(Version.LUCENE_41,
new SimpleAnalyzer(Version.LUCENE_41)));
w.addDocument(new Document()); // Empty doc (0, 0)
Document doc = new Document(); // Real doc (1, 1)
doc.add(new StringField(f1, v1, Store.NO));
w.addDocument(doc);
w.addDocument(new Document()); // Empty doc (2, 2)
w.addDocument(new Document()); // Empty doc (3, 3)
w.commit(); // Commit - so we'll have two atomic readers
doc = new Document(); // RealDoc (0, 4)
doc.add(new StringField(f1, v2, Store.NO));
w.addDocument(doc);
w.addDocument(new Document()); // Empty doc (1, 5)
w.close();
IndexReader r = DirectoryReader.open(d);
BytesRef br = new BytesRef();
for (AtomicReaderContext leaf : r.leaves()) {
System.out.println(--- new atomic reader);
AtomicReader reader = leaf.reader();
DocTerms a = FieldCache.DEFAULT.getTerms(reader, f1);
for (int i = 0; i reader.maxDoc(); ++i) {
int n = leaf.docBase + i;
System.out.println(n+ exists: +a.exists(i));
br = a.getTerm(i, br);
if (br.length 0) {
System.out.println(n + +br.utf8ToString());
}
}
}
The result printing:
--- new atomic reader
0 exists: true
1 exists: false
1 v1
2 exists: true
3 exists: true
--- new atomic reader
4 exists: false
4 v2
5 exists: true
Indeed, exists() results are wrong.
So again, just an FYI, as this API no longer exists...
Regards,
Doron
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org