Hi All,

I am finally having some time to upgrade our lucene from the 2.4 series to the 2.9 series. And I am having a problem that while everything compiles great I am getting a new UnsupportedOperationException.


java.lang.UnsupportedOperationException
at org.apache.lucene.index.AbstractAllTermDocs.seek(AbstractAllTermDocs.java:42) at org.apache.lucene.index.DirectoryReader$MultiTermDocs.termDocs(DirectoryReader.java:1186) at org.apache.lucene.index.DirectoryReader$MultiTermDocs.next(DirectoryReader.java:1118) at org.expasy.core.index.SubQueryFilter.fastForLargeResultSets(SubQueryFilter.java:129)

I copied in the code that calls this. See an explanation of what it tries to achieve underneath.

private void fastForLargeResultSets(String foreignField, BitSet bits, TermDocs docs, TermDocs foreignDocs, IndexReader foreignReader, BitSet queryResults)
        throws IOException
{
        int start = queryResults.nextSetBit(0);
        TermEnum foreignEnum = foreignReader.terms(new Term(foreignField, ""));
        while (foreignEnum.next())
                {
                Term term = foreignEnum.term();
                if (term == null || !term.field().equals(foreignField))
                        break;
                if (!term.text().equals("not_null"))
                {
                        foreignDocs.skipTo(start);
                        foreignDocs.seek(term);
//Source of exception in my code
                        while (foreignDocs.next())
                        {
                                int doc = foreignDocs.doc();
                                if (queryResults.get(doc))
                                {
                                        foreignDocs.skipTo(doc);
                                        if (term != null && term.text() != null)
                                                buffer.add(term.text());
                                }
// Use a buffer to avoid jumping around on disk to much.
//
                                if (buffer.size() >= BUFFERSIZE)
                                {
                                        emptyBuffer(buffer, bits, docs);
                                }
                        }
                }
        }

        if (!buffer.isEmpty())
        {
                emptyBuffer(buffer, bits, docs);
        }
}

The purpose of this code is to fill a bitset as a filter. The filter is used to find documents in index a who have a linking key value to them in index b.

While resource intensive this code path was quite fast for when you have multimillion documents in index b pointing to multimillion documents in index b.

i.e. it creates a "join" between two queries on different indexes.

for a live example
http://www.uniprot.org/uniprot/?query=citation%3A%28author%3Afink%29
this a search for fink in the field author in the "citation" index.
For each document in the "citation" index that matches term "fink" in the field "author" retrieve the terms that contain an uniquely identifying key value for documents in the "uniprot" index. Generate a bitset to use in filtering the documents in the "uniprot" index (done in the emptybuffer method).

Is this a bug? and does anyone have ideas for an effective (maybe superior) work around?

Regards and thanks for a great project!

Jerven

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to