Re: BitSet Filter ArrayIndexOutOfBoundsException?

Ryan McKinley Wed, 15 Apr 2009 17:35:44 -0700

uggg.  So there is no longer a consistent docId I can use in a filter?

I have an operation that is quite expensive that I am hoping to runonly once for each time the index changes. Is the

How would I get all the doc ids with a given (stored) field from aReader? I am trying:


 TermDocs td = reader.termDocs();
  while( td.next() ) {
    int id = td.doc();
    Document doc = searcher.doc( id, selector );
    ...

but the termDocs() function is always empty (The index is not empty)

Thanks
ryan




On Apr 15, 2009, at 7:41 PM, Uwe Schindler wrote:

Use the index reader given to getDocIdSet. The Ids are only validfor thatindex reader. This is new in Lucene 2.9: filters are executedagainst each

segment of an index separately, so the docids of the
MultiReader/DirectoryIndexReader are different to the local ones.

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

-----Original Message-----
From: Ryan McKinley [mailto:ryan...@gmail.com]
Sent: Thursday, April 16, 2009 1:34 AM
To: java-user@lucene.apache.org
Subject: Re: BitSet Filter ArrayIndexOutOfBoundsException?

Are you saying there lucene document could have different ids in the
MultiReader and the IndexReader?

I have assumed that the ids have not changed as long as the
lastmodified time has not changed:
  long lastmodified = IndexReader.lastModified( reader.directory() );
Is this assumption correct?

I get the original ids using:

    SolrIndexSearcher searcher = ...
    DocList docs = searcher.getDocList( new MatchAllDocsQuery(),
        (DocSet)null, null, 0, Integer.MAX_VALUE );

and assume that nothing has changed as long as:
   IndexReader.lastModified( searcher.getReader().directory() );
has not changed.

Am I missing something?

If so, how would I get access to the docId expected by
Filter#getDocIdSet()?

thanks!
ryan

On Apr 15, 2009, at 5:41 PM, Michael McCandless wrote:

Maybe it's because you're using the MultiReader docID space but
getDocIdSet(IndexReader) expects you to use the docID space for that
IndexReader (ie, a single segment)?

Mike

On Wed, Apr 15, 2009 at 1:37 PM, Ryan McKinley <ryan...@gmail.com>
wrote:

I am working on a Filter that uses an RTree to test for inclusion.
This
Filter works great *most* of the time -- if the index is optimized,
it works
all of the time.  I feel like I am missing something basic, but not
sure
what it could be.

Each time the reader opens (and the index has changed), I build an
RTree
from stored fields.  The RTree holds the lucene document ID and is
later
used in a Filter/Query.  This is how I build the RTree:

FieldSelector selector = new MapFieldSelector( new String[]
{ "extent" } );
DocIterator iter = docs.iterator();
while( iter.hasNext() ) {
  int id = iter.nextDoc();
  Document doc = searcher.doc( id, selector );
  Fieldable ff = doc.getFieldable( "extent" );
  if( ff != null && !reader.isDeleted( id ) ) {
    ... add the id to the RTree ...
  }
}

In the Filter, I run query my RTree and add results to a BitSet

public DocIdSet getDocIdSet(IndexReader reader) throws IOException
{
  final BitSet bits = new BitSet();

  // ... query the RTree adding matching ids to the BitSet...
    bits.set( id );

  return new DocIdBitSet( bitset );
}

When things go wrong, I get an error like this:

java.lang.ArrayIndexOutOfBoundsException: 67

at org.apache.lucene.util.OpenBitSet.fastSet(OpenBitSet.java:242)

   at
org
.apache
.solr.search.DocSetHitCollector.collect(DocSetHitCollector.java:63)
   at
org.apache.lucene.search.IndexSearcher
$MultiReaderCollectorWrapper.collect(IndexSearcher.java:313)
   at org.apache.lucene.search.Scorer.score(Scorer.java:58)
   at
org.apache.lucene.search.IndexSearcher.doSearch(IndexSearcher.java:
262)
   at

org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:250)

   at org.apache.lucene.search.Searcher.search(Searcher.java:126)
   at
org
.apache
.solr.search.SolrIndexSearcher.getDocSetNC(SolrIndexSearcher.java:
691)
   at
org
.apache

.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:597)

   at
org
.apache

.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:633)

   at
org
.apache
.solr

.search.SolrIndexSearcher.getDocListAndSetNC(SolrIndexSearcher.java:

1154)
   at
org
.apache
.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:
924)
   at
org

.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:

345)
   at
org
.apache
.solr.handler.component.QueryComponent.process(QueryComponent.java:
171)

I'm guessing it is referencing a deleted document or something like
that,
but I figured the:
&& !reader.isDeleted( id ) clause would take care of that.

Any pointers would be great!

Thanks
Ryan


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: BitSet Filter ArrayIndexOutOfBoundsException?

Reply via email to