Ok finally with some pointers from Ryan, figured out the last problem.
So as a note to anyone else who might encounter the same problems with
multireader
A) Directories can contain multiple segments and a reader for those segments
B) Searches are replayed within each reader in a serial fashion **
C) If utilizing FieldCache / BitSet or anything related to document position
within a reader, and you need docId
-- document id = (sum of previous reader maxdocs )+ bitset position
e.g.
int offset;
int nextOffset;
public DocIdSet getDocIdSet(IndexReader reader) {
OpenBitSet bitset = new OpenBitSet(reader.maxDoc());
offset += reader.maxDoc();
for (int i =0; i reader.maxDoc(); i++) {
.....
.... filter stuff ....
....
if ( good ) {
bitset.set( i );
int docId = i + nextOffset;
...........
}
}
nextOffset += offset;
.......
}
K, works time for sleep
P
On Tue, Apr 28, 2009 at 5:44 PM, patrick o'leary <[email protected]> wrote:
> Think I may have found it, it was multiple runs of the filter, one for each
> segment reader, I was generating a new map to hold distances each time. So
> only the distances from the
> last segment reader were stored.
>
> Currently it looks like those segmented searches are done serially, well in
> solr they are-
> I presume the end goal is to make them multi-threaded ?
> I'll need to make my map synchronized
>
>
> On Tue, Apr 28, 2009 at 4:42 PM, Uwe Schindler <[email protected]> wrote:
>
>> What is the problem exactly? Maybe you use the new Collector API, where
>> the search is done for each segment, so caching does not work correctly?
>>
>>
>>
>> -----
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: [email protected]
>> ------------------------------
>>
>> *From:* patrick o'leary [mailto:[email protected]]
>> *Sent:* Tuesday, April 28, 2009 10:31 PM
>> *To:* [email protected]
>> *Subject:* ReadOnlyMultiSegmentReader bitset id vs doc id
>>
>>
>>
>> hey
>>
>> I've got a filter that's storing document id's with a geo distance for
>> spatial lucene using a bitset position for doc id,
>> However with a MultiSegmentReader that's no longer going to working.
>>
>> What's the most appropriate way to go from bitset position to doc id now?
>>
>> Thanks
>> Patrick
>>
>
>