subject:"Re\: Algorithm of retrieving docs"

Re: Algorithm of retrieving docs

2014-02-13 Thread Harshvardhan Ojha

Thanks Michael for help, this helped me with my problem. Regards Harshvardhan Ojha On Thu, Feb 13, 2014 at 8:51 PM, Michael McCandless < [email protected]> wrote: > The bloom filter is only used by the postings format wrapper, and > we've had mixed results on whether it helps performanc

Re: Algorithm of retrieving docs

2014-02-13 Thread Michael McCandless

The bloom filter is only used by the postings format wrapper, and we've had mixed results on whether it helps performance or not (seems to depend heavily on the exact usage). We have bit set / iterator abstractions (oal.util.Bits, oal.search.DocIdSet/Iterator) to manage "sets" of documents, but mo

Re: Algorithm of retrieving docs

2014-02-13 Thread Harshvardhan Ojha

Hi Mike/Mikhail, Don't you guys think org.apache.lucene.codecs.bloom.FuzzySet.java, contains(BytesRef value) methods returns probablity of having a field, and it is a place where we are using hashing ? Are there any other place in source which when given with document id, could determine by calcu

Re: Algorithm of retrieving docs

2014-02-13 Thread Michael McCandless

Lucene only assigns its int docID during indexing. Retrieving a previously stored document is a O(1), but that involves a disk seek which can be very costly when the page is not in the OS's IO cache. Lucene does not do any caching itself (relies on the OS instead). Have a look at the current def

Re: Algorithm of retrieving docs

Re: Algorithm of retrieving docs

Re: Algorithm of retrieving docs

Re: Algorithm of retrieving docs

4 matches

Site Navigation

Mail list logo

Footer information