Hi Micahel: After removing isDelete(), the index loads in 430 ms. Thanks
-john On 10/21/07, Michael Busch <[EMAIL PROTECTED]> wrote: > > John Wang wrote: > > > > > Since all three methods loads docids into an int[], the lookup time is > the > > same for all three methods, what's > > different are the load times: > > > > 1) 16.5 seconds, 43 MB > > 2) 590 milliseconds 32.5 MB > > 3) 186 milliseconds 26MB > > Good analysis! Thanks for sharing the results... > > > > > I think the payload method is good enough so we don't need to diverge > from > > the lucene code base. > > Actually, I noticed that in my program in getCachedIDs() you can remove > the check > if (!reader.isDeleted(tp.doc())) { > > This should improve the performance further (not sure how much though), > because the synchronized isDeleted() call is quite expensive and not > necessary. > > If you want to reduce the index size, you might want to try to encode > the Integers more efficiently, e. g. as VInts (depending on the values > of your UIDs). > > > However, I feel that being able to customize the > > indexing process and store our own file is still more efficient both in > load > > time and index size. > > > > Yes, the current payload implementation is not optimized for this use > case, it can be improved with a per-doc approach like the one I suggested. > > -Michael > > > > Thanks > > > > -John > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > >