Re: Payload Loading and Reloading

2007-11-29 Thread Grant Ingersoll
Good points, like I said, I will look more into caching in the Near Spans. I need to profile them some anyway, as I am hoping there is some speedup to be had there. -Grant On Nov 29, 2007, at 6:23 PM, Michael Busch wrote: Grant Ingersoll wrote: As for the cost of the seeks, why can't w

Re: Payload Loading and Reloading

2007-11-29 Thread Michael Busch
Michael Busch wrote: > once. For convenience, user could also create a very simple > Termpositions decorator that caches the most recently loaded payload and > allows calling getPayload() more than once. Something like this should do the trick (I stole resizeBuffer() from Token). It's untested cod

Re: Payload Loading and Reloading

2007-11-29 Thread Michael Busch
Grant Ingersoll wrote: > > As for the cost of the seeks, why can't we just document that this is > what is going on and discourage people from doing it? I'm just trying to keep SegmentTermPositions#getPayload() as efficient as possible because it's often used in the most inner loops of scorers

Re: Payload Loading and Reloading

2007-11-29 Thread Grant Ingersoll
The use case I have is for Lucene-1001, so the caching is going to happen somewhere in Lucene, not necessarily the application. I think caching it in SegTermPos. is the simplest, but I will have to look at the alternatives. It is particularly problematic in the Near Spans case (ordered an

Re: Payload Loading and Reloading

2007-11-29 Thread Michael Busch
I designed the API with this limitation intentionally to prevent users from thinking that they can call TermPositions.getPayload() more than once with no costs. If we allow to call it more often than once then we have to seek back in the posting stream. Even if this is just a seek in the underlyin