Good points, like I said, I will look more into caching in the Near
Spans. I need to profile them some anyway, as I am hoping there is
some speedup to be had there.
-Grant
On Nov 29, 2007, at 6:23 PM, Michael Busch wrote:
Grant Ingersoll wrote:
As for the cost of the seeks, why can't w
Michael Busch wrote:
> once. For convenience, user could also create a very simple
> Termpositions decorator that caches the most recently loaded payload and
> allows calling getPayload() more than once.
Something like this should do the trick (I stole resizeBuffer() from
Token). It's untested cod
Grant Ingersoll wrote:
>
> As for the cost of the seeks, why can't we just document that this is
> what is going on and discourage people from doing it?
I'm just trying to keep SegmentTermPositions#getPayload() as efficient
as possible because it's often used in the most inner loops of scorers
The use case I have is for Lucene-1001, so the caching is going to
happen somewhere in Lucene, not necessarily the application. I think
caching it in SegTermPos. is the simplest, but I will have to look at
the alternatives. It is particularly problematic in the Near Spans
case (ordered an
I designed the API with this limitation intentionally to prevent users
from thinking that they can call TermPositions.getPayload() more than
once with no costs.
If we allow to call it more often than once then we have to seek back in
the posting stream. Even if this is just a seek in the underlyin