seek() seems somewhat doable, although inefficient because the
underlying TermPositions supports seek, but that really would only
allow us to go back to the beginning, I think (besides the fact that
Spans is an interface and it would break back compat, ugh!).
Collector route seems more promising and since that API isn't fixed
yet, might be more doable. It could either be done right on Collector
or could introduce something like SpanCollector, but then that would
imply re-implmenting many of the existing collectors. Not sure what's
involved just yet.
-Grant
On Aug 6, 2009, at 1:50 PM, Grant Ingersoll wrote:
I think it is fairly common use case (relative to the rather
uncommon use case of using SpanQuery that is) to want to do
something like:
...
SpanQuery sq = ...
topDocs = searcher.search(tq, 10);
Spans spans = sq.getSpans(searcher.getIndexReader());
for (int i = 0; i < topDocs.scoreDocs.length; i++) {
spans.seek(topDocs.scoreDocs[i]); //NOTE: seek() does not exist as
a method, only skipTo, and skipTo() can only go forward, so this
CODE DOESN'T WORK!!!!!!
//Do something with the info at that span
}
Yet, this really isn't possible because Spans.skipTo() only moves
forward. So, you are left trying to marry running the search with
moving around in the Spans, or some other rather clunky mechanism
and this code is almost always really ugly. Alternatively, people
forgo the search() part and just go straight to the spans, but then
you miss out on scores.
It just has never felt right to me, but I am not seeing a better way
of doing it at the moment, so I thought I would throw it out to the
list to see what people think. That is, how can we generate a Spans
object that is backed by the order in a ScoreDocs array? The thing
is, in order to run the SpanQuery, we iterated over the Spans
anyway? I think that what I would really like is for the case where
I am doing SpanQuerys that I can tell it to preserve the Span by
hanging it off of something (maybe the Collector could have a
callback that allows me to collect Span info). (not sure if that
makes sense). I realize this would be extra memory, but that is
probably a cost I'm willing to pay. Alternatively, we need to add a
seek() method to spans() and pay the cost of thrashing.
Thoughts? Am I off base here or missing something?
-Grant
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org