seek() seems somewhat doable, although inefficient because the underlying TermPositions supports seek, but that really would only allow us to go back to the beginning, I think (besides the fact that Spans is an interface and it would break back compat, ugh!). Collector route seems more promising and since that API isn't fixed yet, might be more doable. It could either be done right on Collector or could introduce something like SpanCollector, but then that would imply re-implmenting many of the existing collectors. Not sure what's involved just yet.

-Grant


On Aug 6, 2009, at 1:50 PM, Grant Ingersoll wrote:

I think it is fairly common use case (relative to the rather uncommon use case of using SpanQuery that is) to want to do something like:

...
SpanQuery sq = ...
topDocs = searcher.search(tq, 10);
Spans spans = sq.getSpans(searcher.getIndexReader());

for (int i = 0; i < topDocs.scoreDocs.length; i++) {
spans.seek(topDocs.scoreDocs[i]); //NOTE: seek() does not exist as a method, only skipTo, and skipTo() can only go forward, so this CODE DOESN'T WORK!!!!!!
        //Do something with the info at that span
}

Yet, this really isn't possible because Spans.skipTo() only moves forward. So, you are left trying to marry running the search with moving around in the Spans, or some other rather clunky mechanism and this code is almost always really ugly. Alternatively, people forgo the search() part and just go straight to the spans, but then you miss out on scores.

It just has never felt right to me, but I am not seeing a better way of doing it at the moment, so I thought I would throw it out to the list to see what people think. That is, how can we generate a Spans object that is backed by the order in a ScoreDocs array? The thing is, in order to run the SpanQuery, we iterated over the Spans anyway? I think that what I would really like is for the case where I am doing SpanQuerys that I can tell it to preserve the Span by hanging it off of something (maybe the Collector could have a callback that allows me to collect Span info). (not sure if that makes sense). I realize this would be extra memory, but that is probably a cost I'm willing to pay. Alternatively, we need to add a seek() method to spans() and pay the cost of thrashing.

Thoughts?  Am I off base here or missing something?

-Grant

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to