It makes sense if you understand the context. We make each verse of a Bible a document. There are about 36000 docc in a Bible. We want a user to find all the verses that match there search to give the count of total hits. We then show slices of the hits from first hit to last im document order typically about 100 at a time. Scoring is unimportant.

The user can also choose to prioritize and limit the results. This uses scoring and the top docs. This is not the users prefered search.

So I don't mind being nasty. But having looked at it I think it would be better to have a non-scoring collector that is a co-process that w/ an iterator interface gets the next doc on demand, from first doc in index to last.

-- DM


On Oct 3, 2009, at 6:12 PM, Mark Miller <markrmil...@gmail.com> wrote:

You used Hits to get all that hits? Nasty man - thats we deprecated that
class - even though the JavaDoc warns you thats a major speed trap,
everyone still did it ... use a Collector.

Your right though - it shouldn't point to IndexSearcher.search(Query)
after that - it should point to IndexSearcher.search(Query, int)

Goto fix that.

DM Smith wrote:
I'm working on migrating my code to 2.9. And I'm trying to figure out
what to do. Along the way I found a circular argument in the JavaDoc
for Searcher. BTW, this is not a user question.

My current code calls:
               Hits hits = searcher.search(query);

The JavaDoc for it says:
 /** Returns the documents matching <code>query</code>.
  * @throws BooleanQuery.TooManyClauses
  * @deprecated Hits will be removed in Lucene 3.0. Use
  * {...@link #search(Query, Filter, int)} instead.
  */
 public final Hits search(Query query) throws IOException {
   return search(query, (Filter)null);
 }

However, search(Query, Filter, int) is not quite appropriate as I need
all hits. I guess I could pass null for filter and MAX_INT.

So, I found search(Query, Collector), which seems most appropriate.
(Not sure though, but I'll figure it out.) However, the JavaDoc for it
says:
 /** Lower-level search API.
 *
 * <p>{...@link Collector#collect(int)} is called for every matching
document.
 *
* <p>Applications should only use this if they need <i>all</i> of the
 * matching documents.  The high-level search API ({...@link
 * Searcher#search(Query)}) is usually more efficient, as it skips
 * non-high-scoring hits.
* <p>Note: The <code>score</code> passed to this method is a raw score.
 * In other words, the score will not necessarily be a float whose
value is
 * between 0 and 1.
 * @throws BooleanQuery.TooManyClauses
 */
public void search(Query query, Collector results)
  throws IOException {
  search(createWeight(query), null, results);
}

But Searcher.search(Query) is deprecated.

So what is the appropriate documentation for getting all "hits"? Seems
to say, "Don't do that"

-- DM




--
- Mark

http://www.lucidimagination.com




---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to