Re: Searcher javadoc problem

DM Smith Sat, 03 Oct 2009 17:48:22 -0700

It makes sense if you understand the context. We make each verse of aBible a document. There are about 36000 docc in a Bible. We want auser to find all the verses that match there search to give the countof total hits. We then show slices of the hits from first hit to lastim document order typically about 100 at a time. Scoring is unimportant.

The user can also choose to prioritize and limit the results. Thisuses scoring and the top docs. This is not the users prefered search.

So I don't mind being nasty. But having looked at it I think it wouldbe better to have a non-scoring collector that is a co-process that w/an iterator interface gets the next doc on demand, from first doc inindex to last.

-- DM



On Oct 3, 2009, at 6:12 PM, Mark Miller <[email protected]> wrote:

You used Hits to get all that hits? Nasty man - thats we deprecatedthat

class - even though the JavaDoc warns you thats a major speed trap,
everyone still did it ... use a Collector.

Your right though - it shouldn't point to IndexSearcher.search(Query)
after that - it should point to IndexSearcher.search(Query, int)

Goto fix that.

DM Smith wrote:

I'm working on migrating my code to 2.9. And I'm trying to figure out
what to do. Along the way I found a circular argument in the JavaDoc
for Searcher. BTW, this is not a user question.

My current code calls:
               Hits hits = searcher.search(query);

The JavaDoc for it says:
 /** Returns the documents matching <code>query</code>.
  * @throws BooleanQuery.TooManyClauses
  * @deprecated Hits will be removed in Lucene 3.0. Use
  * {...@link #search(Query, Filter, int)} instead.
  */
 public final Hits search(Query query) throws IOException {
   return search(query, (Filter)null);
 }

However, search(Query, Filter, int) is not quite appropriate as Ineed

all hits. I guess I could pass null for filter and MAX_INT.

So, I found search(Query, Collector), which seems most appropriate.

(Not sure though, but I'll figure it out.) However, the JavaDoc forit

says:
 /** Lower-level search API.
 *
 * <p>{...@link Collector#collect(int)} is called for every matching
document.
 *

* <p>Applications should only use this if they need <i>all</i> ofthe

 * matching documents.  The high-level search API ({...@link
 * Searcher#search(Query)}) is usually more efficient, as it skips
 * non-high-scoring hits.

* <p>Note: The <code>score</code> passed to this method is a rawscore.

 * In other words, the score will not necessarily be a float whose
value is
 * between 0 and 1.
 * @throws BooleanQuery.TooManyClauses
 */
public void search(Query query, Collector results)
  throws IOException {
  search(createWeight(query), null, results);
}

But Searcher.search(Query) is deprecated.

So what is the appropriate documentation for getting all "hits"?Seems

to say, "Don't do that"

-- DM



--
- Mark

http://www.lucidimagination.com




---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Searcher javadoc problem

Reply via email to