>just to clarify, i ment take the call to getMoreDocs(50) which is
>currently in the Hits constructor, and refactor it out and into the
>"Searcher.search" methods. that way the behavior is hte same as before
>for all existing clients, but new subclasses cna change the behavior so
>that hte "search
: > * move the call to getMoreDocs(int) from Hits to Searcher.search
:
: Hmm... Hits is passed to the caller and works as a standalone cache.
: While it maintains a reference to the Searcher, it only uses that to
: resolve Documents upon misses. Perhaps the current separation of
: concerns is ac
At 8:01 PM -0700 9/8/05, Chris Hostetter wrote:
>: Which makes me wonder whether the caching logic of Hits, optimized for
>: random- rather than linear-access, and not tuneable or controllable in
>: 1.4.3, should be reviewed for a subsequent release, at least the
>: API-breaking 2.0. I'll wager th
: Which makes me wonder whether the caching logic of Hits, optimized for
: random- rather than linear-access, and not tuneable or controllable in
: 1.4.3, should be reviewed for a subsequent release, at least the
: API-breaking 2.0. I'll wager that a majority of applications do nothing
: other th
t methods (a HitCollector would probably be
>best)
>
>
>
>
>
>
>: Date: Thu, 8 Sep 2005 17:05:18 -0600
>: From: Richard Krenek <[EMAIL PROTECTED]>
>: Reply-To: java-user@lucene.apache.org
>: To: java-user@lucene.apache.org
>: Subject: Re: Weird time
A HitCollector returns docs by the order they are found (in the index, not
by relevance). Use a search method that returns TopDocs if you want the
first n documents without executing the query more than once (Hits uses this
internally).
-Yonik
Now hiring -- http://tinyurl.com/7m67g
On 9/8/05,
This answers a lot of questions and observations. We looked in the source
code of the Hits object and found the getMoreDocs(int min) method which does
what you stated below.
We are assuming you meant for us to use a HitCollector instead. This brings
up a new question does the Searcher call the
om: Richard Krenek <[EMAIL PROTECTED]>
: Reply-To: java-user@lucene.apache.org
: To: java-user@lucene.apache.org
: Subject: Re: Weird time results doing wildcard queries
:
: I did the change and here are the results:
:
: Query (default field is COMP_PART_NUMBER): 2444*
: Query: COMP_PART_NUMBER:2
On Friday 09 September 2005 00:40, Chris Hostetter wrote:
> 1) How similar, and how many? ... If i remember correctly, the Hits
> constructor does some work to pre-fetch the first 100 results.
What's really expensive in fetching documents is the disk access (often one
disk seek per matching docu
I did the change and here are the results:
Query (default field is COMP_PART_NUMBER): 2444*
Query: COMP_PART_NUMBER:2444*
Query Time: 328 ms - time for query to run.
383 total matching documents.
Cycle Time: 141 ms - time to run through hits.
Query (default field is COMP_PART_NUMBER): *91822*
Qu
The Hits class collects the document ids from the query in batches. If you
iterate beyond what was collected, the query is re-executed to collect more
ids.
You can use the expert level search methods on IndexSearcher if this isn't
what you want.
-Yonik
On 9/8/05, Richard Krenek <[EMAIL PROTEC
: is if the query starts with a wildcard. In the case where it starts with a
: wildcard, lucene has no option but to linearly go over every term in the
: index to see if it matches your pattern. It must visit every singe term in
That would explain why the search itself takes a while, but not why
a
I understand that for the query, but why does it matter once you have the
Hits object? That is the part I'm baffled on. The query with the wildcard in
the front takes a lot longer, but we expected that.
On 9/8/05, Jeremy Meyer <[EMAIL PROTECTED]> wrote:
>
> The issue isn't with multiple wildcar
The issue isn't with multiple wildcards exactly. Specifically, the problem
is if the query starts with a wildcard. In the case where it starts with a
wildcard, lucene has no option but to linearly go over every term in the
index to see if it matches your pattern. It must visit every singe term i
14 matches
Mail list logo