Hi Erik
What exactly do u mean by this
>We've emphasized numerous times that calling hits.doc(i) is a resource
>hit. Don't do it for documents you aren't going to show. To filter by
>score, use hits.score(i) first.
I am bit Confused u mean to say Replace
hits.doc(i)
by
hits.score(i)
Also
> Ah, so you are accessing every document to get this field information.
> It is incorrect that you cannot filter prior to getting hits. You have
> a couple of options in filtering by a field value - use a QueryFilter
. or simply AND a RangeQuery to the original query.
Since the portal we ar building for is a eCommerce one, We have to return
SearchWord across
( >7 ) x 1000 x 15000 documents , Get most of the Relevant His (Where
ever Score is between 0.5 to 1.0 )
and then Sort the adjecent Fields 'Vendors' and 'Price' in ASC Order
In such a case We cannot use RangeQuery.... without priorly knowing what
exactly the Consumer want's
Is it not possible to have a Generalized Filter in further versions of API
, to Inject some minor factors prior to
getting the Hits returned.
Thx in advance
Karthik
-----Original Message-----
From: Erik Hatcher [mailto:[EMAIL PROTECTED]
Sent: Tuesday, December 14, 2004 3:44 PM
To: Lucene Users List
Subject: Re: HITCOLLECTOR+SCORE+DELIMMA
On Dec 13, 2004, at 11:16 PM, Karthik N S wrote:
> time [ A simple search of 'handbags' returned 1,60,000 hits and time
> taken
> was 440 secs ,in production Env / May be our
> Coding is poor,But we are constantly improving the process ].
If your searches are taking 440 seconds, you have something more
fundamentally wrong. You are either doing some large
wildcard/range/fuzzy expansions or you're accessing every document from
all your hits. Is the searcher.search() method taking that long? I
bet not. Or rather is it the iteration over the Hits that is killing
the "search" time, which is what I suspect?
We've emphasized numerous times that calling hits.doc(i) is a resource
hit. Don't do it for documents you aren't going to show. To filter by
score, use hits.score(i) first.
> { O/s Linux Gentoo , RAM 1GB, Lucene1.4.1,Appserver = Tomcat5, and
> BlackDawn Java 1.4.2 with Args -XX:+UseParallelGC for
>
> Garbage Collection }
Please narrow your code down to a clean, succinct example that you can
post. It is difficult to help you without details of your code (but
let me emphasize again - it needs to be clean and succinct so it is
quick for us to get a handle on).
> To be One step in advance ,We also have an adjecent Fields 'Vendor
> ','Price' which we have to accordingly Compare
> Best/Poor/Least results . So We have to have to limit the hits
> accordingly,since Lucene API does not provide any way to
> inject this limiting facility *prior* to getting the hits .
Ah, so you are accessing every document to get this field information.
It is incorrect that you cannot filter prior to getting hits. You have
a couple of options in filtering by a field value - use a QueryFilter
or simply AND a RangeQuery to the original query.
Erik
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]