To avoid caching 10,025 docs when you only want to see 10,000 to 10,025
(and assuming the user was paging through results) you might have to
remember the lowest score used in the previous page of results to avoid
adding those 10,000 docs with score > lastLowScore
to the HitQueue again.
Cool! Only one question: if we have
class RelevanceAndDistanceCollector extends
HitCollector
{
public ScoreDoc[] getMatches(int start, int size)
{
...
}
}
and a call of getMatches(1, 25); would not cache
as many as 1+ docs, would it? Remember this is the
whole point o
Here's an example I put together to illustrate the point.
package distance;
import java.io.IOException;
import java.util.ArrayList;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.WhitespaceAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lu
This is interesting, one I had not considered.
Mark - are there any code samples that implement this approach? Or maybe
something similar in approach?
thanks,
jeff
On 9/19/05, mark harwood <[EMAIL PROTECTED]> wrote:
>
> I think the HitCollector approach was fine but needed
> a couple of changes
I think this is probably the closest thing I like to/am able to do now. If I
ever get to do this, I'll share the idea/code and seek review and suggestions.
Thank you very much, Mark, and all others that have helped!
-James
mark harwood <[EMAIL PROTECTED]> wrote:
I think the HitCollector appro
I think the HitCollector approach was fine but needed
a couple of changes:
1) use a PriorityQueue subclass in place of the
SortedSet to keep only the top n scoring docs
2) multiply lucene score by a distance measurement
based on the current doc's location (doc location
being read from a cached arra
On Sep 18, 2005, at 3:39 PM, James Huang wrote:
> So the question is, is there a way to overriding score
> calculation at runtime? In the lucene/search package,
> I see interfaces like Scorer, Weight and methods like
> Query.createWeight(). This looks promising.
You indeed need to override the fol
I like Erik's suggestion here as a starting point. I would guess you might
find some direction in the Scorer class, but I haven't gone through this in
detail.
Conceptually a sliding weight based on proximity sounds correct...
-- jeff
On Sep 18, 2005, at 3:39 PM, James Huang wrote:
> > So the
On Sep 18, 2005, at 3:39 PM, James Huang wrote:
So the question is, is there a way to overriding score
calculation at runtime? In the lucene/search package,
I see interfaces like Scorer, Weight and methods like
Query.createWeight(). This looks promising.
There are several ways to adjust scorin
--- Jeff Rodenburg <[EMAIL PROTECTED]> wrote:
> trimming the post further:
>
> On 9/18/05, James Huang <[EMAIL PROTECTED]> wrote:
> >
> > >The problem is quite generic, I believe. What I
> like to do is similar to
> > LIA-ch6, i.e. to find a "good Chinese Hunan-style
> restaurant near me." I
trimming the post further:
On 9/18/05, James Huang <[EMAIL PROTECTED]> wrote:
>
> >The problem is quite generic, I believe. What I like to do is similar to
> LIA-ch6, i.e. to find a "good Chinese Hunan-style restaurant near me." I
> prefer Hunan-style; however, if a good Human-style one is 12 m
See comments below.
--- Erik Hatcher <[EMAIL PROTECTED]> wrote:
> [trimming the post a bit]
>
> On Sep 18, 2005, at 11:51 AM, James Huang wrote:
> > The problem is quite generic, I believe. What I
> like
> > to do is similar to LIA-ch6, i.e. to find a "good
> > Chinese Hunan-style restaurant nea
[trimming the post a bit]
On Sep 18, 2005, at 11:51 AM, James Huang wrote:
The problem is quite generic, I believe. What I like
to do is similar to LIA-ch6, i.e. to find a "good
Chinese Hunan-style restaurant near me." I prefer
Hunan-style; however, if a good Human-style one is 12
miles, where t
On Sep 18, 2005, at 11:10 AM, James Huang wrote:
--- Erik Hatcher <[EMAIL PROTECTED]> wrote:
On Sep 18, 2005, at 10:24 AM, James Huang wrote:
--- Erik Hatcher <[EMAIL PROTECTED]>
wrote:
Get back to using your DistanceComparatorSource,
and
couple that with
a SortField.FIELD_
--- Erik Hatcher <[EMAIL PROTECTED]> wrote:
>
> On Sep 18, 2005, at 10:24 AM, James Huang wrote:
>
> > --- Erik Hatcher <[EMAIL PROTECTED]>
> wrote:
> >
> >
> >> Get back to using your DistanceComparatorSource,
> and
> >> couple that with
> >> a SortField.FIELD_SCORE, like this:
> >>
> >> Sort
On Sep 18, 2005, at 10:24 AM, James Huang wrote:
--- Erik Hatcher <[EMAIL PROTECTED]> wrote:
Get back to using your DistanceComparatorSource, and
couple that with
a SortField.FIELD_SCORE, like this:
Sort sort = new Sort(new SortField[] {new
SortField("location",
new DistanceCompara
--- Erik Hatcher <[EMAIL PROTECTED]> wrote:
> Get back to using your DistanceComparatorSource, and
> couple that with
> a SortField.FIELD_SCORE, like this:
>
> Sort sort = new Sort(new SortField[] {new
> SortField("location",
> new DistanceComparatorSource( you need>)),
> SortField.F
On Sep 17, 2005, at 7:00 PM, James Huang wrote:
I use a custom collector:
[...]
Then, use IndexSearcher.search(qry, collector);
So what happens if you get 10M results from a search?
This seems to work. What I wish for is that sorting is
done by the search engine itself, hoping for a bet
I use a custom collector:
class ResultCollector extends HitCollector
{
SortedSet set = new TreeSet();
IndexSearcher searcher;
Location me;
ResultCollector(IndexSearcher searcher, Location me)
{
this.me = me;
this.searcher = searcher;
}
public void collect(int id, float scor
On Sep 17, 2005, at 4:10 PM, James Huang wrote:
Hi,
I can sort the search results by distance now. But,
the relevance is lost.
I like to have the results sorted by relevance +
distance, i.e., relevance first; for results of
similar relevance, order by distance. How to do that?
How are you c
I guess I can use HitCollector and implement my own
sorting, right?
Is there a better approach?
--- James Huang <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I can sort the search results by distance now. But,
> the relevance is lost.
>
> I like to have the results sorted by relevance +
> distance, i.e
Hi,
I can sort the search results by distance now. But,
the relevance is lost.
I like to have the results sorted by relevance +
distance, i.e., relevance first; for results of
similar relevance, order by distance. How to do that?
Thanks a lot in advance!
-James
--- James Huang <[EMAIL PROTECTE
22 matches
Mail list logo