On Apr 9, 2004, at 8:16 PM, Michael A. Schoen wrote:
I have an index of urls, and need to display the top 10 results for a
given query, but want to display only 1 result per domain. It seems
that using either Hits or a HitCollector, I'll need to access the doc,
grab the domain field (I'll have
On Friday 09 April 2004 23:59, Ype Kingma wrote:
When you need 3000 hits and their stored fields, you might
consider using the lower level search API with your own HitCollector.
I apologize for the stupid question but ... where's the actualy result in
HitCollector? :-)
collect(int doc,
Can I customize the way it does highlight terms? Right now it does so by arounding
with b.
That's the job of a formatter class. You can pass one in the constructor eg:
Formatter myFormatter=new SimpleHTMLFormatter(i,/i);
Highlighter h=new Highlighter(myFormatter, new QueryScorer(query)));
If
Erik,
Thanks for the poiner.
I am not sure how sort can filter out results.
sort will just sort the results right ?
lets say if i had below results
http://www.b.com/1.html
http://www.a.com/1.html
http://www.b.com/2.html
http://www.a.com/2.html
if you sort by domain name, results might be
On Apr 10, 2004, at 5:08 AM, [EMAIL PROTECTED] wrote:
On Friday 09 April 2004 23:59, Ype Kingma wrote:
When you need 3000 hits and their stored fields, you might
consider using the lower level search API with your own HitCollector.
I apologize for the stupid question but ... where's the actualy
On Apr 10, 2004, at 9:47 AM, Venu Durgam wrote:
I am not sure how sort can filter out results.
sort will just sort the results right ?
Right no filtering using Sort.
lets say if i had below results
http://www.b.com/1.html
http://www.a.com/1.html
http://www.b.com/2.html
http://www.a.com/2.html
So as Venu pointed out, sorting doesn't seem to help the problem. If we have
to walk the result set, access docs and dedupe using brute force, we're
better off w/ the standard order by relevance.
If you've got an example of this type of clustering done in a more efficient
way, that'd be great.