You could implement your own HitCollector interface and remove lower scoring duplicates as you come across them by using a Map or something to keep track as you go.

Ken Kinder wrote:
I've poked around on google and the archives quite a bite, but I can't
find exactly what I need. Say I have a query that would normally
return a set of documents:

1 002 (text...)
2 001 (text...)
3 001 (text...)
4 002 (text...)
5 004 (text...)

I'd like that modified to be:

1 002 (text...)
2 001 (text...)
5 004 (text...)

So the ordering is the same, but I only want the first 001 in the
result set -- skip all the rest.

Does this make sense? Is there a way to do it?

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



--

Grant Ingersoll Sr. Software Engineer Center for Natural Language Processing Syracuse University School of Information Studies 335 Hinds Hall Syracuse, NY 13244 http://www.cnlp.org Voice: 315-443-5484 Fax: 315-443-6886

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to