FWIW...  We run a hash or the content and other bits of our docs, and
then remove duplicates according to specific algorithms.  (exactly the
same page content can clearly be hosted on many different urls but,
and domains)  Then, the choosen ones are indexed.  Though we toss the
synonyms in the index too, so we know all it's other "names."

cheers
gene

Gene Campbell
http:www.picante.co.nz
gene at picante point co point nz

http://www.travelbeen.com - "the social search engine for travel"

On Fri, Feb 27, 2009 at 5:53 AM, Cheng Zhang <zhangyongji...@yahoo.com> wrote:
> It's exactly what I'm looking for. Thank you Grant.
>
>
> ----- Original Message ----
> From: Grant Ingersoll <gsing...@apache.org>
> To: solr-user@lucene.apache.org
> Sent: Thursday, February 26, 2009 6:56:22 AM
> Subject: Re: unique result
>
> I presume these all have different unique ids?
>
> If you can address it at indexing time, then have a look at 
> https://issues.apache.org/jira/browse/SOLR-799
>
> Otherwise, you might look at https://issues.apache.org/jira/browse/SOLR-236
>
>
> On Feb 25, 2009, at 6:54 PM, Cheng Zhang wrote:
>
>> Is it possible to have Solr to remove duplicated query results?
>>
>> For example, instead of return
>>
>> <result name="response" numFound="572" start="0">
>> <doc>  <str name="productGroup_t_i_s_nm">Wireless</str> </doc>
>> <doc>  <str name="productGroup_t_i_s_nm">Wireless</str> </doc>
>> <doc>  <str name="productGroup_t_i_s_nm">Wireless</str> </doc>
>> <doc>  <str name="productGroup_t_i_s_nm">Video Games</str> </doc>
>> <doc>  <str name="productGroup_t_i_s_nm">Video Games</str> </doc>
>> </result>
>>
>> return:
>>  <result name="response" numFound="572" start="0">
>>   <doc>  <str name="productGroup_t_i_s_nm">Wireless</str> </doc>
>>   <doc>  <str name="productGroup_t_i_s_nm">Video Games</str> </doc>
>>  </result>
>>
>> Thanks a lot,
>> Kevin
>>
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
> using Solr/Lucene:
> http://www.lucidimagination.com/search
>

Reply via email to