Re: Conditional deduplication

2009-09-30 Thread Mauricio Scheffer
See http://wiki.apache.org/solr/FieldCollapsing

On Wed, Sep 30, 2009 at 4:41 PM, Michael  wrote:

> If I index a bunch of email documents, is there a way to say"show me all
> email documents, but only one per To: email address"
> so that if there are a total of 10 distinct To: fields in the corpus, I get
> back 10 email documents?
>
> I'm aware of http://wiki.apache.org/solr/Deduplication but I want to
> retain
> the ability to search across all of my email documents most of the time,
> and
> only occasionally search for the distinct ones.
>
> Essentially I want to do a
> SELECT DISTINCT to_field FROM documents
> where a normal search is a
> SELECT * FROM documents
>
> Thanks for any pointers.
>


Conditional deduplication

2009-09-30 Thread Michael
If I index a bunch of email documents, is there a way to say"show me all
email documents, but only one per To: email address"
so that if there are a total of 10 distinct To: fields in the corpus, I get
back 10 email documents?

I'm aware of http://wiki.apache.org/solr/Deduplication but I want to retain
the ability to search across all of my email documents most of the time, and
only occasionally search for the distinct ones.

Essentially I want to do a
SELECT DISTINCT to_field FROM documents
where a normal search is a
SELECT * FROM documents

Thanks for any pointers.