Would grouping solve this?  I'd rather not move to a pre-release solr ...

To clarify the problem:

The data are fine and not duplicated - however, I want to analyze the data, and 
summarize one field (kind of like faceting), to understand what the largest 
value is.

For example:

Document 1:   label=1A1A1; body="adfasdfadsfasf"
Document 2:   label=5A1B1; body="adfaasdfasdfsdfadsfasf"
Document 3:   label=1A1A1; body="adasdfasdfasdffaasdfasdfsdfadsfasf"
Document 4:   label=7A1A1; body="azxzxcvdfaasdfasdfsdfadsfasf"
Document 5:   label=7A1A1; body="azxzxcvdfaasdfasdfsdasdaaaaafadsfasf"
Document 6:   label=5A1B1; body="adfaasdfasdfsdfadsfasfzzz"

How do I get back just ONE of the largest "label" item?

In other words, what query will return the 7A1A1 label just once?  If I search 
for q=* and sort the results, it works, except I get back multiple hits for 
each label.  If I do a facet, I can only sort by increasing order, when what I 
want is decreasing order.


-Peter

On Apr 7, 2011, at 10:02 AM, Erick Erickson wrote:

> What version of Solr are you using? And, assuming the version that
> has it in, have you seen grouping?
> 
> Which is another way of asking why you want to do this, perhaps it's an
> XY problem....
> 
> Best
> Erick
> 
> On Thu, Apr 7, 2011 at 1:13 AM, Peter Spam <ps...@mac.com> wrote:
> 
>> Hi,
>> 
>> I have documents with a field that has "1A2B3C" alphanumeric characters.  I
>> can query for * and sort results based on this field, however I'd like to
>> "uniq" these results (remove duplicates) so that I can get the 5 largest
>> unique values.  I can't use the StatsComponent because my values have
>> letters in them too.
>> 
>> Faceting (and ignoring the counts) gets me half of the way there, but I can
>> only sort ascending.  If I could also sort facet results descending, I'd be
>> done.  I'd rather not return all documents and just parse the last few
>> results to work around this.
>> 
>> Any ideas?
>> 
>> 
>> -Pete
>> 

Reply via email to