Re: Tips for getting unique results?

Erick Erickson Thu, 07 Apr 2011 15:49:46 -0700

I think you can specify the in-group sort, and specify a very small number
(perhaps
even one) to go in each group. But you'd have to store the length of each
body and sort by that.


I'm pretty sure grouping is trunk-only.

The problem here is getting something that applies
just within the group and not across groups... I'm not sure how to tackle
that
other than perhaps the grouping idea...

Best
Erick

On Thu, Apr 7, 2011 at 6:36 PM, Peter Spam <ps...@mac.com> wrote:

> Would grouping solve this?  I'd rather not move to a pre-release solr ...
>
> To clarify the problem:
>
> The data are fine and not duplicated - however, I want to analyze the data,
> and summarize one field (kind of like faceting), to understand what the
> largest value is.
>
> For example:
>
> Document 1:   label=1A1A1; body="adfasdfadsfasf"
> Document 2:   label=5A1B1; body="adfaasdfasdfsdfadsfasf"
> Document 3:   label=1A1A1; body="adasdfasdfasdffaasdfasdfsdfadsfasf"
> Document 4:   label=7A1A1; body="azxzxcvdfaasdfasdfsdfadsfasf"
> Document 5:   label=7A1A1; body="azxzxcvdfaasdfasdfsdasdaaaaafadsfasf"
> Document 6:   label=5A1B1; body="adfaasdfasdfsdfadsfasfzzz"
>
> How do I get back just ONE of the largest "label" item?
>
> In other words, what query will return the 7A1A1 label just once?  If I
> search for q=* and sort the results, it works, except I get back multiple
> hits for each label.  If I do a facet, I can only sort by increasing order,
> when what I want is decreasing order.
>
>
> -Peter
>
> On Apr 7, 2011, at 10:02 AM, Erick Erickson wrote:
>
> > What version of Solr are you using? And, assuming the version that
> > has it in, have you seen grouping?
> >
> > Which is another way of asking why you want to do this, perhaps it's an
> > XY problem....
> >
> > Best
> > Erick
> >
> > On Thu, Apr 7, 2011 at 1:13 AM, Peter Spam <ps...@mac.com> wrote:
> >
> >> Hi,
> >>
> >> I have documents with a field that has "1A2B3C" alphanumeric characters.
>  I
> >> can query for * and sort results based on this field, however I'd like
> to
> >> "uniq" these results (remove duplicates) so that I can get the 5 largest
> >> unique values.  I can't use the StatsComponent because my values have
> >> letters in them too.
> >>
> >> Faceting (and ignoring the counts) gets me half of the way there, but I
> can
> >> only sort ascending.  If I could also sort facet results descending, I'd
> be
> >> done.  I'd rather not return all documents and just parse the last few
> >> results to work around this.
> >>
> >> Any ideas?
> >>
> >>
> >> -Pete
> >>
>
>

Re: Tips for getting unique results?

Reply via email to