[jira] [Commented] (SOLR-2366) Facet Range Gaps

Hoss Man (JIRA) Sat, 02 Apr 2011 16:45:45 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015088#comment-13015088
 ]


Hoss Man commented on SOLR-2366:
--------------------------------

In no particular order...

* I like Jan's {{facet.range.spec}} naming suggestion better then my 
{{facet.range.buckets}} suggestion ... but i think {{facet.range.series}}, 
{{facet.range.seq}}, or {{facet.range.sequence}} might be better still.

* I think Jan's point about {{N}} vs {{+N}} in the sequence list as a way to 
mix absolute values vs increments definitely makes sense, and would be 
consistent with the existing date match expression.  

* the complexity with supporting *both* absolute values and increments would be 
the question of what solr should do with input like 
{{facet.range.seq=10,20,+50,+100,120,150}} ?  what ranges would we return? 
(10-20, 20-70, 70-???....)  would it be an error? would we give back ranges 
that overlapped?  what about 
{{facet.range.seq=10,50,+50,100,150&facet.range.include=all}} .. would that 
result in one of the ranges being [100 TO 100] or would we throw that one out?  
(I think it would be wise to start out only implementing the absolute value 
approavh, since that seems (to me) the more useful option of the two, and then 
consider adding the incremental values as a separate issue later after hashing 
out hte semantics of these types of situations)

* A few of Jan's sample input suggestions used {{*}} at either the start or end 
of the sequence to denote "everything before" the second value or "everything 
after" the second to last value -- i don't think we need to support this 
syntax, I think the existing {{facet.range.other}} would still be the right way 
to support this with {{facet.range.sequence}}.  if you want "everything before" 
and/or "everything after" use {{facet.range.include=before}} and/or 
{{facet.range.include=after}} .. otherwise it would be confusing to decide what 
things like {{facet.range.include=before&facet.range.seq=*,10,20}} and 
{{facet.range.include=none&facet.range.seq=*,10,20}} mean.

* I *REALLY* don't think we should try to implement something like Jan's 
{{facet.range.labels}} suggestion.  I can't imagine any way of supporting it 
thta wouldn't prevent or radically complicate the "..." type continuation of 
series i suggested before, and that seems like a much more powerful feature 
then labels.  if a user is going to provide a label for every range, then you 
must enumerate every range, and you might as well enumerate them (and label 
them) with {{facet.query}} where the label and the query can be side by side.

This...

{code}
facet.query={!label="One or more"}bedrooms:[1 TO *]
facet.query={!label="Two or more"}bedrooms:[2 TO *]
facet.query={!label="Three or more"}bedrooms:[3 TO *]
facet.query={!label="Four or more"}bedrooms:[4 TO *]
{code}

...seems way more readable, and less prone to user error in tweaking, then 
this...

{code}
f.bedrooms.facet.range.spec=1..*,2..*,3..*,4..*
f.bedrooms.facet.range.labels="One or more","Two or more","Three or more","Four 
or more"
{code}

* Herman commented...

bq. While using fact.query allows us to construct arbitrary ranges, we must 
then pick them out of the results separately. This becomes more difficult if we 
arbitrarily facet on two or more fields/expressions. 

I don't see that as being particularly hard problem that we need to worry about 
helping users avoid,  Especially since users can anotate those queries using 
localparams and set any arbitrary key=val pairs on them that you want to help 
organize them and identify them later when parsing the response...

{code}
facet.query={!group=bed label="One or more"}bedrooms:[1 TO *]
facet.query={!group=bed label="Two or more"}bedrooms:[2 TO *]
facet.query={!group=bed label="Three or more"}bedrooms:[3 TO *]
facet.query={!group=bed label="Four or more"}bedrooms:[4 TO *]
facet.query={!group=size label="Small"}sqft:[* TO 1000]
facet.query={!group=size label="Medium"}sqft:[1000 TO 2500]
facet.query={!group=size label="Large"}sqft:[2500 TO *]
{code}




> Facet Range Gaps
> ----------------
>
>                 Key: SOLR-2366
>                 URL: https://issues.apache.org/jira/browse/SOLR-2366
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: Grant Ingersoll
>            Priority: Minor
>             Fix For: 3.2, 4.0
>
>         Attachments: SOLR-2366.patch, SOLR-2366.patch
>
>
> There really is no reason why the range gap for date and numeric faceting 
> needs to be evenly spaced.  For instance, if and when SOLR-1581 is completed 
> and one were doing spatial distance calculations, one could facet by function 
> into 3 different sized buckets: walking distance (0-5KM), driving distance 
> (5KM-150KM) and everything else (150KM+), for instance.  We should be able to 
> quantize the results into arbitrarily sized buckets.  I'd propose the syntax 
> to be a comma separated list of sizes for each bucket.  If only one value is 
> specified, then it behaves as it currently does.  Otherwise, it creates the 
> different size buckets.  If the number of buckets doesn't evenly divide up 
> the space, then the size of the last bucket specified is used to fill out the 
> remaining space (not sure on this)
> For instance,
> facet.range.start=0
> facet.range.end=400
> facet.range.gap=5,25,50,100
> would yield buckets of:
> 0-5,5-30,30-80,80-180,180-280,280-380,380-400

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SOLR-2366) Facet Range Gaps

Reply via email to