Chris, thanx for all this info! I'll think about these things again
and then come back to you...

Cheers,
Martin


On Tue, 2007-06-26 at 23:22 -0700, Chris Hostetter wrote:
> : my documents (products) have a price field, and I want to have
> : a "dynamically" calculated range facet for that in the response.
> 
> FYI: there have been some previous discussions on this topic...
> 
> http://www.nabble.com/blahblah-t2387813.html#a6799060
> http://www.nabble.com/faceted-browsing-t1363854.html#a3753053
> 
> : AFAICS I do not have the possibility to specify range queries in my
> : application, as I do not have a clue what's the lowest and highest
> : price in the search result and what are "good" ranges according
> : to the (statistical) distribution of prices in the search result.
> 
> as mentioned in one of those threads, it's *really* hard to get the
> statistical sampling to the point where it's both balanced, but also user
> freindly.  writing code specificly for price ranges in dollars lets you
> make some assumptions about things that give you "nice" ranges (rounding
> to one significant digit less then the max, doing log based ranges, etc..)
> that wouldn't really apply if you were trying to implement a truely
> generic dynamic range generator.
> 
> one thing to keep in mind: it's typically not a good idea to have the
> constraint set of a facet change just because some other constraint was
> added to the query -- individual constraints might disappear because
> they no longer apply, but it can be very disconcerting to a user to
> when options hcange on them....  if i search on "ipod" a statistical
> analysis of prices might yeild facet ranges of $1-20, $20-60, $60-120,
> $120-$200 ... if i then click on "accessories" the statistics might skew
> cheaper, so hte new ranges are $1-20, $20-30, $30-40, $40-70 ...  and now
> i'm a frustrated user, because i relaly wanted ot use the range $20-60
> (that just happens to be my budget) and you offered it to me and then you
> took it away ... i have to undo my selection or "accessories" then click
> $20-60, and then click accessories to get what i wnat ... not very nice.
> 
> : So if it would be possible to go over each item in the search result
> : I could check the price field and define my ranges for the specific
> : query on solr side and return the price ranges as a facet.
> 
> : Otherwise, what would be a good starting point to plug in such
> : functionality into solr?
> 
> if you relaly want to do statistical distributions, one way to avoid doing
> all of this work on the client side (and needing to pull back all of hte
> prices from all of hte matches) would be to write a custom request handler
> that subclasses whichever on you currently use and does this computation
> on the server side -- where it has lower level access to the data and
> doesn't need to stream it over the wire.  FieldCache in particular would
> come in handy.
> 
> it occurs to me that even though there may not be a way to dynamicly
> create facet ranges that can apply usefully on any numeric field, we could
> add generic support to the request handlers for optionally fetching some
> basic statistics about a DocSet for clients that want them (either for
> building ranges, or for any other purpose)
> 
> min, max, mean, median, mode, midrange ... those should all be easy to
> compute using the ValueSource from the field type (it would be nice if
> FieldType's had some way of indicating which DocValues function can best
> manage the field type, but we can always assume float or have an option
> for dictating it ... people might want a float mean for an int field
> anyway)
> 
> i suppose even stddev could be computed fairly easily ... there's a
> formula for that that works well in a single pass over a bunch of values
> right?
> 
> 
> 
> 
> -Hoss
> 
-- 
Martin Grotzke
http://www.javakaffee.de/blog/

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to