[
https://issues.apache.org/jira/browse/SOLR-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12512372
]
Pieter Berkel commented on SOLR-258:
------------------------------------
I've just tried this patch and the results are impressive!
I agree with Ryan regarding the naming of 'pre', 'post' and 'inner', using
simple concrete words will make it easier for developers to understand the
basic concepts. At first I was a little confused how the 'gap' parameter was
used, perhaps a name like 'interval' would be more indicative of it's purpose?
While on the topic of gaps / intervals, I can imagine a case where one might
want facet counts over non-linear intervals, for instance obtaining results
from: "Last 7 days", "Last 30 days", "Last 90 days", "Last 6 months".
Obviously you can achieve this by setting facet.date.gap=+1DAY and then
post-process the results, but a much more elegant solution would be to allow
"facet.date.gap" (or another suitably named param) to accept a
(comma-delimited) set of explicit partition dates:
facet.date.start=NOW-6MONTHS/DAY
facet.date.end=NOW/DAY
facet.date.gap=NOW-90DAYS/DAY,NOW-30DAYS/DAY,NOW-7DAYS/DAY
It would then be trivial to calculate facet counts for the ranges specified
above.
It would be useful to make the 'start' an 'end' parameters optional. If not
specified 'start' should default to the earliest stored date value, and 'end'
should default to the latest stored date value (assuming that's possible).
Probably should return a 400 if 'gap' is not set.
My personal opinion is that 'end' should be a hard limit, the last gap should
never go past 'end'. Given that the facet label is always generated from the
lower value in the range, I don't think truncating the last 'gap' will cause
problems, however it may be helpful to return the actual date value for "end"
if it was specified as a offset of NOW.
What might be a problem is when both start and end dates are specified as
offsets of NOW, the value of NOW may not be constant for both values. In one
of my tests, I set:
facet.date.start=NOW-12MONTHS
facet.date.end=NOW
facet.date.gap=+1MONTH
With some extra debugging output I can see that mostly the value of NOW is the
same:
<str name="start">2006-07-13T06:06:07.397</str>
<str name="end">2007-07-13T06:06:07.397</str>
However occasionally there is a difference:
<str name="start">2006-07-13T05:48:23.014</str>
<str name="end">2007-07-13T05:48:23.015</str>
This difference alters the number of gaps calculated (+1 when NOW values are
diff for start & end). Not sure how this could be fixed, but as you mentioned
above, it will probably involve changing "ft.toExternal(ft.toInternal(...))".
Thanks again for creating this useful addition, I'll try to test it a bit more
and see if I can find anything else.
> Date based Facets
> -----------------
>
> Key: SOLR-258
> URL: https://issues.apache.org/jira/browse/SOLR-258
> Project: Solr
> Issue Type: New Feature
> Reporter: Hoss Man
> Assignee: Hoss Man
> Attachments: date_facets.patch, date_facets.patch, date_facets.patch,
> date_facets.patch, date_facets.patch
>
>
> 1) Allow clients to express concepts like...
> * "give me facet counts per day for every day this month."
> * "give me facet counts per hour for every hour of today."
> * "give me facet counts per hour for every hour of a specific day."
> * "give me facet counts per hour for every hour of a specific day and
> give me facet counts for the
> number of matches before that day, or after that day."
> 2) Return all data in a way that makes it easy to use to build filter queries
> on those date ranges.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.