: It's more than string processing, anyway. I would want to convert the
: Solr Time 2007-03-15T00:41:5:2Z to "March 15th, 2007" in a web app.
: I'd also like to say 'Posted 3 days ago." In my vision of things,
: that work is done on Solr's side. (The former case with a strftime
: type formatter in solrconfig, the latter by having strftime return
: the day number this year.)

One of the early architecture/design principles of the Solr "search"
APIs was "compute secondary info about a result if it's more
efficient or easier to compute in Solr then it would be for a client to do
it" -- DocSet caches, facet counts, and sorting/pagination being
great examples of things where Solr can do less "work" to get the same
info out of raw data then a client app would because of it's low level
access to the data, and becuase of how much data would need to go over the
wire for the client to do the same computation. ... that's largely just a
lit bit of historic trivial however, Solr has a lot of features now which
might not hold up to the yard stick, but i mention it only to clarify one
of hte reasons Solr didnt' have more 'configurable" date formatting to
start with.

it has been on the TaskList since the start of incubation however...

  * a DateTime field (or Query Parser extension) that allows flexible
    input for easier human entered queries
  * allow alternate format for date output to ease client creation of
    date objects?

One of hte reasons i dont' think anyone has tackled them yet is because
it's hard to get a holistic view of a solution, because there are really
several loosely related problems with date formatting issues:

The first is a discusion of the "internal format" and what resolution the
dates are stored at in the index itself.  if you *know* that you never
plan on querying with anything more fine grained then day resolution,
storing your dates with only day resolution can make your index a lot
smaller (and make date searches a lot faster).  with the current DateField
the same performance benefits can be achieved by "rounding" your dates
before indexing them, but if we were to make it a config option on
DateField itself to automaticly round, we would need to take this info
into account when parsing updates -- should the client be exepcted to know
what precision each date field uses?  do they send dates expressed using
the "internal" format, or as fully qualified times?  is it an
error/warning to attempt to index more datetime precision then a field
supports?

The second is a discussion of "external format" (which seems to be what
you are mostly discussing)  the most trivial way to address this would be
options on the ResponseWriters that allow them to be configured with
DateFormater Strings they would use to process any date they return .. but
that raises questions about the QueryParsing aspect as well ... should
date formating be a property of the response, or a property of the
request, such that both input and output formats are identicle?

Third is how the discussions of the internal format and the external
format shouldn't be treated completely indepndent.  it's tempting to say
that there will be a clean abstraction between the two, that all client
interaction will be done using configured "external" formater(s) to create
internal java Date objects, which will then be translated back to Strings
by an "internal" formater for the purpose of indexing (and querying) but
what happens when a query expresses a date range too precise for the
granularity expressed by the internal format? do we match
nothing/everything? ... what if the indexed granularity is *more* recised
then the uery graunlarity .. how do we know if a range query between March
6, 2007 and May 10, 2007 on a field that stores millisencond granularity
is suppose to go from the first millisecond of each day or the last?



Questions like these are whiy I'm glad Solr currently keeps it simple and
makes people deal in absolutes .. less room for confusion  :)


-Hoss

Reply via email to