[
https://issues.apache.org/jira/browse/SOLR-9080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Smiley updated SOLR-9080:
-------------------------------
Attachment: SOLR_9080_DateMath_should_not_use_Calendar_API.patch
The test said (erroneously) all was well, when I changed the year to 1234 in
most test methods, for any of two reasons (or both): the test itself uses
SimpleDateFormatter (which uses Calendar) as a source of truth, and because it
called directly into the Calendar based utility methods of DateMathParser
instead of constructing a String date math expression.
This patch addresses both of those issues in the test, and changes most years
in the test to 1234. I left some where they were because specific dates were
used to craft time zone offset functionality.
And I fixed DateMathParser itself, which was kinda fun. I removed or made
non-public some things that weren't being used outside of itself or the test.
* Note that a {{Locale}} is no longer needed/used in this API and it's dubious
if it ever had an effect before, at least based on a comment about impacting
when a day of the week starts (who cares?). Only the DIH DateFormatEvaluator
passes something other than Locale.ROOT: it uses Locale.ENGLISH with the
ability to pick something else, and it's not evident it's tested.
This patch is probably not the final patch as I want to change the
DateMathParser's API that will affect some callers.
I'm inclined to think DateMathParser should not be something constructed -- it
just needs static methods. And switch away from java.util.TimeZone to
java.time.ZoneId in the API. Maybe a separate issue for such things.
> DateMath is broken before the year 1582
> ---------------------------------------
>
> Key: SOLR-9080
> URL: https://issues.apache.org/jira/browse/SOLR-9080
> Project: Solr
> Issue Type: Bug
> Reporter: David Smiley
> Assignee: David Smiley
> Fix For: 6.0
>
> Attachments: SOLR_9080_DateMath_should_not_use_Calendar_API.patch
>
>
> In Solr 6.0, dates are parsed using the Java 8 java.time API. It formerly
> was parsed using java.util.SimpleDateFormat which uses
> java.util.GregorianCalendar. I've learned that the java.time API does _not_
> switch to a different algorithm at the Gregorian Change Date (year 1582)
> whereas GregorianCalendar does. A ramification of this is that the
> milliseconds before epoch value is different between the APIs for dates prior
> to this year. They both round-trip between themselves but not between each
> other prior to this date. Thus, anyone indexing historical dates must
> re-index when moving to Solr 6.
> What was _not_ changed in the parsing code was Solr's date-math logic -- it
> still uses the Calendar API. This works for dates after 1582 but before,
> it'll introduce discrepancies. Here's an example showing weird behavior:
> http://localhost:8983/solr/techproducts/select?facet.range.end=1400-01-01T00:00:00Z&facet.range.gap=%2B10YEARS&facet.range.start=1300-01-01T00:00:00Z/YEAR&facet.range=manufacturedate_dt&facet=on&indent=on&q=*:*&rows=0&wt=json
> Note that the year 1300 rounded down to the year, becomes 1299 January 8th
> (weird in and of itself) and that subsequent gaps start on the 9th.
> {noformat}
> "counts":[
> "1299-01-08T00:00:00Z",0,
> "1309-01-09T00:00:00Z",0,
> "1319-01-09T00:00:00Z",0, ...
> {noformat}
> This weirdness will show itself for units at the year or month level, but not
> below that (from what I'm seeing). In other words, if facet.range.gap is at
> this amount, or otherwise using the date math syntax to round or add a year
> or month, there will be issues like this. Otherwise there doesn't seem to be
> an issue.
> I think the solution is clearly to switch the date math code to use the
> java.time API.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]