[
https://issues.apache.org/jira/browse/SOLR-12561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16548094#comment-16548094
]
David Smiley commented on SOLR-12561:
-------------------------------------
Patch:
* Use java.time thoroughly; no java.text remnants nor use of Date. Always
Instant.
* Enhanced TestExtractionDateUtil a lot to be more thorough to test more of the
supported patterns and their idiosyncrasies. I want to ensure we don't break
back-compat here! These better tests helped uncovered some issues during
development of this switch.
* Two of the default patterns had a lowercase "hh" for hour of AM/PM instead of
"HH" for hour of day. SimpleDateFormat seemed to deal with this but I think
they are fundamentally invalid without an AM/PM qualifier. I switched them to
HH. If someone custom configures the patterns in their solr config, they'll
need to use the correct designator.
* Use parsed DateTimeFormatter instances instead of Strings in
SolrContentHandler and it's factory. Since order might be significant or might
be used for performance reasons, I also switched to LinkedHashSet from HashSet
for the impl in ExtractingRequestHandler's config parser.
This seems safe for 7.x; any break would seem to be very obscure IMO. On the
other hand, 8.0 will be out this fall or so.
> Port ExtractionDateUtil to java.time API
> ----------------------------------------
>
> Key: SOLR-12561
> URL: https://issues.apache.org/jira/browse/SOLR-12561
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public)
> Components: contrib - Solr Cell (Tika extraction)
> Reporter: David Smiley
> Assignee: David Smiley
> Priority: Minor
> Fix For: master (8.0)
>
> Attachments: SOLR-12561.patch
>
>
> The ExtractionDateUtil class in the extraction contrib uses
> SimpleDateFormatter. The Java 8 java.time API is superior; you can find
> articles out there why. One thing that comes to mind is less timezone
> bugginess – SOLR-10243. Although the API may be a bit baroque IMO
> (over-engineered). Here I'd like to switch over the API and furthermore have
> the patterns be pre-parsed so that at runtime we don't need to re-parse the
> patterns.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]