GitHub user MaxGekk opened a pull request:
https://github.com/apache/spark/pull/23201
[SPARK-26246][SQL] Infer date and timestamp types from JSON
## What changes were proposed in this pull request?
The `JsonInferSchema` class is extended to support `DateType` and
`TimestampType` inferring from string fields in JSON input. It tries to infer
`TimestampType` as tightest type first of all. If timestamp parsing fails,
`DateType` is inferred using date pattern. As the fallback in the case of both
failures, it invokes `DateTimeUtils.stringToTime`.
## How was this patch tested?
Added new test suite - `JsonInferSchemaSuite` to check date and timestamp
types inferring from JSON. This changes was tested by `JsonSuite`,
`JsonExpressionsSuite` and `JsonFunctionsSuite` as well.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/MaxGekk/spark-1 json-infer-time
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/23201.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #23201
----
commit 2a26e2c680b517e9e89a0f4bc4cc31884020188d
Author: Maxim Gekk <max.gekk@...>
Date: 2018-12-02T20:06:05Z
Added a test for timestamp inferring
commit bd472072a39dbec2e1eec1396196c6c5e6a659dd
Author: Maxim Gekk <max.gekk@...>
Date: 2018-12-02T20:43:48Z
Infer date and timestamp types
commit 9dbdf0a764c998875932e50faf460f36216ef58d
Author: Maxim Gekk <max.gekk@...>
Date: 2018-12-02T20:44:08Z
Test for date type
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]