GitHub user gatorsmile opened a pull request:
https://github.com/apache/spark/pull/20764
[SPARK-23436][SQL][BACKPORT-2.3] Infer partition as Date only if it can be
casted to Date
This PR is to backport https://github.com/apache/spark/pull/20621 to branch
2.3
---
## What changes were proposed in this pull request?
Before the patch, Spark could infer as Date a partition value which cannot
be casted to Date (this can happen when there are extra characters after a
valid date, like `2018-02-15AAA`).
When this happens and the input format has metadata which define the schema
of the table, then `null` is returned as a value for the partition column,
because the `cast` operator used in
(`PartitioningAwareFileIndex.inferPartitioning`) is unable to convert the value.
The PR checks in the partition inference that values can be casted to Date
and Timestamp, in order to infer that datatype to them.
## How was this patch tested?
added UT
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/gatorsmile/spark backport23436
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/20764.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #20764
----
commit a69d8d19b01438bc228deb6706c2dc59817a1cfd
Author: Marco Gaido <marcogaido91@...>
Date: 2018-02-20T05:56:38Z
[SPARK-23436][SQL] Infer partition as Date only if it can be casted to Date
## What changes were proposed in this pull request?
Before the patch, Spark could infer as Date a partition value which cannot
be casted to Date (this can happen when there are extra characters after a
valid date, like `2018-02-15AAA`).
When this happens and the input format has metadata which define the schema
of the table, then `null` is returned as a value for the partition column,
because the `cast` operator used in
(`PartitioningAwareFileIndex.inferPartitioning`) is unable to convert the value.
The PR checks in the partition inference that values can be casted to Date
and Timestamp, in order to infer that datatype to them.
## How was this patch tested?
added UT
Author: Marco Gaido <[email protected]>
Closes #20621 from mgaido91/SPARK-23436.
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]