Github user mgaido91 commented on a diff in the pull request:
https://github.com/apache/spark/pull/20621#discussion_r168924005
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala
---
@@ -407,6 +407,34 @@ object PartitioningUtils {
Literal(bigDecimal)
}
+ val dateTry = Try {
+ // try and parse the date, if no exception occurs this is a
candidate to be resolved as
+ // DateType
+ DateTimeUtils.getThreadLocalDateFormat.parse(raw)
+ // SPARK-23436: Casting the string to date may still return null if
a bad Date is provided.
+ // This can happen since DateFormat.parse may not use the entire
text of the given string:
+ // so if there are extra-characters after the date, it returns
correctly.
+ // We need to check that we can cast the raw string since we later
can use Cast to get
+ // the partition values with the right DataType (see
+ //
org.apache.spark.sql.execution.datasources.PartitioningAwareFileIndex.inferPartitioning)
+ val dateValue = Cast(Literal(raw), DateType).eval()
+ // Disallow DateType if the cast returned null
+ require(dateValue != null)
+ Literal.create(dateValue, DateType)
+ }
+
+ val timestampTry = Try {
+ val unescapedRaw = unescapePathName(raw)
+ // try and parse the date, if no exception occurs this is a
candidate to be resolved as
+ // TimestampType
+
DateTimeUtils.getThreadLocalTimestampFormat(timeZone).parse(unescapedRaw)
--- End diff --
I don't think so, because in the cast we tolerate various timestamp format
which parse doesn't support (please check the comment to
`DateTimeUtils.stringToTimestamp`). So I'd not consider safe to remove this and
anyway it would/may introduce unintended behavior changes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]