[GitHub] spark pull request #20621: [SPARK-23436][SQL] Infer partition as Date only i...

mgaido91 Sat, 17 Feb 2018 07:04:50 -0800

Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20621#discussion_r168924005
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala
 ---
    @@ -407,6 +407,34 @@ object PartitioningUtils {
           Literal(bigDecimal)
         }
     
    +    val dateTry = Try {
    +      // try and parse the date, if no exception occurs this is a 
candidate to be resolved as
    +      // DateType
    +      DateTimeUtils.getThreadLocalDateFormat.parse(raw)
    +      // SPARK-23436: Casting the string to date may still return null if 
a bad Date is provided.
    +      // This can happen since DateFormat.parse  may not use the entire 
text of the given string:
    +      // so if there are extra-characters after the date, it returns 
correctly.
    +      // We need to check that we can cast the raw string since we later 
can use Cast to get
    +      // the partition values with the right DataType (see
    +      // 
org.apache.spark.sql.execution.datasources.PartitioningAwareFileIndex.inferPartitioning)
    +      val dateValue = Cast(Literal(raw), DateType).eval()
    +      // Disallow DateType if the cast returned null
    +      require(dateValue != null)
    +      Literal.create(dateValue, DateType)
    +    }
    +
    +    val timestampTry = Try {
    +      val unescapedRaw = unescapePathName(raw)
    +      // try and parse the date, if no exception occurs this is a 
candidate to be resolved as
    +      // TimestampType
    +      
DateTimeUtils.getThreadLocalTimestampFormat(timeZone).parse(unescapedRaw)
    --- End diff --
    
    I don't think so, because in the cast we tolerate various timestamp format 
which parse doesn't support (please check the comment to 
`DateTimeUtils.stringToTimestamp`). So I'd not consider safe to remove this and 
anyway it would/may introduce unintended behavior changes.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #20621: [SPARK-23436][SQL] Infer partition as Date only i...

Reply via email to