[GitHub] spark pull request #20621: [SPARK-23436][SQL] Infer partition as Date only i...

mgaido91 Sun, 18 Feb 2018 01:45:09 -0800

Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20621#discussion_r168945339
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala
 ---
    @@ -407,6 +407,34 @@ object PartitioningUtils {
           Literal(bigDecimal)
         }
     
    +    val dateTry = Try {
    +      // try and parse the date, if no exception occurs this is a 
candidate to be resolved as
    +      // DateType
    +      DateTimeUtils.getThreadLocalDateFormat.parse(raw)
    +      // SPARK-23436: Casting the string to date may still return null if 
a bad Date is provided.
    +      // This can happen since DateFormat.parse  may not use the entire 
text of the given string:
    +      // so if there are extra-characters after the date, it returns 
correctly.
    +      // We need to check that we can cast the raw string since we later 
can use Cast to get
    +      // the partition values with the right DataType (see
    +      // 
org.apache.spark.sql.execution.datasources.PartitioningAwareFileIndex.inferPartitioning)
    +      val dateValue = Cast(Literal(raw), DateType).eval()
    +      // Disallow DateType if the cast returned null
    +      require(dateValue != null)
    +      Literal.create(dateValue, DateType)
    +    }
    +
    +    val timestampTry = Try {
    +      val unescapedRaw = unescapePathName(raw)
    +      // try and parse the date, if no exception occurs this is a 
candidate to be resolved as
    +      // TimestampType
    +      
DateTimeUtils.getThreadLocalTimestampFormat(timeZone).parse(unescapedRaw)
    --- End diff --
    
    Yes, you are right. the only change introduced is that some values which 
were previously wrongly inferred as dates, now will be inferred as strings. 
Everything else works as before.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20621: [SPARK-23436][SQL] Infer partition as Date only i...

Reply via email to