Jonathancui123 commented on PR #36871:
URL: https://github.com/apache/spark/pull/36871#issuecomment-1192923903

   > Because "1765-03-28" does not match timestamp pattern and the column is 
inferred as TimestampType, it should be returned as null. However, in the test 
it is returned as 1765-03-28 00:00:00.0. This is at very least confusing - 
inferDate should only affect dates, not timestamp columns.
   
   @sadikovi I've verified on an older version of spark prior to this PR: 
"1765-03-28" in a timestamp column without user specified format is parsed as 
"1765-03-28 00:00:00.0". So the behavior of parsing default date format in 
timestamp columns is not due to this PR. 
   
   Prior to changes, in a timestamp column:
   - custom format date: null
   - default format date: Parsed by fallback
   
   After inferDate PR, in a timestamp column:
   - custom format date: Parsed if inferDate is true, otherwise null
   - default format date: Parsed by fallback
   
   After enableParsingFallbackForDateType PR (#37147), in a timestamp column: 
   - custom format date: null
   - default format date: null
   
   Since default format date was previously parsed by fallback, we thought it 
was desirable for dates to be parsed in a timestamp column. So we included 
support for custom format dates in a timestamp column when `inferDate`=`true`.
   
   We have two options for target behavior:
   
   OPTION A - Target behavior, in a timestamp column: 
   - custom format date: Parsed if inferDate is true, otherwise null
   - default format date: Parsed if inferDate is true, otherwise null
   --> move the fallback error in #37147 to allow date parsing if 
`inferDate`=`true`
   
   OPTION B - Target behavior, in a timestamp column: 
   - custom format date: null
   - default format date: null
   --> remove redundant dateFormatter.parse behavior for `inferDate` = `true` 
   
   @cloud-fan  @HyukjinKwon should we go with Option A or Option B?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to