SaurabhChawla100 commented on a change in pull request #32558:
URL: https://github.com/apache/spark/pull/32558#discussion_r643760281
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVInferSchema.scala
##########
@@ -160,6 +168,16 @@ class CSVInferSchema(val options: CSVOptions) extends
Serializable {
private def tryParseDouble(field: String): DataType = {
if ((allCatch opt field.toDouble).isDefined || isInfOrNan(field)) {
DoubleType
+ } else {
+ tryParseDateFormat(field)
+ }
+ }
+
+ private def tryParseDateFormat(field: String): DataType = {
+ if (options.inferDateType
+ && !dateFormatter.isInstanceOf[LegacySimpleDateFormatter]
Review comment:
It has to be LegacyFastDateFormatter, missed to changed it. Previously I
was using the SimpleDateFormatter so added this LegacySimpleDateFormatter, Now
since we are using the FastDateFormatter it has to be LegacyFastDateFormatter,
Making that change.
If legacy is on, we have ambiguity about Datetype pattern matching, because
they can be arbitrarily set by users.
It does not do the exact match, which means it's not going to distinguish
yyyy-MM and yyyy-MM-dd for input, for instance, 2010-10-10.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]