Re: [PR] [SPARK-57572][SQL] Infer TimeType during CSV and JSON schema inference [spark]

via GitHub Sat, 20 Jun 2026 15:41:28 -0700


uros-b commented on code in PR #56634:
URL: https://github.com/apache/spark/pull/56634#discussion_r3447606817



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVInferSchema.scala:
##########
@@ -194,12 +201,20 @@ class CSVInferSchema(val options: CSVOptions) extends 
Serializable {
     if ((allCatch opt field.toDouble).isDefined || isInfOrNan(field)) {
       DoubleType
     } else if (options.preferDate) {
-      tryParseDate(field)
+      tryParseTime(field)
     } else {
       tryParseTimestampNTZ(field)
     }
   }
 
+  private def tryParseTime(field: String): DataType = {
+    if (isTimeTypeEnabled && (allCatch opt 
timeFormatter.parse(field)).isDefined) {
+      TimeType(TimeType.MICROS_PRECISION)

Review Comment:
   Let's just be careful about any possible silent precision loss for nano time 
(@MaxGekk is currently working on this).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-57572][SQL] Infer TimeType during CSV and JSON schema inference [spark]

Reply via email to