andygrove commented on PR #43245:
URL: https://github.com/apache/spark/pull/43245#issuecomment-1750858019
Thanks @Hisoka-X. I tested this out, but the behavior is still different
from 3.4.0. I don't think we can just use the length of the format to verify
matches, because some formats have optional components like `[.SSS][XXX]`.
Using the same test file from the issue:
```csv
2884-06-24T02:45:51.138
2884-06-24T02:45:51.138
2884-06-24T02:45:51.138
```
## 3.4.0
Infers column as `timestamp`.
```
scala> val df = spark.read.option("timestampFormat",
"yyyy-MM-dd'T'HH:mm:ss[.SSS][XXX]").option("inferSchema",
true).csv("/tmp/timestamps.csv")
df: org.apache.spark.sql.DataFrame = [_c0: timestamp]
```
## This PR
Infers column as `string`.
```
scala> val df = spark.read.option("timestampFormat",
"yyyy-MM-dd'T'HH:mm:ss[.SSS][XXX]").option("inferSchema",
false).csv("/tmp/timestamps.csv")
val df: org.apache.spark.sql.DataFrame = [_c0: string]
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]