wayne-kyungwonpark commented on code in PR #33709:
URL: https://github.com/apache/spark/pull/33709#discussion_r1919614543
##########
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetPartitionDiscoverySuite.scala:
##########
@@ -1067,7 +1067,7 @@ abstract class ParquetPartitionDiscoverySuite
test("SPARK-23436: invalid Dates should be inferred as String in partition
inference") {
withTempPath { path =>
- val data = Seq(("1", "2018-01", "2018-01-01-04", "test"))
+ val data = Seq(("1", "2018-41", "2018-01-01-04", "test"))
Review Comment:
Additionally,
I tested this code using sbt by replacing the value of the date_month column
as shown below.
(`2018-04` -> `2018-01`)
```
test(
"SPARK-23436, SPARK-36861: invalid Dates should be inferred as String in
partition inference") {
withTempPath { path =>
val data = Seq(("1", "2018-01", "2018-01-01-04", "2021-01-01T00",
"test"))
.toDF("id", "date_month", "date_hour", "date_t_hour", "data")
data.write.partitionBy("date_month", "date_hour",
"date_t_hour").parquet(path.getAbsolutePath)
val input = spark.read.parquet(path.getAbsolutePath).select("id",
"date_month", "date_hour", "date_t_hour", "data")
assert(data.schema.sameType(input.schema))
checkAnswer(input, data)
}
}
```
```sbt
sbt:spark-sql> testOnly *ParquetV1PartitionDiscoverySuite -- -z "SPARK-23436"
sbt:spark-sql> testOnly *ParquetV2PartitionDiscoverySuite -- -z "SPARK-23436"
```
=> All success..!
However, after changing the value to `2018-01-01` with the `DateType` format
'yyyy-MM-dd', all sbt tests failed as expected.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]