uros-b commented on code in PR #56459:
URL: https://github.com/apache/spark/pull/56459#discussion_r3467657009
##########
sql/core/src/test/scala/org/apache/spark/sql/streaming/test/DataStreamTableAPISuite.scala:
##########
@@ -84,6 +84,23 @@ class DataStreamTableAPISuite extends StreamTest with
BeforeAndAfter {
checkErrorTableNotFound(e, "`non_exist_table`")
}
+ test("read: user-specified schema is ignored by the table API") {
+ val tblName = "my_table"
+ withTable(tblName) {
+ spark.range(3).write.format("parquet").saveAsTable(tblName)
+ // The user-specified `a: Int` is ignored (with a warning); the catalog
table's
+ // `id: Long` is used.
+ val df = spark.readStream
+ .schema(new StructType().add("a", IntegerType))
+ .table(tblName)
+ assert(df.schema === new StructType().add("id", LongType, nullable =
false))
+ testStream(df)(
+ ProcessAllAvailable(),
+ CheckAnswer(Row(0), Row(1), Row(2))
+ )
+ }
+ }
+
Review Comment:
Main gap in testing: tests don't actually assert the warning is emitted.
This is the core behavior of the PR, but none of the three added tests verify a
warning is produced, they only assert that the user schema is ignored and the
catalog schema wins. Suggestion: Classic - wrap in withLogAppender(...) and
assert the message is present; Connect Python - use
assertWarns/assertWarnsRegex (or assertWarnsRegex only on the Connect path) to
assert the warning fires.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]