HeartSaVioR commented on code in PR #46569:
URL: https://github.com/apache/spark/pull/46569#discussion_r1599352970
##########
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQueryOptimizationCorrectnessSuite.scala:
##########
@@ -416,4 +416,39 @@ class StreamingQueryOptimizationCorrectnessSuite extends
StreamTest {
)
}
}
+
+ test("SPARK-48267: regression test, stream-stream union followed by
stream-batch join") {
+ withTempDir { dir =>
+ val input1 = MemoryStream[Int]
+ val input2 = MemoryStream[Int]
+
+ val df1 = input1.toDF().withColumn("code", lit(1))
Review Comment:
Given the sequence of optimizations in reproducer, it wouldn't happen if the
union wasn't eliminated, as the value of 'code' won't be known during
optimization phase. That said, if we call observe() or withWatermark() or etc
in streaming DF to prevent optimizer to collapse everything at one side, the
bug wouldn't be triggered.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]