Liwen Sun created SPARK-32776:
---------------------------------
Summary: Limit in streaming should not be optimized away by
PropagateEmptyRelation
Key: SPARK-32776
URL: https://issues.apache.org/jira/browse/SPARK-32776
Project: Spark
Issue Type: Bug
Components: Structured Streaming
Affects Versions: 3.1.0
Reporter: Liwen Sun
Right now, the limit operator in a streaming query may get optimized away when
the relation is empty. This can be problematic for stateful streaming, as this
empty batch will not write any state store files, and the next batch will fail
when trying to read these state store files and throw a file not found error.
We should not let PropagateEmptyRelation optimize away the Limit operator for
streaming queries.
This ticket is intended to apply a small and safe fix for
PropagateEmptyRelation. A fundamental fix that can prevent this from happening
again in the future and in other optimizer rules is more desirable, but that's
a much larger task.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]