Liwen Sun created SPARK-32776:
---------------------------------

             Summary: Limit in streaming should not be optimized away by 
PropagateEmptyRelation
                 Key: SPARK-32776
                 URL: https://issues.apache.org/jira/browse/SPARK-32776
             Project: Spark
          Issue Type: Bug
          Components: Structured Streaming
    Affects Versions: 3.1.0
            Reporter: Liwen Sun


Right now, the limit operator in a streaming query may get optimized away when 
the relation is empty. This can be problematic for stateful streaming, as this 
empty batch will not write any state store files, and the next batch will fail 
when trying to read these state store files and throw a file not found error.

We should not let PropagateEmptyRelation optimize away the Limit operator for 
streaming queries.

This ticket is intended to apply a small and safe fix for 
PropagateEmptyRelation. A fundamental fix that can prevent this from happening 
again in the future and in other optimizer rules is more desirable, but that's 
a much larger task.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to