Jungtaek Lim created SPARK-38227:
------------------------------------

             Summary: Apply strict nullability of nested column in time window 
/ session window
                 Key: SPARK-38227
                 URL: https://issues.apache.org/jira/browse/SPARK-38227
             Project: Spark
          Issue Type: Bug
          Components: Structured Streaming
    Affects Versions: 3.2.1, 3.3.0
            Reporter: Jungtaek Lim


In TimeWindow and SessionWindow, we define dataType of these function 
expressions as StructType having two nested columns "start" and "end", which is 
"nullable".

And we replace these expressions in the analyzer via corresponding rules, 
TimeWindowing for TimeWindow, and SessionWindowing for SessionWindow.

The rules replace the function expressions with Alias, referring 
CreateNamedStruct. For the value side of CreateNamedStruct, we don't specify 
anything about nullability, which leads to a risk the value side may be 
interpreted (or optimized) as non-nullable, which would be different Spark 
would be expected.

We should make sure the nullability of columns in CreateNamedStruct remains the 
same with dataType definition on these function expressions.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to