Jungtaek Lim created SPARK-38227: ------------------------------------ Summary: Apply strict nullability of nested column in time window / session window Key: SPARK-38227 URL: https://issues.apache.org/jira/browse/SPARK-38227 Project: Spark Issue Type: Bug Components: Structured Streaming Affects Versions: 3.2.1, 3.3.0 Reporter: Jungtaek Lim
In TimeWindow and SessionWindow, we define dataType of these function expressions as StructType having two nested columns "start" and "end", which is "nullable". And we replace these expressions in the analyzer via corresponding rules, TimeWindowing for TimeWindow, and SessionWindowing for SessionWindow. The rules replace the function expressions with Alias, referring CreateNamedStruct. For the value side of CreateNamedStruct, we don't specify anything about nullability, which leads to a risk the value side may be interpreted (or optimized) as non-nullable, which would be different Spark would be expected. We should make sure the nullability of columns in CreateNamedStruct remains the same with dataType definition on these function expressions. -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org