Dmytro Fedoriaka created SPARK-55492:
----------------------------------------
Summary: SparkException from
org.apache.spark.sql.catalyst.trees.TreeNode
Key: SPARK-55492
URL: https://issues.apache.org/jira/browse/SPARK-55492
Project: Spark
Issue Type: Bug
Components: Structured Streaming
Affects Versions: 4.2.0
Reporter: Dmytro Fedoriaka
Fix For: 4.2.0
The query:
{code:java}
val df = Seq((1, ("2024-01-01 10:00:00", "val1")))
.toDF("id", "data")
.select(
$"id",
struct( to_timestamp($"data._1").as("timestamp"), $"data._2".as("value")
).as("nested_struct")
)
.select($"id", $"nested_struct".as("kolona"))
df.withWatermark("kolona.timestamp", "0 seconds")
{code}
Error:
{code:java}
Failed to copy node.
Is otherCopyArgs specified correctly for EventTimeWatermark.
Exception message: argument type mismatch
...
at org.apache.spark.sql.catalyst.trees.TreeNode.makeCopy(TreeNode.scala:941)
...{code}
The problem is that we are trying to resolve reference for nested field, and
nested field is not expected as eventTime column for withWatermark. The
solution is to add a validation that forbids using nested fields in
withWatermark.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]