Mayur Madnani created SPARK-48390:
-------------------------------------
Summary: SparkListenerBus not sending tableName details in logical
plan for spark versions 3.4.2 and above
Key: SPARK-48390
URL: https://issues.apache.org/jira/browse/SPARK-48390
Project: Spark
Issue Type: Bug
Components: Spark Core, SQL
Affects Versions: 3.4.3, 3.5.1, 3.5.0, 3.4.2, 3.5.2
Reporter: Mayur Madnani
In OpenLineage, via SparkEventListener a logical plan event is received and by
parsing it the frameworks deduces Input/Output table names to create a lineage.
The issue is that in spark versions 3.4.2 and above (tested and reproducible in
3.4.2 & 3.5.0) the logical plan event sent by spark core is partial and is
missing the tableName property which was been sent in earlier versions (working
in spark 3.3.4).
+_Note: This issue is only encountered in drop table events._+
For a drop table event, see below the logical plan in different spark versions
*Spark 3.3.4*
{code:java}
[
{
"class": "org.apache.spark.sql.execution.command.DropTableCommand",
"num-children": 0,
"tableName":
{ "product-class": "org.apache.spark.sql.catalyst.TableIdentifier", "table":
"drop_table_test", "database": "default" }
,
"ifExists": false,
"isView": false,
"purge": false
}
]
{code}
*Spark 3.4.2*
{code:java}
[
{ "class": "org.apache.spark.sql.catalyst.plans.logical.DropTable",
"num-children": 1, "child": 0, "ifExists": false, "purge": false }
,
{ "class": "org.apache.spark.sql.catalyst.analysis.ResolvedIdentifier",
"num-children": 0, "catalog": null, "identifier": null }
]
{code}
More details in referenced issue here:
[https://github.com/OpenLineage/OpenLineage/issues/2716]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]