Mayur Madnani created SPARK-48390:
-------------------------------------

             Summary: SparkListenerBus not sending tableName details in logical 
plan for spark versions 3.4.2 and above
                 Key: SPARK-48390
                 URL: https://issues.apache.org/jira/browse/SPARK-48390
             Project: Spark
          Issue Type: Bug
          Components: Spark Core, SQL
    Affects Versions: 3.4.3, 3.5.1, 3.5.0, 3.4.2, 3.5.2
            Reporter: Mayur Madnani


In OpenLineage, via SparkEventListener a logical plan event is received and by 
parsing it the frameworks deduces Input/Output table names to create a lineage.
The issue is that in spark versions 3.4.2 and above (tested and reproducible in 
3.4.2 & 3.5.0) the logical plan event sent by spark core is partial and is 
missing the tableName property which was been sent in earlier versions (working 
in spark 3.3.4).


+_Note: This issue is only encountered in drop table events._+

For a drop table event, see below the logical plan in different spark versions

*Spark 3.3.4*
{code:java}
[
{
"class": "org.apache.spark.sql.execution.command.DropTableCommand",
"num-children": 0,
"tableName":

{ "product-class": "org.apache.spark.sql.catalyst.TableIdentifier", "table": 
"drop_table_test", "database": "default" }

,
"ifExists": false,
"isView": false,
"purge": false
}
]

{code}
*Spark 3.4.2*
{code:java}
[

{ "class": "org.apache.spark.sql.catalyst.plans.logical.DropTable", 
"num-children": 1, "child": 0, "ifExists": false, "purge": false }

,

{ "class": "org.apache.spark.sql.catalyst.analysis.ResolvedIdentifier", 
"num-children": 0, "catalog": null, "identifier": null }

]

{code}
More details in referenced issue here: 
[https://github.com/OpenLineage/OpenLineage/issues/2716]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to