Eric Marnadi created SPARK-55402:
------------------------------------

             Summary: Move streamingSourceIdentifyingName from CatalogTable to 
DataSource
                 Key: SPARK-55402
                 URL: https://issues.apache.org/jira/browse/SPARK-55402
             Project: Spark
          Issue Type: Task
          Components: Structured Streaming
    Affects Versions: 4.2.0
            Reporter: Eric Marnadi


streamingSourceIdentifyingName represents query-specific metadata (which source 
name was assigned in a particular streaming query plan), not an intrinsic 
property of the table itself. Storing it in CatalogTable breaks table equality 
semantics:

- Two references to the same table in a single query can have different 
streamingSourceIdentifyingName values

- This causes them to compare as unequal via CatalogTable.equals()

- This can impact multi-statement transactions and any caching/deduplication 
logic that relies on CatalogTable equality. By moving this field to DataSource 
(which is already query-specific), we restore proper catalog table equality 
while maintaining the ability to track streaming source identifying names for 
stable checkpoints.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to