[jira] [Updated] (SPARK-56171) V2 file write with partition and dynamic overwrite support

Yang Jie (Jira) Mon, 23 Mar 2026 21:44:09 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-56171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Yang Jie updated SPARK-56171:
-----------------------------
    Description: Enable `FileWrite` to support partition columns, dynamic 
partition overwrite, and truncate (full overwrite) behind feature flag 
`spark.sql.sources.v2.file.write.enabled` (default false). Add 
`partitionSchema` field to `FileWrite`, partition column separation in 
`WriteJobDescription`, `RequiresDistributionAndOrdering` for partition sorting, 
path creation for new paths, truncate logic for overwrite mode, and dynamic 
partition overwrite via `FileCommitProtocol`. Fix `lazy val description` to 
`val` so `prepareWrite` runs before `setupJob`. Add 
`checkNoCollationsInMapKeys` validation and skip `supportsDataType` check for 
partition columns in `FileWrite.validateInputs`. Use consistent `jobId` for 
`FileCommitProtocol` and `WriteJobDescription.uuid`. `DataFrameWriter`: dynamic 
partition overwrite routing, ErrorIfExists/Ignore mode V1 fallback. Update all 
format Write/Table classes (Parquet, ORC, CSV, JSON, Text, Avro).  (was: Enable 
`FileWrite` to support partition columns, dynamic partition overwrite, and 
truncate (full overwrite) behind feature flag 
`spark.sql.sources.v2.file.write.enabled` (default false). Add 
`partitionSchema` field to `FileWrite`, partition column separation in 
`WriteJobDescription`, `requiredOrdering` for partition sorting, path creation 
for new paths, truncate logic for overwrite mode, and dynamic partition 
overwrite via `FileCommitProtocol`. Update all format Write/Table classes 
(Parquet, ORC, CSV, JSON, Text, Avro). )

> V2 file write with partition and dynamic overwrite support
> ----------------------------------------------------------
>
>                 Key: SPARK-56171
>                 URL: https://issues.apache.org/jira/browse/SPARK-56171
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 4.2.0
>            Reporter: Yang Jie
>            Priority: Major
>
> Enable `FileWrite` to support partition columns, dynamic partition overwrite, 
> and truncate (full overwrite) behind feature flag 
> `spark.sql.sources.v2.file.write.enabled` (default false). Add 
> `partitionSchema` field to `FileWrite`, partition column separation in 
> `WriteJobDescription`, `RequiresDistributionAndOrdering` for partition 
> sorting, path creation for new paths, truncate logic for overwrite mode, and 
> dynamic partition overwrite via `FileCommitProtocol`. Fix `lazy val 
> description` to `val` so `prepareWrite` runs before `setupJob`. Add 
> `checkNoCollationsInMapKeys` validation and skip `supportsDataType` check for 
> partition columns in `FileWrite.validateInputs`. Use consistent `jobId` for 
> `FileCommitProtocol` and `WriteJobDescription.uuid`. `DataFrameWriter`: 
> dynamic partition overwrite routing, ErrorIfExists/Ignore mode V1 fallback. 
> Update all format Write/Table classes (Parquet, ORC, CSV, JSON, Text, Avro).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (SPARK-56171) V2 file write with partition and dynamic overwrite support

Reply via email to