[PR] [SPARK-46714][SQL] Overwrite a partition with custom location [spark]

via GitHub Sun, 12 May 2024 03:07:23 -0700


adrian-wang opened a new pull request, #44725:
URL: https://github.com/apache/spark/pull/44725


   ### What changes were proposed in this pull request?
   
   Sometimes we use more than one filesystems for data warehouse, for example 
one for hot/warm data and another for cold data, with different storages to 
save total cost. But it seems after spark convert table writing into data 
source writing, it is not working as expected.
   
   Before this patch, when overwrite a partition with custom location:
   1. if the partition location is on same filesystem with its table, the 
partition location remain the same.
   2. else, spark will throw an exception `java.lang.IllegalArgumentException: 
Wrong FS:`
   After this patch, the behavior will align with Hive: the overwritten 
partition will be recreated under table location.
   
   ### Why are the changes needed?
   1. to align behavior with Hive
   2. support existing partitions on a separate filesystem from table location.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes.
   Before this patch, when overwrite a partition with custom location:
   1. if the partition location is on same filesystem with its table, the 
partition location remain the same.
   2. else, spark will throw an exception `java.io.IOException: Wrong FS ...`
   After this patch, the behavior will align with Hive: the overwritten 
partition will be recreated under table location.
   
   ### How was this patch tested?
   
   Added a unit test case.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] [SPARK-46714][SQL] Overwrite a partition with custom location [spark]

Reply via email to