[I] Support writing partitioned files in `COPY` command [arrow-datafusion]

via GitHub Mon, 11 Dec 2023 06:26:48 -0800


alamb opened a new issue, #8493:
URL: https://github.com/apache/arrow-datafusion/issues/8493


   ### Is your feature request related to a problem or challenge?
   
   A user asked on ASF Slack: 
https://the-asf.slack.com/archives/C04RJ0C85UZ/p1702248979379239
   
   >  Does the COPY command support *creating* parquet files that are 
partitioned using hive style partitioning?
   
   The usecase is creating Hive-sty;e partitioned datasets (e.g as described 
[here](https://docs.rs/datafusion/latest/datafusion/datasource/listing/struct.ListingTable.html#features))
   
   DataFusion does not support this today, but you can use an external table 
like this 
https://github.com/apache/arrow-datafusion/blob/93b21bdcd3d465ed78b610b54edf1418a47fc497/datafusion/sqllogictest/test_files/insert.slt#L45-L57
   
   
   
   ### Describe the solution you'd like
   
   @devinjdangelo  notes that
   
   The COPY statement does not have a built in PARTITION BY clause in its 
syntax currently, but we could support syntax like:
   
   ```sql
   COPY table to 'folder/location' (format parquet, partition_by year)
   ```
   
   which is the same syntax that [duckdb 
supports](https://duckdb.org/docs/data/partitioning/partitioned_writes) for 
this.
   
   
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] Support writing partitioned files in `COPY` command [arrow-datafusion]

Reply via email to