[
https://issues.apache.org/jira/browse/ARROW-13333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Weston Pace closed ARROW-13333.
-------------------------------
Resolution: Duplicate
> [C++] [Dataset] Support max file size option in write dataset
> -------------------------------------------------------------
>
> Key: ARROW-13333
> URL: https://issues.apache.org/jira/browse/ARROW-13333
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++
> Reporter: Weston Pace
> Priority: Major
>
> The existence FileSystemDatasetWriteOptions::basename_template would seem to
> imply that the dataset writer may write multiple files for a given partition.
> However, the current implementation will always create one file per
> directory.
>
> I'm not sure what the desired behavior is here but the two obvious choices
> are:
> * Get rid of FileSystemDatasetWriteOptions::basename_template (or at least
> the \{i} parameter)
> * Add an option to limit how many rows/bytes are put in a single file
>
> ARROW-12358 is probably worth mentioning as whatever strategy is come up with
> here should probably be compatible with supporting append mode in the future.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)