[
https://issues.apache.org/jira/browse/ARROW-9749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated ARROW-9749:
----------------------------------
Labels: dataset pull-request-available (was: dataset)
> [C++][Dataset] Extract format-specific scan options from FileFormat
> -------------------------------------------------------------------
>
> Key: ARROW-9749
> URL: https://issues.apache.org/jira/browse/ARROW-9749
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++
> Affects Versions: 1.0.0
> Reporter: Ben Kietzman
> Assignee: David Li
> Priority: Major
> Labels: dataset, pull-request-available
> Fix For: 4.0.0
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Currently format specific scan options are embedded as members of the
> corresponding subclass of FileFormat. Extracting these to an options struct
> would provide better separation of concerns; currently the only way to scan a
> parquet formatted dataset with different options is to reconstruct it in a
> differently optioned format from its component files.
> CsvFileFormat could retain ParseOptions as a member, since (for example)
> tab-separated vs comma-separated values can justifiably be considered
> different formats.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)