nealrichardson commented on pull request #9725: URL: https://github.com/apache/arrow/pull/9725#issuecomment-803040199
The R user experience here is not good; I'm happy to improve it in a followup, but I'm not sure how feasible that will be. I'm not sure I understand why `FragmentScanOptions` is separate from `CsvFileFormat`--it seems that all of the options I provide there are csv-specific. The issue is that I want to declare those (null_values, etc.) up front, along with all of the other parsing instructions for the files (column_names, etc.). If we decide that those are two objects, ok, but that means that `open_dataset()` needs to assemble the `Dataset` (via `DatasetFactory`, creating a `CsvFileFormat` along the way) and `FragmentScanOptions` and then some how attach the `FragmentScanOptions` to the R `Dataset` object and carry it around until a scan is initialized on the dataset. (R users never make a ScannerBuilder themselves, it's all wrapped in higher-level functions.) Maybe that's not a problem but that's not something we've had to do elsewhere. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
