nealrichardson commented on pull request #9725: URL: https://github.com/apache/arrow/pull/9725#issuecomment-803052664
> Maybe the respective scan options could be inlined or embedded into the file format to provide defaults? Yeah I think that would be nice. I don't understand well the use case of scanning the same files with different parsing options unless I'm trying to figure out what the "right" options are. To me, things like `null_values` are not scan-time preferences, they're properties that describe what's in the files, so I want to declare them up front and don't need to adjust them later. Is there a reason one would need to scan the same dataset with different parsing options, rather than create a new dataset with the options specified up front? I wonder whether the extra complexity in accepting them also at scan time is worth it if there's a simple solution like that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
