alamb commented on pull request #1010: URL: https://github.com/apache/arrow-datafusion/pull/1010#issuecomment-926082267
> I have never met a case of mixed file formats, so I wouldn't really know what is important to take into consideration For what it is worth, I think the design of this PR makes it easier to support a mixed file format `TableProvider` in the future, even though `ListingProvider` may not do so at the moment. Introducing the `FileFormat` abstraction will allow someone to create something like `MixedProvider` that then calls the appropriate FileFormats if they so desire. > I do know that git history is very important, I am not arguing that 😃. Do we all agree that we should directly replace the old implementations in this PR, even if it means that we will need to modify a large part of the code base at once? I can see both sides here (one massive PR vs multiple smaller PRs) and they both involve tradeoffs. Some thoughts on making reviewss easier: 1. Make separate PRs on to this branch (as you seem to be planning for the `ObjectStore` use) 2. Keep the commits separated into logical chunks Regardless of the route, I wonder if this PR is one we may want to draw some extra attention to -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
