alamb commented on pull request #1010:
URL: https://github.com/apache/arrow-datafusion/pull/1010#issuecomment-926082267


   > I have never met a case of mixed file formats, so I wouldn't really know 
what is important to take into consideration
   
   For what it is worth, I think the design of this PR makes it easier to 
support a mixed file format `TableProvider` in the future, even though 
`ListingProvider` may not do so at the moment. Introducing the `FileFormat` 
abstraction will allow someone to create something like `MixedProvider` that 
then calls the appropriate FileFormats if they so desire. 
   
   > I do know that git history is very important, I am not arguing that 😃. Do 
we all agree that we should directly replace the old implementations in this 
PR, even if it means that we will need to modify a large part of the code base 
at once? 
   
   I can see both sides here (one massive PR vs multiple smaller PRs) and they 
both involve tradeoffs. Some thoughts on making reviewss easier:
   1. Make separate PRs on to this branch (as you seem to be planning for the 
`ObjectStore` use)
   2. Keep the commits separated into logical chunks
   
   Regardless of the route, I wonder if this PR is one we may want to draw some 
extra attention to
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to