kszucs edited a comment on pull request #7073: URL: https://github.com/apache/arrow/pull/7073#issuecomment-622333391
> * Simplified FileSystemDataset to hold a FragmentVector. Each Fragment must be a FileFragment and is checked at `FileSystemDataset::Make`. Fragments are not required to use the same backing filesystem nor the same format. This makes me wonder, why do we need FileSystemDataset and/or UnionDataset at all? Because all of the filesystem and format specific logic is implemented by the discovery objects? It implies that we could define dataset as a vector of arbitrary fragments (BTW this was my conclusion after my python dataset prototype, but convinced myself that a generic dataset would prevent us implementing further optimizations). ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
