alamb commented on issue #2205: URL: https://github.com/apache/arrow-datafusion/issues/2205#issuecomment-1097971612
> Why would you implement this in the ObjectStore API, and not some FileScan component generic over object stores. The caching, spilling, logic, etc... is not going to vary based on object store provider? An ObjectStore API that supports fetch requests with an optional byte range should have us covered? I was thinking that keeping things behind an ObjectStore API makes sense because: 1. the economies and performance of S3, glacier, HDFS, local Minio could be quite different so the amount of consolidation, number of requests, aggressiveness of caching, might vary by object store implementation (not sure) 2. Some caching strategies / implementations (e.g. redis, for example) might not be appropriate to include in the core datafusion So in other words, binding details of caching / resource usage to DataFusion seemed to be unecessary -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
