Rafferty97 commented on PR #20626: URL: https://github.com/apache/datafusion/pull/20626#issuecomment-4794901652
Hi @alamb, I've had a go at implementing your suggestion of a `PreReadDecoder` API, but I'm struggling to figure out the best place to put it. This feels like a feature that should be agnostic to both where the data comes from (filesystem, S3, whatever) and what file format consumes that data, but it seems that implementors of `FileOpener` interact with the object store abstraction directly, leaving nowhere to insert this new step in the pipeline. I guess the most expediant path would be to implement it into each file format separately, in a similar manner to how compression currently works, but I don't think that's the best long-term solution. Keen to hear your thoughts if you had any ideas :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
