Cheappie opened a new issue, #4533:
URL: https://github.com/apache/arrow-datafusion/issues/4533

   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   I am using `ParquetExec` in combination with `ParquetFileReaderFactory` to 
bypass `ObjectStore`. In order to create `FileScanConfig`, I need to fill a 
fake parameter for `object_store_url`, later `FileStream` creation fails 
because It tries to fetch `ObjectStore` that doesn't exist. Right now I work 
around that problem by creating fake `ObjectStore` that is never used.
   
   **Describe the solution you'd like**
   One solution that comes to my mind is making `FileOpener` self contained by 
combining each Opener with `ObjectStore` and `FileMeta`. There are some 
challenges with that solution, e.g. `ParquetFileReaderFactory` wants to own 
`FileMeta` when creating a reader, that would require to either clone 
`FileMeta` or take self in `FileOpener` to move `FileMeta` out of self. 
Alternatively we could have `FileMeta` behind shared pointer, but then we 
cannot move `ObjectMeta` out of it. 
   
   One thing that **might** be a positive outcome, is that in future files with 
different openers could be processed within single FileStream. 
   
   Please share your thoughts, I really wouldn't mind if there is a simpler way 
to improve that situation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to