tustvold commented on issue #2230:
URL: https://github.com/apache/arrow-rs/issues/2230#issuecomment-1200462036

   > You just want to keep it simple, because either object stores or file 
system just require location to get object. Am I right ?
   
   Yeah, you most definitely shouldn't **need** the object meta to request a 
file, although the request preconditions (#2241) would optionally allow 
restrictions based on a limited subset of the metadata, constrained by what is 
generally supported.
   
   > I think that there should be yet another ObjectStore trait in DataFusion, 
that should use output of list method as an input to get with some additional 
flexibility for dynamic attributes.
   
   I agree that this should be handled at the DataFusion layer, although I'm 
not entirely sure what the interface should look like. Personally I would be 
reticent to overload the ObjectStore name, as I think that would get very 
confusing, but the broad idea of introducing more flexibility into the IO 
operators sounds like a good thing imo :+1:
   
   One simple option might be to adapt the existing `FormatReader` trait to 
this purpose or something? :thinking: Tbh what you really want is to be able to 
override the `AsyncFileReader` that is passed to 
`ParquetRecordBatchStreamBuilder` by `ParquetOpener`. This feels like it should 
be imminently achievable without major rework. _It also occurs to me that 
having separate ParquetExec, JsonExec, is basically redundant now that we have 
the `FormatReader` abstraction_.
   
   If you like I'd be happy to raise an issue on DataFusion proposing something 
to this effect, and depending on feedback potentially bash it out for you? Also 
happy to leave this with you, just let me know. I'd very much like to help you 
get this over the line :smile: 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to