houqp commented on pull request #1072:
URL: https://github.com/apache/arrow-datafusion/pull/1072#issuecomment-939230833


   Thanks @rdettai for the detailed write up! I think going static should be 
good enough to unblock our development in the short run. But I agree with you 
that this is just a short term workaround. To make object store truly 
pluggable, we need to serialize them into unique values, e.g. uri scheme,, 
stored in generic strings instead of enum in protobuf. This way, if a user 
compiles in a new object store using a custom crate, they can still get it to 
work without having to change the ballista protobuf file.
   
   In fact, I think we need to do the same thing for table provider as well, 
hardcoding table providers in protobuf leads to the same restriction. For 
example, it's not possible to use delta-rs's custom table provider with 
ballista at the moment.
   
   Given that the current logical plan deserialization code only takes 
serialized protobuf plan as input, I think we would have to go with the lazy 
two pass deserialization approach proposed by @alamb . Alternatively, we can 
change the deserialization call in the scheduler to pass in the execution 
context. This will make it a lot easier to implement more dynamic 
deserialization logic for both object stores and table providers. I don't see a 
strong reason why we want to avoid referencing execution context during logical 
plan deserilization?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to