thinkharderdev opened a new issue, #7303:
URL: https://github.com/apache/arrow-datafusion/issues/7303
### Is your feature request related to a problem or challenge?
Currently plans that include an `InsertExec` cannot be serialized to
protobuf (and hence used in Ballista)
### Describe the solution you'd like
The easiest way to support this would be to modify `PhysicalExtensionCodec`
to support serializing/deserializing a `dyn DataSink`. So something like:
```
pub trait PhysicalExtensionCodec: Debug + Send + Sync {
fn try_decode(
&self,
buf: &[u8],
inputs: &[Arc<dyn ExecutionPlan>],
registry: &dyn FunctionRegistry,
) -> Result<Arc<dyn ExecutionPlan>>;
fn try_decode_data_sink(&self, buf: &[u8]) -> Result<Arc<dyn DataSink>> {
// Default impl for backcompat
Err(DataFusionError::NotImplemented("PhysicalExtensionCodec::try_decode_data_sink
not implemented".into()))
}
fn try_encode(&self, node: Arc<dyn ExecutionPlan>, buf: &mut Vec<u8>) ->
Result<()>;
fn try_encode_data_sink(&self, sink: Arc<dyn DataSink>, buf: &mut
Vec<u8>) -> Result<()> {
// Default impl for backcompat
Err(DataFusionError::NotImplemented("PhysicalExtensionCodec::try_encode_data_sink
not implemented".into()))
}
}
```
Alternatively we might push serialization to the `DataSink` trait itself:
```
#[async_trait]
pub trait DataSink: DisplayAs + Debug + Send + Sync {
// TODO add desired input ordering
// How does this sink want its input ordered?
/// Writes the data to the sink, returns the number of values written
///
/// This method will be called exactly once during each DML
/// statement. Thus prior to return, the sink should do any commit
/// or rollback required.
async fn write_all(
&self,
data: Vec<SendableRecordBatchStream>,
context: &Arc<TaskContext>,
) -> Result<u64>;
/// Encode `self` into the provided buffer
fn encode(&self, buf: &mut Vec<u8>) -> Result<()>;
/// Decode a instance of `Self` from a buffer
fn try_decode(buf: &[u8]) -> Result<Self>;
}
```
### Describe alternatives you've considered
Not do anything
### Additional context
_No response_
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]