tustvold commented on issue #6339:
URL:
https://github.com/apache/arrow-datafusion/issues/6339#issuecomment-1545867063
> And should the methods be async
The return type is `BoxFuture` which would allow for asychronous completion
> For that it would be necessary to have some kind of flush method in the
DataSink trait that allows to update the state of the table
As the returned type is `BoxFuture` the implementation could update the
state once the stream has been exhausted? i.e. something like
```
impl DataSink for MyTable {
fn write_stream(&self, partition: usize, input:
SendableRecordBatchStream) -> BoxFuture<Result<u64>> {
async move {
while let Some(next) = input.next().await.transpose()? {
// Process batch
}
// Commit transaction
}.boxed()
}
}
```
> return the number of rows written
I personally would return `()` and leave metrics accounting as a
TableProvider detail
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]