pantShrey commented on issue #23247:
URL: https://github.com/apache/datafusion/issues/23247#issuecomment-4842630354

   One constraint worth flagging: Arrow's StreamWriter requires a 
`std::io::Write` implementor, so the synchronous boundary isn't just in 
DataFusion's `SpillWriter` -- it's baked into the Arrow IPC writer itself. Any 
async adapter built on top of the current Arrow API would need to either buffer 
batches before handing them to an async upload, or chunk synchronous writes 
into an async stream, both of which reintroduce a copy that defeats part of the 
purpose.
   A cleaner fix would likely require Arrow itself to expose an async IPC 
writer (writing against `AsyncWrite` or similar), rather than DataFusion 
working around the sync constraint from the outside.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to