wjones127 commented on issue #5458:
URL: https://github.com/apache/arrow-rs/issues/5458#issuecomment-1979152825

   The design seems reasonable overall.
   
   In Lance, our write pattern at the moment looks like:
   
   ```
   write col 1
   ...
   write col N
   flush
   (maybe return control to caller)
   write col 1
   ..
   write col N
   flush
   ```
   
   Because I am calling `flush` often, I don't think I'd miss the backpressure 
from `write`. However, what I think I might miss is being able to initiate the 
requests during `write` calls. I wonder if it would make sense to have some 
sort of `poll_flush()` method? Obviously it has some of the stability concerns 
from #5366, but I think if given a warning it could be safe enough.
   
   Also, is there a maximum buffer size enforced? Does reaching that make 
`write()` fail? or is it up to the user to limit how much they are buffering? 
(Which I think they could do easily by tracking bytes.)
   
   I'm thinking, if implemented, how I would use this API is put the writer on 
a background tokio task. Then I could run the IO calls in the background and 
implement backpressure over some channel. This brings up the question, it is 
safe to call `write` while the future returned by `flush()` has begun but is 
incomplete? Ideally I would like to be able to enqueue more data before I've 
completed drained the currently queued buffer. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to