westonpace commented on PR #43632: URL: https://github.com/apache/arrow/pull/43632#issuecomment-2297608446
@lidavidm > Doesn't this still mean the consumer can block the producer's thread on accident (by doing processing inside wake)? Yes, implementors would be cautioned to implement `wake` as cheaply as possible. I think rust's implementation is a bunch of lock-free stuff but a short-held mutex to move things from one queue to another is probably fine too. > Is the reason for a separate task to help optimize cache usage? Yes. > Would it be sufficient for that use case if we had a callback approach that produced a task instead of directly producing an array? Off the top of my head, yes, that should be fine. I think you can separate thread transfer from push/pull in that way. > I don't have much experience on the performance side, but in the development time/lines-of-code side, trying to make a producer that expects to push its output interact with a consumer that wants to pull is expensive (the reverse is also true). This gets more and more complicated the more times this mismatch is encountered in a pipeline. I'm not entirely sure I agree. Converting from synchronous pull to asynchronous push is pretty straightforward (need to introduce a polling thread). I also don't think maintenance of a new stable ABI is entirely justified by a usability concern but I won't die on that hill. That being said, I think there are two main factors in desiring an asynchronous interface over a synchronous interface. 1. Manage the # of threads 2. Control context switching I would guess that 1 is more common than 2 and I think an ABI that just provides just 1 is probably ok. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
