CurtHagenlocher commented on PR #3657: URL: https://github.com/apache/arrow-adbc/pull/3657#issuecomment-3481480340
The main concern with a purely streaming approach is being able to handle retries. That is, let's say that we read half of a response and then the connection is reset for some reason. Will the end-to-end system reissue the command to re-fetch the data and stream it again or are we forced to return an error to the user? Buffering a response in memory ensures that we know the entire response was read. If a single cloud fetch is always just a single Arrow record batch, then addressing this concern is relatively straightforward. But it gets more complicated if a single fetched stream has multiple record batches and one or more have already been returned to the caller when the connection goes awry. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
