kmjung commented on pull request #15185: URL: https://github.com/apache/beam/pull/15185#issuecomment-893699939
> Out of curiosity, what's an average thoughtput of fetching data with BQ Storage API for us-central region? What would you consider an expectable benchmark numbers? The single-stream throughput for the storage API depends heavily on your schema width and the data format you're using, as well as some other factors, but with a ~50 column schema I would expect that the API should be capable of sending 40-50 MiB/s (~30k rows/second) per stream. For Java-based pipelines, usually the limiting factor is gRPC flow control -- pipelines usually can't process data as fast as the API streams it -- and I would expect the same to be the case here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
