[GitHub] [beam] kmjung commented on pull request #15185: [BEAM-10917] Add support for BigQuery Read API in Python BEAM

GitBox Thu, 05 Aug 2021 11:52:16 -0700


kmjung commented on pull request #15185:
URL: https://github.com/apache/beam/pull/15185#issuecomment-893699939



   > Out of curiosity, what's an average thoughtput of fetching data with BQ 
Storage API for us-central region? What would you consider an expectable 
benchmark numbers?
   
   The single-stream throughput for the storage API depends heavily on your 
schema width and the data format you're using, as well as some other factors, 
but with a ~50 column schema I would expect that the API should be capable of 
sending 40-50 MiB/s (~30k rows/second) per stream. For Java-based pipelines, 
usually the limiting factor is gRPC flow control -- pipelines usually can't 
process data as fast as the API streams it -- and I would expect the same to be 
the case here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [beam] kmjung commented on pull request #15185: [BEAM-10917] Add support for BigQuery Read API in Python BEAM

Reply via email to