boyuanzz commented on pull request #14538:
URL: https://github.com/apache/beam/pull/14538#issuecomment-820650099


   > > What's the purpose to have BigTableRead run against a larger data set as 
well?
   > 
   > It is more suitable to be used as benchmark for performance tracking 
purpose. Current 1K row read finishes in a few seconds.
   
   I think it would be nice to separate E2E integration tests from 
benchmark/load tests. We usually want to run a small data set for E2E 
integration test to make it finish as soon as possible. Having read size as 
pipeline options can help us invoke the same test for different purpose. For 
example, most of performance jobs: 
https://github.com/apache/beam/tree/master/.test-infra/jenkins#performance-jobs 
are configurable for the input size.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to