halaharr opened a new issue, #28765:
URL: https://github.com/apache/beam/issues/28765

     We are seeing Dataflow  pipelines taking 2x to 3x more time to run in 
Apache beam SDK ver 2.50 compared to Apache beam SDK ver 2.44. As part of 
troubleshooting we compared the DAGS in 2.44 and 2.50 and we are seeing BQ read 
from table step in DAG (full table scan using DIRECT_TABLE_ACCESS) taking 3 sec 
to read 19 records / 13KB size in 2.44 and same exact pipeline with exactly 
same 19  records   and 13KB size taking 1 min 5 sec in 2.50. Is this because 
this API has degraded in ver 2.50 since I also see throughput for this DAG step 
is much higher in 2.44 than 2.50. Please find the  throughput graph images 
(elements/sec)  below for both versions below 
   
   Throughput in ver 2.44 --> 0.15 sec (High)
   
   Throughput in ver 2.50 --> 0.083 sec (Low)
   
   <img width="598" alt="apache_beam_250" 
src="https://github.com/apache/beam/assets/16997826/27292c20-c2c1-4bd6-b3be-d3a74e82c638";>
   <img width="611" alt="apache_beam_244" 
src="https://github.com/apache/beam/assets/16997826/3a3481a8-3de1-4916-82e6-5b9cd4ff981f";>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to