ajamato commented on a change in pull request #12070:
URL: https://github.com/apache/beam/pull/12070#discussion_r450433163



##########
File path: 
sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryStorageStreamSource.java
##########
@@ -219,7 +232,15 @@ private synchronized boolean readNextRecord() throws 
IOException {
         }
 
         fractionConsumedFromPreviousResponse = 
fractionConsumedFromCurrentResponse;
-        ReadRowsResponse currentResponse = responseIterator.next();
+        ReadRowsResponse currentResponse;
+        Stopwatch stopwatch = Stopwatch.createStarted();

Review comment:
       +1 to adding an options to allow disabling this.
   
   In the past we have avoided using tools which grab the system clock 
frequently as this system call can be quite slow and add performance impact. 
Especially if its done frequently. In this case however, it looks like the 
system clock may only be called infrequently. I.e. whenever an RPC is made 
(doing it or every element would be an issue) In which case its fine to do so.
   
   Instead we have used the state sampler classes to do this. This uses a 
separate thread to count which periodically wakes up and checks what "state" 
other threads are executing under. Which would . 
   
   However, there is no user metric or sdk level API to time blocks of code 
using the state sampler.
   
   We only used it for the execution time metrics. To calculate the time spend 
in the process, startBundle and finishBundle methods of each ParDos. Here it is 
for reference.
   https://github.com/apache/beam/pull/7676




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to