[jira] [Work logged] (BEAM-10791) Identify and log additional information needed to debug streaming insert requests for Python SDK

ASF GitHub Bot (Jira) Thu, 03 Sep 2020 15:35:24 -0700


     [ 
https://issues.apache.org/jira/browse/BEAM-10791?focusedWorklogId=478835&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-478835
 ]


ASF GitHub Bot logged work on BEAM-10791:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 03/Sep/20 22:34
            Start Date: 03/Sep/20 22:34
    Worklog Time Spent: 10m 
      Work Description: ihji commented on a change in pull request #12754:
URL: https://github.com/apache/beam/pull/12754#discussion_r483288994



##########
File path: sdks/python/apache_beam/io/gcp/bigquery.py
##########
@@ -1198,8 +1209,34 @@ def process(self, element, *schema_side_inputs):
       return self._flush_all_batches()
 
   def finish_bundle(self):
+    current_millis = int(time.time() * 1000)

Review comment:
       The three calls have their own role. One for remembering request time. 
One for measuring response time so that we can subtract the start time from the 
end time. And the last one is for checking log reporting interval.
   
   We might slightly reduce the overhead by moving the last `time.time()` call 
inside the synchronized block. In that case, the thread that skips the 
synchronized section won't execute `time.time()` (probably the saving is small 
though)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 478835)
    Time Spent: 1.5h  (was: 1h 20m)

> Identify and log additional information needed to debug streaming insert 
> requests for Python SDK
> ------------------------------------------------------------------------------------------------
>
>                 Key: BEAM-10791
>                 URL: https://issues.apache.org/jira/browse/BEAM-10791
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-py-gcp
>            Reporter: Heejong Lee
>            Assignee: Heejong Lee
>            Priority: P2
>          Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> implement logging for per worker statistics:
> - Request count - for that window.
> - Error codes + number of occurrences for that window (Or perhaps just log 
> each error with as much detail as possible.)
> - Tail latencies of requests (50, 90 and 99, percentiles)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (BEAM-10791) Identify and log additional information needed to debug streaming insert requests for Python SDK

Reply via email to