[GitHub] [beam] ajamato commented on a change in pull request #12754: [BEAM-10791] Identify and log additional information needed to debug …

GitBox Thu, 03 Sep 2020 10:01:26 -0700


ajamato commented on a change in pull request #12754:
URL: https://github.com/apache/beam/pull/12754#discussion_r483126531




##########
File path: sdks/python/apache_beam/io/gcp/bigquery.py
##########
@@ -1198,8 +1209,34 @@ def process(self, element, *schema_side_inputs):
       return self._flush_all_batches()
 
   def finish_bundle(self):
+    current_millis = int(time.time() * 1000)
+    if BigQueryWriteFn.LATENCY_LOGGING_LOCK.acquire(False):
+      try:
+        if (BigQueryWriteFn.LATENCY_LOGGING_HISTOGRAM.total_count() > 0 and
+            (current_millis -
+             BigQueryWriteFn.LATENCY_LOGGING_LAST_REPORTED_MILLIS) >
+            self._latency_logging_frequency * 1000):
+          self._log_percentiles()
+          BigQueryWriteFn.LATENCY_LOGGING_HISTOGRAM.clear()

Review comment:
       Is there a reason why you clear? I think we need to have a good sample 
of elements over a period of time to a decent latency breakdown. This seems 
like it will be cleared quite frequently.
   
   In the metric case we will just pass it all to stackdriver, and it can 
basically get these latencies over a sliding window. I.e. at each minute show 
the 50, 95, 99th percentile latencies for the last 10 minute window.
   
   In the logging case, we would like to have something similar. I'm not saying 
you need to make such a complex calculation to move things in and out of the 
window.
   
   Though, maybe this is perfectly fine, for 180 seconds we can probably record 
a decent number of requests. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [beam] ajamato commented on a change in pull request #12754: [BEAM-10791] Identify and log additional information needed to debug …

Reply via email to