[ 
https://issues.apache.org/jira/browse/BEAM-6627?focusedWorklogId=198048&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-198048
 ]

ASF GitHub Bot logged work on BEAM-6627:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 13/Feb/19 11:59
            Start Date: 13/Feb/19 11:59
    Worklog Time Spent: 10m 
      Work Description: lgajowy commented on pull request #7772: [BEAM-6627] 
Added Metrics API processing time reporting to TextIOIT
URL: https://github.com/apache/beam/pull/7772#discussion_r256355745
 
 

 ##########
 File path: 
sdks/java/io/file-based-io-tests/src/test/java/org/apache/beam/sdk/io/text/TextIOIT.java
 ##########
 @@ -127,28 +140,49 @@ public void writeThenReadAll() {
 
     PipelineResult result = pipeline.run();
     result.waitUntilFinish();
-    publishGcsResults(result);
+    gatherAndPublishMetrics(result);
   }
 
-  private void publishGcsResults(PipelineResult result) {
+  private void gatherAndPublishMetrics(PipelineResult result) {
+    String uuid = UUID.randomUUID().toString();
+    Timestamp timestamp = Timestamp.now();
+    List<NamedTestResult> namedTestResults = readMetrics(result, uuid, 
timestamp);
+    publishToBigQuery(namedTestResults, bigQueryDataset, bigQueryTable);
+    ConsoleResultPublisher.publish(namedTestResults, uuid, 
timestamp.toString());
+  }
+
+  private List<NamedTestResult> readMetrics(
+      PipelineResult result, String uuid, Timestamp timestamp) {
+    List<NamedTestResult> results = new ArrayList<>();
+
+    MetricsReader reader = new MetricsReader(result, FILEIOIT_NAMESPACE);
+    long writeStartTime = reader.getStartTimeMetric("startTime");
+    long writeEndTime = reader.getEndTimeMetric("middleTime");
+    long readStartTime = reader.getStartTimeMetric("middleTime");
+    long readEndTime = reader.getEndTimeMetric("endTime");
+    double writeTime = (writeEndTime - writeStartTime) / 1000.0;
+    double readTime = (readEndTime - readStartTime) / 1000.0;
+    double copiesPerSec = calculateGcsMetric(result);
+
+    if (copiesPerSec > 0) {
+      results.add(
+          NamedTestResult.create(uuid, timestamp.toString(), "copies_per_sec", 
copiesPerSec));
+    }
+
+    results.add(NamedTestResult.create(uuid, timestamp.toString(), 
"read_time", readTime));
+    results.add(NamedTestResult.create(uuid, timestamp.toString(), 
"write_time", writeTime));
+
+    return results;
+  }
+
+  private double calculateGcsMetric(PipelineResult result) {
 
 Review comment:
   More importantly, this should be collected only when 
`options.getGcsPerformanceMetrics()` is `true`. I think this was missing 
before. Could you add that condition?
   
   @udim am I right here or am I missing something?
   
   Tl;dr version for Udi: we want to reuse the mechanism for publishing the 
bigQuery metrics for collecting time in IOITs too. Previously we checked if 
BigQuery related options are present to publish the Gcs metric. Now that there 
are more metrics collected this way, such "if" condition would be too broad. 
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 198048)
    Time Spent: 2h 20m  (was: 2h 10m)

> Use Metrics API in IO performance tests
> ---------------------------------------
>
>                 Key: BEAM-6627
>                 URL: https://issues.apache.org/jira/browse/BEAM-6627
>             Project: Beam
>          Issue Type: Improvement
>          Components: testing
>            Reporter: Michal Walenia
>            Assignee: Michal Walenia
>            Priority: Minor
>          Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to