[ 
https://issues.apache.org/jira/browse/BEAM-5355?focusedWorklogId=167007&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-167007
 ]

ASF GitHub Bot logged work on BEAM-5355:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 16/Nov/18 18:51
            Start Date: 16/Nov/18 18:51
    Worklog Time Spent: 10m 
      Work Description: ajamato commented on a change in pull request #6987: 
[BEAM-5355] Prevent creating metrics of the same name multiple times
URL: https://github.com/apache/beam/pull/6987#discussion_r234311498
 
 

 ##########
 File path: 
sdks/java/testing/load-tests/src/main/java/org/apache/beam/sdk/loadtests/GroupByKeyLoadTest.java
 ##########
 @@ -83,15 +83,14 @@ private GroupByKeyLoadTest(String[] args) throws 
IOException {
   void loadTest() throws IOException {
     Optional<SyntheticStep> syntheticStep = 
createStep(options.getStepOptions());
 
-    PCollection<KV<byte[], byte[]>> input =
-        pipeline.apply(SyntheticBoundedIO.readFrom(sourceOptions));
+    PCollection<KV<byte[], byte[]>> input = pipeline
+        .apply(SyntheticBoundedIO.readFrom(sourceOptions))
+        .apply(ParDo.of(new MetricsMonitor(METRICS_NAMESPACE)));
 
 Review comment:
   Seems like this metric is being created as a UserMetric. Since this is the 
case, I see no issue with this change, and using it in whatever load tests we 
want.
   
   We don't have any state sampling in the beam Java SDK yet, but the python 
one does. I am planing to add state sampling support to properly time step 
execution soon. Provides an estimate of time spend running in various steps of 
the pipeline, without incurring the time of expensive system calls (As in your 
use of System.currentTimeMillis).
   
   This would provide estimates of the time (not perfectly accurate), but not 
add additional slowdown to the pipeline (which your System.currentTimeMillis 
calls are adding)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 167007)
    Time Spent: 3h 50m  (was: 3h 40m)

> Create GroupByKey load test for Java SDK
> ----------------------------------------
>
>                 Key: BEAM-5355
>                 URL: https://issues.apache.org/jira/browse/BEAM-5355
>             Project: Beam
>          Issue Type: Sub-task
>          Components: testing
>            Reporter: Lukasz Gajowy
>            Assignee: Lukasz Gajowy
>            Priority: Minor
>             Fix For: Not applicable
>
>          Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> This is more thoroughly described in this proposal: 
> [https://docs.google.com/document/d/1PuIQv4v06eosKKwT76u7S6IP88AnXhTf870Rcj1AHt4/edit?usp=sharing]
>  
> In short: this ticket is about implementing the GroupByKeyLoadIT that uses 
> SyntheticStep and Synthetic source to create load on the pipeline. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to