[
https://issues.apache.org/jira/browse/BEAM-10670?focusedWorklogId=496426&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-496426
]
ASF GitHub Bot logged work on BEAM-10670:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 07/Oct/20 09:52
Start Date: 07/Oct/20 09:52
Worklog Time Spent: 10m
Work Description: iemejia commented on pull request #13021:
URL: https://github.com/apache/beam/pull/13021#issuecomment-704825742
I am comparing the results of current master vs this PR in batch mode and
the improvements are so big that I am even confused of how can it be so
different, is it partitioning less or ignoring some operations because not need
to estimate watermarks or something? Difference is really that good, Amazing!
**Current master:**
```
Performance:
Conf Runtime(sec) (Baseline) Events(/sec) (Baseline) Results
(Baseline)
0000 2.1 47303.7 100000
0001 0.6 169779.3 92000
0002 0.3 293255.1 351
0003 3.1 32299.7 580
0004 1.0 10427.5 40
0005 1.5 67340.1 12
0006 1.1 9487.7 103
0007 1.7 59101.7 1
0008 1.3 77279.8 6000
0009 0.5 19084.0 298
0010 0.7 153139.4 1
0011 1.8 54112.6 1919
0012 0.9 112359.6 1919
0013 0.3 304878.0 92000
0014 0.9 113507.4 92000
==========================================================================================
```
This PR
```
Performance:
Conf Runtime(sec) (Baseline) Events(/sec) (Baseline) Results
(Baseline)
0000 1.1 90090.1 100000
0001 0.3 337837.8 92000
0002 0.1 694444.4 351
0003 1.4 71582.0 580
0004 1.0 10111.2 40
0005 0.6 177935.9 12
0006 0.3 40000.0 103
0007 0.4 227272.7 1
0008 0.3 314465.4 6000
0009 0.2 49019.6 298
0010 0.6 165016.5 1
0011 0.5 187969.9 1919
0012 0.2 492610.8 1919
0013 0.3 392156.9 92000
0014 0.9 113765.6 92000
==========================================================================================
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 496426)
Time Spent: 32.5h (was: 32h 20m)
> Make non-portable Splittable DoFn the only option when executing Java "Read"
> transforms
> ---------------------------------------------------------------------------------------
>
> Key: BEAM-10670
> URL: https://issues.apache.org/jira/browse/BEAM-10670
> Project: Beam
> Issue Type: Improvement
> Components: sdk-java-core
> Reporter: Luke Cwik
> Assignee: Luke Cwik
> Priority: P2
> Time Spent: 32.5h
> Remaining Estimate: 0h
>
> All runners seem to be capable of migrating to splittable DoFn for
> non-portable execution except for Dataflow runner v1 which will internalize
> the current primitive read implementation that is shared across runner
> implementations.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)