[jira] [Work logged] (BEAM-5600) Splitting for SplittableDoFn should be exposed within runner shared libraries
[ https://issues.apache.org/jira/browse/BEAM-5600?focusedWorklogId=366570=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-366570 ] ASF GitHub Bot logged work on BEAM-5600: Author: ASF GitHub Bot Created on: 06/Jan/20 11:32 Start Date: 06/Jan/20 11:32 Worklog Time Spent: 10m Work Description: mxm commented on issue #10482: [BEAM-5600] Add unimplemented split API to Runner side SDF libraries. URL: https://github.com/apache/beam/pull/10482#issuecomment-571106600 Thanks! LGTM. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 366570) Time Spent: 3.5h (was: 3h 20m) > Splitting for SplittableDoFn should be exposed within runner shared libraries > - > > Key: BEAM-5600 > URL: https://issues.apache.org/jira/browse/BEAM-5600 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Priority: Major > Labels: portability > Time Spent: 3.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5600) Splitting for SplittableDoFn should be exposed within runner shared libraries
[ https://issues.apache.org/jira/browse/BEAM-5600?focusedWorklogId=365510=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-365510 ] ASF GitHub Bot logged work on BEAM-5600: Author: ASF GitHub Bot Created on: 02/Jan/20 22:10 Start Date: 02/Jan/20 22:10 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10482: [BEAM-5600] Add unimplemented split API to Runner side SDF libraries. URL: https://github.com/apache/beam/pull/10482 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 365510) Time Spent: 3h 20m (was: 3h 10m) > Splitting for SplittableDoFn should be exposed within runner shared libraries > - > > Key: BEAM-5600 > URL: https://issues.apache.org/jira/browse/BEAM-5600 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Priority: Major > Labels: portability > Time Spent: 3h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5600) Splitting for SplittableDoFn should be exposed within runner shared libraries
[ https://issues.apache.org/jira/browse/BEAM-5600?focusedWorklogId=365509=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-365509 ] ASF GitHub Bot logged work on BEAM-5600: Author: ASF GitHub Bot Created on: 02/Jan/20 22:07 Start Date: 02/Jan/20 22:07 Worklog Time Spent: 10m Work Description: boyuanzz commented on pull request #10482: [BEAM-5600] Add unimplemented split API to Runner side SDF libraries. URL: https://github.com/apache/beam/pull/10482#discussion_r362648195 ## File path: runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/control/SdkHarnessClient.java ## @@ -476,6 +511,38 @@ public void abort() {} } } + /** + * A {@link CloseableFnDataReceiver} which counts the number of elements that have been accepted. + */ + private static class CountingFnDataReceiver implements CloseableFnDataReceiver { +private final CloseableFnDataReceiver delegate; +private long count; + +private CountingFnDataReceiver(CloseableFnDataReceiver delegate) { + this.delegate = delegate; +} + +public long getCount() { + return count; +} + +@Override +public void accept(T input) throws Exception { + count += 1; Review comment: Thanks! Just wanna make sure we don't do any manual buffer in between. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 365509) Time Spent: 3h 10m (was: 3h) > Splitting for SplittableDoFn should be exposed within runner shared libraries > - > > Key: BEAM-5600 > URL: https://issues.apache.org/jira/browse/BEAM-5600 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Priority: Major > Labels: portability > Time Spent: 3h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5600) Splitting for SplittableDoFn should be exposed within runner shared libraries
[ https://issues.apache.org/jira/browse/BEAM-5600?focusedWorklogId=365504=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-365504 ] ASF GitHub Bot logged work on BEAM-5600: Author: ASF GitHub Bot Created on: 02/Jan/20 21:54 Start Date: 02/Jan/20 21:54 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10482: [BEAM-5600] Add unimplemented split API to Runner side SDF libraries. URL: https://github.com/apache/beam/pull/10482#discussion_r362644406 ## File path: runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/control/SdkHarnessClient.java ## @@ -476,6 +511,38 @@ public void abort() {} } } + /** + * A {@link CloseableFnDataReceiver} which counts the number of elements that have been accepted. + */ + private static class CountingFnDataReceiver implements CloseableFnDataReceiver { +private final CloseableFnDataReceiver delegate; +private long count; + +private CountingFnDataReceiver(CloseableFnDataReceiver delegate) { + this.delegate = delegate; +} + +public long getCount() { + return count; +} + +@Override +public void accept(T input) throws Exception { + count += 1; Review comment: In this case the runner has committed to sending the value to the SDK, we can't realistically account for all "buffers" that may live between this call and when the SDK receives it (JVM buffers, OS buffers, network buffers, ...). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 365504) Time Spent: 3h (was: 2h 50m) > Splitting for SplittableDoFn should be exposed within runner shared libraries > - > > Key: BEAM-5600 > URL: https://issues.apache.org/jira/browse/BEAM-5600 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Priority: Major > Labels: portability > Time Spent: 3h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5600) Splitting for SplittableDoFn should be exposed within runner shared libraries
[ https://issues.apache.org/jira/browse/BEAM-5600?focusedWorklogId=365488=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-365488 ] ASF GitHub Bot logged work on BEAM-5600: Author: ASF GitHub Bot Created on: 02/Jan/20 20:53 Start Date: 02/Jan/20 20:53 Worklog Time Spent: 10m Work Description: boyuanzz commented on pull request #10482: [BEAM-5600] Add unimplemented split API to Runner side SDF libraries. URL: https://github.com/apache/beam/pull/10482#discussion_r362624699 ## File path: runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/control/SdkHarnessClient.java ## @@ -476,6 +511,38 @@ public void abort() {} } } + /** + * A {@link CloseableFnDataReceiver} which counts the number of elements that have been accepted. + */ + private static class CountingFnDataReceiver implements CloseableFnDataReceiver { +private final CloseableFnDataReceiver delegate; +private long count; + +private CountingFnDataReceiver(CloseableFnDataReceiver delegate) { + this.delegate = delegate; +} + +public long getCount() { + return count; +} + +@Override +public void accept(T input) throws Exception { + count += 1; Review comment: I'm not sure whether the data channel does something like buffering. I'm curious whether the count should be counted `when `flush` is called? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 365488) Time Spent: 2h 50m (was: 2h 40m) > Splitting for SplittableDoFn should be exposed within runner shared libraries > - > > Key: BEAM-5600 > URL: https://issues.apache.org/jira/browse/BEAM-5600 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Priority: Major > Labels: portability > Time Spent: 2h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5600) Splitting for SplittableDoFn should be exposed within runner shared libraries
[ https://issues.apache.org/jira/browse/BEAM-5600?focusedWorklogId=365436=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-365436 ] ASF GitHub Bot logged work on BEAM-5600: Author: ASF GitHub Bot Created on: 02/Jan/20 18:10 Start Date: 02/Jan/20 18:10 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10482: [BEAM-5600] Add unimplemented split API to Runner side SDF libraries. URL: https://github.com/apache/beam/pull/10482#issuecomment-570291489 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 365436) Time Spent: 2h 40m (was: 2.5h) > Splitting for SplittableDoFn should be exposed within runner shared libraries > - > > Key: BEAM-5600 > URL: https://issues.apache.org/jira/browse/BEAM-5600 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Priority: Major > Labels: portability > Time Spent: 2h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5600) Splitting for SplittableDoFn should be exposed within runner shared libraries
[ https://issues.apache.org/jira/browse/BEAM-5600?focusedWorklogId=365412=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-365412 ] ASF GitHub Bot logged work on BEAM-5600: Author: ASF GitHub Bot Created on: 02/Jan/20 16:46 Start Date: 02/Jan/20 16:46 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10482: [BEAM-5600] Add unimplemented split API to Runner side SDF libraries. URL: https://github.com/apache/beam/pull/10482#issuecomment-570265673 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 365412) Time Spent: 2.5h (was: 2h 20m) > Splitting for SplittableDoFn should be exposed within runner shared libraries > - > > Key: BEAM-5600 > URL: https://issues.apache.org/jira/browse/BEAM-5600 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Priority: Major > Labels: portability > Time Spent: 2.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5600) Splitting for SplittableDoFn should be exposed within runner shared libraries
[ https://issues.apache.org/jira/browse/BEAM-5600?focusedWorklogId=364893=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-364893 ] ASF GitHub Bot logged work on BEAM-5600: Author: ASF GitHub Bot Created on: 31/Dec/19 06:54 Start Date: 31/Dec/19 06:54 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10482: [BEAM-5600] Add unimplemented split API to Runner side SDF libraries. URL: https://github.com/apache/beam/pull/10482#issuecomment-569876432 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 364893) Time Spent: 2h 20m (was: 2h 10m) > Splitting for SplittableDoFn should be exposed within runner shared libraries > - > > Key: BEAM-5600 > URL: https://issues.apache.org/jira/browse/BEAM-5600 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Priority: Major > Labels: portability > Time Spent: 2h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5600) Splitting for SplittableDoFn should be exposed within runner shared libraries
[ https://issues.apache.org/jira/browse/BEAM-5600?focusedWorklogId=364826=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-364826 ] ASF GitHub Bot logged work on BEAM-5600: Author: ASF GitHub Bot Created on: 31/Dec/19 02:07 Start Date: 31/Dec/19 02:07 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10482: [BEAM-5600] Add unimplemented split API to Runner side SDF libraries. URL: https://github.com/apache/beam/pull/10482#issuecomment-569847995 Run Python2_PVR_Flink PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 364826) Time Spent: 2h 10m (was: 2h) > Splitting for SplittableDoFn should be exposed within runner shared libraries > - > > Key: BEAM-5600 > URL: https://issues.apache.org/jira/browse/BEAM-5600 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Priority: Major > Labels: portability > Time Spent: 2h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5600) Splitting for SplittableDoFn should be exposed within runner shared libraries
[ https://issues.apache.org/jira/browse/BEAM-5600?focusedWorklogId=364825=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-364825 ] ASF GitHub Bot logged work on BEAM-5600: Author: ASF GitHub Bot Created on: 31/Dec/19 02:07 Start Date: 31/Dec/19 02:07 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10482: [BEAM-5600] Add unimplemented split API to Runner side SDF libraries. URL: https://github.com/apache/beam/pull/10482#issuecomment-569847979 Run Portable_Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 364825) Time Spent: 2h (was: 1h 50m) > Splitting for SplittableDoFn should be exposed within runner shared libraries > - > > Key: BEAM-5600 > URL: https://issues.apache.org/jira/browse/BEAM-5600 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Priority: Major > Labels: portability > Time Spent: 2h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5600) Splitting for SplittableDoFn should be exposed within runner shared libraries
[ https://issues.apache.org/jira/browse/BEAM-5600?focusedWorklogId=364824=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-364824 ] ASF GitHub Bot logged work on BEAM-5600: Author: ASF GitHub Bot Created on: 31/Dec/19 02:07 Start Date: 31/Dec/19 02:07 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10482: [BEAM-5600] Add unimplemented split API to Runner side SDF libraries. URL: https://github.com/apache/beam/pull/10482#issuecomment-569847967 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 364824) Time Spent: 1h 50m (was: 1h 40m) > Splitting for SplittableDoFn should be exposed within runner shared libraries > - > > Key: BEAM-5600 > URL: https://issues.apache.org/jira/browse/BEAM-5600 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Priority: Major > Labels: portability > Time Spent: 1h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5600) Splitting for SplittableDoFn should be exposed within runner shared libraries
[ https://issues.apache.org/jira/browse/BEAM-5600?focusedWorklogId=364765=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-364765 ] ASF GitHub Bot logged work on BEAM-5600: Author: ASF GitHub Bot Created on: 30/Dec/19 21:27 Start Date: 30/Dec/19 21:27 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10482: [BEAM-5600] Add unimplemented split API to Runner side SDF libraries. URL: https://github.com/apache/beam/pull/10482 This creates an unsupported API to RemoteBundle and a default split handler that throws. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
[jira] [Work logged] (BEAM-5600) Splitting for SplittableDoFn should be exposed within runner shared libraries
[ https://issues.apache.org/jira/browse/BEAM-5600?focusedWorklogId=364767=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-364767 ] ASF GitHub Bot logged work on BEAM-5600: Author: ASF GitHub Bot Created on: 30/Dec/19 21:27 Start Date: 30/Dec/19 21:27 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10482: [BEAM-5600] Add unimplemented split API to Runner side SDF libraries. URL: https://github.com/apache/beam/pull/10482#issuecomment-569801244 R: @boyuanzz @mxm This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 364767) Time Spent: 1h 40m (was: 1.5h) > Splitting for SplittableDoFn should be exposed within runner shared libraries > - > > Key: BEAM-5600 > URL: https://issues.apache.org/jira/browse/BEAM-5600 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Priority: Major > Labels: portability > Time Spent: 1h 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5600) Splitting for SplittableDoFn should be exposed within runner shared libraries
[ https://issues.apache.org/jira/browse/BEAM-5600?focusedWorklogId=364766=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-364766 ] ASF GitHub Bot logged work on BEAM-5600: Author: ASF GitHub Bot Created on: 30/Dec/19 21:27 Start Date: 30/Dec/19 21:27 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10482: [BEAM-5600] Add unimplemented split API to Runner side SDF libraries. URL: https://github.com/apache/beam/pull/10482#discussion_r362100325 ## File path: runners/java-fn-execution/src/main/java/org/apache/beam/runners/fnexecution/control/SdkHarnessClient.java ## @@ -231,122 +234,114 @@ public ActiveBundle newBundle( return fnApiDataService.receive( LogicalEndpoint.of(bundleId, ptransformId), receiver.getCoder(), receiver.getReceiver()); } - } - /** An active bundle for a particular {@link BeamFnApi.ProcessBundleDescriptor}. */ - public static class ActiveBundle implements RemoteBundle { -private final String bundleId; -private final CompletionStage response; -private final Map inputReceivers; -private final Map outputClients; -private final StateDelegator.Registration stateRegistration; -private final BundleProgressHandler progressHandler; -private final BundleCheckpointHandler checkpointHandler; -private final BundleFinalizationHandler finalizationHandler; - -private ActiveBundle( -String bundleId, -CompletionStage response, -Map inputReceivers, -Map outputClients, -StateDelegator.Registration stateRegistration, -BundleProgressHandler progressHandler, -BundleCheckpointHandler checkpointHandler, -BundleFinalizationHandler finalizationHandler) { - this.bundleId = bundleId; - this.response = response; - this.inputReceivers = inputReceivers; - this.outputClients = outputClients; - this.stateRegistration = stateRegistration; - this.progressHandler = progressHandler; - this.checkpointHandler = checkpointHandler; - this.finalizationHandler = finalizationHandler; -} +/** An active bundle for a particular {@link BeamFnApi.ProcessBundleDescriptor}. */ +public class ActiveBundle implements RemoteBundle { + private final String bundleId; + private final CompletionStage response; + private final Map inputReceivers; + private final Map outputClients; + private final StateDelegator.Registration stateRegistration; + private final BundleProgressHandler progressHandler; + private final BundleSplitHandler splitHandler; + private final BundleCheckpointHandler checkpointHandler; + private final BundleFinalizationHandler finalizationHandler; + + private ActiveBundle( + String bundleId, + CompletionStage response, + Map inputReceivers, + Map outputClients, + StateDelegator.Registration stateRegistration, + BundleProgressHandler progressHandler, + BundleSplitHandler splitHandler, + BundleCheckpointHandler checkpointHandler, + BundleFinalizationHandler finalizationHandler) { +this.bundleId = bundleId; +this.response = response; +this.inputReceivers = inputReceivers; +this.outputClients = outputClients; +this.stateRegistration = stateRegistration; +this.progressHandler = progressHandler; +this.splitHandler = splitHandler; +this.checkpointHandler = checkpointHandler; +this.finalizationHandler = finalizationHandler; + } -/** Returns an id used to represent this bundle. */ -@Override -public String getId() { - return bundleId; -} + /** Returns an id used to represent this bundle. */ + @Override + public String getId() { +return bundleId; + } -/** - * Get a map of PCollection ids to {@link FnDataReceiver receiver}s which consume input - * elements, forwarding them to the remote environment. - */ -@Override -public Map getInputReceivers() { - return (Map) inputReceivers; -} + /** + * Get a map of PCollection ids to {@link FnDataReceiver receiver}s which consume input + * elements, forwarding them to the remote environment. + */ + @Override + public Map getInputReceivers() { +return (Map) inputReceivers; + } -/** - * Blocks until bundle processing is finished. This is comprised of: - * - * - * closing each {@link #getInputReceivers() input receiver}. - * waiting for the SDK to say that processing the bundle is finished. - * waiting for all inbound data clients to complete - * - * - * This method will throw an exception if bundle processing has failed. {@link - *
[jira] [Work logged] (BEAM-5600) Splitting for SplittableDoFn should be exposed within runner shared libraries
[ https://issues.apache.org/jira/browse/BEAM-5600?focusedWorklogId=341554=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-341554 ] ASF GitHub Bot logged work on BEAM-5600: Author: ASF GitHub Bot Created on: 12/Nov/19 00:12 Start Date: 12/Nov/19 00:12 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10045: [BEAM-5600, BEAM-2939] Add SplittableParDo expansion logic to runner's core. URL: https://github.com/apache/beam/pull/10045 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 341554) Time Spent: 1h 10m (was: 1h) > Splitting for SplittableDoFn should be exposed within runner shared libraries > - > > Key: BEAM-5600 > URL: https://issues.apache.org/jira/browse/BEAM-5600 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Priority: Major > Labels: portability > Time Spent: 1h 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5600) Splitting for SplittableDoFn should be exposed within runner shared libraries
[ https://issues.apache.org/jira/browse/BEAM-5600?focusedWorklogId=341508=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-341508 ] ASF GitHub Bot logged work on BEAM-5600: Author: ASF GitHub Bot Created on: 11/Nov/19 22:04 Start Date: 11/Nov/19 22:04 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10045: [BEAM-5600, BEAM-2939] Add SplittableParDo expansion logic to runner's core. URL: https://github.com/apache/beam/pull/10045#discussion_r344930720 ## File path: runners/samza/src/main/java/org/apache/beam/runners/samza/SamzaPipelineRunner.java ## @@ -36,8 +39,16 @@ @Override public PortablePipelineResult run(final Pipeline pipeline, JobInfo jobInfo) { +// Expand any splittable DoFns within the graph to enable sizing and splitting of bundles. Review comment: I was thinking about that but the preparation will likely diverge once more forms of splitting are added since the various graph expansions will change. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 341508) Time Spent: 1h (was: 50m) > Splitting for SplittableDoFn should be exposed within runner shared libraries > - > > Key: BEAM-5600 > URL: https://issues.apache.org/jira/browse/BEAM-5600 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Priority: Major > Labels: portability > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5600) Splitting for SplittableDoFn should be exposed within runner shared libraries
[ https://issues.apache.org/jira/browse/BEAM-5600?focusedWorklogId=341476=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-341476 ] ASF GitHub Bot logged work on BEAM-5600: Author: ASF GitHub Bot Created on: 11/Nov/19 21:10 Start Date: 11/Nov/19 21:10 Worklog Time Spent: 10m Work Description: tweise commented on pull request #10045: [BEAM-5600, BEAM-2939] Add SplittableParDo expansion logic to runner's core. URL: https://github.com/apache/beam/pull/10045#discussion_r344910948 ## File path: runners/samza/src/main/java/org/apache/beam/runners/samza/SamzaPipelineRunner.java ## @@ -36,8 +39,16 @@ @Override public PortablePipelineResult run(final Pipeline pipeline, JobInfo jobInfo) { +// Expand any splittable DoFns within the graph to enable sizing and splitting of bundles. Review comment: Considering the repetition between Flink, Spark and Samza, maybe worth moving the recurring sequence into "preparation" utility function? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 341476) Time Spent: 50m (was: 40m) > Splitting for SplittableDoFn should be exposed within runner shared libraries > - > > Key: BEAM-5600 > URL: https://issues.apache.org/jira/browse/BEAM-5600 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Priority: Major > Labels: portability > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5600) Splitting for SplittableDoFn should be exposed within runner shared libraries
[ https://issues.apache.org/jira/browse/BEAM-5600?focusedWorklogId=341465=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-341465 ] ASF GitHub Bot logged work on BEAM-5600: Author: ASF GitHub Bot Created on: 11/Nov/19 20:33 Start Date: 11/Nov/19 20:33 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10045: [BEAM-5600, BEAM-2939] Add SplittableParDo expansion logic to runner's core. URL: https://github.com/apache/beam/pull/10045#issuecomment-552603677 Run Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 341465) Time Spent: 40m (was: 0.5h) > Splitting for SplittableDoFn should be exposed within runner shared libraries > - > > Key: BEAM-5600 > URL: https://issues.apache.org/jira/browse/BEAM-5600 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Priority: Major > Labels: portability > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5600) Splitting for SplittableDoFn should be exposed within runner shared libraries
[ https://issues.apache.org/jira/browse/BEAM-5600?focusedWorklogId=341359=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-341359 ] ASF GitHub Bot logged work on BEAM-5600: Author: ASF GitHub Bot Created on: 11/Nov/19 16:26 Start Date: 11/Nov/19 16:26 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10045: [BEAM-5600, BEAM-2939] Add SplittableParDo expansion logic to runner's core. URL: https://github.com/apache/beam/pull/10045#issuecomment-552513917 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 341359) Time Spent: 20m (was: 10m) > Splitting for SplittableDoFn should be exposed within runner shared libraries > - > > Key: BEAM-5600 > URL: https://issues.apache.org/jira/browse/BEAM-5600 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Priority: Major > Labels: portability > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5600) Splitting for SplittableDoFn should be exposed within runner shared libraries
[ https://issues.apache.org/jira/browse/BEAM-5600?focusedWorklogId=341360=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-341360 ] ASF GitHub Bot logged work on BEAM-5600: Author: ASF GitHub Bot Created on: 11/Nov/19 16:26 Start Date: 11/Nov/19 16:26 Worklog Time Spent: 10m Work Description: lukecwik commented on issue #10045: [BEAM-5600, BEAM-2939] Add SplittableParDo expansion logic to runner's core. URL: https://github.com/apache/beam/pull/10045#issuecomment-552513983 Run Portable_Python PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 341360) Time Spent: 0.5h (was: 20m) > Splitting for SplittableDoFn should be exposed within runner shared libraries > - > > Key: BEAM-5600 > URL: https://issues.apache.org/jira/browse/BEAM-5600 > Project: Beam > Issue Type: New Feature > Components: sdk-java-core >Reporter: Scott Wegner >Priority: Major > Labels: portability > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-5600) Splitting for SplittableDoFn should be exposed within runner shared libraries
[ https://issues.apache.org/jira/browse/BEAM-5600?focusedWorklogId=340749=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-340749 ] ASF GitHub Bot logged work on BEAM-5600: Author: ASF GitHub Bot Created on: 08/Nov/19 21:06 Start Date: 08/Nov/19 21:06 Worklog Time Spent: 10m Work Description: lukecwik commented on pull request #10045: [BEAM-5600, BEAM-2939] Add SplittableParDo expansion logic to runner's core. URL: https://github.com/apache/beam/pull/10045 Update Flink, Spark, and Samza to use it removing the unnecessary expansion within the Python portable runner. Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)[![Build