[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=96785&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-96785 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 30/Apr/18 17:30 Start Date: 30/Apr/18 17:30 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r185053553 ## File path: runners/reference/java/build.gradle ## @@ -27,8 +27,11 @@ framework to execute user-definied functions.""" dependencies { shadow project(path: ":beam-model-pipeline", configuration: "shadow") shadow project(path: ":beam-runners-core-construction-java", configuration: "shadow") + shadow project(path: ":beam-sdks-java-fn-execution", configuration: "shadow") shadow library.java.slf4j_api - testCompile library.java.junit + shadowTest project(path: ":beam-runners-core-construction-java", configuration: "shadowTest") Review comment: No. This is pulling in `InMemoryArtifactStagerService`, which is a test _utility_ rather than a test itself. That class should really live under the main source set in a `testing` subpackage, but I didn't want to do that refactor here. Is there a tracking bug for removing cross-module test dependencies? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 96785) Time Spent: 19h (was: 18h 50m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Fix For: 2.5.0 > > Time Spent: 19h > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=95588&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-95588 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 26/Apr/18 16:25 Start Date: 26/Apr/18 16:25 Worklog Time Spent: 10m Work Description: iemejia closed pull request #5232: [BEAM-4071] fix Maven build URL: https://github.com/apache/beam/pull/5232 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/runners/reference/java/pom.xml b/runners/reference/java/pom.xml index cbecdd8e413..93a519aa832 100644 --- a/runners/reference/java/pom.xml +++ b/runners/reference/java/pom.xml @@ -36,6 +36,10 @@ jar + +true + + org.apache.beam @@ -47,6 +51,21 @@ beam-runners-core-construction-java + + org.apache.beam + beam-sdks-java-fn-execution + + + + org.apache.beam + beam-model-job-management + + + + org.apache.beam + beam-runners-core-construction-java + + com.google.protobuf protobuf-java @@ -83,5 +102,14 @@ slf4j-jdk14 test + + + org.apache.beam + beam-runners-core-construction-java + ${project.version} + tests + test + + This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 95588) Time Spent: 18h 50m (was: 18h 40m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Fix For: 2.5.0 > > Time Spent: 18h 50m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=95565&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-95565 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 26/Apr/18 16:04 Start Date: 26/Apr/18 16:04 Worklog Time Spent: 10m Work Description: kennknowles commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r18769 ## File path: runners/reference/java/build.gradle ## @@ -27,8 +27,11 @@ framework to execute user-definied functions.""" dependencies { shadow project(path: ":beam-model-pipeline", configuration: "shadow") shadow project(path: ":beam-runners-core-construction-java", configuration: "shadow") + shadow project(path: ":beam-sdks-java-fn-execution", configuration: "shadow") shadow library.java.slf4j_api - testCompile library.java.junit + shadowTest project(path: ":beam-runners-core-construction-java", configuration: "shadowTest") Review comment: (is it pulling in NeedsRunner tests? if so, comment that, since these dependencies are by default an incorrect thing to do) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 95565) Time Spent: 18.5h (was: 18h 20m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Fix For: 2.5.0 > > Time Spent: 18.5h > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=95567&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-95567 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 26/Apr/18 16:04 Start Date: 26/Apr/18 16:04 Worklog Time Spent: 10m Work Description: kennknowles commented on a change in pull request #5232: [BEAM-4071] fix Maven build URL: https://github.com/apache/beam/pull/5232#discussion_r18885 ## File path: runners/reference/java/pom.xml ## @@ -83,5 +102,14 @@ slf4j-jdk14 test + + Review comment: This kind of dependency is a red flag and we have a JIRA to eliminate all such dependencies unless they are used only for executing. I went back and commented on the original PR to ask. Since this is for parity it LGTM but both should be explained in comment. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 95567) Time Spent: 18h 40m (was: 18.5h) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Fix For: 2.5.0 > > Time Spent: 18h 40m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=95564&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-95564 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 26/Apr/18 16:03 Start Date: 26/Apr/18 16:03 Worklog Time Spent: 10m Work Description: kennknowles commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r18445 ## File path: runners/reference/java/build.gradle ## @@ -27,8 +27,11 @@ framework to execute user-definied functions.""" dependencies { shadow project(path: ":beam-model-pipeline", configuration: "shadow") shadow project(path: ":beam-runners-core-construction-java", configuration: "shadow") + shadow project(path: ":beam-sdks-java-fn-execution", configuration: "shadow") shadow library.java.slf4j_api - testCompile library.java.junit + shadowTest project(path: ":beam-runners-core-construction-java", configuration: "shadowTest") Review comment: Why is this here? We should not have these kinds of dependencies except for the purpose of running test suites. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 95564) Time Spent: 18h 20m (was: 18h 10m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Fix For: 2.5.0 > > Time Spent: 18h 20m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=95558&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-95558 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 26/Apr/18 15:48 Start Date: 26/Apr/18 15:48 Worklog Time Spent: 10m Work Description: tweise opened a new pull request #5232: [BEAM-4071] fix Maven build URL: https://github.com/apache/beam/pull/5232 DESCRIPTION HERE Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] Make sure there is a [JIRA issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the change (usually before you start working on it). Trivial changes like typos do not require a JIRA issue. Your pull request should address just this issue, without pulling in other changes. - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue. - [ ] Write a pull request description that is detailed enough to understand: - [ ] What the pull request does - [ ] Why it does it - [ ] How it does it - [ ] Why this approach - [ ] Each commit in the pull request should have a meaningful subject line and body. - [ ] Run `./gradlew build` to make sure basic checks pass. A more thorough check will be performed on your pull request automatically. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 95558) Time Spent: 18h 10m (was: 18h) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Fix For: 2.5.0 > > Time Spent: 18h 10m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=95277&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-95277 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 25/Apr/18 22:09 Start Date: 25/Apr/18 22:09 Worklog Time Spent: 10m Work Description: jkff closed pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ArtifactServiceStager.java b/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ArtifactServiceStager.java index 7319b8f30d1..63cc50810c3 100644 --- a/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ArtifactServiceStager.java +++ b/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ArtifactServiceStager.java @@ -47,6 +47,7 @@ import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactChunk; import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata; import org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse; import org.apache.beam.model.jobmanagement.v1.ArtifactApi.Manifest; import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest; import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse; @@ -87,26 +88,33 @@ private ArtifactServiceStager(Channel channel, int bufferSize) { this.bufferSize = bufferSize; } - public void stage(Iterable files) throws IOException, InterruptedException { -final Map> futures = new HashMap<>(); -for (File file : files) { + /** + * Stages the given artifact files to the staging service. + * + * @return The artifact staging token returned by the service + */ + public String stage(Iterable files) throws IOException, InterruptedException { +final Map> futures = new HashMap<>(); +for (StagedFile file : files) { futures.put(file, MoreFutures.supplyAsync(new StagingCallable(file), executorService)); } CompletionStage stagingResult = MoreFutures.allAsList(futures.values()) .thenApply(ignored -> new ExtractStagingResultsCallable(futures).call()); -stageManifest(stagingResult); +return stageManifest(stagingResult); } - private void stageManifest(CompletionStage stagingFuture) + private String stageManifest(CompletionStage stagingFuture) throws InterruptedException { try { StagingResult stagingResult = MoreFutures.get(stagingFuture); if (stagingResult.isSuccess()) { Manifest manifest = Manifest.newBuilder().addAllArtifact(stagingResult.getMetadata()).build(); -blockingStub.commitManifest( -CommitManifestRequest.newBuilder().setManifest(manifest).build()); +CommitManifestResponse response = +blockingStub.commitManifest( + CommitManifestRequest.newBuilder().setManifest(manifest).build()); +return response.getStagingToken(); } else { RuntimeException failure = new RuntimeException( @@ -124,9 +132,9 @@ private void stageManifest(CompletionStage stagingFuture) } private class StagingCallable implements ThrowingSupplier { -private final File file; +private final StagedFile file; -private StagingCallable(File file) { +private StagingCallable(StagedFile file) { this.file = file; } @@ -135,11 +143,12 @@ public ArtifactMetadata get() throws Exception { // TODO: Add Retries PutArtifactResponseObserver responseObserver = new PutArtifactResponseObserver(); StreamObserver requestObserver = stub.putArtifact(responseObserver); - ArtifactMetadata metadata = ArtifactMetadata.newBuilder().setName(file.getName()).build(); + ArtifactMetadata metadata = + ArtifactMetadata.newBuilder().setName(file.getStagingName()).build(); requestObserver.onNext(PutArtifactRequest.newBuilder().setMetadata(metadata).build()); MessageDigest md5Digest = MessageDigest.getInstance("MD5"); - FileChannel channel = new FileInputStream(file).getChannel(); + FileChannel channel = new FileInputStream(file.getFile()).getChannel(); ByteBuffer readBuffer = ByteBuffer.allocate(bufferSize); while (!responseObserver.isTerminal() && channel.position() < channel.size()) { readBuffer.clea
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=94765&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-94765 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 24/Apr/18 19:09 Start Date: 24/Apr/18 19:09 Worklog Time Spent: 10m Work Description: bsidhom commented on issue #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#issuecomment-384046168 Done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 94765) Time Spent: 17h 50m (was: 17h 40m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 17h 50m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=94637&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-94637 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 24/Apr/18 15:42 Start Date: 24/Apr/18 15:42 Worklog Time Spent: 10m Work Description: tgroh commented on issue #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#issuecomment-383980472 Gotta rebase one more time, I fear This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 94637) Time Spent: 17h 40m (was: 17.5h) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 17h 40m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=94401&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-94401 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 24/Apr/18 00:52 Start Date: 24/Apr/18 00:52 Worklog Time Spent: 10m Work Description: bsidhom commented on issue #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#issuecomment-383767860 Commits should be clean now. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 94401) Time Spent: 17.5h (was: 17h 20m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 17.5h > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=94261&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-94261 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 23/Apr/18 19:30 Start Date: 23/Apr/18 19:30 Worklog Time Spent: 10m Work Description: bsidhom commented on issue #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#issuecomment-383694569 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 94261) Time Spent: 17h 20m (was: 17h 10m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 17h 20m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=93536&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-93536 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 21/Apr/18 01:44 Start Date: 21/Apr/18 01:44 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r183196402 ## File path: runners/reference/java/src/test/java/org/apache/beam/runners/reference/CloseableResourceTest.java ## @@ -0,0 +1,78 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static org.hamcrest.MatcherAssert.assertThat; +import static org.hamcrest.Matchers.instanceOf; +import static org.hamcrest.Matchers.is; +import static org.hamcrest.Matchers.notNullValue; +import static org.junit.Assert.assertTrue; + +import java.util.concurrent.atomic.AtomicBoolean; +import org.apache.beam.runners.reference.CloseableResource.CloseException; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.junit.runners.JUnit4; + +/** Tests for {@link CloseableResource}. */ +@RunWith(JUnit4.class) +public class CloseableResourceTest { + + @Test + public void alwaysReturnsSameResource() { +Foo foo = new Foo(); +CloseableResource resource = CloseableResource.of(foo, (ignored) -> {}); +assertThat(resource.get(), is(foo)); +assertThat(resource.get(), is(foo)); + } + + @Test + public void callsCloser() throws Exception { +AtomicBoolean closed = new AtomicBoolean(false); +try (CloseableResource ignored = +CloseableResource.of( +new Foo(), +(foo) -> { + closed.set(true); +})) { + // Do nothing. +} +assertTrue(closed.get()); + } + + @Test + public void wrapsExceptionsInCloseException() { +Exception wrapped = new Exception(); +CloseException closeException = null; +try (CloseableResource ignored = +CloseableResource.of( +new Foo(), +(foo) -> { + throw wrapped; +})) { + // Do nothing. +} catch (CloseException e) { + closeException = e; Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 93536) Time Spent: 17h 10m (was: 17h) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 17h 10m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=93534&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-93534 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 21/Apr/18 01:44 Start Date: 21/Apr/18 01:44 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r183196129 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/testing/TestJobService.java ## @@ -0,0 +1,99 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference.testing; + +import io.grpc.stub.StreamObserver; +import org.apache.beam.model.jobmanagement.v1.JobApi.CancelJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.CancelJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.GetJobStateRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.GetJobStateResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.JobMessagesRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.JobMessagesResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.JobState; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceImplBase; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; + +/** + * A JobService for tests. + * + * A {@link TestJobService} always returns a fixed staging endpoint, job preparation id, job id, + * and job state. As soon as a job is run, it is put into the given job state. + */ +public class TestJobService extends JobServiceImplBase { + + private final ApiServiceDescriptor stagingEndpoint; + private final String preparationId; + private final String jobId; + private final JobState.Enum jobState; + + public TestJobService( + ApiServiceDescriptor stagingEndpoint, + String preparationId, + String jobId, + JobState.Enum jobState) { +this.stagingEndpoint = stagingEndpoint; +this.preparationId = preparationId; +this.jobId = jobId; +this.jobState = jobState; + } + + @Override + public void prepare( + PrepareJobRequest request, StreamObserver responseObserver) { +responseObserver.onNext( +PrepareJobResponse.newBuilder() +.setPreparationId(preparationId) +.setArtifactStagingEndpoint(stagingEndpoint) +.build()); +responseObserver.onCompleted(); + } + + @Override + public void run(RunJobRequest request, StreamObserver responseObserver) { + responseObserver.onNext(RunJobResponse.newBuilder().setJobId(jobId).build()); +responseObserver.onCompleted(); + } + + @Override + public void getState( + GetJobStateRequest request, StreamObserver responseObserver) { + responseObserver.onNext(GetJobStateResponse.newBuilder().setState(jobState).build()); +responseObserver.onCompleted(); + } + + @Override + public void cancel(CancelJobRequest request, StreamObserver responseObserver) { +super.cancel(request, responseObserver); Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 93534) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components:
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=93528&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-93528 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 21/Apr/18 01:44 Start Date: 21/Apr/18 01:44 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r183194847 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/PortableRunner.java ## @@ -0,0 +1,245 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkState; +import static org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.Sets; +import com.google.protobuf.ByteString; +import io.grpc.ManagedChannel; +import java.io.File; +import java.io.FileOutputStream; +import java.io.IOException; +import java.util.Collection; +import java.util.Set; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; +import org.apache.beam.runners.core.construction.ArtifactServiceStager; +import org.apache.beam.runners.core.construction.ArtifactServiceStager.StagedFile; +import org.apache.beam.runners.core.construction.JavaReadViaImpulse; +import org.apache.beam.runners.core.construction.PipelineOptionsTranslation; +import org.apache.beam.runners.core.construction.PipelineTranslation; +import org.apache.beam.runners.reference.CloseableResource.CloseException; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.PipelineRunner; +import org.apache.beam.sdk.fn.channel.ManagedChannelFactory; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.options.PipelineOptionsValidator; +import org.apache.beam.sdk.options.PortablePipelineOptions; +import org.apache.beam.sdk.util.ZipFiles; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A {@link PipelineRunner} a {@link Pipeline} against a {@code JobService}. */ +public class PortableRunner extends PipelineRunner { + + private static final Logger LOG = LoggerFactory.getLogger(PortableRunner.class); + + /** Provided pipeline options. */ + private final PipelineOptions options; + /** Job API endpoint. */ + private final String endpoint; + /** Files to stage to artifact staging service. They will ultimately be added to the classpath. */ + private final Collection filesToStage; + /** Channel factory used to create communication channel with job and staging services. */ + private final ManagedChannelFactory channelFactory; + + /** + * Constructs a runner from the provided options. + * + * @param options Properties which configure the runner. + * @return The newly created runner. + */ + public static PortableRunner fromOptions(PipelineOptions options) { +return createInternal(options, ManagedChannelFactory.createDefault()); Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 93528) Time Spent: 16h 1
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=93535&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-93535 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 21/Apr/18 01:44 Start Date: 21/Apr/18 01:44 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r183196027 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/testing/InProcessManagedChannelFactory.java ## @@ -0,0 +1,36 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference.testing; + +import io.grpc.ManagedChannel; +import io.grpc.inprocess.InProcessChannelBuilder; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; +import org.apache.beam.sdk.fn.channel.ManagedChannelFactory; + +/** + * A {@link org.apache.beam.sdk.fn.channel.ManagedChannelFactory} that uses in-process channels. + * + * The channel builder uses {@link ApiServiceDescriptor#getUrl()} as the unique in-process name. + */ +public class InProcessManagedChannelFactory extends ManagedChannelFactory { + + @Override + public ManagedChannel forDescriptor(ApiServiceDescriptor apiServiceDescriptor) { Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 93535) Time Spent: 17h (was: 16h 50m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 17h > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=93530&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-93530 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 21/Apr/18 01:44 Start Date: 21/Apr/18 01:44 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r183194864 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/PortableRunner.java ## @@ -0,0 +1,245 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkState; +import static org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.Sets; +import com.google.protobuf.ByteString; +import io.grpc.ManagedChannel; +import java.io.File; +import java.io.FileOutputStream; +import java.io.IOException; +import java.util.Collection; +import java.util.Set; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; +import org.apache.beam.runners.core.construction.ArtifactServiceStager; +import org.apache.beam.runners.core.construction.ArtifactServiceStager.StagedFile; +import org.apache.beam.runners.core.construction.JavaReadViaImpulse; +import org.apache.beam.runners.core.construction.PipelineOptionsTranslation; +import org.apache.beam.runners.core.construction.PipelineTranslation; +import org.apache.beam.runners.reference.CloseableResource.CloseException; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.PipelineRunner; +import org.apache.beam.sdk.fn.channel.ManagedChannelFactory; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.options.PipelineOptionsValidator; +import org.apache.beam.sdk.options.PortablePipelineOptions; +import org.apache.beam.sdk.util.ZipFiles; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A {@link PipelineRunner} a {@link Pipeline} against a {@code JobService}. */ +public class PortableRunner extends PipelineRunner { + + private static final Logger LOG = LoggerFactory.getLogger(PortableRunner.class); + + /** Provided pipeline options. */ + private final PipelineOptions options; + /** Job API endpoint. */ + private final String endpoint; + /** Files to stage to artifact staging service. They will ultimately be added to the classpath. */ + private final Collection filesToStage; + /** Channel factory used to create communication channel with job and staging services. */ + private final ManagedChannelFactory channelFactory; + + /** + * Constructs a runner from the provided options. + * + * @param options Properties which configure the runner. + * @return The newly created runner. + */ + public static PortableRunner fromOptions(PipelineOptions options) { +return createInternal(options, ManagedChannelFactory.createDefault()); + } + + @VisibleForTesting + static PortableRunner createInternal( + PipelineOptions options, ManagedChannelFactory channelFactory) { +PortablePipelineOptions portableOptions = +PipelineOptionsValidator.validate(PortablePipelineOptions.class, options); + +String endpoint = portableOptions.getJobEndpoint(); + +// Deduplicate artifacts. +Set pathsToStage = Sets.newHashSet(); +if (portableOptions.getFile
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=93531&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-93531 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 21/Apr/18 01:44 Start Date: 21/Apr/18 01:44 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r183194737 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/JobServicePipelineResult.java ## @@ -0,0 +1,141 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import com.google.common.util.concurrent.Uninterruptibles; +import com.google.protobuf.ByteString; +import java.util.concurrent.CompletableFuture; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.TimeoutException; +import org.apache.beam.model.jobmanagement.v1.JobApi; +import org.apache.beam.model.jobmanagement.v1.JobApi.CancelJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.CancelJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.GetJobStateRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.GetJobStateResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.metrics.MetricResults; +import org.joda.time.Duration; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +class JobServicePipelineResult implements PipelineResult { + + private static final long POLL_INTERVAL_SEC = 10; + + private static final Logger LOG = LoggerFactory.getLogger(JobServicePipelineResult.class); + + private final ByteString jobId; + private final CloseableResource jobService; + + JobServicePipelineResult(ByteString jobId, CloseableResource jobService) { +this.jobId = jobId; +this.jobService = jobService; + } + + @Override + public State getState() { +JobServiceBlockingStub stub = jobService.get(); +GetJobStateResponse response = + stub.getState(GetJobStateRequest.newBuilder().setJobIdBytes(jobId).build()); +return getJavaState(response.getState()); + } + + @Override + public State cancel() { +JobServiceBlockingStub stub = jobService.get(); +CancelJobResponse response = + stub.cancel(CancelJobRequest.newBuilder().setJobIdBytes(jobId).build()); +return getJavaState(response.getState()); + } + + @Override + public State waitUntilFinish(Duration duration) { +if (duration.compareTo(Duration.millis(1)) < 1) { + // Equivalent to infinite timeout. + return waitUntilFinish(); +} else { + CompletableFuture result = CompletableFuture.supplyAsync(this::waitUntilFinish); + try { +return Uninterruptibles.getUninterruptibly( +result, duration.getMillis(), TimeUnit.MILLISECONDS); + } catch (TimeoutException e) { +// Null result indicates a timeout. +return null; + } catch (ExecutionException e) { +throw new RuntimeException(e); + } +} + } + + @Override + public State waitUntilFinish() { +JobServiceBlockingStub stub = jobService.get(); +GetJobStateRequest request = GetJobStateRequest.newBuilder().setJobIdBytes(jobId).build(); +GetJobStateResponse response = stub.getState(request); +State lastState = getJavaState(response.getState()); +while (!lastState.isTerminal()) { + Uninterruptibles.sleepUninterruptibly(POLL_INTERVAL_SEC, TimeUnit.SECONDS); Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (wa
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=93533&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-93533 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 21/Apr/18 01:44 Start Date: 21/Apr/18 01:44 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r183195286 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/testing/InMemoryArtifactService.java ## @@ -0,0 +1,159 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference.testing; + +import static com.google.common.base.Preconditions.checkArgument; +import static com.google.common.base.Preconditions.checkState; + +import com.google.common.collect.Sets; +import com.google.common.io.BaseEncoding; +import io.grpc.Status; +import io.grpc.stub.StreamObserver; +import java.security.MessageDigest; +import java.util.Set; +import java.util.function.Consumer; +import javax.annotation.concurrent.GuardedBy; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse; +import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A StagingService for tests. Only stores artifact metadata. */ Review comment: Wow. I guess IntelliJ failed me. 😢 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 93533) Time Spent: 16h 50m (was: 16h 40m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 16h 50m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=93532&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-93532 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 21/Apr/18 01:44 Start Date: 21/Apr/18 01:44 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r183196507 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/options/PortablePipelineOptions.java ## @@ -0,0 +1,46 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.options; + +import java.util.List; +import org.apache.beam.sdk.options.Validation.Required; + +/** Pipeline options common to all portable runners. */ +public interface PortablePipelineOptions extends PipelineOptions { + + // TODO: https://issues.apache.org/jira/browse/BEAM-4106: Consider pulling this out into a new + // options interface, e.g., FileStagingOptions. Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 93532) Time Spent: 16h 40m (was: 16.5h) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 16h 40m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=93529&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-93529 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 21/Apr/18 01:44 Start Date: 21/Apr/18 01:44 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r183194398 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/JobServicePipelineResult.java ## @@ -0,0 +1,141 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import com.google.common.util.concurrent.Uninterruptibles; +import com.google.protobuf.ByteString; +import java.util.concurrent.CompletableFuture; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.TimeoutException; +import org.apache.beam.model.jobmanagement.v1.JobApi; +import org.apache.beam.model.jobmanagement.v1.JobApi.CancelJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.CancelJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.GetJobStateRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.GetJobStateResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.metrics.MetricResults; +import org.joda.time.Duration; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +class JobServicePipelineResult implements PipelineResult { + + private static final long POLL_INTERVAL_SEC = 10; + + private static final Logger LOG = LoggerFactory.getLogger(JobServicePipelineResult.class); + + private final ByteString jobId; + private final CloseableResource jobService; + + JobServicePipelineResult(ByteString jobId, CloseableResource jobService) { +this.jobId = jobId; +this.jobService = jobService; + } + + @Override + public State getState() { +JobServiceBlockingStub stub = jobService.get(); +GetJobStateResponse response = + stub.getState(GetJobStateRequest.newBuilder().setJobIdBytes(jobId).build()); +return getJavaState(response.getState()); + } + + @Override + public State cancel() { +JobServiceBlockingStub stub = jobService.get(); +CancelJobResponse response = + stub.cancel(CancelJobRequest.newBuilder().setJobIdBytes(jobId).build()); +return getJavaState(response.getState()); + } + + @Override + public State waitUntilFinish(Duration duration) { +if (duration.compareTo(Duration.millis(1)) < 1) { + // Equivalent to infinite timeout. + return waitUntilFinish(); +} else { + CompletableFuture result = CompletableFuture.supplyAsync(this::waitUntilFinish); + try { +return Uninterruptibles.getUninterruptibly( Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 93529) Time Spent: 16h 20m (was: 16h 10m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 16h 20m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > def
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=93480&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-93480 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 20/Apr/18 23:14 Start Date: 20/Apr/18 23:14 Worklog Time Spent: 10m Work Description: tgroh commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r183186654 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/PortableRunner.java ## @@ -0,0 +1,245 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkState; +import static org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.Sets; +import com.google.protobuf.ByteString; +import io.grpc.ManagedChannel; +import java.io.File; +import java.io.FileOutputStream; +import java.io.IOException; +import java.util.Collection; +import java.util.Set; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; +import org.apache.beam.runners.core.construction.ArtifactServiceStager; +import org.apache.beam.runners.core.construction.ArtifactServiceStager.StagedFile; +import org.apache.beam.runners.core.construction.JavaReadViaImpulse; +import org.apache.beam.runners.core.construction.PipelineOptionsTranslation; +import org.apache.beam.runners.core.construction.PipelineTranslation; +import org.apache.beam.runners.reference.CloseableResource.CloseException; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.PipelineRunner; +import org.apache.beam.sdk.fn.channel.ManagedChannelFactory; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.options.PipelineOptionsValidator; +import org.apache.beam.sdk.options.PortablePipelineOptions; +import org.apache.beam.sdk.util.ZipFiles; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A {@link PipelineRunner} a {@link Pipeline} against a {@code JobService}. */ +public class PortableRunner extends PipelineRunner { + + private static final Logger LOG = LoggerFactory.getLogger(PortableRunner.class); + + /** Provided pipeline options. */ + private final PipelineOptions options; + /** Job API endpoint. */ + private final String endpoint; + /** Files to stage to artifact staging service. They will ultimately be added to the classpath. */ + private final Collection filesToStage; + /** Channel factory used to create communication channel with job and staging services. */ + private final ManagedChannelFactory channelFactory; + + /** + * Constructs a runner from the provided options. + * + * @param options Properties which configure the runner. + * @return The newly created runner. + */ + public static PortableRunner fromOptions(PipelineOptions options) { +return createInternal(options, ManagedChannelFactory.createDefault()); + } + + @VisibleForTesting + static PortableRunner createInternal( + PipelineOptions options, ManagedChannelFactory channelFactory) { +PortablePipelineOptions portableOptions = +PipelineOptionsValidator.validate(PortablePipelineOptions.class, options); + +String endpoint = portableOptions.getJobEndpoint(); + +// Deduplicate artifacts. +Set pathsToStage = Sets.newHashSet(); +if (portableOptions.getFilesT
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=93473&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-93473 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 20/Apr/18 23:14 Start Date: 20/Apr/18 23:14 Worklog Time Spent: 10m Work Description: tgroh commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r183186297 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/JobServicePipelineResult.java ## @@ -0,0 +1,141 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import com.google.common.util.concurrent.Uninterruptibles; +import com.google.protobuf.ByteString; +import java.util.concurrent.CompletableFuture; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.TimeoutException; +import org.apache.beam.model.jobmanagement.v1.JobApi; +import org.apache.beam.model.jobmanagement.v1.JobApi.CancelJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.CancelJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.GetJobStateRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.GetJobStateResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.metrics.MetricResults; +import org.joda.time.Duration; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +class JobServicePipelineResult implements PipelineResult { + + private static final long POLL_INTERVAL_SEC = 10; + + private static final Logger LOG = LoggerFactory.getLogger(JobServicePipelineResult.class); + + private final ByteString jobId; + private final CloseableResource jobService; + + JobServicePipelineResult(ByteString jobId, CloseableResource jobService) { +this.jobId = jobId; +this.jobService = jobService; + } + + @Override + public State getState() { +JobServiceBlockingStub stub = jobService.get(); +GetJobStateResponse response = + stub.getState(GetJobStateRequest.newBuilder().setJobIdBytes(jobId).build()); +return getJavaState(response.getState()); + } + + @Override + public State cancel() { +JobServiceBlockingStub stub = jobService.get(); +CancelJobResponse response = + stub.cancel(CancelJobRequest.newBuilder().setJobIdBytes(jobId).build()); +return getJavaState(response.getState()); + } + + @Override + public State waitUntilFinish(Duration duration) { +if (duration.compareTo(Duration.millis(1)) < 1) { + // Equivalent to infinite timeout. + return waitUntilFinish(); +} else { + CompletableFuture result = CompletableFuture.supplyAsync(this::waitUntilFinish); + try { +return Uninterruptibles.getUninterruptibly( +result, duration.getMillis(), TimeUnit.MILLISECONDS); + } catch (TimeoutException e) { +// Null result indicates a timeout. +return null; + } catch (ExecutionException e) { +throw new RuntimeException(e); + } +} + } + + @Override + public State waitUntilFinish() { +JobServiceBlockingStub stub = jobService.get(); +GetJobStateRequest request = GetJobStateRequest.newBuilder().setJobIdBytes(jobId).build(); +GetJobStateResponse response = stub.getState(request); +State lastState = getJavaState(response.getState()); +while (!lastState.isTerminal()) { + Uninterruptibles.sleepUninterruptibly(POLL_INTERVAL_SEC, TimeUnit.SECONDS); Review comment: Just normal sleeps, breaking out of the method if we're interrupted. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Iss
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=93481&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-93481 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 20/Apr/18 23:14 Start Date: 20/Apr/18 23:14 Worklog Time Spent: 10m Work Description: tgroh commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r183187341 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/options/PortablePipelineOptions.java ## @@ -0,0 +1,46 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.options; + +import java.util.List; +import org.apache.beam.sdk.options.Validation.Required; + +/** Pipeline options common to all portable runners. */ +public interface PortablePipelineOptions extends PipelineOptions { + + // TODO: https://issues.apache.org/jira/browse/BEAM-4106: Consider pulling this out into a new + // options interface, e.g., FileStagingOptions. Review comment: Highly likely that we also desire autopopulation of the classpath here, so worth a comment there also This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 93481) Time Spent: 15h 50m (was: 15h 40m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 15h 50m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=93474&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-93474 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 20/Apr/18 23:14 Start Date: 20/Apr/18 23:14 Worklog Time Spent: 10m Work Description: tgroh commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r183186254 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/JobServicePipelineResult.java ## @@ -0,0 +1,141 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import com.google.common.util.concurrent.Uninterruptibles; +import com.google.protobuf.ByteString; +import java.util.concurrent.CompletableFuture; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.TimeoutException; +import org.apache.beam.model.jobmanagement.v1.JobApi; +import org.apache.beam.model.jobmanagement.v1.JobApi.CancelJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.CancelJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.GetJobStateRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.GetJobStateResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.metrics.MetricResults; +import org.joda.time.Duration; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +class JobServicePipelineResult implements PipelineResult { + + private static final long POLL_INTERVAL_SEC = 10; + + private static final Logger LOG = LoggerFactory.getLogger(JobServicePipelineResult.class); + + private final ByteString jobId; + private final CloseableResource jobService; + + JobServicePipelineResult(ByteString jobId, CloseableResource jobService) { +this.jobId = jobId; +this.jobService = jobService; + } + + @Override + public State getState() { +JobServiceBlockingStub stub = jobService.get(); +GetJobStateResponse response = + stub.getState(GetJobStateRequest.newBuilder().setJobIdBytes(jobId).build()); +return getJavaState(response.getState()); + } + + @Override + public State cancel() { +JobServiceBlockingStub stub = jobService.get(); +CancelJobResponse response = + stub.cancel(CancelJobRequest.newBuilder().setJobIdBytes(jobId).build()); +return getJavaState(response.getState()); + } + + @Override + public State waitUntilFinish(Duration duration) { +if (duration.compareTo(Duration.millis(1)) < 1) { + // Equivalent to infinite timeout. + return waitUntilFinish(); +} else { + CompletableFuture result = CompletableFuture.supplyAsync(this::waitUntilFinish); + try { +return Uninterruptibles.getUninterruptibly( Review comment: I strongly favor the latter - `InterruptedException` generally means you're being asked to shut down, which we should respect unless we have *extremely strong reasons* to not; this plays nicely with things like `ExecutorService.shutdown()`, and related. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 93474) Time Spent: 14h 50m (was: 14h 40m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 14h 50m > Remaining Estimate
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=93482&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-93482 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 20/Apr/18 23:14 Start Date: 20/Apr/18 23:14 Worklog Time Spent: 10m Work Description: tgroh commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r183186907 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/testing/InProcessManagedChannelFactory.java ## @@ -0,0 +1,36 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference.testing; + +import io.grpc.ManagedChannel; +import io.grpc.inprocess.InProcessChannelBuilder; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; +import org.apache.beam.sdk.fn.channel.ManagedChannelFactory; + +/** + * A {@link org.apache.beam.sdk.fn.channel.ManagedChannelFactory} that uses in-process channels. + * + * The channel builder uses {@link ApiServiceDescriptor#getUrl()} as the unique in-process name. + */ +public class InProcessManagedChannelFactory extends ManagedChannelFactory { + + @Override + public ManagedChannel forDescriptor(ApiServiceDescriptor apiServiceDescriptor) { Review comment: I've been using a lambda when this is called for (e.g. in https://github.com/apache/beam/blob/master/runners/java-fn-execution/src/test/java/org/apache/beam/runners/fnexecution/control/RemoteExecutionTest.java#L124); mind updateing those also? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 93482) Time Spent: 16h (was: 15h 50m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 16h > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=93477&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-93477 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 20/Apr/18 23:14 Start Date: 20/Apr/18 23:14 Worklog Time Spent: 10m Work Description: tgroh commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r183187200 ## File path: runners/reference/java/src/test/java/org/apache/beam/runners/reference/CloseableResourceTest.java ## @@ -0,0 +1,78 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static org.hamcrest.MatcherAssert.assertThat; +import static org.hamcrest.Matchers.instanceOf; +import static org.hamcrest.Matchers.is; +import static org.hamcrest.Matchers.notNullValue; +import static org.junit.Assert.assertTrue; + +import java.util.concurrent.atomic.AtomicBoolean; +import org.apache.beam.runners.reference.CloseableResource.CloseException; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.junit.runners.JUnit4; + +/** Tests for {@link CloseableResource}. */ +@RunWith(JUnit4.class) +public class CloseableResourceTest { + + @Test + public void alwaysReturnsSameResource() { +Foo foo = new Foo(); +CloseableResource resource = CloseableResource.of(foo, (ignored) -> {}); +assertThat(resource.get(), is(foo)); +assertThat(resource.get(), is(foo)); + } + + @Test + public void callsCloser() throws Exception { +AtomicBoolean closed = new AtomicBoolean(false); +try (CloseableResource ignored = +CloseableResource.of( +new Foo(), +(foo) -> { + closed.set(true); +})) { + // Do nothing. +} +assertTrue(closed.get()); + } + + @Test + public void wrapsExceptionsInCloseException() { +Exception wrapped = new Exception(); +CloseException closeException = null; +try (CloseableResource ignored = +CloseableResource.of( +new Foo(), +(foo) -> { + throw wrapped; +})) { + // Do nothing. +} catch (CloseException e) { + closeException = e; Review comment: All of your tests here match exactly that pattern, though - so I'm not super convinced that we want to have the one test which isn't like the other ones? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 93477) Time Spent: 15h 20m (was: 15h 10m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 15h 20m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=93476&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-93476 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 20/Apr/18 23:14 Start Date: 20/Apr/18 23:14 Worklog Time Spent: 10m Work Description: tgroh commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r183186447 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/PortableRunner.java ## @@ -0,0 +1,245 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkState; +import static org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.Sets; +import com.google.protobuf.ByteString; +import io.grpc.ManagedChannel; +import java.io.File; +import java.io.FileOutputStream; +import java.io.IOException; +import java.util.Collection; +import java.util.Set; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; +import org.apache.beam.runners.core.construction.ArtifactServiceStager; +import org.apache.beam.runners.core.construction.ArtifactServiceStager.StagedFile; +import org.apache.beam.runners.core.construction.JavaReadViaImpulse; +import org.apache.beam.runners.core.construction.PipelineOptionsTranslation; +import org.apache.beam.runners.core.construction.PipelineTranslation; +import org.apache.beam.runners.reference.CloseableResource.CloseException; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.PipelineRunner; +import org.apache.beam.sdk.fn.channel.ManagedChannelFactory; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.options.PipelineOptionsValidator; +import org.apache.beam.sdk.options.PortablePipelineOptions; +import org.apache.beam.sdk.util.ZipFiles; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A {@link PipelineRunner} a {@link Pipeline} against a {@code JobService}. */ +public class PortableRunner extends PipelineRunner { + + private static final Logger LOG = LoggerFactory.getLogger(PortableRunner.class); + + /** Provided pipeline options. */ + private final PipelineOptions options; + /** Job API endpoint. */ + private final String endpoint; + /** Files to stage to artifact staging service. They will ultimately be added to the classpath. */ + private final Collection filesToStage; + /** Channel factory used to create communication channel with job and staging services. */ + private final ManagedChannelFactory channelFactory; + + /** + * Constructs a runner from the provided options. + * + * @param options Properties which configure the runner. + * @return The newly created runner. + */ + public static PortableRunner fromOptions(PipelineOptions options) { +return createInternal(options, ManagedChannelFactory.createDefault()); Review comment: It's implied to be internal because it's package private and marked `@VisibleForTesting` - anyone using it outside of developers in this package is already jumping into a volcano in terms of behaving properly This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the spe
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=93478&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-93478 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 20/Apr/18 23:14 Start Date: 20/Apr/18 23:14 Worklog Time Spent: 10m Work Description: tgroh commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r183186841 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/testing/InMemoryArtifactService.java ## @@ -0,0 +1,159 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference.testing; + +import static com.google.common.base.Preconditions.checkArgument; +import static com.google.common.base.Preconditions.checkState; + +import com.google.common.collect.Sets; +import com.google.common.io.BaseEncoding; +import io.grpc.Status; +import io.grpc.stub.StreamObserver; +import java.security.MessageDigest; +import java.util.Set; +import java.util.function.Consumer; +import javax.annotation.concurrent.GuardedBy; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse; +import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A StagingService for tests. Only stores artifact metadata. */ Review comment: https://github.com/apache/beam/blob/master/runners/core-construction-java/src/test/java/org/apache/beam/runners/core/construction/InMemoryArtifactStagerService.java This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 93478) Time Spent: 15.5h (was: 15h 20m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 15.5h > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=93475&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-93475 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 20/Apr/18 23:14 Start Date: 20/Apr/18 23:14 Worklog Time Spent: 10m Work Description: tgroh commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r183186527 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/PortableRunner.java ## @@ -0,0 +1,245 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkState; +import static org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.Sets; +import com.google.protobuf.ByteString; +import io.grpc.ManagedChannel; +import java.io.File; +import java.io.FileOutputStream; +import java.io.IOException; +import java.util.Collection; +import java.util.Set; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; +import org.apache.beam.runners.core.construction.ArtifactServiceStager; +import org.apache.beam.runners.core.construction.ArtifactServiceStager.StagedFile; +import org.apache.beam.runners.core.construction.JavaReadViaImpulse; +import org.apache.beam.runners.core.construction.PipelineOptionsTranslation; +import org.apache.beam.runners.core.construction.PipelineTranslation; +import org.apache.beam.runners.reference.CloseableResource.CloseException; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.PipelineRunner; +import org.apache.beam.sdk.fn.channel.ManagedChannelFactory; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.options.PipelineOptionsValidator; +import org.apache.beam.sdk.options.PortablePipelineOptions; +import org.apache.beam.sdk.util.ZipFiles; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A {@link PipelineRunner} a {@link Pipeline} against a {@code JobService}. */ +public class PortableRunner extends PipelineRunner { + + private static final Logger LOG = LoggerFactory.getLogger(PortableRunner.class); + + /** Provided pipeline options. */ + private final PipelineOptions options; + /** Job API endpoint. */ + private final String endpoint; + /** Files to stage to artifact staging service. They will ultimately be added to the classpath. */ + private final Collection filesToStage; + /** Channel factory used to create communication channel with job and staging services. */ + private final ManagedChannelFactory channelFactory; + + /** + * Constructs a runner from the provided options. + * + * @param options Properties which configure the runner. + * @return The newly created runner. + */ + public static PortableRunner fromOptions(PipelineOptions options) { +return createInternal(options, ManagedChannelFactory.createDefault()); + } + + @VisibleForTesting + static PortableRunner createInternal( + PipelineOptions options, ManagedChannelFactory channelFactory) { +PortablePipelineOptions portableOptions = +PipelineOptionsValidator.validate(PortablePipelineOptions.class, options); + +String endpoint = portableOptions.getJobEndpoint(); + +// Deduplicate artifacts. +Set pathsToStage = Sets.newHashSet(); +if (portableOptions.getFilesT
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=93479&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-93479 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 20/Apr/18 23:14 Start Date: 20/Apr/18 23:14 Worklog Time Spent: 10m Work Description: tgroh commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r183187061 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/testing/TestJobService.java ## @@ -0,0 +1,99 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference.testing; + +import io.grpc.stub.StreamObserver; +import org.apache.beam.model.jobmanagement.v1.JobApi.CancelJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.CancelJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.GetJobStateRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.GetJobStateResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.JobMessagesRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.JobMessagesResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.JobState; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceImplBase; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; + +/** + * A JobService for tests. + * + * A {@link TestJobService} always returns a fixed staging endpoint, job preparation id, job id, + * and job state. As soon as a job is run, it is put into the given job state. + */ +public class TestJobService extends JobServiceImplBase { + + private final ApiServiceDescriptor stagingEndpoint; + private final String preparationId; + private final String jobId; + private final JobState.Enum jobState; + + public TestJobService( + ApiServiceDescriptor stagingEndpoint, + String preparationId, + String jobId, + JobState.Enum jobState) { +this.stagingEndpoint = stagingEndpoint; +this.preparationId = preparationId; +this.jobId = jobId; +this.jobState = jobState; + } + + @Override + public void prepare( + PrepareJobRequest request, StreamObserver responseObserver) { +responseObserver.onNext( +PrepareJobResponse.newBuilder() +.setPreparationId(preparationId) +.setArtifactStagingEndpoint(stagingEndpoint) +.build()); +responseObserver.onCompleted(); + } + + @Override + public void run(RunJobRequest request, StreamObserver responseObserver) { + responseObserver.onNext(RunJobResponse.newBuilder().setJobId(jobId).build()); +responseObserver.onCompleted(); + } + + @Override + public void getState( + GetJobStateRequest request, StreamObserver responseObserver) { + responseObserver.onNext(GetJobStateResponse.newBuilder().setState(jobState).build()); +responseObserver.onCompleted(); + } + + @Override + public void cancel(CancelJobRequest request, StreamObserver responseObserver) { +super.cancel(request, responseObserver); Review comment: Remove overrides that are exactly equal to the `super` method This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 93479) Time Spent: 15h 40m (was: 15.5h) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEA
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92397&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92397 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 04:18 Start Date: 19/Apr/18 04:18 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182630023 ## File path: runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ArtifactServiceStager.java ## @@ -87,26 +87,32 @@ private ArtifactServiceStager(Channel channel, int bufferSize) { this.bufferSize = bufferSize; } - public void stage(Iterable files) throws IOException, InterruptedException { -final Map> futures = new HashMap<>(); -for (File file : files) { + /** + * Stages the given artifact files to the staging service. + * + * @return The artifact staging token returned by the service + */ + public String stage(Iterable files) throws IOException, InterruptedException { +final Map> futures = new HashMap<>(); +for (StagedFile file : files) { futures.put(file, MoreFutures.supplyAsync(new StagingCallable(file), executorService)); } CompletionStage stagingResult = MoreFutures.allAsList(futures.values()) .thenApply(ignored -> new ExtractStagingResultsCallable(futures).call()); -stageManifest(stagingResult); +return stageManifest(stagingResult); } - private void stageManifest(CompletionStage stagingFuture) + private String stageManifest(CompletionStage stagingFuture) throws InterruptedException { try { StagingResult stagingResult = MoreFutures.get(stagingFuture); if (stagingResult.isSuccess()) { Manifest manifest = Manifest.newBuilder().addAllArtifact(stagingResult.getMetadata()).build(); -blockingStub.commitManifest( -CommitManifestRequest.newBuilder().setManifest(manifest).build()); +return blockingStub + .commitManifest(CommitManifestRequest.newBuilder().setManifest(manifest).build()) +.getStagingToken(); Review comment: > **tgroh** wrote: > Separate these two lines? Just a preference thing, but I think that the fluent style hints at something more immediate than the actual RPC we're making here. Done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92397) Time Spent: 14.5h (was: 14h 20m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 14.5h > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92396&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92396 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 04:18 Start Date: 19/Apr/18 04:18 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182630024 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/JobServicePipelineResult.java ## @@ -0,0 +1,141 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import com.google.common.util.concurrent.Uninterruptibles; +import com.google.protobuf.ByteString; +import java.util.concurrent.CompletableFuture; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.TimeoutException; +import org.apache.beam.model.jobmanagement.v1.JobApi; +import org.apache.beam.model.jobmanagement.v1.JobApi.CancelJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.CancelJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.GetJobStateRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.GetJobStateResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.metrics.MetricResults; +import org.joda.time.Duration; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +class JobServicePipelineResult implements PipelineResult { + + private static final long POLL_INTERVAL_SEC = 10; + + private static final Logger LOG = LoggerFactory.getLogger(JobServicePipelineResult.class); + + private final ByteString jobId; + private final CloseableResource jobService; + + JobServicePipelineResult(ByteString jobId, CloseableResource jobService) { +this.jobId = jobId; +this.jobService = jobService; + } + + @Override + public State getState() { +JobServiceBlockingStub stub = jobService.get(); +GetJobStateResponse response = + stub.getState(GetJobStateRequest.newBuilder().setJobIdBytes(jobId).build()); +return getJavaState(response.getState()); + } + + @Override + public State cancel() { +JobServiceBlockingStub stub = jobService.get(); +CancelJobResponse response = + stub.cancel(CancelJobRequest.newBuilder().setJobIdBytes(jobId).build()); +return getJavaState(response.getState()); + } + + @Override + public State waitUntilFinish(Duration duration) { +if (duration.compareTo(Duration.millis(1)) < 1) { + // Equivalent to infinite timeout. + return waitUntilFinish(); +} else { + CompletableFuture result = CompletableFuture.supplyAsync(this::waitUntilFinish); + try { +return Uninterruptibles.getUninterruptibly( Review comment: > **tgroh** wrote: > Why is this uninterruptible? To ensure that we sleep fully for the amount of time requested. Given that we don't use interrupts anywhere in Beam code, we don't expect this to happen anyway. Alternatively we could just crash on interrupts (i.e., throw a RuntimeException wrapping the interruption). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92396) Time Spent: 14h 20m (was: 14h 10m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92393&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92393 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 04:18 Start Date: 19/Apr/18 04:18 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182630025 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/PortableRunner.java ## @@ -0,0 +1,245 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkState; +import static org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.Sets; +import com.google.protobuf.ByteString; +import io.grpc.ManagedChannel; +import java.io.File; +import java.io.FileOutputStream; +import java.io.IOException; +import java.util.Collection; +import java.util.Set; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; +import org.apache.beam.runners.core.construction.ArtifactServiceStager; +import org.apache.beam.runners.core.construction.ArtifactServiceStager.StagedFile; +import org.apache.beam.runners.core.construction.JavaReadViaImpulse; +import org.apache.beam.runners.core.construction.PipelineOptionsTranslation; +import org.apache.beam.runners.core.construction.PipelineTranslation; +import org.apache.beam.runners.reference.CloseableResource.CloseException; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.PipelineRunner; +import org.apache.beam.sdk.fn.channel.ManagedChannelFactory; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.options.PipelineOptionsValidator; +import org.apache.beam.sdk.options.PortablePipelineOptions; +import org.apache.beam.sdk.util.ZipFiles; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A {@link PipelineRunner} a {@link Pipeline} against a {@code JobService}. */ +public class PortableRunner extends PipelineRunner { + + private static final Logger LOG = LoggerFactory.getLogger(PortableRunner.class); + + /** Provided pipeline options. */ + private final PipelineOptions options; + /** Job API endpoint. */ + private final String endpoint; + /** Files to stage to artifact staging service. They will ultimately be added to the classpath. */ + private final Collection filesToStage; + /** Channel factory used to create communication channel with job and staging services. */ + private final ManagedChannelFactory channelFactory; + + /** + * Constructs a runner from the provided options. + * + * @param options Properties which configure the runner. + * @return The newly created runner. + */ + public static PortableRunner fromOptions(PipelineOptions options) { +return createInternal(options, ManagedChannelFactory.createDefault()); Review comment: > **tgroh** wrote: > why not just `create`? Mostly to draw attention to the fact that this is an internal call and to distinguish it from `fromOptions`. If you have a strong preference, I'll change it. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92392&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92392 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 04:18 Start Date: 19/Apr/18 04:18 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182630021 ## File path: runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ArtifactServiceStager.java ## @@ -222,13 +230,26 @@ public StagingResult call() { } } + /** A file along with a staging name. */ + @AutoValue + public abstract static class StagedFile { Review comment: > **tgroh** wrote: > We create this before staging, right? The name implies after See https://github.com/apache/beam/pull/5150#discussion_r182507804. I'm open to suggestions, but otherwise I'll leave it as is. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92392) Time Spent: 14h (was: 13h 50m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 14h > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92391&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92391 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 04:18 Start Date: 19/Apr/18 04:18 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182630022 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/PortableRunner.java ## @@ -0,0 +1,245 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkState; +import static org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.Sets; +import com.google.protobuf.ByteString; +import io.grpc.ManagedChannel; +import java.io.File; +import java.io.FileOutputStream; +import java.io.IOException; +import java.util.Collection; +import java.util.Set; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; +import org.apache.beam.runners.core.construction.ArtifactServiceStager; +import org.apache.beam.runners.core.construction.ArtifactServiceStager.StagedFile; +import org.apache.beam.runners.core.construction.JavaReadViaImpulse; +import org.apache.beam.runners.core.construction.PipelineOptionsTranslation; +import org.apache.beam.runners.core.construction.PipelineTranslation; +import org.apache.beam.runners.reference.CloseableResource.CloseException; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.PipelineRunner; +import org.apache.beam.sdk.fn.channel.ManagedChannelFactory; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.options.PipelineOptionsValidator; +import org.apache.beam.sdk.options.PortablePipelineOptions; +import org.apache.beam.sdk.util.ZipFiles; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A {@link PipelineRunner} a {@link Pipeline} against a {@code JobService}. */ +public class PortableRunner extends PipelineRunner { + + private static final Logger LOG = LoggerFactory.getLogger(PortableRunner.class); + + /** Provided pipeline options. */ + private final PipelineOptions options; + /** Job API endpoint. */ + private final String endpoint; + /** Files to stage to artifact staging service. They will ultimately be added to the classpath. */ + private final Collection filesToStage; + /** Channel factory used to create communication channel with job and staging services. */ + private final ManagedChannelFactory channelFactory; + + /** + * Constructs a runner from the provided options. + * + * @param options Properties which configure the runner. + * @return The newly created runner. + */ + public static PortableRunner fromOptions(PipelineOptions options) { +return createInternal(options, ManagedChannelFactory.createDefault()); + } + + @VisibleForTesting + static PortableRunner createInternal( + PipelineOptions options, ManagedChannelFactory channelFactory) { +PortablePipelineOptions portableOptions = +PipelineOptionsValidator.validate(PortablePipelineOptions.class, options); + +String endpoint = portableOptions.getJobEndpoint(); + +// Deduplicate artifacts. +Set pathsToStage = Sets.newHashSet(); +if (portableOptions.getFile
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92395&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92395 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 04:18 Start Date: 19/Apr/18 04:18 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182630026 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/CloseableResource.java ## @@ -0,0 +1,120 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkArgument; +import static com.google.common.base.Preconditions.checkState; + +import javax.annotation.Nullable; + +/** + * An {@link AutoCloseable} that wraps a resource that needs to be cleaned up but does not implement + * {@link AutoCloseable} itself. + * + * Recipients of a {@link CloseableResource} are in general responsible for cleanup. Ownership + * can be transferred from one context to another via {@link #transfer()}. Transferring relinquishes + * ownership from the original resource. This allows resources to be safely constructed and + * transferred within a try-with-resources block. For example: + * + * {@code try (CloseableResource resource = CloseableResource.of(...)) { + * // Do something with resource. + * ... + * // Then transfer ownership to some consumer. + * resourceConsumer(resource.transfer()); + * } + * } + * + * Not thread-safe. + */ +public class CloseableResource implements AutoCloseable { + + private final T resource; + + /** + * {@link Closer } for the underlying resource. Closers are nullable to allow transfer of + * ownership. However, newly-constructed {@link CloseableResource CloseableResources} must always + * have non-null closers. + */ + @Nullable private Closer closer; + + private boolean isClosed = false; + + private CloseableResource(T resource, Closer closer) { +this.resource = resource; +this.closer = closer; + } + + /** Creates a {@link CloseableResource} with the given resource and closer. */ + public static CloseableResource of(T resource, Closer closer) { +checkArgument(resource != null, "Resource must be non-null"); +checkArgument(closer != null, "%s must be non-null", Closer.class.getName()); +return new CloseableResource<>(resource, closer); + } + + /** Gets the underlying resource. */ + public T get() { +checkState(closer != null, "%s has transferred ownership", CloseableResource.class.getName()); +checkState(!isClosed, "% is closed", CloseableResource.class.getName()); +return resource; + } + + /** + * Returns a new {@link CloseableResource} that owns the underlying resource and relinquishes + * ownership from this {@link CloseableResource}. {@link #close()} on the original instance + * becomes a no-op. + */ + public CloseableResource transfer() { +checkState(closer != null, "%s has transferred ownership", CloseableResource.class.getName()); +checkState(!isClosed, "% is closed", CloseableResource.class.getName()); +CloseableResource other = CloseableResource.of(resource, closer); +this.closer = null; +return other; + } + + /** + * Closes the underlying resource. The closer will only be executed on the first call. + * + * @throws CloseException wrapping any exceptions thrown while closing + */ + @Override + public void close() throws CloseException { +if (closer != null && !isClosed) { + try { +closer.close(resource); +isClosed = true; + } catch (Exception e) { +// Mark resource as closed even if we catch an exception. +isClosed = true; Review comment: > **tgroh** wrote: > `isClosed` can be in the finally block Done. This is an automated message from the Apache Git Servi
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92394&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92394 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 04:18 Start Date: 19/Apr/18 04:18 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182630027 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/testing/InMemoryArtifactService.java ## @@ -0,0 +1,159 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference.testing; + +import static com.google.common.base.Preconditions.checkArgument; +import static com.google.common.base.Preconditions.checkState; + +import com.google.common.collect.Sets; +import com.google.common.io.BaseEncoding; +import io.grpc.Status; +import io.grpc.stub.StreamObserver; +import java.security.MessageDigest; +import java.util.Set; +import java.util.function.Consumer; +import javax.annotation.concurrent.GuardedBy; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse; +import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A StagingService for tests. Only stores artifact metadata. */ Review comment: > **tgroh** wrote: > We already have one of these, don't we? I was unable to find one. Please link it if it does exist. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92394) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 14h 10m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92377&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92377 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 04:17 Start Date: 19/Apr/18 04:17 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182629957 ## File path: runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ArtifactServiceStager.java ## @@ -222,13 +230,26 @@ public StagingResult call() { } } + /** A file along with a staging name. */ + @AutoValue + public abstract static class StagedFile { Review comment: > **tgroh** wrote: > We create this before staging, right? The name implies after See https://github.com/apache/beam/pull/5150#discussion_r182507804. I'm open to suggestions, but otherwise I'll leave it as is. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92377) Time Spent: 12h (was: 11h 50m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 12h > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92384&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92384 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 04:17 Start Date: 19/Apr/18 04:17 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182629974 ## File path: runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ArtifactServiceStager.java ## @@ -222,13 +230,26 @@ public StagingResult call() { } } + /** A file along with a staging name. */ + @AutoValue + public abstract static class StagedFile { Review comment: > **tgroh** wrote: > We create this before staging, right? The name implies after See https://github.com/apache/beam/pull/5150#discussion_r182507804. I'm open to suggestions, but otherwise I'll leave it as is. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92384) Time Spent: 12h 50m (was: 12h 40m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 12h 50m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92382&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92382 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 04:17 Start Date: 19/Apr/18 04:17 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182629965 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/PortableRunner.java ## @@ -0,0 +1,245 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkState; +import static org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.Sets; +import com.google.protobuf.ByteString; +import io.grpc.ManagedChannel; +import java.io.File; +import java.io.FileOutputStream; +import java.io.IOException; +import java.util.Collection; +import java.util.Set; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; +import org.apache.beam.runners.core.construction.ArtifactServiceStager; +import org.apache.beam.runners.core.construction.ArtifactServiceStager.StagedFile; +import org.apache.beam.runners.core.construction.JavaReadViaImpulse; +import org.apache.beam.runners.core.construction.PipelineOptionsTranslation; +import org.apache.beam.runners.core.construction.PipelineTranslation; +import org.apache.beam.runners.reference.CloseableResource.CloseException; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.PipelineRunner; +import org.apache.beam.sdk.fn.channel.ManagedChannelFactory; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.options.PipelineOptionsValidator; +import org.apache.beam.sdk.options.PortablePipelineOptions; +import org.apache.beam.sdk.util.ZipFiles; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A {@link PipelineRunner} a {@link Pipeline} against a {@code JobService}. */ +public class PortableRunner extends PipelineRunner { + + private static final Logger LOG = LoggerFactory.getLogger(PortableRunner.class); + + /** Provided pipeline options. */ + private final PipelineOptions options; + /** Job API endpoint. */ + private final String endpoint; + /** Files to stage to artifact staging service. They will ultimately be added to the classpath. */ + private final Collection filesToStage; + /** Channel factory used to create communication channel with job and staging services. */ + private final ManagedChannelFactory channelFactory; + + /** + * Constructs a runner from the provided options. + * + * @param options Properties which configure the runner. + * @return The newly created runner. + */ + public static PortableRunner fromOptions(PipelineOptions options) { +return createInternal(options, ManagedChannelFactory.createDefault()); Review comment: > **tgroh** wrote: > why not just `create`? Mostly to draw attention to the fact that this is an internal call and to distinguish it from `fromOptions`. If you have a strong preference, I'll change it. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92389&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92389 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 04:17 Start Date: 19/Apr/18 04:17 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182629980 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/testing/InMemoryArtifactService.java ## @@ -0,0 +1,159 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference.testing; + +import static com.google.common.base.Preconditions.checkArgument; +import static com.google.common.base.Preconditions.checkState; + +import com.google.common.collect.Sets; +import com.google.common.io.BaseEncoding; +import io.grpc.Status; +import io.grpc.stub.StreamObserver; +import java.security.MessageDigest; +import java.util.Set; +import java.util.function.Consumer; +import javax.annotation.concurrent.GuardedBy; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse; +import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A StagingService for tests. Only stores artifact metadata. */ Review comment: > **tgroh** wrote: > We already have one of these, don't we? I was unable to find one. Please link it if it does exist. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92389) Time Spent: 13.5h (was: 13h 20m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 13.5h > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92381&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92381 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 04:17 Start Date: 19/Apr/18 04:17 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182629961 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/testing/InMemoryArtifactService.java ## @@ -0,0 +1,159 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference.testing; + +import static com.google.common.base.Preconditions.checkArgument; +import static com.google.common.base.Preconditions.checkState; + +import com.google.common.collect.Sets; +import com.google.common.io.BaseEncoding; +import io.grpc.Status; +import io.grpc.stub.StreamObserver; +import java.security.MessageDigest; +import java.util.Set; +import java.util.function.Consumer; +import javax.annotation.concurrent.GuardedBy; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse; +import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A StagingService for tests. Only stores artifact metadata. */ Review comment: > **tgroh** wrote: > We already have one of these, don't we? I was unable to find one. Please link it if it does exist. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92381) Time Spent: 12h 20m (was: 12h 10m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 12h 20m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92378&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92378 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 04:17 Start Date: 19/Apr/18 04:17 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182629958 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/PortableRunner.java ## @@ -0,0 +1,245 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkState; +import static org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.Sets; +import com.google.protobuf.ByteString; +import io.grpc.ManagedChannel; +import java.io.File; +import java.io.FileOutputStream; +import java.io.IOException; +import java.util.Collection; +import java.util.Set; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; +import org.apache.beam.runners.core.construction.ArtifactServiceStager; +import org.apache.beam.runners.core.construction.ArtifactServiceStager.StagedFile; +import org.apache.beam.runners.core.construction.JavaReadViaImpulse; +import org.apache.beam.runners.core.construction.PipelineOptionsTranslation; +import org.apache.beam.runners.core.construction.PipelineTranslation; +import org.apache.beam.runners.reference.CloseableResource.CloseException; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.PipelineRunner; +import org.apache.beam.sdk.fn.channel.ManagedChannelFactory; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.options.PipelineOptionsValidator; +import org.apache.beam.sdk.options.PortablePipelineOptions; +import org.apache.beam.sdk.util.ZipFiles; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A {@link PipelineRunner} a {@link Pipeline} against a {@code JobService}. */ +public class PortableRunner extends PipelineRunner { + + private static final Logger LOG = LoggerFactory.getLogger(PortableRunner.class); + + /** Provided pipeline options. */ + private final PipelineOptions options; + /** Job API endpoint. */ + private final String endpoint; + /** Files to stage to artifact staging service. They will ultimately be added to the classpath. */ + private final Collection filesToStage; + /** Channel factory used to create communication channel with job and staging services. */ + private final ManagedChannelFactory channelFactory; + + /** + * Constructs a runner from the provided options. + * + * @param options Properties which configure the runner. + * @return The newly created runner. + */ + public static PortableRunner fromOptions(PipelineOptions options) { +return createInternal(options, ManagedChannelFactory.createDefault()); + } + + @VisibleForTesting + static PortableRunner createInternal( + PipelineOptions options, ManagedChannelFactory channelFactory) { +PortablePipelineOptions portableOptions = +PipelineOptionsValidator.validate(PortablePipelineOptions.class, options); + +String endpoint = portableOptions.getJobEndpoint(); + +// Deduplicate artifacts. +Set pathsToStage = Sets.newHashSet(); +if (portableOptions.getFile
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92386&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92386 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 04:17 Start Date: 19/Apr/18 04:17 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182629979 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/CloseableResource.java ## @@ -0,0 +1,120 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkArgument; +import static com.google.common.base.Preconditions.checkState; + +import javax.annotation.Nullable; + +/** + * An {@link AutoCloseable} that wraps a resource that needs to be cleaned up but does not implement + * {@link AutoCloseable} itself. + * + * Recipients of a {@link CloseableResource} are in general responsible for cleanup. Ownership + * can be transferred from one context to another via {@link #transfer()}. Transferring relinquishes + * ownership from the original resource. This allows resources to be safely constructed and + * transferred within a try-with-resources block. For example: + * + * {@code try (CloseableResource resource = CloseableResource.of(...)) { + * // Do something with resource. + * ... + * // Then transfer ownership to some consumer. + * resourceConsumer(resource.transfer()); + * } + * } + * + * Not thread-safe. + */ +public class CloseableResource implements AutoCloseable { + + private final T resource; + + /** + * {@link Closer } for the underlying resource. Closers are nullable to allow transfer of + * ownership. However, newly-constructed {@link CloseableResource CloseableResources} must always + * have non-null closers. + */ + @Nullable private Closer closer; + + private boolean isClosed = false; + + private CloseableResource(T resource, Closer closer) { +this.resource = resource; +this.closer = closer; + } + + /** Creates a {@link CloseableResource} with the given resource and closer. */ + public static CloseableResource of(T resource, Closer closer) { +checkArgument(resource != null, "Resource must be non-null"); +checkArgument(closer != null, "%s must be non-null", Closer.class.getName()); +return new CloseableResource<>(resource, closer); + } + + /** Gets the underlying resource. */ + public T get() { +checkState(closer != null, "%s has transferred ownership", CloseableResource.class.getName()); +checkState(!isClosed, "% is closed", CloseableResource.class.getName()); +return resource; + } + + /** + * Returns a new {@link CloseableResource} that owns the underlying resource and relinquishes + * ownership from this {@link CloseableResource}. {@link #close()} on the original instance + * becomes a no-op. + */ + public CloseableResource transfer() { +checkState(closer != null, "%s has transferred ownership", CloseableResource.class.getName()); +checkState(!isClosed, "% is closed", CloseableResource.class.getName()); +CloseableResource other = CloseableResource.of(resource, closer); +this.closer = null; +return other; + } + + /** + * Closes the underlying resource. The closer will only be executed on the first call. + * + * @throws CloseException wrapping any exceptions thrown while closing + */ + @Override + public void close() throws CloseException { +if (closer != null && !isClosed) { + try { +closer.close(resource); +isClosed = true; + } catch (Exception e) { +// Mark resource as closed even if we catch an exception. +isClosed = true; Review comment: > **tgroh** wrote: > `isClosed` can be in the finally block Done. This is an automated message from the Apache Git Servi
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92379&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92379 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 04:17 Start Date: 19/Apr/18 04:17 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182629959 ## File path: runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ArtifactServiceStager.java ## @@ -87,26 +87,32 @@ private ArtifactServiceStager(Channel channel, int bufferSize) { this.bufferSize = bufferSize; } - public void stage(Iterable files) throws IOException, InterruptedException { -final Map> futures = new HashMap<>(); -for (File file : files) { + /** + * Stages the given artifact files to the staging service. + * + * @return The artifact staging token returned by the service + */ + public String stage(Iterable files) throws IOException, InterruptedException { +final Map> futures = new HashMap<>(); +for (StagedFile file : files) { futures.put(file, MoreFutures.supplyAsync(new StagingCallable(file), executorService)); } CompletionStage stagingResult = MoreFutures.allAsList(futures.values()) .thenApply(ignored -> new ExtractStagingResultsCallable(futures).call()); -stageManifest(stagingResult); +return stageManifest(stagingResult); } - private void stageManifest(CompletionStage stagingFuture) + private String stageManifest(CompletionStage stagingFuture) throws InterruptedException { try { StagingResult stagingResult = MoreFutures.get(stagingFuture); if (stagingResult.isSuccess()) { Manifest manifest = Manifest.newBuilder().addAllArtifact(stagingResult.getMetadata()).build(); -blockingStub.commitManifest( -CommitManifestRequest.newBuilder().setManifest(manifest).build()); +return blockingStub + .commitManifest(CommitManifestRequest.newBuilder().setManifest(manifest).build()) +.getStagingToken(); Review comment: > **tgroh** wrote: > Separate these two lines? Just a preference thing, but I think that the fluent style hints at something more immediate than the actual RPC we're making here. Done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92379) Time Spent: 12h 10m (was: 12h) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 12h 10m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92387&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92387 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 04:17 Start Date: 19/Apr/18 04:17 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182629977 ## File path: runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ArtifactServiceStager.java ## @@ -87,26 +87,32 @@ private ArtifactServiceStager(Channel channel, int bufferSize) { this.bufferSize = bufferSize; } - public void stage(Iterable files) throws IOException, InterruptedException { -final Map> futures = new HashMap<>(); -for (File file : files) { + /** + * Stages the given artifact files to the staging service. + * + * @return The artifact staging token returned by the service + */ + public String stage(Iterable files) throws IOException, InterruptedException { +final Map> futures = new HashMap<>(); +for (StagedFile file : files) { futures.put(file, MoreFutures.supplyAsync(new StagingCallable(file), executorService)); } CompletionStage stagingResult = MoreFutures.allAsList(futures.values()) .thenApply(ignored -> new ExtractStagingResultsCallable(futures).call()); -stageManifest(stagingResult); +return stageManifest(stagingResult); } - private void stageManifest(CompletionStage stagingFuture) + private String stageManifest(CompletionStage stagingFuture) throws InterruptedException { try { StagingResult stagingResult = MoreFutures.get(stagingFuture); if (stagingResult.isSuccess()) { Manifest manifest = Manifest.newBuilder().addAllArtifact(stagingResult.getMetadata()).build(); -blockingStub.commitManifest( -CommitManifestRequest.newBuilder().setManifest(manifest).build()); +return blockingStub + .commitManifest(CommitManifestRequest.newBuilder().setManifest(manifest).build()) +.getStagingToken(); Review comment: > **tgroh** wrote: > Separate these two lines? Just a preference thing, but I think that the fluent style hints at something more immediate than the actual RPC we're making here. Done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92387) Time Spent: 13h 10m (was: 13h) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 13h 10m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92383&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92383 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 04:17 Start Date: 19/Apr/18 04:17 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182629963 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/CloseableResource.java ## @@ -0,0 +1,120 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkArgument; +import static com.google.common.base.Preconditions.checkState; + +import javax.annotation.Nullable; + +/** + * An {@link AutoCloseable} that wraps a resource that needs to be cleaned up but does not implement + * {@link AutoCloseable} itself. + * + * Recipients of a {@link CloseableResource} are in general responsible for cleanup. Ownership + * can be transferred from one context to another via {@link #transfer()}. Transferring relinquishes + * ownership from the original resource. This allows resources to be safely constructed and + * transferred within a try-with-resources block. For example: + * + * {@code try (CloseableResource resource = CloseableResource.of(...)) { + * // Do something with resource. + * ... + * // Then transfer ownership to some consumer. + * resourceConsumer(resource.transfer()); + * } + * } + * + * Not thread-safe. + */ +public class CloseableResource implements AutoCloseable { + + private final T resource; + + /** + * {@link Closer } for the underlying resource. Closers are nullable to allow transfer of + * ownership. However, newly-constructed {@link CloseableResource CloseableResources} must always + * have non-null closers. + */ + @Nullable private Closer closer; + + private boolean isClosed = false; + + private CloseableResource(T resource, Closer closer) { +this.resource = resource; +this.closer = closer; + } + + /** Creates a {@link CloseableResource} with the given resource and closer. */ + public static CloseableResource of(T resource, Closer closer) { +checkArgument(resource != null, "Resource must be non-null"); +checkArgument(closer != null, "%s must be non-null", Closer.class.getName()); +return new CloseableResource<>(resource, closer); + } + + /** Gets the underlying resource. */ + public T get() { +checkState(closer != null, "%s has transferred ownership", CloseableResource.class.getName()); +checkState(!isClosed, "% is closed", CloseableResource.class.getName()); +return resource; + } + + /** + * Returns a new {@link CloseableResource} that owns the underlying resource and relinquishes + * ownership from this {@link CloseableResource}. {@link #close()} on the original instance + * becomes a no-op. + */ + public CloseableResource transfer() { +checkState(closer != null, "%s has transferred ownership", CloseableResource.class.getName()); +checkState(!isClosed, "% is closed", CloseableResource.class.getName()); +CloseableResource other = CloseableResource.of(resource, closer); +this.closer = null; +return other; + } + + /** + * Closes the underlying resource. The closer will only be executed on the first call. + * + * @throws CloseException wrapping any exceptions thrown while closing + */ + @Override + public void close() throws CloseException { +if (closer != null && !isClosed) { + try { +closer.close(resource); +isClosed = true; + } catch (Exception e) { +// Mark resource as closed even if we catch an exception. +isClosed = true; Review comment: > **tgroh** wrote: > `isClosed` can be in the finally block Done. This is an automated message from the Apache Git Servi
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92380&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92380 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 04:17 Start Date: 19/Apr/18 04:17 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182629960 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/JobServicePipelineResult.java ## @@ -0,0 +1,141 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import com.google.common.util.concurrent.Uninterruptibles; +import com.google.protobuf.ByteString; +import java.util.concurrent.CompletableFuture; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.TimeoutException; +import org.apache.beam.model.jobmanagement.v1.JobApi; +import org.apache.beam.model.jobmanagement.v1.JobApi.CancelJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.CancelJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.GetJobStateRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.GetJobStateResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.metrics.MetricResults; +import org.joda.time.Duration; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +class JobServicePipelineResult implements PipelineResult { + + private static final long POLL_INTERVAL_SEC = 10; + + private static final Logger LOG = LoggerFactory.getLogger(JobServicePipelineResult.class); + + private final ByteString jobId; + private final CloseableResource jobService; + + JobServicePipelineResult(ByteString jobId, CloseableResource jobService) { +this.jobId = jobId; +this.jobService = jobService; + } + + @Override + public State getState() { +JobServiceBlockingStub stub = jobService.get(); +GetJobStateResponse response = + stub.getState(GetJobStateRequest.newBuilder().setJobIdBytes(jobId).build()); +return getJavaState(response.getState()); + } + + @Override + public State cancel() { +JobServiceBlockingStub stub = jobService.get(); +CancelJobResponse response = + stub.cancel(CancelJobRequest.newBuilder().setJobIdBytes(jobId).build()); +return getJavaState(response.getState()); + } + + @Override + public State waitUntilFinish(Duration duration) { +if (duration.compareTo(Duration.millis(1)) < 1) { + // Equivalent to infinite timeout. + return waitUntilFinish(); +} else { + CompletableFuture result = CompletableFuture.supplyAsync(this::waitUntilFinish); + try { +return Uninterruptibles.getUninterruptibly( Review comment: > **tgroh** wrote: > Why is this uninterruptible? To ensure that we sleep fully for the amount of time requested. Given that we don't use interrupts anywhere in Beam code, we don't expect this to happen anyway. Alternatively we could just crash on interrupts (i.e., throw a RuntimeException wrapping the interruption). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92380) Time Spent: 12h 20m (was: 12h 10m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92390&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92390 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 04:17 Start Date: 19/Apr/18 04:17 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182629976 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/PortableRunner.java ## @@ -0,0 +1,245 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkState; +import static org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.Sets; +import com.google.protobuf.ByteString; +import io.grpc.ManagedChannel; +import java.io.File; +import java.io.FileOutputStream; +import java.io.IOException; +import java.util.Collection; +import java.util.Set; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; +import org.apache.beam.runners.core.construction.ArtifactServiceStager; +import org.apache.beam.runners.core.construction.ArtifactServiceStager.StagedFile; +import org.apache.beam.runners.core.construction.JavaReadViaImpulse; +import org.apache.beam.runners.core.construction.PipelineOptionsTranslation; +import org.apache.beam.runners.core.construction.PipelineTranslation; +import org.apache.beam.runners.reference.CloseableResource.CloseException; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.PipelineRunner; +import org.apache.beam.sdk.fn.channel.ManagedChannelFactory; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.options.PipelineOptionsValidator; +import org.apache.beam.sdk.options.PortablePipelineOptions; +import org.apache.beam.sdk.util.ZipFiles; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A {@link PipelineRunner} a {@link Pipeline} against a {@code JobService}. */ +public class PortableRunner extends PipelineRunner { + + private static final Logger LOG = LoggerFactory.getLogger(PortableRunner.class); + + /** Provided pipeline options. */ + private final PipelineOptions options; + /** Job API endpoint. */ + private final String endpoint; + /** Files to stage to artifact staging service. They will ultimately be added to the classpath. */ + private final Collection filesToStage; + /** Channel factory used to create communication channel with job and staging services. */ + private final ManagedChannelFactory channelFactory; + + /** + * Constructs a runner from the provided options. + * + * @param options Properties which configure the runner. + * @return The newly created runner. + */ + public static PortableRunner fromOptions(PipelineOptions options) { +return createInternal(options, ManagedChannelFactory.createDefault()); + } + + @VisibleForTesting + static PortableRunner createInternal( + PipelineOptions options, ManagedChannelFactory channelFactory) { +PortablePipelineOptions portableOptions = +PipelineOptionsValidator.validate(PortablePipelineOptions.class, options); + +String endpoint = portableOptions.getJobEndpoint(); + +// Deduplicate artifacts. +Set pathsToStage = Sets.newHashSet(); +if (portableOptions.getFile
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92388&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92388 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 04:17 Start Date: 19/Apr/18 04:17 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182629978 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/JobServicePipelineResult.java ## @@ -0,0 +1,141 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import com.google.common.util.concurrent.Uninterruptibles; +import com.google.protobuf.ByteString; +import java.util.concurrent.CompletableFuture; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.TimeoutException; +import org.apache.beam.model.jobmanagement.v1.JobApi; +import org.apache.beam.model.jobmanagement.v1.JobApi.CancelJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.CancelJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.GetJobStateRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.GetJobStateResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.metrics.MetricResults; +import org.joda.time.Duration; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +class JobServicePipelineResult implements PipelineResult { + + private static final long POLL_INTERVAL_SEC = 10; + + private static final Logger LOG = LoggerFactory.getLogger(JobServicePipelineResult.class); + + private final ByteString jobId; + private final CloseableResource jobService; + + JobServicePipelineResult(ByteString jobId, CloseableResource jobService) { +this.jobId = jobId; +this.jobService = jobService; + } + + @Override + public State getState() { +JobServiceBlockingStub stub = jobService.get(); +GetJobStateResponse response = + stub.getState(GetJobStateRequest.newBuilder().setJobIdBytes(jobId).build()); +return getJavaState(response.getState()); + } + + @Override + public State cancel() { +JobServiceBlockingStub stub = jobService.get(); +CancelJobResponse response = + stub.cancel(CancelJobRequest.newBuilder().setJobIdBytes(jobId).build()); +return getJavaState(response.getState()); + } + + @Override + public State waitUntilFinish(Duration duration) { +if (duration.compareTo(Duration.millis(1)) < 1) { + // Equivalent to infinite timeout. + return waitUntilFinish(); +} else { + CompletableFuture result = CompletableFuture.supplyAsync(this::waitUntilFinish); + try { +return Uninterruptibles.getUninterruptibly( Review comment: > **tgroh** wrote: > Why is this uninterruptible? To ensure that we sleep fully for the amount of time requested. Given that we don't use interrupts anywhere in Beam code, we don't expect this to happen anyway. Alternatively we could just crash on interrupts (i.e., throw a RuntimeException wrapping the interruption). This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92388) Time Spent: 13h 20m (was: 13h 10m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92385&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92385 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 04:17 Start Date: 19/Apr/18 04:17 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182629975 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/PortableRunner.java ## @@ -0,0 +1,245 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkState; +import static org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.Sets; +import com.google.protobuf.ByteString; +import io.grpc.ManagedChannel; +import java.io.File; +import java.io.FileOutputStream; +import java.io.IOException; +import java.util.Collection; +import java.util.Set; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; +import org.apache.beam.runners.core.construction.ArtifactServiceStager; +import org.apache.beam.runners.core.construction.ArtifactServiceStager.StagedFile; +import org.apache.beam.runners.core.construction.JavaReadViaImpulse; +import org.apache.beam.runners.core.construction.PipelineOptionsTranslation; +import org.apache.beam.runners.core.construction.PipelineTranslation; +import org.apache.beam.runners.reference.CloseableResource.CloseException; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.PipelineRunner; +import org.apache.beam.sdk.fn.channel.ManagedChannelFactory; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.options.PipelineOptionsValidator; +import org.apache.beam.sdk.options.PortablePipelineOptions; +import org.apache.beam.sdk.util.ZipFiles; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A {@link PipelineRunner} a {@link Pipeline} against a {@code JobService}. */ +public class PortableRunner extends PipelineRunner { + + private static final Logger LOG = LoggerFactory.getLogger(PortableRunner.class); + + /** Provided pipeline options. */ + private final PipelineOptions options; + /** Job API endpoint. */ + private final String endpoint; + /** Files to stage to artifact staging service. They will ultimately be added to the classpath. */ + private final Collection filesToStage; + /** Channel factory used to create communication channel with job and staging services. */ + private final ManagedChannelFactory channelFactory; + + /** + * Constructs a runner from the provided options. + * + * @param options Properties which configure the runner. + * @return The newly created runner. + */ + public static PortableRunner fromOptions(PipelineOptions options) { +return createInternal(options, ManagedChannelFactory.createDefault()); Review comment: > **tgroh** wrote: > why not just `create`? Mostly to draw attention to the fact that this is an internal call and to distinguish it from `fromOptions`. If you have a strong preference, I'll change it. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92331&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92331 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 01:34 Start Date: 19/Apr/18 01:34 Worklog Time Spent: 10m Work Description: tgroh commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182611526 ## File path: runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ArtifactServiceStager.java ## @@ -87,26 +87,32 @@ private ArtifactServiceStager(Channel channel, int bufferSize) { this.bufferSize = bufferSize; } - public void stage(Iterable files) throws IOException, InterruptedException { -final Map> futures = new HashMap<>(); -for (File file : files) { + /** + * Stages the given artifact files to the staging service. + * + * @return The artifact staging token returned by the service + */ + public String stage(Iterable files) throws IOException, InterruptedException { +final Map> futures = new HashMap<>(); +for (StagedFile file : files) { futures.put(file, MoreFutures.supplyAsync(new StagingCallable(file), executorService)); } CompletionStage stagingResult = MoreFutures.allAsList(futures.values()) .thenApply(ignored -> new ExtractStagingResultsCallable(futures).call()); -stageManifest(stagingResult); +return stageManifest(stagingResult); } - private void stageManifest(CompletionStage stagingFuture) + private String stageManifest(CompletionStage stagingFuture) throws InterruptedException { try { StagingResult stagingResult = MoreFutures.get(stagingFuture); if (stagingResult.isSuccess()) { Manifest manifest = Manifest.newBuilder().addAllArtifact(stagingResult.getMetadata()).build(); -blockingStub.commitManifest( -CommitManifestRequest.newBuilder().setManifest(manifest).build()); +return blockingStub + .commitManifest(CommitManifestRequest.newBuilder().setManifest(manifest).build()) +.getStagingToken(); Review comment: Separate these two lines? Just a preference thing, but I think that the fluent style hints at something more immediate than the actual RPC we're making here. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92331) Time Spent: 11h (was: 10h 50m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 11h > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92338&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92338 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 01:34 Start Date: 19/Apr/18 01:34 Worklog Time Spent: 10m Work Description: tgroh commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182611652 ## File path: runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ArtifactServiceStager.java ## @@ -222,13 +225,26 @@ public StagingResult call() { } } + /** A file along with a staging name. */ + @AutoValue + public abstract static class FileToStage { +public static FileToStage of(File file, String stageName) { + return new AutoValue_ArtifactServiceStager_FileToStage(file, stageName); +} + +/** The file to stage. */ +public abstract File getFile(); +/** Staging handle to this file. */ +public abstract String getStageName(); Review comment: "StagingName"? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92338) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 11h 40m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92334&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92334 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 01:34 Start Date: 19/Apr/18 01:34 Worklog Time Spent: 10m Work Description: tgroh commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182613993 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/PortableRunner.java ## @@ -0,0 +1,245 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkState; +import static org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.Sets; +import com.google.protobuf.ByteString; +import io.grpc.ManagedChannel; +import java.io.File; +import java.io.FileOutputStream; +import java.io.IOException; +import java.util.Collection; +import java.util.Set; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; +import org.apache.beam.runners.core.construction.ArtifactServiceStager; +import org.apache.beam.runners.core.construction.ArtifactServiceStager.StagedFile; +import org.apache.beam.runners.core.construction.JavaReadViaImpulse; +import org.apache.beam.runners.core.construction.PipelineOptionsTranslation; +import org.apache.beam.runners.core.construction.PipelineTranslation; +import org.apache.beam.runners.reference.CloseableResource.CloseException; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.PipelineRunner; +import org.apache.beam.sdk.fn.channel.ManagedChannelFactory; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.options.PipelineOptionsValidator; +import org.apache.beam.sdk.options.PortablePipelineOptions; +import org.apache.beam.sdk.util.ZipFiles; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A {@link PipelineRunner} a {@link Pipeline} against a {@code JobService}. */ +public class PortableRunner extends PipelineRunner { + + private static final Logger LOG = LoggerFactory.getLogger(PortableRunner.class); + + /** Provided pipeline options. */ + private final PipelineOptions options; + /** Job API endpoint. */ + private final String endpoint; + /** Files to stage to artifact staging service. They will ultimately be added to the classpath. */ + private final Collection filesToStage; + /** Channel factory used to create communication channel with job and staging services. */ + private final ManagedChannelFactory channelFactory; + + /** + * Constructs a runner from the provided options. + * + * @param options Properties which configure the runner. + * @return The newly created runner. + */ + public static PortableRunner fromOptions(PipelineOptions options) { +return createInternal(options, ManagedChannelFactory.createDefault()); + } + + @VisibleForTesting + static PortableRunner createInternal( + PipelineOptions options, ManagedChannelFactory channelFactory) { +PortablePipelineOptions portableOptions = +PipelineOptionsValidator.validate(PortablePipelineOptions.class, options); + +String endpoint = portableOptions.getJobEndpoint(); + +// Deduplicate artifacts. +Set pathsToStage = Sets.newHashSet(); +if (portableOptions.getFilesT
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92333&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92333 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 01:34 Start Date: 19/Apr/18 01:34 Worklog Time Spent: 10m Work Description: tgroh commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182613920 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/PortableRunner.java ## @@ -0,0 +1,254 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkState; +import static org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.Sets; +import com.google.protobuf.ByteString; +import io.grpc.ManagedChannel; +import java.io.File; +import java.io.IOException; +import java.util.Collection; +import java.util.List; +import java.util.Set; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.model.pipeline.v1.Endpoints; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; +import org.apache.beam.runners.core.construction.ArtifactServiceStager; +import org.apache.beam.runners.core.construction.ArtifactServiceStager.FileToStage; +import org.apache.beam.runners.core.construction.JavaReadViaImpulse; +import org.apache.beam.runners.core.construction.PipelineOptionsTranslation; +import org.apache.beam.runners.core.construction.PipelineTranslation; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.PipelineRunner; +import org.apache.beam.sdk.fn.channel.ManagedChannelFactory; +import org.apache.beam.sdk.options.ExperimentalOptions; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.options.PipelineOptionsValidator; +import org.apache.beam.sdk.options.PortablePipelineOptions; +import org.apache.beam.sdk.runners.PTransformOverride; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A {@link PipelineRunner} a {@link Pipeline} against a {@code JobService}. */ +public class PortableRunner extends PipelineRunner { + + private static final Logger LOG = LoggerFactory.getLogger(PortableRunner.class); + + /** Provided pipeline options. */ Review comment: They do, but always with the same options. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92333) Time Spent: 11h 20m (was: 11h 10m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 11h 20m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indic
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92336&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92336 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 01:34 Start Date: 19/Apr/18 01:34 Worklog Time Spent: 10m Work Description: tgroh commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182613945 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/PortableRunner.java ## @@ -0,0 +1,245 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkState; +import static org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.Sets; +import com.google.protobuf.ByteString; +import io.grpc.ManagedChannel; +import java.io.File; +import java.io.FileOutputStream; +import java.io.IOException; +import java.util.Collection; +import java.util.Set; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; +import org.apache.beam.runners.core.construction.ArtifactServiceStager; +import org.apache.beam.runners.core.construction.ArtifactServiceStager.StagedFile; +import org.apache.beam.runners.core.construction.JavaReadViaImpulse; +import org.apache.beam.runners.core.construction.PipelineOptionsTranslation; +import org.apache.beam.runners.core.construction.PipelineTranslation; +import org.apache.beam.runners.reference.CloseableResource.CloseException; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.PipelineRunner; +import org.apache.beam.sdk.fn.channel.ManagedChannelFactory; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.options.PipelineOptionsValidator; +import org.apache.beam.sdk.options.PortablePipelineOptions; +import org.apache.beam.sdk.util.ZipFiles; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A {@link PipelineRunner} a {@link Pipeline} against a {@code JobService}. */ +public class PortableRunner extends PipelineRunner { + + private static final Logger LOG = LoggerFactory.getLogger(PortableRunner.class); + + /** Provided pipeline options. */ + private final PipelineOptions options; + /** Job API endpoint. */ + private final String endpoint; + /** Files to stage to artifact staging service. They will ultimately be added to the classpath. */ + private final Collection filesToStage; + /** Channel factory used to create communication channel with job and staging services. */ + private final ManagedChannelFactory channelFactory; + + /** + * Constructs a runner from the provided options. + * + * @param options Properties which configure the runner. + * @return The newly created runner. + */ + public static PortableRunner fromOptions(PipelineOptions options) { +return createInternal(options, ManagedChannelFactory.createDefault()); Review comment: why not just `create`? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92336) Ti
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92332&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92332 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 01:34 Start Date: 19/Apr/18 01:34 Worklog Time Spent: 10m Work Description: tgroh commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182613729 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/CloseableResource.java ## @@ -0,0 +1,120 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkArgument; +import static com.google.common.base.Preconditions.checkState; + +import javax.annotation.Nullable; + +/** + * An {@link AutoCloseable} that wraps a resource that needs to be cleaned up but does not implement + * {@link AutoCloseable} itself. + * + * Recipients of a {@link CloseableResource} are in general responsible for cleanup. Ownership + * can be transferred from one context to another via {@link #transfer()}. Transferring relinquishes + * ownership from the original resource. This allows resources to be safely constructed and + * transferred within a try-with-resources block. For example: + * + * {@code try (CloseableResource resource = CloseableResource.of(...)) { + * // Do something with resource. + * ... + * // Then transfer ownership to some consumer. + * resourceConsumer(resource.transfer()); + * } + * } + * + * Not thread-safe. + */ +public class CloseableResource implements AutoCloseable { + + private final T resource; + + /** + * {@link Closer } for the underlying resource. Closers are nullable to allow transfer of + * ownership. However, newly-constructed {@link CloseableResource CloseableResources} must always + * have non-null closers. + */ + @Nullable private Closer closer; + + private boolean isClosed = false; + + private CloseableResource(T resource, Closer closer) { +this.resource = resource; +this.closer = closer; + } + + /** Creates a {@link CloseableResource} with the given resource and closer. */ + public static CloseableResource of(T resource, Closer closer) { +checkArgument(resource != null, "Resource must be non-null"); +checkArgument(closer != null, "%s must be non-null", Closer.class.getName()); +return new CloseableResource<>(resource, closer); + } + + /** Gets the underlying resource. */ + public T get() { +checkState(closer != null, "%s has transferred ownership", CloseableResource.class.getName()); +checkState(!isClosed, "% is closed", CloseableResource.class.getName()); +return resource; + } + + /** + * Returns a new {@link CloseableResource} that owns the underlying resource and relinquishes + * ownership from this {@link CloseableResource}. {@link #close()} on the original instance + * becomes a no-op. + */ + public CloseableResource transfer() { +checkState(closer != null, "%s has transferred ownership", CloseableResource.class.getName()); +checkState(!isClosed, "% is closed", CloseableResource.class.getName()); +CloseableResource other = CloseableResource.of(resource, closer); +this.closer = null; +return other; + } + + /** + * Closes the underlying resource. The closer will only be executed on the first call. + * + * @throws CloseException wrapping any exceptions thrown while closing + */ + @Override + public void close() throws CloseException { +if (closer != null && !isClosed) { + try { +closer.close(resource); +isClosed = true; + } catch (Exception e) { +// Mark resource as closed even if we catch an exception. +isClosed = true; Review comment: `isClosed` can be in the finally block This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92337&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92337 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 01:34 Start Date: 19/Apr/18 01:34 Worklog Time Spent: 10m Work Description: tgroh commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182611630 ## File path: runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ArtifactServiceStager.java ## @@ -222,13 +230,26 @@ public StagingResult call() { } } + /** A file along with a staging name. */ + @AutoValue + public abstract static class StagedFile { Review comment: We create this before staging, right? The name implies after This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92337) Time Spent: 11h 40m (was: 11.5h) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 11h 40m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92339&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92339 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 01:34 Start Date: 19/Apr/18 01:34 Worklog Time Spent: 10m Work Description: tgroh commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182613813 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/JobServicePipelineResult.java ## @@ -0,0 +1,141 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import com.google.common.util.concurrent.Uninterruptibles; +import com.google.protobuf.ByteString; +import java.util.concurrent.CompletableFuture; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.TimeoutException; +import org.apache.beam.model.jobmanagement.v1.JobApi; +import org.apache.beam.model.jobmanagement.v1.JobApi.CancelJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.CancelJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.GetJobStateRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.GetJobStateResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.metrics.MetricResults; +import org.joda.time.Duration; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +class JobServicePipelineResult implements PipelineResult { + + private static final long POLL_INTERVAL_SEC = 10; + + private static final Logger LOG = LoggerFactory.getLogger(JobServicePipelineResult.class); + + private final ByteString jobId; + private final CloseableResource jobService; + + JobServicePipelineResult(ByteString jobId, CloseableResource jobService) { +this.jobId = jobId; +this.jobService = jobService; + } + + @Override + public State getState() { +JobServiceBlockingStub stub = jobService.get(); +GetJobStateResponse response = + stub.getState(GetJobStateRequest.newBuilder().setJobIdBytes(jobId).build()); +return getJavaState(response.getState()); + } + + @Override + public State cancel() { +JobServiceBlockingStub stub = jobService.get(); +CancelJobResponse response = + stub.cancel(CancelJobRequest.newBuilder().setJobIdBytes(jobId).build()); +return getJavaState(response.getState()); + } + + @Override + public State waitUntilFinish(Duration duration) { +if (duration.compareTo(Duration.millis(1)) < 1) { + // Equivalent to infinite timeout. + return waitUntilFinish(); +} else { + CompletableFuture result = CompletableFuture.supplyAsync(this::waitUntilFinish); + try { +return Uninterruptibles.getUninterruptibly( Review comment: Why is this uninterruptible? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92339) Time Spent: 11h 50m (was: 11h 40m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 11h 50m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job AP
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92335&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92335 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 01:34 Start Date: 19/Apr/18 01:34 Worklog Time Spent: 10m Work Description: tgroh commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182614126 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/testing/InMemoryArtifactService.java ## @@ -0,0 +1,159 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference.testing; + +import static com.google.common.base.Preconditions.checkArgument; +import static com.google.common.base.Preconditions.checkState; + +import com.google.common.collect.Sets; +import com.google.common.io.BaseEncoding; +import io.grpc.Status; +import io.grpc.stub.StreamObserver; +import java.security.MessageDigest; +import java.util.Set; +import java.util.function.Consumer; +import javax.annotation.concurrent.GuardedBy; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse; +import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A StagingService for tests. Only stores artifact metadata. */ Review comment: We already have one of these, don't we? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92335) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 11.5h > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92320&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92320 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 19/Apr/18 00:44 Start Date: 19/Apr/18 00:44 Worklog Time Spent: 10m Work Description: bsidhom commented on issue #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#issuecomment-382573160 Cleaned up commit history. @tgroh PTAL This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92320) Time Spent: 10h 50m (was: 10h 40m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 10h 50m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92303&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92303 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 18/Apr/18 23:53 Start Date: 18/Apr/18 23:53 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182601780 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/CloseableResource.java ## @@ -0,0 +1,119 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkArgument; +import static com.google.common.base.Preconditions.checkState; + +import javax.annotation.Nullable; + +/** + * An {@link AutoCloseable} that wraps a resource that needs to be cleaned up but does not implement + * {@link AutoCloseable} itself. + * + * Recipients of a {@link CloseableResource} are in general responsible for cleanup. Ownership + * can be transferred from one context to another via {@link #transfer()}. Transferring relinquishes + * ownership from the original resource. This allows resources to be safely constructed and + * transferred within a try-with-resources block. For example: + * + * try (CloseableResourceresource = CloseableResource.of(...)) { Review comment: > **jkff** wrote: > {@code > ... > } > could help you avoid the ampersands. I didn't realize `@code` worked with multiline text. Done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92303) Time Spent: 10h 20m (was: 10h 10m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 10h 20m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92305&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92305 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 18/Apr/18 23:53 Start Date: 18/Apr/18 23:53 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182601783 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/testing/InMemoryArtifactService.java ## @@ -0,0 +1,182 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference.testing; + +import static com.google.common.base.Preconditions.checkArgument; +import static com.google.common.base.Preconditions.checkState; + +import com.google.common.collect.Maps; +import com.google.common.collect.Sets; +import com.google.common.io.BaseEncoding; +import com.google.protobuf.ByteString; +import io.grpc.Status; +import io.grpc.stub.StreamObserver; +import java.security.MessageDigest; +import java.util.Map; +import java.util.Set; +import java.util.function.BiConsumer; +import javax.annotation.concurrent.GuardedBy; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse; +import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +// TODO: Implement artifact retrieval. +/** A StagingService for tests. */ +public class InMemoryArtifactService extends ArtifactStagingServiceImplBase { + + private static final Logger LOG = LoggerFactory.getLogger(InMemoryArtifactService.class); + + private final Object artifactLock = new Object(); + + @GuardedBy("artifactLock") + private final Map artifacts = Maps.newHashMap(); + + private final boolean keepArtifacts; + + @GuardedBy("artifactLock") + private boolean committed = false; + + /** + * Constructs an {@link InMemoryArtifactService}. If {@code keepArtifacts} is true, all artifacts + * are kept in memory as {@link ByteString ByteStrings}. Doing so is currently a waste of space Review comment: > **jkff** wrote: > Hm then can you just remove this parameter? If setting it to true currently accomplishes no useful purpose (it's fine to keep if you plan to very soon follow up this PR with something that makes it have a useful purpose) I've thought about it a bit and realized that we do not want to actually serve artifacts here. That functionality is better handled by a ValidatesRunner integration test. Removed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92305) Time Spent: 10.5h (was: 10h 20m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 10.5h > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92304&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92304 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 18/Apr/18 23:53 Start Date: 18/Apr/18 23:53 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182601778 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/CloseableResource.java ## @@ -0,0 +1,119 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkArgument; +import static com.google.common.base.Preconditions.checkState; + +import javax.annotation.Nullable; + +/** + * An {@link AutoCloseable} that wraps a resource that needs to be cleaned up but does not implement + * {@link AutoCloseable} itself. + * + * Recipients of a {@link CloseableResource} are in general responsible for cleanup. Ownership + * can be transferred from one context to another via {@link #transfer()}. Transferring relinquishes + * ownership from the original resource. This allows resources to be safely constructed and + * transferred within a try-with-resources block. For example: + * + * try (CloseableResourceresource = CloseableResource.of(...)) { + * // Do something with resource. + * ... + * // Then transfer ownership to some consumer. + * resourceConsumer(resource.release()); Review comment: > **jkff** wrote: > Did you mean .transfer()? Done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92304) Time Spent: 10h 20m (was: 10h 10m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 10h 20m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92307&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92307 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 18/Apr/18 23:53 Start Date: 18/Apr/18 23:53 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182601789 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/PortableRunner.java ## @@ -0,0 +1,254 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkState; +import static org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.Sets; +import com.google.protobuf.ByteString; +import io.grpc.ManagedChannel; +import java.io.File; +import java.io.IOException; +import java.util.Collection; +import java.util.List; +import java.util.Set; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.model.pipeline.v1.Endpoints; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; +import org.apache.beam.runners.core.construction.ArtifactServiceStager; +import org.apache.beam.runners.core.construction.ArtifactServiceStager.FileToStage; +import org.apache.beam.runners.core.construction.JavaReadViaImpulse; +import org.apache.beam.runners.core.construction.PipelineOptionsTranslation; +import org.apache.beam.runners.core.construction.PipelineTranslation; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.PipelineRunner; +import org.apache.beam.sdk.fn.channel.ManagedChannelFactory; +import org.apache.beam.sdk.options.ExperimentalOptions; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.options.PipelineOptionsValidator; +import org.apache.beam.sdk.options.PortablePipelineOptions; +import org.apache.beam.sdk.runners.PTransformOverride; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A {@link PipelineRunner} a {@link Pipeline} against a {@code JobService}. */ +public class PortableRunner extends PipelineRunner { + + private static final Logger LOG = LoggerFactory.getLogger(PortableRunner.class); + + /** Provided pipeline options. */ Review comment: > **jkff** wrote: > Hm okay. I looked at existing PipelineRunner implementations and was unable to figure out whether they do or don't intend to support running multiple pipelines with the same PipelineRunner. We can deal with this later if necessary. I verified with @tgroh that they do not. Unfortunately, `PipelineOptions` are effectively global flags, so users can't meaningfully instantiate different options anyway. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92307) Time Spent: 10h 40m (was: 10.5h) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92306&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92306 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 18/Apr/18 23:53 Start Date: 18/Apr/18 23:53 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182601788 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/PortableRunner.java ## @@ -0,0 +1,254 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkState; +import static org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.Sets; +import com.google.protobuf.ByteString; +import io.grpc.ManagedChannel; +import java.io.File; +import java.io.IOException; +import java.util.Collection; +import java.util.List; +import java.util.Set; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.model.pipeline.v1.Endpoints; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; +import org.apache.beam.runners.core.construction.ArtifactServiceStager; +import org.apache.beam.runners.core.construction.ArtifactServiceStager.FileToStage; +import org.apache.beam.runners.core.construction.JavaReadViaImpulse; +import org.apache.beam.runners.core.construction.PipelineOptionsTranslation; +import org.apache.beam.runners.core.construction.PipelineTranslation; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.PipelineRunner; +import org.apache.beam.sdk.fn.channel.ManagedChannelFactory; +import org.apache.beam.sdk.options.ExperimentalOptions; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.options.PipelineOptionsValidator; +import org.apache.beam.sdk.options.PortablePipelineOptions; +import org.apache.beam.sdk.runners.PTransformOverride; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A {@link PipelineRunner} a {@link Pipeline} against a {@code JobService}. */ +public class PortableRunner extends PipelineRunner { + + private static final Logger LOG = LoggerFactory.getLogger(PortableRunner.class); + + /** Provided pipeline options. */ + private final PipelineOptions options; + /** Job API endpoint. */ + private final String endpoint; + /** Files to stage to artifact staging service. They will ultimately be added to the classpath. */ + private final Collection filesToStage; + /** Channel factory used to create communication channel with job and staging services. */ + private final ManagedChannelFactory channelFactory; + + /** + * Constructs a runner from the provided options. + * + * @param options Properties which configure the runner. + * @return The newly created runner. + */ + public static PortableRunner fromOptions(PipelineOptions options) { +if (true) { + throw new UnsupportedOperationException(); +} +return createInternal(options, new DirectoryZipper(), getChannelFactory(options)); + } + + @VisibleForTesting + static PortableRunner createInternal( + PipelineOptions options, DirectoryZipper zipper, ManagedChannelFactory channelFactory) { +PortablePipelineOptions portableOptions = +PipelineOptionsValidator.validate(PortablePipelineOptions.class, options); + +String end
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92272&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92272 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 18/Apr/18 22:51 Start Date: 18/Apr/18 22:51 Worklog Time Spent: 10m Work Description: jkff commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182592002 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/PortableRunner.java ## @@ -0,0 +1,254 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkState; +import static org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.Sets; +import com.google.protobuf.ByteString; +import io.grpc.ManagedChannel; +import java.io.File; +import java.io.IOException; +import java.util.Collection; +import java.util.List; +import java.util.Set; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.model.pipeline.v1.Endpoints; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; +import org.apache.beam.runners.core.construction.ArtifactServiceStager; +import org.apache.beam.runners.core.construction.ArtifactServiceStager.FileToStage; +import org.apache.beam.runners.core.construction.JavaReadViaImpulse; +import org.apache.beam.runners.core.construction.PipelineOptionsTranslation; +import org.apache.beam.runners.core.construction.PipelineTranslation; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.PipelineRunner; +import org.apache.beam.sdk.fn.channel.ManagedChannelFactory; +import org.apache.beam.sdk.options.ExperimentalOptions; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.options.PipelineOptionsValidator; +import org.apache.beam.sdk.options.PortablePipelineOptions; +import org.apache.beam.sdk.runners.PTransformOverride; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A {@link PipelineRunner} a {@link Pipeline} against a {@code JobService}. */ +public class PortableRunner extends PipelineRunner { + + private static final Logger LOG = LoggerFactory.getLogger(PortableRunner.class); + + /** Provided pipeline options. */ Review comment: > **bsidhom** wrote: > I would have gladly done that, but that doesn't fit with the `PipelineRunner` construction. As mentioned above, the fit between a Java runner and a portable runner service isn't great. Hm okay. I looked at existing PipelineRunner implementations and was unable to figure out whether they do or don't intend to support running multiple pipelines with the same PipelineRunner. We can deal with this later if necessary. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92272) Time Spent: 10h (was: 9h 50m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Iss
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92273&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92273 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 18/Apr/18 22:51 Start Date: 18/Apr/18 22:51 Worklog Time Spent: 10m Work Description: jkff commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182592003 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/PortableRunner.java ## @@ -0,0 +1,254 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkState; +import static org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.Sets; +import com.google.protobuf.ByteString; +import io.grpc.ManagedChannel; +import java.io.File; +import java.io.IOException; +import java.util.Collection; +import java.util.List; +import java.util.Set; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.model.pipeline.v1.Endpoints; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; +import org.apache.beam.runners.core.construction.ArtifactServiceStager; +import org.apache.beam.runners.core.construction.ArtifactServiceStager.FileToStage; +import org.apache.beam.runners.core.construction.JavaReadViaImpulse; +import org.apache.beam.runners.core.construction.PipelineOptionsTranslation; +import org.apache.beam.runners.core.construction.PipelineTranslation; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.PipelineRunner; +import org.apache.beam.sdk.fn.channel.ManagedChannelFactory; +import org.apache.beam.sdk.options.ExperimentalOptions; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.options.PipelineOptionsValidator; +import org.apache.beam.sdk.options.PortablePipelineOptions; +import org.apache.beam.sdk.runners.PTransformOverride; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A {@link PipelineRunner} a {@link Pipeline} against a {@code JobService}. */ +public class PortableRunner extends PipelineRunner { + + private static final Logger LOG = LoggerFactory.getLogger(PortableRunner.class); + + /** Provided pipeline options. */ + private final PipelineOptions options; + /** Job API endpoint. */ + private final String endpoint; + /** Files to stage to artifact staging service. They will ultimately be added to the classpath. */ + private final Collection filesToStage; + /** Channel factory used to create communication channel with job and staging services. */ + private final ManagedChannelFactory channelFactory; + + /** + * Constructs a runner from the provided options. + * + * @param options Properties which configure the runner. + * @return The newly created runner. + */ + public static PortableRunner fromOptions(PipelineOptions options) { +if (true) { + throw new UnsupportedOperationException(); +} +return createInternal(options, new DirectoryZipper(), getChannelFactory(options)); + } + + @VisibleForTesting + static PortableRunner createInternal( + PipelineOptions options, DirectoryZipper zipper, ManagedChannelFactory channelFactory) { +PortablePipelineOptions portableOptions = +PipelineOptionsValidator.validate(PortablePipelineOptions.class, options); + +String endpoi
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92269&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92269 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 18/Apr/18 22:51 Start Date: 18/Apr/18 22:51 Worklog Time Spent: 10m Work Description: jkff commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182591997 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/CloseableResource.java ## @@ -0,0 +1,119 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkArgument; +import static com.google.common.base.Preconditions.checkState; + +import javax.annotation.Nullable; + +/** + * An {@link AutoCloseable} that wraps a resource that needs to be cleaned up but does not implement + * {@link AutoCloseable} itself. + * + * Recipients of a {@link CloseableResource} are in general responsible for cleanup. Ownership + * can be transferred from one context to another via {@link #transfer()}. Transferring relinquishes + * ownership from the original resource. This allows resources to be safely constructed and + * transferred within a try-with-resources block. For example: + * + * try (CloseableResourceresource = CloseableResource.of(...)) { Review comment: {@code ... } could help you avoid the ampersands. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92269) Time Spent: 9.5h (was: 9h 20m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 9.5h > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92270&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92270 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 18/Apr/18 22:51 Start Date: 18/Apr/18 22:51 Worklog Time Spent: 10m Work Description: jkff commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182591998 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/CloseableResource.java ## @@ -0,0 +1,119 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkArgument; +import static com.google.common.base.Preconditions.checkState; + +import javax.annotation.Nullable; + +/** + * An {@link AutoCloseable} that wraps a resource that needs to be cleaned up but does not implement + * {@link AutoCloseable} itself. + * + * Recipients of a {@link CloseableResource} are in general responsible for cleanup. Ownership + * can be transferred from one context to another via {@link #transfer()}. Transferring relinquishes + * ownership from the original resource. This allows resources to be safely constructed and + * transferred within a try-with-resources block. For example: + * + * try (CloseableResourceresource = CloseableResource.of(...)) { + * // Do something with resource. + * ... + * // Then transfer ownership to some consumer. + * resourceConsumer(resource.release()); Review comment: Did you mean .transfer()? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92270) Time Spent: 9h 40m (was: 9.5h) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 9h 40m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92271&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92271 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 18/Apr/18 22:51 Start Date: 18/Apr/18 22:51 Worklog Time Spent: 10m Work Description: jkff commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182592000 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/testing/InMemoryArtifactService.java ## @@ -0,0 +1,182 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference.testing; + +import static com.google.common.base.Preconditions.checkArgument; +import static com.google.common.base.Preconditions.checkState; + +import com.google.common.collect.Maps; +import com.google.common.collect.Sets; +import com.google.common.io.BaseEncoding; +import com.google.protobuf.ByteString; +import io.grpc.Status; +import io.grpc.stub.StreamObserver; +import java.security.MessageDigest; +import java.util.Map; +import java.util.Set; +import java.util.function.BiConsumer; +import javax.annotation.concurrent.GuardedBy; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse; +import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +// TODO: Implement artifact retrieval. +/** A StagingService for tests. */ +public class InMemoryArtifactService extends ArtifactStagingServiceImplBase { + + private static final Logger LOG = LoggerFactory.getLogger(InMemoryArtifactService.class); + + private final Object artifactLock = new Object(); + + @GuardedBy("artifactLock") + private final Map artifacts = Maps.newHashMap(); + + private final boolean keepArtifacts; + + @GuardedBy("artifactLock") + private boolean committed = false; + + /** + * Constructs an {@link InMemoryArtifactService}. If {@code keepArtifacts} is true, all artifacts + * are kept in memory as {@link ByteString ByteStrings}. Doing so is currently a waste of space Review comment: Hm then can you just remove this parameter? If setting it to true currently accomplishes no useful purpose (it's fine to keep if you plan to very soon follow up this PR with something that makes it have a useful purpose) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92271) Time Spent: 9h 50m (was: 9h 40m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 9h 50m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92268&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92268 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 18/Apr/18 22:51 Start Date: 18/Apr/18 22:51 Worklog Time Spent: 10m Work Description: jkff commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182591993 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/CloseableResource.java ## @@ -0,0 +1,119 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkArgument; +import static com.google.common.base.Preconditions.checkState; + +import javax.annotation.Nullable; + +/** + * An {@link AutoCloseable} that wraps a resource that needs to be cleaned up but does not implement + * {@link AutoCloseable} itself. + * + * Recipients of a {@link CloseableResource} are in general responsible for cleanup. Ownership + * can be transferred from one context to another via {@link #transfer()}. Transferring relinquishes + * ownership from the original resource. This allows resources to be safely constructed and + * transferred within a try-with-resources block. For example: + * + * try (CloseableResourceresource = CloseableResource.of(...)) { + * // Do something with resource. + * ... + * // Then transfer ownership to some consumer. + * resourceConsumer(resource.release()); + * } + * + * Not thread-safe. + */ +public class CloseableResource implements AutoCloseable { + + private final T resource; + + /** + * {@link Closer } for the underlying resource. Closers are nullable to allow transfer of + * ownership. However, newly-constructed {@link CloseableResource CloseableResources} must always + * have non-null closers. + */ + @Nullable private Closer closer; + + private boolean isClosed = false; + + private CloseableResource(T resource, Closer closer) { +this.resource = resource; +this.closer = closer; + } + + /** Creates a {@link CloseableResource} with the given resource and closer. */ + public static CloseableResource of(T resource, Closer closer) { +checkArgument(resource != null, "Resource must be non-null"); +checkArgument(closer != null, "%s must be non-null", Closer.class.getName()); +return new CloseableResource<>(resource, closer); + } + + /** Gets the underlying resource. */ + public T get() { +checkState(closer != null, "%s has transferred ownership", CloseableResource.class.getName()); +checkState(!isClosed, "% is closed", CloseableResource.class.getName()); +return resource; + } + + /** + * Returns a new {@link CloseableResource} that owns the underlying resource and relinquishes + * ownership from this {@link CloseableResource}. {@link #close()} on the original instance + * becomes a no-op. + */ + public CloseableResource transfer() { Review comment: > **bsidhom** wrote: > It would be really nice to have the compiler enforce that existing `CloseableResource` references are never passed as arguments but that they are instead always `transfer()`'d in. I'm envisioning something like safe RAII move semantics in C++. Is there a FindBugs check that could enforce this? There's no findbugs check that can do that AFAIK. It should be possible to write a check using a compiler plugin, but I think practically speaking we won't :-/ This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92268) Time Spent: 9h 20m (was: 9h 10m) > Portable Runner Job API shim > > >
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92195&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92195 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 18/Apr/18 18:08 Start Date: 18/Apr/18 18:08 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182507804 ## File path: runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ArtifactServiceStager.java ## @@ -222,13 +225,26 @@ public StagingResult call() { } } + /** A file along with a staging name. */ + @AutoValue + public abstract static class FileToStage { Review comment: I think the name `StagedFile` is a bit misleading because it hasn't yet been staged. However, `FileToStage` is an unusual pattern and a bit unwieldy. I've switched to `StagedFile`, but let me know if you have other suggestions. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92195) Time Spent: 8h 40m (was: 8.5h) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 8h 40m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92194&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92194 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 18/Apr/18 18:08 Start Date: 18/Apr/18 18:08 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182505797 ## File path: runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ArtifactServiceStager.java ## @@ -87,26 +87,27 @@ private ArtifactServiceStager(Channel channel, int bufferSize) { this.bufferSize = bufferSize; } - public void stage(Iterable files) throws IOException, InterruptedException { -final Map> futures = new HashMap<>(); -for (File file : files) { + public String stage(Iterable files) throws IOException, InterruptedException { +final Map> futures = new HashMap<>(); +for (FileToStage file : files) { futures.put(file, MoreFutures.supplyAsync(new StagingCallable(file), executorService)); } CompletionStage stagingResult = MoreFutures.allAsList(futures.values()) .thenApply(ignored -> new ExtractStagingResultsCallable(futures).call()); -stageManifest(stagingResult); +return stageManifest(stagingResult); } - private void stageManifest(CompletionStage stagingFuture) + private String stageManifest(CompletionStage stagingFuture) throws InterruptedException { try { StagingResult stagingResult = MoreFutures.get(stagingFuture); Review comment: Good idea. Created https://issues.apache.org/jira/browse/BEAM-4116. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92194) Time Spent: 8.5h (was: 8h 20m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 8.5h > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92196&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92196 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 18/Apr/18 18:08 Start Date: 18/Apr/18 18:08 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182510801 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/PortableRunner.java ## @@ -0,0 +1,254 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkState; +import static org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.Sets; +import com.google.protobuf.ByteString; +import io.grpc.ManagedChannel; +import java.io.File; +import java.io.IOException; +import java.util.Collection; +import java.util.List; +import java.util.Set; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.model.pipeline.v1.Endpoints; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; +import org.apache.beam.runners.core.construction.ArtifactServiceStager; +import org.apache.beam.runners.core.construction.ArtifactServiceStager.FileToStage; +import org.apache.beam.runners.core.construction.JavaReadViaImpulse; +import org.apache.beam.runners.core.construction.PipelineOptionsTranslation; +import org.apache.beam.runners.core.construction.PipelineTranslation; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.PipelineRunner; +import org.apache.beam.sdk.fn.channel.ManagedChannelFactory; +import org.apache.beam.sdk.options.ExperimentalOptions; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.options.PipelineOptionsValidator; +import org.apache.beam.sdk.options.PortablePipelineOptions; +import org.apache.beam.sdk.runners.PTransformOverride; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A {@link PipelineRunner} a {@link Pipeline} against a {@code JobService}. */ +public class PortableRunner extends PipelineRunner { + + private static final Logger LOG = LoggerFactory.getLogger(PortableRunner.class); + + /** Provided pipeline options. */ + private final PipelineOptions options; + /** Job API endpoint. */ + private final String endpoint; + /** Files to stage to artifact staging service. They will ultimately be added to the classpath. */ + private final Collection filesToStage; + /** Channel factory used to create communication channel with job and staging services. */ + private final ManagedChannelFactory channelFactory; + + /** + * Constructs a runner from the provided options. + * + * @param options Properties which configure the runner. + * @return The newly created runner. + */ + public static PortableRunner fromOptions(PipelineOptions options) { +if (true) { + throw new UnsupportedOperationException(); +} +return createInternal(options, new DirectoryZipper(), getChannelFactory(options)); + } + + @VisibleForTesting + static PortableRunner createInternal( + PipelineOptions options, DirectoryZipper zipper, ManagedChannelFactory channelFactory) { +PortablePipelineOptions portableOptions = +PipelineOptionsValidator.validate(PortablePipelineOptions.class, options); + +String end
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92197&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92197 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 18/Apr/18 18:08 Start Date: 18/Apr/18 18:08 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182519308 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/CloseableResource.java ## @@ -0,0 +1,119 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkArgument; +import static com.google.common.base.Preconditions.checkState; + +import javax.annotation.Nullable; + +/** + * An {@link AutoCloseable} that wraps a resource that needs to be cleaned up but does not implement + * {@link AutoCloseable} itself. + * + * Recipients of a {@link CloseableResource} are in general responsible for cleanup. Ownership + * can be transferred from one context to another via {@link #transfer()}. Transferring relinquishes + * ownership from the original resource. This allows resources to be safely constructed and + * transferred within a try-with-resources block. For example: + * + * try (CloseableResourceresource = CloseableResource.of(...)) { + * // Do something with resource. + * ... + * // Then transfer ownership to some consumer. + * resourceConsumer(resource.release()); + * } + * + * Not thread-safe. + */ +public class CloseableResource implements AutoCloseable { + + private final T resource; + + /** + * {@link Closer } for the underlying resource. Closers are nullable to allow transfer of + * ownership. However, newly-constructed {@link CloseableResource CloseableResources} must always + * have non-null closers. + */ + @Nullable private Closer closer; + + private boolean isClosed = false; + + private CloseableResource(T resource, Closer closer) { +this.resource = resource; +this.closer = closer; + } + + /** Creates a {@link CloseableResource} with the given resource and closer. */ + public static CloseableResource of(T resource, Closer closer) { +checkArgument(resource != null, "Resource must be non-null"); +checkArgument(closer != null, "%s must be non-null", Closer.class.getName()); +return new CloseableResource<>(resource, closer); + } + + /** Gets the underlying resource. */ + public T get() { +checkState(closer != null, "%s has transferred ownership", CloseableResource.class.getName()); +checkState(!isClosed, "% is closed", CloseableResource.class.getName()); +return resource; + } + + /** + * Returns a new {@link CloseableResource} that owns the underlying resource and relinquishes + * ownership from this {@link CloseableResource}. {@link #close()} on the original instance + * becomes a no-op. + */ + public CloseableResource transfer() { Review comment: It would be really nice to have the compiler enforce that existing `CloseableResource` references are never passed as arguments but that they are instead always `transfer()`'d in. I'm envisioning something like safe RAII move semantics in C++. Is there a FindBugs check that could enforce this? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92197) Time Spent: 9h (was: 8h 50m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92198&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92198 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 18/Apr/18 18:08 Start Date: 18/Apr/18 18:08 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182506978 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/CloseableResource.java ## @@ -0,0 +1,81 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkArgument; +import static com.google.common.base.Preconditions.checkState; + +/** + * An {@link AutoCloseable} that wraps a resource that needs to be cleaned up but does not implement + * {@link AutoCloseable} itself. Recipients of a {@link CloseableResource} are in general + * responsible for cleanup. Not thread-safe. + */ +public class CloseableResource implements AutoCloseable { Review comment: While protecting against double closes is generally a good idea, I don't think we should attempt to do that here. First, we do not have insight into exactly what operations a given `Closer` is performing. It _may_ be reasonable for two separate `Closer`s to be created for the same underlying resource and to have separate `CloseableResources` representing the different aspects that require cleanup (e.g., if those have different lifecycles). Second, as Eugene pointed out, `AutoCloseables` should _usually_ be idempotent, so guarding from double-closes here is overstepping our bounds. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92198) Time Spent: 9h 10m (was: 9h) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 9h 10m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92176&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92176 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 18/Apr/18 17:17 Start Date: 18/Apr/18 17:17 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182504387 ## File path: runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ArtifactServiceStager.java ## @@ -222,13 +225,26 @@ public StagingResult call() { } } + /** A file along with a staging name. */ + @AutoValue + public abstract static class FileToStage { +public static FileToStage of(File file, String stageName) { + return new AutoValue_ArtifactServiceStager_FileToStage(file, stageName); +} + +/** The file to stage. */ +public abstract File getFile(); +/** Staging handle to this file. */ +public abstract String getStageName(); Review comment: No, this is just an opaque handle that the client provides to the staging service. The artifact can be referenced by this name in the manifest. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92176) Time Spent: 8h 10m (was: 8h) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 8h 10m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=92177&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-92177 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 18/Apr/18 17:17 Start Date: 18/Apr/18 17:17 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182504387 ## File path: runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ArtifactServiceStager.java ## @@ -222,13 +225,26 @@ public StagingResult call() { } } + /** A file along with a staging name. */ + @AutoValue + public abstract static class FileToStage { +public static FileToStage of(File file, String stageName) { + return new AutoValue_ArtifactServiceStager_FileToStage(file, stageName); +} + +/** The file to stage. */ +public abstract File getFile(); +/** Staging handle to this file. */ +public abstract String getStageName(); Review comment: No, this is just an opaque handle that the client provides to the staging service. The artifact will be referenced by this name in the manifest. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 92177) Time Spent: 8h 20m (was: 8h 10m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 8h 20m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=91985&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91985 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 17/Apr/18 23:09 Start Date: 17/Apr/18 23:09 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182267064 ## File path: runners/reference/java/src/test/java/org/apache/beam/runners/reference/CloseableResourceTest.java ## @@ -0,0 +1,78 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static org.hamcrest.MatcherAssert.assertThat; +import static org.hamcrest.Matchers.instanceOf; +import static org.hamcrest.Matchers.is; +import static org.hamcrest.Matchers.notNullValue; +import static org.junit.Assert.assertTrue; + +import java.util.concurrent.atomic.AtomicBoolean; +import org.apache.beam.runners.reference.CloseableResource.CloseException; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.junit.runners.JUnit4; + +/** Tests for {@link CloseableResource}. */ +@RunWith(JUnit4.class) +public class CloseableResourceTest { + + @Test + public void alwaysReturnsSameResource() { +Foo foo = new Foo(); +CloseableResource resource = CloseableResource.of(foo, (ignored) -> {}); +assertThat(resource.get(), is(foo)); +assertThat(resource.get(), is(foo)); + } + + @Test + public void callsCloser() throws Exception { +AtomicBoolean closed = new AtomicBoolean(false); +try (CloseableResource ignored = +CloseableResource.of( +new Foo(), +(foo) -> { + closed.set(true); +})) { + // Do nothing. +} +assertTrue(closed.get()); + } + + @Test + public void wrapsExceptionsInCloseException() { +Exception wrapped = new Exception(); +CloseException closeException = null; +try (CloseableResource ignored = +CloseableResource.of( +new Foo(), +(foo) -> { + throw wrapped; +})) { + // Do nothing. +} catch (CloseException e) { + closeException = e; Review comment: Yes, I would much prefer assertThrows. I dislike `ExpectedException` because it limits the assertions that can be made and requires the exception to be thrown as the very last statement of the method. Otherwise, you do not know from which line the exception is actually being thrown. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 91985) Time Spent: 7h 20m (was: 7h 10m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 7h 20m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=91992&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91992 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 17/Apr/18 23:09 Start Date: 17/Apr/18 23:09 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182264068 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/PortableRunner.java ## @@ -0,0 +1,254 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkState; +import static org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.Sets; +import com.google.protobuf.ByteString; +import io.grpc.ManagedChannel; +import java.io.File; +import java.io.IOException; +import java.util.Collection; +import java.util.List; +import java.util.Set; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.model.pipeline.v1.Endpoints; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; +import org.apache.beam.runners.core.construction.ArtifactServiceStager; +import org.apache.beam.runners.core.construction.ArtifactServiceStager.FileToStage; +import org.apache.beam.runners.core.construction.JavaReadViaImpulse; +import org.apache.beam.runners.core.construction.PipelineOptionsTranslation; +import org.apache.beam.runners.core.construction.PipelineTranslation; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.PipelineRunner; +import org.apache.beam.sdk.fn.channel.ManagedChannelFactory; +import org.apache.beam.sdk.options.ExperimentalOptions; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.options.PipelineOptionsValidator; +import org.apache.beam.sdk.options.PortablePipelineOptions; +import org.apache.beam.sdk.runners.PTransformOverride; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A {@link PipelineRunner} a {@link Pipeline} against a {@code JobService}. */ +public class PortableRunner extends PipelineRunner { + + private static final Logger LOG = LoggerFactory.getLogger(PortableRunner.class); + + /** Provided pipeline options. */ + private final PipelineOptions options; + /** Job API endpoint. */ + private final String endpoint; + /** Files to stage to artifact staging service. They will ultimately be added to the classpath. */ + private final Collection filesToStage; + /** Channel factory used to create communication channel with job and staging services. */ + private final ManagedChannelFactory channelFactory; + + /** + * Constructs a runner from the provided options. + * + * @param options Properties which configure the runner. + * @return The newly created runner. + */ + public static PortableRunner fromOptions(PipelineOptions options) { +if (true) { + throw new UnsupportedOperationException(); +} +return createInternal(options, new DirectoryZipper(), getChannelFactory(options)); + } + + @VisibleForTesting + static PortableRunner createInternal( + PipelineOptions options, DirectoryZipper zipper, ManagedChannelFactory channelFactory) { +PortablePipelineOptions portableOptions = +PipelineOptionsValidator.validate(PortablePipelineOptions.class, options); + +String end
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=91976&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91976 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 17/Apr/18 23:09 Start Date: 17/Apr/18 23:09 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182260262 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/PortableRunner.java ## @@ -0,0 +1,254 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkState; +import static org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.Sets; +import com.google.protobuf.ByteString; +import io.grpc.ManagedChannel; +import java.io.File; +import java.io.IOException; +import java.util.Collection; +import java.util.List; +import java.util.Set; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.model.pipeline.v1.Endpoints; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; +import org.apache.beam.runners.core.construction.ArtifactServiceStager; +import org.apache.beam.runners.core.construction.ArtifactServiceStager.FileToStage; +import org.apache.beam.runners.core.construction.JavaReadViaImpulse; +import org.apache.beam.runners.core.construction.PipelineOptionsTranslation; +import org.apache.beam.runners.core.construction.PipelineTranslation; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.PipelineRunner; +import org.apache.beam.sdk.fn.channel.ManagedChannelFactory; +import org.apache.beam.sdk.options.ExperimentalOptions; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.options.PipelineOptionsValidator; +import org.apache.beam.sdk.options.PortablePipelineOptions; +import org.apache.beam.sdk.runners.PTransformOverride; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A {@link PipelineRunner} a {@link Pipeline} against a {@code JobService}. */ +public class PortableRunner extends PipelineRunner { + + private static final Logger LOG = LoggerFactory.getLogger(PortableRunner.class); + + /** Provided pipeline options. */ Review comment: I would have gladly done that, but that doesn't fit with the `PipelineRunner` construction. As mentioned above, the fit between a Java runner and a portable runner service isn't great. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 91976) Time Spent: 6h 20m (was: 6h 10m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 6h 20m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=91973&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91973 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 17/Apr/18 23:09 Start Date: 17/Apr/18 23:09 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182257618 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/DirectoryZipper.java ## @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkArgument; + +import java.io.File; +import java.io.FileOutputStream; +import java.io.IOException; +import org.apache.beam.sdk.util.ZipFiles; + +/** A utility that replaces directories with a zip file containing its contents. */ Review comment: Actually, I've gone ahead and removed this. We're doing a lot of I/O in the end-to-end test anyway, so this doesn't buy us much. This code has been inlined and is no longer pluggable. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 91973) Time Spent: 6h 10m (was: 6h) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 6h 10m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=91965&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91965 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 17/Apr/18 23:09 Start Date: 17/Apr/18 23:09 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182232673 ## File path: runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ArtifactServiceStager.java ## @@ -87,26 +87,27 @@ private ArtifactServiceStager(Channel channel, int bufferSize) { this.bufferSize = bufferSize; } - public void stage(Iterable files) throws IOException, InterruptedException { -final Map> futures = new HashMap<>(); -for (File file : files) { + public String stage(Iterable files) throws IOException, InterruptedException { Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 91965) Time Spent: 5h 10m (was: 5h) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 5h 10m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=91982&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91982 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 17/Apr/18 23:09 Start Date: 17/Apr/18 23:09 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182261921 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/JobServicePipelineResult.java ## @@ -0,0 +1,140 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import com.google.common.util.concurrent.Uninterruptibles; +import com.google.protobuf.ByteString; +import java.util.concurrent.CompletableFuture; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.TimeoutException; +import org.apache.beam.model.jobmanagement.v1.JobApi; +import org.apache.beam.model.jobmanagement.v1.JobApi.CancelJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.CancelJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.GetJobStateRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.GetJobStateResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.metrics.MetricResults; +import org.joda.time.Duration; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +class JobServicePipelineResult implements PipelineResult { + + private static final long POLL_INTERVAL_SEC = 10; + + private static final Logger LOG = LoggerFactory.getLogger(JobServicePipelineResult.class); + + private final ByteString jobId; + private final CloseableResource jobService; + + JobServicePipelineResult(ByteString jobId, CloseableResource jobService) { +this.jobId = jobId; +this.jobService = jobService; + } + + @Override + public State getState() { +JobServiceBlockingStub stub = jobService.get(); +GetJobStateResponse response = + stub.getState(GetJobStateRequest.newBuilder().setJobIdBytes(jobId).build()); +return getJavaState(response.getState()); + } + + @Override + public State cancel() { +JobServiceBlockingStub stub = jobService.get(); +CancelJobResponse response = + stub.cancel(CancelJobRequest.newBuilder().setJobIdBytes(jobId).build()); +return getJavaState(response.getState()); + } + + @Override + public State waitUntilFinish(Duration duration) { +if (duration.compareTo(Duration.millis(1)) < 1) { + // Equivalent to infinite timeout. + return waitUntilFinish(); +} else { + CompletableFuture result = CompletableFuture.supplyAsync(this::waitUntilFinish); + try { +return Uninterruptibles.getUninterruptibly( +result, duration.getMillis(), TimeUnit.MILLISECONDS); + } catch (TimeoutException e) { +// Null result indicates a timeout. +return null; + } catch (ExecutionException e) { +throw new RuntimeException(e); + } +} + } + + @Override + public State waitUntilFinish() { +JobServiceBlockingStub stub = jobService.get(); +GetJobStateRequest request = GetJobStateRequest.newBuilder().setJobIdBytes(jobId).build(); +State lastState; +do { + Uninterruptibles.sleepUninterruptibly(POLL_INTERVAL_SEC, TimeUnit.SECONDS); Review comment: Conciseness. Fixed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 91982) > Portable Runner Job API shim > > > Key: BEAM-4071 >
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=91980&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91980 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 17/Apr/18 23:09 Start Date: 17/Apr/18 23:09 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182260838 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/PortableRunner.java ## @@ -0,0 +1,254 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkState; +import static org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.Sets; +import com.google.protobuf.ByteString; +import io.grpc.ManagedChannel; +import java.io.File; +import java.io.IOException; +import java.util.Collection; +import java.util.List; +import java.util.Set; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.model.pipeline.v1.Endpoints; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; +import org.apache.beam.runners.core.construction.ArtifactServiceStager; +import org.apache.beam.runners.core.construction.ArtifactServiceStager.FileToStage; +import org.apache.beam.runners.core.construction.JavaReadViaImpulse; +import org.apache.beam.runners.core.construction.PipelineOptionsTranslation; +import org.apache.beam.runners.core.construction.PipelineTranslation; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.PipelineRunner; +import org.apache.beam.sdk.fn.channel.ManagedChannelFactory; +import org.apache.beam.sdk.options.ExperimentalOptions; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.options.PipelineOptionsValidator; +import org.apache.beam.sdk.options.PortablePipelineOptions; +import org.apache.beam.sdk.runners.PTransformOverride; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A {@link PipelineRunner} a {@link Pipeline} against a {@code JobService}. */ +public class PortableRunner extends PipelineRunner { + + private static final Logger LOG = LoggerFactory.getLogger(PortableRunner.class); + + /** Provided pipeline options. */ + private final PipelineOptions options; + /** Job API endpoint. */ + private final String endpoint; + /** Files to stage to artifact staging service. They will ultimately be added to the classpath. */ + private final Collection filesToStage; + /** Channel factory used to create communication channel with job and staging services. */ + private final ManagedChannelFactory channelFactory; + + /** + * Constructs a runner from the provided options. + * + * @param options Properties which configure the runner. + * @return The newly created runner. + */ + public static PortableRunner fromOptions(PipelineOptions options) { +if (true) { + throw new UnsupportedOperationException(); +} +return createInternal(options, new DirectoryZipper(), getChannelFactory(options)); + } + + @VisibleForTesting + static PortableRunner createInternal( + PipelineOptions options, DirectoryZipper zipper, ManagedChannelFactory channelFactory) { +PortablePipelineOptions portableOptions = +PipelineOptionsValidator.validate(PortablePipelineOptions.class, options); + +String end
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=91981&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91981 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 17/Apr/18 23:09 Start Date: 17/Apr/18 23:09 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182264134 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/PortableRunner.java ## @@ -0,0 +1,254 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkState; +import static org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.Sets; +import com.google.protobuf.ByteString; +import io.grpc.ManagedChannel; +import java.io.File; +import java.io.IOException; +import java.util.Collection; +import java.util.List; +import java.util.Set; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.model.pipeline.v1.Endpoints; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; +import org.apache.beam.runners.core.construction.ArtifactServiceStager; +import org.apache.beam.runners.core.construction.ArtifactServiceStager.FileToStage; +import org.apache.beam.runners.core.construction.JavaReadViaImpulse; +import org.apache.beam.runners.core.construction.PipelineOptionsTranslation; +import org.apache.beam.runners.core.construction.PipelineTranslation; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.PipelineRunner; +import org.apache.beam.sdk.fn.channel.ManagedChannelFactory; +import org.apache.beam.sdk.options.ExperimentalOptions; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.options.PipelineOptionsValidator; +import org.apache.beam.sdk.options.PortablePipelineOptions; +import org.apache.beam.sdk.runners.PTransformOverride; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A {@link PipelineRunner} a {@link Pipeline} against a {@code JobService}. */ +public class PortableRunner extends PipelineRunner { + + private static final Logger LOG = LoggerFactory.getLogger(PortableRunner.class); + + /** Provided pipeline options. */ + private final PipelineOptions options; + /** Job API endpoint. */ + private final String endpoint; + /** Files to stage to artifact staging service. They will ultimately be added to the classpath. */ + private final Collection filesToStage; + /** Channel factory used to create communication channel with job and staging services. */ + private final ManagedChannelFactory channelFactory; + + /** + * Constructs a runner from the provided options. + * + * @param options Properties which configure the runner. + * @return The newly created runner. + */ + public static PortableRunner fromOptions(PipelineOptions options) { +if (true) { + throw new UnsupportedOperationException(); +} +return createInternal(options, new DirectoryZipper(), getChannelFactory(options)); + } + + @VisibleForTesting + static PortableRunner createInternal( + PipelineOptions options, DirectoryZipper zipper, ManagedChannelFactory channelFactory) { +PortablePipelineOptions portableOptions = +PipelineOptionsValidator.validate(PortablePipelineOptions.class, options); + +String end
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=91979&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91979 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 17/Apr/18 23:09 Start Date: 17/Apr/18 23:09 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182260505 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/PortableRunner.java ## @@ -0,0 +1,254 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkState; +import static org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.Sets; +import com.google.protobuf.ByteString; +import io.grpc.ManagedChannel; +import java.io.File; +import java.io.IOException; +import java.util.Collection; +import java.util.List; +import java.util.Set; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.model.pipeline.v1.Endpoints; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; +import org.apache.beam.runners.core.construction.ArtifactServiceStager; +import org.apache.beam.runners.core.construction.ArtifactServiceStager.FileToStage; +import org.apache.beam.runners.core.construction.JavaReadViaImpulse; +import org.apache.beam.runners.core.construction.PipelineOptionsTranslation; +import org.apache.beam.runners.core.construction.PipelineTranslation; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.PipelineRunner; +import org.apache.beam.sdk.fn.channel.ManagedChannelFactory; +import org.apache.beam.sdk.options.ExperimentalOptions; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.options.PipelineOptionsValidator; +import org.apache.beam.sdk.options.PortablePipelineOptions; +import org.apache.beam.sdk.runners.PTransformOverride; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A {@link PipelineRunner} a {@link Pipeline} against a {@code JobService}. */ +public class PortableRunner extends PipelineRunner { + + private static final Logger LOG = LoggerFactory.getLogger(PortableRunner.class); + + /** Provided pipeline options. */ + private final PipelineOptions options; + /** Job API endpoint. */ + private final String endpoint; + /** Files to stage to artifact staging service. They will ultimately be added to the classpath. */ + private final Collection filesToStage; + /** Channel factory used to create communication channel with job and staging services. */ + private final ManagedChannelFactory channelFactory; + + /** + * Constructs a runner from the provided options. + * + * @param options Properties which configure the runner. + * @return The newly created runner. + */ + public static PortableRunner fromOptions(PipelineOptions options) { +if (true) { + throw new UnsupportedOperationException(); +} +return createInternal(options, new DirectoryZipper(), getChannelFactory(options)); + } + + @VisibleForTesting + static PortableRunner createInternal( + PipelineOptions options, DirectoryZipper zipper, ManagedChannelFactory channelFactory) { +PortablePipelineOptions portableOptions = +PipelineOptionsValidator.validate(PortablePipelineOptions.class, options); + +String end
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=91963&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91963 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 17/Apr/18 23:09 Start Date: 17/Apr/18 23:09 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182247718 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/DirectoryZipper.java ## @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkArgument; + +import java.io.File; +import java.io.FileOutputStream; +import java.io.IOException; +import org.apache.beam.sdk.util.ZipFiles; + +/** A utility that replaces directories with a zip file containing its contents. */ +class DirectoryZipper { + + /** + * If the given path is a regular file, returns the file itself. If the given file is a directory, + * zips the file and returns the path to this new zip file. + */ + public String replaceDirectoryWithZipFile(String path) throws IOException { +return zipDirectoryInternal(path); Review comment: Done. This was an artifact of how I refactored the original code. ;) This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 91963) Time Spent: 4h 50m (was: 4h 40m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 4h 50m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=91983&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91983 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 17/Apr/18 23:09 Start Date: 17/Apr/18 23:09 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182264845 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/testing/InMemoryArtifactService.java ## @@ -0,0 +1,176 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference.testing; + +import static com.google.common.base.Preconditions.checkArgument; +import static com.google.common.base.Preconditions.checkState; + +import com.google.common.collect.Maps; +import com.google.common.collect.Sets; +import com.google.common.io.BaseEncoding; +import com.google.protobuf.ByteString; +import io.grpc.Status; +import io.grpc.stub.StreamObserver; +import java.security.MessageDigest; +import java.util.Map; +import java.util.Set; +import java.util.function.BiConsumer; +import javax.annotation.concurrent.GuardedBy; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.ArtifactMetadata; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestRequest; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.CommitManifestResponse; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactRequest; +import org.apache.beam.model.jobmanagement.v1.ArtifactApi.PutArtifactResponse; +import org.apache.beam.model.jobmanagement.v1.ArtifactStagingServiceGrpc.ArtifactStagingServiceImplBase; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A StagingService for tests. */ +public class InMemoryArtifactService extends ArtifactStagingServiceImplBase { + + private static final Logger LOG = LoggerFactory.getLogger(InMemoryArtifactService.class); + + private final Object artifactLock = new Object(); + + @GuardedBy("artifactLock") + private final Map artifacts = Maps.newHashMap(); + + private final boolean keepArtifacts; + + @GuardedBy("artifactLock") + private boolean committed = false; + + public InMemoryArtifactService(boolean keepArtifacts) { Review comment: Fixed. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 91983) Time Spent: 7h (was: 6h 50m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 7h > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=91968&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91968 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 17/Apr/18 23:09 Start Date: 17/Apr/18 23:09 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182260015 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/JobServicePipelineResult.java ## @@ -0,0 +1,140 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import com.google.common.util.concurrent.Uninterruptibles; +import com.google.protobuf.ByteString; +import java.util.concurrent.CompletableFuture; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.TimeoutException; +import org.apache.beam.model.jobmanagement.v1.JobApi; +import org.apache.beam.model.jobmanagement.v1.JobApi.CancelJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.CancelJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.GetJobStateRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.GetJobStateResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.metrics.MetricResults; +import org.joda.time.Duration; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +class JobServicePipelineResult implements PipelineResult { + + private static final long POLL_INTERVAL_SEC = 10; + + private static final Logger LOG = LoggerFactory.getLogger(JobServicePipelineResult.class); + + private final ByteString jobId; + private final CloseableResource jobService; + + JobServicePipelineResult(ByteString jobId, CloseableResource jobService) { +this.jobId = jobId; +this.jobService = jobService; + } + + @Override + public State getState() { +JobServiceBlockingStub stub = jobService.get(); +GetJobStateResponse response = + stub.getState(GetJobStateRequest.newBuilder().setJobIdBytes(jobId).build()); +return getJavaState(response.getState()); + } + + @Override + public State cancel() { +JobServiceBlockingStub stub = jobService.get(); +CancelJobResponse response = + stub.cancel(CancelJobRequest.newBuilder().setJobIdBytes(jobId).build()); +return getJavaState(response.getState()); + } + + @Override + public State waitUntilFinish(Duration duration) { +if (duration.compareTo(Duration.millis(1)) < 1) { + // Equivalent to infinite timeout. + return waitUntilFinish(); +} else { + CompletableFuture result = CompletableFuture.supplyAsync(this::waitUntilFinish); + try { +return Uninterruptibles.getUninterruptibly( Review comment: The whole thing is funky. The fact that we have to wrap a long-running server connection in a pipeline result that is completely unaware of this makes it difficult to use correctly. This was really just a stop-gap to get something that we could eventually run in a test runner. I think that doing this the correct way will require more discussion. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 91968) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Prior
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=91990&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91990 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 17/Apr/18 23:09 Start Date: 17/Apr/18 23:09 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182268058 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/options/PortablePipelineOptions.java ## @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.options; + +import java.util.List; +import org.apache.beam.sdk.options.Validation.Required; + +/** Pipeline options common to all portable runners. */ +public interface PortablePipelineOptions extends PipelineOptions { + + /** + * List of local files to make available to workers. + * + * Files are placed on the worker's classpath. + * + * The default value is the list of jars from the main program's classpath. + */ + @Description( + "Files to stage to the artifact service and make available to workers. Files are placed on " + + "the worker's classpath. The default value is all files from the classpath.") + List getFilesToStage(); + void setFilesToStage(List value); + + @Description("Job service endpoint to use.") Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 91990) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 7h 40m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=91987&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91987 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 17/Apr/18 23:09 Start Date: 17/Apr/18 23:09 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182267916 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/options/PortablePipelineOptions.java ## @@ -0,0 +1,43 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.options; + +import java.util.List; +import org.apache.beam.sdk.options.Validation.Required; + +/** Pipeline options common to all portable runners. */ +public interface PortablePipelineOptions extends PipelineOptions { + + /** + * List of local files to make available to workers. + * + * Files are placed on the worker's classpath. Review comment: The worker will access all of the files though the manifest. The files are expected to be jars or directories that should be placed on the classpath and nothing else. This matches the current Dataflow behavior and is the only thing supported. To really drive this home, we could remove the option to override stage files and just always scan for URLClassLoaders and add elements to the stage file list. However, this prevents users from using custom ClassLoaders. Perhaps the way to solve this for now is to rename this to something like `getClasspathArtifacts` or similar. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 91987) Time Spent: 7.5h (was: 7h 20m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 7.5h > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=91977&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91977 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 17/Apr/18 23:09 Start Date: 17/Apr/18 23:09 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182255696 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/DirectoryZipper.java ## @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkArgument; + +import java.io.File; +import java.io.FileOutputStream; +import java.io.IOException; +import org.apache.beam.sdk.util.ZipFiles; + +/** A utility that replaces directories with a zip file containing its contents. */ Review comment: This was an artifact of how the functionality was originally written. I've updated the class to more closely reflect what we needed it to do in the first place. By doing it this way, we can also avoid writing out empty files in tests. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 91977) Time Spent: 6.5h (was: 6h 20m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 6.5h > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=91967&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91967 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 17/Apr/18 23:09 Start Date: 17/Apr/18 23:09 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182243728 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/CloseableResource.java ## @@ -0,0 +1,81 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkArgument; +import static com.google.common.base.Preconditions.checkState; + +/** + * An {@link AutoCloseable} that wraps a resource that needs to be cleaned up but does not implement + * {@link AutoCloseable} itself. Recipients of a {@link CloseableResource} are in general + * responsible for cleanup. Not thread-safe. + */ +public class CloseableResource implements AutoCloseable { + + private final T resource; + private final Closer closer; + + private boolean isClosed = false; + + private CloseableResource(T resource, Closer closer) { +this.resource = resource; +this.closer = closer; + } + + /** Creates a {@link CloseableResource} with the given resource and closer. */ + public static CloseableResource of(T resource, Closer closer) { +checkArgument(resource != null, "Resource must be non-null"); +checkArgument(closer != null, "%s must be non-null", Closer.class.getName()); +return new CloseableResource<>(resource, closer); + } + + /** Gets the underlying resource. */ + public T get() { +checkState(!isClosed, "% is closed", CloseableResource.class.getName()); +return resource; + } + + /** + * Close the underlying resource. Must only be called once. Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 91967) Time Spent: 5.5h (was: 5h 20m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 5.5h > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=91964&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91964 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 17/Apr/18 23:09 Start Date: 17/Apr/18 23:09 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182234035 ## File path: runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/ArtifactServiceStager.java ## @@ -222,13 +225,26 @@ public StagingResult call() { } } + /** A file along with a staging name. */ + @AutoValue + public abstract static class FileToStage { +public static FileToStage of(File file, String stageName) { + return new AutoValue_ArtifactServiceStager_FileToStage(file, stageName); +} + +/** The file to stage. */ +public abstract File getFile(); +/** Staging handle to this file. */ Review comment: We were using filenames and strings everywhere else, so I continued using strings here to minimize the change size (which is already large). If I update this to a `Path`, I think it makes sense to change all `File` references to match this. Let me know if that's a change worth making here. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 91964) Time Spent: 5h (was: 4h 50m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 5h > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=91969&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91969 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 17/Apr/18 23:09 Start Date: 17/Apr/18 23:09 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182235637 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/PortableRunner.java ## @@ -0,0 +1,254 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static com.google.common.base.Preconditions.checkState; +import static org.apache.beam.runners.core.construction.PipelineResources.detectClassPathResourcesToStage; + +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.Sets; +import com.google.protobuf.ByteString; +import io.grpc.ManagedChannel; +import java.io.File; +import java.io.IOException; +import java.util.Collection; +import java.util.List; +import java.util.Set; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceBlockingStub; +import org.apache.beam.model.pipeline.v1.Endpoints; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; +import org.apache.beam.runners.core.construction.ArtifactServiceStager; +import org.apache.beam.runners.core.construction.ArtifactServiceStager.FileToStage; +import org.apache.beam.runners.core.construction.JavaReadViaImpulse; +import org.apache.beam.runners.core.construction.PipelineOptionsTranslation; +import org.apache.beam.runners.core.construction.PipelineTranslation; +import org.apache.beam.sdk.Pipeline; +import org.apache.beam.sdk.PipelineResult; +import org.apache.beam.sdk.PipelineRunner; +import org.apache.beam.sdk.fn.channel.ManagedChannelFactory; +import org.apache.beam.sdk.options.ExperimentalOptions; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.options.PipelineOptionsValidator; +import org.apache.beam.sdk.options.PortablePipelineOptions; +import org.apache.beam.sdk.runners.PTransformOverride; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** A {@link PipelineRunner} a {@link Pipeline} against a {@code JobService}. */ +public class PortableRunner extends PipelineRunner { + + private static final Logger LOG = LoggerFactory.getLogger(PortableRunner.class); + + /** Provided pipeline options. */ + private final PipelineOptions options; + /** Job API endpoint. */ + private final String endpoint; + /** Files to stage to artifact staging service. They will ultimately be added to the classpath. */ + private final Collection filesToStage; + /** Channel factory used to create communication channel with job and staging services. */ + private final ManagedChannelFactory channelFactory; + + /** + * Constructs a runner from the provided options. + * + * @param options Properties which configure the runner. + * @return The newly created runner. + */ + public static PortableRunner fromOptions(PipelineOptions options) { +if (true) { + throw new UnsupportedOperationException(); +} +return createInternal(options, new DirectoryZipper(), getChannelFactory(options)); + } + + @VisibleForTesting + static PortableRunner createInternal( + PipelineOptions options, DirectoryZipper zipper, ManagedChannelFactory channelFactory) { +PortablePipelineOptions portableOptions = +PipelineOptionsValidator.validate(PortablePipelineOptions.class, options); + +String end
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=91975&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91975 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 17/Apr/18 23:09 Start Date: 17/Apr/18 23:09 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182266551 ## File path: runners/reference/java/src/main/java/org/apache/beam/runners/reference/testing/TestJobService.java ## @@ -0,0 +1,94 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference.testing; + +import io.grpc.stub.StreamObserver; +import org.apache.beam.model.jobmanagement.v1.JobApi.CancelJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.CancelJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.GetJobStateRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.GetJobStateResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.JobMessagesRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.JobMessagesResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.JobState; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.PrepareJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobRequest; +import org.apache.beam.model.jobmanagement.v1.JobApi.RunJobResponse; +import org.apache.beam.model.jobmanagement.v1.JobServiceGrpc.JobServiceImplBase; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; + +/** A JobService for tests. */ Review comment: Added javadoc. This _could_ be replaced by a mock if we decided to have the PortableRunner be handed a blocking stub rather than constructing one itself. In that case, we would mock the blocking stub directly. Of course, that becomes difficult once we want to start testing the streaming RPCs (job status, etc). I originally decided it was not reasonable to have the stub and underlying channel constructed externally due to ownership confusion. However, now that we have `CloseableResource`, this may be reasonable as long as we aren't using streaming RPCs. What do you think? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 91975) Time Spent: 6h 10m (was: 6h) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 6h 10m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=91986&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91986 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 17/Apr/18 23:09 Start Date: 17/Apr/18 23:09 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182267118 ## File path: runners/reference/java/src/test/java/org/apache/beam/runners/reference/PortableRunnerTest.java ## @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.runners.reference; + +import static org.hamcrest.MatcherAssert.assertThat; +import static org.hamcrest.Matchers.is; + +import io.grpc.Server; +import io.grpc.inprocess.InProcessServerBuilder; +import java.io.File; +import java.io.IOException; +import java.io.Serializable; +import org.apache.beam.model.jobmanagement.v1.JobApi.JobState; +import org.apache.beam.model.pipeline.v1.Endpoints.ApiServiceDescriptor; +import org.apache.beam.runners.reference.testing.InMemoryArtifactService; +import org.apache.beam.runners.reference.testing.InProcessManagedChannelFactory; +import org.apache.beam.runners.reference.testing.TestJobService; +import org.apache.beam.sdk.PipelineResult.State; +import org.apache.beam.sdk.options.PipelineOptions; +import org.apache.beam.sdk.options.PipelineOptionsFactory; +import org.apache.beam.sdk.options.PortablePipelineOptions; +import org.apache.beam.sdk.testing.TestPipeline; +import org.junit.Rule; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.junit.runners.JUnit4; + +/** Tests for {@link PortableRunner}. */ +@RunWith(JUnit4.class) +public class PortableRunnerTest implements Serializable { + + private static final String ENDPOINT_URL = "foo:3000"; + private static final ApiServiceDescriptor ENDPOINT_DESCRIPTOR = + ApiServiceDescriptor.newBuilder().setUrl(ENDPOINT_URL).build(); + + private PipelineOptions options = createPipelineOptions(); + + @Rule public transient TestPipeline p = TestPipeline.fromOptions(options); + + @Test + public void stagesAndRunsJob() throws Exception { +try (CloseableResource server = createJobServer(JobState.Enum.DONE)) { Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 91986) Time Spent: 7h 20m (was: 7h 10m) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 7h 20m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (BEAM-4071) Portable Runner Job API shim
[ https://issues.apache.org/jira/browse/BEAM-4071?focusedWorklogId=91974&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-91974 ] ASF GitHub Bot logged work on BEAM-4071: Author: ASF GitHub Bot Created on: 17/Apr/18 23:09 Start Date: 17/Apr/18 23:09 Worklog Time Spent: 10m Work Description: bsidhom commented on a change in pull request #5150: [BEAM-4071] Add Portable Runner Job API shim URL: https://github.com/apache/beam/pull/5150#discussion_r182242374 ## File path: runners/local-artifact-service-java/src/test/java/org/apache/beam/artifact/local/LocalFileSystemArtifactRetrievalServiceTest.java ## @@ -186,11 +187,11 @@ public void onCompleted() { } private void stageAndCreateRetrievalService(Map artifacts) throws Exception { -List artifactFiles = new ArrayList<>(); +List artifactFiles = new ArrayList<>(); for (Map.Entry artifact : artifacts.entrySet()) { File artifactFile = tmp.newFile(artifact.getKey()); new FileOutputStream(artifactFile).getChannel().write(ByteBuffer.wrap(artifact.getValue())); - artifactFiles.add(artifactFile); + artifactFiles.add(FileToStage.of(artifactFile, artifactFile.getName())); Review comment: Done. Not sure why it was that way originally. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 91974) > Portable Runner Job API shim > > > Key: BEAM-4071 > URL: https://issues.apache.org/jira/browse/BEAM-4071 > Project: Beam > Issue Type: New Feature > Components: runner-core >Reporter: Ben Sidhom >Assignee: Ben Sidhom >Priority: Minor > Time Spent: 6h 10m > Remaining Estimate: 0h > > There needs to be a way to execute Java-SDK pipelines against a portable job > server. The job server itself is expected to be started up out-of-band. The > "PortableRunner" should take an option indicating the Job API endpoint and > defer other runner configurations to the backend itself. -- This message was sent by Atlassian JIRA (v7.6.3#76005)