[jira] [Updated] (BEAM-5267) Update Flink Runner to Flink 1.6.x
[ https://issues.apache.org/jira/browse/BEAM-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels updated BEAM-5267: - Description: For the next release, the Flink version should be bumped. As changes for 2.7.0 are already frozen, it's going to be 2.8.0. (was: For the next release, the Flink version should be bumped. As changes for 2.7.0 are already frozen, it going to be 2.8.0. ) > Update Flink Runner to Flink 1.6.x > -- > > Key: BEAM-5267 > URL: https://issues.apache.org/jira/browse/BEAM-5267 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Major > Fix For: 2.8.0 > > > For the next release, the Flink version should be bumped. As changes for > 2.7.0 are already frozen, it's going to be 2.8.0. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-5284) Enable Java Portable Flink PostCommit Tests to Jenkins
[ https://issues.apache.org/jira/browse/BEAM-5284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels updated BEAM-5284: - Component/s: runner-flink > Enable Java Portable Flink PostCommit Tests to Jenkins > -- > > Key: BEAM-5284 > URL: https://issues.apache.org/jira/browse/BEAM-5284 > Project: Beam > Issue Type: Test > Components: runner-flink, testing >Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > Labels: CI > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (BEAM-5284) Enable Java Portable Flink PostCommit Tests to Jenkins
[ https://issues.apache.org/jira/browse/BEAM-5284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16602179#comment-16602179 ] Maximilian Michels edited comment on BEAM-5284 at 9/3/18 1:44 PM: -- +1 (I was about to open an issue for this) was (Author: mxm): +1 > Enable Java Portable Flink PostCommit Tests to Jenkins > -- > > Key: BEAM-5284 > URL: https://issues.apache.org/jira/browse/BEAM-5284 > Project: Beam > Issue Type: Test > Components: testing >Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > Labels: CI > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5284) Enable Java Portable Flink PostCommit Tests to Jenkins
[ https://issues.apache.org/jira/browse/BEAM-5284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16602179#comment-16602179 ] Maximilian Michels commented on BEAM-5284: -- +1 > Enable Java Portable Flink PostCommit Tests to Jenkins > -- > > Key: BEAM-5284 > URL: https://issues.apache.org/jira/browse/BEAM-5284 > Project: Beam > Issue Type: Test > Components: testing >Reporter: Ankur Goenka >Assignee: Ankur Goenka >Priority: Major > Labels: CI > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-5288) Modify Environment to support non-dockerized SDK harness deployments
[ https://issues.apache.org/jira/browse/BEAM-5288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels updated BEAM-5288: - Description: As of mailing discussions and BEAM-5187, it has become clear that we need to extend the Environment information. In addition to the Docker environment, the extended environment holds deployment options for 1) a process-based environment, 2) an externally managed environment. The proto definition, as of now, looks as follows: {noformat} message Environment { // (Required) The URN of the payload string urn = 1; // (Optional) The data specifying any parameters to the URN. If // the URN does not require any arguments, this may be omitted. bytes payload = 2; } message SupportedEnvironments { enum Primitives { DOCKER = 0 [(beam_urn) = "beam:env:docker:v1"]; PROCESS = 1 [(beam_urn) = "beam:env:process:v1"]; EXTERNAL = 2 [(beam_urn) = "beam:env:external:v1"]; } } // The payload of a Docker image message DockerPayload { string container_image = 1; // implicitly linux_amd64. } message ProcessPayload { string os = 1; // "linux", "darwin", .. string arch = 2; // "amd64", .. string command = 3; // process to execute repeated string params = 4; // parameters } {noformat} was: As of mailing discussions and BEAM-5187, it has become clear that we need to extend the Environment information. In addition to the Docker environment, the extended environment holds deployment options for 1) a process-based environment, 2) an externally managed environment. The proto definition, as of now, looks as follows: {noformat} message Environment { // (Required) The URN of the payload string urn = 1; // (Optional) The data specifying any parameters to the URN. If // the URN does not require any arguments, this may be omitted. bytes payload = 2; } message SupportedEnvironments { enum Primitives { DOCKER = 0 [(beam_urn) = "beam:env:docker:v1"]; PROCESS = 1 [(beam_urn) = "beam:env:process:v1"]; EXTERNAL = 2 [(beam_urn) = "beam:env:external:v1"]; } } // The payload of a Docker image message DockerPayload { string container_image = 1; // implicitly linux_amd64. } message ProcessPayload { string os = 1; // "linux", "darwin", .. string arch = 2; // "amd64", .. string command = 3; // process to execute } {noformat} > Modify Environment to support non-dockerized SDK harness deployments > - > > Key: BEAM-5288 > URL: https://issues.apache.org/jira/browse/BEAM-5288 > Project: Beam > Issue Type: New Feature > Components: beam-model >Reporter: Maximilian Michels >Priority: Major > > As of mailing discussions and BEAM-5187, it has become clear that we need to > extend the Environment information. In addition to the Docker environment, > the extended environment holds deployment options for 1) a process-based > environment, 2) an externally managed environment. > The proto definition, as of now, looks as follows: > {noformat} > message Environment { >// (Required) The URN of the payload >string urn = 1; >// (Optional) The data specifying any parameters to the URN. If >// the URN does not require any arguments, this may be omitted. >bytes payload = 2; > } > message SupportedEnvironments { >enum Primitives { > DOCKER = 0 [(beam_urn) = "beam:env:docker:v1"]; > PROCESS = 1 [(beam_urn) = "beam:env:process:v1"]; > EXTERNAL = 2 [(beam_urn) = "beam:env:external:v1"]; >} > } > // The payload of a Docker image > message DockerPayload { >string container_image = 1; // implicitly linux_amd64. > } > message ProcessPayload { >string os = 1; // "linux", "darwin", .. >string arch = 2; // "amd64", .. >string command = 3; // process to execute >repeated string params = 4; // parameters > } > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5288) Modify Environment to support non-dockerized SDK harness deployments
Maximilian Michels created BEAM-5288: Summary: Modify Environment to support non-dockerized SDK harness deployments Key: BEAM-5288 URL: https://issues.apache.org/jira/browse/BEAM-5288 Project: Beam Issue Type: New Feature Components: beam-model Reporter: Maximilian Michels Assignee: Kenneth Knowles As of mailing discussions and BEAM-5187, it has become clear that we need to extend the Environment information. In addition to the Docker environment, the extended environment holds deployment options for 1) a process-based environment, 2) an externally managed environment. The proto definition, as of now, looks as follows: {noformat} message Environment { // (Required) The URN of the payload string urn = 1; // (Optional) The data specifying any parameters to the URN. If // the URN does not require any arguments, this may be omitted. bytes payload = 2; } message SupportedEnvironments { enum Primitives { DOCKER = 0 [(beam_urn) = "beam:env:docker:v1"]; PROCESS = 1 [(beam_urn) = "beam:env:process:v1"]; EXTERNAL = 2 [(beam_urn) = "beam:env:external:v1"]; } } // The payload of a Docker image message DockerPayload { string container_image = 1; // implicitly linux_amd64. } message ProcessPayload { string os = 1; // "linux", "darwin", .. string arch = 2; // "amd64", .. string command = 3; // process to execute } {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-5288) Modify Environment to support non-dockerized SDK harness deployments
[ https://issues.apache.org/jira/browse/BEAM-5288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels reassigned BEAM-5288: Assignee: (was: Kenneth Knowles) > Modify Environment to support non-dockerized SDK harness deployments > - > > Key: BEAM-5288 > URL: https://issues.apache.org/jira/browse/BEAM-5288 > Project: Beam > Issue Type: New Feature > Components: beam-model >Reporter: Maximilian Michels >Priority: Major > > As of mailing discussions and BEAM-5187, it has become clear that we need to > extend the Environment information. In addition to the Docker environment, > the extended environment holds deployment options for 1) a process-based > environment, 2) an externally managed environment. > The proto definition, as of now, looks as follows: > {noformat} > message Environment { >// (Required) The URN of the payload >string urn = 1; >// (Optional) The data specifying any parameters to the URN. If >// the URN does not require any arguments, this may be omitted. >bytes payload = 2; > } > message SupportedEnvironments { >enum Primitives { > DOCKER = 0 [(beam_urn) = "beam:env:docker:v1"]; > PROCESS = 1 [(beam_urn) = "beam:env:process:v1"]; > EXTERNAL = 2 [(beam_urn) = "beam:env:external:v1"]; >} > } > // The payload of a Docker image > message DockerPayload { >string container_image = 1; // implicitly linux_amd64. > } > message ProcessPayload { >string os = 1; // "linux", "darwin", .. >string arch = 2; // "amd64", .. >string command = 3; // process to execute > } > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-5277) Python SDK wordcount fails due to side inputs in streaming mode
[ https://issues.apache.org/jira/browse/BEAM-5277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels updated BEAM-5277: - Description: After BEAM-5250 is fixed, the wordcount fails with: {noformat} RuntimeError: java.lang.NullPointerException: Element processed by SDK before side input is ready at org.apache.beam.repackaged.beam_runners_flink_2.11.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:787) at org.apache.beam.runners.flink.translation.functions.FlinkStreamingSideInputHandlerFactory$1.get(FlinkStreamingSideInputHandlerFactory.java:126) at org.apache.beam.runners.fnexecution.state.StateRequestHandlers$StateRequestHandlerToSideInputHandlerFactoryAdapter.handleGetRequest(StateRequestHandlers.java:297) at org.apache.beam.runners.fnexecution.state.StateRequestHandlers$StateRequestHandlerToSideInputHandlerFactoryAdapter.handle(StateRequestHandlers.java:267) at org.apache.beam.runners.fnexecution.state.GrpcStateService$Inbound.onNext(GrpcStateService.java:121) at org.apache.beam.runners.fnexecution.state.GrpcStateService$Inbound.onNext(GrpcStateService.java:109) at org.apache.beam.vendor.grpc.v1.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:248) at org.apache.beam.vendor.grpc.v1.io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33) at org.apache.beam.vendor.grpc.v1.io.grpc.Contexts$ContextualizedServerCallListener.onMessage(Contexts.java:76) at org.apache.beam.vendor.grpc.v1.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:263) at org.apache.beam.vendor.grpc.v1.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:683) at org.apache.beam.vendor.grpc.v1.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) at org.apache.beam.vendor.grpc.v1.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) [while running 'write/Write/WriteImpl/FinalizeWrite'] at org.apache.beam.runners.fnexecution.control.FnApiControlClient$ResponseStreamObserver.onNext(FnApiControlClient.java:157) at org.apache.beam.runners.fnexecution.control.FnApiControlClient$ResponseStreamObserver.onNext(FnApiControlClient.java:140) at org.apache.beam.vendor.grpc.v1.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:248) at org.apache.beam.vendor.grpc.v1.io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33) at org.apache.beam.vendor.grpc.v1.io.grpc.Contexts$ContextualizedServerCallListener.onMessage(Contexts.java:76) at org.apache.beam.vendor.grpc.v1.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:263) at org.apache.beam.vendor.grpc.v1.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:683) at org.apache.beam.vendor.grpc.v1.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) at org.apache.beam.vendor.grpc.v1.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ... 1 more {noformat} was: After https://issues.apache.org/jira/browse/BEAM-5250 is fixed, the wordcount fails with: {noformat} RuntimeError: java.lang.NullPointerException: Element processed by SDK before side input is ready at org.apache.beam.repackaged.beam_runners_flink_2.11.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:787) at org.apache.beam.runners.flink.translation.functions.FlinkStreamingSideInputHandlerFactory$1.get(FlinkStreamingSideInputHandlerFactory.java:126) at org.apache.beam.runners.fnexecution.state.StateRequestHandlers$StateRequestHandlerToSideInputHandlerFactoryAdapter.handleGetRequest(StateRequestHandlers.java:297) at org.apache.beam.runners.fnexecution.state.StateRequestHandlers$StateRequestHandlerToSideInputHandlerFactoryAdapter.handle(StateRequestHandlers.java:267) at org.apache.beam.runners.fnexecution.state.GrpcStateService$Inbound.onNext(GrpcStateService.java:121) at org.apache.beam.runners.fnexecution.state.GrpcStateService$Inbound.onNext(GrpcStateService.java:109) at
[jira] [Created] (BEAM-5277) Python SDK wordcount fails due to side inputs in streaming mode
Maximilian Michels created BEAM-5277: Summary: Python SDK wordcount fails due to side inputs in streaming mode Key: BEAM-5277 URL: https://issues.apache.org/jira/browse/BEAM-5277 Project: Beam Issue Type: Bug Components: runner-flink Reporter: Maximilian Michels Assignee: Thomas Weise Fix For: 2.8.0 After https://issues.apache.org/jira/browse/BEAM-5250 is fixed, the wordcount fails with: {noformat} RuntimeError: java.lang.NullPointerException: Element processed by SDK before side input is ready at org.apache.beam.repackaged.beam_runners_flink_2.11.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:787) at org.apache.beam.runners.flink.translation.functions.FlinkStreamingSideInputHandlerFactory$1.get(FlinkStreamingSideInputHandlerFactory.java:126) at org.apache.beam.runners.fnexecution.state.StateRequestHandlers$StateRequestHandlerToSideInputHandlerFactoryAdapter.handleGetRequest(StateRequestHandlers.java:297) at org.apache.beam.runners.fnexecution.state.StateRequestHandlers$StateRequestHandlerToSideInputHandlerFactoryAdapter.handle(StateRequestHandlers.java:267) at org.apache.beam.runners.fnexecution.state.GrpcStateService$Inbound.onNext(GrpcStateService.java:121) at org.apache.beam.runners.fnexecution.state.GrpcStateService$Inbound.onNext(GrpcStateService.java:109) at org.apache.beam.vendor.grpc.v1.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:248) at org.apache.beam.vendor.grpc.v1.io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33) at org.apache.beam.vendor.grpc.v1.io.grpc.Contexts$ContextualizedServerCallListener.onMessage(Contexts.java:76) at org.apache.beam.vendor.grpc.v1.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:263) at org.apache.beam.vendor.grpc.v1.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:683) at org.apache.beam.vendor.grpc.v1.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) at org.apache.beam.vendor.grpc.v1.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) [while running 'write/Write/WriteImpl/FinalizeWrite'] at org.apache.beam.runners.fnexecution.control.FnApiControlClient$ResponseStreamObserver.onNext(FnApiControlClient.java:157) at org.apache.beam.runners.fnexecution.control.FnApiControlClient$ResponseStreamObserver.onNext(FnApiControlClient.java:140) at org.apache.beam.vendor.grpc.v1.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:248) at org.apache.beam.vendor.grpc.v1.io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33) at org.apache.beam.vendor.grpc.v1.io.grpc.Contexts$ContextualizedServerCallListener.onMessage(Contexts.java:76) at org.apache.beam.vendor.grpc.v1.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:263) at org.apache.beam.vendor.grpc.v1.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:683) at org.apache.beam.vendor.grpc.v1.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) at org.apache.beam.vendor.grpc.v1.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ... 1 more {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-5250) Python Wordcount fails with Flink portable streaming
[ https://issues.apache.org/jira/browse/BEAM-5250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels reassigned BEAM-5250: Assignee: Maximilian Michels > Python Wordcount fails with Flink portable streaming > > > Key: BEAM-5250 > URL: https://issues.apache.org/jira/browse/BEAM-5250 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Thomas Weise >Assignee: Maximilian Michels >Priority: Major > Labels: portability > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-5267) Update Flink Runner to Flink 1.6.x
[ https://issues.apache.org/jira/browse/BEAM-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels updated BEAM-5267: - Priority: Major (was: Minor) > Update Flink Runner to Flink 1.6.x > -- > > Key: BEAM-5267 > URL: https://issues.apache.org/jira/browse/BEAM-5267 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Major > Fix For: 2.8.0 > > > For the next release, the Flink version should be bumped. As changes for > 2.7.0 are already frozen, it going to be 2.8.0. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-5267) Update Flink Runner to Flink 1.6.x
[ https://issues.apache.org/jira/browse/BEAM-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels updated BEAM-5267: - Fix Version/s: 2.8.0 > Update Flink Runner to Flink 1.6.x > -- > > Key: BEAM-5267 > URL: https://issues.apache.org/jira/browse/BEAM-5267 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Minor > Fix For: 2.8.0 > > > For the next release, the Flink version should be bumped. As changes for > 2.7.0 are already frozen, it going to be 2.8.0. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-5267) Update Flink Runner to Flink 1.6.x
[ https://issues.apache.org/jira/browse/BEAM-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels updated BEAM-5267: - Affects Version/s: (was: 2.8.0) > Update Flink Runner to Flink 1.6.x > -- > > Key: BEAM-5267 > URL: https://issues.apache.org/jira/browse/BEAM-5267 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Maximilian Michels >Assignee: Maximilian Michels >Priority: Minor > > For the next release, the Flink version should be bumped. As changes for > 2.7.0 are already frozen, it going to be 2.8.0. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5267) Update Flink Runner to Flink 1.6.x
Maximilian Michels created BEAM-5267: Summary: Update Flink Runner to Flink 1.6.x Key: BEAM-5267 URL: https://issues.apache.org/jira/browse/BEAM-5267 Project: Beam Issue Type: Improvement Components: runner-flink Affects Versions: 2.8.0 Reporter: Maximilian Michels Assignee: Maximilian Michels For the next release, the Flink version should be bumped. As changes for 2.7.0 are already frozen, it going to be 2.8.0. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-5211) Flink Streaming ExecutableStage operator chain blocks grpc receiver threads
[ https://issues.apache.org/jira/browse/BEAM-5211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels resolved BEAM-5211. -- Resolution: Fixed Should be fixed now via https://github.com/apache/beam/pull/6271 > Flink Streaming ExecutableStage operator chain blocks grpc receiver threads > > > Key: BEAM-5211 > URL: https://issues.apache.org/jira/browse/BEAM-5211 > Project: Beam > Issue Type: Bug > Components: runner-flink >Affects Versions: 2.6.0 >Reporter: Thomas Weise >Assignee: Thomas Weise >Priority: Major > Labels: portability > Fix For: 2.7.0 > > Attachments: jstack.log > > Time Spent: 1h 50m > Remaining Estimate: 0h > > The operator attempts to emit results as they are received, within the grpc > thread, while the subtask thread is waiting for bundle completion. This leads > to blocking of grpc threads and eventually deadlock when multiple stages are > within an operator chain. Observed with wordcount, see attached stacktrace. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (BEAM-4130) Portable Flink runner JobService entry point in a Docker container
[ https://issues.apache.org/jira/browse/BEAM-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583602#comment-16583602 ] Maximilian Michels edited comment on BEAM-4130 at 8/21/18 1:02 PM: --- For local runs, where Flink is bootstrapped inside the Docker container, we need to default to run the SDK harness in the same container {{InProcessEnvironmentFactory}}. This is necessary because we can't start Docker containers inside the Docker container. For remote Flink clusters, we can use the regular {{DockerEnvironmentFactory}}. edit: The InProcess SDK harness is a separate project: https://issues.apache.org/jira/browse/BEAM-5187 As discussed on the mailing list, we can create sibling containers for embedded runs by mounting the Docker binaries and the Docker socket inside the Job Server container. was (Author: mxm): For local runs, where Flink is bootstrapped inside the Docker container, we need to default to run the SDK harness in the same container {{InProcessEnvironmentFactory}}. This is necessary because we can't start Docker containers inside the Docker container. For remote Flink clusters, we can use the regular {{DockerEnvironmentFactory}}. > Portable Flink runner JobService entry point in a Docker container > -- > > Key: BEAM-4130 > URL: https://issues.apache.org/jira/browse/BEAM-4130 > Project: Beam > Issue Type: New Feature > Components: runner-flink >Reporter: Ben Sidhom >Assignee: Maximilian Michels >Priority: Minor > Time Spent: 4h 50m > Remaining Estimate: 0h > > The portable Flink runner exists as a Job Service that runs somewhere. We > need a main entry point that itself spins up the job service (and artifact > staging service). The main program itself should be packaged into an uberjar > such that it can be run locally or submitted to a Flink deployment via `flink > run`. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5187) Create an InProcessJobBundleFactory for non-dockerized SDK harness
Maximilian Michels created BEAM-5187: Summary: Create an InProcessJobBundleFactory for non-dockerized SDK harness Key: BEAM-5187 URL: https://issues.apache.org/jira/browse/BEAM-5187 Project: Beam Issue Type: New Feature Components: runner-core Reporter: Maximilian Michels Assignee: Kenneth Knowles As discussed on the mailing list [1], we want to giver users an option to execute portable pipelines without Docker. Analog to the {{DockerJobBundleFactory}}, a {{InProcessJobBundleFactory}} could be added to directly fork SDK harness processes. Artifacts will be provided by an artifact directory or could be setup similar to the existing bootstrapping code ("boot.go") which we use for containers. The process-based execution can optionally be configured via the pipeline options. [1] https://lists.apache.org/thread.html/d8b81e9f74f77d74c8b883cda80fa48efdcaf6ac2ad313c4fe68795a@%3Cdev.beam.apache.org%3E -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5174) Website feed is broken due to license header
Maximilian Michels created BEAM-5174: Summary: Website feed is broken due to license header Key: BEAM-5174 URL: https://issues.apache.org/jira/browse/BEAM-5174 Project: Beam Issue Type: Bug Components: website Reporter: Maximilian Michels Assignee: Melissa Pashniak The feed at https://beam.apache.org/feed.xml starts out with a license header which breaks the XML support. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-5162) Document Metrics API for users
Maximilian Michels created BEAM-5162: Summary: Document Metrics API for users Key: BEAM-5162 URL: https://issues.apache.org/jira/browse/BEAM-5162 Project: Beam Issue Type: Task Components: sdk-java-core, website Reporter: Maximilian Michels Assignee: Kenneth Knowles The Metrics API is currently only documented in Beam's JavaDocs. A complementary user documentation with examples would be desirable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4130) Portable Flink runner JobService entry point in a Docker container
[ https://issues.apache.org/jira/browse/BEAM-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583602#comment-16583602 ] Maximilian Michels commented on BEAM-4130: -- For local runs, where Flink is bootstrapped inside the Docker container, we need to default to run the SDK harness in the same container {{InProcessEnvironmentFactory}}. This is necessary because we can't start Docker containers inside the Docker container. For remote Flink clusters, we can use the regular {{DockerEnvironmentFactory}}. > Portable Flink runner JobService entry point in a Docker container > -- > > Key: BEAM-4130 > URL: https://issues.apache.org/jira/browse/BEAM-4130 > Project: Beam > Issue Type: New Feature > Components: runner-flink >Reporter: Ben Sidhom >Assignee: Maximilian Michels >Priority: Minor > Time Spent: 4h 40m > Remaining Estimate: 0h > > The portable Flink runner exists as a Job Service that runs somewhere. We > need a main entry point that itself spins up the job service (and artifact > staging service). The main program itself should be packaged into an uberjar > such that it can be run locally or submitted to a Flink deployment via `flink > run`. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (BEAM-4130) Portable Flink runner JobService entry point in a Docker container
[ https://issues.apache.org/jira/browse/BEAM-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581180#comment-16581180 ] Maximilian Michels edited comment on BEAM-4130 at 8/15/18 3:13 PM: --- To add to the above, shouldn't the behavior for the JobService be the same as for the SDK Harness? Currently, the Python SDK Harness is either started from a Docker container or supplied via a remote URL. Similarly, the JobService could either be brought up using Docker or assumed to be running at some address. was (Author: mxm): To add to the above, shouldn't the behavior for the JobService be the same as for the SDK Harness? Currently, the Python SDK Harness is either started from a Docker container or supplied via a remote URL. Similarly, the JobService is either brought up using Docker or assumed to be running at some address. > Portable Flink runner JobService entry point in a Docker container > -- > > Key: BEAM-4130 > URL: https://issues.apache.org/jira/browse/BEAM-4130 > Project: Beam > Issue Type: New Feature > Components: runner-flink >Reporter: Ben Sidhom >Assignee: Maximilian Michels >Priority: Minor > Time Spent: 4h 20m > Remaining Estimate: 0h > > The portable Flink runner exists as a Job Service that runs somewhere. We > need a main entry point that itself spins up the job service (and artifact > staging service). The main program itself should be packaged into an uberjar > such that it can be run locally or submitted to a Flink deployment via `flink > run`. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4130) Portable Flink runner JobService entry point in a Docker container
[ https://issues.apache.org/jira/browse/BEAM-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581180#comment-16581180 ] Maximilian Michels commented on BEAM-4130: -- To add to the above, shouldn't the behavior for the JobService be the same as for the SDK Harness? Currently, the Python SDK Harness is either started from a Docker container or supplied via a remote URL. Similarly, the JobService is either brought up using Docker or assumed to be running at some address. > Portable Flink runner JobService entry point in a Docker container > -- > > Key: BEAM-4130 > URL: https://issues.apache.org/jira/browse/BEAM-4130 > Project: Beam > Issue Type: New Feature > Components: runner-flink >Reporter: Ben Sidhom >Assignee: Maximilian Michels >Priority: Minor > Time Spent: 4h 20m > Remaining Estimate: 0h > > The portable Flink runner exists as a Job Service that runs somewhere. We > need a main entry point that itself spins up the job service (and artifact > staging service). The main program itself should be packaged into an uberjar > such that it can be run locally or submitted to a Flink deployment via `flink > run`. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4130) Portable Flink runner JobService entry point in a Docker container
[ https://issues.apache.org/jira/browse/BEAM-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581161#comment-16581161 ] Maximilian Michels commented on BEAM-4130: -- >From the design document ( https://s.apache.org/beam-job-api ) it looks like >Docker has been chosen as the preferred approach. There is already a runner >for testing purposes which brings up a JobService/ArtifactService, see >{{TestPortableRunner}}. The goal is to have a similar functionality when >executing in non-testing mode. I agree with [~thw] that the Docker container shouldn't be the only way. Users should be able to manually run the JobService and the service might be designed to run long-term in the future. > Portable Flink runner JobService entry point in a Docker container > -- > > Key: BEAM-4130 > URL: https://issues.apache.org/jira/browse/BEAM-4130 > Project: Beam > Issue Type: New Feature > Components: runner-flink >Reporter: Ben Sidhom >Assignee: Maximilian Michels >Priority: Minor > Time Spent: 4h 20m > Remaining Estimate: 0h > > The portable Flink runner exists as a Job Service that runs somewhere. We > need a main entry point that itself spins up the job service (and artifact > staging service). The main program itself should be packaged into an uberjar > such that it can be run locally or submitted to a Flink deployment via `flink > run`. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-956) Execute ReduceFnRunner Directly in Flink Runner
[ https://issues.apache.org/jira/browse/BEAM-956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16160289#comment-16160289 ] Maximilian Michels commented on BEAM-956: - Hi! :) Thanks for clarifying. Well at least the {{GroupAlsoByWindowViaWindowSetDoFn}} is not an {{OldDoFn}} anymore. > Execute ReduceFnRunner Directly in Flink Runner > --- > > Key: BEAM-956 > URL: https://issues.apache.org/jira/browse/BEAM-956 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Aljoscha Krettek > > Right now, a {{ReduceFnRunner}} is executed via > {{GroupAlsoByWindowViaWindowSetDoFn}} which in turn is executed via a > {{DoFnRunner}}. We should change that to get rid of the dependence on > {{GroupAlsoByWindowViaWindowSetDoFn}} which is an {{OldDoFn}} and also to get > rid of some unneeded layering. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-1628) Flink runner: logic around --flinkMaster is error-prone
[ https://issues.apache.org/jira/browse/BEAM-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138333#comment-16138333 ] Maximilian Michels commented on BEAM-1628: -- [~vectorijk] Are you still working on this issue? > Flink runner: logic around --flinkMaster is error-prone > --- > > Key: BEAM-1628 > URL: https://issues.apache.org/jira/browse/BEAM-1628 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Davor Bonaci >Assignee: Kai Jiang >Priority: Minor > Labels: newbie, starter > > The logic for handling {{--flinkMaster}} seems not particularly user-friendly. > https://github.com/apache/beam/blob/fbcde4cdc7d68de8734bf540c079b2747631a854/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/FlinkPipelineExecutionEnvironment.java#L132 > {code} > if (masterUrl.equals("[local]")) { > } else if (masterUrl.equals("[collection]")) { > } else if (masterUrl.equals("[auto]")) { > } else if (masterUrl.matches(".*:\\d*")) { > } else { > // use auto. > } > {code} > The options are constructed with "auto" set as default. > I think we should do the following: > * I assume there's a default port for the Flink master. We should default to > it. > * We should treat a string without a colon as a host name. (Not default to > local execution.) > This is super easy fix, hopefully someone can pick it up quickly ;-) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-956) Execute ReduceFnRunner Directly in Flink Runner
[ https://issues.apache.org/jira/browse/BEAM-956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138331#comment-16138331 ] Maximilian Michels commented on BEAM-956: - It appears this issue has been fixed. [~aljoscha] can this be closed? > Execute ReduceFnRunner Directly in Flink Runner > --- > > Key: BEAM-956 > URL: https://issues.apache.org/jira/browse/BEAM-956 > Project: Beam > Issue Type: Improvement > Components: runner-flink >Reporter: Aljoscha Krettek > > Right now, a {{ReduceFnRunner}} is executed via > {{GroupAlsoByWindowViaWindowSetDoFn}} which in turn is executed via a > {{DoFnRunner}}. We should change that to get rid of the dependence on > {{GroupAlsoByWindowViaWindowSetDoFn}} which is an {{OldDoFn}} and also to get > rid of some unneeded layering. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (BEAM-1255) java.io.NotSerializableException in flink on UnboundedSource
[ https://issues.apache.org/jira/browse/BEAM-1255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maximilian Michels resolved BEAM-1255. -- Resolution: Fixed > java.io.NotSerializableException in flink on UnboundedSource > > > Key: BEAM-1255 > URL: https://issues.apache.org/jira/browse/BEAM-1255 > Project: Beam > Issue Type: Bug > Components: runner-flink >Affects Versions: 0.5.0 >Reporter: Alexey Diomin >Assignee: Alexey Diomin > Fix For: 0.5.0 > > > After introduce new Coders with TypeDescriptor on flink runner we have issue: > {code} > Caused by: java.io.NotSerializableException: > sun.reflect.generics.reflectiveObjects.TypeVariableImpl > - element of array (index: 0) > - array (class "[Ljava.lang.Object;", size: 2) > - field (class > "com.google.common.collect.ImmutableList$SerializedForm", name: "elements", > type: "class [Ljava.lang.Object;") > - object (class > "com.google.common.collect.ImmutableList$SerializedForm", > com.google.common.collect.ImmutableList$SerializedForm@30af5b6b) > - field (class "com.google.common.reflect.Types$ParameterizedTypeImpl", > name: "argumentsList", type: "class com.google.common.collect.ImmutableList") > - object (class > "com.google.common.reflect.Types$ParameterizedTypeImpl", > org.apache.beam.sdk.io.UnboundedSource) > - field (class "com.google.common.reflect.TypeToken", name: > "runtimeType", type: "interface java.lang.reflect.Type") > - object (class "com.google.common.reflect.TypeToken$SimpleTypeToken", > org.apache.beam.sdk.io.UnboundedSource ) > - field (class "org.apache.beam.sdk.values.TypeDescriptor", name: > "token", type: "class com.google.common.reflect.TypeToken") > - object (class > "org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper$1", > org.apache.beam.sdk.io.UnboundedSource ) > - field (class "org.apache.beam.sdk.coders.SerializableCoder", name: > "typeDescriptor", type: "class org.apache.beam.sdk.values.TypeDescriptor") > - object (class "org.apache.beam.sdk.coders.SerializableCoder", > SerializableCoder) > - field (class "org.apache.beam.sdk.coders.KvCoder", name: "keyCoder", > type: "interface org.apache.beam.sdk.coders.Coder") > - object (class "org.apache.beam.sdk.coders.KvCoder", > KvCoder(SerializableCoder,AvroCoder)) > - field (class "org.apache.beam.sdk.coders.IterableLikeCoder", name: > "elementCoder", type: "interface org.apache.beam.sdk.coders.Coder") > - object (class "org.apache.beam.sdk.coders.ListCoder", > ListCoder(KvCoder(SerializableCoder,AvroCoder))) > - field (class > "org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper", > name: "checkpointCoder", type: "class org.apache.beam.sdk.coders.ListCoder") > - root object (class > "org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper", > > org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper@b2c5e07) > at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1182) > {code} > bug introduced after commit: > 7b98fa08d14e8121e8885f00a9a9a878b73f81a6 > pull request: > https://github.com/apache/beam/pull/1537 > Code for reproduce error > {code} > import com.google.common.collect.ImmutableList; > import org.apache.beam.runners.flink.FlinkPipelineOptions; > import org.apache.beam.runners.flink.FlinkRunner; > import org.apache.beam.sdk.Pipeline; > import org.apache.beam.sdk.io.kafka.KafkaIO; > import org.apache.beam.sdk.options.PipelineOptionsFactory; > public class FlinkSerialisationError { > public static void main(String[] args) { > FlinkPipelineOptions options = > PipelineOptionsFactory.as(FlinkPipelineOptions.class); > options.setRunner(FlinkRunner.class); > options.setStreaming(true); > Pipeline pipeline = Pipeline.create(options); > pipeline.apply( > KafkaIO.read() > .withBootstrapServers("localhost:9092") > .withTopics(ImmutableList.of("test")) > // set ConsumerGroup > .withoutMetadata()); > pipeline.run(); > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)