[jira] [Updated] (BEAM-5267) Update Flink Runner to Flink 1.6.x

2018-09-03 Thread Maximilian Michels (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-5267:
-
Description: For the next release, the Flink version should be bumped. As 
changes for 2.7.0 are already frozen, it's going to be 2.8.0.   (was: For the 
next release, the Flink version should be bumped. As changes for 2.7.0 are 
already frozen, it going to be 2.8.0. )

> Update Flink Runner to Flink 1.6.x
> --
>
> Key: BEAM-5267
> URL: https://issues.apache.org/jira/browse/BEAM-5267
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.8.0
>
>
> For the next release, the Flink version should be bumped. As changes for 
> 2.7.0 are already frozen, it's going to be 2.8.0. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5284) Enable Java Portable Flink PostCommit Tests to Jenkins

2018-09-03 Thread Maximilian Michels (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-5284:
-
Component/s: runner-flink

> Enable Java Portable Flink PostCommit Tests to Jenkins
> --
>
> Key: BEAM-5284
> URL: https://issues.apache.org/jira/browse/BEAM-5284
> Project: Beam
>  Issue Type: Test
>  Components: runner-flink, testing
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
>  Labels: CI
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-5284) Enable Java Portable Flink PostCommit Tests to Jenkins

2018-09-03 Thread Maximilian Michels (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16602179#comment-16602179
 ] 

Maximilian Michels edited comment on BEAM-5284 at 9/3/18 1:44 PM:
--

+1 (I was about to open an issue for this)


was (Author: mxm):
+1

> Enable Java Portable Flink PostCommit Tests to Jenkins
> --
>
> Key: BEAM-5284
> URL: https://issues.apache.org/jira/browse/BEAM-5284
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
>  Labels: CI
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-5284) Enable Java Portable Flink PostCommit Tests to Jenkins

2018-09-03 Thread Maximilian Michels (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-5284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16602179#comment-16602179
 ] 

Maximilian Michels commented on BEAM-5284:
--

+1

> Enable Java Portable Flink PostCommit Tests to Jenkins
> --
>
> Key: BEAM-5284
> URL: https://issues.apache.org/jira/browse/BEAM-5284
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Ankur Goenka
>Assignee: Ankur Goenka
>Priority: Major
>  Labels: CI
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5288) Modify Environment to support non-dockerized SDK harness deployments

2018-09-03 Thread Maximilian Michels (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-5288:
-
Description: 
As of mailing discussions and BEAM-5187, it has become clear that we need to 
extend the Environment information. In addition to the Docker environment, the 
extended environment holds deployment options for 1) a process-based 
environment, 2) an externally managed environment. 

The proto definition, as of now, looks as follows:

{noformat}
 message Environment {

   // (Required) The URN of the payload
   string urn = 1;

   // (Optional) The data specifying any parameters to the URN. If
   // the URN does not require any arguments, this may be omitted.
   bytes payload = 2;
 }

 message SupportedEnvironments {
   enum Primitives {
 DOCKER = 0 [(beam_urn) = "beam:env:docker:v1"];

 PROCESS = 1 [(beam_urn) = "beam:env:process:v1"];

 EXTERNAL = 2 [(beam_urn) = "beam:env:external:v1"];
   }
 }

 // The payload of a Docker image
 message DockerPayload {
   string container_image = 1;  // implicitly linux_amd64.
 }

 message ProcessPayload {
   string os = 1;  // "linux", "darwin", ..
   string arch = 2;  // "amd64", ..
   string command = 3; // process to execute
   repeated string params = 4; // parameters
 }
{noformat}

  was:
As of mailing discussions and BEAM-5187, it has become clear that we need to 
extend the Environment information. In addition to the Docker environment, the 
extended environment holds deployment options for 1) a process-based 
environment, 2) an externally managed environment. 

The proto definition, as of now, looks as follows:

{noformat}
 message Environment {

   // (Required) The URN of the payload
   string urn = 1;

   // (Optional) The data specifying any parameters to the URN. If
   // the URN does not require any arguments, this may be omitted.
   bytes payload = 2;
 }

 message SupportedEnvironments {
   enum Primitives {
 DOCKER = 0 [(beam_urn) = "beam:env:docker:v1"];

 PROCESS = 1 [(beam_urn) = "beam:env:process:v1"];

 EXTERNAL = 2 [(beam_urn) = "beam:env:external:v1"];
   }
 }

 // The payload of a Docker image
 message DockerPayload {
   string container_image = 1;  // implicitly linux_amd64.
 }

 message ProcessPayload {
   string os = 1;  // "linux", "darwin", ..
   string arch = 2;  // "amd64", ..
   string command = 3; // process to execute
 }
{noformat}


> Modify Environment to support non-dockerized SDK harness deployments 
> -
>
> Key: BEAM-5288
> URL: https://issues.apache.org/jira/browse/BEAM-5288
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Maximilian Michels
>Priority: Major
>
> As of mailing discussions and BEAM-5187, it has become clear that we need to 
> extend the Environment information. In addition to the Docker environment, 
> the extended environment holds deployment options for 1) a process-based 
> environment, 2) an externally managed environment. 
> The proto definition, as of now, looks as follows:
> {noformat}
>  message Environment {
>// (Required) The URN of the payload
>string urn = 1;
>// (Optional) The data specifying any parameters to the URN. If
>// the URN does not require any arguments, this may be omitted.
>bytes payload = 2;
>  }
>  message SupportedEnvironments {
>enum Primitives {
>  DOCKER = 0 [(beam_urn) = "beam:env:docker:v1"];
>  PROCESS = 1 [(beam_urn) = "beam:env:process:v1"];
>  EXTERNAL = 2 [(beam_urn) = "beam:env:external:v1"];
>}
>  }
>  // The payload of a Docker image
>  message DockerPayload {
>string container_image = 1;  // implicitly linux_amd64.
>  }
>  message ProcessPayload {
>string os = 1;  // "linux", "darwin", ..
>string arch = 2;  // "amd64", ..
>string command = 3; // process to execute
>repeated string params = 4; // parameters
>  }
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5288) Modify Environment to support non-dockerized SDK harness deployments

2018-09-03 Thread Maximilian Michels (JIRA)
Maximilian Michels created BEAM-5288:


 Summary: Modify Environment to support non-dockerized SDK harness 
deployments 
 Key: BEAM-5288
 URL: https://issues.apache.org/jira/browse/BEAM-5288
 Project: Beam
  Issue Type: New Feature
  Components: beam-model
Reporter: Maximilian Michels
Assignee: Kenneth Knowles


As of mailing discussions and BEAM-5187, it has become clear that we need to 
extend the Environment information. In addition to the Docker environment, the 
extended environment holds deployment options for 1) a process-based 
environment, 2) an externally managed environment. 

The proto definition, as of now, looks as follows:

{noformat}
 message Environment {

   // (Required) The URN of the payload
   string urn = 1;

   // (Optional) The data specifying any parameters to the URN. If
   // the URN does not require any arguments, this may be omitted.
   bytes payload = 2;
 }

 message SupportedEnvironments {
   enum Primitives {
 DOCKER = 0 [(beam_urn) = "beam:env:docker:v1"];

 PROCESS = 1 [(beam_urn) = "beam:env:process:v1"];

 EXTERNAL = 2 [(beam_urn) = "beam:env:external:v1"];
   }
 }

 // The payload of a Docker image
 message DockerPayload {
   string container_image = 1;  // implicitly linux_amd64.
 }

 message ProcessPayload {
   string os = 1;  // "linux", "darwin", ..
   string arch = 2;  // "amd64", ..
   string command = 3; // process to execute
 }
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5288) Modify Environment to support non-dockerized SDK harness deployments

2018-09-03 Thread Maximilian Michels (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels reassigned BEAM-5288:


Assignee: (was: Kenneth Knowles)

> Modify Environment to support non-dockerized SDK harness deployments 
> -
>
> Key: BEAM-5288
> URL: https://issues.apache.org/jira/browse/BEAM-5288
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Maximilian Michels
>Priority: Major
>
> As of mailing discussions and BEAM-5187, it has become clear that we need to 
> extend the Environment information. In addition to the Docker environment, 
> the extended environment holds deployment options for 1) a process-based 
> environment, 2) an externally managed environment. 
> The proto definition, as of now, looks as follows:
> {noformat}
>  message Environment {
>// (Required) The URN of the payload
>string urn = 1;
>// (Optional) The data specifying any parameters to the URN. If
>// the URN does not require any arguments, this may be omitted.
>bytes payload = 2;
>  }
>  message SupportedEnvironments {
>enum Primitives {
>  DOCKER = 0 [(beam_urn) = "beam:env:docker:v1"];
>  PROCESS = 1 [(beam_urn) = "beam:env:process:v1"];
>  EXTERNAL = 2 [(beam_urn) = "beam:env:external:v1"];
>}
>  }
>  // The payload of a Docker image
>  message DockerPayload {
>string container_image = 1;  // implicitly linux_amd64.
>  }
>  message ProcessPayload {
>string os = 1;  // "linux", "darwin", ..
>string arch = 2;  // "amd64", ..
>string command = 3; // process to execute
>  }
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5277) Python SDK wordcount fails due to side inputs in streaming mode

2018-08-31 Thread Maximilian Michels (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-5277:
-
Description: 
After BEAM-5250 is fixed, the wordcount fails with:

{noformat}
RuntimeError: java.lang.NullPointerException: Element processed by SDK before 
side input is ready
at 
org.apache.beam.repackaged.beam_runners_flink_2.11.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:787)
at 
org.apache.beam.runners.flink.translation.functions.FlinkStreamingSideInputHandlerFactory$1.get(FlinkStreamingSideInputHandlerFactory.java:126)
at 
org.apache.beam.runners.fnexecution.state.StateRequestHandlers$StateRequestHandlerToSideInputHandlerFactoryAdapter.handleGetRequest(StateRequestHandlers.java:297)
at 
org.apache.beam.runners.fnexecution.state.StateRequestHandlers$StateRequestHandlerToSideInputHandlerFactoryAdapter.handle(StateRequestHandlers.java:267)
at 
org.apache.beam.runners.fnexecution.state.GrpcStateService$Inbound.onNext(GrpcStateService.java:121)
at 
org.apache.beam.runners.fnexecution.state.GrpcStateService$Inbound.onNext(GrpcStateService.java:109)
at 
org.apache.beam.vendor.grpc.v1.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:248)
at 
org.apache.beam.vendor.grpc.v1.io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33)
at 
org.apache.beam.vendor.grpc.v1.io.grpc.Contexts$ContextualizedServerCallListener.onMessage(Contexts.java:76)
at 
org.apache.beam.vendor.grpc.v1.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:263)
at 
org.apache.beam.vendor.grpc.v1.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:683)
at 
org.apache.beam.vendor.grpc.v1.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at 
org.apache.beam.vendor.grpc.v1.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
 [while running 'write/Write/WriteImpl/FinalizeWrite']

at 
org.apache.beam.runners.fnexecution.control.FnApiControlClient$ResponseStreamObserver.onNext(FnApiControlClient.java:157)
at 
org.apache.beam.runners.fnexecution.control.FnApiControlClient$ResponseStreamObserver.onNext(FnApiControlClient.java:140)
at 
org.apache.beam.vendor.grpc.v1.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:248)
at 
org.apache.beam.vendor.grpc.v1.io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33)
at 
org.apache.beam.vendor.grpc.v1.io.grpc.Contexts$ContextualizedServerCallListener.onMessage(Contexts.java:76)
at 
org.apache.beam.vendor.grpc.v1.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:263)
at 
org.apache.beam.vendor.grpc.v1.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:683)
at 
org.apache.beam.vendor.grpc.v1.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at 
org.apache.beam.vendor.grpc.v1.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
... 1 more
{noformat}


  was:
After https://issues.apache.org/jira/browse/BEAM-5250 is fixed, the wordcount 
fails with:

{noformat}
RuntimeError: java.lang.NullPointerException: Element processed by SDK before 
side input is ready
at 
org.apache.beam.repackaged.beam_runners_flink_2.11.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:787)
at 
org.apache.beam.runners.flink.translation.functions.FlinkStreamingSideInputHandlerFactory$1.get(FlinkStreamingSideInputHandlerFactory.java:126)
at 
org.apache.beam.runners.fnexecution.state.StateRequestHandlers$StateRequestHandlerToSideInputHandlerFactoryAdapter.handleGetRequest(StateRequestHandlers.java:297)
at 
org.apache.beam.runners.fnexecution.state.StateRequestHandlers$StateRequestHandlerToSideInputHandlerFactoryAdapter.handle(StateRequestHandlers.java:267)
at 
org.apache.beam.runners.fnexecution.state.GrpcStateService$Inbound.onNext(GrpcStateService.java:121)
at 
org.apache.beam.runners.fnexecution.state.GrpcStateService$Inbound.onNext(GrpcStateService.java:109)
at 

[jira] [Created] (BEAM-5277) Python SDK wordcount fails due to side inputs in streaming mode

2018-08-31 Thread Maximilian Michels (JIRA)
Maximilian Michels created BEAM-5277:


 Summary: Python SDK wordcount fails due to side inputs in 
streaming mode
 Key: BEAM-5277
 URL: https://issues.apache.org/jira/browse/BEAM-5277
 Project: Beam
  Issue Type: Bug
  Components: runner-flink
Reporter: Maximilian Michels
Assignee: Thomas Weise
 Fix For: 2.8.0


After https://issues.apache.org/jira/browse/BEAM-5250 is fixed, the wordcount 
fails with:

{noformat}
RuntimeError: java.lang.NullPointerException: Element processed by SDK before 
side input is ready
at 
org.apache.beam.repackaged.beam_runners_flink_2.11.com.google.common.base.Preconditions.checkNotNull(Preconditions.java:787)
at 
org.apache.beam.runners.flink.translation.functions.FlinkStreamingSideInputHandlerFactory$1.get(FlinkStreamingSideInputHandlerFactory.java:126)
at 
org.apache.beam.runners.fnexecution.state.StateRequestHandlers$StateRequestHandlerToSideInputHandlerFactoryAdapter.handleGetRequest(StateRequestHandlers.java:297)
at 
org.apache.beam.runners.fnexecution.state.StateRequestHandlers$StateRequestHandlerToSideInputHandlerFactoryAdapter.handle(StateRequestHandlers.java:267)
at 
org.apache.beam.runners.fnexecution.state.GrpcStateService$Inbound.onNext(GrpcStateService.java:121)
at 
org.apache.beam.runners.fnexecution.state.GrpcStateService$Inbound.onNext(GrpcStateService.java:109)
at 
org.apache.beam.vendor.grpc.v1.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:248)
at 
org.apache.beam.vendor.grpc.v1.io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33)
at 
org.apache.beam.vendor.grpc.v1.io.grpc.Contexts$ContextualizedServerCallListener.onMessage(Contexts.java:76)
at 
org.apache.beam.vendor.grpc.v1.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:263)
at 
org.apache.beam.vendor.grpc.v1.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:683)
at 
org.apache.beam.vendor.grpc.v1.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at 
org.apache.beam.vendor.grpc.v1.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
 [while running 'write/Write/WriteImpl/FinalizeWrite']

at 
org.apache.beam.runners.fnexecution.control.FnApiControlClient$ResponseStreamObserver.onNext(FnApiControlClient.java:157)
at 
org.apache.beam.runners.fnexecution.control.FnApiControlClient$ResponseStreamObserver.onNext(FnApiControlClient.java:140)
at 
org.apache.beam.vendor.grpc.v1.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:248)
at 
org.apache.beam.vendor.grpc.v1.io.grpc.ForwardingServerCallListener.onMessage(ForwardingServerCallListener.java:33)
at 
org.apache.beam.vendor.grpc.v1.io.grpc.Contexts$ContextualizedServerCallListener.onMessage(Contexts.java:76)
at 
org.apache.beam.vendor.grpc.v1.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:263)
at 
org.apache.beam.vendor.grpc.v1.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1MessagesAvailable.runInContext(ServerImpl.java:683)
at 
org.apache.beam.vendor.grpc.v1.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at 
org.apache.beam.vendor.grpc.v1.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
... 1 more
{noformat}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (BEAM-5250) Python Wordcount fails with Flink portable streaming

2018-08-30 Thread Maximilian Michels (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels reassigned BEAM-5250:


Assignee: Maximilian Michels

> Python Wordcount fails with Flink portable streaming
> 
>
> Key: BEAM-5250
> URL: https://issues.apache.org/jira/browse/BEAM-5250
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Thomas Weise
>Assignee: Maximilian Michels
>Priority: Major
>  Labels: portability
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5267) Update Flink Runner to Flink 1.6.x

2018-08-30 Thread Maximilian Michels (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-5267:
-
Priority: Major  (was: Minor)

> Update Flink Runner to Flink 1.6.x
> --
>
> Key: BEAM-5267
> URL: https://issues.apache.org/jira/browse/BEAM-5267
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Major
> Fix For: 2.8.0
>
>
> For the next release, the Flink version should be bumped. As changes for 
> 2.7.0 are already frozen, it going to be 2.8.0. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5267) Update Flink Runner to Flink 1.6.x

2018-08-30 Thread Maximilian Michels (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-5267:
-
Fix Version/s: 2.8.0

> Update Flink Runner to Flink 1.6.x
> --
>
> Key: BEAM-5267
> URL: https://issues.apache.org/jira/browse/BEAM-5267
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
> Fix For: 2.8.0
>
>
> For the next release, the Flink version should be bumped. As changes for 
> 2.7.0 are already frozen, it going to be 2.8.0. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (BEAM-5267) Update Flink Runner to Flink 1.6.x

2018-08-30 Thread Maximilian Michels (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels updated BEAM-5267:
-
Affects Version/s: (was: 2.8.0)

> Update Flink Runner to Flink 1.6.x
> --
>
> Key: BEAM-5267
> URL: https://issues.apache.org/jira/browse/BEAM-5267
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Maximilian Michels
>Assignee: Maximilian Michels
>Priority: Minor
>
> For the next release, the Flink version should be bumped. As changes for 
> 2.7.0 are already frozen, it going to be 2.8.0. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5267) Update Flink Runner to Flink 1.6.x

2018-08-30 Thread Maximilian Michels (JIRA)
Maximilian Michels created BEAM-5267:


 Summary: Update Flink Runner to Flink 1.6.x
 Key: BEAM-5267
 URL: https://issues.apache.org/jira/browse/BEAM-5267
 Project: Beam
  Issue Type: Improvement
  Components: runner-flink
Affects Versions: 2.8.0
Reporter: Maximilian Michels
Assignee: Maximilian Michels


For the next release, the Flink version should be bumped. As changes for 2.7.0 
are already frozen, it going to be 2.8.0. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (BEAM-5211) Flink Streaming ExecutableStage operator chain blocks grpc receiver threads

2018-08-24 Thread Maximilian Michels (JIRA)


 [ 
https://issues.apache.org/jira/browse/BEAM-5211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved BEAM-5211.
--
Resolution: Fixed

Should be fixed now via https://github.com/apache/beam/pull/6271

> Flink Streaming ExecutableStage operator chain blocks grpc receiver threads 
> 
>
> Key: BEAM-5211
> URL: https://issues.apache.org/jira/browse/BEAM-5211
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.6.0
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Major
>  Labels: portability
> Fix For: 2.7.0
>
> Attachments: jstack.log
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> The operator attempts to emit results as they are received, within the grpc 
> thread, while the subtask thread is waiting for bundle completion. This leads 
> to blocking of grpc threads and eventually deadlock when multiple stages are 
> within an operator chain. Observed with wordcount, see attached stacktrace.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-4130) Portable Flink runner JobService entry point in a Docker container

2018-08-21 Thread Maximilian Michels (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583602#comment-16583602
 ] 

Maximilian Michels edited comment on BEAM-4130 at 8/21/18 1:02 PM:
---

For local runs, where Flink is bootstrapped inside the Docker container, we 
need to default to run the SDK harness in the same container 
{{InProcessEnvironmentFactory}}. This is necessary because we can't start 
Docker containers inside the Docker container.

For remote Flink clusters, we can use the regular {{DockerEnvironmentFactory}}.

edit: The InProcess SDK harness is a separate project: 
https://issues.apache.org/jira/browse/BEAM-5187

As discussed on the mailing list, we can create sibling containers for embedded 
runs by mounting the Docker binaries and the Docker socket inside the Job 
Server container.


was (Author: mxm):
For local runs, where Flink is bootstrapped inside the Docker container, we 
need to default to run the SDK harness in the same container 
{{InProcessEnvironmentFactory}}. This is necessary because we can't start 
Docker containers inside the Docker container.

For remote Flink clusters, we can use the regular {{DockerEnvironmentFactory}}.

> Portable Flink runner JobService entry point in a Docker container
> --
>
> Key: BEAM-4130
> URL: https://issues.apache.org/jira/browse/BEAM-4130
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Ben Sidhom
>Assignee: Maximilian Michels
>Priority: Minor
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> The portable Flink runner exists as a Job Service that runs somewhere. We 
> need a main entry point that itself spins up the job service (and artifact 
> staging service). The main program itself should be packaged into an uberjar 
> such that it can be run locally or submitted to a Flink deployment via `flink 
> run`.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5187) Create an InProcessJobBundleFactory for non-dockerized SDK harness

2018-08-21 Thread Maximilian Michels (JIRA)
Maximilian Michels created BEAM-5187:


 Summary: Create an InProcessJobBundleFactory for non-dockerized 
SDK harness
 Key: BEAM-5187
 URL: https://issues.apache.org/jira/browse/BEAM-5187
 Project: Beam
  Issue Type: New Feature
  Components: runner-core
Reporter: Maximilian Michels
Assignee: Kenneth Knowles


As discussed on the mailing list [1], we want to giver users an option to 
execute portable pipelines without Docker. Analog to the 
{{DockerJobBundleFactory}}, a {{InProcessJobBundleFactory}} could be added to 
directly fork SDK harness processes.

Artifacts will be provided by an artifact directory or could be setup similar 
to the existing bootstrapping code ("boot.go") which we use for containers.

The process-based execution can optionally be configured via the pipeline 
options.

[1] 
https://lists.apache.org/thread.html/d8b81e9f74f77d74c8b883cda80fa48efdcaf6ac2ad313c4fe68795a@%3Cdev.beam.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5174) Website feed is broken due to license header

2018-08-20 Thread Maximilian Michels (JIRA)
Maximilian Michels created BEAM-5174:


 Summary: Website feed is broken due to license header
 Key: BEAM-5174
 URL: https://issues.apache.org/jira/browse/BEAM-5174
 Project: Beam
  Issue Type: Bug
  Components: website
Reporter: Maximilian Michels
Assignee: Melissa Pashniak


The feed at https://beam.apache.org/feed.xml starts out with a license header 
which breaks the XML support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (BEAM-5162) Document Metrics API for users

2018-08-17 Thread Maximilian Michels (JIRA)
Maximilian Michels created BEAM-5162:


 Summary: Document Metrics API for users
 Key: BEAM-5162
 URL: https://issues.apache.org/jira/browse/BEAM-5162
 Project: Beam
  Issue Type: Task
  Components: sdk-java-core, website
Reporter: Maximilian Michels
Assignee: Kenneth Knowles


The Metrics API is currently only documented in Beam's JavaDocs. A 
complementary user documentation with examples would be desirable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4130) Portable Flink runner JobService entry point in a Docker container

2018-08-17 Thread Maximilian Michels (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16583602#comment-16583602
 ] 

Maximilian Michels commented on BEAM-4130:
--

For local runs, where Flink is bootstrapped inside the Docker container, we 
need to default to run the SDK harness in the same container 
{{InProcessEnvironmentFactory}}. This is necessary because we can't start 
Docker containers inside the Docker container.

For remote Flink clusters, we can use the regular {{DockerEnvironmentFactory}}.

> Portable Flink runner JobService entry point in a Docker container
> --
>
> Key: BEAM-4130
> URL: https://issues.apache.org/jira/browse/BEAM-4130
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Ben Sidhom
>Assignee: Maximilian Michels
>Priority: Minor
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> The portable Flink runner exists as a Job Service that runs somewhere. We 
> need a main entry point that itself spins up the job service (and artifact 
> staging service). The main program itself should be packaged into an uberjar 
> such that it can be run locally or submitted to a Flink deployment via `flink 
> run`.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (BEAM-4130) Portable Flink runner JobService entry point in a Docker container

2018-08-15 Thread Maximilian Michels (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581180#comment-16581180
 ] 

Maximilian Michels edited comment on BEAM-4130 at 8/15/18 3:13 PM:
---

To add to the above, shouldn't the behavior for the JobService be the same as 
for the SDK Harness? Currently, the Python SDK Harness is either started from a 
Docker container or supplied via a remote URL. Similarly, the JobService could 
either be brought up using Docker or assumed to be running at some address.


was (Author: mxm):
To add to the above, shouldn't the behavior for the JobService be the same as 
for the SDK Harness? Currently, the Python SDK Harness is either started from a 
Docker container or supplied via a remote URL. Similarly, the JobService is 
either brought up using Docker or assumed to be running at some address.

> Portable Flink runner JobService entry point in a Docker container
> --
>
> Key: BEAM-4130
> URL: https://issues.apache.org/jira/browse/BEAM-4130
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Ben Sidhom
>Assignee: Maximilian Michels
>Priority: Minor
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> The portable Flink runner exists as a Job Service that runs somewhere. We 
> need a main entry point that itself spins up the job service (and artifact 
> staging service). The main program itself should be packaged into an uberjar 
> such that it can be run locally or submitted to a Flink deployment via `flink 
> run`.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4130) Portable Flink runner JobService entry point in a Docker container

2018-08-15 Thread Maximilian Michels (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581180#comment-16581180
 ] 

Maximilian Michels commented on BEAM-4130:
--

To add to the above, shouldn't the behavior for the JobService be the same as 
for the SDK Harness? Currently, the Python SDK Harness is either started from a 
Docker container or supplied via a remote URL. Similarly, the JobService is 
either brought up using Docker or assumed to be running at some address.

> Portable Flink runner JobService entry point in a Docker container
> --
>
> Key: BEAM-4130
> URL: https://issues.apache.org/jira/browse/BEAM-4130
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Ben Sidhom
>Assignee: Maximilian Michels
>Priority: Minor
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> The portable Flink runner exists as a Job Service that runs somewhere. We 
> need a main entry point that itself spins up the job service (and artifact 
> staging service). The main program itself should be packaged into an uberjar 
> such that it can be run locally or submitted to a Flink deployment via `flink 
> run`.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-4130) Portable Flink runner JobService entry point in a Docker container

2018-08-15 Thread Maximilian Michels (JIRA)


[ 
https://issues.apache.org/jira/browse/BEAM-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581161#comment-16581161
 ] 

Maximilian Michels commented on BEAM-4130:
--

>From the design document ( https://s.apache.org/beam-job-api ) it looks like 
>Docker has been chosen as the preferred approach. There is already a runner 
>for testing purposes which brings up a JobService/ArtifactService, see 
>{{TestPortableRunner}}. The goal is to have a similar functionality when 
>executing in non-testing mode.

I agree with [~thw] that the Docker container shouldn't be the only way. Users 
should be able to manually run the JobService and the service might be designed 
to run long-term in the future.

> Portable Flink runner JobService entry point in a Docker container
> --
>
> Key: BEAM-4130
> URL: https://issues.apache.org/jira/browse/BEAM-4130
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-flink
>Reporter: Ben Sidhom
>Assignee: Maximilian Michels
>Priority: Minor
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> The portable Flink runner exists as a Job Service that runs somewhere. We 
> need a main entry point that itself spins up the job service (and artifact 
> staging service). The main program itself should be packaged into an uberjar 
> such that it can be run locally or submitted to a Flink deployment via `flink 
> run`.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-956) Execute ReduceFnRunner Directly in Flink Runner

2017-09-10 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16160289#comment-16160289
 ] 

Maximilian Michels commented on BEAM-956:
-

Hi! :) Thanks for clarifying. Well at least the 
{{GroupAlsoByWindowViaWindowSetDoFn}} is not an {{OldDoFn}} anymore. 

> Execute ReduceFnRunner Directly in Flink Runner
> ---
>
> Key: BEAM-956
> URL: https://issues.apache.org/jira/browse/BEAM-956
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Aljoscha Krettek
>
> Right now, a {{ReduceFnRunner}} is executed via 
> {{GroupAlsoByWindowViaWindowSetDoFn}} which in turn is executed via a 
> {{DoFnRunner}}. We should change that to get rid of the dependence on 
> {{GroupAlsoByWindowViaWindowSetDoFn}} which is an {{OldDoFn}} and also to get 
> rid of some unneeded layering.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-1628) Flink runner: logic around --flinkMaster is error-prone

2017-08-23 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138333#comment-16138333
 ] 

Maximilian Michels commented on BEAM-1628:
--

[~vectorijk] Are you still working on this issue?

> Flink runner: logic around --flinkMaster is error-prone
> ---
>
> Key: BEAM-1628
> URL: https://issues.apache.org/jira/browse/BEAM-1628
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Reporter: Davor Bonaci
>Assignee: Kai Jiang
>Priority: Minor
>  Labels: newbie, starter
>
> The logic for handling {{--flinkMaster}} seems not particularly user-friendly.
> https://github.com/apache/beam/blob/fbcde4cdc7d68de8734bf540c079b2747631a854/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/FlinkPipelineExecutionEnvironment.java#L132
> {code}
> if (masterUrl.equals("[local]")) {
> } else if (masterUrl.equals("[collection]")) {
> } else if (masterUrl.equals("[auto]")) {
> } else if (masterUrl.matches(".*:\\d*")) {
> } else {
>   // use auto.
> }
> {code}
> The options are constructed with "auto" set as default.
> I think we should do the following:
> * I assume there's a default port for the Flink master. We should default to 
> it.
> * We should treat a string without a colon as a host name. (Not default to 
> local execution.)
> This is super easy fix, hopefully someone can pick it up quickly ;-)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-956) Execute ReduceFnRunner Directly in Flink Runner

2017-08-23 Thread Maximilian Michels (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138331#comment-16138331
 ] 

Maximilian Michels commented on BEAM-956:
-

It appears this issue has been fixed. [~aljoscha] can this be closed?

> Execute ReduceFnRunner Directly in Flink Runner
> ---
>
> Key: BEAM-956
> URL: https://issues.apache.org/jira/browse/BEAM-956
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-flink
>Reporter: Aljoscha Krettek
>
> Right now, a {{ReduceFnRunner}} is executed via 
> {{GroupAlsoByWindowViaWindowSetDoFn}} which in turn is executed via a 
> {{DoFnRunner}}. We should change that to get rid of the dependence on 
> {{GroupAlsoByWindowViaWindowSetDoFn}} which is an {{OldDoFn}} and also to get 
> rid of some unneeded layering.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (BEAM-1255) java.io.NotSerializableException in flink on UnboundedSource

2017-01-22 Thread Maximilian Michels (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maximilian Michels resolved BEAM-1255.
--
Resolution: Fixed

> java.io.NotSerializableException in flink on UnboundedSource
> 
>
> Key: BEAM-1255
> URL: https://issues.apache.org/jira/browse/BEAM-1255
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 0.5.0
>Reporter: Alexey Diomin
>Assignee: Alexey Diomin
> Fix For: 0.5.0
>
>
> After introduce new Coders with TypeDescriptor on flink runner we have issue:
> {code}
> Caused by: java.io.NotSerializableException: 
> sun.reflect.generics.reflectiveObjects.TypeVariableImpl
>   - element of array (index: 0)
>   - array (class "[Ljava.lang.Object;", size: 2)
>   - field (class 
> "com.google.common.collect.ImmutableList$SerializedForm", name: "elements", 
> type: "class [Ljava.lang.Object;")
>   - object (class 
> "com.google.common.collect.ImmutableList$SerializedForm", 
> com.google.common.collect.ImmutableList$SerializedForm@30af5b6b)
>   - field (class "com.google.common.reflect.Types$ParameterizedTypeImpl", 
> name: "argumentsList", type: "class com.google.common.collect.ImmutableList")
>   - object (class 
> "com.google.common.reflect.Types$ParameterizedTypeImpl", 
> org.apache.beam.sdk.io.UnboundedSource)
>   - field (class "com.google.common.reflect.TypeToken", name: 
> "runtimeType", type: "interface java.lang.reflect.Type")
>   - object (class "com.google.common.reflect.TypeToken$SimpleTypeToken", 
> org.apache.beam.sdk.io.UnboundedSource)
>   - field (class "org.apache.beam.sdk.values.TypeDescriptor", name: 
> "token", type: "class com.google.common.reflect.TypeToken")
>   - object (class 
> "org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper$1",
>  org.apache.beam.sdk.io.UnboundedSource)
>   - field (class "org.apache.beam.sdk.coders.SerializableCoder", name: 
> "typeDescriptor", type: "class org.apache.beam.sdk.values.TypeDescriptor")
>   - object (class "org.apache.beam.sdk.coders.SerializableCoder", 
> SerializableCoder)
>   - field (class "org.apache.beam.sdk.coders.KvCoder", name: "keyCoder", 
> type: "interface org.apache.beam.sdk.coders.Coder")
>   - object (class "org.apache.beam.sdk.coders.KvCoder", 
> KvCoder(SerializableCoder,AvroCoder))
>   - field (class "org.apache.beam.sdk.coders.IterableLikeCoder", name: 
> "elementCoder", type: "interface org.apache.beam.sdk.coders.Coder")
>   - object (class "org.apache.beam.sdk.coders.ListCoder", 
> ListCoder(KvCoder(SerializableCoder,AvroCoder)))
>   - field (class 
> "org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper",
>  name: "checkpointCoder", type: "class org.apache.beam.sdk.coders.ListCoder")
>   - root object (class 
> "org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper",
>  
> org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper@b2c5e07)
>   at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1182)
> {code}
> bug introduced after commit:
> 7b98fa08d14e8121e8885f00a9a9a878b73f81a6
> pull request:
> https://github.com/apache/beam/pull/1537
> Code for reproduce error
> {code}
> import com.google.common.collect.ImmutableList;
> import org.apache.beam.runners.flink.FlinkPipelineOptions;
> import org.apache.beam.runners.flink.FlinkRunner;
> import org.apache.beam.sdk.Pipeline;
> import org.apache.beam.sdk.io.kafka.KafkaIO;
> import org.apache.beam.sdk.options.PipelineOptionsFactory;
> public class FlinkSerialisationError {
> public static void main(String[] args) {
> FlinkPipelineOptions options = 
> PipelineOptionsFactory.as(FlinkPipelineOptions.class);
> options.setRunner(FlinkRunner.class);
> options.setStreaming(true);
> Pipeline pipeline = Pipeline.create(options);
> pipeline.apply(
> KafkaIO.read()
> .withBootstrapServers("localhost:9092")
> .withTopics(ImmutableList.of("test"))
> // set ConsumerGroup
> .withoutMetadata());
> pipeline.run();
> }
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


<    1   2