[GitHub] flink issue #5652: [hotfix][tests] Do not use singleActorSystem in LocalFlin...

2018-03-07 Thread aljoscha
Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/5652
  
@zentol To unblock this, I'd propose to add another constructor to 
`MiniClusterResource` that takes a `enableClusterClient` parameter. Only if 
that is true do we start in multi-actor-system mode. WDYT?


---


[GitHub] flink issue #5652: [hotfix][tests] Do not use singleActorSystem in LocalFlin...

2018-03-07 Thread aljoscha
Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/5652
  
@StephanEwen I think that's what the interface already does. For example, 
`MiniClusterClient.submitJob()` does a `miniCluster.runDetached(jobGraph)`, and 
`MiniClusterClient.cancel()` does `miniCluster.cancelJob(jobId)`. The problem 
is that the legacy cluster does not have methods for those things, namely 
"cancel", "get job status", "get accumulators", and "savepoint". All existing 
ITCases use custom Akka communication with the testing cluster. We can either 
add methods for all that to the legacy mini cluster (that would probably also 
use Akka) or use the `StandaloneClusterClient`, which also uses Akka. But for 
those Akka messages to work we can't run it in single-actor-system mode. WDYT?


---


[GitHub] flink issue #5652: [hotfix][tests] Do not use singleActorSystem in LocalFlin...

2018-03-07 Thread StephanEwen
Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/5652
  
Would it work to have a "Flink Service" resource interface to which you can 
submit jobs?

It may be backed by a cluster client or directly by the mini cluster, which 
executes jobs directly. Having the shared interface (across flip6 and legacy) 
based on the cluster client seems like the wrong common abstraction.


---


[GitHub] flink issue #5652: [hotfix][tests] Do not use singleActorSystem in LocalFlin...

2018-03-07 Thread aljoscha
Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/5652
  
Could change `MiniClusterResource` to expect a `needsClusterClient()` 
parameter or whatnot and normally start in single-actor-system mode. That's 
probably what you had in mind ... 😅 


---


[GitHub] flink issue #5652: [hotfix][tests] Do not use singleActorSystem in LocalFlin...

2018-03-07 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/5652
  
yup. But one profile is already scratching the 50m limit as is :/


---


[GitHub] flink issue #5652: [hotfix][tests] Do not use singleActorSystem in LocalFlin...

2018-03-07 Thread aljoscha
Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/5652
  
We will see when we get the results from Travis for this one, right?



---


[GitHub] flink issue #5652: [hotfix][tests] Do not use singleActorSystem in LocalFlin...

2018-03-07 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/5652
  
The alternative would be to make the `ClusterClient` functionality optional 
and force tests to explicitly enable it.


---


[GitHub] flink issue #5652: [hotfix][tests] Do not use singleActorSystem in LocalFlin...

2018-03-07 Thread zentol
Github user zentol commented on the issue:

https://github.com/apache/flink/pull/5652
  
All legacy tests going through the `MiniClusterResource` will take longer. 
I don't know by how much, but we now have to start multiple actor systems and 
the JM<->TM communication is no longer local.


---


[GitHub] flink issue #5652: [hotfix][tests] Do not use singleActorSystem in LocalFlin...

2018-03-07 Thread aljoscha
Github user aljoscha commented on the issue:

https://github.com/apache/flink/pull/5652
  
Tests taking longer will be true for all tests or only those that use the 
`ClusterClient`? What increase in time are we talking about?



---