[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15976545#comment-15976545
 ] 

ASF GitHub Bot commented on FLINK-6130:
---

Github user zhangminglei closed the pull request at:

https://github.com/apache/flink/pull/3726


> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15976533#comment-15976533
 ] 

ASF GitHub Bot commented on FLINK-6130:
---

Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/3726
  
Yes we should create another JIRA issue. You can create it. You can also 
close FLINK-6130 then.


> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15976517#comment-15976517
 ] 

ASF GitHub Bot commented on FLINK-6130:
---

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/3726
  
@tillrohrmann That is right. We can close this PR surely. BTW, Should we 
make another jira refactoring this class and 
```AbstractYarnFlinkApplicationMasterRunner ``` ? Because I watch the 
```AbstractYarnFlinkApplicationMasterRunner ``` and find only one subclass. If 
we do create another jira, You are the reporter and assign it to me. How do you 
think of this ?


> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15976447#comment-15976447
 ] 

ASF GitHub Bot commented on FLINK-6130:
---

Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/3726
  
@zhangminglei please refrain from pulling more and more people into the 
loop by directly referencing them given that this is a trivial change and that 
5 people are already involved.

Looking at the `YarnFlinkApplicationMasterRunner` I think it could benefit 
from some more refactoring. For example, I don't think that we need to have an 
`AbstractYarnFlinkApplicationMasterRunner` base class. Both can be combined in 
one class. Then we can instantiate the services and runtime components in the 
constructor of `YarnFlinkApplicationMasterRunner`. That way we can get rid of 
the lock in the `run` method.

Given that, I'm not sure whether this PR makes then sense anymore. I think 
we can close it.


> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15976399#comment-15976399
 ] 

ASF GitHub Bot commented on FLINK-6130:
---

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/3726
  
@zentol Please helps review. I am very appreciate it.


> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15976396#comment-15976396
 ] 

ASF GitHub Bot commented on FLINK-6130:
---

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/3726
  
@StephanEwen Hi, How do you think of this refine ?


> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-18 Thread mingleizhang (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15973937#comment-15973937
 ] 

mingleizhang commented on FLINK-6130:
-

[~Zentol] [~till.rohrmann] That makes sense to me now. So, I just have decided 
that the previous practice {code}Object result = future.value().get();{code} is 
meaningless as I can not any useful message from it. Thanks and appreciate it.

> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-18 Thread mingleizhang (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15972736#comment-15972736
 ] 

mingleizhang commented on FLINK-6130:
-

[~till.rohrmann] Void ? I see. I would think {code}@GuardedBy("lock){code} 
might a wrong as well. Could we do a refine like the following ?

Change the code 
{code}
@GuardedBy("lock")
private ResourceManager resourceManager;
{code}

to 
{code}
private volatile ResourceManager resourceManager;
{code}

> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-18 Thread Chesnay Schepler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15972731#comment-15972731
 ] 

Chesnay Schepler commented on FLINK-6130:
-

[~mingleizhang] No, it is of type Future, meaning it always returns null. 
If get() returns (without exception) it means the ResourceManager has shut 
down, as that is the definition of the termination future, see 
RpcEndpoint#getTerminationFuture.

> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-18 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15972709#comment-15972709
 ] 

Till Rohrmann commented on FLINK-6130:
--

[~Zentol] maybe the {{@GuardedBy("lock)}} annotation is simply wrong.

[~mingleizhang] the termination future is of type {{Future}}. Thus, it 
does not contain any return value information.

> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-18 Thread mingleizhang (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15972690#comment-15972690
 ] 

mingleizhang commented on FLINK-6130:
-

[~Zentol] Doesn't it return the result of the status of YARN Application Master 
? Like failure or success ?

> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-18 Thread Chesnay Schepler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15972672#comment-15972672
 ] 

Chesnay Schepler commented on FLINK-6130:
-

[~mingleizhang] The termination future doesn't return anything useful; the only 
interesting bit is that get() returns at all.

> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-18 Thread mingleizhang (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15972656#comment-15972656
 ] 

mingleizhang commented on FLINK-6130:
-

[~till.rohrmann] Thanks for review, and except for [~Zentol] said, another 
stuff for getting more information about return value from {code}LOG.info("YARN 
Application Master finished and the result is {}", result);{code}

> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-18 Thread Chesnay Schepler (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15972647#comment-15972647
 ] 

Chesnay Schepler commented on FLINK-6130:
-

[~till.rohrmann] Because the resourceManager is annotated by @GuardedBy("lock").

> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-18 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15972620#comment-15972620
 ] 

Till Rohrmann commented on FLINK-6130:
--

What is the problem with calling {{getTerminationFuture}} outside of the lock 
[~tedyu]?

> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15972615#comment-15972615
 ] 

ASF GitHub Bot commented on FLINK-6130:
---

Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/3726
  
What is the problem we're trying here to solve?


> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970988#comment-15970988
 ] 

ASF GitHub Bot commented on FLINK-6130:
---

Github user zhangminglei commented on a diff in the pull request:

https://github.com/apache/flink/pull/3726#discussion_r111732371
  
--- Diff: 
flink-yarn/src/main/java/org/apache/flink/yarn/YarnFlinkApplicationMasterRunner.java
 ---
@@ -158,14 +160,16 @@ protected int runApplicationMaster(Configuration 
config) {
jobManagerRunner.start();
LOG.debug("Job Manager Runner started");
 
+   // wait for resource manager to finish and 
return a value for later to get
+   future = (Future) 
resourceManager.getTerminationFuture();
+
//  (5) start the web monitor
// TODO: add web monitor
}
+   Object object = future.value().get();
 
-   // wait for resource manager to finish
-   resourceManager.getTerminationFuture().get();
// everything started, we can wait until all is done or 
the process is killed
-   LOG.info("YARN Application Master finished");
+   LOG.info("YARN Application Master finished and the 
result is " + object.toString());
--- End diff --

Very nice.


> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970983#comment-15970983
 ] 

ASF GitHub Bot commented on FLINK-6130:
---

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/3726
  
Travis sucks some time.


> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970977#comment-15970977
 ] 

ASF GitHub Bot commented on FLINK-6130:
---

Github user tzulitai commented on a diff in the pull request:

https://github.com/apache/flink/pull/3726#discussion_r111731219
  
--- Diff: 
flink-yarn/src/main/java/org/apache/flink/yarn/YarnFlinkApplicationMasterRunner.java
 ---
@@ -158,14 +160,16 @@ protected int runApplicationMaster(Configuration 
config) {
jobManagerRunner.start();
LOG.debug("Job Manager Runner started");
 
+   // wait for resource manager to finish and 
return a value for later to get
+   future = (Future) 
resourceManager.getTerminationFuture();
+
//  (5) start the web monitor
// TODO: add web monitor
}
+   Object object = future.value().get();
 
-   // wait for resource manager to finish
-   resourceManager.getTerminationFuture().get();
// everything started, we can wait until all is done or 
the process is killed
-   LOG.info("YARN Application Master finished");
+   LOG.info("YARN Application Master finished and the 
result is " + object.toString());
--- End diff --

Just another nitpick: the `object` variable could probably use a nicer / 
more meaningful name :)


> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970975#comment-15970975
 ] 

ASF GitHub Bot commented on FLINK-6130:
---

Github user zhangminglei commented on a diff in the pull request:

https://github.com/apache/flink/pull/3726#discussion_r111731104
  
--- Diff: 
flink-yarn/src/main/java/org/apache/flink/yarn/YarnFlinkApplicationMasterRunner.java
 ---
@@ -158,14 +160,16 @@ protected int runApplicationMaster(Configuration 
config) {
jobManagerRunner.start();
LOG.debug("Job Manager Runner started");
 
+   // wait for resource manager to finish and 
return a value for later to get
+   future = (Future) 
resourceManager.getTerminationFuture();
+
//  (5) start the web monitor
// TODO: add web monitor
}
+   Object object = future.value().get();
 
-   // wait for resource manager to finish
-   resourceManager.getTerminationFuture().get();
// everything started, we can wait until all is done or 
the process is killed
-   LOG.info("YARN Application Master finished");
+   LOG.info("YARN Application Master finished and the 
result is " + object.toString());
--- End diff --

Thanks for review. I will update the code. Very appreciate it. 


> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970973#comment-15970973
 ] 

ASF GitHub Bot commented on FLINK-6130:
---

Github user tzulitai commented on the issue:

https://github.com/apache/flink/pull/3726
  
I don't think the timeout was related. We'll see from this Travis run :)


> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970972#comment-15970972
 ] 

ASF GitHub Bot commented on FLINK-6130:
---

Github user tzulitai commented on a diff in the pull request:

https://github.com/apache/flink/pull/3726#discussion_r111730519
  
--- Diff: 
flink-yarn/src/main/java/org/apache/flink/yarn/YarnFlinkApplicationMasterRunner.java
 ---
@@ -158,14 +160,16 @@ protected int runApplicationMaster(Configuration 
config) {
jobManagerRunner.start();
LOG.debug("Job Manager Runner started");
 
+   // wait for resource manager to finish and 
return a value for later to get
+   future = (Future) 
resourceManager.getTerminationFuture();
+
//  (5) start the web monitor
// TODO: add web monitor
}
+   Object object = future.value().get();
 
-   // wait for resource manager to finish
-   resourceManager.getTerminationFuture().get();
// everything started, we can wait until all is done or 
the process is killed
-   LOG.info("YARN Application Master finished");
+   LOG.info("YARN Application Master finished and the 
result is " + object.toString());
--- End diff --

I would use a log message formatter here:
`LOG.info("YARN Application Master finished and the result is {}", object);`


> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970959#comment-15970959
 ] 

ASF GitHub Bot commented on FLINK-6130:
---

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/3726
  
@tzulitai Hi, Could you please review my code ? I am appreciate it.


> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970649#comment-15970649
 ] 

ASF GitHub Bot commented on FLINK-6130:
---

Github user zhangminglei commented on the issue:

https://github.com/apache/flink/pull/3726
  
@tedyu @zentol It seems there is a problem with travis-ci test if no output 
within 5 mins then it will print the stacktrace and then kill the watchdog , 
please help to review and checkout how to solve, I am appreciate it. 
```
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 7.213 sec - 
in org.apache.flink.runtime.minicluster.MiniClusterITCase

==
Maven produced no output for 300 seconds.

==

==
The following Java processes are running (JPS)

==
1552 Launcher
12642 Jps
8045 surefirebooter5889190340308801783.jar
...
Trying to KILL watchdog (1547).
MVN exited with EXIT CODE: 143.
java.io.FileNotFoundException: build-target/lib/flink-dist*.jar (No such 
file or directory)
at java.util.zip.ZipFile.open(Native Method)
at java.util.zip.ZipFile.(ZipFile.java:220)
at java.util.zip.ZipFile.(ZipFile.java:150)
at java.util.zip.ZipFile.(ZipFile.java:121)
at sun.tools.jar.Main.list(Main.java:1060)
at sun.tools.jar.Main.run(Main.java:291)
at sun.tools.jar.Main.main(Main.java:1233)
find: `./flink-yarn-tests/target/flink-yarn-tests*': No such file or 
directory
PRODUCED build artifacts.
```


> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970608#comment-15970608
 ] 

ASF GitHub Bot commented on FLINK-6130:
---

GitHub user zhangminglei opened a pull request:

https://github.com/apache/flink/pull/3726

[FLINK-6130] [yarn] Fix Consider calling resourceManager#getTerminatiā€¦

ā€¦onFuture() with lock held.

Thanks for contributing to Apache Flink. Before you open your pull request, 
please take the following check list into consideration.
If your changes take all of the items into account, feel free to open your 
pull request. For more information and/or questions please refer to the [How To 
Contribute guide](http://flink.apache.org/how-to-contribute.html).
In addition to going through the list, please provide a meaningful 
description of your changes.

- [ ] General
  - The pull request references the related JIRA issue ("[FLINK-XXX] Jira 
title text")
  - The pull request addresses only one issue
  - Each commit in the PR has a meaningful commit message (including the 
JIRA id)

- [ ] Documentation
  - Documentation has been added for new functionality
  - Old documentation affected by the pull request has been updated
  - JavaDoc for public methods has been added

- [ ] Tests & Build
  - Functionality added by the pull request is covered by tests
  - `mvn clean verify` has been executed successfully locally or a Travis 
build has passed


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zhangminglei/flink flink-6130

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3726.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3726


commit d7c707046b1a081ceb5999cdb1f08bf26a2aeac2
Author: zhangminglei 
Date:   2017-04-17T01:58:14Z

[FLINK-6130] [yarn] Fix Consider calling 
resourceManager#getTerminationFuture() with lock held.




> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-16 Thread mingleizhang (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970602#comment-15970602
 ] 

mingleizhang commented on FLINK-6130:
-

[~tedyu] Thanks and appreciate it again. If there is no more question, I will 
give a PR soon.

> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-16 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970599#comment-15970599
 ] 

Ted Yu commented on FLINK-6130:
---

lgtm

> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-16 Thread mingleizhang (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970598#comment-15970598
 ] 

mingleizhang commented on FLINK-6130:
-

[~tedyu] Yep. And the code like below, how do you think of this ? Thanks.

{code}
@Override
protected int runApplicationMaster(Configuration config) {
 Future future; 

synchronized (lock) {
LOG.info("Starting High Availability Services");

   // wait for resource manager to finish
future = (Future) resourceManager.getTerminationFuture();

//  (5) start the web monitor
// TODO: add web monitor
}
Object object = future.value().get();

// everything started, we can wait until all is done or the process is 
killed
LOG.info("YARN Application Master finished" + object.toString());
{code}

> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-16 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970590#comment-15970590
 ] 

Ted Yu commented on FLINK-6130:
---

You don't need to create another synchronized (lock) block.
You can use the existing one to obtain the future.

> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-16 Thread mingleizhang (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970588#comment-15970588
 ] 

mingleizhang commented on FLINK-6130:
-

[~tedyu] Thanks and appreciate it. I will give a PR to this jira soon enough.

> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-16 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970587#comment-15970587
 ] 

Ted Yu commented on FLINK-6130:
---

lgtm

> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-16 Thread mingleizhang (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970586#comment-15970586
 ] 

mingleizhang commented on FLINK-6130:
-

[~tedyu] Could you please help review the code ? How do you think of this ? 
Thanks.
{code}
Future future;
synchronized (lock) {
// wait for resource manager to finish
 future = (Future) 
resourceManager.getTerminationFuture();
}
Object object = future.get();
// everything started, we can wait until all is done or 
the process is killed
LOG.info("YARN Application Master finished" + 
object.toString());
{code}

> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-16 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970408#comment-15970408
 ] 

Ted Yu commented on FLINK-6130:
---

You can store the return value from getTerminationFuture in a variable inside 
the synchronized block.
Outside the synchronized block, call get() on the variable.

> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Assignee: mingleizhang
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-6130) Consider calling resourceManager#getTerminationFuture() with lock held

2017-04-16 Thread mingleizhang (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-6130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970332#comment-15970332
 ] 

mingleizhang commented on FLINK-6130:
-

[~tedyu] Thanks. I would ask how to use those return value that is stored from 
resouceManager#getTerminationFuture() or do nothing ? It seems there is a 
intersection with jira Flink-6275.

> Consider calling resourceManager#getTerminationFuture() with lock held
> --
>
> Key: FLINK-6130
> URL: https://issues.apache.org/jira/browse/FLINK-6130
> Project: Flink
>  Issue Type: Bug
>Reporter: Ted Yu
>Priority: Minor
>
> In YarnFlinkApplicationMasterRunner#runApplicationMaster() :
> {code}
>   synchronized (lock) {
> LOG.info("Starting High Availability Services");
> ...
>   }
>   // wait for resource manager to finish
>   resourceManager.getTerminationFuture().get();
> {code}
> resourceManager#getTerminationFuture() is called without holding lock.
> We should store the value returned from 
> resourceManager#getTerminationFuture() inside the synchronized block.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)