[ 
https://issues.apache.org/jira/browse/FLINK-13242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16890176#comment-16890176
 ] 

Andrey Zagrebin commented on FLINK-13242:
-----------------------------------------

Hi [~xintongsong], I think this is a very gooding finding and the test failure 
makes sense now.

I do not think we can use _getRpcService().getScheduledExecutor()_ to set 
_failUnfulfillableRequest_ because it is again used from the main thread in 
_SlotManager#internalRequestSlot_ and ideally all state should be modified in 
the main thread.

Why do we schedule _setFailUnfulfillableRequest_ only once in 
_StandaloneResourceManager.initialize_ before RM gets leadership? If I 
understand correctly, this flag is to give some time to task executors to 
register the available slots before the slot requests can be checked whether 
they can be fulfilled or not. The task executors will register themselves every 
time this RM gets the leadership. It looks like _setFailUnfulfillableRequest_ 
should be scheduled after each leader election. Then the changing of the 
executor fencing token should not be a problem like for any other operation on 
a leader. cc [~StephanEwen] [~till.rohrmann]

> StandaloneResourceManagerTest fails on travis
> ---------------------------------------------
>
>                 Key: FLINK-13242
>                 URL: https://issues.apache.org/jira/browse/FLINK-13242
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Coordination
>    Affects Versions: 1.9.0
>            Reporter: Chesnay Schepler
>            Assignee: Andrey Zagrebin
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 1.9.0, 1.10.0
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> https://travis-ci.org/apache/flink/jobs/557696989
> {code}
> 08:28:06.475 [ERROR] 
> testStartupPeriod(org.apache.flink.runtime.resourcemanager.StandaloneResourceManagerTest)
>   Time elapsed: 10.276 s  <<< FAILURE!
> java.lang.AssertionError: condition was not fulfilled before the deadline
>       at 
> org.apache.flink.runtime.resourcemanager.StandaloneResourceManagerTest.assertHappensUntil(StandaloneResourceManagerTest.java:114)
>       at 
> org.apache.flink.runtime.resourcemanager.StandaloneResourceManagerTest.testStartupPeriod(StandaloneResourceManagerTest.java:60)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to