[
https://issues.apache.org/jira/browse/SLIDER-270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14086371#comment-14086371
]
Steve Loughran commented on SLIDER-270:
---------------------------------------
can't replicate in the mock YARN engine, but surfaces in a new test in the
classic HBase provider
{code}
14-08-05 16:22:58,345 [Thread-1] INFO client.SliderClient
(SliderClient.java:flex(1837)) - Flexing running cluster
2014-08-05 16:22:58,345 [Thread-1] DEBUG rpc.RpcBinder
(RpcBinder.java:getProxy(227)) - Connecting to stevel-9.local/240.0.0.1:53015
2014-08-05 16:22:58,346 [Thread-1] DEBUG rpc.RpcBinder
(RpcBinder.java:connectToServer(122)) - Connecting to Slider AM at
stevel-9.local/240.0.0.1:53015
2014-08-05 16:22:58,379 [Thread-1] INFO client.SliderClient
(SliderClient.java:flex(1841)) - Cluster size updated
2014-08-05 16:22:58,381 [Thread-1] DEBUG rpc.RpcBinder
(RpcBinder.java:getProxy(227)) - Connecting to stevel-9.local/240.0.0.1:53015
2014-08-05 16:22:58,381 [Thread-1] DEBUG rpc.RpcBinder
(RpcBinder.java:connectToServer(122)) - Connecting to Slider AM at
stevel-9.local/240.0.0.1:53015
2014-08-05 16:22:58,406 [Thread-1] DEBUG test.SliderTestUtils
(SliderTestUtils.groovy:waitForRoleCount(294)) - Waiting: [worker]: desired: 2;
actual: 4
2014-08-05 16:22:58,473 [IPC Server handler 29 on 53004] WARN
resourcemanager.RMAuditLogger (RMAuditLogger.java:logFailure(209)) -
USER=stevel IP=240.0.0.OPERATION=AM Released Container
TARGET=FifoScheduler RESULT=FAILURE DESCRIPTION=Trying to release container
not owned by app or with invalid id PERMISSIONS=Unauthorized access or
invalid container APPID=application_1407252134537_0001
CONTAINERID=container_1407252134537_0001_01_000003
2014-08-05 16:22:58,473 [IPC Server handler 29 on 53004] INFO
fifo.FifoScheduler (FifoScheduler.java:containerCompleted(829)) - Null
container completed...
2014-08-05 16:22:58,473 [IPC Server handler 29 on 53004] WARN
resourcemanager.RMAuditLogger (RMAuditLogger.java:logFailure(209)) -
USER=stevel IP=240.0.0.OPERATION=AM Released Container
TARGET=FifoScheduler RESULT=FAILURE DESCRIPTION=Trying to release container
not owned by app or with invalid id PERMISSIONS=Unauthorized access or
invalid container APPID=application_1407252134537_0001
CONTAINERID=container_1407252134537_0001_01_000007
2014-08-05 16:22:58,473 [IPC Server handler 29 on 53004] INFO
fifo.FifoScheduler (FifoScheduler.java:containerCompleted(829)) - Null
container completed...
2014-08-05 16:22:59,094 [ResourceManager Event Processor] INFO
fifo.FifoScheduler (FifoScheduler.java:containerCompleted(829)) - Null
container completed...
2014-08-05 16:22:59,094 [ResourceManager Event Processor] INFO
fifo.FifoScheduler (FifoScheduler.java:containerCompleted(829)) - Null
container completed...
2014-08-05 16:22:59,408 [Thread-1] DEBUG rpc.RpcBinder
(RpcBinder.java:getProxy(227)) - Co
{code}
> Calling flex (down) the second time does not work
> -------------------------------------------------
>
> Key: SLIDER-270
> URL: https://issues.apache.org/jira/browse/SLIDER-270
> Project: Slider
> Issue Type: Bug
> Components: appmaster
> Affects Versions: Slider 0.50
> Reporter: Sumit Mohanty
> Assignee: Steve Loughran
> Fix For: Slider 0.50
>
>
> From AppMaster log (see below) it looks like that the second command to flex
> from 2 to 1 did not result in container release.
> {noformat}
> 14/08/04 01:55:18 INFO state.AppState: Role MEMCACHED flexed from 3 to 2
> 14/08/04 01:55:18 INFO state.AppState: RoleStatus{name='MEMCACHED', key=1,
> desired=2, actual=3, requested=0, releasing=0, failed=0, started=3,
> startFailed=0, completed=0, failureMessage=''}
> 14/08/04 01:55:18 INFO state.AppState: MEMCACHED: Asking for 1 fewer node(s)
> for a total of 2
> 14/08/04 01:55:19 INFO appmaster.SliderAppMaster: onContainersCompleted([1]
> 14/08/04 01:55:19 INFO appmaster.SliderAppMaster: Container Completion for
> containerID=container_1405048900371_0054_01_000004, state=COMPLETE,
> exitStatus=-100, diagnostics=Container released by application
> 14/08/04 01:55:19 INFO state.AppState: Container was queued for release
> 14/08/04 01:55:19 INFO state.AppState: decrementing role count for role
> MEMCACHED
> 14/08/04 01:55:19 INFO agent.AgentProviderService: Removing container
> specific data for container_1405048900371_0054_01_000004
> 14/08/04 01:55:19 INFO agent.AgentProviderService: publishing
> PublishedConfiguration{description='ComponentInstanceData' entries = 2}
> 14/08/04 01:55:19 INFO state.AppState: RoleStatus{name='MEMCACHED', key=1,
> desired=2, actual=2, requested=0, releasing=0, failed=0, started=3,
> startFailed=0, completed=1, failureMessage=''}
> 14/08/04 01:55:45 INFO state.AppState: Role MEMCACHED flexed from 2 to 1
> 14/08/04 01:55:45 INFO state.AppState: RoleStatus{name='MEMCACHED', key=1,
> desired=1, actual=2, requested=0, releasing=0, failed=0, started=3,
> startFailed=0, completed=1, failureMessage=''}
> 14/08/04 01:55:45 INFO state.AppState: MEMCACHED: Asking for 1 fewer node(s)
> for a total of 1
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.2#6252)