I think I found out what causes the NPE above in .92 and why it works in
version 0.80
The component name (a.k.a. role name) is "solo___super" i.e. it has 3 "_"
In 0.92, it seems a new concept of "Role Group" is introduced, which was
not present in 0.80.
In 0.92 - AgentProviderService.java
private static final String LABEL_MAKER = "___";
...
private String getRoleName(String label) {
int index1 = label.indexOf(LABEL_MAKER);
int index2 = label.lastIndexOf(LABEL_MAKER);
if (index1 == index2) {
return label.substring(index1 + LABEL_MAKER.length());
} else {
return label.substring(index1 + LABEL_MAKER.length(), index2);
}
}
private String getRoleGroup(String label) {
return label.substring(label.lastIndexOf(LABEL_MAKER) +
LABEL_MAKER.length());
}
So when the real role name contains 3 "_" e.g. for "solo___super", the
getRoleName on container name will return just "solo" and not
"solo___super" and that bad role name can cause NPE
Same role name works in 0.80 because in 0.80, there is no concept of
roleGroup
In 0.80 - AgentProviderService.java
private String getRoleName(String label) {
return label.substring(label.indexOf(LABEL_MAKER) +
LABEL_MAKER.length());
}
so in 0.80, the role name "solo__super" will return correct role name from
container label
1) I tried to understand what the roleGroup is and whats its usage is but
could not locate any doc. Can someone give few lines of explanation ?
2) Should this be considered a bug in .92 ? If not, and if you think
LABEL_MAKER should not be used in any role names; at least a clear doc AND
a clear check when accepting config files will help. I.e. if LABEL_MAKER
should not be used in any role names; then slider 0.92 should give error
when creating cluster or accepting configs during any other operations etc.
saying invalid role name etc. etc.
Thanks in advance,
On Tue, Apr 11, 2017 at 6:09 PM, Manoj Samel
wrote:
> Hi
>
> Running slider 0.92 on CDH 5.5.1 (which is Hadoop 2.6), with Kerberos
>
> I am deploying a application with multiple components. The components
> start but fail to heart beat to slider AM. The slider AM log shows NPE at
> container heartbeat URLs as below.
>
> I have attached the complete slider AM log
>
> 2017-04-12 00:44:05,741 [2011871076@qtp-814377348-5] INFO
> agent.AgentProviderService - Handling registration: responseId=-1
> timestamp=1491957845550
> label=container_e95_1476898378926_91401_01_03___solo___super
> hostname=node1078
> expectedState=INIT
> actualState=INIT
> appVersion=null
>
> 2017-04-12 00:44:05,741 [2011871076@qtp-814377348-5] INFO
> agent.AgentProviderService - label:
> container_e95_1476898378926_91401_01_03___solo___super
> pkg: null
> 2017-04-12 00:44:05,741 [2011871076@qtp-814377348-5] INFO
> agent.AgentProviderService - Registration response:
> RegistrationResponse{response=OK, responseId=0, statusCommands=null}
> 2017-04-12 00:44:05,871 [Socket Reader #1 for port 32120] INFO ipc.Server
> - Auth successful for slideradmin@BIGDATA (auth:SIMPLE)
> 2017-04-12 00:44:05,873 [Socket Reader #1 for port 32120] INFO
> authorize.ServiceAuthorizationManager
> - Authorization successful for slideradmin@BIGDATA (auth:TOKEN) for
> protocol=interface org.apache.slider.server.appmaster.rpc.
> SliderClusterProtocolPB
> 2017-04-12 00:44:15,749 [100585@qtp-814377348-7] ERROR mortbay.log -
> /ws/v1/slider/agents/container_e95_1476898378926_
> 91401_01_02___pdx__svt___ten85/heartbeat
> java.lang.NullPointerException
> at org.apache.slider.providers.agent.AgentProviderService.
> handleHeartBeat(AgentProviderService.java:1090)
> at org.apache.slider.server.appmaster.web.rest.agent.
> AgentResource.heartbeat(AgentResource.java:98)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(
> JavaMethodInvokerFactory.java:60)
> at com.sun.jersey.server.impl.model.method.dispatch.
> AbstractResourceMethodDispatchProvider$TypeOutInvoker._dispatch(
> AbstractResourceMethodDispatchProvider.java:185)
> at com.sun.jersey.server.impl.model.method.dispatch.
> ResourceJavaMethodDispatcher.dispatch(ResourceJavaMethodDispatcher.
> java:75)
> at com.sun.jersey.server.impl.uri.rules.HttpMethodRule.
> accept(HttpMethodRule.java:288)
> at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.
> accept(RightHandPathRule.java:147)
> at com.sun.jersey.server.impl.uri.rules.SubLocatorRule.
> accept(SubLocatorRule.java:134)
> at com.sun.jersey.server.impl.uri.rules.RightHandPathRule.
> accept(RightHandPathRule.java:147)
> at