Hi Jon,

I tried the OOB jmemcached with component name my-name. You may be able to
reproduce by trying the jmemcached with component name my-name

Attached are the slider-am log (part) and the component agent log. The
component agent shows "Unable to connect" error as before. I have masked
FQDNs etc. but rest of log is intact.

As far as I can see, AM seems not to get container heartbeat (since
container start has error), marks  container lost (-100) and keeps
attempting to get new container.

2016-01-12 20:53:46,334 [Thread-33] WARN  agent.HeartbeatMonitor -
Component
ComponentInstanceState{containerIdAsString='container_1452195922769_0005_01_000002',
state=INIT, failuresSeen=0, lastHeartbeat=1452631909186,
containerState=UNHEALTHY, componentName='my-name'} marked UNHEALTHY. Last
heartbeat received at 1452631909186 approx. 117148 ms. back.
2016-01-12 20:54:46,335 [Thread-33] WARN  agent.HeartbeatMonitor -
Component
ComponentInstanceState{containerIdAsString='container_1452195922769_0005_01_000002',
state=INIT, failuresSeen=0, lastHeartbeat=1452631909186,
containerState=HEARTBEAT_LOST, componentName='my-name'} marked
HEARTBEAT_LOST. Last heartbeat received at 1452631909186 approx. 177149 ms.
back.
2016-01-12 20:54:46,335 [AmExecutor-006] INFO  appmaster.SliderAppMaster -
containerLostContactWithProvider: container
container_1452195922769_0005_01_000002 lost
2016-01-12 20:54:46,336 [AmExecutor-006] INFO  appmaster.SliderAppMaster -
Container released; triggering review
2016-01-12 20:54:46,336 [AmExecutor-006] INFO  state.AppState - Reviewing
RoleStatus{name='my-name', key=1, desired=1, actual=1, requested=0,
releasing=1, failed=0, failed recently=0, node failed=0, pre-empted=0,
started=1, startFailed=0, completed=0, failureMessage=''} : expected 1
2016-01-12 20:54:47,340 [AMRM Callback Handler Thread] INFO
 appmaster.SliderAppMaster - onContainersCompleted([1]
2016-01-12 20:54:47,340 [AMRM Callback Handler Thread] INFO
 appmaster.SliderAppMaster - Container Completion for
containerID=container_1452195922769_0005_01_000002, state=COMPLETE,
exitStatus=-100, diagnostics=Container released by application


Thanks for your time,

Manoj

On Tue, Jan 12, 2016 at 12:14 PM, Jon Maron <jma...@hortonworks.com> wrote:

> OK.  So was there a ‘-‘ or some other character in your component name?  A
> ‘-‘ should work.  The component names are currently expected to follow a
> naming convention that allows for DNS compatible names, and dashes are
> included in that character set.  The fact that the endpoint did not appear
> may be related to some other issue.  The AM logs may help here as well.
>
> > On Jan 12, 2016, at 2:57 PM, Manoj Samel <manojsamelt...@gmail.com>
> wrote:
> >
> > Jon,
> >
> > I replaced <COMP-NAME> with actual component name.
> >
> > Thanks,
> >
> >
> >
> > On Tue, Jan 12, 2016 at 11:43 AM, Jon Maron <jma...@hortonworks.com>
> wrote:
> >
> >> Did you replace the actual comp name with <COMP-NAME>, or do you
> actually
> >> have the ‘<‘ and ‘>’ characters in the name?
> >>
> >>> On Jan 12, 2016, at 2:40 PM, Manoj Samel <manojsamelt...@gmail.com>
> >> wrote:
> >>>
> >>> Slider version 0.80 with secured cluster
> >>>
> >>> Use case is to create a component reflecting user name. It seems the
> only
> >>> valid character in component name besides [A-Z][a-z[0-9] is underscore
> >> '_'.
> >>>
> >>> Attempt to create a component with characters like dash '-' or many
> other
> >>> characters fail to bring up the component with error like below where
> >>> <COMP-NAME>
> >>> is the component name containing offending character
> >>>
> >>> INFO 2015-12-24 18:55:40,605 Controller.py:140 - Registering with the
> >>> server at
> >>>
> >>
> https://host1:41613/ws/v1/slider/agents/container_1450746204314_0043_01_000002___
> >> <COMP-NAME>/register
> >>> with data '{"tags": "", "timestamp": 1450983340604, "expectedState": 0,
> >>> "responseId": -1, "actualState": 0, "logFolders": {}, "agentVersion":
> >> "1",
> >>> "allocatedPorts": {}, "appVersion": null, "publicHostname": "host2",
> >>> "label": "container_1450746204314_0043_01_000002___<COMP-NAME>"}'
> >>> INFO 2015-12-24 18:55:40,605 security.py:89 - SSL Connect being
> called..
> >>> connecting to the server
> >>> INFO 2015-12-24 18:55:40,695 security.py:51 - SSL connection
> established.
> >>> Two-way SSL authentication is turned off on the server.
> >>> INFO 2015-12-24 18:55:40,745 Controller.py:183 - Unable to connect to:
> >>>
> >>
> https://host1:41613/ws/v1/slider/agents/container_1450746204314_0043_01_000002___
> >> <COMP-NAME>/register
> >>>
> >>> Traceback (most recent call last):
> >>> File
> >>>
> >>
> "/data/yarn/local/usercache/foo/appcache/application_1450746204314_0043/filecache/10/slider-agent.tar.gz/slider-agent/agent/Controller.py",
> >>> line 142, in registerWithServer
> >>>   regResp = json.loads(response)
> >>> File "/usr/lib64/python2.6/json/__init__.py", line 307, in loads
> >>>   return _default_decoder.decode(s)
> >>> File "/usr/lib64/python2.6/json/decoder.py", line 319, in decode
> >>>   obj, end = self.raw_decode(s, idx=_w(s, 0).end())
> >>> File "/usr/lib64/python2.6/json/decoder.py", line 338, in raw_decode
> >>>   raise ValueError("No JSON object could be decoded")
> >>> ValueError: No JSON object could be decoded
> >>>
> >>>
> >>> Any thoughts ?
> >>>
> >>> Thanks,
> >>
> >>
>
>

Reply via email to