Can you check your package script stop function if it is doing "import
params" like this -
https://github.com/apache/incubator-slider/blob/develop/app-packages/hbase/
package/scripts/hbase_master.py#L48


If yes, then you might have to share your app-package scripts (without the
app binary/tar), for us to debug further. For that you have to file a bug
and upload it to the bug. Attaching it to the email to this DL will not
work.

-Gour

On 12/2/16, 9:30 AM, "Ophir Etzion" <op...@foursquare.com> wrote:

>1. you can't see the PYTHONPATH issue. you can see there is no setting of
>the PYTHONPATH that you can see in the START command.
>2. thanks for letting me know about release_timeout_secs but for my app I
>don't care if the containers die, the stop command sends an udp packet
>elsewhere.
>
>here is the output for START where you can see the PYTHONPATH being set:
>INFO 2016-11-30 17:50:32,361 AgentToggleLogger.py:40 - Running command
>['/usr/bin/python',
> '-S',
> 
>u'/export/hdk3/yarn/nm/usercache/hive/appcache/application_1479830316320_6
>4974/filecache/11/enable_presto_worker.zip/package/scripts/enable_presto_w
>orker_component.py',
> u'START',
> 
>'/export/hda3/data/log/hadoop-yarn/container/application_1479830316320_649
>74/container_e468_1479830316320_64974_01_000091/command-4.json',
> 
>'/export/hdk3/yarn/nm/usercache/hive/appcache/application_1479830316320_64
>974/filecache/11/enable_presto_worker.zip/package',
> 
>'/export/hda3/data/log/hadoop-yarn/container/application_1479830316320_649
>74/container_e468_1479830316320_64974_01_000091/structured-out-4.json',
> 'INFO',
> 
>'/export/hdj3/yarn/nm/usercache/hive/appcache/application_1479830316320_64
>974/container_e468_1479830316320_64974_01_000091']
>INFO 2016-11-30 17:50:32,361 AgentToggleLogger.py:40 - Setting env:
>PYTHONPATH to
>/export/hdj3/yarn/nm/usercache/hive/appcache/application_1479830316320_649
>74/filecache/10/slider-agent.tar.gz/slider-agent/jinja2:/export/hdj3/yarn/
>nm/usercache/hive/appcache/application_1479830316320_64974/filecache/10/sl
>ider-agent.tar.gz/slider-agent
>INFO 2016-11-30 17:50:32,463 AgentToggleLogger.py:40 - Queue result:
>{'componentStatus': [],
> 'reports': [{'actionId': u'4-1',
>              'clusterName': u'enable-presto-worker_cluster_a',
>              'exitcode': 777,
>              'reportResult': True,
>              'role': u'NODE',
>              'roleCommand': u'START',
>              'serviceName': u'enable-presto-worker_cluster_a',
>              'status': 'IN_PROGRESS',
>              'stderr': '',
>              'stdout': "2016-11-30 17:50:32,455 -
>Directory['/data/appdata/enable_presto_worker/data/var/run'] {'recursive':
>True}",
>              'structuredOut': '{}',
>              'taskId': 4}]}
>
>On Fri, Dec 2, 2016 at 11:51 AM, Gour Saha <gs...@hortonworks.com> wrote:
>
>> Also keep in mind - if your application needs to run something useful
>>when
>> the stop cmd is initiated then you need to set an appropriate value to
>> site.global.app_container.release_timeout_secs. Otherwise kill signals
>>are
>> sent to the agent containers via YARN (almost immediately) and the
>> containers donĀ¹t get time for graceful shutdown.
>>
>> -Gour
>>
>>
>>
>> On 12/2/16, 8:29 AM, "Billie Rinaldi" <billie.rina...@gmail.com> wrote:
>>
>> >It looks like the Traceback stack for the stop command output is
>>truncated
>> >in the logs you pasted. I only see the first line of the Traceback:
>> >INFO 2016-11-30 18:07:03,919 PythonExecutor.py:97 - stop command
>>output:
>> > err: Traceback (most recent call last):
>> >  File
>> 
>>>"/export/hdk3/yarn/nm/usercache/hive/appcache/application_1479830316320_
>> >64974/filecache/11/enable_presto_worker.zip/package/
>> >scripts/enable_presto_worker_component.py",
>> >line 23, in <module>
>> >    from resource_management import *
>> >
>> >So I cannot see the PYTHONPATH error you're talking about. If you paste
>> >the
>> >entire Traceback that might tell us more.
>> >
>> >Billie
>> >
>> >On Fri, Dec 2, 2016 at 7:19 AM, Ophir Etzion <op...@foursquare.com>
>> wrote:
>> >
>> >> it does implement a STOP command that does something useful.
>> >> it fails because the PYTHONPATH isn't set like it is in different
>> >>commands.
>> >>
>> >> On Thu, Dec 1, 2016 at 10:38 PM, Gour Saha <gs...@hortonworks.com>
>> >>wrote:
>> >>
>> >> > Does enable_presto_worker_component.py support/implement a STOP
>> >>command?
>> >> >
>> >> > Does your application need to run something useful when the stop
>>cmd
>> >>is
>> >> > initiated?
>> >> >
>> >> > -Gour
>> >> >
>> >> > On 11/30/16, 10:58 AM, "Ophir Etzion" <op...@foursquare.com> wrote:
>> >> >
>> >> > >Hi,
>> >> > >
>> >> > >I hope I'm writing to the correct mailing list. please direct me
>> >> elsewhere
>> >> > >if this is not the correct place to write to.
>> >> > >
>> >> > >I've written a simple custom slider application and the STOP
>>script
>> >> fails
>> >> > >due to what seems like a slider issue of not setting the
>>PYTHONPATH
>> >>when
>> >> > >running the stop command.
>> >> > >
>> >> > >I will probably debug to see what goes on in
>> >>CustomServiceOrchestrator
>> >> and
>> >> > >why it doesn't set the env variables there but I'll only do it in
>>a
>> >> couple
>> >> > >of weeks.
>> >> > >I wanted to ask if anyone noticed something like this before I
>>look
>> >>into
>> >> > >it
>> >> > >further.
>> >> > >
>> >> > >in the agent log it looks like this:
>> >> > >
>> >> > >INFO 2016-11-30 18:07:03,894 ActionQueue.py:173 - Running command:
>> >> > >{u'roleCommand': u'STOP', u'clusterName':
>> >> > >u'enable-presto-worker_cluster_a', u'componentName': u'NODE',
>> >> > u'hostname':
>> >> > >u'fsak20.prod.foursquare.com', u'hostLevelParams': {u'java_home':
>> >> > >u'/data/loko/infrastructure-jdk8/current/bin/', u'container_id':
>> >> > >u'container_e468_1479830316320_64974_01_000091'}, u'commandType':
>> >> > >u'EXECUTION_COMMAND', u'roleParams': {u'auto_restart': u'false'},
>> >> > >u'serviceName': u'enable-presto-worker_cluster_a', u'role':
>>u'NODE',
>> >> > >u'commandParams': {u'record_config': u'true',
>> >>u'service_package_folder':
>> >> > >u'${AGENT_WORK_ROOT}/work/app/definition/package', u'script':
>> >> > >u'scripts/enable_presto_worker_component.py', u'schema_version':
>> >> u'2.0',
>> >> > >u'command_timeout': u'600', u'script_type': u'PYTHON'},
>>u'taskId': 5,
>> >> > >u'yarnDockerMode': False, u'commandId': '5-1', u'containers': [],
>> >> > >u'configurations': {u'global': {u'security_enabled': u'false',
>> >> > >u'app_container_id':
>>u'container_e468_1479830316320_64974_01_000091'
>> ,
>> >> > >u'data_dir': u'/data/appdata/enable_presto_worker/data',
>> u'app_name':
>> >> > >u'enable_presto_worker.py', u'app_root':
>> >> > >u'${AGENT_WORK_ROOT}/app/install',
>> >> > >u'app_log_dir': u'${AGENT_LOG_ROOT}', u'app_pid_dir':
>> >> > >u'${AGENT_WORK_ROOT}/app/run', u'app_container_tag': u'2',
>> >>u'pid_file':
>> >> > >u'${AGENT_WORK_ROOT}/app/run/component.pid', u'app_install_dir':
>> >> > >u'${AGENT_WORK_ROOT}/app/install', u'app_input_conf_dir':
>> >> > >u'${AGENT_WORK_ROOT}/propagatedconf', u'state_monitor_port':
>> >>u'9990'}}}
>> >> > >INFO 2016-11-30 18:07:03,896 CustomServiceOrchestrator.py:329 -
>> >>Storing
>> >> > >applied config: {u'global': {u'app_container_id':
>> >> > >u'container_e468_1479830316320_64974_01_000091',
>> >> > >             u'app_container_tag': u'2',
>> >> > >             u'app_input_conf_dir':
>> >> > >u'/export/hdj3/yarn/nm/usercache/hive/appcache/
>> >> > application_1479830316320_6
>> >> > >4974/container_e468_1479830316320_64974_01_000091/propagatedconf',
>> >> > >             u'app_install_dir':
>> >> > >u'/export/hdj3/yarn/nm/usercache/hive/appcache/
>> >> > application_1479830316320_6
>> >> > >4974/container_e468_1479830316320_64974_01_000091/app/install',
>> >> > >             u'app_log_dir':
>> >> > >u'/data/log/hadoop-yarn/container/application_
>> >> > 1479830316320_64974/containe
>> >> > >r_e468_1479830316320_64974_01_000091',
>> >> > >             u'app_name': u'enable_presto_worker.py',
>> >> > >             u'app_pid_dir':
>> >> > >u'/export/hdj3/yarn/nm/usercache/hive/appcache/
>> >> > application_1479830316320_6
>> >> > >4974/container_e468_1479830316320_64974_01_000091/app/run',
>> >> > >             u'app_root':
>> >> > >u'/export/hdj3/yarn/nm/usercache/hive/appcache/
>> >> > application_1479830316320_6
>> >> > >4974/container_e468_1479830316320_64974_01_000091/app/install',
>> >> > >             u'data_dir': u'/data/appdata/enable_presto_
>> worker/data',
>> >> > >             u'pid_file':
>> >> > >u'/export/hdj3/yarn/nm/usercache/hive/appcache/
>> >> > application_1479830316320_6
>> >> > >4974/container_e468_1479830316320_64974_01_000091/
>> >> app/run/component.pid',
>> >> > >             u'security_enabled': u'false',
>> >> > >             u'state_monitor_port': u'9990'}}
>> >> > >INFO 2016-11-30 18:07:03,898 PythonExecutor.py:152 - command str:
>> >> > > /usr/bin/python -S
>> >> > >/export/hdk3/yarn/nm/usercache/hive/appcache/
>> >> > application_1479830316320_649
>> >> > >74/filecache/11/enable_presto_worker.zip/package/
>> >> > scripts/enable_presto_wor
>> >> > >ker_component.py
>> >> > >STOP
>> >> > >/export/hda3/data/log/hadoop-yarn/container/application_
>> >> > 1479830316320_6497
>> >> > >4/container_e468_1479830316320_64974_01_000091/command-5.json
>> >> > >/export/hdk3/yarn/nm/usercache/hive/appcache/
>> >> > application_1479830316320_649
>> >> > >74/filecache/11/enable_presto_worker.zip/package
>> >> > >/export/hda3/data/log/hadoop-yarn/container/application_
>> >> > 1479830316320_6497
>> >> > 
>>>4/container_e468_1479830316320_64974_01_000091/structured-out-5.json
>> >> > >INFO
>> >> > >/export/hdj3/yarn/nm/usercache/hive/appcache/
>> >> > application_1479830316320_649
>> >> > >74/container_e468_1479830316320_64974_01_000091
>> >> > >INFO 2016-11-30 18:07:03,919 PythonExecutor.py:97 - stop command
>> >>output:
>> >> > > err: Traceback (most recent call last):
>> >> > >  File
>> >> > >"/export/hdk3/yarn/nm/usercache/hive/appcache/
>> >> > application_1479830316320_64
>> >> > >974/filecache/11/enable_presto_worker.zip/package/
>> >> > scripts/enable_presto_wo
>> >> > >rker_component.py",
>> >> > >line 23, in <module>
>> >> > >    from resource_management import *
>> >> >
>> >> >
>> >>
>>
>>

Reply via email to