*I use the develop branch in the slider repo. My config files are set as below:*
*slider-client.xml:* <configuration> <property> <name>yarn.resourcemanager.address</name> <value>155.69.148.21:9032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>155.69.148.21:9030</value> </property> <property> <name>fs.defaultFS</name> <value>hdfs://155.69.148.21:18000</value> </property> <property> <name>slider.zookeeper.quorum</name> <value>155.69.148.21:2181</value> </property> <property> <name>slider.client.resource.origin</name> <value>conf/slider-client.xml</value> <description>This is just for diagnostics</description> </property> <property> <name>yarn.application.classpath</name> <value>/hadoop/hadoop-2.5.0/etc/hadoop,/hadoop/hadoop-2.5.0/share/hadoop/common/*,/hadoop/hadoop-2.5.0/share/hadoop/common/lib/*,/hadoop/hadoop-2.5.0/share/hadoop/hdfs/*,/hadoop/hadoop-2.5.0/share/hadoop/hdfs/lib/*,/hadoop/hadoop-2.5.0/share/hadoop/mapreduce/*,/hadoop/hadoop-2.5.0/share/hadoop/mapreduce/lib/*,/hadoop/hadoop-2.5.0/share/hadoop/yarn/*,/hadoop/hadoop-2.5.0/share/hadoop/yarn/lib/*</value> </property> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <property> <name>slider.yarn.queue</name> <value>default</value> <description>YARN queue for the Application Master</description> </property> *I uploaded agent.ini into HDFS (/user/*username*/agent/conf) and its content is:* [server] hostname=localhost port=8440 secured_port=8441 check_path=/ws/v1/slider/agents/ register_path=/ws/v1/slider/agents/{name}/register heartbeat_path=/ws/v1/slider/agents/{name}/heartbeat [agent] app_pkg_dir=app/definition app_install_dir=app/install app_run_dir=app/run app_dbg_cmd= debug_mode_enabled=true app_task_dir=. app_log_dir=. app_tmp_dir=app/tmp log_dir=. run_dir=infra/run version_file=infra/version log_level=INFO [python] [command] max_retries=2 sleep_between_retries=1 [security] [heartbeat] state_interval=6 log_lines_count=300 *appConfig.json:* { "schema": "http://example.org/specification/v2.0.0", "metadata": { }, "global": { "application.def": "slider-hbase-app-package-0.98.5-hadoop2.zip", "create.default.zookeeper.node": "true", "java_home": "/users/staff/hustnn/hadoop-0.23.6/java/jdk1.6.0_39", "system_configs": "core-site", "agent.conf": "/user/hustnn/agent/conf/agent.ini", "site.global.app_user": "yarn", "site.global.app_root": "${AGENT_WORK_ROOT}/app/install/hbase-0.98.5-hadoop2", "site.global.ganglia_server_host": "${NN_HOST}", "site.global.ganglia_server_port": "8667", "site.global.ganglia_server_id": "Application1", "site.global.ganglia_enabled":"true", "site.global.hbase_instance_name": "instancename", "site.global.hbase_root_password": "secret", "site.global.user_group": "hadoop", "site.global.security_enabled": "false", "site.global.monitor_protocol": "http", "site.global.hbase_thrift_port": "${HBASE_THRIFT.ALLOCATED_PORT}", "site.global.hbase_thrift2_port": "${HBASE_THRIFT2.ALLOCATED_PORT}", "site.global.hbase_rest_port": "${HBASE_REST.ALLOCATED_PORT}", "site.hbase-env.hbase_master_heapsize": "1024m", "site.hbase-env.hbase_regionserver_heapsize": "1024m", "site.hbase-site.hbase.rootdir": "${DEFAULT_DATA_DIR}", "site.hbase-site.hbase.superuser": "yarn", "site.hbase-site.hbase.tmp.dir": "${AGENT_WORK_ROOT}/work/app/tmp", "site.hbase-site.hbase.local.dir": "${hbase.tmp.dir}/local", "site.hbase-site.hbase.zookeeper.quorum": "155.69.148.21:2181", "site.hbase-site.zookeeper.znode.parent": "${DEF_ZK_PATH}", "site.hbase-site.hbase.regionserver.info.port": "0", "site.hbase-site.hbase.master.info.port": "${HBASE_MASTER.ALLOCATED_PORT}", "site.hbase-site.hbase.regionserver.port": "0", "site.hbase-site.hbase.master.port": "0" }, "components": { "slider-appmaster": { "jvm.heapsize": "256M" } } } *resource.json:* { "schema": "http://example.org/specification/v2.0.0", "metadata": { }, "global": { }, "components": { "HBASE_MASTER": { "yarn.role.priority": "1", "yarn.component.instances": "1", "yarn.memory": "256" }, "slider-appmaster": { }, "HBASE_REGIONSERVER": { "yarn.role.priority": "2", "yarn.component.instances": "1", "yarn.memory": "256" }, "HBASE_REST": { "yarn.role.priority": "3", "yarn.component.instances": "1", "yarn.memory": "256" }, "HBASE_THRIFT": { "yarn.role.priority": "4", "yarn.component.instances": "1", "yarn.memory": "256" }, "HBASE_THRIFT2": { "yarn.role.priority": "5", "yarn.component.instances": "1", "yarn.memory": "256" } } } 2014-09-11 20:22 GMT+08:00 Tim Israel <t...@timisrael.com>: > What's your slider-client.xml look like? Also, did you put agent.ini into > the appropriate folder in HDFS? > > Tim > > On Thu, Sep 11, 2014 at 7:25 AM, 牛兆捷 <nzjem...@gmail.com> wrote: > > > When I run the agent like this: > > > > ./slider create cl1 --image > > hdfs://yourNameNodeHost:8020/user/yarn/agent/slider-agent.tar.gz > > --template appConfig.json --resources resources.json > > > > The application is accepted by YARN but failed. Anyone know it?? > > > > The error info in the YARN RM UI is : > > > > Application application_1410335027910_0001 failed 2 times due to AM > > Container for appattempt_1410335027910_0001_000002 exited with exitCode: > 32 > > due to: Exception from container-launch: ExitCodeException exitCode=32: > > ExitCodeException exitCode=32: > > at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) > > at org.apache.hadoop.util.Shell.run(Shell.java:455) > > at > > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702) > > at > > > > > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) > > at > > > > > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300) > > at > > > > > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81) > > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > > at > > > > > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > > at > > > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > > at java.lang.Thread.run(Thread.java:662) > > Container exited with a non-zero exit code 32 > > .Failing this attempt.. Failing the application. > > > > > > Then I go inside the container and get the error info as below: > > > > 14/09/11 19:15:45 INFO service.AbstractService: Service python failed in > > state STARTED; cause: org.apache.slider.core.main.ServiceLaunchException: > > python failed with code 2 > > org.apache.slider.core.main.ServiceLaunchException: python failed with > code > > 2 > > at > > > > > org.apache.slider.server.services.workflow.ForkedProcessService.reportFailure(ForkedProcessService.java:202) > > at > > > > > org.apache.slider.server.services.workflow.ForkedProcessService.onProcessExited(ForkedProcessService.java:192) > > at > > > > > org.apache.slider.server.services.workflow.LongLivedProcess.run(LongLivedProcess.java:345) > > at > > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) > > at > > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > > at > > > > > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > > at > > > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > > at java.lang.Thread.run(Thread.java:662) > > 14/09/11 19:15:45 WARN tools.SliderUtils: Expected exit code={0}, actual > > exit code={2} > > 14/09/11 19:15:45 INFO tools.SliderUtils: [ERR] > > 14/09/11 19:15:45 INFO tools.SliderUtils: [ERR] Unknown option: -- > > 14/09/11 19:15:45 INFO tools.SliderUtils: [ERR] usage: python [option] > ... > > [-c cmd | -m mod | file | -] [arg] ... > > 14/09/11 19:15:45 INFO tools.SliderUtils: [ERR] Try `python -h' for more > > information. > > 14/09/11 19:15:45 INFO tools.SliderUtils: [OUT] > > 14/09/11 19:15:45 INFO service.AbstractService: Service SliderAppMaster > > failed in state INITED; cause: > > org.apache.slider.core.exceptions.SliderException: Process python failed: > > Expected exit code={0}, actual exit code={2} > > org.apache.slider.core.exceptions.SliderException: Process python failed: > > Expected exit code={0}, actual exit code={2} > > at > > > > > org.apache.slider.common.tools.SliderUtils.execCommand(SliderUtils.java:1744) > > at > > > > > org.apache.slider.common.tools.SliderUtils.validateSliderServerEnvironment(SliderUtils.java:1777) > > at > > > > > org.apache.slider.server.appmaster.SliderAppMaster.serviceInit(SliderAppMaster.java:405) > > at > > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > > at > > > > > org.apache.slider.core.main.ServiceLauncher.launchService(ServiceLauncher.java:180) > > at > > > > > org.apache.slider.core.main.ServiceLauncher.launchServiceRobustly(ServiceLauncher.java:471) > > at > > > > > org.apache.slider.core.main.ServiceLauncher.launchServiceAndExit(ServiceLauncher.java:401) > > at > > > > > org.apache.slider.core.main.ServiceLauncher.serviceMain(ServiceLauncher.java:626) > > at > > > > > org.apache.slider.server.appmaster.SliderAppMaster.main(SliderAppMaster.java:1897) > > Exception: org.apache.slider.core.exceptions.SliderException: Process > > python failed: Expected exit code={0}, actual exit code={2} > > 14/09/11 19:15:45 ERROR main.ServiceLauncher: Exception: > > org.apache.slider.core.exceptions.SliderException: Process python failed: > > Expected exit code={0}, actual exit code={2} > > org.apache.hadoop.service.ServiceStateException: > > org.apache.slider.core.exceptions.SliderException: Process python failed: > > Expected exit code={0}, actual exit code={2} > > at > > > > > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59) > > at > > org.apache.hadoop.service.AbstractService.init(AbstractService.java:172) > > at > > > > > org.apache.slider.core.main.ServiceLauncher.launchService(ServiceLauncher.java:180) > > at > > > > > org.apache.slider.core.main.ServiceLauncher.launchServiceRobustly(ServiceLauncher.java:471) > > at > > > > > org.apache.slider.core.main.ServiceLauncher.launchServiceAndExit(ServiceLauncher.java:401) > > at > > > > > org.apache.slider.core.main.ServiceLauncher.serviceMain(ServiceLauncher.java:626) > > at > > > > > org.apache.slider.server.appmaster.SliderAppMaster.main(SliderAppMaster.java:1897) > > Caused by: org.apache.slider.core.exceptions.SliderException: Process > > python failed: Expected exit code={0}, actual exit code={2} > > at > > > > > org.apache.slider.common.tools.SliderUtils.execCommand(SliderUtils.java:1744) > > at > > > > > org.apache.slider.common.tools.SliderUtils.validateSliderServerEnvironment(SliderUtils.java:1777) > > at > > > > > org.apache.slider.server.appmaster.SliderAppMaster.serviceInit(SliderAppMaster.java:405) > > at > > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > > ... 5 more > > 14/09/11 19:15:45 INFO util.ExitUtil: Exiting with status 32 > > > > > > -- > > *Regards,* > > *Zhaojie* > > > -- *Regards,* *Zhaojie*