[
https://issues.apache.org/jira/browse/SLIDER-343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Billie Rinaldi updated SLIDER-343:
----------------------------------
Description: I'm trying to fix an issue with how the Accumulo app package
starts its processes (currently they don't know their hostname). I'm trying to
pull the hostname from config["hostname"], but it turns out that the command
json files don't have accurate hostnames. In a command file on node-3, I'm
seeing "hostname": "node-1.example.com". (was: I'm trying to fix an issue with
how the Accumulo app package starts its processes (currently they don't know
their hostname). After making one change that isn't quite working yet, the app
instance has ended up in a state where the app has finished and is destroyed,
but there are still app processes running on the cluster.
Output of slider list:
version: 1.6.1-snapshot
name: accumulo
state: FINISHED
URL: http://node-1.example.com:8088/proxy/application_1408460425050_0028/A
Started 21 Aug 2014 14:59:45 GMT
Finished 21 Aug 2014 15:02:55 GMT
RPC :node-1.example.com:38102
Diagnostics :org.apache.slider.core.exceptions.TriggerClusterTeardownException:
Unstable Application Instance : - failed with role ACCUMULO_TSERVER failing 6
times (2 in startup); threshold is 5 - last failure: Failure
container_1408460425050_0028_01_000029 on host node-3.example.com, see
http://node-1.example.com:19888/jobhistory/logs/node-3.example.com:45454/container_1408460425050_0028_01_000029/ctx/billie
Output of ps -ef | grep accumulo | grep master (only one master was requested):
yarn 11888 1 0 08:00 ? 00:00:11 /usr/lib/jvm/java/bin/java
-Dapp=master -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75
-Djava.net.preferIPv4Stack=true -Xmx128m -Xms128m -classpath
/yarn/local/usercache/billie/appcache/application_1408460425050_0028/container_1408460425050_0028_01_000002/app/install/accumulo-1.6.1-SNAPSHOT/conf:/yarn/local/usercache/billie/appcache/application_1408460425050_0028/container_1408460425050_0028_01_000002/app/install/accumulo-1.6.1-SNAPSHOT/lib/accumulo-start.jar:/usr/lib/hadoop/lib/log4j-1.2.17.jar
-XX:OnOutOfMemoryError=kill -9 %p -XX:-OmitStackTraceInFastThrow
-Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl
-Dorg.apache.accumulo.core.home.dir=/yarn/local/usercache/billie/appcache/application_1408460425050_0028/container_1408460425050_0028_01_000002/app/install/accumulo-1.6.1-SNAPSHOT
-Dhadoop.home.dir=/usr/lib/hadoop -Dzookeeper.home.dir=/usr/lib/zookeeper
org.apache.accumulo.start.Main master --address node-1.example.com
yarn 12278 1 0 08:00 ? 00:00:07 /usr/lib/jvm/java/bin/java
-Dapp=master -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75
-Djava.net.preferIPv4Stack=true -Xmx128m -Xms128m -classpath
/yarn/local/usercache/billie/appcache/application_1408460425050_0028/container_1408460425050_0028_01_000008/app/install/accumulo-1.6.1-SNAPSHOT/conf:/yarn/local/usercache/billie/appcache/application_1408460425050_0028/container_1408460425050_0028_01_000008/app/install/accumulo-1.6.1-SNAPSHOT/lib/accumulo-start.jar:/usr/lib/hadoop/lib/log4j-1.2.17.jar
-XX:OnOutOfMemoryError=kill -9 %p -XX:-OmitStackTraceInFastThrow
-Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl
-Dorg.apache.accumulo.core.home.dir=/yarn/local/usercache/billie/appcache/application_1408460425050_0028/container_1408460425050_0028_01_000008/app/install/accumulo-1.6.1-SNAPSHOT
-Dhadoop.home.dir=/usr/lib/hadoop -Dzookeeper.home.dir=/usr/lib/zookeeper
org.apache.accumulo.start.Main master --address node-1.example.com
yarn 12594 1 0 08:01 ? 00:00:07 /usr/lib/jvm/java/bin/java
-Dapp=master -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75
-Djava.net.preferIPv4Stack=true -Xmx128m -Xms128m -classpath
/yarn/local/usercache/billie/appcache/application_1408460425050_0028/container_1408460425050_0028_01_000013/app/install/accumulo-1.6.1-SNAPSHOT/conf:/yarn/local/usercache/billie/appcache/application_1408460425050_0028/container_1408460425050_0028_01_000013/app/install/accumulo-1.6.1-SNAPSHOT/lib/accumulo-start.jar:/usr/lib/hadoop/lib/log4j-1.2.17.jar
-XX:OnOutOfMemoryError=kill -9 %p -XX:-OmitStackTraceInFastThrow
-Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl
-Dorg.apache.accumulo.core.home.dir=/yarn/local/usercache/billie/appcache/application_1408460425050_0028/container_1408460425050_0028_01_000013/app/install/accumulo-1.6.1-SNAPSHOT
-Dhadoop.home.dir=/usr/lib/hadoop -Dzookeeper.home.dir=/usr/lib/zookeeper
org.apache.accumulo.start.Main master --address node-1.example.com
yarn 13006 1 0 08:02 ? 00:00:07 /usr/lib/jvm/java/bin/java
-Dapp=master -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75
-Djava.net.preferIPv4Stack=true -Xmx128m -Xms128m -classpath
/yarn/local/usercache/billie/appcache/application_1408460425050_0028/container_1408460425050_0028_01_000018/app/install/accumulo-1.6.1-SNAPSHOT/conf:/yarn/local/usercache/billie/appcache/application_1408460425050_0028/container_1408460425050_0028_01_000018/app/install/accumulo-1.6.1-SNAPSHOT/lib/accumulo-start.jar:/usr/lib/hadoop/lib/log4j-1.2.17.jar
-XX:OnOutOfMemoryError=kill -9 %p -XX:-OmitStackTraceInFastThrow
-Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl
-Dorg.apache.accumulo.core.home.dir=/yarn/local/usercache/billie/appcache/application_1408460425050_0028/container_1408460425050_0028_01_000018/app/install/accumulo-1.6.1-SNAPSHOT
-Dhadoop.home.dir=/usr/lib/hadoop -Dzookeeper.home.dir=/usr/lib/zookeeper
org.apache.accumulo.start.Main master --address node-1.example.com)
> Command files have incorrect hostname
> -------------------------------------
>
> Key: SLIDER-343
> URL: https://issues.apache.org/jira/browse/SLIDER-343
> Project: Slider
> Issue Type: Bug
> Reporter: Billie Rinaldi
>
> I'm trying to fix an issue with how the Accumulo app package starts its
> processes (currently they don't know their hostname). I'm trying to pull the
> hostname from config["hostname"], but it turns out that the command json
> files don't have accurate hostnames. In a command file on node-3, I'm seeing
> "hostname": "node-1.example.com".
--
This message was sent by Atlassian JIRA
(v6.2#6252)