Hi Hitesh, Bingo! Log Type: syslog_dag_1476593404620_0001_1
Log Upload Time: Sat Oct 15 22:03:47 -0700 2016 Log Length: 75813 Showing 4096 bytes of 75813 total. Click here <http://dwrdevnn1.sv2.trulia.com:19888/jobhistory/logs/dwrdevdn17.sv2.trulia.com:46114/container_1476593404620_0001_01_000001/container_1476593404620_0001_01_000001/dwr/syslog_dag_1476593404620_0001_1/?start=0> for the full log. 6-10-15 21:51:35,970 [WARN] [IPC Server handler 25 on 40353] |app.TezTaskCommunicatorImpl|: Received task heartbeat from unknown container with id: container_1476593404620_0001_01_000050, asking it to die 2016-10-15 21:51:35,972 [WARN] [IPC Server handler 27 on 40353] |app.TezTaskCommunicatorImpl|: Received task heartbeat from unknown container with id: container_1476593404620_0001_01_000008, asking it to die 2016-10-15 21:51:35,973 [WARN] [IPC Server handler 3 on 40353] |app.TezTaskCommunicatorImpl|: Received task heartbeat from unknown container with id: container_1476593404620_0001_01_000007, asking it to die 2016-10-15 21:51:35,974 [WARN] [IPC Server handler 29 on 40353] |app.TezTaskCommunicatorImpl|: Received task heartbeat from unknown container with id: container_1476593404620_0001_01_000011, asking it to die 2016-10-15 21:51:35,987 [ERROR] [HistoryEventHandlingThread] |impl.TimelineClientImpl|: Failed to get the response from the timeline server. 2016-10-15 21:51:35,987 [WARN] [HistoryEventHandlingThread] |ats.ATSHistoryLoggingService|: Could not handle history events org.apache.hadoop.yarn.exceptions.YarnException: Failed to get the response from the timeline server. at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPosting(TimelineClientImpl.java:339) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:301) at org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService.handleEvents(ATSHistoryLoggingService.java:357) at org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService.access$700(ATSHistoryLoggingService.java:53) at org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService$1.run(ATSHistoryLoggingService.java:190) at java.lang.Thread.run(Thread.java:745) 2016-10-15 21:51:35,987 [WARN] [IPC Server handler 6 on 40353] |app.TezTaskCommunicatorImpl|: Received task heartbeat from unknown container with id: container_1476593404620_0001_01_000058, asking it to die 2016-10-15 21:51:35,989 [WARN] [IPC Server handler 24 on 40353] |app.TezTaskCommunicatorImpl|: Received task heartbeat from unknown container with id: container_1476593404620_0001_01_000051, asking it to die 2016-10-15 21:51:36,021 [ERROR] [HistoryEventHandlingThread] |impl.TimelineClientImpl|: Failed to get the response from the timeline server. 2016-10-15 21:51:36,021 [WARN] [HistoryEventHandlingThread] |ats.ATSHistoryLoggingService|: Could not handle history events org.apache.hadoop.yarn.exceptions.YarnException: Failed to get the response from the timeline server. at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.doPosting(TimelineClientImpl.java:339) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.putEntities(TimelineClientImpl.java:301) at org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService.handleEvents(ATSHistoryLoggingService.java:357) at org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService.access$700(ATSHistoryLoggingService.java:53) at org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService$1.run(ATSHistoryLoggingService.java:190) at java.lang.Thread.run(Thread.java:745) i'm running the hive cli on host=dwrdevnn1. i updated yarn-site.xml on dwrdevnn1. i restarted the ATS service on dwrdevnn1. sudo -u yarn -- yarn-daemon.sh --config /etc/hadoop/conf start timelineserver netstat is showing 8188 as being alive. i can also telnet to dwrdevnn1 8188. also port 10200 is LISTENing. $ sudo netstat -lanp | grep 31168 tcp 0 0 172.19.103.136:10200 0.0.0.0:* LISTEN 31168/java tcp 0 0 172.19.103.136:8188 0.0.0.0:* LISTEN 31168/java might there be a debug log level i can set on impl.TimelineClientImpl to see what is happening on the connection event? thank you again! Cheers, Stephen. On Sun, Oct 16, 2016 at 9:54 AM, Hitesh Shah <hit...@apache.org> wrote: > Hello Stephen, > > yarn-site.xml needs to be updated wherever the Tez client is used. i.e if > you are using Hive, then wherever you launch the Hive CLI and also where > the HiveServer2 is installed ( HS2 will need a restart ). > > To see if the connection to timeline is/was an issue, please check the > yarn app logs for any Tez application ( the application master logs to be > more specific: syslog_dag* files) to see if there are any > warnings/exceptions being logged related to history event handling. > > thanks > — Hitesh > > > On Oct 15, 2016, at 9:58 PM, Stephen Sprague <sprag...@gmail.com> wrote: > > > > hmm... made that change to yarn-site.xml and retarted the timelineserver > and RM. > > > > $ sudo netstat -lanp | grep 31168 #timelineserver > > > > tcp 0 0 172.19.103.136:10200 0.0.0.0:* > LISTEN 31168/java > > tcp 0 0 172.19.103.136:8188 0.0.0.0:* > LISTEN 31168/java > > tcp 0 0 172.19.103.136:8188 172.19.103.136:45299 > ESTABLISHED 31168/java > > tcp 0 0 172.19.103.136:8188 172.19.103.136:45298 > ESTABLISHED 31168/java > > tcp 0 0 172.19.103.136:8188 172.19.103.136:45322 > ESTABLISHED 31168/java > > tcp 0 0 172.19.103.136:8188 172.19.103.136:45297 > ESTABLISHED 31168/java > > tcp 0 0 172.19.103.136:8188 172.19.103.136:45316 > ESTABLISHED 31168/java > > tcp 0 0 172.19.103.136:8188 172.19.103.136:45318 > ESTABLISHED 31168/java > > tcp 0 0 172.19.103.136:8188 172.19.103.136:45317 > ESTABLISHED 31168/java > > tcp 0 0 172.19.103.136:8188 172.19.103.136:45321 > ESTABLISHED 31168/java > > tcp 0 0 172.19.103.136:8188 172.19.103.136:45326 > ESTABLISHED 31168/java > > tcp 0 0 172.19.103.136:8188 172.19.103.136:45314 > ESTABLISHED 31168/java > > tcp 0 0 172.19.103.136:8188 172.19.103.136:45315 > ESTABLISHED 31168/java > > tcp 0 0 172.19.103.136:8188 172.19.103.136:45313 > ESTABLISHED 31168/java > > tcp 0 0 172.19.103.136:8188 172.19.103.136:45320 > ESTABLISHED 31168/java > > tcp 0 0 172.19.103.136:8188 172.19.103.136:45324 > ESTABLISHED 31168/java > > tcp 0 0 172.19.103.136:8188 172.19.103.136:45325 > ESTABLISHED 31168/java > > tcp 0 0 172.19.103.136:8188 172.19.103.136:45319 > ESTABLISHED 31168/java > > unix 2 [ ] STREAM CONNECTED 1455259739 31168/java > > unix 2 [ ] STREAM CONNECTED 1455253313 31168/java > > > > > > still no dice though. same error. i only changed yarn-site.xml on the > namenode though. you think i need to copy it to all the datanodes and > restart the NM's too? > > > > any other suggestions? > > > > 'ppreciate the help! > > > > > > Cheers, > > Stephen. > > > > On Sat, Oct 15, 2016 at 8:46 PM, Allan Wilson <wilsoncr...@gmail.com> > wrote: > > Just saw Gopals response...that def needs updating too. > > > > Sent from my iPhone > > > > On Oct 15, 2016, at 9:31 PM, Stephen Sprague <sprag...@gmail.com> wrote: > > > >> thanks guys. lemme answer. > >> > >> Sreenath- > >> 1. yarn.acl.enable = false (ie. i did not set it) > >> 2. this: http://dwrdevnn1.sv2.trulia.com:9766 displays index.html > with an *empty* list > >> > >> Gopal- > >> 3. i'll replace 0.0.0.0 with dwrdevnn1.sv2.trulia.com and see > happens... > >> > >> Allan- > >> 4. yes, metrics are enabled. > >> > >> > >> I'll let you know what happens with Gopal's suggestion. > >> > >> > >> Cheers, > >> Stephen. > >> > >> On Sat, Oct 15, 2016 at 8:20 PM, Allan Wilson <wilsoncr...@gmail.com> > wrote: > >> Are you emitting metrics to the ATS? > >> > >> yarn.timeline-service.enabled=true > >> > >> Sent from my iPhone > >> > >> On Oct 15, 2016, at 8:36 PM, Sreenath Somarajapuram < > ssomarajapu...@hortonworks.com> wrote: > >> > >>> Hi Stephen, > >>> > >>> The error message is coming from ATS, and it says that the application > data is not available. > >>> And yes, tez_application_1476574340629_0001 is a legit value. It can > be considered as the id for Tez application details. > >>> > >>> Please help me with these: > >>> 1. Are you having yarn.acl.enable = true in yarn-site.xml ? > >>> 2. On going to http://dwrdevnn1.sv2.trulia.com:9766 from your browser > window, the UI is supposed to display a list of DAGs. Are you able to view > them? > >>> > >>> Thanks, > >>> Sreenath > >>> > >>> From: Stephen Sprague <sprag...@gmail.com> > >>> Reply-To: "user@tez.apache.org" <user@tez.apache.org> > >>> Date: Sunday, October 16, 2016 at 7:16 AM > >>> To: "user@tez.apache.org" <user@tez.apache.org> > >>> Subject: Tez UI > >>> > >>> hey guys, > >>> i'm having hard time getting the Tez UI to work. I'm sure i'm doing > something wrong but i can't seem to figure out. Here's my scenario. > >>> > >>> 1. i'm using nginx as the webserver. port 9766. using that port > without params correctly displays index.html. (i followed the instructions > on unzipping the war file - that seems ok - i'm using tez-ui2 ) > >>> > >>> > >>> 2. i run a Tez job. It runs fine. > >>> > >>> > >>> 3. i click on the "History" hyperlink in the RM UI at 8088. > >>> > >>> > >>> 4. it attempts to run http://dwrdevnn1.sv2.trulia. > com:8088/proxy/application_1476574340629_0001/#/tez-app/ > application_1476574340629_0001 > >>> > >>> > >>> 5. which yields this error: > >>> > >>> <image.png> > >>> > >>> i see "id: tez_application_1476574340629_0001" is that "tez_" > prefix legit? > >>> > >>> > >>> > >>> 6. the ATS is running on port 8188. I've modified the file > config/configs.env as well: cf. timeline: "http://dwrdevnn1.sv2.trulia. > com:8188", > >>> > >>> > >>> 7. here are those details: > >>> > >>> yarn 29762 1 12 18:10 pts/5 00:00:11 > /usr/lib/jvm/java-8-oracle/jre/bin/java -Dproc_timelineserver -Xmx1000m > -Dhadoop.log.dir=/var/log/hadoop-yarn -Dyarn.log.dir=/var/log/hadoop-yarn > -Dhadoop.log.file=yarn-yarn-timelineserver-dwrdevnn1.log > -Dyarn.log.file=yarn-yarn-timelineserver-dwrdevnn1.log -Dyarn.home.dir= > -Dyarn.id.str=yarn -Dhadoop.root.logger=INFO,RFA > -Dyarn.root.logger=INFO,RFA -Djava.library.path=/usr/lib/hadoop/lib/native > -Dyarn.policy.file=hadoop-policy.xml -Dhadoop.log.dir=/var/log/hadoop-yarn > -Dyarn.log.dir=/var/log/hadoop-yarn > -Dhadoop.log.file=yarn-yarn-timelineserver-dwrdevnn1.log > -Dyarn.log.file=yarn-yarn-timelineserver-dwrdevnn1.log > -Dyarn.home.dir=/usr/lib/hadoop-yarn -Dhadoop.home.dir=/usr/lib/hadoop-yarn > -Dhadoop.root.logger=INFO,RFA -Dyarn.root.logger=INFO,RFA > -Djava.library.path=/usr/lib/hadoop/lib/native -classpath > /etc/hadoop/conf:/etc/hadoop/conf:/etc/hadoop/conf:/usr/ > lib/hadoop/lib/*:/usr/lib/hadoop/.//*:/usr/lib/hadoop- > hdfs/./:/usr/lib/hadoop-hdfs/lib/*:/usr/lib/hadoop-hdfs/.// > *:/usr/lib/hadoop-yarn/lib/*:/usr/lib/hadoop-yarn/.//*:/usr/ > lib/hadoop-mapreduce/lib/*:/usr/lib/hadoop-mapreduce/.//*: > /opt/pepperdata/lib/*:/usr/lib/hadoop-yarn/.//*:/usr/lib/ > hadoop-yarn/lib/*:/etc/hadoop/conf/timelineserver-config/log4j.properties > org.apache.hadoop.yarn.server.applicationhistoryservice. > ApplicationHistoryServer > >>> > >>> $ sudo netstat -lanp |grep 29762 > >>> tcp 0 0 0.0.0.0:10200 0.0.0.0:* > LISTEN 29762/java > >>> tcp 0 0 0.0.0.0:8188 0.0.0.0:* > LISTEN 29762/java > >>> > >>> > >>> > >>> 8. the configs in yarn-site.xml > >>> <property> > >>> <name>yarn.timeline-service.hostname</name> > >>> <value>0.0.0.0</value> > >>> </property> > >>> <property> > >>> <name>yarn.timeline-service.enabled</name> > >>> <value>true</value> > >>> </property> > >>> <property> > >>> <name>yarn.timeline-service.webapp.address</name> > >>> <value>0.0.0.0:8188</value> > >>> </property> > >>> <property> > >>> <name>yarn.timeline-service.http-cross-origin.enabled</name> > >>> <value>true</value> > >>> </property> > >>> <property> > >>> <name>yarn.resourcemanager.system-metrics-publisher.enabled</name> > >>> <value>true</value> > >>> </property> > >>> > >>> > >>> 9. and tez-site.xml are as follows: > >>> <property> > >>> <description>Enable Tez to use the Timeline Server for History > Logging</description> > >>> <name>tez.history.logging.service.class</name> > >>> <value>org.apache.tez.dag.history.logging.ats. > ATSHistoryLoggingService</value> > >>> </property> > >>> > >>> <!-- port 9766 defined in nginx config file --> > >>> <property> > >>> <description>URL for where the Tez UI is hosted</description> > >>> <name>tez.tez-ui.history-url.base</name> > >>> <value>http://dwrdevnn1.sv2.trulia.com:9766</value> > >>> </property> > >>> > >>> <!-- from tez-ui README.txt --> > >>> <property> > >>> <name>tez.runtime.convert.user-payload.to.history-text</name> > >>> <value>true</value> > >>> <description>Should be enabled to get the configuration options. > If enabled, the config options are set as > >>> userpayload per input/output. > >>> </description> > >>> </property> > >>> > >>> <property> > >>> <name>tez.allow.disabled.timeline-domains</name> > >>> <value>true</value> > >>> </property> > >>> > >>> > >>> > >>> So i don't get it. Any ideas why this fails? > >>> > >>> thanks, > >>> Stephen. > >>> > >>> > >>> <image.png> > >> > > > >