The RM will need CORS support turned on so that the Tez UI can query the live Tez AM from the Tez UI. This support was added in Hadoop 2.7.2.
https://hadoop.apache.org/docs/current2/hadoop-project-dist/hadoop-common/HttpAuthentication.html On Wed, Sep 28, 2016 at 6:28 PM, Madhusudan Ramanna <m.rama...@ymail.com> wrote: > Well, don't see much in yarn logs > > However, in the browser console of tez-ui we see: > > >>>>>>>> > > LHttpRequest cannot load http://rm:8088/proxy/application_1475091857089_ > 0021/ws/v1/tez/dagProgress?dagID=1&_=1475104772228. No > 'Access-Control-Allow-Origin' header is present on the requested resource. > Origin 'http://tez-dev.internal:8080' is therefore not allowed access. > > <<<<<<< > > Is proxy supposed to redirect http://rm:8088/proxy/ > application_1475091857089_0021/ws/v1/tez/dagProgress? > dagID=1&_=1475104772228 to > > http://timelineserver:8188/ws/v1/tez/dagProgress?dagID=1&_=1475104772228 > <http://rm:8088/proxy/application_1475091857089_0021/ws/v1/tez/dagProgress?dagID=1&_=1475104772228> > > > On Wednesday, September 28, 2016 3:02 PM, Hitesh Shah <hit...@apache.org> > wrote: > > > Ok thanks - so it does look it is publishing data correctly for the most > part. You may wish to start digging through the yarn app logs for which > data is not showing up as well as the yarn timeline logs to see if there > are any exceptions being thrown. > > — Hitesh > > > > On Sep 28, 2016, at 2:50 PM, Madhusudan Ramanna <m.rama...@ymail.com> > wrote: > > > > Here is history.txt.appattempt_1475091857089_0015_000001 (clipped) > > > > > > {"entity":"tez_application_1475091857089_0015"," > entitytype":"TEZ_APPLICATION","otherinfo":{"user":"apxqueue","config":""} > > {"entity":"tez_appattempt_1475091857089_0015_000001"," > entitytype":"TEZ_APPLICATION_ATTEMPT","relatedEntities":[{" > entity":"application_1475091857089_0015","entitytype":"applicationId"},{ > "entity":"appattempt_1475091857089_0015_000001","entitytype":" > applicationAttemptId"}],"events":[{"ts":1475098751116," > eventtype":"AM_LAUNCHED"}],"otherinfo":{"appSubmitTime":1475098748512}} > > {"entity":"tez_appattempt_1475091857089_0015_000001"," > entitytype":"TEZ_APPLICATION_ATTEMPT","relatedEntities":[{" > entity":"application_1475091857089_0015","entitytype":"applicationId"},{ > "entity":"appattempt_1475091857089_0015_000001","entitytype":" > applicationAttemptId"}],"events":[{"ts":1475098753324," > eventtype":"AM_STARTED"}]} > > {"entity":"dag_1475091857089_0015_1","entitytype":"TEZ_DAG_ > ID","relatedEntities":[{"entity":"tez_application_1475091857089_0015"," > entitytype":"TEZ_APPLICATION"},{"entity":"tez_appattempt_ > 1475091857089_0015_000001","entitytype":"TEZ_APPLICATION_ > ATTEMPT"},{"entity":"application_1475091857089_0015","entitytype":" > applicationId"},{"entity":"appattempt_1475091857089_0015_ > 000001","entitytype":"applicationAttemptId"},{"entity":"apxqueue"," > entitytype":"user"}],"primaryfilters":{"dagName":"pager:0.1.3-SNAPSHOT"," > callerId":"application_1475091857089_0015","callerType":"Coordinator"}," > events":[{"ts":1475098772855,"eventtype":"DAG_SUBMITTED"}]," > otherinfo":{"dagPlan":{"dagName":"pager:0.1.3-SNAPSHOT","dagContext":{" > callerId":"application_1475091857089_0015","callerType":"Coordinator"," > context":"Coordinator","description":"Tez graph 'pager:0.1.3-SNAPSHOT'"}," > version":2,"vertices":[{"vertexName":"parser"," > processorClass":"com.xyz.cv2.mrv2.ShimMapper","outEdgeIds":["772105978"]," > additionalInputs":[{"name":"_initial","class":"org.apache. > tez.mapreduce.input.MRInput","initializer":"org.apache.tez. > mapreduce.common.MRInputAMSplitGenerator"}]},{"vertexName":"pager"," > processorClass":"com.xyz.cv2.mrv2.ShimMapper","inEdgeIds":[ > "772105978"]}],"edges":[{"edgeId":"772105978","inputVertexName":"parser"," > outputVertexName":"pager","dataMovementType":"ONE_TO_ONE" > ,"dataSourceType":"PERSISTED","schedulingType":"SEQUENTIAL", > "edgeSourceClass":"org.apache.tez.runtime.library.output. > UnorderedKVOutput","edgeDestinationClass":"org.apache.tez.runtime.library. > input.UnorderedKVInput"}]},"callerId":"application_1475091857089_0015"," > callerType":"Coordinator"}} > > {"entity":"dag_1475091857089_0015_1","entitytype":"TEZ_DAG_ > ID","events":[{"ts":1475098773464,"eventtype":" > DAG_INITIALIZED"}],"otherinfo":{"vertexNameIdMapping":{" > pager":"vertex_1475091857089_0015_1_01","parser":"vertex_ > 1475091857089_0015_1_00"}}} > > {"entity":"dag_1475091857089_0015_1","entitytype":"TEZ_DAG_ > ID","events":[{"ts":1475098773467,"eventtype":"DAG_STARTED"}]} > > {"entity":"vertex_1475091857089_0015_1_00"," > entitytype":"TEZ_VERTEX_ID","relatedEntities":[{"entity":" > dag_1475091857089_0015_1","entitytype":"TEZ_DAG_ID"}]," > events":[{"ts":1475098773628,"eventtype":"VERTEX_ > INITIALIZED"}],"otherinfo":{"vertexName":"parser","initRequestedTime": > 1475098773472,"initTime":1475098773628,"numTasks":1,"processorClassName":" > com.xyz.cv2.mrv2.ShimMapper","servicePlugin":{" > taskSchedulerName":"TezYarn","taskSchedulerClassName":"org. > apache.tez.dag.app.rm.YarnTaskSchedulerService","taskCommunicatorName":" > TezYarn","taskCommunicatorClassName":"org.apache.tez.dag.app. > TezTaskCommunicatorImpl","containerLauncherName":"TezYarn"," > containerLauncherClassName":"org.apache.tez.dag.app.launcher. > TezContainerLauncherImpl"}}} > > {"entity":"vertex_1475091857089_0015_1_00"," > entitytype":"TEZ_VERTEX_ID","relatedEntities":[{"entity":" > dag_1475091857089_0015_1","entitytype":"TEZ_DAG_ID"}]," > events":[{"ts":1475098773630,"eventtype":"VERTEX_STARTED"}],"otherinfo":{" > startRequestedTime":1475098773507,"startTime":1475098773630}} > > {"entity":"vertex_1475091857089_0015_1_00"," > entitytype":"TEZ_VERTEX_ID","events":[{"ts":0,"eventtype":" > VERTEX_CONFIGURE_DONE","eventinfo":{"numTasks":1}}],"otherinfo":{}} > > {"entity":"vertex_1475091857089_0015_1_01"," > entitytype":"TEZ_VERTEX_ID","relatedEntities":[{"entity":" > dag_1475091857089_0015_1","entitytype":"TEZ_DAG_ID"}]," > events":[{"ts":1475098773639,"eventtype":"VERTEX_ > INITIALIZED"}],"otherinfo":{"vertexName":"pager","initRequestedTime": > 1475098773507,"initTime":1475098773639,"numTasks":1,"processorClassName":" > com.xyz.cv2.mrv2.ShimMapper","servicePlugin":{" > taskSchedulerName":"TezYarn","taskSchedulerClassName":"org. > apache.tez.dag.app.rm.YarnTaskSchedulerService","taskCommunicatorName":" > TezYarn","taskCommunicatorClassName":"org.apache.tez.dag.app. > TezTaskCommunicatorImpl","containerLauncherName":"TezYarn"," > containerLauncherClassName":"org.apache.tez.dag.app.launcher. > TezContainerLauncherImpl"}}} > > {"entity":"vertex_1475091857089_0015_1_01"," > entitytype":"TEZ_VERTEX_ID","events":[{"ts":0,"eventtype":" > VERTEX_CONFIGURE_DONE","eventinfo":{"numTasks":1," > updatedEdgeManagers":{"parser":{"schedulingType":" > SEQUENTIAL","edgeSourceClass":"org.apache.tez.runtime.library.output. > UnorderedKVOutput","dataMovementType":"ONE_TO_ONE" > ,"edgeDestinationClass":"org.apache.tez.runtime.library. > input.UnorderedKVInput","dataSourceType":"PERSISTED"}}}}],"otherinfo":{}} > > {"entity":"vertex_1475091857089_0015_1_01"," > entitytype":"TEZ_VERTEX_ID","relatedEntities":[{"entity":" > dag_1475091857089_0015_1","entitytype":"TEZ_DAG_ID"}]," > events":[{"ts":1475098773642,"eventtype":"VERTEX_STARTED"}],"otherinfo":{" > startRequestedTime":1475098773636,"startTime":1475098773642}} > > {"entity":"task_1475091857089_0015_1_00_000000","entitytype" > :"TEZ_TASK_ID","relatedEntities":[{"entity":"vertex_1475091857089_0015_1_ > 00","entitytype":"TEZ_VERTEX_ID"}],"events":[{"ts": > 1475098773645,"eventtype":"TASK_STARTED"}],"otherinfo":{" > startTime":1475098773645,"scheduledTime":1475098773645}} > > {"entity":"tez_container_1475091857089_0015_01_000002", > "entitytype":"TEZ_CONTAINER_ID","relatedEntities":[{"entity":"appattempt_ > 1475091857089_0015_000001","entitytype":"TEZ_APPLICATION_ > ATTEMPT"},{"entity":"container_1475091857089_0015_ > 01_000002","entitytype":"containerId"}],"events":[{"ts" > :1475098775614,"eventtype":"CONTAINER_LAUNCHED"}]} > > {"entity":"attempt_1475091857089_0015_1_00_000000_0","entitytype":"TEZ_ > TASK_ATTEMPT_ID","relatedEntities":[{"entity":"ip-10-1-2-173.us-west-2. > compute.internal:8041","entitytype":"nodeId"},{"entity":"container_ > 1475091857089_0015_01_000002","entitytype":"containerId"},{" > entity":"task_1475091857089_0015_1_00_000000","entitytype" > :"TEZ_TASK_ID"}],"events":[{"ts":1475098782472,"eventtype": > "TASK_ATTEMPT_STARTED"}],"otherinfo":{"inProgressLogsURL":"ip-10-1-2- > 173.us-west-2.compute.internal:8042\/node\/containerlogs\/container_ > 1475091857089_0015_01_000002\/apxqueue","completedLogsURL":" > http:\/\/ip-10-1-3-71.us-west-2.compute.internal:19888\/ > jobhistory\/logs\/\/ip-10-1-2-173.us-west-2.compute. > internal:8041\/container_1475091857089_0015_01_000002\/v_parser_attempt_ > 1475091857089_0015_1_00_000000_0\/apxqueue"}} > > {"entity":"attempt_1475091857089_0015_1_00_000000_0","entitytype":"TEZ_ > TASK_ATTEMPT_ID","events":[{"ts":1475098785740,"eventtype": > "TASK_ATTEMPT_FINISHED"}],"otherinfo":{"creationTime": > 1475098773667,"allocationTime":1475098775535,"startTime": > 1475098782472,"endTime":1475098785740,"timeTaken": > 3268,"status":"SUCCEEDED","diagnostics":"","counters":{"counterGroups":[{" > counterGroupName":"org.apache.tez.common.counters. > DAGCounter","counters":[{"counterName":"DATA_LOCAL_ > TASKS","counterValue":1}]},{"counterGroupName":"org.apache. > tez.common.counters.FileSystemCounter","counterGroupDisplayName":"File > System Counters","counters":[{"counterName":"FILE_BYTES_ > WRITTEN","counterValue":42},{"counterName":"HDFS_BYTES_READ" > ,"counterValue":14297},{"counterName":"HDFS_READ_OPS"," > counterValue":2}]},{"counterGroupName":"org.apache.tez.common.counters. > TaskCounter","counters":[{"counterName":"GC_TIME_MILLIS", > "counterValue":265},{"counterName":"CPU_MILLISECONDS","counterValue": > 7860},{"counterName":"PHYSICAL_MEMORY_BYTES","counterValue":265814016},{" > counterName":"VIRTUAL_MEMORY_BYTES","counterValue": > 5392384000},{"counterName":"COMMITTED_HEAP_BYTES"," > counterValue":265814016},{"counterName":"INPUT_RECORDS_ > PROCESSED","counterValue":1},{"counterName":"INPUT_SPLIT_ > LENGTH_BYTES","counterValue":14297},{"counterName":"OUTPUT_ > BYTES_WITH_OVERHEAD","counterValue":6},{"counterName":"OUTPUT_BYTES_ > PHYSICAL","counterValue":34}]}]},"lastDataEvents":{" > lastDataEvents":[{"TEZ_TASK_ATTEMPT_ID":"","ts":1475098773609}]}}} > > {"entity":"task_1475091857089_0015_1_00_000000","entitytype" > :"TEZ_TASK_ID","events":[{"ts":1475098785751,"eventtype":" > TASK_FINISHED"}],"otherinfo":{"startTime":1475098782472," > endTime":1475098785751,"timeTaken":3279,"status":" > SUCCEEDED","diagnostics":"","counters":{"counterGroups":[{" > counterGroupName":"org.apache.tez.common.counters. > DAGCounter","counters":[{"counterName":"DATA_LOCAL_ > TASKS","counterValue":1}]},{"counterGroupName":"org.apache. > tez.common.counters.FileSystemCounter","counterGroupDisplayName":"File > System Counters","counters":[{"counterName":"FILE_BYTES_ > WRITTEN","counterValue":42},{"counterName":"HDFS_BYTES_READ" > ,"counterValue":14297},{"counterName":"HDFS_READ_OPS"," > counterValue":2}]},{"counterGroupName":"org.apache.tez.common.counters. > TaskCounter","counters":[{"counterName":"GC_TIME_MILLIS", > "counterValue":265},{"counterName":"CPU_MILLISECONDS","counterValue": > 7860},{"counterName":"PHYSICAL_MEMORY_BYTES","counterValue":265814016},{" > counterName":"VIRTUAL_MEMORY_BYTES","counterValue": > 5392384000},{"counterName":"COMMITTED_HEAP_BYTES"," > counterValue":265814016},{"counterName":"INPUT_RECORDS_ > PROCESSED","counterValue":1},{"counterName":"INPUT_SPLIT_ > LENGTH_BYTES","counterValue":14297},{"counterName":"OUTPUT_ > BYTES_WITH_OVERHEAD","counterValue":6},{"counterName":"OUTPUT_BYTES_ > PHYSICAL","counterValue":34}]}]},"successfulAttemptId":" > attempt_1475091857089_0015_1_00_000000_0"}} > > {"entity":"vertex_1475091857089_0015_1_00"," > entitytype":"TEZ_VERTEX_ID","events":[{"ts":1475098785759," > eventtype":"VERTEX_FINISHED"}],"otherinfo":{"endTime": > 1475098785759,"timeTaken":12129,"status":"SUCCEEDED"," > diagnostics":"","counters":{"counterGroups":[{" > counterGroupName":"org.apache.tez.common.counters. > DAGCounter","counters":[{"counterName":"DATA_LOCAL_ > TASKS","counterValue":1}]},{"counterGroupName":"org.apache. > tez.common.counters.FileSystemCounter","counterGroupDisplayName":"File > System Counters","counters":[{"counterName":"FILE_BYTES_ > WRITTEN","counterValue":42},{"counterName":"HDFS_BYTES_READ" > ,"counterValue":14297},{"counterName":"HDFS_READ_OPS"," > counterValue":2}]},{"counterGroupName":"org.apache.tez.common.counters. > TaskCounter","counters":[{"counterName":"GC_TIME_MILLIS", > "counterValue":265},{"counterName":"CPU_MILLISECONDS","counterValue": > 7860},{"counterName":"PHYSICAL_MEMORY_BYTES","counterValue":265814016},{" > counterName":"VIRTUAL_MEMORY_BYTES","counterValue": > 5392384000},{"counterName":"COMMITTED_HEAP_BYTES"," > counterValue":265814016},{"counterName":"INPUT_RECORDS_ > PROCESSED","counterValue":1},{"counterName":"INPUT_SPLIT_ > LENGTH_BYTES","counterValue":14297},{"counterName":"OUTPUT_ > BYTES_WITH_OVERHEAD","counterValue":6},{"counterName":"OUTPUT_BYTES_ > PHYSICAL","counterValue":34}]}]},"stats":{"firstTaskStartTime": > 1475098782472,"firstTasksToStart":["task_1475091857089_0015_1_00_ > 000000"],"lastTaskFinishTime":1475098785740,"lastTasksToFinish":["task_ > 1475091857089_0015_1_00_000000"],"minTaskDuration": > 3268,"maxTaskDuration":3268,"avgTaskDuration":3268," > shortestDurationTasks":["task_1475091857089_0015_1_00_000000"]," > longestDurationTasks":["task_1475091857089_0015_1_00_000000"]}," > numFailedTaskAttempts":0,"numKilledTaskAttempts":0,"numCompletedTasks":1," > numSucceededTasks":1,"numKilledTasks":0,"numFailedTasks":0," > servicePlugin":{"taskSchedulerName":"TezYarn"," > taskSchedulerClassName":"org.apache.tez.dag.app.rm. > YarnTaskSchedulerService","taskCommunicatorName":"TezYarn"," > taskCommunicatorClassName":"org.apache.tez.dag.app. > TezTaskCommunicatorImpl","containerLauncherName":"TezYarn"," > containerLauncherClassName":"org.apache.tez.dag.app.launcher. > TezContainerLauncherImpl"}}} > > {"entity":"task_1475091857089_0015_1_01_000000","entitytype" > :"TEZ_TASK_ID","relatedEntities":[{"entity":"vertex_1475091857089_0015_1_ > 01","entitytype":"TEZ_VERTEX_ID"}],"events":[{"ts": > 1475098785787,"eventtype":"TASK_STARTED"}],"otherinfo":{" > startTime":1475098785787,"scheduledTime":1475098785787}} > > {"entity":"attempt_1475091857089_0015_1_01_000000_0","entitytype":"TEZ_ > TASK_ATTEMPT_ID","relatedEntities":[{"entity":"ip-10-1-2-173.us-west-2. > compute.internal:8041","entitytype":"nodeId"},{"entity":"container_ > 1475091857089_0015_01_000002","entitytype":"containerId"},{" > entity":"task_1475091857089_0015_1_01_000000","entitytype" > :"TEZ_TASK_ID"}],"events":[{"ts":1475098785847,"eventtype": > "TASK_ATTEMPT_STARTED"}],"otherinfo":{"inProgressLogsURL":"ip-10-1-2- > 173.us-west-2.compute.internal:8042\/node\/containerlogs\/container_ > 1475091857089_0015_01_000002\/apxqueue","completedLogsURL":" > http:\/\/ip-10-1-3-71.us-west-2.compute.internal:19888\/ > jobhistory\/logs\/\/ip-10-1-2-173.us-west-2.compute. > internal:8041\/container_1475091857089_0015_01_000002\/ > v_pager_attempt_1475091857089_0015_1_01_000000_0\/apxqueue"}} > > {"entity":"attempt_1475091857089_0015_1_01_000000_0","entitytype":"TEZ_ > TASK_ATTEMPT_ID","events":[{"ts":1475098785988,"eventtype": > "TASK_ATTEMPT_FINISHED"}],"otherinfo":{"creationTime": > 1475098785788,"allocationTime":1475098785793,"startTime": > 1475098785847,"endTime":1475098785988,"timeTaken":141, > "status":"SUCCEEDED","diagnostics":"","counters":{"counterGroups":[{" > counterGroupName":"org.apache.tez.common.counters. > DAGCounter","counters":[{"counterName":"OTHER_LOCAL_ > TASKS","counterValue":1}]},{"counterGroupName":"org.apache. > tez.common.counters.TaskCounter","counters":[{"counterName":"CPU_ > MILLISECONDS","counterValue":330},{"counterName":"PHYSICAL_ > MEMORY_BYTES","counterValue":265814016},{"counterName":" > VIRTUAL_MEMORY_BYTES","counterValue":5404876800},{" > counterName":"COMMITTED_HEAP_BYTES","counterValue": > 265814016},{"counterName":"SHUFFLE_PHASE_TIME","counterValue":27},{" > counterName":"FIRST_EVENT_RECEIVED","counterValue":25},{ > "counterName":"LAST_EVENT_RECEIVED","counterValue":25}]} > ]},"lastDataEvents":{"lastDataEvents":[{"TEZ_TASK_ATTEMPT_ID":"attempt_ > 1475091857089_0015_1_00_000000_0","ts":1475098785739}]}}} > > {"entity":"task_1475091857089_0015_1_01_000000","entitytype" > :"TEZ_TASK_ID","events":[{"ts":1475098785989,"eventtype":" > TASK_FINISHED"}],"otherinfo":{"startTime":1475098785847," > endTime":1475098785989,"timeTaken":142,"status":" > SUCCEEDED","diagnostics":"","counters":{"counterGroups":[{" > counterGroupName":"org.apache.tez.common.counters. > DAGCounter","counters":[{"counterName":"OTHER_LOCAL_ > TASKS","counterValue":1}]},{"counterGroupName":"org.apache. > tez.common.counters.TaskCounter","counters":[{"counterName":"CPU_ > MILLISECONDS","counterValue":330},{"counterName":"PHYSICAL_ > MEMORY_BYTES","counterValue":265814016},{"counterName":" > VIRTUAL_MEMORY_BYTES","counterValue":5404876800},{" > counterName":"COMMITTED_HEAP_BYTES","counterValue": > 265814016},{"counterName":"SHUFFLE_PHASE_TIME","counterValue":27},{" > counterName":"FIRST_EVENT_RECEIVED","counterValue":25},{ > "counterName":"LAST_EVENT_RECEIVED","counterValue":25}]} > ]},"successfulAttemptId":"attempt_1475091857089_0015_1_01_000000_0"}} > > {"entity":"vertex_1475091857089_0015_1_01"," > entitytype":"TEZ_VERTEX_ID","events":[{"ts":1475098785990," > eventtype":"VERTEX_FINISHED"}],"otherinfo":{"endTime": > 1475098785990,"timeTaken":12348,"status":"SUCCEEDED"," > diagnostics":"","counters":{"counterGroups":[{" > counterGroupName":"org.apache.tez.common.counters. > DAGCounter","counters":[{"counterName":"OTHER_LOCAL_ > TASKS","counterValue":1}]},{"counterGroupName":"org.apache. > tez.common.counters.TaskCounter","counters":[{"counterName":"CPU_ > MILLISECONDS","counterValue":330},{"counterName":"PHYSICAL_ > MEMORY_BYTES","counterValue":265814016},{"counterName":" > VIRTUAL_MEMORY_BYTES","counterValue":5404876800},{" > counterName":"COMMITTED_HEAP_BYTES","counterValue": > 265814016},{"counterName":"SHUFFLE_PHASE_TIME","counterValue":27},{" > counterName":"FIRST_EVENT_RECEIVED","counterValue":25},{ > "counterName":"LAST_EVENT_RECEIVED","counterValue":25}]}]},"stats":{" > firstTaskStartTime":1475098785847,"firstTasksToStart":["task_ > 1475091857089_0015_1_01_000000"],"lastTaskFinishTime":1475098785988," > lastTasksToFinish":["task_1475091857089_0015_1_01_ > 000000"],"minTaskDuration":141,"maxTaskDuration":141," > avgTaskDuration":141,"shortestDurationTasks":["task_ > 1475091857089_0015_1_01_000000"],"longestDurationTasks":["task_ > 1475091857089_0015_1_01_000000"]},"numFailedTaskAttempts":0," > numKilledTaskAttempts":0,"numCompletedTasks":1,"numSucceededTasks":1," > numKilledTasks":0,"numFailedTasks":0,"servicePlugin":{" > taskSchedulerName":"TezYarn","taskSchedulerClassName":"org. > apache.tez.dag.app.rm.YarnTaskSchedulerService","taskCommunicatorName":" > TezYarn","taskCommunicatorClassName":"org.apache.tez.dag.app. > TezTaskCommunicatorImpl","containerLauncherName":"TezYarn"," > containerLauncherClassName":"org.apache.tez.dag.app.launcher. > TezContainerLauncherImpl"}}} > > {"entity":"dag_1475091857089_0015_1","entitytype":"TEZ_DAG_ > ID","events":[{"ts":1475098785994,"eventtype":" > DAG_FINISHED"}],"otherinfo":{"startTime":1475098773467," > endTime":1475098785994,"timeTaken":12527,"status":" > SUCCEEDED","diagnostics":"","counters":{"counterGroups":[{" > counterGroupName":"org.apache.tez.common.counters. > DAGCounter","counters":[{"counterName":"NUM_SUCCEEDED_ > TASKS","counterValue":2},{"counterName":"TOTAL_LAUNCHED_ > TASKS","counterValue":2},{"counterName":"OTHER_LOCAL_ > TASKS","counterValue":1},{"counterName":"DATA_LOCAL_ > TASKS","counterValue":1},{"counterName":"AM_CPU_ > MILLISECONDS","counterValue":1730},{"counterName":"AM_GC_ > TIME_MILLIS","counterValue":82}]},{"counterGroupName":" > org.apache.tez.common.counters.FileSystemCounter","counterGroupDisplayName":"File > System Counters","counters":[{"counterName":"FILE_BYTES_ > WRITTEN","counterValue":42},{"counterName":"HDFS_BYTES_READ" > ,"counterValue":14297},{"counterName":"HDFS_READ_OPS"," > counterValue":2}]},{"counterGroupName":"org.apache.tez.common.counters. > TaskCounter","counters":[{"counterName":"GC_TIME_MILLIS", > "counterValue":265},{"counterName":"CPU_MILLISECONDS","counterValue": > 8190},{"counterName":"PHYSICAL_MEMORY_BYTES","counterValue":531628032},{" > counterName":"VIRTUAL_MEMORY_BYTES","counterValue": > 10797260800},{"counterName":"COMMITTED_HEAP_BYTES"," > counterValue":531628032},{"counterName":"INPUT_RECORDS_ > PROCESSED","counterValue":1},{"counterName":"INPUT_SPLIT_ > LENGTH_BYTES","counterValue":14297},{"counterName":"OUTPUT_ > BYTES_WITH_OVERHEAD","counterValue":6},{"counterName":"OUTPUT_BYTES_ > PHYSICAL","counterValue":34},{"counterName":"SHUFFLE_PHASE_ > TIME","counterValue":27},{"counterName":"FIRST_EVENT_ > RECEIVED","counterValue":25},{"counterName":"LAST_EVENT_ > RECEIVED","counterValue":25}]}]},"completionApplicationAttemptId > ":"appattempt_1475091857089_0015_000001","numFailedTaskAttempts":0," > numKilledTaskAttempts":0,"numCompletedTasks":2,"numSucceededTasks":2," > numKilledTasks":0,"numFailedTasks":0}} > > {"entity":"tez_container_1475091857089_0015_01_000002", > "entitytype":"TEZ_CONTAINER_ID","relatedEntities":[{"entity":"appattempt_ > 1475091857089_0015_000001","entitytype":"TEZ_APPLICATION_ > ATTEMPT"},{"entity":"container_1475091857089_0015_ > 01_000002","entitytype":"containerId"}],"events":[{"ts" > :1475098794497,"eventtype":"CONTAINER_STOPPED"}]," > otherinfo":{"exitStatus":0}} > > > > > > > > > > On Wednesday, September 28, 2016 1:52 PM, Hitesh Shah <hit...@apache.org> > wrote: > > > > > > To pinpoint the issue, one approach would be to change the history > logger to SimpleHistoryLogger . i.e comment out the property for > tez.history.logging.service.class in the configs so that it falls back to > the default value. This should generate a history log file as part of the > application logs which should help us understand whether tez itself is not > generating the data or YARN timeline is somehow losing it. Any exceptions > in the DAGAppMaster log and/or the yarn timeline logs when this job runs? > > > > — HItesh > > > > > > > > > On Sep 28, 2016, at 1:30 PM, Madhusudan Ramanna <m.rama...@ymail.com> > wrote: > > > > > > Hitesh, > > > > > > Some information like appId is getting through to timeline server, but > not all. See attached. > > > > > > Here is the output of > > > > > > http://timelinehost:port/ws/v1/timeline/TEZ_DAG_ID/ > > > {"entities":[{"events":[{"timestamp":1475094093409," > eventtype":"DAG_FINISHED","eventinfo":{}},{"timestamp": > 1475094062692,"eventtype":"DAG_STARTED","eventinfo":{}},{ > "timestamp":1475094062688,"eventtype":"DAG_INITIALIZED"," > eventinfo":{}},{"timestamp":1475094062055,"eventtype":" > DAG_SUBMITTED","eventinfo":{}}],"entitytype":"TEZ_DAG_ID"," > entity":"dag_1475091857089_0007_1","starttime":1475094062055,"domain":" > DEFAULT","relatedentities":{},"primaryfilters":{},"otherinfo":{}}]} > > > > > > http://host:8188/ws/v1/timeline/TEZ_DAG_ID/dag_1475091857089_0007_1 > > > > > > {"events":[{"timestamp":1475094093409,"eventtype":" > DAG_FINISHED","eventinfo":{}},{"timestamp":1475094062692," > eventtype":"DAG_STARTED","eventinfo":{}},{"timestamp": > 1475094062688,"eventtype":"DAG_INITIALIZED","eventinfo":{ > }},{"timestamp":1475094062055,"eventtype":"DAG_SUBMITTED"," > eventinfo":{}}],"entitytype":"TEZ_DAG_ID","entity":"dag_ > 1475091857089_0007_1","starttime":1475094062055,"domain":"DEFAULT"," > relatedentities":{},"primaryfilters":{},"otherinfo":{}} > > > > > > > > > > > > On Wednesday, September 28, 2016 8:44 AM, Hitesh Shah < > hit...@apache.org> wrote: > > > > > > > > > Hello Madhusudan, > > > > > > Thanks for the patience. Let us take this to a jira where once you > attach more logs, we can root cause the issue. > > > > > > A few things to attach to the jira: > > > - yarn-site.xml > > > - tez-site.xml > > > - hadoop version > > > - timeline server log for the time period in question > > > - application logs for any tez app which fails to display > > > - output of http://timelinehost:port/ws/v1/timeline/TEZ_DAG_ID/<dag_id>/ > ( e.g. dag_1475014682883_0027_1 ) > > > > > > thanks > > > — Hitesh > > > > > > > On Sep 27, 2016, at 10:42 PM, Madhusudan Ramanna < > m.rama...@ymail.com> wrote: > > > > > > > > So I downloaded Tez commit 91a397b0ba and built the dist package. > We're not seeing the zip exception anymore. > > > > > > > > However, now Tez UI is completely broken. Not at all sure what is > happening here. Please see attached screenshots. > > > > > > > > > > > > 2016-09-28 05:11:40,903 [INFO] [main] |web.WebUIService|: Tez UI > History URL: http://dev-cv2.aws:8080/tez-ui/#/tez-app/application_ > 1475014682883_0027 > > > > 2016-09-28 05:11:40,908 [INFO] [main] |history.HistoryEventHandler|: > Initializing HistoryEventHandler withrecoveryEnabled=true, > historyServiceClassName=org.apache.tez.dag.history.logging.ats. > ATSHistoryLoggingService > > > > 2016-09-28 05:11:41,474 [INFO] [main] |impl.TimelineClientImpl|: > Timeline service address: http://ts-ip.aws:8188/ws/v1/timeline/ > > > > 2016-09-28 05:11:41,474 [INFO] [main] |ats.ATSHistoryLoggingService|: > Initializing ATSHistoryLoggingService with maxEventsPerBatch=5, > maxPollingTime(ms)=10, waitTimeForShutdown(ms)=-1, > TimelineACLManagerClass=org.apache.tez.dag.history.ats.acls. > ATSHistoryACLPolicyManager > > > > 2016-09-28 05:11:41,644 [INFO] [main] |impl.TimelineClientImpl|: > Timeline service address: http://ts-ip.aws:8188/ws/v1/timeline/ > > > > > > > > > > > > >>> DAG Execution > > > > > > > > 2016-09-28 05:11:52,779 [INFO] [IPC Server handler 0 on 44039] > |history.HistoryEventHandler|: [HISTORY][DAG:dag_ > 1475014682883_0027_1][Event:DAG_SUBMITTED]: dagID=dag_1475014682883_0027_1, > submitTime=1475039511185 > > > > > > > > > > > > Timeline server is up and running. Tez UI is however not able to > display DAG and other details > > > > > > > > thanks, > > > > Madhu > > > > > > > > > > > > > > > > On Saturday, September 24, 2016 12:01 PM, Hitesh Shah < > hit...@apache.org> wrote: > > > > > > > > > > > > tez-dist tar balls are not published to maven today - only the > module specific jars are. But yes, you could just try a local build to see > if you can reproduce the issue with the commit in question. > > > > > > > > — Hitesh > > > > > > > > > > > > > On Sep 23, 2016, at 6:23 PM, Madhusudan Ramanna < > m.rama...@ymail.com> wrote: > > > > > > > > > > Hitesh and Zhiyuan, > > > > > > > > > > Apache snapshots doesn't seem to have tez-dist > > > > > > > > > > http://repository.apache.org/content/groups/snapshots/org/ > apache/tez/tez-dist/ > > > > > > > > > > The last one seems to be 0.2.0-SNAPSHOT > > > > > > > > > > Should I just download based on the commit and recompile ? > > > > > > > > > > thanks, > > > > > Madhu > > > > > > > > > > > > > > > On Friday, September 23, 2016 5:19 PM, Hitesh Shah < > hit...@apache.org> wrote: > > > > > > > > > > > > > > > Hello Madhusudan, > > > > > > > > > > If you look at the MANIFEST.MF inside any of the tez jars, it will > provide the commit hash via the SCM-Revision field. > > > > > > > > > > The tez client and the DAGAppMaster also log this info at runtime. > > > > > > > > > > — Hitesh > > > > > > > > > > > On Sep 23, 2016, at 4:08 PM, Madhusudan Ramanna < > m.rama...@ymail.com> wrote: > > > > > > > > > > > > Zhiyuan, > > > > > > > > > > > > We just pulled down the latest snapshot from Apache repository. > Question, is how can I figure out branch and commit information from the > snapshot artifact ? > > > > > > > > > > > > thanks, > > > > > > Madhu > > > > > > > > > > > > > > > > > > On Friday, September 23, 2016 10:38 AM, zhiyuan yang < > sjtu....@gmail.com> wrote: > > > > > > > > > > > > > > > > > > Hi Madhu, > > > > > > > > > > > > It looks like a Inflater-Deflater mismatch to me. From stack > traces I see you cherry-picked this patch instead of using master branch. > > > > > > Would you mind double check whether the patch is correctly > cherry-picked? > > > > > > > > > > > > Thanks! > > > > > > Zhiyuan > > > > > > > > > > > >> On Sep 23, 2016, at 10:21 AM, Madhusudan Ramanna < > m.rama...@ymail.com> wrote: > > > > > >> > > > > > >> Hello, > > > > > >> > > > > > >> We're using the Apache snapshot repository to pull latest tez > snapshots. > > > > > >> > > > > > >> We've started seeing this exception: > > > > > >> > > > > > >> org.apache.tez.dag.api.TezUncheckedException: > java.util.zip.ZipException: incorrect header check > > > > > >> at org.apache.tez.dag.library.vertexmanager. > ShuffleVertexManager.handleVertexManagerEvent( > ShuffleVertexManager.java:622) > > > > > >> at org.apache.tez.dag.library.vertexmanager. > ShuffleVertexManager.onVertexManagerEventReceived( > ShuffleVertexManager.java:579) > > > > > >> at org.apache.tez.dag.app.dag.impl.VertexManager$ > VertexManagerEventReceived.invoke(VertexManager.java:606) > > > > > >> at org.apache.tez.dag.app.dag.impl.VertexManager$ > VertexManagerEvent$1.run(VertexManager.java:647) > > > > > >> at org.apache.tez.dag.app.dag.impl.VertexManager$ > VertexManagerEvent$1.run(VertexManager.java:642) > > > > > >> at java.security.AccessController.doPrivileged(Native Method) > > > > > >> at javax.security.auth.Subject.doAs(Subject.java:422) > > > > > >> at org.apache.hadoop.security.UserGroupInformation.doAs( > UserGroupInformation.java:1628) > > > > > >> at org.apache.tez.dag.app.dag.impl.VertexManager$ > VertexManagerEvent.call(VertexManager.java:642) > > > > > >> at org.apache.tez.dag.app.dag.impl.VertexManager$ > VertexManagerEvent.call(VertexManager.java:631) > > > > > >> at java.util.concurrent.FutureTask.run(FutureTask.java:266) > > > > > >> at java.util.concurrent.ThreadPoolExecutor.runWorker( > ThreadPoolExecutor.java:1142) > > > > > >> at java.util.concurrent.ThreadPoolExecutor$Worker.run( > ThreadPoolExecutor.java:617) > > > > > >> at java.lang.Thread.run(Thread.java:745) > > > > > >> Caused by: java.util.zip.ZipException: incorrect header check > > > > > >> at java.util.zip.InflaterInputStream.read( > InflaterInputStream.java:164) > > > > > >> at java.io.FilterInputStream.read(FilterInputStream.java:107) > > > > > >> at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1792) > > > > > >> at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1769) > > > > > >> at org.apache.commons.io.IOUtils.copy(IOUtils.java:1744) > > > > > >> at org.apache.commons.io.IOUtils.toByteArray(IOUtils.java:462) > > > > > >> > > > > > >> > > > > > >> since this commit > > > > > >> > > > > > >> https://github.com/apache/tez/commit/ > da4098b9d6f72e6d4aacc1623622a0875408d2ba > > > > > >> > > > > > >> > > > > > >> Wanted to bring this to your attention. For now we've locked > the snapshot version down. > > > > > >> > > > > > >> thanks, > > > > > >> Madhu > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > <Screen Shot 2016-09-27 at 10.27.02 PM.png><Screen Shot 2016-09-27 > at 10.27.13 PM.png><Screen Shot 2016-09-27 at 10.39.20 PM.png> > > > > > > > > > > > > > > <Screen Shot 2016-09-28 at 1.26.35 PM.png> > > > > > > > > >