Yes something along those lines.
This might help a bit more: http://techtonka.com/?p=174
thanks
— Hitesh
On Jun 18, 2015, at 6:44 PM, r7raul1...@163.com wrote:
log like this hdfs-audit.log.9 ?
r7raul1...@163.com
From: Hitesh Shah
Date: 2015-06-19 02:28
To: user
Subject: Re: hive 1.1.0 on tez0.53 error
Also, if you have access to the name node audit logs, can you search for all
accesses of
/tmp/hive/lujian/_tez_session_dir/86bc0010-4816-4251-95aa-bb37b8d029da/“
directory and see if/when someone tried to delete it?
thanks
— Hitesh
On Jun 17, 2015, at 7:57 PM, r7raul1...@163.com wrote:
Here is hive log:
Status: Running (Executing on YARN cluster with App id
application_1433219182593_180456)
Map 1: -/- Reducer 2: 0/5 Reducer 3: 0/5
Map 1: 0(+1)/1 Reducer 2: 0/5 Reducer 3: 0/5
Map 1: 1/1 Reducer 2: 0(+1)/5 Reducer 3: 0/5
Map 1: 1/1 Reducer 2: 0(+5)/5 Reducer 3: 0/5
Map 1: 1/1 Reducer 2: 2(+3)/5 Reducer 3: 0/5
Map 1: 1/1 Reducer 2: 4(+1)/5 Reducer 3: 0(+5)/5
Map 1: 1/1 Reducer 2: 5/5 Reducer 3: 0(+5)/5
Map 1: 1/1 Reducer 2: 5/5 Reducer 3: 5/5
Loading data to table testtmp.tmp_pm_cpttr_hot_srch partition (cur_flg=0,
ds=2015-06-16)
Partition testtmp.tmp_pm_cpttr_hot_srch{cur_flg=0, ds=2015-06-16} stats:
[numFiles=5, numRows=0, totalSize=0, rawDataSize=0]
OK
Time taken: 3.885 seconds
OK
Time taken: 0.266 seconds
OK
Time taken: 0.067 seconds
Query ID = lujian_2015061718_f048ad51-d72f-458f-8480-bef366606a68
Total jobs = 1
Launching Job 1 out of 1
Status: Running (Executing on YARN cluster with App id
application_1433219182593_180456)
Map 1: 0/1 Map 2: -/-
Map 1: 0/1 Map 2: 0/1
Map 1: 0/1 Map 2: 0/1
Map 1: 0(+0,-1)/1 Map 2: 0(+0,-1)/1
Map 1: 0(+0,-1)/1 Map 2: 0(+0,-1)/1
Map 1: 0(+0,-2)/1 Map 2: 0(+0,-2)/1
Map 1: 0(+0,-2)/1 Map 2: 0(+0,-2)/1
Map 1: 0(+0,-3)/1 Map 2: 0(+0,-3)/1
Status: Failed
Vertex failed, vertexName=Map 2, vertexId=vertex_1433219182593_180456_3_01,
diagnostics=[Task failed, taskId=task_1433219182593_180456_3_01_00,
diagnostics=[TaskAttempt 0 failed, info=[Container
container_1433219182593_180456_01_14 finished with diagnostics set to
[Container failed. File does not exist:
hdfs://yhd-jqhadoop2.int.yihaodian.com:8020/tmp/hive/lujian/_tez_session_dir/86bc0010-4816-4251-95aa-bb37b8d029da/.tez/application_1433219182593_180456/tez-conf.pb
]], TaskAttempt 1 failed, info=[Container
container_1433219182593_180456_01_16 finished with diagnostics set to
[Container failed. File does not exist:
hdfs://yhd-jqhadoop2.int.yihaodian.com:8020/tmp/hive/lujian/_tez_session_dir/86bc0010-4816-4251-95aa-bb37b8d029da/.tez/application_1433219182593_180456/tez-conf.pb
]], TaskAttempt 2 failed, info=[Container
container_1433219182593_180456_01_18 finished with diagnostics set to
[Container failed. File does not exist:
hdfs://yhd-jqhadoop2.int.yihaodian.com:8020/tmp/hive/lujian/_tez_session_dir/86bc0010-4816-4251-95aa-bb37b8d029da/.tez/application_1433219182593_180456/tez-conf.pb
]], TaskAttempt 3 failed, info=[Container
container_1433219182593_180456_01_20 finished with diagnostics set to
[Container failed. File does not exist:
hdfs://yhd-jqhadoop2.int.yihaodian.com:8020/tmp/hive/lujian/_tez_session_dir/86bc0010-4816-4251-95aa-bb37b8d029da/.tez/application_1433219182593_180456/tez-conf.pb
]]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex
vertex_1433219182593_180456_3_01 [Map 2] killed/failed due to:null]
Vertex killed, vertexName=Map 1, vertexId=vertex_1433219182593_180456_3_00,
diagnostics=[Vertex received Kill while in RUNNING state., Vertex killed as
other vertex failed. failedTasks:0, Vertex vertex_1433219182593_180456_3_00
[Map 1] killed/failed due to:null]
DAG failed due to vertex failure. failedVertices:1 killedVertices:1
FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.tez.TezTask
I think maybe first successfule job delete tez-conf.pb ?
r7raul1...@163.com
From: Hitesh Shah
Date: 2015-06-18 10:46
To: user
Subject: Re: hive 1.1.0 on tez0.53 error
That particular log is a red herring and not really an issue that is
causing the failure.
The main problem based on the log is this:
2015-06-17 18:00:43,543 INFO [AsyncDispatcher event handler]
history.HistoryEventHandler:
[HISTORY][DAG:dag_1433219182593_180456_3][Event:DAG_FINISHED]:
dagId=dag_1433219182593_180456_3, startTime=1434535228467,
finishTime=1434535243529, timeTaken=15062, status=FAILED,
diagnostics=Vertex failed, vertexName=Map 2,
vertexId=vertex_1433219182593_180456_3_01, diagnostics=[Task failed,
taskId=task_1433219182593_180456_3_01_00, diagnostics=[TaskAttempt 0
failed, info=[Container container_1433219182593_180456_01_14 finished
with diagnostics set to [Container failed. File does not