Yes something along those lines.

This might help a bit more: http://techtonka.com/?p=174

thanks
— Hitesh

On Jun 18, 2015, at 6:44 PM, [email protected] wrote:

> 
> log like this hdfs-audit.log.9 ?
> [email protected]
>  
> From: Hitesh Shah
> Date: 2015-06-19 02:28
> To: user
> Subject: Re: hive 1.1.0 on tez0.53 error
> Also, if you have access to the name node audit logs, can you search for all 
> accesses of 
> "/tmp/hive/lujian/_tez_session_dir/86bc0010-4816-4251-95aa-bb37b8d029da/“ 
> directory and see if/when someone tried to delete it?
>  
> thanks
> — Hitesh
>  
> On Jun 17, 2015, at 7:57 PM, [email protected] wrote:
>  
> > Here is hive log:
> > Status: Running (Executing on YARN cluster with App id 
> > application_1433219182593_180456)
> >
> > Map 1: -/-  Reducer 2: 0/5  Reducer 3: 0/5
> > Map 1: 0(+1)/1      Reducer 2: 0/5  Reducer 3: 0/5
> > Map 1: 1/1  Reducer 2: 0(+1)/5      Reducer 3: 0/5
> > Map 1: 1/1  Reducer 2: 0(+5)/5      Reducer 3: 0/5
> > Map 1: 1/1  Reducer 2: 2(+3)/5      Reducer 3: 0/5
> > Map 1: 1/1  Reducer 2: 4(+1)/5      Reducer 3: 0(+5)/5
> > Map 1: 1/1  Reducer 2: 5/5  Reducer 3: 0(+5)/5
> > Map 1: 1/1  Reducer 2: 5/5  Reducer 3: 5/5
> > Loading data to table testtmp.tmp_pm_cpttr_hot_srch partition (cur_flg=0, 
> > ds=2015-06-16)
> > Partition testtmp.tmp_pm_cpttr_hot_srch{cur_flg=0, ds=2015-06-16} stats: 
> > [numFiles=5, numRows=0, totalSize=0, rawDataSize=0]
> > OK
> > Time taken: 3.885 seconds
> > OK
> > Time taken: 0.266 seconds
> > OK
> > Time taken: 0.067 seconds
> > Query ID = lujian_20150617180000_f048ad51-d72f-458f-8480-bef366606a68
> > Total jobs = 1
> > Launching Job 1 out of 1
> >
> >
> > Status: Running (Executing on YARN cluster with App id 
> > application_1433219182593_180456)
> >
> > Map 1: 0/1  Map 2: -/-
> > Map 1: 0/1  Map 2: 0/1
> > Map 1: 0/1  Map 2: 0/1
> > Map 1: 0(+0,-1)/1   Map 2: 0(+0,-1)/1
> > Map 1: 0(+0,-1)/1   Map 2: 0(+0,-1)/1
> > Map 1: 0(+0,-2)/1   Map 2: 0(+0,-2)/1
> > Map 1: 0(+0,-2)/1   Map 2: 0(+0,-2)/1
> > Map 1: 0(+0,-3)/1   Map 2: 0(+0,-3)/1
> > Status: Failed
> > Vertex failed, vertexName=Map 2, vertexId=vertex_1433219182593_180456_3_01, 
> > diagnostics=[Task failed, taskId=task_1433219182593_180456_3_01_000000, 
> > diagnostics=[TaskAttempt 0 failed, info=[Container 
> > container_1433219182593_180456_01_000014 finished with diagnostics set to 
> > [Container failed. File does not exist: 
> > hdfs://yhd-jqhadoop2.int.yihaodian.com:8020/tmp/hive/lujian/_tez_session_dir/86bc0010-4816-4251-95aa-bb37b8d029da/.tez/application_1433219182593_180456/tez-conf.pb
> > ]], TaskAttempt 1 failed, info=[Container 
> > container_1433219182593_180456_01_000016 finished with diagnostics set to 
> > [Container failed. File does not exist: 
> > hdfs://yhd-jqhadoop2.int.yihaodian.com:8020/tmp/hive/lujian/_tez_session_dir/86bc0010-4816-4251-95aa-bb37b8d029da/.tez/application_1433219182593_180456/tez-conf.pb
> > ]], TaskAttempt 2 failed, info=[Container 
> > container_1433219182593_180456_01_000018 finished with diagnostics set to 
> > [Container failed. File does not exist: 
> > hdfs://yhd-jqhadoop2.int.yihaodian.com:8020/tmp/hive/lujian/_tez_session_dir/86bc0010-4816-4251-95aa-bb37b8d029da/.tez/application_1433219182593_180456/tez-conf.pb
> > ]], TaskAttempt 3 failed, info=[Container 
> > container_1433219182593_180456_01_000020 finished with diagnostics set to 
> > [Container failed. File does not exist: 
> > hdfs://yhd-jqhadoop2.int.yihaodian.com:8020/tmp/hive/lujian/_tez_session_dir/86bc0010-4816-4251-95aa-bb37b8d029da/.tez/application_1433219182593_180456/tez-conf.pb
> > ]]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex 
> > vertex_1433219182593_180456_3_01 [Map 2] killed/failed due to:null]
> > Vertex killed, vertexName=Map 1, vertexId=vertex_1433219182593_180456_3_00, 
> > diagnostics=[Vertex received Kill while in RUNNING state., Vertex killed as 
> > other vertex failed. failedTasks:0, Vertex vertex_1433219182593_180456_3_00 
> > [Map 1] killed/failed due to:null]
> > DAG failed due to vertex failure. failedVertices:1 killedVertices:1
> > FAILED: Execution Error, return code 2 from 
> > org.apache.hadoop.hive.ql.exec.tez.TezTask
> >
> > I think maybe first successfule job delete  tez-conf.pb ?
> >
> > [email protected]
> > 
> > From: Hitesh Shah
> > Date: 2015-06-18 10:46
> > To: user
> > Subject: Re: hive 1.1.0 on tez0.53 error
> > That particular log is a red herring and not really an issue that is 
> > causing the failure.
> > 
> > The main problem based on the log is this:
> > 
> > 2015-06-17 18:00:43,543 INFO [AsyncDispatcher event handler] 
> > history.HistoryEventHandler: 
> > [HISTORY][DAG:dag_1433219182593_180456_3][Event:DAG_FINISHED]: 
> > dagId=dag_1433219182593_180456_3, startTime=1434535228467, 
> > finishTime=1434535243529, timeTaken=15062, status=FAILED, 
> > diagnostics=Vertex failed, vertexName=Map 2, 
> > vertexId=vertex_1433219182593_180456_3_01, diagnostics=[Task failed, 
> > taskId=task_1433219182593_180456_3_01_000000, diagnostics=[TaskAttempt 0 
> > failed, info=[Container container_1433219182593_180456_01_000014 finished 
> > with diagnostics set to [Container failed. File does not exist: 
> > hdfs://yhd-jqhadoop2.int.yihaodian.com:8020/tmp/hive/lujian/_tez_session_dir/86bc0010-4816-4251-95aa-bb37b8d029da/.tez/application_1433219182593_180456/tez-conf.pb
> > ]], TaskAttempt 1 failed, info=[Container 
> > container_1433219182593_180456_01_000016 finished with diagnostics set to 
> > [Container failed. File does not exist: 
> > hdfs://yhd-jqhadoop2.int.yihaodian.com:8020/tmp/hive/lujian/_tez_session_dir/86bc0010-4816-4251-95aa-bb37b8d029da/.tez/application_1433219182593_180456/tez-conf.pb
> > ]], TaskAttempt 2 failed, info=[Container 
> > container_1433219182593_180456_01_000018 finished with diagnostics set to 
> > [Container failed. File does not exist: 
> > hdfs://yhd-jqhadoop2.int.yihaodian.com:8020/tmp/hive/lujian/_tez_session_dir/86bc0010-4816-4251-95aa-bb37b8d029da/.tez/application_1433219182593_180456/tez-conf.pb
> > ]], TaskAttempt 3 failed, info=[Container 
> > container_1433219182593_180456_01_000020 finished with diagnostics set to 
> > [Container failed. File does not exist: 
> > hdfs://yhd-jqhadoop2.int.yihaodian.com:8020/tmp/hive/lujian/_tez_session_dir/86bc0010-4816-4251-95aa-bb37b8d029da/.tez/application_1433219182593_180456/tez-conf.pb
> > ]]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex 
> > vertex_1433219182593_180456_3_01 [Map 2] killed/failed due to:null]
> > Vertex killed, vertexName=Map 1, vertexId=vertex_1433219182593_180456_3_00, 
> > diagnostics=[Vertex received Kill while in RUNNING state., Vertex killed as 
> > other vertex failed. failedTasks:0, Vertex vertex_1433219182593_180456_3_00 
> > [Map 1] killed/failed due to:null]
> > DAG failed due to vertex failure. failedVertices:1 killedVertices:1, 
> > counters=Counters: 2, org.apache.tez.common.counters.DAGCounter, 
> > NUM_FAILED_TASKS=7, NUM_KILLED_TASKS=1
> > 
> > While the dag was running, it seems like the local resources ( tez-conf.pb 
> > ) needed for the YARN container disappeared and as a result, container 
> > launches started failing eventually leading to a dag failure.
> > 
> > — Hitesh
> > 
> > 
> > On Jun 17, 2015, at 7:31 PM, [email protected] wrote:
> > 
> > >  Hive return:
> > >  Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask
> > >
> > >  I check log found :
> > > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
> > >  No lease on 
> > > /tmp/hive/lujian/_tez_session_dir/86bc0010-4816-4251-95aa-bb37b8d029da/.tez/application_1433219182593_180456/recovery/1/summary:
> > >  File does not exist. Holder DFSClient_NONMAPREDUCE_-1030523577_1 does 
> > > not have any open files.
> > > at 
> > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2938)
> > > at 
> > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:3002)
> > > at 
> > > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:2982)
> > > at 
> > > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:626)
> > >
> > >
> > >
> > > More detail log see my attach.
> > >
> > > [email protected]
> > > <tezerror.rar>

Reply via email to