[ 
https://issues.apache.org/jira/browse/IMPALA-8972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16943685#comment-16943685
 ] 

Tim Armstrong commented on IMPALA-8972:
---------------------------------------

https://cwiki.apache.org/confluence/display/IMPALA/Debugging+Impala+Minidumps 
has instructions.

If you have access to the system, it may be easier to enable core dumps 
(there's an option in CM) and then open the core dumps with GDB. The core dumps 
are huge but there are just fewer steps to open then.

> Impala deamon crashing frequently
> ---------------------------------
>
>                 Key: IMPALA-8972
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8972
>             Project: IMPALA
>          Issue Type: Question
>          Components: Infrastructure
>    Affects Versions: Impala 2.8.0
>         Environment: Impala version 2.8.0-cdh5-INTERNAL RELEASE (build )
>            Reporter: Ashok
>            Priority: Major
>
> Hi Team,
>  
> Impala deamon is crashing frequently and need to restart .
>  
> Please help in troubleshooting the same 
>  
> I could see below error messages in deamon logs
>  
> 1.
>  
> Java exception follows:
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
>  No lease on /user/hive/warehouse/steelwedge_psnokiadmt_p
> rod.db/ppd_ro_im_dpart_bkp/_impala_insert_staging/e847d4231bb8c531_c166c98d00000000/.e847d4231bb8c531-c166c98d00000002_664317806_dir/e847d4231bb8c531-c166c98
> d00000002_844965293_data.0.parq (inode 17854099): File does not exist. Holder 
> DFSClient_NONMAPREDUCE_-924590406_1 does not have any open files.
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3635)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:3438)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3294)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:679)
>  at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)
>  at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:489)
>  at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
>  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:415)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
> at org.apache.hadoop.ipc.Client.call(Client.java:1472)
>  at org.apache.hadoop.ipc.Client.call(Client.java:1409)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
>  at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
>  at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:409)
>  at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:606)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
>  at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
>  at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1739)
>  at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1535)
>  at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:689)
> Wrote minidump to 
> /var/log/impala/minidumps/impalad/6352d57e-7493-b4db-27e7f36f-518eec8e.dmp
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> # SIGSEGV (0xb) at pc=0x00007f42744b72fc, pid=4881, tid=139899197028096
> #
> # JRE version: Java(TM) SE Runtime Environment (7.0_80-b15) (build 
> 1.7.0_80-b15)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # C [libkudu_client.so.0+0x27d2fc] void std::_Rb_tree<std::string, 
> std::pair<std::string const, std::string>, 
> std::_Select1st<std::pair<std::string const,
> std::string> >, std::less<std::string>, std::allocator<std::pair<std::string 
> const, std::string> > 
> >::_M_insert_unique<std::_Rb_tree_iterator<std::pair<std::
> string const, std::string> > >(std::_Rb_tree_iterator<std::pair<std::string 
> const, std::string> >, std::_Rb_tree_iterator<std::pair<std::string const, 
> std::s
> tring> >)+0x2381c
>  
> 2.
>  
> W0917 22:35:49.505252  1265 BlockReaderFactory.java:778] I/O error 
> constructing remote block reader.W0917 22:35:49.505252  1265 
> BlockReaderFactory.java:778] I/O error constructing remote block reader.Java 
> exception follows:java.io.IOException: Got error for OP_READ_BLOCK, 
> status=ERROR, self=/10.111.92.61:46531, remote=/10.111.92.61:50010, for file 
> /user/hive/warehouse/steelwedge_psnokiadmt_prod.db/graph_ppt_in_list/job_id=149042/ea46e454a6357b08-2622f21800000002_731828525_data.0.,
>  for pool BP-1380753826-10.128.50.16-1462783635263 block 1081077775_7337065 
> at 
> org.apache.hadoop.hdfs.RemoteBlockReader2.checkSuccess(RemoteBlockReader2.java:467)
>  at 
> org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:432)
>  at 
> org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:881)
>  at 
> org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:759)
>  at 
> org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:376) 
> at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:662) 
> at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:889)
>  at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:965) at 
> org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:147)W0917 
> 22:35:49.506021  1265 DFSInputStream.java:699] Failed to connect to 
> /10.111.92.61:50010 for block, add to deadNodes and continue. 
> java.io.IOException: Got error for OP_READ_BLOCK, status=ERROR, 
> self=/10.111.92.61:46531, remote=/10.111.92.61:50010, for file 
> /user/hive/warehouse/steelwedge_psnokiadmt_prod.db/graph_ppt_in_list/job_id=149042/ea46e454a6357b08-2622f21800000002_731828525_data.0.,
>  for pool BP-1380753826-10.128.50.16-1462783635263 block 
> 1081077775_7337065Java exception follows:java.io.IOException: Got Aborting 
> Impala for OP_READ_BLOCK, status=ERROR, self=/10.111.92.61:46531, 
> remote=/10.111.92.61:50010, for file 
> /user/hive/warehouse/steelwedge_psnokiadmt_prod.db/graph_ppt_in_list/job_id=149042/ea46e454a6357b08-2622f21800000002_731828525_data.0.,
>  for pool BP-1380753826-10.128.50.16-1462783635263 block 1081077775_7337065 
> at 
> org.apache.hadoop.hdfs.RemoteBlockReader2.checkSuccess(RemoteBlockReader2.java:467)
>  at 
> org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:432)
>  at 
> org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:881)
>  at 
> org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:759)
>  at 
> org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:376) 
> at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:662) 
> at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:889)
>  at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:965) at 
> org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:147)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to