[ 
https://issues.apache.org/jira/browse/IMPALA-8972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16942145#comment-16942145
 ] 

Tim Armstrong commented on IMPALA-8972:
---------------------------------------

Honestly, the newer the better. We've gotten really good feedback on 
performance and stability of the 2.12 release (CDH5.15+ if you're using 
Cloudera) if you don't want to do a major version upgrade. Otherwise the 3.x 
releases have a lot of new things and improvements as well as all the good 
stuff in the 2.x line.

I can't really offer too much specific advice about the crash, I don't 
recognise the backtrace and I have very limited cycles to spend trying to 
identify bugs that we've probably already fixed. If you can track down the 
query that is causing it that's always useful.

But yeah, the main thing that's going to make you happier is upgrading.

> Impala deamon crashing frequently
> ---------------------------------
>
>                 Key: IMPALA-8972
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8972
>             Project: IMPALA
>          Issue Type: Question
>          Components: Infrastructure
>    Affects Versions: Impala 2.8.0
>         Environment: Impala version 2.8.0-cdh5-INTERNAL RELEASE (build )
>            Reporter: Ashok
>            Priority: Major
>
> Hi Team,
>  
> Impala deamon is crashing frequently and need to restart .
>  
> Please help in troubleshooting the same 
>  
> I could see below error messages in deamon logs
>  
> 1.
>  
> Java exception follows:
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
>  No lease on /user/hive/warehouse/steelwedge_psnokiadmt_p
> rod.db/ppd_ro_im_dpart_bkp/_impala_insert_staging/e847d4231bb8c531_c166c98d00000000/.e847d4231bb8c531-c166c98d00000002_664317806_dir/e847d4231bb8c531-c166c98
> d00000002_844965293_data.0.parq (inode 17854099): File does not exist. Holder 
> DFSClient_NONMAPREDUCE_-924590406_1 does not have any open files.
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3635)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:3438)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3294)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:679)
>  at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)
>  at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:489)
>  at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
>  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:415)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
> at org.apache.hadoop.ipc.Client.call(Client.java:1472)
>  at org.apache.hadoop.ipc.Client.call(Client.java:1409)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
>  at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
>  at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:409)
>  at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:606)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
>  at com.sun.proxy.$Proxy10.addBlock(Unknown Source)
>  at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1739)
>  at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1535)
>  at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:689)
> Wrote minidump to 
> /var/log/impala/minidumps/impalad/6352d57e-7493-b4db-27e7f36f-518eec8e.dmp
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> # SIGSEGV (0xb) at pc=0x00007f42744b72fc, pid=4881, tid=139899197028096
> #
> # JRE version: Java(TM) SE Runtime Environment (7.0_80-b15) (build 
> 1.7.0_80-b15)
> # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.80-b11 mixed mode 
> linux-amd64 compressed oops)
> # Problematic frame:
> # C [libkudu_client.so.0+0x27d2fc] void std::_Rb_tree<std::string, 
> std::pair<std::string const, std::string>, 
> std::_Select1st<std::pair<std::string const,
> std::string> >, std::less<std::string>, std::allocator<std::pair<std::string 
> const, std::string> > 
> >::_M_insert_unique<std::_Rb_tree_iterator<std::pair<std::
> string const, std::string> > >(std::_Rb_tree_iterator<std::pair<std::string 
> const, std::string> >, std::_Rb_tree_iterator<std::pair<std::string const, 
> std::s
> tring> >)+0x2381c
>  
> 2.
>  
> W0917 22:35:49.505252  1265 BlockReaderFactory.java:778] I/O error 
> constructing remote block reader.W0917 22:35:49.505252  1265 
> BlockReaderFactory.java:778] I/O error constructing remote block reader.Java 
> exception follows:java.io.IOException: Got error for OP_READ_BLOCK, 
> status=ERROR, self=/10.111.92.61:46531, remote=/10.111.92.61:50010, for file 
> /user/hive/warehouse/steelwedge_psnokiadmt_prod.db/graph_ppt_in_list/job_id=149042/ea46e454a6357b08-2622f21800000002_731828525_data.0.,
>  for pool BP-1380753826-10.128.50.16-1462783635263 block 1081077775_7337065 
> at 
> org.apache.hadoop.hdfs.RemoteBlockReader2.checkSuccess(RemoteBlockReader2.java:467)
>  at 
> org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:432)
>  at 
> org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:881)
>  at 
> org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:759)
>  at 
> org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:376) 
> at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:662) 
> at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:889)
>  at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:965) at 
> org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:147)W0917 
> 22:35:49.506021  1265 DFSInputStream.java:699] Failed to connect to 
> /10.111.92.61:50010 for block, add to deadNodes and continue. 
> java.io.IOException: Got error for OP_READ_BLOCK, status=ERROR, 
> self=/10.111.92.61:46531, remote=/10.111.92.61:50010, for file 
> /user/hive/warehouse/steelwedge_psnokiadmt_prod.db/graph_ppt_in_list/job_id=149042/ea46e454a6357b08-2622f21800000002_731828525_data.0.,
>  for pool BP-1380753826-10.128.50.16-1462783635263 block 
> 1081077775_7337065Java exception follows:java.io.IOException: Got Aborting 
> Impala for OP_READ_BLOCK, status=ERROR, self=/10.111.92.61:46531, 
> remote=/10.111.92.61:50010, for file 
> /user/hive/warehouse/steelwedge_psnokiadmt_prod.db/graph_ppt_in_list/job_id=149042/ea46e454a6357b08-2622f21800000002_731828525_data.0.,
>  for pool BP-1380753826-10.128.50.16-1462783635263 block 1081077775_7337065 
> at 
> org.apache.hadoop.hdfs.RemoteBlockReader2.checkSuccess(RemoteBlockReader2.java:467)
>  at 
> org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:432)
>  at 
> org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:881)
>  at 
> org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:759)
>  at 
> org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:376) 
> at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:662) 
> at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:889)
>  at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:965) at 
> org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:147)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to