Tianyi Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10792 )

Change subject: IMPALA-3040: Remove cache directive before dropping a table
......................................................................


Patch Set 2:

The exception thrown is:

E0614 17:03:05.768528 17538 HdfsTable.java:909] Encountered an error loading 
block metadata for table: cachedb.cach
ed_tbl_part
Java exception follows:
java.util.concurrent.ExecutionException: java.io.FileNotFoundException: File 
does not exist: /test-warehouse/cached
b.db/cached_tbl_part/j=2/b14eab6ad3ac682a-1338d1ba00000000_385360643_data.0.
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66)
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56)
  at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:2157)
  at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2127)
  at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2040)
  at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:583)
  at 
org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(Authorizatio
nProviderProxyClientProtocol.java:94)
  at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenod
eProtocolServerSideTranslatorPB.java:377)
  at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod
(ClientNamenodeProtocolProtos.java)
  at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1080)
  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2278)
  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2274)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:415)
  at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2272)
  at java.util.concurrent.FutureTask.report(FutureTask.java:122)
  at java.util.concurrent.FutureTask.get(FutureTask.java:188)
  at 
org.apache.impala.catalog.HdfsTable.loadMetadataAndDiskIds(HdfsTable.java:904)
  at 
org.apache.impala.catalog.HdfsTable.updatePartitionsFromHms(HdfsTable.java:1403)
  at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:1253)
  at org.apache.impala.catalog.HdfsTable.load(HdfsTable.java:1199)
  at 
org.apache.impala.catalog.CatalogServiceCatalog.reloadTable(CatalogServiceCatalog.java:1460)
  at 
org.apache.impala.catalog.TableLoadingMgr.execAsyncRefreshWork(TableLoadingMgr.java:320)
  at 
org.apache.impala.catalog.TableLoadingMgr.access$500(TableLoadingMgr.java:48)
  at org.apache.impala.catalog.TableLoadingMgr$1.call(TableLoadingMgr.java:175)
  at org.apache.impala.catalog.TableLoadingMgr$1.call(TableLoadingMgr.java:171)
  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.FileNotFoundException: File does not exist: 
/test-warehouse/cachedb.db/cached_tbl_part/j=2/b14ea
b6ad3ac682a-1338d1ba00000000_385360643_data.0.
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66)
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56)
  at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:2157)
  at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2127)
  at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2040)
  at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:583)
  at 
org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(Authorizatio
nProviderProxyClientProtocol.java:94)
  at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenod
eProtocolServerSideTranslatorPB.java:377)
  at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod
(ClientNamenodeProtocolProtos.java)
  at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1080)
  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2278)
  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2274)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:415)
  at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2272)
  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
  at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
  at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
  at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
  at 
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
  at 
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
  at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1326)
  at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1311)
  at org.apache.hadoop.hdfs.DFSClient.getBlockLocations(DFSClient.java:1369)
  at 
org.apache.hadoop.hdfs.DistributedFileSystem$2.doCall(DistributedFileSystem.java:250)
  at 
org.apache.hadoop.hdfs.DistributedFileSystem$2.doCall(DistributedFileSystem.java:246)
  at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
  at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileBlockLocations(DistributedFileSystem.java:246)
  at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileBlockLocations(DistributedFileSystem.java:237)
  at org.apache.impala.catalog.HdfsTable.refreshFileMetadata(HdfsTable.java:503)
  at org.apache.impala.catalog.HdfsTable.access$000(HdfsTable.java:116)
  at 
org.apache.impala.catalog.HdfsTable$FileMetadataLoadRequest.call(HdfsTable.java:335)
  at 
org.apache.impala.catalog.HdfsTable$FileMetadataLoadRequest.call(HdfsTable.java:317)
  ... 4 more
Caused by: 
org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File does 
not exist: /test-warehou
se/cachedb.db/cached_tbl_part/j=2/b14eab6ad3ac682a-1338d1ba00000000_385360643_data.0.
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66)
  at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56)
  at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:2157)
  at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2127)
  at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2040)
  at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:583)
  at 
org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getBlockLocations(Authorizatio
nProviderProxyClientProtocol.java:94)
  at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenod
eProtocolServerSideTranslatorPB.java:377)
  at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod
(ClientNamenodeProtocolProtos.java)
  at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1080)
  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2278)
  at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2274)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:415)
  at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2272)
  at org.apache.hadoop.ipc.Client.call(Client.java:1510)
  at org.apache.hadoop.ipc.Client.call(Client.java:1447)
  at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:231)
  at com.sun.proxy.$Proxy11.getBlockLocations(Unknown Source)
  at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolT
ranslatorPB.java:268)
  at sun.reflect.GeneratedMethodAccessor25.invoke(Unknown Source)
  at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:606)
  at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:258)
  at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
  at com.sun.proxy.$Proxy12.getBlockLocations(Unknown Source)
  at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1324)
  ... 15 more


I went through the code again and have a new theory. I'm not confident about it 
though:
I think table is dropped concurrently with 
https://github.com/apache/impala/blob/e6abf8e86058349531caabe0a800432b1703e8f1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L1361,
 and listPartitionNames() returned an empty list. Therefore the partition in 
the table is dropped at L1400 and was not loaded back. loadMetadataAndDiskIds() 
on the other hand operates on the list generated at L1389 so is executed 
despite msPartitionNames being empty, and threw the exception.
Now the questions is whether listPartitionNames() returns an empty list if the 
table doesn't exist. The first thing to notices is that listPartitionNames() 
might throw NoSuchObjectException, so intuitively that should happen if the 
table doesn't exist, which is not true. The relevant code is at 
https://github.com/apache/hive/blob/966b83e3b9123bb455572d47878601d60b86999e/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L4717.
 The NoSuchObjectException is only thrown by fireReadTablePreEvent(), which is 
some kind of hook mechanism and might be a no-op in most cases. The backend 
implementation is at 
https://github.com/apache/hive/blob/966b83e3b9123bb455572d47878601d60b86999e/standalone-metastore/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L3247.
 It merely executes a select query which doesn't check if the table exists at 
all. So yes, it will return an empty list.


--
To view, visit http://gerrit.cloudera.org:8080/10792
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id7701a499405e961456adea63f3592b43bd69170
Gerrit-Change-Number: 10792
Gerrit-PatchSet: 2
Gerrit-Owner: Tianyi Wang <[email protected]>
Gerrit-Reviewer: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Tianyi Wang <[email protected]>
Gerrit-Comment-Date: Wed, 27 Jun 2018 02:02:02 +0000
Gerrit-HasComments: No

Reply via email to