Adriano created IMPALA-9055:
-------------------------------

             Summary: HDFS Caching with Impala: Expiration 
26687997791:19:48:13.951 exceeds the max relative expiration time of <maxTtl>
                 Key: IMPALA-9055
                 URL: https://issues.apache.org/jira/browse/IMPALA-9055
             Project: IMPALA
          Issue Type: Bug
          Components: Catalog
            Reporter: Adriano


HDFS Caching with Impala:
If we create a pool specifying the maxTtl with the hdfs command:
e.g:
{{sudo -u hdfs hdfs cacheadmin -addPool case422446 -owner impala -group hdfs 
-mode 755 -limit 100000000000  -maxTtl 7d}}

when we try to alter a table adding a partition in Impala:
e.g:
{{alter table foo partition (p1=1) set cached in 'foo'
}}
we get a failure with the exception:
ERROR: ImpalaRuntimeException: Expiration 26687997791:19:48:13.951 exceeds the 
max relative expiration time of 604800000 ms.
        at 
org.apache.hadoop.hdfs.server.namenode.CacheManager.validateExpiryTime(CacheManager.java:378)
        at 
org.apache.hadoop.hdfs.server.namenode.CacheManager.addDirective(CacheManager.java:528)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNDNCacheOp.addCacheDirective(FSNDNCacheOp.java:45)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addCacheDirective(FSNamesystem.java:6782)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addCacheDirective(NameNodeRpcServer.java:1883)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addCacheDirective(ClientNamenodeProtocolServerSideTranslatorPB.java:1265)
        at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)

CAUSED BY: InvalidRequestException: Expiration 26687997791:19:48:13.951 exceeds 
the max relative expiration time of 604800000 ms.
        at 
org.apache.hadoop.hdfs.server.namenode.CacheManager.validateExpiryTime(CacheManager.java:378)
        at 
org.apache.hadoop.hdfs.server.namenode.CacheManager.addDirective(CacheManager.java:528)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNDNCacheOp.addCacheDirective(FSNDNCacheOp.java:45)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addCacheDirective(FSNamesystem.java:6782)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addCacheDirective(NameNodeRpcServer.java:1883)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addCacheDirective(ClientNamenodeProtocolServerSideTranslatorPB.java:1265)
        at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)

CAUSED BY: RemoteException: Expiration 26687997791:19:48:13.951 exceeds the max 
relative expiration time of 604800000 ms.
        at 
org.apache.hadoop.hdfs.server.namenode.CacheManager.validateExpiryTime(CacheManager.java:378)
        at 
org.apache.hadoop.hdfs.server.namenode.CacheManager.addDirective(CacheManager.java:528)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNDNCacheOp.addCacheDirective(FSNDNCacheOp.java:45)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addCacheDirective(FSNamesystem.java:6782)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addCacheDirective(NameNodeRpcServer.java:1883)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addCacheDirective(ClientNamenodeProtocolServerSideTranslatorPB.java:1265)
        at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:523)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1685)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)

this is always reproducible.

The workaround is to do not set the maxTtl in the hdfs command during the pool 
creation.

Here's the repro steps:

{code:java}
impala-shell -i <ImpalaD hostname> -q "drop table if exists foo;"
impala-shell -i <ImpalaD hostname> -q "create table if not exists foo (c1 int, 
c2 string) partitioned by (p1 int) stored as parquet location 
'/user/hive/warehouse/foo';"
impala-shell -i <ImpalaD hostname> -q "insert into foo partition (p1 = 1 ) 
values  (1 ,'one');"
sudo -u hdfs hdfs cacheadmin -removePool foo
sudo -u hdfs hdfs cacheadmin -listDirectives -stats
sudo -u hdfs hdfs cacheadmin -listPools -stats
sudo -u hdfs hdfs cacheadmin -addPool foo -owner impala -group hdfs -mode 755 
-limit 100000000000  -maxTtl 7d
sudo -u hdfs hdfs cacheadmin -addDirective -path /user/hive/warehouse/foo -pool 
foo -ttl 7d
sudo -u hdfs hdfs cacheadmin -listDirectives -stats
sudo -u hdfs hdfs cacheadmin -listPools -stats
impala-shell -i <ImpalaD hostname> -q "alter table foo set uncached;" 
sleep 5;
impala-shell -i <ImpalaD hostname> -q "alter table foo partition (p1=1) set 
cached in 'foo';" 
{code}

I did not found any already open jira for this issue that looks reproducible in 
many CDH versions, I appreciate if you can take a look on it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to