[jira] [Created] (IMPALA-12516) HDFS in minicluster fails to use cache with RHEL 8 on ARM

Michael Smith (Jira) Wed, 25 Oct 2023 11:58:03 -0700

Michael Smith created IMPALA-12516:
--------------------------------------

             Summary: HDFS in minicluster fails to use cache with RHEL 8 on ARM
                 Key: IMPALA-12516
                 URL: https://issues.apache.org/jira/browse/IMPALA-12516
             Project: IMPALA
          Issue Type: Task
            Reporter: Michael Smith



When running HDFS on ARM as part of Impala's test minicluster (Graviton 2/3 
instances running RHEL 8), 
query_test/test_hdfs_caching.py::TestHdfsCaching::test_table_is_cached fails.

I've traced this to HDFS returning LocatedFileStatus objects where
* in most environments, BlockLocations.getCachedHosts has at least one entry
* on m6g and m7g instances with RHEL 8, BlockLocations.getCachedHosts is an 
empty set
HDFS datanode shows the following warnings, which seem related. 2199 is the 
size of the data entry we expect to be able to read from cache.

{code}
2023-10-24 13:16:20,906 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeCommand action: DNA_CACHE for BP-1880771169-127.0.0.1-1698175576078 of 
[1073741878, 1073745044, 1073745046]
2023-10-24 13:16:20,906 WARN 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetCache: Failed to 
cache 1073741878_BP-1880771169-127.0.0.1-1698175576078: could not reserve 2199 
more bytes in the cache: dfs.datanode.max.locked.memory of 64000 exceeded.
2023-10-24 13:16:20,906 WARN 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetCache: Failed to 
cache 1073745044_BP-1880771169-127.0.0.1-1698175576078: could not reserve 115 
more bytes in the cache: dfs.datanode.max.locked.memory of 64000 exceeded.
2023-10-24 13:16:20,906 WARN 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetCache: Failed to 
cache 1073745046_BP-1880771169-127.0.0.1-1698175576078: could not reserve 115 
more bytes in the cache: dfs.datanode.max.locked.memory of 64000 exceeded.
{code}

https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-hdfs/MemoryStorage.html#Limit_RAM_used_for_replicas_in_Memory
 mentions configuring dfs.datanode.max.locked.memory and memlock. Increasing 
these settings to 100000 and 100 respectively in my environment fixes the test. 
We should update the minicluster config to allow increasing 
dfs.datanode.max.locked.memory in relevant test environments.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (IMPALA-12516) HDFS in minicluster fails to use cache with RHEL 8 on ARM

Reply via email to