[jira] [Commented] (IMPALA-12516) HDFS in minicluster fails to use cache with RHEL 8 on ARM

ASF subversion and git services (Jira) Mon, 06 Nov 2023 17:13:05 -0800


    [ 
https://issues.apache.org/jira/browse/IMPALA-12516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17783438#comment-17783438
 ]


ASF subversion and git services commented on IMPALA-12516:
----------------------------------------------------------

Commit 2af924d4e5f9191a275e2ddeea6ba866329b0b1d in impala's branch 
refs/heads/master from Michael Smith
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=2af924d4e ]

IMPALA-12516: Set HDFS limit based on memlock

With RHEL 8 on AWS Graviton instances,
dfs.datanode.max.locked.memory=64000 is insufficient to run
query_test/test_hdfs_caching.py::TestHdfsCaching::test_table_is_cached.

Sets dfs.datanode.max.locked.memory based on 'ulimit -l', and sets
memlock to 64MB in bootstrap_system.sh to match modern defaults and
provide space for future HDFS caching tests.

New setting can be seen in admin output like

  node-1 will use ports DATANODE_PORT=31002, DATANODE_HTTP_PORT=31012,
  DATANODE_IPC_PORT=31022, DATANODE_HTTPS_PORT=31032,
  DATANODE_CLIENT_PORT=31042, NODEMANAGER_PORT=31102,
  NODEMANAGER_LOCALIZER_PORT=31122, NODEMANAGER_WEBUI_PORT=31142,
  KUDU_TS_RPC_PORT=31202, and KUDU_TS_WEBUI_PORT=31302;
  DATANODE_LOCKED_MEM=65536000

Change-Id: I7722ddd0c7fbd9bbd1979503952b7522b808194a
Reviewed-on: http://gerrit.cloudera.org:8080/20623
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Joe McDonnell <[email protected]>


> HDFS in minicluster fails to use cache with RHEL 8 on ARM
> ---------------------------------------------------------
>
>                 Key: IMPALA-12516
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12516
>             Project: IMPALA
>          Issue Type: Task
>            Reporter: Michael Smith
>            Assignee: Michael Smith
>            Priority: Major
>             Fix For: Impala 4.4.0
>
>
> When running HDFS on ARM as part of Impala's test minicluster (Graviton 2/3 
> instances running RHEL 8), 
> query_test/test_hdfs_caching.py::TestHdfsCaching::test_table_is_cached fails.
> I've traced this to HDFS returning LocatedFileStatus objects where
> * in most environments, BlockLocations.getCachedHosts has at least one entry
> * on m6g and m7g instances with RHEL 8, BlockLocations.getCachedHosts is an 
> empty set
> HDFS datanode shows the following warnings, which seem related. 2199 is the 
> size of the data entry we expect to be able to read from cache.
> {code}
> 2023-10-24 13:16:20,906 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> DatanodeCommand action: DNA_CACHE for BP-1880771169-127.0.0.1-1698175576078 
> of [1073741878, 1073745044, 1073745046]
> 2023-10-24 13:16:20,906 WARN 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetCache: Failed 
> to cache 1073741878_BP-1880771169-127.0.0.1-1698175576078: could not reserve 
> 2199 more bytes in the cache: dfs.datanode.max.locked.memory of 64000 
> exceeded.
> 2023-10-24 13:16:20,906 WARN 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetCache: Failed 
> to cache 1073745044_BP-1880771169-127.0.0.1-1698175576078: could not reserve 
> 115 more bytes in the cache: dfs.datanode.max.locked.memory of 64000 exceeded.
> 2023-10-24 13:16:20,906 WARN 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetCache: Failed 
> to cache 1073745046_BP-1880771169-127.0.0.1-1698175576078: could not reserve 
> 115 more bytes in the cache: dfs.datanode.max.locked.memory of 64000 exceeded.
> {code}
> https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-hdfs/MemoryStorage.html#Limit_RAM_used_for_replicas_in_Memory
>  mentions configuring dfs.datanode.max.locked.memory and memlock. Increasing 
> these settings to 100000 and 100 respectively in my environment fixes the 
> test. We should update the minicluster config to allow increasing 
> dfs.datanode.max.locked.memory in relevant test environments.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (IMPALA-12516) HDFS in minicluster fails to use cache with RHEL 8 on ARM

Reply via email to