[
https://issues.apache.org/jira/browse/HDFS-16550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shilun Fan updated HDFS-16550:
------------------------------
Component/s: journal-node
Hadoop Flags: Reviewed
Target Version/s: 3.4.0
Affects Version/s: 3.4.0
> [SBN read] Improper cache-size for journal node may cause cluster crash
> -----------------------------------------------------------------------
>
> Key: HDFS-16550
> URL: https://issues.apache.org/jira/browse/HDFS-16550
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: journal-node
> Affects Versions: 3.4.0
> Reporter: Tao Li
> Assignee: Tao Li
> Priority: Major
> Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: image-2022-04-21-09-54-29-751.png,
> image-2022-04-21-09-54-57-111.png, image-2022-04-21-12-32-56-170.png
>
> Time Spent: 1h
> Remaining Estimate: 0h
>
> When we introduced {*}SBN Read{*}, we encountered a situation during upgrade
> the JournalNodes.
> Cluster Info:
> *Active: nn0*
> *Standby: nn1*
> 1. Rolling restart journal node. {color:#ff0000}(related config:
> fs.journalnode.edit-cache-size.bytes=1G, -Xms1G, -Xmx=1G){color}
> 2. The cluster runs for a while, edits cache usage is increasing and memory
> is used up.
> 3. {color:#ff0000}Active namenode(nn0){color} shutdown because of “{_}Timed
> out waiting 120000ms for a quorum of nodes to respond”{_}.
> 4. Transfer nn1 to Active state.
> 5. {color:#ff0000}New Active namenode(nn1){color} also shutdown because of
> “{_}Timed out waiting 120000ms for a quorum of nodes to respond” too{_}.
> 6. {color:#ff0000}The cluster crashed{color}.
>
> Related code:
> {code:java}
> JournaledEditsCache(Configuration conf) {
> capacity = conf.getInt(DFSConfigKeys.DFS_JOURNALNODE_EDIT_CACHE_SIZE_KEY,
> DFSConfigKeys.DFS_JOURNALNODE_EDIT_CACHE_SIZE_DEFAULT);
> if (capacity > 0.9 * Runtime.getRuntime().maxMemory()) {
> Journal.LOG.warn(String.format("Cache capacity is set at %d bytes but " +
> "maximum JVM memory is only %d bytes. It is recommended that you " +
> "decrease the cache size or increase the heap size.",
> capacity, Runtime.getRuntime().maxMemory()));
> }
> Journal.LOG.info("Enabling the journaled edits cache with a capacity " +
> "of bytes: " + capacity);
> ReadWriteLock lock = new ReentrantReadWriteLock(true);
> readLock = new AutoCloseableLock(lock.readLock());
> writeLock = new AutoCloseableLock(lock.writeLock());
> initialize(INVALID_TXN_ID);
> } {code}
> Currently, *fs.journalNode.edit-cache-size-bytes* can be set to a larger size
> than the memory requested by the process. If
> {*}fs.journalNode.edit-cache-sie.bytes > 0.9 *
> Runtime.getruntime().maxMemory(){*}, only warn logs are printed during
> journalnode startup. This can easily be overlooked by users. However, as the
> cluster runs to a certain period of time, it is likely to cause the cluster
> to crash.
>
> NN log:
> !image-2022-04-21-09-54-57-111.png|width=1012,height=47!
> !image-2022-04-21-12-32-56-170.png|width=809,height=218!
> IMO, we should not set the {{cache size}} to a fixed value, but to the ratio
> of maximum memory, which is 0.2 by default.
> This avoids the problem of too large cache size. In addition, users can
> actively adjust the heap size when they need to increase the cache size.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]