tomscut created HDFS-16550:
------------------------------
Summary: [SBN read] Improper cache-size for journal node may cause
cluster crash
Key: HDFS-16550
URL: https://issues.apache.org/jira/browse/HDFS-16550
Project: Hadoop HDFS
Issue Type: Bug
Reporter: tomscut
Assignee: tomscut
Attachments: image-2022-04-21-09-54-29-751.png,
image-2022-04-21-09-54-57-111.png
When we introduced SBN Read, we encountered a situation when upgrading the
JournalNodes.
Cluster Info:
*Active: nn0*
*Standby: nn1*
1. Rolling restart journal node. {color:#FF0000}(related config:
fs.journalnode.edit-cache-size.bytes=1G, -Xms1G, -Xmx=1G){color}
2. The cluster runs for a while.
3. {color:#FF0000}Active namenode(nn0){color} shutdown because of Timed out
waiting 120000ms for a quorum of nodes to respond.
4. Transfer nn1 to Active state.
5. {color:#FF0000}New Active namenode(nn1){color} also shutdown because of
Timed out waiting 120000ms for a quorum of nodes to respond.
6. {color:#FF0000}The cluster crashed{color}.
Related code:
{code:java}
JournaledEditsCache(Configuration conf) {
capacity = conf.getInt(DFSConfigKeys.DFS_JOURNALNODE_EDIT_CACHE_SIZE_KEY,
DFSConfigKeys.DFS_JOURNALNODE_EDIT_CACHE_SIZE_DEFAULT);
if (capacity > 0.9 * Runtime.getRuntime().maxMemory()) {
Journal.LOG.warn(String.format("Cache capacity is set at %d bytes but " +
"maximum JVM memory is only %d bytes. It is recommended that you " +
"decrease the cache size or increase the heap size.",
capacity, Runtime.getRuntime().maxMemory()));
}
Journal.LOG.info("Enabling the journaled edits cache with a capacity " +
"of bytes: " + capacity);
ReadWriteLock lock = new ReentrantReadWriteLock(true);
readLock = new AutoCloseableLock(lock.readLock());
writeLock = new AutoCloseableLock(lock.writeLock());
initialize(INVALID_TXN_ID);
} {code}
Currently, *fs.journalNode.edit-cache-size-bytes* can be set to a larger size
than the memory requested by the process. If
{*}fs.journalNode.edit-cache-sie.bytes > 0.9 *
Runtime.getruntime().maxMemory(){*}, only warn logs are printed during
journalnode startup. This can easily be overlooked by users. However, as the
cluster runs to a certain period of time, it is likely to cause the cluster to
crash.
!image-2022-04-21-09-54-57-111.png|width=1227,height=57!
IMO, when {*}fs.journalNode.edit-cache-size-bytes > threshold *
Runtime.getruntime ().maxMemory(){*}, we should throw an Exception and
{color:#FF0000}fast fail{color}. Giving a clear hint for users to update
related configurations.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]