[ 
https://issues.apache.org/jira/browse/HDFS-16550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17640224#comment-17640224
 ] 

ASF GitHub Bot commented on HDFS-16550:
---------------------------------------

xkrogen commented on code in PR #4209:
URL: https://github.com/apache/hadoop/pull/4209#discussion_r1033958544


##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JournaledEditsCache.java:
##########
@@ -277,11 +279,10 @@ void storeEdits(byte[] inputData, long newStartTxn, long 
newEndTxn,
         initialize(INVALID_TXN_ID);
         Journal.LOG.warn(String.format("A single batch of edits was too " +
                 "large to fit into the cache: startTxn = %d, endTxn = %d, " +
-                "input length = %d. The capacity of the cache (%s) must be " +
+                "input length = %d. The capacity of the cache must be " +

Review Comment:
   Can we keep the key here in the error message, but print both? Like `(%s or 
%s)` where one is `DFS_JOURNALNODE_EDIT_CACHE_SIZE_KEY` and one is 
`DFS_JOURNALNODE_EDIT_CACHE_SIZE_FRACTION_KEY`



##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/qjournal/server/TestJournaledEditsCache.java:
##########
@@ -221,6 +224,24 @@ public void testCacheMalformedInput() throws Exception {
     cache.retrieveEdits(-1, 10, new ArrayList<>());
   }
 
+  @Test
+  public void testCacheFraction() {
+    // Set dfs.journalnode.edit-cache-size.bytes.
+    Configuration config = new Configuration();
+    config.setInt(DFSConfigKeys.DFS_JOURNALNODE_EDIT_CACHE_SIZE_KEY, 1);
+    
config.setFloat(DFSConfigKeys.DFS_JOURNALNODE_EDIT_CACHE_SIZE_FRACTION_KEY, 
0.1f);
+    cache = new JournaledEditsCache(config);
+    assertEquals(1, cache.getCapacity(), 0.0);

Review Comment:
   why do we do this as a floating-point comparison? can't we just do 
`assertEquals(1, cache.getCapacity())`



##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JournaledEditsCache.java:
##########
@@ -121,12 +121,14 @@ class JournaledEditsCache {
   // ** End lock-protected fields **
 
   JournaledEditsCache(Configuration conf) {
+    float fraction = 
conf.getFloat(DFSConfigKeys.DFS_JOURNALNODE_EDIT_CACHE_SIZE_FRACTION_KEY,
+        DFSConfigKeys.DFS_JOURNALNODE_EDIT_CACHE_SIZE_FRACTION_DEFAULT);

Review Comment:
   maybe enforce a check here to guarantee that `fraction < 1.0` ? to fail-fast 
in case of misconfigurations



##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/qjournal/server/TestJournaledEditsCache.java:
##########
@@ -221,6 +224,24 @@ public void testCacheMalformedInput() throws Exception {
     cache.retrieveEdits(-1, 10, new ArrayList<>());
   }
 
+  @Test
+  public void testCacheFraction() {

Review Comment:
   how about `testCacheSizeConfigs` ? since we are testing both of them and how 
they interact



##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/qjournal/server/TestJournaledEditsCache.java:
##########
@@ -221,6 +224,24 @@ public void testCacheMalformedInput() throws Exception {
     cache.retrieveEdits(-1, 10, new ArrayList<>());
   }
 
+  @Test
+  public void testCacheFraction() {
+    // Set dfs.journalnode.edit-cache-size.bytes.
+    Configuration config = new Configuration();
+    config.setInt(DFSConfigKeys.DFS_JOURNALNODE_EDIT_CACHE_SIZE_KEY, 1);
+    
config.setFloat(DFSConfigKeys.DFS_JOURNALNODE_EDIT_CACHE_SIZE_FRACTION_KEY, 
0.1f);
+    cache = new JournaledEditsCache(config);
+    assertEquals(1, cache.getCapacity(), 0.0);
+
+    // Don't set dfs.journalnode.edit-cache-size.bytes.
+    Configuration config1 = new Configuration();
+    
config1.setFloat(DFSConfigKeys.DFS_JOURNALNODE_EDIT_CACHE_SIZE_FRACTION_KEY, 
0.1f);
+    cache = new JournaledEditsCache(config1);
+    assertEquals(
+        memory * 
config1.getFloat(DFSConfigKeys.DFS_JOURNALNODE_EDIT_CACHE_SIZE_FRACTION_KEY, 
0.0f),
+        cache.getCapacity(), 0.0);

Review Comment:
   how about just this?
   ```suggestion
       assertEquals(memory / 10, cache.getCapacity());
   ```



##########
hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml:
##########
@@ -4955,6 +4955,19 @@
   </description>
 </property>
 
+<property>
+  <name>dfs.journalnode.edit-cache-size.fraction</name>
+  <value>0.5f</value>
+  <description>
+    This ratio refers to the proportion of the maximum memory of the JVM.
+    Used to calculate the size of the edits cache that is kept in the 
JournalNode's memory.

Review Comment:
   we should explicitly mention that this is an alternative to the `.bytes` 
config and either copy some of the guidance there (about when it will be 
enabled, txn size, etc.) or reference it like "see the documentation for 
`dfs.journalnode.edit-cache-size.bytes` to learn more"



##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/qjournal/server/TestJournaledEditsCache.java:
##########
@@ -221,6 +224,24 @@ public void testCacheMalformedInput() throws Exception {
     cache.retrieveEdits(-1, 10, new ArrayList<>());
   }
 
+  @Test
+  public void testCacheFraction() {
+    // Set dfs.journalnode.edit-cache-size.bytes.
+    Configuration config = new Configuration();
+    config.setInt(DFSConfigKeys.DFS_JOURNALNODE_EDIT_CACHE_SIZE_KEY, 1);
+    
config.setFloat(DFSConfigKeys.DFS_JOURNALNODE_EDIT_CACHE_SIZE_FRACTION_KEY, 
0.1f);
+    cache = new JournaledEditsCache(config);
+    assertEquals(1, cache.getCapacity(), 0.0);
+
+    // Don't set dfs.journalnode.edit-cache-size.bytes.
+    Configuration config1 = new Configuration();
+    
config1.setFloat(DFSConfigKeys.DFS_JOURNALNODE_EDIT_CACHE_SIZE_FRACTION_KEY, 
0.1f);
+    cache = new JournaledEditsCache(config1);
+    assertEquals(
+        memory * 
config1.getFloat(DFSConfigKeys.DFS_JOURNALNODE_EDIT_CACHE_SIZE_FRACTION_KEY, 
0.0f),
+        cache.getCapacity(), 0.0);
+  }

Review Comment:
   can we also confirm the default behavior (neither config set) ?



##########
hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml:
##########
@@ -4955,6 +4955,19 @@
   </description>
 </property>
 
+<property>
+  <name>dfs.journalnode.edit-cache-size.fraction</name>
+  <value>0.5f</value>
+  <description>
+    This ratio refers to the proportion of the maximum memory of the JVM.
+    Used to calculate the size of the edits cache that is kept in the 
JournalNode's memory.
+    The recommended value is less than 0.9. We recommend using
+    dfs.journalnode.edit-cache-size.fraction instead of
+    dfs.journalnode.edit-cache-size.bytes. If we set
+    dfs.journalnode.edit-cache-size.bytes, this parameter will not take effect.

Review Comment:
   I don't think we specifically recommend one or the other? They are just 
different ways to configure it.



##########
hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml:
##########
@@ -4955,6 +4955,19 @@
   </description>

Review Comment:
   for `dfs.journalnode.edit-cache-size.bytes`, we also need to update the 
documentation:
   
   1. remove the default value, since there is none. mention that the size 
defaults to the value of the `.fraction` config
   2. mention that it takes precedence over the `.fraction` config, so if both 
are set, then `.bytes` is used



##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/qjournal/server/TestJournaledEditsCache.java:
##########
@@ -56,8 +56,11 @@ public class TestJournaledEditsCache {
       PathUtils.getTestDir(TestJournaledEditsCache.class, false);
   private JournaledEditsCache cache;
 
+  private long memory;
+

Review Comment:
   this field seems kind of unnecessary 

> [SBN read] Improper cache-size for journal node may cause cluster crash
> -----------------------------------------------------------------------
>
>                 Key: HDFS-16550
>                 URL: https://issues.apache.org/jira/browse/HDFS-16550
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Tao Li
>            Assignee: Tao Li
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: image-2022-04-21-09-54-29-751.png, 
> image-2022-04-21-09-54-57-111.png, image-2022-04-21-12-32-56-170.png
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> When we introduced {*}SBN Read{*}, we encountered a situation during upgrade 
> the JournalNodes.
> Cluster Info: 
> *Active: nn0*
> *Standby: nn1*
> 1. Rolling restart journal node. {color:#ff0000}(related config: 
> fs.journalnode.edit-cache-size.bytes=1G, -Xms1G, -Xmx=1G){color}
> 2. The cluster runs for a while, edits cache usage is increasing and memory 
> is used up.
> 3. {color:#ff0000}Active namenode(nn0){color} shutdown because of “{_}Timed 
> out waiting 120000ms for a quorum of nodes to respond”{_}.
> 4. Transfer nn1 to Active state.
> 5. {color:#ff0000}New Active namenode(nn1){color} also shutdown because of 
> “{_}Timed out waiting 120000ms for a quorum of nodes to respond” too{_}.
> 6. {color:#ff0000}The cluster crashed{color}.
>  
> Related code:
> {code:java}
> JournaledEditsCache(Configuration conf) {
>   capacity = conf.getInt(DFSConfigKeys.DFS_JOURNALNODE_EDIT_CACHE_SIZE_KEY,
>       DFSConfigKeys.DFS_JOURNALNODE_EDIT_CACHE_SIZE_DEFAULT);
>   if (capacity > 0.9 * Runtime.getRuntime().maxMemory()) {
>     Journal.LOG.warn(String.format("Cache capacity is set at %d bytes but " +
>         "maximum JVM memory is only %d bytes. It is recommended that you " +
>         "decrease the cache size or increase the heap size.",
>         capacity, Runtime.getRuntime().maxMemory()));
>   }
>   Journal.LOG.info("Enabling the journaled edits cache with a capacity " +
>       "of bytes: " + capacity);
>   ReadWriteLock lock = new ReentrantReadWriteLock(true);
>   readLock = new AutoCloseableLock(lock.readLock());
>   writeLock = new AutoCloseableLock(lock.writeLock());
>   initialize(INVALID_TXN_ID);
> } {code}
> Currently, *fs.journalNode.edit-cache-size-bytes* can be set to a larger size 
> than the memory requested by the process. If 
> {*}fs.journalNode.edit-cache-sie.bytes > 0.9 * 
> Runtime.getruntime().maxMemory(){*}, only warn logs are printed during 
> journalnode startup. This can easily be overlooked by users. However, as the 
> cluster runs to a certain period of time, it is likely to cause the cluster 
> to crash.
>  
> NN log:
> !image-2022-04-21-09-54-57-111.png|width=1012,height=47!
> !image-2022-04-21-12-32-56-170.png|width=809,height=218!
> IMO, we should not set the {{cache size}} to a fixed value, but to the ratio 
> of maximum memory, which is 0.2 by default.
> This avoids the problem of too large cache size. In addition, users can 
> actively adjust the heap size when they need to increase the cache size.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to