tomscut commented on a change in pull request #4082:
URL: https://github.com/apache/hadoop/pull/4082#discussion_r834896270



##########
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java
##########
@@ -1509,13 +1509,18 @@ synchronized void abortCurrentLogSegment() {
    * effect.
    */
   @Override
-  public synchronized void purgeLogsOlderThan(final long minTxIdToKeep) {
+  public synchronized void purgeLogsOlderThan(long minTxIdToKeep) {
     // Should not purge logs unless they are open for write.
     // This prevents the SBN from purging logs on shared storage, for example.
     if (!isOpenForWrite()) {
       return;
     }
-    
+
+    // Reset purgeLogsFrom to avoid purging edit log which is in progress.
+    if (isSegmentOpen()) {
+      minTxIdToKeep = minTxIdToKeep > curSegmentTxId ? curSegmentTxId : 
minTxIdToKeep;

Review comment:
       Hi @jojochuang @Hexiaoqiao @ayushtkn , please also take a look. Thank 
you very much.
   
   This problem begin from inprogress edits tail. And this issue 
[HDFS-14317](https://issues.apache.org/jira/browse/HDFS-14317) does a good job 
of avoiding this problem.
   
   However, if SNN's rolledit operation is disabled accidentally by 
configuration, and ANN's automatic roll period is very long, then edit log 
which is in progress may also be purged.
   
   Although we add assertions, assertion is generally disabled in a 
production(we don't normally add `-ea` to JVM parameters). This problem and the 
logs also proves that we are not strictly ensure`(inTxIdToKeep <= 
curSegmentTxId)`. So it is dangerous for NameNode. 
   
   We should reset `minTxIdToKeep` to ensure that the in progress edit log is 
not purged very strict. And wait for ANN to automatically roll to finalize the 
edit log. Then, after checkpoint, ANN automatically purged the finalized 
editlog(See the stack mentioned above).

##########
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java
##########
@@ -1509,13 +1509,18 @@ synchronized void abortCurrentLogSegment() {
    * effect.
    */
   @Override
-  public synchronized void purgeLogsOlderThan(final long minTxIdToKeep) {
+  public synchronized void purgeLogsOlderThan(long minTxIdToKeep) {
     // Should not purge logs unless they are open for write.
     // This prevents the SBN from purging logs on shared storage, for example.
     if (!isOpenForWrite()) {
       return;
     }
-    
+
+    // Reset purgeLogsFrom to avoid purging edit log which is in progress.
+    if (isSegmentOpen()) {
+      minTxIdToKeep = minTxIdToKeep > curSegmentTxId ? curSegmentTxId : 
minTxIdToKeep;

Review comment:
       Hi @jojochuang @Hexiaoqiao @ayushtkn , please also take a look. Thank 
you very much.
   
   This problem begin from inprogress edits tail. And this issue 
[HDFS-14317](https://issues.apache.org/jira/browse/HDFS-14317) does a good job 
of avoiding this problem.
   
   However, if SNN's rolledit operation is disabled accidentally by 
configuration, and ANN's automatic roll period is very long, then edit log 
which is in progress may also be purged.
   
   Although we add assertions, assertion is generally disabled in a 
production(we don't normally add `-ea` to JVM parameters). This problem and the 
logs also prove that we are not strictly ensure`(inTxIdToKeep <= 
curSegmentTxId)`. So it is dangerous for NameNode. 
   
   We should reset `minTxIdToKeep` to ensure that the in progress edit log is 
not purged very strict. And wait for ANN to automatically roll to finalize the 
edit log. Then, after checkpoint, ANN automatically purged the finalized 
editlog(See the stack mentioned above).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to