tomscut commented on PR #4087:
URL: https://github.com/apache/hadoop/pull/4087#issuecomment-1097474203

   > Hi @tomscut, sorry for the delay in my response.
   > 
   > I am inclined to agree with @sunchao that the approach laid out in 
[HDFS-14378](https://issues.apache.org/jira/browse/HDFS-14378) is a better 
long-term solution.
   > 
   > > It might be risky(we can look at here 
[HDFS-2737](https://issues.apache.org/jira/browse/HDFS-2737)) by simply 
disabling all SNN to trigger active roll edits log.
   > 
   > Can you clarify what from 
[HDFS-2737](https://issues.apache.org/jira/browse/HDFS-2737) makes you feel 
that it is risky? I skimmed the discussed and didn't notice anything alarming. 
You may also want to see [this comment on 
HDFS-14378](https://issues.apache.org/jira/browse/HDFS-14378?focusedCommentId=16907765&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16907765)
 where this same point was discussed.
   > 
   > That all being said, I think this PR may be a good step in the interim, 
since [HDFS-14378](https://issues.apache.org/jira/browse/HDFS-14378) is a more 
substantial change. I would appreciate some other opinions, though. cc 
@simbadzina @aajisaka @shvachko
   
   Thanks you @xkrogen very much for your comments. 
   It is mentioned in the description of HDFS-2737:
   ```
   Currently, the edit log tailing process can only read finalized log 
segments. So, if the active NN is not rolling its logs periodically, the SBN 
will lag a lot. This also causes many datanode messages to be queued up in the 
PendingDatanodeMessage structure.
   
   To combat this, the active NN needs to roll its logs periodically – perhaps 
based on a time threshold, or perhaps based on a number of transactions. I'm 
not sure yet whether it's better to have the NN roll on its own or to have the 
SBN ask the active NN to roll its logs.
   ```
   The pendingDatanodeMessage issue mentioned here strikes me as a bit risky. 
However, after supporting `SBN READ`, `Journal` supports `read inProgress`. If 
we enable `read inProgress`, even if we disable all SNN to roll edits, the 
pendingDatanodeMessage problem is not too serious. 
   
   I would also appreciate some other opinions.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to