tomscut commented on PR #6236: URL: https://github.com/apache/hadoop/pull/6236#issuecomment-1790025896
> @shuaiqig Thanks for your report. But I am confused why upload fsimage from standby could hold write lock at active NameNode side? any stack do you print? Thanks again. BTW, update description which copy from JIRA. Thanks @shuaiqig for your report. And Thanks @Hexiaoqiao for your comments. There are some real problems here. When SNN does `checkpoint`, it uploads a `fsimage`, which may be tens of gigabytes, which will make the disk where the ANN metadata is stored very busy. When `rollEditLog()` is called, ANN writes to `seen_txid` in both the `dfs.namenode.name.dir` and the `dfs.namenode.edits.dir` (regardless of whether they are isolated or not), using a` write lock` . If the ioutil is high, it will take a long time to write the small file `seen_txid`, so indirectly cause ANN to hold the write lock for a long time. we added a separate lock to achieve mutual exclusion. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
