[ https://issues.apache.org/jira/browse/HDFS-7964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14960114#comment-14960114 ]
Yi Liu edited comment on HDFS-7964 at 10/16/15 6:40 AM: -------------------------------------------------------- Thanks [~daryn] for the work. Further comments: *1.* In FSEditLogAsync#run {code} @Override public void run() { try { while (true) { .... if (doSync) { ... logSync(getLastWrittenTxId()); ... {code} I think it's better to pass the txid of current edit to {{logSync}}, not need to wait for all txid written. Then it's more efficient and client can get more faster response? *2.* {code} -log4j.rootLogger=OFF, CONSOLE +log4j.rootLogger=DEBUG, CONSOLE {code} Any reason to change it? *3.* {code} call.abortResponse(syncEx); {code} Seems this code is not available? was (Author: hitliuyi): Thanks [~daryn] for the work. Further comments: *1.* In FSEditLogAsync#run {code} @Override public void run() { try { while (true) { .... if (doSync) { ... logSync(getLastWrittenTxId()); ... {code} I think it's better to pass the txid of current edit to {{logSync}}, not need to wait for all txid written. Then it's more efficient and client can get more faster response? *2.* {code} + editsBatchedInSync = txid - synctxid - 1; {code} Isn't it "txid - synctxid"? The txid is the max txid written, and synctxid is the max txid already synced, suppose txid = 20, synctxid = 10, then the editsBatchedInSync should be (txid - synctxid) = (20 - 10) = 10. Also you can get it from the existing log message: {code} final String msg = "Could not sync enough journals to persistent storage " + "due to " + e.getMessage() + ". " + "Unsynced transactions: " + (txid - synctxid); {code} *3.* {code} -log4j.rootLogger=OFF, CONSOLE +log4j.rootLogger=DEBUG, CONSOLE {code} Any reason to change it? *4.* {code} call.abortResponse(syncEx); {code} Seems this code is not available? > Add support for async edit logging > ---------------------------------- > > Key: HDFS-7964 > URL: https://issues.apache.org/jira/browse/HDFS-7964 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode > Affects Versions: 2.0.2-alpha > Reporter: Daryn Sharp > Assignee: Daryn Sharp > Attachments: HDFS-7964.patch, HDFS-7964.patch > > > Edit logging is a major source of contention within the NN. LogEdit is > called within the namespace write log, while logSync is called outside of the > lock to allow greater concurrency. The handler thread remains busy until > logSync returns to provide the client with a durability guarantee for the > response. > Write heavy RPC load and/or slow IO causes handlers to stall in logSync. > Although the write lock is not held, readers are limited/starved and the call > queue fills. Combining an edit log thread with postponed RPC responses from > HADOOP-10300 will provide the same durability guarantee but immediately free > up the handlers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)