[jira] [Commented] (HDFS-1148) Convert FSDataset to ReadWriteLock
[ https://issues.apache.org/jira/browse/HDFS-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14712727#comment-14712727 ] Yong Zhang commented on HDFS-1148: -- Failed unit tests: hadoop.hdfs.TestFileStatus is not related to this patch Convert FSDataset to ReadWriteLock -- Key: HDFS-1148 URL: https://issues.apache.org/jira/browse/HDFS-1148 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, performance Reporter: Todd Lipcon Assignee: Yong Zhang Attachments: HDFS-1148.001.patch, HDFS-1148.002.patch, HDFS-1148.003.patch, hdfs-1148-old.txt, hdfs-1148-trunk.txt, patch-HDFS-1148-rel0.20.2.txt In benchmarking HDFS-941 I noticed that for the random read workload, the FSDataset lock is highly contended. After converting it to a ReentrantReadWriteLock, I saw a ~25% improvement on both latency and ops/second. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-1148) Convert FSDataset to ReadWriteLock
[ https://issues.apache.org/jira/browse/HDFS-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14711727#comment-14711727 ] Hadoop QA commented on HDFS-1148: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 18m 25s | Pre-patch trunk has 4 extant Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 8m 0s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 5s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 24s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 25s | The applied patch generated 1 new checkstyle issues (total was 121, now 111). | | {color:green}+1{color} | whitespace | 0m 3s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 31s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 2m 1s | Post-patch findbugs hadoop-hdfs-project/hadoop-hdfs compilation is broken. | | {color:green}+1{color} | findbugs | 2m 1s | The patch does not introduce any new Findbugs (version ) warnings. | | {color:green}+1{color} | native | 0m 37s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 6m 39s | Tests failed in hadoop-hdfs. | | | | 49m 49s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestFileStatus | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12752270/HDFS-1148.003.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / eee0d45 | | Pre-patch Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/12117/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/12117/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12117/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12117/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12117/console | This message was automatically generated. Convert FSDataset to ReadWriteLock -- Key: HDFS-1148 URL: https://issues.apache.org/jira/browse/HDFS-1148 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, performance Reporter: Todd Lipcon Assignee: Yong Zhang Attachments: HDFS-1148.001.patch, HDFS-1148.002.patch, HDFS-1148.003.patch, hdfs-1148-old.txt, hdfs-1148-trunk.txt, patch-HDFS-1148-rel0.20.2.txt In benchmarking HDFS-941 I noticed that for the random read workload, the FSDataset lock is highly contended. After converting it to a ReentrantReadWriteLock, I saw a ~25% improvement on both latency and ops/second. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-1148) Convert FSDataset to ReadWriteLock
[ https://issues.apache.org/jira/browse/HDFS-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14663029#comment-14663029 ] Hadoop QA commented on HDFS-1148: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 17m 25s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 7m 51s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 41s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 24s | The applied patch generated 1 new checkstyle issues (total was 122, now 111). | | {color:red}-1{color} | whitespace | 0m 2s | The patch has 3 line(s) that end in whitespace. Use git apply --whitespace=fix. | | {color:green}+1{color} | install | 1m 21s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 34s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 2s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 190m 0s | Tests failed in hadoop-hdfs. | | | | 234m 20s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.TestReplaceDatanodeOnFailure | | | hadoop.hdfs.server.datanode.TestBlockRecovery | | Timed out tests | org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes | | | org.apache.hadoop.cli.TestHDFSCLI | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12749415/HDFS-1148.002.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 8f73bdd | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11943/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/11943/artifact/patchprocess/whitespace.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11943/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11943/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11943/console | This message was automatically generated. Convert FSDataset to ReadWriteLock -- Key: HDFS-1148 URL: https://issues.apache.org/jira/browse/HDFS-1148 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, performance Reporter: Todd Lipcon Assignee: Yong Zhang Attachments: HDFS-1148.001.patch, HDFS-1148.002.patch, hdfs-1148-old.txt, hdfs-1148-trunk.txt, patch-HDFS-1148-rel0.20.2.txt In benchmarking HDFS-941 I noticed that for the random read workload, the FSDataset lock is highly contended. After converting it to a ReentrantReadWriteLock, I saw a ~25% improvement on both latency and ops/second. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-1148) Convert FSDataset to ReadWriteLock
[ https://issues.apache.org/jira/browse/HDFS-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654926#comment-14654926 ] Hadoop QA commented on HDFS-1148: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 19m 6s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:red}-1{color} | tests included | 0m 0s | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | | {color:green}+1{color} | javac | 8m 12s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 0s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 25s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 28s | The applied patch generated 5 new checkstyle issues (total was 122, now 115). | | {color:green}+1{color} | whitespace | 0m 2s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 25s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 36s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 39s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 14s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 96m 22s | Tests failed in hadoop-hdfs. | | | | 143m 34s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.hdfs.server.namenode.TestNameEditsConfigs | | | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics | | | hadoop.hdfs.server.namenode.TestAclConfigFlag | | | hadoop.hdfs.server.namenode.TestListCorruptFileBlocks | | | hadoop.hdfs.server.namenode.TestBlockPlacementPolicyRackFaultTolerant | | | hadoop.hdfs.server.namenode.TestFavoredNodesEndToEnd | | | hadoop.hdfs.server.namenode.web.resources.TestWebHdfsDataLocality | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12748681/HDFS-1148.001.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / d540374 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/11901/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/11901/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/11901/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/11901/console | This message was automatically generated. Convert FSDataset to ReadWriteLock -- Key: HDFS-1148 URL: https://issues.apache.org/jira/browse/HDFS-1148 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, performance Reporter: Todd Lipcon Assignee: Yong Zhang Attachments: HDFS-1148.001.patch, hdfs-1148-old.txt, hdfs-1148-trunk.txt, patch-HDFS-1148-rel0.20.2.txt In benchmarking HDFS-941 I noticed that for the random read workload, the FSDataset lock is highly contended. After converting it to a ReentrantReadWriteLock, I saw a ~25% improvement on both latency and ops/second. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-1148) Convert FSDataset to ReadWriteLock
[ https://issues.apache.org/jira/browse/HDFS-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14623705#comment-14623705 ] Yong Zhang commented on HDFS-1148: -- Yes, but I think we still have some apps with remote reading. Can you please assert this task to me or I will take it? and same for HDFS-3767. Convert FSDataset to ReadWriteLock -- Key: HDFS-1148 URL: https://issues.apache.org/jira/browse/HDFS-1148 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, performance Reporter: Todd Lipcon Assignee: Dave Thompson Attachments: hdfs-1148-old.txt, hdfs-1148-trunk.txt, patch-HDFS-1148-rel0.20.2.txt In benchmarking HDFS-941 I noticed that for the random read workload, the FSDataset lock is highly contended. After converting it to a ReentrantReadWriteLock, I saw a ~25% improvement on both latency and ops/second. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-1148) Convert FSDataset to ReadWriteLock
[ https://issues.apache.org/jira/browse/HDFS-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1478#comment-1478 ] Dave Thompson commented on HDFS-1148: - 5 years out, I am not working on this, nor am I up on it's current relevance. Please close it out as you see fit, or take the assignee. Convert FSDataset to ReadWriteLock -- Key: HDFS-1148 URL: https://issues.apache.org/jira/browse/HDFS-1148 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, performance Reporter: Todd Lipcon Assignee: Dave Thompson Attachments: hdfs-1148-old.txt, hdfs-1148-trunk.txt, patch-HDFS-1148-rel0.20.2.txt In benchmarking HDFS-941 I noticed that for the random read workload, the FSDataset lock is highly contended. After converting it to a ReentrantReadWriteLock, I saw a ~25% improvement on both latency and ops/second. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-1148) Convert FSDataset to ReadWriteLock
[ https://issues.apache.org/jira/browse/HDFS-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14321180#comment-14321180 ] Todd Lipcon commented on HDFS-1148: --- I don't think this is nearly as useful anymore now that HBase and other apps that do a large number of random reads are usually using short-circuit read and thus avoiding any datanode code in their hot path. If we want to optimize the remote read random read rate, this would be useful for that purpose -- just doesn't seem urgent. Convert FSDataset to ReadWriteLock -- Key: HDFS-1148 URL: https://issues.apache.org/jira/browse/HDFS-1148 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, performance Reporter: Todd Lipcon Assignee: Dave Thompson Attachments: hdfs-1148-old.txt, hdfs-1148-trunk.txt, patch-HDFS-1148-rel0.20.2.txt In benchmarking HDFS-941 I noticed that for the random read workload, the FSDataset lock is highly contended. After converting it to a ReentrantReadWriteLock, I saw a ~25% improvement on both latency and ops/second. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-1148) Convert FSDataset to ReadWriteLock
[ https://issues.apache.org/jira/browse/HDFS-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319397#comment-14319397 ] Yong Zhang commented on HDFS-1148: -- Hi everyone, what is the status for this improvement? Convert FSDataset to ReadWriteLock -- Key: HDFS-1148 URL: https://issues.apache.org/jira/browse/HDFS-1148 Project: Hadoop HDFS Issue Type: Improvement Components: datanode, performance Reporter: Todd Lipcon Assignee: Dave Thompson Attachments: hdfs-1148-old.txt, hdfs-1148-trunk.txt, patch-HDFS-1148-rel0.20.2.txt In benchmarking HDFS-941 I noticed that for the random read workload, the FSDataset lock is highly contended. After converting it to a ReentrantReadWriteLock, I saw a ~25% improvement on both latency and ops/second. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-1148) Convert FSDataset to ReadWriteLock
[ https://issues.apache.org/jira/browse/HDFS-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044712#comment-13044712 ] Todd Lipcon commented on HDFS-1148: --- As I was updating HDFS-941 to trunk tonight, I took the opportunity to look into the blocking behavior again. While running TestParallelRead (with N_ITERATIONS bumped up 10x) I ran: {code} $ while true ; do jstack 3378 | grep -A2 BLOCK /tmp/blocked ; done {code} and then when it was done: {code} $ grep 'at ' /tmp/blocked | sort | uniq -c | sort -nk1 1 at java.lang.Object.wait(Native Method) 6 at org.apache.hadoop.hdfs.DFSInputStream.getBlockRange(DFSInputStream.java:313) 27 at org.apache.hadoop.hdfs.TestParallelRead$ReadWorker.read(TestParallelRead.java:142) 137 at org.apache.hadoop.hdfs.server.datanode.FSDataset.getBlockInputStream(FSDataset.java:1286) 183 at org.apache.hadoop.hdfs.server.datanode.BlockSender.init(BlockSender.java:100) 220 at org.apache.hadoop.hdfs.DFSInputStream.getFileLength(DFSInputStream.java:206) 251 at org.apache.hadoop.hdfs.server.datanode.FSDataset.getBlockFile(FSDataset.java:1267) {code} Three of the top four contention points are on the FSDataset monitor lock. The client-side DFSInputStream.getFileLength one is surprising, but not related to this particular JIRA. Convert FSDataset to ReadWriteLock -- Key: HDFS-1148 URL: https://issues.apache.org/jira/browse/HDFS-1148 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: hdfs-1148-old.txt, patch-HDFS-1148-rel0.20.2.txt In benchmarking HDFS-941 I noticed that for the random read workload, the FSDataset lock is highly contended. After converting it to a ReentrantReadWriteLock, I saw a ~25% improvement on both latency and ops/second. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1148) Convert FSDataset to ReadWriteLock
[ https://issues.apache.org/jira/browse/HDFS-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044714#comment-13044714 ] Todd Lipcon commented on HDFS-1148: --- Uploaded another copy of the diff that ignores whitespace at http://cloudera-todd.s3.amazonaws.com/1148-whitespace.txt -- easier to read that way. Convert FSDataset to ReadWriteLock -- Key: HDFS-1148 URL: https://issues.apache.org/jira/browse/HDFS-1148 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: hdfs-1148-old.txt, hdfs-1148-trunk.txt, patch-HDFS-1148-rel0.20.2.txt In benchmarking HDFS-941 I noticed that for the random read workload, the FSDataset lock is highly contended. After converting it to a ReentrantReadWriteLock, I saw a ~25% improvement on both latency and ops/second. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1148) Convert FSDataset to ReadWriteLock
[ https://issues.apache.org/jira/browse/HDFS-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016050#comment-13016050 ] Suresh Srinivas commented on HDFS-1148: --- Todd, I want to understand what methods have lot contention. Also you say: As I understood it, the non-fair mode is more efficient. Since the lock is mostly uncontended I am not sure what you mean mostly uncontended, because I understand the problem description as there are lot of contentions... Convert FSDataset to ReadWriteLock -- Key: HDFS-1148 URL: https://issues.apache.org/jira/browse/HDFS-1148 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: hdfs-1148-old.txt, patch-HDFS-1148-rel0.20.2.txt In benchmarking HDFS-941 I noticed that for the random read workload, the FSDataset lock is highly contended. After converting it to a ReentrantReadWriteLock, I saw a ~25% improvement on both latency and ops/second. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1148) Convert FSDataset to ReadWriteLock
[ https://issues.apache.org/jira/browse/HDFS-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016054#comment-13016054 ] Todd Lipcon commented on HDFS-1148: --- bq. Todd, I want to understand what methods have lot contention I actually don't remember anymore - this was a while back that I saw this, and only once I added HDFS-941. Since it was a read workload, it makes sense that it would be getBlockInputStream, metaFileExists, getVisibleLength, and getMetaDataInputStream. bq. I am not sure what you mean mostly uncontended, because I understand the problem description as there are lot of contentions Sorry, what I meant here is that, once it's converted to read-write lock, there is very little contention for the exclusive (write) lock. It's very rare to write small blocks, whereas small frequent reads come often from applications like HBase. So, we mostly see lots of readers and only the occasional writer. Convert FSDataset to ReadWriteLock -- Key: HDFS-1148 URL: https://issues.apache.org/jira/browse/HDFS-1148 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: hdfs-1148-old.txt, patch-HDFS-1148-rel0.20.2.txt In benchmarking HDFS-941 I noticed that for the random read workload, the FSDataset lock is highly contended. After converting it to a ReentrantReadWriteLock, I saw a ~25% improvement on both latency and ops/second. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1148) Convert FSDataset to ReadWriteLock
[ https://issues.apache.org/jira/browse/HDFS-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009755#comment-13009755 ] Dmytro Molkov commented on HDFS-1148: - Todd, in your patch you are using ReentrantReadWriteLock in a Non-Fair mode, is this intentional? Convert FSDataset to ReadWriteLock -- Key: HDFS-1148 URL: https://issues.apache.org/jira/browse/HDFS-1148 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: hdfs-1148-old.txt In benchmarking HDFS-941 I noticed that for the random read workload, the FSDataset lock is highly contended. After converting it to a ReentrantReadWriteLock, I saw a ~25% improvement on both latency and ops/second. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1148) Convert FSDataset to ReadWriteLock
[ https://issues.apache.org/jira/browse/HDFS-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009782#comment-13009782 ] Todd Lipcon commented on HDFS-1148: --- As I understood it, the non-fair mode is more efficient. Since the lock is mostly uncontended, I figured strict fairness wasn't important. Do you have some results that indicate the non-fair mode is a problem? Convert FSDataset to ReadWriteLock -- Key: HDFS-1148 URL: https://issues.apache.org/jira/browse/HDFS-1148 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: hdfs-1148-old.txt In benchmarking HDFS-941 I noticed that for the random read workload, the FSDataset lock is highly contended. After converting it to a ReentrantReadWriteLock, I saw a ~25% improvement on both latency and ops/second. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-1148) Convert FSDataset to ReadWriteLock
[ https://issues.apache.org/jira/browse/HDFS-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009940#comment-13009940 ] dhruba borthakur commented on HDFS-1148: I too have figured out lately that for these type of locks, non-fair mode is much better performance. Convert FSDataset to ReadWriteLock -- Key: HDFS-1148 URL: https://issues.apache.org/jira/browse/HDFS-1148 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: hdfs-1148-old.txt In benchmarking HDFS-941 I noticed that for the random read workload, the FSDataset lock is highly contended. After converting it to a ReentrantReadWriteLock, I saw a ~25% improvement on both latency and ops/second. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (HDFS-1148) Convert FSDataset to ReadWriteLock
[ https://issues.apache.org/jira/browse/HDFS-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12984851#action_12984851 ] dhruba borthakur commented on HDFS-1148: hi todd, if you have this patch, may we have a look? thanks. Convert FSDataset to ReadWriteLock -- Key: HDFS-1148 URL: https://issues.apache.org/jira/browse/HDFS-1148 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Reporter: Todd Lipcon Assignee: Todd Lipcon In benchmarking HDFS-941 I noticed that for the random read workload, the FSDataset lock is highly contended. After converting it to a ReentrantReadWriteLock, I saw a ~25% improvement on both latency and ops/second. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1148) Convert FSDataset to ReadWriteLock
[ https://issues.apache.org/jira/browse/HDFS-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12866515#action_12866515 ] Todd Lipcon commented on HDFS-1148: --- YCSB output (same test setup as described in HDFS-941). This test is run with the HDFS-941 improvements plus FSDataset being converted to a readwrite lock. [OVERALL],RunTime(ms), 94325 [OVERALL],Throughput(ops/sec), 10601.643254704479 [READ], Operations, 100 [READ], AverageLatency(ms), 3.747273 [READ], MinLatency(ms), 0 [READ], MaxLatency(ms), 1360 [READ], 95thPercentileLatency(ms), 10 [READ], 99thPercentileLatency(ms), 15 [READ], Return=0, 100 Convert FSDataset to ReadWriteLock -- Key: HDFS-1148 URL: https://issues.apache.org/jira/browse/HDFS-1148 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Reporter: Todd Lipcon Assignee: Todd Lipcon In benchmarking HDFS-941 I noticed that for the random read workload, the FSDataset lock is highly contended. After converting it to a ReentrantReadWriteLock, I saw a ~25% improvement on both latency and ops/second. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1148) Convert FSDataset to ReadWriteLock
[ https://issues.apache.org/jira/browse/HDFS-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12866715#action_12866715 ] dhruba borthakur commented on HDFS-1148: sounds like a good idea to me. how complicated is the codechange? Convert FSDataset to ReadWriteLock -- Key: HDFS-1148 URL: https://issues.apache.org/jira/browse/HDFS-1148 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Reporter: Todd Lipcon Assignee: Todd Lipcon In benchmarking HDFS-941 I noticed that for the random read workload, the FSDataset lock is highly contended. After converting it to a ReentrantReadWriteLock, I saw a ~25% improvement on both latency and ops/second. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1148) Convert FSDataset to ReadWriteLock
[ https://issues.apache.org/jira/browse/HDFS-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12866716#action_12866716 ] Todd Lipcon commented on HDFS-1148: --- pretty small in 0.20, but FSDataset is substantially reworked in trunk, haven't taken a look there. In general I just took a conservative approach, and made anything that might possibly change something use the write lock - even without being fancy it completely dropped this class off my jstacks. Convert FSDataset to ReadWriteLock -- Key: HDFS-1148 URL: https://issues.apache.org/jira/browse/HDFS-1148 Project: Hadoop HDFS Issue Type: Improvement Components: data-node Reporter: Todd Lipcon Assignee: Todd Lipcon In benchmarking HDFS-941 I noticed that for the random read workload, the FSDataset lock is highly contended. After converting it to a ReentrantReadWriteLock, I saw a ~25% improvement on both latency and ops/second. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.