[jira] [Commented] (HDFS-11402) HDFS Snapshots should capture point-in-time copies of OPEN files
[ https://issues.apache.org/jira/browse/HDFS-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869330#comment-15869330 ] Hadoop QA commented on HDFS-11402: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 35s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 11s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 92m 42s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}124m 49s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.tools.TestHdfsConfigFields | | | hadoop.hdfs.TestRollingUpgrade | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | HDFS-11402 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12852975/HDFS-11402.01.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux 043c43761295 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / a136936 | | Default Java | 1.8.0_121 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/18386/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/18386/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/18386/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > HDFS Snapshots should capture point-in-time copies of OPEN files > > > Key: HDFS-11402 > URL: https://issues.apache.org/jira/browse/HDFS-11402 > Project: Hadoop HDFS > Issue Type: Improvement >
[jira] [Commented] (HDFS-10506) OIV's ReverseXML processor cannot reconstruct some snapshot details
[ https://issues.apache.org/jira/browse/HDFS-10506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869270#comment-15869270 ] Surendra Singh Lilhore commented on HDFS-10506: --- Reviewed and tested the patch.. Looks good to me. > OIV's ReverseXML processor cannot reconstruct some snapshot details > --- > > Key: HDFS-10506 > URL: https://issues.apache.org/jira/browse/HDFS-10506 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 2.8.0 >Reporter: Colin P. McCabe >Assignee: Akira Ajisaka > Attachments: HDFS-10506.01.patch, HDFS-10506.02.patch, > HDFS-10506.03.patch > > > OIV's ReverseXML processor cannot reconstruct some snapshot details. > Specifically, should contain a and field, > but does not. should contain a field. OIV also > needs to be changed to emit these fields into the XML (they are currently > missing). -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-11419) BlockPlacementPolicyDefault is choosing datanode in an inefficient way
[ https://issues.apache.org/jira/browse/HDFS-11419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey reassigned HDFS-11419: --- Assignee: Chen Liang > BlockPlacementPolicyDefault is choosing datanode in an inefficient way > -- > > Key: HDFS-11419 > URL: https://issues.apache.org/jira/browse/HDFS-11419 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chen Liang >Assignee: Chen Liang > > Currently in {{BlockPlacementPolicyDefault}}, {{chooseTarget}} will end up > calling into {{chooseRandom}}, which will first find a random datanode by > calling > {code}DatanodeDescriptor chosenNode = chooseDataNode(scope, > excludedNodes);{code}, then it checks whether that returned datanode > satisfies storage type requirement > {code}storage = chooseStorage4Block( > chosenNode, blocksize, results, entry.getKey());{code} > If yes, {{numOfReplicas--;}}, otherwise, the node is added to excluded nodes, > and runs the loop again until {{numOfReplicas}} is down to 0. > A problem here is that, storage type is not being considered only until after > a random node is already returned. We've seen a case where a cluster has a > large number of datanodes, while only a few satisfy the storage type > condition. So, for the most part, this code blindly picks random datanodes > that do not satisfy the storage type, and adds the node to excluded and tries > again and again. > To make matters worse, the way {{NetworkTopology#chooseRandom}} works is > that, given a set of excluded nodes, it first finds a random datanodes, then > if it is in excluded nodes set, try find another random nodes. So the more > excluded nodes there are, the more likely a random node will be in the > excluded set, in which case we basically wasted one iteration. > Therefore, this JIRA proposes to augment/modify the relevant classes in a way > that datanodes can be found more efficiently. There are currently two > different high level solutions we are considering: > 1. add some field to Node base types to describe the storage type info, and > when searching for a node, we take such field(s) in to account, and do not > return node that does not meet the storage type requirement. > 2. change {{NetworkTopology}} class to be aware of storage types: for one > storage type, there is one tree subset that connects all the nodes with that > type. And one search happens on only one such subset. So unexpected storage > types are simply in the search space. > Thanks [~szetszwo] for the offline discussion, and any comments are more than > welcome. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11419) BlockPlacementPolicyDefault is choosing datanode in an inefficient way
[ https://issues.apache.org/jira/browse/HDFS-11419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869262#comment-15869262 ] Yiqun Lin commented on HDFS-11419: -- I looked into this again. I found something seem not be absolutely accurate. {quote} So, for the most part, this code blindly picks random datanodes that do not satisfy the storage type, and adds the node to excluded and tries again and again. {quote} >From the codes of {{BlockPlacementDeaultPolicy#chooseRandom}}, it doesn't add >the node to excluded when the random nodes not satisfy the storage type. Only >when it is found, then it will do. The related codes: {code} ... storage = chooseStorage4Block( chosenNode, blocksize, results, entry.getKey()); if (storage != null) { <=== only when storage was found then adding to excluded numOfReplicas--; if (firstChosen == null) { firstChosen = storage; } // add node (subclasses may also add related nodes) to excludedNode addToExcludedNodes(chosenNode, excludedNodes); ... {code} > BlockPlacementPolicyDefault is choosing datanode in an inefficient way > -- > > Key: HDFS-11419 > URL: https://issues.apache.org/jira/browse/HDFS-11419 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chen Liang > > Currently in {{BlockPlacementPolicyDefault}}, {{chooseTarget}} will end up > calling into {{chooseRandom}}, which will first find a random datanode by > calling > {code}DatanodeDescriptor chosenNode = chooseDataNode(scope, > excludedNodes);{code}, then it checks whether that returned datanode > satisfies storage type requirement > {code}storage = chooseStorage4Block( > chosenNode, blocksize, results, entry.getKey());{code} > If yes, {{numOfReplicas--;}}, otherwise, the node is added to excluded nodes, > and runs the loop again until {{numOfReplicas}} is down to 0. > A problem here is that, storage type is not being considered only until after > a random node is already returned. We've seen a case where a cluster has a > large number of datanodes, while only a few satisfy the storage type > condition. So, for the most part, this code blindly picks random datanodes > that do not satisfy the storage type, and adds the node to excluded and tries > again and again. > To make matters worse, the way {{NetworkTopology#chooseRandom}} works is > that, given a set of excluded nodes, it first finds a random datanodes, then > if it is in excluded nodes set, try find another random nodes. So the more > excluded nodes there are, the more likely a random node will be in the > excluded set, in which case we basically wasted one iteration. > Therefore, this JIRA proposes to augment/modify the relevant classes in a way > that datanodes can be found more efficiently. There are currently two > different high level solutions we are considering: > 1. add some field to Node base types to describe the storage type info, and > when searching for a node, we take such field(s) in to account, and do not > return node that does not meet the storage type requirement. > 2. change {{NetworkTopology}} class to be aware of storage types: for one > storage type, there is one tree subset that connects all the nodes with that > type. And one search happens on only one such subset. So unexpected storage > types are simply in the search space. > Thanks [~szetszwo] for the offline discussion, and any comments are more than > welcome. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-11417) Add datanode admin command to get the storage info.
[ https://issues.apache.org/jira/browse/HDFS-11417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869251#comment-15869251 ] Surendra Singh Lilhore edited comment on HDFS-11417 at 2/16/17 5:33 AM: I feel this is required when you have only shell interface to access the cluster data. was (Author: surendrasingh): I think this is required when you have only shell interface to access the cluster data. > Add datanode admin command to get the storage info. > --- > > Key: HDFS-11417 > URL: https://issues.apache.org/jira/browse/HDFS-11417 > Project: Hadoop HDFS > Issue Type: New Feature >Affects Versions: 2.7.3 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore > > It is good to add one admin command for datanode to get the data directory > info like storage type, directory path, number of block, capacity, used > space. This will be help full in large cluster where DN has multiple data > directory configured. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11417) Add datanode admin command to get the storage info.
[ https://issues.apache.org/jira/browse/HDFS-11417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869251#comment-15869251 ] Surendra Singh Lilhore commented on HDFS-11417: --- I think this is required when you have only shell interface to access the cluster data. > Add datanode admin command to get the storage info. > --- > > Key: HDFS-11417 > URL: https://issues.apache.org/jira/browse/HDFS-11417 > Project: Hadoop HDFS > Issue Type: New Feature >Affects Versions: 2.7.3 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore > > It is good to add one admin command for datanode to get the data directory > info like storage type, directory path, number of block, capacity, used > space. This will be help full in large cluster where DN has multiple data > directory configured. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9388) Refactor decommission related code to support maintenance state for datanodes
[ https://issues.apache.org/jira/browse/HDFS-9388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869184#comment-15869184 ] Ming Ma commented on HDFS-9388: --- [~manojg], most of the work has been done by other jiras. Some specific items left include the rename of DecommissionManager and if comments about decommission should be updated. Please feel free to assign it to yourself. Thank you! > Refactor decommission related code to support maintenance state for datanodes > - > > Key: HDFS-9388 > URL: https://issues.apache.org/jira/browse/HDFS-9388 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ming Ma > > Lots of code can be shared between the existing decommission functionality > and to-be-added maintenance state support for datanodes. To make it easier to > add maintenance state support, let us first modify the existing code to make > it more general. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10506) OIV's ReverseXML processor cannot reconstruct some snapshot details
[ https://issues.apache.org/jira/browse/HDFS-10506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869179#comment-15869179 ] Hadoop QA commented on HDFS-10506: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 27s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 20s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 9s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 96m 55s{color} | {color:green} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}129m 50s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | HDFS-10506 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12852953/HDFS-10506.03.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux d037e9ef6a16 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 0741dd3 | | Default Java | 1.8.0_121 | | findbugs | v3.0.0 | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/18385/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/18385/console | | Powered by | Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > OIV's ReverseXML processor cannot reconstruct some snapshot details > --- > > Key: HDFS-10506 > URL: https://issues.apache.org/jira/browse/HDFS-10506 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 2.8.0 >Reporter: Colin P. McCabe >Assignee: Akira Ajisaka > Attachments: HDFS-10506.01.patch, HDFS-10506.02.patch, > HDFS-10506.03.patch > > > OIV's ReverseXML processor cannot reconstruct some snapshot details. > Specifically, should contain a and
[jira] [Commented] (HDFS-11265) Extend visualization for Maintenance Mode under Datanode tab in the NameNode UI
[ https://issues.apache.org/jira/browse/HDFS-11265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869173#comment-15869173 ] Hudson commented on HDFS-11265: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11260 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/11260/]) HDFS-11265. Extend visualization for Maintenance Mode under Datanode tab (mingma: rev a136936d018b5cebb7aad9a01ea0dcc366e1c3b8) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/static/hadoop.css > Extend visualization for Maintenance Mode under Datanode tab in the NameNode > UI > --- > > Key: HDFS-11265 > URL: https://issues.apache.org/jira/browse/HDFS-11265 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Affects Versions: 3.0.0-alpha1 >Reporter: Manoj Govindassamy >Assignee: Elek, Marton > Fix For: 2.9.0, 3.0.0-alpha3 > > Attachments: ex.png, HDFS-11265.001.patch, icons.png, x.png > > > With HDFS-9391, DataNodes in MaintenanceModes states are shown under DataNode > page in NameNode UI, but they are lacking icon visualization like the ones > shown for other node states. Need to extend the icon visualization to cover > Maintenance Mode. > {code} >
[jira] [Commented] (HDFS-11411) Avoid OutOfMemoryError in TestMaintenanceState test runs
[ https://issues.apache.org/jira/browse/HDFS-11411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869170#comment-15869170 ] Ming Ma commented on HDFS-11411: Looks good. Nits: * For {{testExpectedReplication}} case, should we move the setup() call into the function that calls startCluster? * Maybe at the end of the function, call teardown first, then setup for the next iteration. Otherwise, setup will be called twice (one from the test case setup and another one from the added explicit call) for the first iteration of test case. > Avoid OutOfMemoryError in TestMaintenanceState test runs > > > Key: HDFS-11411 > URL: https://issues.apache.org/jira/browse/HDFS-11411 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Affects Versions: 3.0.0-alpha1 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > Attachments: HDFS-11411.01.patch > > > TestMainteananceState test runs are seeing OutOfMemoryError issues quite > frequently now. Need to fix tests that are consuming lots of memory/threads. > {noformat} > --- > T E S T S > --- > Running org.apache.hadoop.hdfs.TestMaintenanceState > Tests run: 21, Failures: 0, Errors: 4, Skipped: 0, Time elapsed: 219.479 sec > <<< FAILURE! - in org.apache.hadoop.hdfs.Te > testTransitionFromDecommissioned(org.apache.hadoop.hdfs.TestMaintenanceState) > Time elapsed: 0.64 sec <<< ERROR! > java.lang.OutOfMemoryError: unable to create new native thread > testTakeDeadNodeOutOfMaintenance(org.apache.hadoop.hdfs.TestMaintenanceState) > Time elapsed: 0.031 sec <<< ERROR! > java.lang.OutOfMemoryError: unable to create new native thread > testWithNNAndDNRestart(org.apache.hadoop.hdfs.TestMaintenanceState) Time > elapsed: 0.03 sec <<< ERROR! > java.lang.OutOfMemoryError: unable to create new native thread > testMultipleNodesMaintenance(org.apache.hadoop.hdfs.TestMaintenanceState) > Time elapsed: 60.127 sec <<< ERROR! > java.io.IOException: Problem starting http server > Results : > Tests in error: > > TestMaintenanceState.testTransitionFromDecommissioned:225->AdminStatesBaseTest.startCluster:413->AdminStatesBaseTest.s > > TestMaintenanceState.testTakeDeadNodeOutOfMaintenance:636->AdminStatesBaseTest.startCluster:413->AdminStatesBaseTest.s > > TestMaintenanceState.testWithNNAndDNRestart:692->AdminStatesBaseTest.startCluster:413->AdminStatesBaseTest.startCluste > > TestMaintenanceState.testMultipleNodesMaintenance:532->AdminStatesBaseTest.startCluster:413->AdminStatesBaseTest.start > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11265) Extend visualization for Maintenance Mode under Datanode tab in the NameNode UI
[ https://issues.apache.org/jira/browse/HDFS-11265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-11265: --- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.0-alpha3 2.9.0 Status: Resolved (was: Patch Available) Thanks [~elektrobank] for the contribution and [~manojg] for the review. Committed to trunk and branch-2. > Extend visualization for Maintenance Mode under Datanode tab in the NameNode > UI > --- > > Key: HDFS-11265 > URL: https://issues.apache.org/jira/browse/HDFS-11265 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Affects Versions: 3.0.0-alpha1 >Reporter: Manoj Govindassamy >Assignee: Elek, Marton > Fix For: 2.9.0, 3.0.0-alpha3 > > Attachments: ex.png, HDFS-11265.001.patch, icons.png, x.png > > > With HDFS-9391, DataNodes in MaintenanceModes states are shown under DataNode > page in NameNode UI, but they are lacking icon visualization like the ones > shown for other node states. Need to extend the icon visualization to cover > Maintenance Mode. > {code} >
[jira] [Commented] (HDFS-11419) BlockPlacementPolicyDefault is choosing datanode in an inefficient way
[ https://issues.apache.org/jira/browse/HDFS-11419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869150#comment-15869150 ] Yiqun Lin commented on HDFS-11419: -- Great proposal, [~vagarychen]! Actually, this should be a problem that block placement policy doesn't using expected storage types in choosing random nodes. And that makes some invalid nodes being chosen. Reading from two solutions, I prefer to the first solution. As storage type introduced in DataNode, it will be better to add this info as the one field of Node. That means which storage types are contained in the Node. If we have done this, we just need to do a little change on {{NetworkTopology#chooseRandom}}. > BlockPlacementPolicyDefault is choosing datanode in an inefficient way > -- > > Key: HDFS-11419 > URL: https://issues.apache.org/jira/browse/HDFS-11419 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Chen Liang > > Currently in {{BlockPlacementPolicyDefault}}, {{chooseTarget}} will end up > calling into {{chooseRandom}}, which will first find a random datanode by > calling > {code}DatanodeDescriptor chosenNode = chooseDataNode(scope, > excludedNodes);{code}, then it checks whether that returned datanode > satisfies storage type requirement > {code}storage = chooseStorage4Block( > chosenNode, blocksize, results, entry.getKey());{code} > If yes, {{numOfReplicas--;}}, otherwise, the node is added to excluded nodes, > and runs the loop again until {{numOfReplicas}} is down to 0. > A problem here is that, storage type is not being considered only until after > a random node is already returned. We've seen a case where a cluster has a > large number of datanodes, while only a few satisfy the storage type > condition. So, for the most part, this code blindly picks random datanodes > that do not satisfy the storage type, and adds the node to excluded and tries > again and again. > To make matters worse, the way {{NetworkTopology#chooseRandom}} works is > that, given a set of excluded nodes, it first finds a random datanodes, then > if it is in excluded nodes set, try find another random nodes. So the more > excluded nodes there are, the more likely a random node will be in the > excluded set, in which case we basically wasted one iteration. > Therefore, this JIRA proposes to augment/modify the relevant classes in a way > that datanodes can be found more efficiently. There are currently two > different high level solutions we are considering: > 1. add some field to Node base types to describe the storage type info, and > when searching for a node, we take such field(s) in to account, and do not > return node that does not meet the storage type requirement. > 2. change {{NetworkTopology}} class to be aware of storage types: for one > storage type, there is one tree subset that connects all the nodes with that > type. And one search happens on only one such subset. So unexpected storage > types are simply in the search space. > Thanks [~szetszwo] for the offline discussion, and any comments are more than > welcome. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11402) HDFS Snapshots should capture point-in-time copies of OPEN files
[ https://issues.apache.org/jira/browse/HDFS-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HDFS-11402: -- Attachment: HDFS-11402.01.patch > HDFS Snapshots should capture point-in-time copies of OPEN files > > > Key: HDFS-11402 > URL: https://issues.apache.org/jira/browse/HDFS-11402 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 2.6.0 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > Attachments: HDFS-11402.01.patch > > > *Problem:* > 1. When there are files being written and when HDFS Snapshots are taken in > parallel, Snapshots do capture all these files, but these being written files > in Snapshots do not have the point-in-time file length captured. That is, > these open files are not frozen in HDFS Snapshots. These open files > grow/shrink in length, just like the original file, even after the snapshot > time. > 2. At the time of File close or any other meta data modification operation on > these files, HDFS reconciles the file length and records the modification in > the last taken Snapshot. All the previously taken Snapshots continue to have > those open Files with no modification recorded. So, all those previous > snapshots end up using the final modification record in the last snapshot. > Thus after the file close, file lengths in all those snapshots will end up > same. > Assume File1 is opened for write and a total of 1MB written to it. While the > writes are happening, snapshots are taken in parallel. > {noformat} > |---Time---T1---T2-T3T4--> > |---Snap1--Snap2-Snap3---> > |---File1.open---write-write---close-> > {noformat} > Then at time, > T2: > Snap1.File1.length = 0 > T3: > Snap1.File1.length = 0 > Snap2.File1.length = 0 > > T4: > Snap1.File1.length = 1MB > Snap2.File1.length = 1MB > Snap3.File1.length = 1MB > *Proposal* > 1. At the time of taking Snapshot, {{SnapshotManager#createSnapshot}} can > optionally request {{DirectorySnapshottableFeature#addSnapshot}} to freeze > open files. > 2. {{DirectorySnapshottableFeature#addSnapshot}} can consult with > {{LeaseManager}} and get a list INodesInPath for all open files under the > snapshot dir. > 3. {{DirectorySnapshottableFeature#addSnapshot}} after the Snapshot creation, > Diff creation and updating modification time, can invoke > {{INodeFile#recordModification}} for each of the open files. This way, the > Snapshot just taken will have a {{FileDiff}} with {{fileSize}} captured for > each of the open files. > 4. Above model follows the current Snapshot and Diff protocols and doesn't > introduce any any disk formats. So, I don't think we will be needing any new > FSImage Loader/Saver changes for Snapshots. > 5. One of the design goals of HDFS Snapshot was ability to take any number of > snapshots in O(1) time. LeaseManager though has all the open files with > leases in-memory map, an iteration is still needed to prune the needed open > files and then run recordModification on each of them. So, it will not be a > strict O(1) with the above proposal. But, its going be a marginal increase > only as the new order will be of O(open_files_under_snap_dir). In order to > avoid HDFS Snapshots change in behavior for open files and avoid change in > time complexity, this improvement can be made under a new config > {{"dfs.namenode.snapshot.freeze.openfiles"}} which by default can be > {{false}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11402) HDFS Snapshots should capture point-in-time copies of OPEN files
[ https://issues.apache.org/jira/browse/HDFS-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HDFS-11402: -- Status: Patch Available (was: Open) > HDFS Snapshots should capture point-in-time copies of OPEN files > > > Key: HDFS-11402 > URL: https://issues.apache.org/jira/browse/HDFS-11402 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 2.6.0 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > Attachments: HDFS-11402.01.patch > > > *Problem:* > 1. When there are files being written and when HDFS Snapshots are taken in > parallel, Snapshots do capture all these files, but these being written files > in Snapshots do not have the point-in-time file length captured. That is, > these open files are not frozen in HDFS Snapshots. These open files > grow/shrink in length, just like the original file, even after the snapshot > time. > 2. At the time of File close or any other meta data modification operation on > these files, HDFS reconciles the file length and records the modification in > the last taken Snapshot. All the previously taken Snapshots continue to have > those open Files with no modification recorded. So, all those previous > snapshots end up using the final modification record in the last snapshot. > Thus after the file close, file lengths in all those snapshots will end up > same. > Assume File1 is opened for write and a total of 1MB written to it. While the > writes are happening, snapshots are taken in parallel. > {noformat} > |---Time---T1---T2-T3T4--> > |---Snap1--Snap2-Snap3---> > |---File1.open---write-write---close-> > {noformat} > Then at time, > T2: > Snap1.File1.length = 0 > T3: > Snap1.File1.length = 0 > Snap2.File1.length = 0 > > T4: > Snap1.File1.length = 1MB > Snap2.File1.length = 1MB > Snap3.File1.length = 1MB > *Proposal* > 1. At the time of taking Snapshot, {{SnapshotManager#createSnapshot}} can > optionally request {{DirectorySnapshottableFeature#addSnapshot}} to freeze > open files. > 2. {{DirectorySnapshottableFeature#addSnapshot}} can consult with > {{LeaseManager}} and get a list INodesInPath for all open files under the > snapshot dir. > 3. {{DirectorySnapshottableFeature#addSnapshot}} after the Snapshot creation, > Diff creation and updating modification time, can invoke > {{INodeFile#recordModification}} for each of the open files. This way, the > Snapshot just taken will have a {{FileDiff}} with {{fileSize}} captured for > each of the open files. > 4. Above model follows the current Snapshot and Diff protocols and doesn't > introduce any any disk formats. So, I don't think we will be needing any new > FSImage Loader/Saver changes for Snapshots. > 5. One of the design goals of HDFS Snapshot was ability to take any number of > snapshots in O(1) time. LeaseManager though has all the open files with > leases in-memory map, an iteration is still needed to prune the needed open > files and then run recordModification on each of them. So, it will not be a > strict O(1) with the above proposal. But, its going be a marginal increase > only as the new order will be of O(open_files_under_snap_dir). In order to > avoid HDFS Snapshots change in behavior for open files and avoid change in > time complexity, this improvement can be made under a new config > {{"dfs.namenode.snapshot.freeze.openfiles"}} which by default can be > {{false}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11402) HDFS Snapshots should capture point-in-time copies of OPEN files
[ https://issues.apache.org/jira/browse/HDFS-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HDFS-11402: -- Status: Open (was: Patch Available) > HDFS Snapshots should capture point-in-time copies of OPEN files > > > Key: HDFS-11402 > URL: https://issues.apache.org/jira/browse/HDFS-11402 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 2.6.0 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > > *Problem:* > 1. When there are files being written and when HDFS Snapshots are taken in > parallel, Snapshots do capture all these files, but these being written files > in Snapshots do not have the point-in-time file length captured. That is, > these open files are not frozen in HDFS Snapshots. These open files > grow/shrink in length, just like the original file, even after the snapshot > time. > 2. At the time of File close or any other meta data modification operation on > these files, HDFS reconciles the file length and records the modification in > the last taken Snapshot. All the previously taken Snapshots continue to have > those open Files with no modification recorded. So, all those previous > snapshots end up using the final modification record in the last snapshot. > Thus after the file close, file lengths in all those snapshots will end up > same. > Assume File1 is opened for write and a total of 1MB written to it. While the > writes are happening, snapshots are taken in parallel. > {noformat} > |---Time---T1---T2-T3T4--> > |---Snap1--Snap2-Snap3---> > |---File1.open---write-write---close-> > {noformat} > Then at time, > T2: > Snap1.File1.length = 0 > T3: > Snap1.File1.length = 0 > Snap2.File1.length = 0 > > T4: > Snap1.File1.length = 1MB > Snap2.File1.length = 1MB > Snap3.File1.length = 1MB > *Proposal* > 1. At the time of taking Snapshot, {{SnapshotManager#createSnapshot}} can > optionally request {{DirectorySnapshottableFeature#addSnapshot}} to freeze > open files. > 2. {{DirectorySnapshottableFeature#addSnapshot}} can consult with > {{LeaseManager}} and get a list INodesInPath for all open files under the > snapshot dir. > 3. {{DirectorySnapshottableFeature#addSnapshot}} after the Snapshot creation, > Diff creation and updating modification time, can invoke > {{INodeFile#recordModification}} for each of the open files. This way, the > Snapshot just taken will have a {{FileDiff}} with {{fileSize}} captured for > each of the open files. > 4. Above model follows the current Snapshot and Diff protocols and doesn't > introduce any any disk formats. So, I don't think we will be needing any new > FSImage Loader/Saver changes for Snapshots. > 5. One of the design goals of HDFS Snapshot was ability to take any number of > snapshots in O(1) time. LeaseManager though has all the open files with > leases in-memory map, an iteration is still needed to prune the needed open > files and then run recordModification on each of them. So, it will not be a > strict O(1) with the above proposal. But, its going be a marginal increase > only as the new order will be of O(open_files_under_snap_dir). In order to > avoid HDFS Snapshots change in behavior for open files and avoid change in > time complexity, this improvement can be made under a new config > {{"dfs.namenode.snapshot.freeze.openfiles"}} which by default can be > {{false}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11402) HDFS Snapshots should capture point-in-time copies of OPEN files
[ https://issues.apache.org/jira/browse/HDFS-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HDFS-11402: -- Attachment: (was: HDFS-11402.01.patch) > HDFS Snapshots should capture point-in-time copies of OPEN files > > > Key: HDFS-11402 > URL: https://issues.apache.org/jira/browse/HDFS-11402 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 2.6.0 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > > *Problem:* > 1. When there are files being written and when HDFS Snapshots are taken in > parallel, Snapshots do capture all these files, but these being written files > in Snapshots do not have the point-in-time file length captured. That is, > these open files are not frozen in HDFS Snapshots. These open files > grow/shrink in length, just like the original file, even after the snapshot > time. > 2. At the time of File close or any other meta data modification operation on > these files, HDFS reconciles the file length and records the modification in > the last taken Snapshot. All the previously taken Snapshots continue to have > those open Files with no modification recorded. So, all those previous > snapshots end up using the final modification record in the last snapshot. > Thus after the file close, file lengths in all those snapshots will end up > same. > Assume File1 is opened for write and a total of 1MB written to it. While the > writes are happening, snapshots are taken in parallel. > {noformat} > |---Time---T1---T2-T3T4--> > |---Snap1--Snap2-Snap3---> > |---File1.open---write-write---close-> > {noformat} > Then at time, > T2: > Snap1.File1.length = 0 > T3: > Snap1.File1.length = 0 > Snap2.File1.length = 0 > > T4: > Snap1.File1.length = 1MB > Snap2.File1.length = 1MB > Snap3.File1.length = 1MB > *Proposal* > 1. At the time of taking Snapshot, {{SnapshotManager#createSnapshot}} can > optionally request {{DirectorySnapshottableFeature#addSnapshot}} to freeze > open files. > 2. {{DirectorySnapshottableFeature#addSnapshot}} can consult with > {{LeaseManager}} and get a list INodesInPath for all open files under the > snapshot dir. > 3. {{DirectorySnapshottableFeature#addSnapshot}} after the Snapshot creation, > Diff creation and updating modification time, can invoke > {{INodeFile#recordModification}} for each of the open files. This way, the > Snapshot just taken will have a {{FileDiff}} with {{fileSize}} captured for > each of the open files. > 4. Above model follows the current Snapshot and Diff protocols and doesn't > introduce any any disk formats. So, I don't think we will be needing any new > FSImage Loader/Saver changes for Snapshots. > 5. One of the design goals of HDFS Snapshot was ability to take any number of > snapshots in O(1) time. LeaseManager though has all the open files with > leases in-memory map, an iteration is still needed to prune the needed open > files and then run recordModification on each of them. So, it will not be a > strict O(1) with the above proposal. But, its going be a marginal increase > only as the new order will be of O(open_files_under_snap_dir). In order to > avoid HDFS Snapshots change in behavior for open files and avoid change in > time complexity, this improvement can be made under a new config > {{"dfs.namenode.snapshot.freeze.openfiles"}} which by default can be > {{false}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11402) HDFS Snapshots should capture point-in-time copies of OPEN files
[ https://issues.apache.org/jira/browse/HDFS-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HDFS-11402: -- Attachment: HDFS-11402.01.patch Attached v01 patch which addresses the following: 1. {{LeaseManager#getINodeWithLeases()}} returns set of open files in {{INodesInPath}} format under any given {{INodeDirectory}} 2. {{INodesInPath#isAncestor()}} can find if the given {{INodeDirectory}} is an ancestor of it. Used by {{LeaseManager}} to gather open files under a snap root. 3. New config param {{dfs.namenode.snapshot.freeze.openfiles}} in {{DFSConfigKeys}} to turn on/off this feature 4. When {{dfs.namenode.snapshot.freeze.openfiles}} is enabled, {{DirectorySnapshottableFeature#addSnapshot}} gathers open files under the snap root from {{LeaseManager}} and records modification for each of them to freeze the file length in the snapshot. Tests: 1. {{TestLeaseManager}} updated to test {{LeaseManager#getINodeWithLeases()}} functionality 2. TestOpenFilesWithSnapshot, TestSnapshotManager updated to test Snapshots when there are files being written -- verify file lengths in snapshots are frozen even after more writes goto the live file, verify open files in non snap dir not affected, verify file lengths frozen with NN restart, etc., [~yzhangal], [~jingzhao], [~andrew.wang] can you please review the patch and share your suggestions/comments ? > HDFS Snapshots should capture point-in-time copies of OPEN files > > > Key: HDFS-11402 > URL: https://issues.apache.org/jira/browse/HDFS-11402 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 2.6.0 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > Attachments: HDFS-11402.01.patch > > > *Problem:* > 1. When there are files being written and when HDFS Snapshots are taken in > parallel, Snapshots do capture all these files, but these being written files > in Snapshots do not have the point-in-time file length captured. That is, > these open files are not frozen in HDFS Snapshots. These open files > grow/shrink in length, just like the original file, even after the snapshot > time. > 2. At the time of File close or any other meta data modification operation on > these files, HDFS reconciles the file length and records the modification in > the last taken Snapshot. All the previously taken Snapshots continue to have > those open Files with no modification recorded. So, all those previous > snapshots end up using the final modification record in the last snapshot. > Thus after the file close, file lengths in all those snapshots will end up > same. > Assume File1 is opened for write and a total of 1MB written to it. While the > writes are happening, snapshots are taken in parallel. > {noformat} > |---Time---T1---T2-T3T4--> > |---Snap1--Snap2-Snap3---> > |---File1.open---write-write---close-> > {noformat} > Then at time, > T2: > Snap1.File1.length = 0 > T3: > Snap1.File1.length = 0 > Snap2.File1.length = 0 > > T4: > Snap1.File1.length = 1MB > Snap2.File1.length = 1MB > Snap3.File1.length = 1MB > *Proposal* > 1. At the time of taking Snapshot, {{SnapshotManager#createSnapshot}} can > optionally request {{DirectorySnapshottableFeature#addSnapshot}} to freeze > open files. > 2. {{DirectorySnapshottableFeature#addSnapshot}} can consult with > {{LeaseManager}} and get a list INodesInPath for all open files under the > snapshot dir. > 3. {{DirectorySnapshottableFeature#addSnapshot}} after the Snapshot creation, > Diff creation and updating modification time, can invoke > {{INodeFile#recordModification}} for each of the open files. This way, the > Snapshot just taken will have a {{FileDiff}} with {{fileSize}} captured for > each of the open files. > 4. Above model follows the current Snapshot and Diff protocols and doesn't > introduce any any disk formats. So, I don't think we will be needing any new > FSImage Loader/Saver changes for Snapshots. > 5. One of the design goals of HDFS Snapshot was ability to take any number of > snapshots in O(1) time. LeaseManager though has all the open files with > leases in-memory map, an iteration is still needed to prune the needed open > files and then run recordModification on each of them. So, it will not be a > strict O(1) with the above proposal. But, its going be a marginal increase > only as the new order will be of O(open_files_under_snap_dir). In order to > avoid HDFS Snapshots change in behavior for open files and avoid change in > time complexity, this improvement can be made under a new config > {{"dfs.namenode.snapshot.freeze.openfiles"}} which by default can be > {{false}}.
[jira] [Updated] (HDFS-11402) HDFS Snapshots should capture point-in-time copies of OPEN files
[ https://issues.apache.org/jira/browse/HDFS-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HDFS-11402: -- Status: Patch Available (was: Open) > HDFS Snapshots should capture point-in-time copies of OPEN files > > > Key: HDFS-11402 > URL: https://issues.apache.org/jira/browse/HDFS-11402 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 2.6.0 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > Attachments: HDFS-11402.01.patch > > > *Problem:* > 1. When there are files being written and when HDFS Snapshots are taken in > parallel, Snapshots do capture all these files, but these being written files > in Snapshots do not have the point-in-time file length captured. That is, > these open files are not frozen in HDFS Snapshots. These open files > grow/shrink in length, just like the original file, even after the snapshot > time. > 2. At the time of File close or any other meta data modification operation on > these files, HDFS reconciles the file length and records the modification in > the last taken Snapshot. All the previously taken Snapshots continue to have > those open Files with no modification recorded. So, all those previous > snapshots end up using the final modification record in the last snapshot. > Thus after the file close, file lengths in all those snapshots will end up > same. > Assume File1 is opened for write and a total of 1MB written to it. While the > writes are happening, snapshots are taken in parallel. > {noformat} > |---Time---T1---T2-T3T4--> > |---Snap1--Snap2-Snap3---> > |---File1.open---write-write---close-> > {noformat} > Then at time, > T2: > Snap1.File1.length = 0 > T3: > Snap1.File1.length = 0 > Snap2.File1.length = 0 > > T4: > Snap1.File1.length = 1MB > Snap2.File1.length = 1MB > Snap3.File1.length = 1MB > *Proposal* > 1. At the time of taking Snapshot, {{SnapshotManager#createSnapshot}} can > optionally request {{DirectorySnapshottableFeature#addSnapshot}} to freeze > open files. > 2. {{DirectorySnapshottableFeature#addSnapshot}} can consult with > {{LeaseManager}} and get a list INodesInPath for all open files under the > snapshot dir. > 3. {{DirectorySnapshottableFeature#addSnapshot}} after the Snapshot creation, > Diff creation and updating modification time, can invoke > {{INodeFile#recordModification}} for each of the open files. This way, the > Snapshot just taken will have a {{FileDiff}} with {{fileSize}} captured for > each of the open files. > 4. Above model follows the current Snapshot and Diff protocols and doesn't > introduce any any disk formats. So, I don't think we will be needing any new > FSImage Loader/Saver changes for Snapshots. > 5. One of the design goals of HDFS Snapshot was ability to take any number of > snapshots in O(1) time. LeaseManager though has all the open files with > leases in-memory map, an iteration is still needed to prune the needed open > files and then run recordModification on each of them. So, it will not be a > strict O(1) with the above proposal. But, its going be a marginal increase > only as the new order will be of O(open_files_under_snap_dir). In order to > avoid HDFS Snapshots change in behavior for open files and avoid change in > time complexity, this improvement can be made under a new config > {{"dfs.namenode.snapshot.freeze.openfiles"}} which by default can be > {{false}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11265) Extend visualization for Maintenance Mode under Datanode tab in the NameNode UI
[ https://issues.apache.org/jira/browse/HDFS-11265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869107#comment-15869107 ] Ming Ma commented on HDFS-11265: Strictly speaking, live decommissioned nodes can serve read requests as the least preferred replicas. But even with that, the existing patch LGTM. +1. > Extend visualization for Maintenance Mode under Datanode tab in the NameNode > UI > --- > > Key: HDFS-11265 > URL: https://issues.apache.org/jira/browse/HDFS-11265 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Affects Versions: 3.0.0-alpha1 >Reporter: Manoj Govindassamy >Assignee: Elek, Marton > Attachments: ex.png, HDFS-11265.001.patch, icons.png, x.png > > > With HDFS-9391, DataNodes in MaintenanceModes states are shown under DataNode > page in NameNode UI, but they are lacking icon visualization like the ones > shown for other node states. Need to extend the icon visualization to cover > Maintenance Mode. > {code} >
[jira] [Commented] (HDFS-11402) HDFS Snapshots should capture point-in-time copies of OPEN files
[ https://issues.apache.org/jira/browse/HDFS-11402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869091#comment-15869091 ] Manoj Govindassamy commented on HDFS-11402: --- [~yzhangal], LeaseManager maintains INodeFileIDs of all leased open files in-memory. Moreover, with the proposed design LeaseManager now exposes a functionality to get all files with leases under the given directory. So at the time of Snapshot, the SnapshotManager will only run extra operations for the open files under the snapshot dir and not on all the open files in the system. For all the open files, creation of diff record is very light weight as it only copies the length field from the file and not blocks. Your proposal of maintaining diff record for each open file at the time of file length update well before the Snaoshot is _not_ going to make the overall time complexity O(1). The diff record computation is not the one contributing to the extra time consumption at the Snapshot time. Its the generation of open files under the snapshot directory and the diff record saving for each of those open files is making the overall complexity for the snapshot creation an O(open files count under snap root). And, {{dfs.namenode.snapshot.freeze.openfiles}} config is {{false}} by default. Users who are interested in this feature can enable this at a marginal cost and otherwise there aren't any performance dips in the normal snapshot operation. > HDFS Snapshots should capture point-in-time copies of OPEN files > > > Key: HDFS-11402 > URL: https://issues.apache.org/jira/browse/HDFS-11402 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 2.6.0 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > > *Problem:* > 1. When there are files being written and when HDFS Snapshots are taken in > parallel, Snapshots do capture all these files, but these being written files > in Snapshots do not have the point-in-time file length captured. That is, > these open files are not frozen in HDFS Snapshots. These open files > grow/shrink in length, just like the original file, even after the snapshot > time. > 2. At the time of File close or any other meta data modification operation on > these files, HDFS reconciles the file length and records the modification in > the last taken Snapshot. All the previously taken Snapshots continue to have > those open Files with no modification recorded. So, all those previous > snapshots end up using the final modification record in the last snapshot. > Thus after the file close, file lengths in all those snapshots will end up > same. > Assume File1 is opened for write and a total of 1MB written to it. While the > writes are happening, snapshots are taken in parallel. > {noformat} > |---Time---T1---T2-T3T4--> > |---Snap1--Snap2-Snap3---> > |---File1.open---write-write---close-> > {noformat} > Then at time, > T2: > Snap1.File1.length = 0 > T3: > Snap1.File1.length = 0 > Snap2.File1.length = 0 > > T4: > Snap1.File1.length = 1MB > Snap2.File1.length = 1MB > Snap3.File1.length = 1MB > *Proposal* > 1. At the time of taking Snapshot, {{SnapshotManager#createSnapshot}} can > optionally request {{DirectorySnapshottableFeature#addSnapshot}} to freeze > open files. > 2. {{DirectorySnapshottableFeature#addSnapshot}} can consult with > {{LeaseManager}} and get a list INodesInPath for all open files under the > snapshot dir. > 3. {{DirectorySnapshottableFeature#addSnapshot}} after the Snapshot creation, > Diff creation and updating modification time, can invoke > {{INodeFile#recordModification}} for each of the open files. This way, the > Snapshot just taken will have a {{FileDiff}} with {{fileSize}} captured for > each of the open files. > 4. Above model follows the current Snapshot and Diff protocols and doesn't > introduce any any disk formats. So, I don't think we will be needing any new > FSImage Loader/Saver changes for Snapshots. > 5. One of the design goals of HDFS Snapshot was ability to take any number of > snapshots in O(1) time. LeaseManager though has all the open files with > leases in-memory map, an iteration is still needed to prune the needed open > files and then run recordModification on each of them. So, it will not be a > strict O(1) with the above proposal. But, its going be a marginal increase > only as the new order will be of O(open_files_under_snap_dir). In order to > avoid HDFS Snapshots change in behavior for open files and avoid change in > time complexity, this improvement can be made under a new config > {{"dfs.namenode.snapshot.freeze.openfiles"}} which by default can be >
[jira] [Commented] (HDFS-11411) Avoid OutOfMemoryError in TestMaintenanceState test runs
[ https://issues.apache.org/jira/browse/HDFS-11411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869075#comment-15869075 ] Manoj Govindassamy commented on HDFS-11411: --- Tests failures are not related to the patch. > Avoid OutOfMemoryError in TestMaintenanceState test runs > > > Key: HDFS-11411 > URL: https://issues.apache.org/jira/browse/HDFS-11411 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Affects Versions: 3.0.0-alpha1 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > Attachments: HDFS-11411.01.patch > > > TestMainteananceState test runs are seeing OutOfMemoryError issues quite > frequently now. Need to fix tests that are consuming lots of memory/threads. > {noformat} > --- > T E S T S > --- > Running org.apache.hadoop.hdfs.TestMaintenanceState > Tests run: 21, Failures: 0, Errors: 4, Skipped: 0, Time elapsed: 219.479 sec > <<< FAILURE! - in org.apache.hadoop.hdfs.Te > testTransitionFromDecommissioned(org.apache.hadoop.hdfs.TestMaintenanceState) > Time elapsed: 0.64 sec <<< ERROR! > java.lang.OutOfMemoryError: unable to create new native thread > testTakeDeadNodeOutOfMaintenance(org.apache.hadoop.hdfs.TestMaintenanceState) > Time elapsed: 0.031 sec <<< ERROR! > java.lang.OutOfMemoryError: unable to create new native thread > testWithNNAndDNRestart(org.apache.hadoop.hdfs.TestMaintenanceState) Time > elapsed: 0.03 sec <<< ERROR! > java.lang.OutOfMemoryError: unable to create new native thread > testMultipleNodesMaintenance(org.apache.hadoop.hdfs.TestMaintenanceState) > Time elapsed: 60.127 sec <<< ERROR! > java.io.IOException: Problem starting http server > Results : > Tests in error: > > TestMaintenanceState.testTransitionFromDecommissioned:225->AdminStatesBaseTest.startCluster:413->AdminStatesBaseTest.s > > TestMaintenanceState.testTakeDeadNodeOutOfMaintenance:636->AdminStatesBaseTest.startCluster:413->AdminStatesBaseTest.s > > TestMaintenanceState.testWithNNAndDNRestart:692->AdminStatesBaseTest.startCluster:413->AdminStatesBaseTest.startCluste > > TestMaintenanceState.testMultipleNodesMaintenance:532->AdminStatesBaseTest.startCluster:413->AdminStatesBaseTest.start > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11352) Potential deadlock in NN when failing over
[ https://issues.apache.org/jira/browse/HDFS-11352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869026#comment-15869026 ] Yongjun Zhang commented on HDFS-11352: -- Thanks [~ajisakaa]. > Potential deadlock in NN when failing over > -- > > Key: HDFS-11352 > URL: https://issues.apache.org/jira/browse/HDFS-11352 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.4, 2.6.6 >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Critical > Labels: high-availability > Fix For: 2.7.4, 2.6.6 > > Attachments: HDFS-11352-branch-2.7.000.patch > > > HDFS-11180 fixed a general class of deadlock that can occur when failing over > between the MetricsSystemImpl and FSEditLog (see comments on that JIRA for > more details). In trunk and branch-2/branch-2.8 this fix was successful by > making the metrics calls not synchronize on FSEditLog. > In branch-2.6 and branch-2.7 there is one more method, > {{FSNamesystem#getTransactionsSinceLastCheckpoint}}, which still requires the > lock on FSEditLog and thus can result in the same deadlock scenario. This can > be seen by running {{TestFSNamesystemMBean#testWithFSEditLogLock}} _with the > patch in HDFS-11290_ on either of these branches (it fails currently). -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10506) OIV's ReverseXML processor cannot reconstruct some snapshot details
[ https://issues.apache.org/jira/browse/HDFS-10506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka updated HDFS-10506: - Attachment: HDFS-10506.03.patch Rebased. > OIV's ReverseXML processor cannot reconstruct some snapshot details > --- > > Key: HDFS-10506 > URL: https://issues.apache.org/jira/browse/HDFS-10506 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 2.8.0 >Reporter: Colin P. McCabe >Assignee: Akira Ajisaka > Attachments: HDFS-10506.01.patch, HDFS-10506.02.patch, > HDFS-10506.03.patch > > > OIV's ReverseXML processor cannot reconstruct some snapshot details. > Specifically, should contain a and field, > but does not. should contain a field. OIV also > needs to be changed to emit these fields into the XML (they are currently > missing). -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11352) Potential deadlock in NN when failing over
[ https://issues.apache.org/jira/browse/HDFS-11352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869008#comment-15869008 ] Akira Ajisaka commented on HDFS-11352: -- Hi [~yzhangal], yes, it's correct. > Potential deadlock in NN when failing over > -- > > Key: HDFS-11352 > URL: https://issues.apache.org/jira/browse/HDFS-11352 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.4, 2.6.6 >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Critical > Labels: high-availability > Fix For: 2.7.4, 2.6.6 > > Attachments: HDFS-11352-branch-2.7.000.patch > > > HDFS-11180 fixed a general class of deadlock that can occur when failing over > between the MetricsSystemImpl and FSEditLog (see comments on that JIRA for > more details). In trunk and branch-2/branch-2.8 this fix was successful by > making the metrics calls not synchronize on FSEditLog. > In branch-2.6 and branch-2.7 there is one more method, > {{FSNamesystem#getTransactionsSinceLastCheckpoint}}, which still requires the > lock on FSEditLog and thus can result in the same deadlock scenario. This can > be seen by running {{TestFSNamesystemMBean#testWithFSEditLogLock}} _with the > patch in HDFS-11290_ on either of these branches (it fails currently). -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11412) Maintenance minimum replication config value allowable range should be {0 - DefaultReplication}
[ https://issues.apache.org/jira/browse/HDFS-11412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868977#comment-15868977 ] Ming Ma commented on HDFS-11412: Thanks [~manojg]. * Regarding whether to use default replication factor or max replication factor, do you care about the following use case? default == 3, max = 30. Block A have large replication factor 30 and would like to keep at least 20 live replicas around during maintenance. Then put 20 nodes with replicas of Block A into maintenance at the same time. To make sure at least 20 live replicas after maintenance, the system need to honor minReplicationToBeInMaintenance == 20. * Impact on {{getExpectedLiveRedundancyNum}} calculation. Set minReplicationToBeInMaintenance to 3. Block B's replication factor is 2. Put one of its replicas into maintenance. Inside function {{getExpectedLiveRedundancyNum}}, {{Math.max(expectedRedundancy - numberReplicas.maintenanceReplicas(), getMinReplicationToBeInMaintenance())}} == 3. Ideally the function should return 2. > Maintenance minimum replication config value allowable range should be {0 - > DefaultReplication} > --- > > Key: HDFS-11412 > URL: https://issues.apache.org/jira/browse/HDFS-11412 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Affects Versions: 3.0.0-alpha1 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > Attachments: HDFS-11412.01.patch > > > Currently the allowed value range for Maintenance Min Replication > {{dfs.namenode.maintenance.replication.min}} is 0 to > {{dfs.namenode.replication.min}} (default=1). Users wanting not to affect the > performance of the cluster would wish to have the Maintenance Min Replication > number greater than 1, say 2. In the current design, it is possible to have > this Maintenance Min Replication configuration, but only after changing the > NameNode level Block Min Replication to 2, and which could slowdown the > overall latency for client writes. > Technically speaking we should be allowing Maintenance Min Replication to be > in range 0 to dfs.replication.max. > * There is always config value of 0 for users not wanting any > availability/performance during maintenance. > * And, performance centric workloads can still get maintenance done without > major disruptions by having a bigger Maintenance Min Replication. Setting the > upper limit as dfs.replication.max could be an overkill as it could trigger > re-replication which Maintenance State is trying to avoid. So, we could allow > the {{dfs.namenode.maintenance.replication.min}} in the range {{0 to > dfs.replication}} > {noformat} > if (minMaintenanceR < 0) { > throw new IOException("Unexpected configuration parameters: " > + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY > + " = " + minMaintenanceR + " < 0"); > } > if (minMaintenanceR > minR) { > throw new IOException("Unexpected configuration parameters: " > + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY > + " = " + minMaintenanceR + " > " > + DFSConfigKeys.DFS_NAMENODE_REPLICATION_MIN_KEY > + " = " + minR); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11412) Maintenance minimum replication config value allowable range should be {0 - DefaultReplication}
[ https://issues.apache.org/jira/browse/HDFS-11412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868952#comment-15868952 ] Manoj Govindassamy commented on HDFS-11412: --- Test failures are from the patch. Will submit a revised patch soon. > Maintenance minimum replication config value allowable range should be {0 - > DefaultReplication} > --- > > Key: HDFS-11412 > URL: https://issues.apache.org/jira/browse/HDFS-11412 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, namenode >Affects Versions: 3.0.0-alpha1 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > Attachments: HDFS-11412.01.patch > > > Currently the allowed value range for Maintenance Min Replication > {{dfs.namenode.maintenance.replication.min}} is 0 to > {{dfs.namenode.replication.min}} (default=1). Users wanting not to affect the > performance of the cluster would wish to have the Maintenance Min Replication > number greater than 1, say 2. In the current design, it is possible to have > this Maintenance Min Replication configuration, but only after changing the > NameNode level Block Min Replication to 2, and which could slowdown the > overall latency for client writes. > Technically speaking we should be allowing Maintenance Min Replication to be > in range 0 to dfs.replication.max. > * There is always config value of 0 for users not wanting any > availability/performance during maintenance. > * And, performance centric workloads can still get maintenance done without > major disruptions by having a bigger Maintenance Min Replication. Setting the > upper limit as dfs.replication.max could be an overkill as it could trigger > re-replication which Maintenance State is trying to avoid. So, we could allow > the {{dfs.namenode.maintenance.replication.min}} in the range {{0 to > dfs.replication}} > {noformat} > if (minMaintenanceR < 0) { > throw new IOException("Unexpected configuration parameters: " > + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY > + " = " + minMaintenanceR + " < 0"); > } > if (minMaintenanceR > minR) { > throw new IOException("Unexpected configuration parameters: " > + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY > + " = " + minMaintenanceR + " > " > + DFSConfigKeys.DFS_NAMENODE_REPLICATION_MIN_KEY > + " = " + minR); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-7877) Support maintenance state for datanodes
[ https://issues.apache.org/jira/browse/HDFS-7877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868948#comment-15868948 ] Ming Ma commented on HDFS-7877: --- ok. Will follow up the discussion in HDFS-11412. > Support maintenance state for datanodes > --- > > Key: HDFS-7877 > URL: https://issues.apache.org/jira/browse/HDFS-7877 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-7877-2.patch, HDFS-7877.patch, > Supportmaintenancestatefordatanodes-2.pdf, > Supportmaintenancestatefordatanodes.pdf > > > This requirement came up during the design for HDFS-7541. Given this feature > is mostly independent of upgrade domain feature, it is better to track it > under a separate jira. The design and draft patch will be available soon. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11414) Ozone : move StorageContainerLocation protocol to hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-11414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868898#comment-15868898 ] Hadoop QA commented on HDFS-11414: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 49s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 24s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 28s{color} | {color:green} HDFS-7240 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 24s{color} | {color:green} HDFS-7240 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 32s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in HDFS-7240 has 86 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 51s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in HDFS-7240 has 10 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s{color} | {color:green} HDFS-7240 passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 6s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 40s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client generated 12 new + 86 unchanged - 0 fixed = 98 total (was 86) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 55s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 56m 10s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 19s{color} | {color:red} The patch generated 3 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 84m 16s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs-project/hadoop-hdfs-client | | | org.apache.hadoop.ozone.protocol.proto.StorageContainerLocationProtocolProtos$ContainerRequestProto.PARSER isn't final but should be At StorageContainerLocationProtocolProtos.java:be At StorageContainerLocationProtocolProtos.java:[line 2859] | | | Class org.apache.hadoop.ozone.protocol.proto.StorageContainerLocationProtocolProtos$ContainerRequestProto defines non-transient non-serializable instance field unknownFields In StorageContainerLocationProtocolProtos.java:instance field unknownFields In
[jira] [Commented] (HDFS-11352) Potential deadlock in NN when failing over
[ https://issues.apache.org/jira/browse/HDFS-11352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868884#comment-15868884 ] Yongjun Zhang commented on HDFS-11352: -- HI [~xkrogen] and [~ajisakaa], Thanks for your work here. It looks to me that the reason trunk doesn't need this patch because it has HDFS-7501. Because HDFS-7501 is not backported to 2.7.x and 2.6.x, we had the need for HDFS-11352 here. Does that sound correct to you? Thanks. > Potential deadlock in NN when failing over > -- > > Key: HDFS-11352 > URL: https://issues.apache.org/jira/browse/HDFS-11352 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.7.4, 2.6.6 >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Critical > Labels: high-availability > Fix For: 2.7.4, 2.6.6 > > Attachments: HDFS-11352-branch-2.7.000.patch > > > HDFS-11180 fixed a general class of deadlock that can occur when failing over > between the MetricsSystemImpl and FSEditLog (see comments on that JIRA for > more details). In trunk and branch-2/branch-2.8 this fix was successful by > making the metrics calls not synchronize on FSEditLog. > In branch-2.6 and branch-2.7 there is one more method, > {{FSNamesystem#getTransactionsSinceLastCheckpoint}}, which still requires the > lock on FSEditLog and thus can result in the same deadlock scenario. This can > be seen by running {{TestFSNamesystemMBean#testWithFSEditLogLock}} _with the > patch in HDFS-11290_ on either of these branches (it fails currently). -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11414) Ozone : move StorageContainerLocation protocol to hdfs-client
[ https://issues.apache.org/jira/browse/HDFS-11414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Liang updated HDFS-11414: -- Attachment: HDFS-11414-HDFS-7240.003.patch fix the missing package-info warnings. > Ozone : move StorageContainerLocation protocol to hdfs-client > - > > Key: HDFS-11414 > URL: https://issues.apache.org/jira/browse/HDFS-11414 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Chen Liang >Assignee: Chen Liang > Attachments: HDFS-11414-HDFS-7240.001.patch, > HDFS-11414-HDFS-7240.002.patch, HDFS-11414-HDFS-7240.003.patch > > > {{StorageContainerLocation}} classes are client-facing classes of containers, > similar to {{XceiverClient}}, so they should be moved to hadoop-hdfs-client. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11184) Ozone: SCM: Make SCM use container protocol
[ https://issues.apache.org/jira/browse/HDFS-11184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868753#comment-15868753 ] Chen Liang commented on HDFS-11184: --- v005 patch LGTM, +1 > Ozone: SCM: Make SCM use container protocol > --- > > Key: HDFS-11184 > URL: https://issues.apache.org/jira/browse/HDFS-11184 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Anu Engineer >Assignee: Anu Engineer > Fix For: HDFS-7240 > > Attachments: HDFS-11184-HDFS-7240.001.patch, > HDFS-11184-HDFS-7240.002.patch, HDFS-11184-HDFS-7240.003.patch, > HDFS-11184-HDFS-7240.004.patch, HDFS-11184-HDFS-7240.005.patch > > > SCM will start using container protocol to communicate with datanodes. > This change introduces some test failures due to some missing features which > will be moved to KSM. Will file separate JIRA to track disabled ozone tests. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-11419) BlockPlacementPolicyDefault is choosing datanode in an inefficient way
Chen Liang created HDFS-11419: - Summary: BlockPlacementPolicyDefault is choosing datanode in an inefficient way Key: HDFS-11419 URL: https://issues.apache.org/jira/browse/HDFS-11419 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: Chen Liang Currently in {{BlockPlacementPolicyDefault}}, {{chooseTarget}} will end up calling into {{chooseRandom}}, which will first find a random datanode by calling {code}DatanodeDescriptor chosenNode = chooseDataNode(scope, excludedNodes);{code}, then it checks whether that returned datanode satisfies storage type requirement {code}storage = chooseStorage4Block( chosenNode, blocksize, results, entry.getKey());{code} If yes, {{numOfReplicas--;}}, otherwise, the node is added to excluded nodes, and runs the loop again until {{numOfReplicas}} is down to 0. A problem here is that, storage type is not being considered only until after a random node is already returned. We've seen a case where a cluster has a large number of datanodes, while only a few satisfy the storage type condition. So, for the most part, this code blindly picks random datanodes that do not satisfy the storage type, and adds the node to excluded and tries again and again. To make matters worse, the way {{NetworkTopology#chooseRandom}} works is that, given a set of excluded nodes, it first finds a random datanodes, then if it is in excluded nodes set, try find another random nodes. So the more excluded nodes there are, the more likely a random node will be in the excluded set, in which case we basically wasted one iteration. Therefore, this JIRA proposes to augment/modify the relevant classes in a way that datanodes can be found more efficiently. There are currently two different high level solutions we are considering: 1. add some field to Node base types to describe the storage type info, and when searching for a node, we take such field(s) in to account, and do not return node that does not meet the storage type requirement. 2. change {{NetworkTopology}} class to be aware of storage types: for one storage type, there is one tree subset that connects all the nodes with that type. And one search happens on only one such subset. So unexpected storage types are simply in the search space. Thanks [~szetszwo] for the offline discussion, and any comments are more than welcome. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11375) Display the volume storage type in datanode UI
[ https://issues.apache.org/jira/browse/HDFS-11375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868573#comment-15868573 ] Hudson commented on HDFS-11375: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11259 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/11259/]) HDFS-11375. Display the volume storage type in datanode UI. Contributed (liuml07: rev 0741dd3b9abdeb65bb783c1a8b01f078c4bdba17) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/datanode/datanode.html > Display the volume storage type in datanode UI > -- > > Key: HDFS-11375 > URL: https://issues.apache.org/jira/browse/HDFS-11375 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, ui >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore >Priority: Minor > Fix For: 2.9.0, 3.0.0-alpha3 > > Attachments: DN_UI_Aftrerfix.png, HDFS-11375.01.patch, > HDFS-11375.02.patch > > > Volume storage info is useful for debugging the issue related to policy... -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11375) Display the volume storage type in datanode UI
[ https://issues.apache.org/jira/browse/HDFS-11375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mingliang Liu updated HDFS-11375: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.0-alpha3 2.9.0 Status: Resolved (was: Patch Available) Committed to {{trunk}} and {{branch-2}} branches. Thanks for your contribution [~surendrasingh]. Thanks for your review [~brahmareddy] and [~hanishakoneru]. > Display the volume storage type in datanode UI > -- > > Key: HDFS-11375 > URL: https://issues.apache.org/jira/browse/HDFS-11375 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, ui >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore >Priority: Minor > Fix For: 2.9.0, 3.0.0-alpha3 > > Attachments: DN_UI_Aftrerfix.png, HDFS-11375.01.patch, > HDFS-11375.02.patch > > > Volume storage info is useful for debugging the issue related to policy... -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11418) HttpFS should support old SSL clients
[ https://issues.apache.org/jira/browse/HDFS-11418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868527#comment-15868527 ] Hadoop QA commented on HDFS-11418: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 13m 51s{color} | {color:blue} Docker mode activated. {color} | | {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue} 0m 13s{color} | {color:blue} Shelldocs was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 58s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s{color} | {color:green} branch-2 passed with JDK v1.8.0_121 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 20s{color} | {color:green} branch-2 passed with JDK v1.7.0_121 {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 45s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s{color} | {color:green} branch-2 passed with JDK v1.8.0_121 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s{color} | {color:green} branch-2 passed with JDK v1.7.0_121 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s{color} | {color:green} the patch passed with JDK v1.8.0_121 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 17s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} shellcheck {color} | {color:red} 0m 7s{color} | {color:red} The patch generated 18 new + 518 unchanged - 0 fixed = 536 total (was 518) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s{color} | {color:green} the patch passed with JDK v1.8.0_121 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 14s{color} | {color:green} the patch passed with JDK v1.7.0_121 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 26s{color} | {color:green} hadoop-hdfs-httpfs in the patch passed with JDK v1.7.0_121. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 34m 36s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:b59b8b7 | | JIRA Issue | HDFS-11418 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12852895/HDFS-11418.branch-2.001.patch | | Optional Tests | asflicense mvnsite unit shellcheck shelldocs compile javac javadoc mvninstall xml | | uname | Linux 63d601be64cb 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | branch-2 / 323782b | | Default Java | 1.7.0_121 | | Multi-JDK versions | /usr/lib/jvm/java-8-oracle:1.8.0_121
[jira] [Commented] (HDFS-11225) NameNode crashed because deleteSnapshot held FSNamesystem lock too long
[ https://issues.apache.org/jira/browse/HDFS-11225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868510#comment-15868510 ] Wei-Chiu Chuang commented on HDFS-11225: Actually, because INodeDirectory.cleanSubtree is invoked in a few other places, (e.g. FSDirectory.unprotectedDelete, FSDirectory.unprotectedRenameTo), delete and rename operations may also suffer from the same bug. > NameNode crashed because deleteSnapshot held FSNamesystem lock too long > --- > > Key: HDFS-11225 > URL: https://issues.apache.org/jira/browse/HDFS-11225 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.4.0 > Environment: CDH5.8.2, HA >Reporter: Wei-Chiu Chuang >Assignee: Manoj Govindassamy >Priority: Critical > Labels: high-availability > > The deleteSnapshot operation is synchronous. In certain situations this > operation may hold FSNamesystem lock for too long, bringing almost every > NameNode operation to a halt. > We have observed one incidence where it took so long that ZKFC believes the > NameNode is down. All other IPC threads were waiting to acquire FSNamesystem > lock. This specific deleteSnapshot took ~70 seconds. ZKFC has connection > timeout of 45 seconds by default, and if all IPC threads wait for > FSNamesystem lock and can't accept new incoming connection, ZKFC times out, > advances epoch and NameNode will therefore lose its active NN role and then > fail. > Relevant log: > {noformat} > Thread 154 (IPC Server handler 86 on 8020): > State: RUNNABLE > Blocked count: 2753455 > Waited count: 89201773 > Stack: > > org.apache.hadoop.hdfs.server.namenode.INode$BlocksMapUpdateInfo.addDeleteBlock(INode.java:879) > > org.apache.hadoop.hdfs.server.namenode.INodeFile.destroyAndCollectBlocks(INodeFile.java:508) > > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763) > > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763) > > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763) > > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.destroyAndCollectBlocks(INodeDirectory.java:763) > > org.apache.hadoop.hdfs.server.namenode.INodeReference.destroyAndCollectBlocks(INodeReference.java:339) > > org.apache.hadoop.hdfs.server.namenode.INodeReference$WithName.destroyAndCollectBlocks(INodeReference.java:606) > > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$ChildrenDiff.destroyDeletedList(DirectoryWithSnapshotFeature.java:119) > > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$ChildrenDiff.access$400(DirectoryWithSnapshotFeature.java:61) > > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.destroyDiffAndCollectBlocks(DirectoryWithSnapshotFeature.java:319) > > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff.destroyDiffAndCollectBlocks(DirectoryWithSnapshotFeature.java:167) > > org.apache.hadoop.hdfs.server.namenode.snapshot.AbstractINodeDiffList.deleteSnapshotDiff(AbstractINodeDiffList.java:83) > > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:745) > > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:776) > > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:747) > > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:747) > > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:776) > > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:747) > > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:789) > {noformat} > After the ZKFC determined NameNode was down and advanced epoch, the NN > finished deleting snapshot, and sent the edit to journal nodes, but it was > rejected because epoch was updated. See the following stacktrace: > {noformat} > 10.0.16.21:8485: IPC's epoch 17 is less than the last promised epoch 18 > at > org.apache.hadoop.hdfs.qjournal.server.Journal.checkRequest(Journal.java:429) > at > org.apache.hadoop.hdfs.qjournal.server.Journal.checkWriteRequest(Journal.java:457) > at > org.apache.hadoop.hdfs.qjournal.server.Journal.journal(Journal.java:352) > at >
[jira] [Updated] (HDFS-11418) HttpFS should support old SSL clients
[ https://issues.apache.org/jira/browse/HDFS-11418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-11418: -- Attachment: HDFS-11418.branch-2.001.patch Patch branch-2.001 * Add env HTTPFS_SSL_CIPHERS, default to a list of selected ciphers * Configure Tomcat to accept a list of ciphers TODO * Discuss Allen's idea of strong security by default Testing done * hadoop-hdfs-httpfs unit tests * Verify HTTPFS_SSL_CIPHERS value on stdout during httpfs startup * Run https://github.com/jzhuge/hadoop-bats-tests/blob/master/httpfs.bats in insecure, SSL, and SSL+Kerberos single node setup * Sslcan result should include only listed ciphers * On Centos 6.6, run the following curl command. Expect {{NSS error -12286}} without the fix. {noformat} curl -v -k --negotiate -u: -sS 'https://HTTPFS_HOST:14000/webhdfs/v1/?op=liststatus' {noformat} > HttpFS should support old SSL clients > - > > Key: HDFS-11418 > URL: https://issues.apache.org/jira/browse/HDFS-11418 > Project: Hadoop HDFS > Issue Type: Improvement > Components: httpfs >Affects Versions: 2.8.0, 2.7.4, 2.6.6 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Attachments: HDFS-11418.branch-2.001.patch > > > HADOOP-13812 upgraded Tomcat to 6.0.48 which filters weak ciphers. Old SSL > clients such as curl stop working. The symptom is {{NSS error -12286}} when > running {{curl -v}}. > Instead of forcing the SSL clients to upgrade, we can configure Tomcat to > explicitly allow enough weak ciphers so that old SSL clients can work. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11418) HttpFS should support old SSL clients
[ https://issues.apache.org/jira/browse/HDFS-11418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Zhuge updated HDFS-11418: -- Status: Patch Available (was: Open) > HttpFS should support old SSL clients > - > > Key: HDFS-11418 > URL: https://issues.apache.org/jira/browse/HDFS-11418 > Project: Hadoop HDFS > Issue Type: Improvement > Components: httpfs >Affects Versions: 2.8.0, 2.7.4, 2.6.6 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Minor > Attachments: HDFS-11418.branch-2.001.patch > > > HADOOP-13812 upgraded Tomcat to 6.0.48 which filters weak ciphers. Old SSL > clients such as curl stop working. The symptom is {{NSS error -12286}} when > running {{curl -v}}. > Instead of forcing the SSL clients to upgrade, we can configure Tomcat to > explicitly allow enough weak ciphers so that old SSL clients can work. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10899) Add functionality to re-encrypt EDEKs.
[ https://issues.apache.org/jira/browse/HDFS-10899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868472#comment-15868472 ] Hadoop QA commented on HDFS-10899: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 11 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 6s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 12s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 6s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 3s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 57s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}114m 23s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}150m 3s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency | | | hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration | | | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | HDFS-10899 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12852861/HDFS-10899.09.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc xml | | uname | Linux 9223997301ec 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh |
[jira] [Commented] (HDFS-10899) Add functionality to re-encrypt EDEKs.
[ https://issues.apache.org/jira/browse/HDFS-10899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868464#comment-15868464 ] Hadoop QA commented on HDFS-10899: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 6s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 1s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 11 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 50s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 14s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 4s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 56s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 96m 28s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}138m 24s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | HDFS-10899 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12852861/HDFS-10899.09.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc xml | | uname | Linux 0d52057c4281 3.13.0-108-generic #155-Ubuntu SMP Wed Jan 11 16:58:52 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 0fc6f38 | | Default Java | 1.8.0_121 | | findbugs | v3.0.0 | | unit |
[jira] [Commented] (HDFS-11375) Display the volume storage type in datanode UI
[ https://issues.apache.org/jira/browse/HDFS-11375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868453#comment-15868453 ] Mingliang Liu commented on HDFS-11375: -- +1 The failing tests are not related; let's fix checkstyle warnings with existing ones. > Display the volume storage type in datanode UI > -- > > Key: HDFS-11375 > URL: https://issues.apache.org/jira/browse/HDFS-11375 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, ui >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore >Priority: Minor > Attachments: DN_UI_Aftrerfix.png, HDFS-11375.01.patch, > HDFS-11375.02.patch > > > Volume storage info is useful for debugging the issue related to policy... -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11417) Add datanode admin command to get the storage info.
[ https://issues.apache.org/jira/browse/HDFS-11417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868417#comment-15868417 ] Kihwal Lee commented on HDFS-11417: --- Sorry, it is actually you who are working on both issues! > Add datanode admin command to get the storage info. > --- > > Key: HDFS-11417 > URL: https://issues.apache.org/jira/browse/HDFS-11417 > Project: Hadoop HDFS > Issue Type: New Feature >Affects Versions: 2.7.3 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore > > It is good to add one admin command for datanode to get the data directory > info like storage type, directory path, number of block, capacity, used > space. This will be help full in large cluster where DN has multiple data > directory configured. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11417) Add datanode admin command to get the storage info.
[ https://issues.apache.org/jira/browse/HDFS-11417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868413#comment-15868413 ] Kihwal Lee commented on HDFS-11417: --- More info was added to the datanoed UI. HDFS-11375 is improving it. You might want to give them input if something you think is important is missing. > Add datanode admin command to get the storage info. > --- > > Key: HDFS-11417 > URL: https://issues.apache.org/jira/browse/HDFS-11417 > Project: Hadoop HDFS > Issue Type: New Feature >Affects Versions: 2.7.3 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore > > It is good to add one admin command for datanode to get the data directory > info like storage type, directory path, number of block, capacity, used > space. This will be help full in large cluster where DN has multiple data > directory configured. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8498) Blocks can be committed with wrong size
[ https://issues.apache.org/jira/browse/HDFS-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868412#comment-15868412 ] Jing Zhao commented on HDFS-8498: - [~jojochuang], currently I do not plan to backport this change to branch 2.x. But please feel free to do it if you think it's necessary and I will be happy to review. > Blocks can be committed with wrong size > --- > > Key: HDFS-8498 > URL: https://issues.apache.org/jira/browse/HDFS-8498 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.5.0 >Reporter: Daryn Sharp >Assignee: Jing Zhao >Priority: Critical > Fix For: 3.0.0-alpha3 > > Attachments: HDFS-8498.000.patch, HDFS-8498.001.patch > > > When an IBR for a UC block arrives, the NN updates the expected location's > block and replica state _only_ if it's on an unexpected storage for an > expected DN. If it's for an expected storage, only the genstamp is updated. > When the block is committed, and the expected locations are verified, only > the genstamp is checked. The size is not checked but it wasn't updated in > the expected locations anyway. > A faulty client may misreport the size when committing the block. The block > is effectively corrupted. If the NN issues replications, the received IBR is > considered corrupt, the NN invalidates the block, immediately issues another > replication. The NN eventually realizes all the original replicas are > corrupt after full BRs are received from the original DNs. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8498) Blocks can be committed with wrong size
[ https://issues.apache.org/jira/browse/HDFS-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868394#comment-15868394 ] Hudson commented on HDFS-8498: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11258 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/11258/]) HDFS-8498. Blocks can be committed with wrong size. Contributed by Jing (jing9: rev 627da6f7178e18aa41996969c408b6f344e297d1) * (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/StripedDataStreamer.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSOutputStream.java * (edit) hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java > Blocks can be committed with wrong size > --- > > Key: HDFS-8498 > URL: https://issues.apache.org/jira/browse/HDFS-8498 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.5.0 >Reporter: Daryn Sharp >Assignee: Jing Zhao >Priority: Critical > Fix For: 3.0.0-alpha3 > > Attachments: HDFS-8498.000.patch, HDFS-8498.001.patch > > > When an IBR for a UC block arrives, the NN updates the expected location's > block and replica state _only_ if it's on an unexpected storage for an > expected DN. If it's for an expected storage, only the genstamp is updated. > When the block is committed, and the expected locations are verified, only > the genstamp is checked. The size is not checked but it wasn't updated in > the expected locations anyway. > A faulty client may misreport the size when committing the block. The block > is effectively corrupted. If the NN issues replications, the received IBR is > considered corrupt, the NN invalidates the block, immediately issues another > replication. The NN eventually realizes all the original replicas are > corrupt after full BRs are received from the original DNs. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8498) Blocks can be committed with wrong size
[ https://issues.apache.org/jira/browse/HDFS-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868354#comment-15868354 ] Wei-Chiu Chuang commented on HDFS-8498: --- [~jingzhao] very nice work! Do you plan to cherry pick the fix into 2.x branches? Thanks! > Blocks can be committed with wrong size > --- > > Key: HDFS-8498 > URL: https://issues.apache.org/jira/browse/HDFS-8498 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.5.0 >Reporter: Daryn Sharp >Assignee: Jing Zhao >Priority: Critical > Fix For: 3.0.0-alpha3 > > Attachments: HDFS-8498.000.patch, HDFS-8498.001.patch > > > When an IBR for a UC block arrives, the NN updates the expected location's > block and replica state _only_ if it's on an unexpected storage for an > expected DN. If it's for an expected storage, only the genstamp is updated. > When the block is committed, and the expected locations are verified, only > the genstamp is checked. The size is not checked but it wasn't updated in > the expected locations anyway. > A faulty client may misreport the size when committing the block. The block > is effectively corrupted. If the NN issues replications, the received IBR is > considered corrupt, the NN invalidates the block, immediately issues another > replication. The NN eventually realizes all the original replicas are > corrupt after full BRs are received from the original DNs. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-8498) Blocks can be committed with wrong size
[ https://issues.apache.org/jira/browse/HDFS-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868333#comment-15868333 ] Jing Zhao commented on HDFS-8498: - I've committed the patch into trunk. > Blocks can be committed with wrong size > --- > > Key: HDFS-8498 > URL: https://issues.apache.org/jira/browse/HDFS-8498 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.5.0 >Reporter: Daryn Sharp >Assignee: Jing Zhao >Priority: Critical > Fix For: 3.0.0-alpha3 > > Attachments: HDFS-8498.000.patch, HDFS-8498.001.patch > > > When an IBR for a UC block arrives, the NN updates the expected location's > block and replica state _only_ if it's on an unexpected storage for an > expected DN. If it's for an expected storage, only the genstamp is updated. > When the block is committed, and the expected locations are verified, only > the genstamp is checked. The size is not checked but it wasn't updated in > the expected locations anyway. > A faulty client may misreport the size when committing the block. The block > is effectively corrupted. If the NN issues replications, the received IBR is > considered corrupt, the NN invalidates the block, immediately issues another > replication. The NN eventually realizes all the original replicas are > corrupt after full BRs are received from the original DNs. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-8498) Blocks can be committed with wrong size
[ https://issues.apache.org/jira/browse/HDFS-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao updated HDFS-8498: Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.0-alpha3 Status: Resolved (was: Patch Available) Thanks for the review, [~jnp]! I will commit the patch shortly. > Blocks can be committed with wrong size > --- > > Key: HDFS-8498 > URL: https://issues.apache.org/jira/browse/HDFS-8498 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 2.5.0 >Reporter: Daryn Sharp >Assignee: Jing Zhao >Priority: Critical > Fix For: 3.0.0-alpha3 > > Attachments: HDFS-8498.000.patch, HDFS-8498.001.patch > > > When an IBR for a UC block arrives, the NN updates the expected location's > block and replica state _only_ if it's on an unexpected storage for an > expected DN. If it's for an expected storage, only the genstamp is updated. > When the block is committed, and the expected locations are verified, only > the genstamp is checked. The size is not checked but it wasn't updated in > the expected locations anyway. > A faulty client may misreport the size when committing the block. The block > is effectively corrupted. If the NN issues replications, the received IBR is > considered corrupt, the NN invalidates the block, immediately issues another > replication. The NN eventually realizes all the original replicas are > corrupt after full BRs are received from the original DNs. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-11418) HttpFS should support old SSL clients
John Zhuge created HDFS-11418: - Summary: HttpFS should support old SSL clients Key: HDFS-11418 URL: https://issues.apache.org/jira/browse/HDFS-11418 Project: Hadoop HDFS Issue Type: Improvement Components: httpfs Affects Versions: 2.8.0, 2.7.4, 2.6.6 Reporter: John Zhuge Assignee: John Zhuge Priority: Minor HADOOP-13812 upgraded Tomcat to 6.0.48 which filters weak ciphers. Old SSL clients such as curl stop working. The symptom is {{NSS error -12286}} when running {{curl -v}}. Instead of forcing the SSL clients to upgrade, we can configure Tomcat to explicitly allow enough weak ciphers so that old SSL clients can work. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-11417) Add datanode admin command to get the storage info.
Surendra Singh Lilhore created HDFS-11417: - Summary: Add datanode admin command to get the storage info. Key: HDFS-11417 URL: https://issues.apache.org/jira/browse/HDFS-11417 Project: Hadoop HDFS Issue Type: New Feature Affects Versions: 2.7.3 Reporter: Surendra Singh Lilhore Assignee: Surendra Singh Lilhore It is good to add one admin command for datanode to get the data directory info like storage type, directory path, number of block, capacity, used space. This will be help full in large cluster where DN has multiple data directory configured. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10899) Add functionality to re-encrypt EDEKs.
[ https://issues.apache.org/jira/browse/HDFS-10899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-10899: - Attachment: (was: HDFS-10899.09.patch) > Add functionality to re-encrypt EDEKs. > -- > > Key: HDFS-10899 > URL: https://issues.apache.org/jira/browse/HDFS-10899 > Project: Hadoop HDFS > Issue Type: New Feature > Components: encryption, kms >Reporter: Xiao Chen >Assignee: Xiao Chen > Attachments: editsStored, HDFS-10899.01.patch, HDFS-10899.02.patch, > HDFS-10899.03.patch, HDFS-10899.04.patch, HDFS-10899.05.patch, > HDFS-10899.06.patch, HDFS-10899.07.patch, HDFS-10899.08.patch, > HDFS-10899.09.patch, HDFS-10899.wip.2.patch, HDFS-10899.wip.patch, Re-encrypt > edek design doc.pdf > > > Currently when an encryption zone (EZ) key is rotated, it only takes effect > on new EDEKs. We should provide a way to re-encrypt EDEKs after the EZ key > rotation, for improved security. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10899) Add functionality to re-encrypt EDEKs.
[ https://issues.apache.org/jira/browse/HDFS-10899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-10899: - Attachment: HDFS-10899.09.patch > Add functionality to re-encrypt EDEKs. > -- > > Key: HDFS-10899 > URL: https://issues.apache.org/jira/browse/HDFS-10899 > Project: Hadoop HDFS > Issue Type: New Feature > Components: encryption, kms >Reporter: Xiao Chen >Assignee: Xiao Chen > Attachments: editsStored, HDFS-10899.01.patch, HDFS-10899.02.patch, > HDFS-10899.03.patch, HDFS-10899.04.patch, HDFS-10899.05.patch, > HDFS-10899.06.patch, HDFS-10899.07.patch, HDFS-10899.08.patch, > HDFS-10899.09.patch, HDFS-10899.wip.2.patch, HDFS-10899.wip.patch, Re-encrypt > edek design doc.pdf > > > Currently when an encryption zone (EZ) key is rotated, it only takes effect > on new EDEKs. We should provide a way to re-encrypt EDEKs after the EZ key > rotation, for improved security. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10899) Add functionality to re-encrypt EDEKs.
[ https://issues.apache.org/jira/browse/HDFS-10899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-10899: - Attachment: (was: HDFS-10899.09.patch) > Add functionality to re-encrypt EDEKs. > -- > > Key: HDFS-10899 > URL: https://issues.apache.org/jira/browse/HDFS-10899 > Project: Hadoop HDFS > Issue Type: New Feature > Components: encryption, kms >Reporter: Xiao Chen >Assignee: Xiao Chen > Attachments: editsStored, HDFS-10899.01.patch, HDFS-10899.02.patch, > HDFS-10899.03.patch, HDFS-10899.04.patch, HDFS-10899.05.patch, > HDFS-10899.06.patch, HDFS-10899.07.patch, HDFS-10899.08.patch, > HDFS-10899.09.patch, HDFS-10899.wip.2.patch, HDFS-10899.wip.patch, Re-encrypt > edek design doc.pdf > > > Currently when an encryption zone (EZ) key is rotated, it only takes effect > on new EDEKs. We should provide a way to re-encrypt EDEKs after the EZ key > rotation, for improved security. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10899) Add functionality to re-encrypt EDEKs.
[ https://issues.apache.org/jira/browse/HDFS-10899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-10899: - Attachment: HDFS-10899.09.patch > Add functionality to re-encrypt EDEKs. > -- > > Key: HDFS-10899 > URL: https://issues.apache.org/jira/browse/HDFS-10899 > Project: Hadoop HDFS > Issue Type: New Feature > Components: encryption, kms >Reporter: Xiao Chen >Assignee: Xiao Chen > Attachments: editsStored, HDFS-10899.01.patch, HDFS-10899.02.patch, > HDFS-10899.03.patch, HDFS-10899.04.patch, HDFS-10899.05.patch, > HDFS-10899.06.patch, HDFS-10899.07.patch, HDFS-10899.08.patch, > HDFS-10899.09.patch, HDFS-10899.wip.2.patch, HDFS-10899.wip.patch, Re-encrypt > edek design doc.pdf > > > Currently when an encryption zone (EZ) key is rotated, it only takes effect > on new EDEKs. We should provide a way to re-encrypt EDEKs after the EZ key > rotation, for improved security. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6301) NameNode: persist XAttrs in fsimage and record XAttrs modifications to edit log.
[ https://issues.apache.org/jira/browse/HDFS-6301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868107#comment-15868107 ] Xiao Chen commented on HDFS-6301: - FYI - created HDFS-11410 for the above and Andrew +1'ed, plan to commit today. > NameNode: persist XAttrs in fsimage and record XAttrs modifications to edit > log. > > > Key: HDFS-6301 > URL: https://issues.apache.org/jira/browse/HDFS-6301 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: HDFS XAttrs (HDFS-2006) >Reporter: Yi Liu >Assignee: Yi Liu > Fix For: HDFS XAttrs (HDFS-2006) > > Attachments: HDFS-6301.1.patch, HDFS-6301.patch > > > Store XAttrs in fsimage so that XAttrs are retained across NameNode restarts. > Implement a new edit log opcode, {{OP_SET_XATTRS}}. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11100) Recursively deleting file protected by sticky bit should fail
[ https://issues.apache.org/jira/browse/HDFS-11100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867968#comment-15867968 ] Wei-Chiu Chuang commented on HDFS-11100: I will commit the 005 patch to trunk by end of day if there are no objections. > Recursively deleting file protected by sticky bit should fail > - > > Key: HDFS-11100 > URL: https://issues.apache.org/jira/browse/HDFS-11100 > Project: Hadoop HDFS > Issue Type: Bug > Components: fs >Affects Versions: 2.6.0 >Reporter: John Zhuge >Assignee: John Zhuge >Priority: Critical > Labels: permissions > Attachments: HDFS-11100.001.patch, HDFS-11100.002.patch, > HDFS-11100.003.patch, HDFS-11100.004.patch, HDFS-11100.005.patch, hdfs_cmds > > > Recursively deleting a directory that contains files or directories protected > by sticky bit should fail but it doesn't in HDFS. In the case below, > {{/tmp/test/sticky_dir/f2}} is protected by sticky bit, thus recursive > deleting {{/tmp/test/sticky_dir}} should fail. > {noformat} > + hdfs dfs -ls -R /tmp/test > drwxrwxrwt - jzhuge supergroup 0 2016-11-03 18:08 > /tmp/test/sticky_dir > -rwxrwxrwx 1 jzhuge supergroup 0 2016-11-03 18:08 > /tmp/test/sticky_dir/f2 > + sudo -u hadoop hdfs dfs -rm -skipTrash /tmp/test/sticky_dir/f2 > rm: Permission denied by sticky bit: user=hadoop, > path="/tmp/test/sticky_dir/f2":jzhuge:supergroup:-rwxrwxrwx, > parent="/tmp/test/sticky_dir":jzhuge:supergroup:drwxrwxrwt > + sudo -u hadoop hdfs dfs -rm -r -skipTrash /tmp/test/sticky_dir > Deleted /tmp/test/sticky_dir > {noformat} > Centos 6.4 behavior: > {noformat} > $ ls -lR /tmp/test > /tmp/test: > total 4 > drwxrwxrwt 2 systest systest 4096 Nov 3 18:36 sbit > /tmp/test/sbit: > total 0 > -rw-rw-rw- 1 systest systest 0 Nov 2 13:45 f2 > $ sudo -u mapred rm -fr /tmp/test/sbit > rm: cannot remove `/tmp/test/sbit/f2': Operation not permitted > $ chmod -t /tmp/test/sbit > $ sudo -u mapred rm -fr /tmp/test/sbit > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11333) Print a user friendly error message when plugins are not found
[ https://issues.apache.org/jira/browse/HDFS-11333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867677#comment-15867677 ] Hudson commented on HDFS-11333: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11256 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/11256/]) HDFS-11333. Print a user friendly error message when plugins are not (weichiu: rev 859bd159ae554174200334b5eb1d7e8dbef958ad) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java > Print a user friendly error message when plugins are not found > -- > > Key: HDFS-11333 > URL: https://issues.apache.org/jira/browse/HDFS-11333 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 0.21.0 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Minor > Labels: supportability > Fix For: 2.9.0, 2.7.4, 3.0.0-alpha3, 2.8.1 > > Attachments: HDFS-11333.001.patch, HDFS-11333.002.patch, > HDFS-11333.002.patch, HDFS-11333.003.patch > > > If NameNode is unable to find plugins (specified in dfs.namenode.plugins), it > terminates abruptly with the following stack trace: > {quote} > Failed to start namenode. > java.lang.RuntimeException: java.lang.ClassNotFoundException: Class XXX not > found > at org.apache.hadoop.conf.Configuration.getClasses(Configuration.java:2178) > at > org.apache.hadoop.conf.Configuration.getInstances(Configuration.java:2250) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:713) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:691) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:843) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:822) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1543) > at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1611) > {quote} > We should catch this exception, log a warning message and let it proceed, as > missing the third party library does not affect the functionality of > NameNode. We caught this bug during a CDH upgrade where a third party plugin > was not in the lib directory of the newer version of CDH. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11333) Print a user friendly error message when plugins are not found
[ https://issues.apache.org/jira/browse/HDFS-11333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-11333: --- Resolution: Fixed Fix Version/s: 2.8.1 3.0.0-alpha3 2.7.4 2.9.0 Status: Resolved (was: Patch Available) Committed the patch to trunk, branch-2, branch-2.8 and branch-2.7. Thanks very much for the review from [~linyiqun] and comments from [~aw]! > Print a user friendly error message when plugins are not found > -- > > Key: HDFS-11333 > URL: https://issues.apache.org/jira/browse/HDFS-11333 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 0.21.0 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Minor > Labels: supportability > Fix For: 2.9.0, 2.7.4, 3.0.0-alpha3, 2.8.1 > > Attachments: HDFS-11333.001.patch, HDFS-11333.002.patch, > HDFS-11333.002.patch, HDFS-11333.003.patch > > > If NameNode is unable to find plugins (specified in dfs.namenode.plugins), it > terminates abruptly with the following stack trace: > {quote} > Failed to start namenode. > java.lang.RuntimeException: java.lang.ClassNotFoundException: Class XXX not > found > at org.apache.hadoop.conf.Configuration.getClasses(Configuration.java:2178) > at > org.apache.hadoop.conf.Configuration.getInstances(Configuration.java:2250) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:713) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:691) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:843) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:822) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1543) > at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1611) > {quote} > We should catch this exception, log a warning message and let it proceed, as > missing the third party library does not affect the functionality of > NameNode. We caught this bug during a CDH upgrade where a third party plugin > was not in the lib directory of the newer version of CDH. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11333) Print a user friendly error message when plugins are not found
[ https://issues.apache.org/jira/browse/HDFS-11333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-11333: --- Priority: Minor (was: Major) > Print a user friendly error message when plugins are not found > -- > > Key: HDFS-11333 > URL: https://issues.apache.org/jira/browse/HDFS-11333 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 0.21.0 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Minor > Labels: supportability > Attachments: HDFS-11333.001.patch, HDFS-11333.002.patch, > HDFS-11333.002.patch, HDFS-11333.003.patch > > > If NameNode is unable to find plugins (specified in dfs.namenode.plugins), it > terminates abruptly with the following stack trace: > {quote} > Failed to start namenode. > java.lang.RuntimeException: java.lang.ClassNotFoundException: Class XXX not > found > at org.apache.hadoop.conf.Configuration.getClasses(Configuration.java:2178) > at > org.apache.hadoop.conf.Configuration.getInstances(Configuration.java:2250) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:713) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:691) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:843) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:822) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1543) > at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1611) > {quote} > We should catch this exception, log a warning message and let it proceed, as > missing the third party library does not affect the functionality of > NameNode. We caught this bug during a CDH upgrade where a third party plugin > was not in the lib directory of the newer version of CDH. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11333) Print a user friendly error message when plugins are not found
[ https://issues.apache.org/jira/browse/HDFS-11333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-11333: --- Issue Type: Improvement (was: Bug) > Print a user friendly error message when plugins are not found > -- > > Key: HDFS-11333 > URL: https://issues.apache.org/jira/browse/HDFS-11333 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 0.21.0 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Labels: supportability > Attachments: HDFS-11333.001.patch, HDFS-11333.002.patch, > HDFS-11333.002.patch, HDFS-11333.003.patch > > > If NameNode is unable to find plugins (specified in dfs.namenode.plugins), it > terminates abruptly with the following stack trace: > {quote} > Failed to start namenode. > java.lang.RuntimeException: java.lang.ClassNotFoundException: Class XXX not > found > at org.apache.hadoop.conf.Configuration.getClasses(Configuration.java:2178) > at > org.apache.hadoop.conf.Configuration.getInstances(Configuration.java:2250) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:713) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:691) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:843) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:822) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1543) > at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1611) > {quote} > We should catch this exception, log a warning message and let it proceed, as > missing the third party library does not affect the functionality of > NameNode. We caught this bug during a CDH upgrade where a third party plugin > was not in the lib directory of the newer version of CDH. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-11333) Print a user friendly error message when plugins are not found
[ https://issues.apache.org/jira/browse/HDFS-11333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-11333: --- Summary: Print a user friendly error message when plugins are not found (was: Namenode unable to start if plugins can not be found) > Print a user friendly error message when plugins are not found > -- > > Key: HDFS-11333 > URL: https://issues.apache.org/jira/browse/HDFS-11333 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 0.21.0 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang > Labels: supportability > Attachments: HDFS-11333.001.patch, HDFS-11333.002.patch, > HDFS-11333.002.patch, HDFS-11333.003.patch > > > If NameNode is unable to find plugins (specified in dfs.namenode.plugins), it > terminates abruptly with the following stack trace: > {quote} > Failed to start namenode. > java.lang.RuntimeException: java.lang.ClassNotFoundException: Class XXX not > found > at org.apache.hadoop.conf.Configuration.getClasses(Configuration.java:2178) > at > org.apache.hadoop.conf.Configuration.getInstances(Configuration.java:2250) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:713) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:691) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:843) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:822) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1543) > at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1611) > {quote} > We should catch this exception, log a warning message and let it proceed, as > missing the third party library does not affect the functionality of > NameNode. We caught this bug during a CDH upgrade where a third party plugin > was not in the lib directory of the newer version of CDH. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-6804) race condition between transferring block and appending block causes "Unexpected checksum mismatch exception"
[ https://issues.apache.org/jira/browse/HDFS-6804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867561#comment-15867561 ] Brahma Reddy Battula commented on HDFS-6804: [~jojochuang] As you mentioned , issue was fixed by HDFS-11160..Just I updated the testcase for this particular scenario which is very race.Verified by reverting HDFS-11160 and HDFS-11056. > race condition between transferring block and appending block causes > "Unexpected checksum mismatch exception" > -- > > Key: HDFS-6804 > URL: https://issues.apache.org/jira/browse/HDFS-6804 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.2.0 >Reporter: Gordon Wang >Assignee: Brahma Reddy Battula > Attachments: Testcase_append_transfer_block.patch > > > We found some error log in the datanode. like this > {noformat} > 2014-07-22 01:49:51,338 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Ex > ception for BP-2072804351-192.168.2.104-1406008383435:blk_1073741997_9248 > java.io.IOException: Terminating due to a checksum error.java.io.IOException: > Unexpected checksum mismatch while writing > BP-2072804351-192.168.2.104-1406008383435:blk_1073741997_9248 from > /192.168.2.101:39495 > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:536) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:703) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:575) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:115) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:68) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221) > at java.lang.Thread.run(Thread.java:744) > {noformat} > While on the source datanode, the log says the block is transmitted. > {noformat} > 2014-07-22 01:49:50,805 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Da > taTransfer: Transmitted > BP-2072804351-192.168.2.104-1406008383435:blk_1073741997 > _9248 (numBytes=16188152) to /192.168.2.103:50010 > {noformat} > When the destination datanode gets the checksum mismatch, it reports bad > block to NameNode and NameNode marks the replica on the source datanode as > corrupt. But actually, the replica on the source datanode is valid. Because > the replica can pass the checksum verification. > In all, the replica on the source data is wrongly marked as corrupted. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10899) Add functionality to re-encrypt EDEKs.
[ https://issues.apache.org/jira/browse/HDFS-10899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867528#comment-15867528 ] Hadoop QA commented on HDFS-10899: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 11 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 29s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green} trunk passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 24s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 43s{color} | {color:red} hadoop-hdfs-project in the patch failed. {color} | | {color:red}-1{color} | {color:red} cc {color} | {color:red} 0m 43s{color} | {color:red} hadoop-hdfs-project in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 43s{color} | {color:red} hadoop-hdfs-project in the patch failed. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 37s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 25s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 4s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 25s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 37s{color} | {color:red} hadoop-hdfs-project_hadoop-hdfs generated 2 new + 7 unchanged - 0 fixed = 9 total (was 7) {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 54s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 25s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 30m 52s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:a9ad5d6 | | JIRA Issue | HDFS-10899 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12852766/HDFS-10899.09.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle cc xml | | uname | Linux 3135be15d837 3.13.0-107-generic #154-Ubuntu SMP Tue Dec 20 09:57:27 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / cd3e59a | | Default Java | 1.8.0_121 | | findbugs | v3.0.0 | | mvninstall | https://builds.apache.org/job/PreCommit-HDFS-Build/18379/artifact/patchprocess/patch-mvninstall-hadoop-hdfs-project_hadoop-hdfs.txt | | compile |
[jira] [Reopened] (HDFS-6804) race condition between transferring block and appending block causes "Unexpected checksum mismatch exception"
[ https://issues.apache.org/jira/browse/HDFS-6804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang reopened HDFS-6804: --- [~brahmareddy] I haven't run the patch, but if you attach a patch I am happy to review it. Assigning the jira to you, thanks! > race condition between transferring block and appending block causes > "Unexpected checksum mismatch exception" > -- > > Key: HDFS-6804 > URL: https://issues.apache.org/jira/browse/HDFS-6804 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.2.0 >Reporter: Gordon Wang >Assignee: Brahma Reddy Battula > Attachments: Testcase_append_transfer_block.patch > > > We found some error log in the datanode. like this > {noformat} > 2014-07-22 01:49:51,338 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Ex > ception for BP-2072804351-192.168.2.104-1406008383435:blk_1073741997_9248 > java.io.IOException: Terminating due to a checksum error.java.io.IOException: > Unexpected checksum mismatch while writing > BP-2072804351-192.168.2.104-1406008383435:blk_1073741997_9248 from > /192.168.2.101:39495 > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:536) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:703) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:575) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:115) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:68) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221) > at java.lang.Thread.run(Thread.java:744) > {noformat} > While on the source datanode, the log says the block is transmitted. > {noformat} > 2014-07-22 01:49:50,805 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Da > taTransfer: Transmitted > BP-2072804351-192.168.2.104-1406008383435:blk_1073741997 > _9248 (numBytes=16188152) to /192.168.2.103:50010 > {noformat} > When the destination datanode gets the checksum mismatch, it reports bad > block to NameNode and NameNode marks the replica on the source datanode as > corrupt. But actually, the replica on the source datanode is valid. Because > the replica can pass the checksum verification. > In all, the replica on the source data is wrongly marked as corrupted. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-6804) race condition between transferring block and appending block causes "Unexpected checksum mismatch exception"
[ https://issues.apache.org/jira/browse/HDFS-6804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang reassigned HDFS-6804: - Assignee: Brahma Reddy Battula (was: Wei-Chiu Chuang) > race condition between transferring block and appending block causes > "Unexpected checksum mismatch exception" > -- > > Key: HDFS-6804 > URL: https://issues.apache.org/jira/browse/HDFS-6804 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.2.0 >Reporter: Gordon Wang >Assignee: Brahma Reddy Battula > Attachments: Testcase_append_transfer_block.patch > > > We found some error log in the datanode. like this > {noformat} > 2014-07-22 01:49:51,338 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Ex > ception for BP-2072804351-192.168.2.104-1406008383435:blk_1073741997_9248 > java.io.IOException: Terminating due to a checksum error.java.io.IOException: > Unexpected checksum mismatch while writing > BP-2072804351-192.168.2.104-1406008383435:blk_1073741997_9248 from > /192.168.2.101:39495 > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:536) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:703) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:575) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:115) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:68) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:221) > at java.lang.Thread.run(Thread.java:744) > {noformat} > While on the source datanode, the log says the block is transmitted. > {noformat} > 2014-07-22 01:49:50,805 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Da > taTransfer: Transmitted > BP-2072804351-192.168.2.104-1406008383435:blk_1073741997 > _9248 (numBytes=16188152) to /192.168.2.103:50010 > {noformat} > When the destination datanode gets the checksum mismatch, it reports bad > block to NameNode and NameNode marks the replica on the source datanode as > corrupt. But actually, the replica on the source datanode is valid. Because > the replica can pass the checksum verification. > In all, the replica on the source data is wrongly marked as corrupted. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11391) Numeric usernames do no work with WebHDFS FS (write access)
[ https://issues.apache.org/jira/browse/HDFS-11391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867501#comment-15867501 ] ASF GitHub Bot commented on HDFS-11391: --- Github user pvillard31 closed the pull request at: https://github.com/apache/hadoop/pull/186 > Numeric usernames do no work with WebHDFS FS (write access) > --- > > Key: HDFS-11391 > URL: https://issues.apache.org/jira/browse/HDFS-11391 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.7.3 >Reporter: Pierre Villard >Assignee: Pierre Villard > Fix For: 2.9.0, 3.0.0-alpha3, 2.8.1 > > > In HDFS-4983, a property has been introduced to configure the pattern > validating name of users interacting with WebHDFS because default pattern was > excluding names starting with numbers. > Problem is that this fix works only for read access. In case of write access > against data node, the default pattern is still applied whatever the > configuration is. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11391) Numeric usernames do no work with WebHDFS FS (write access)
[ https://issues.apache.org/jira/browse/HDFS-11391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867500#comment-15867500 ] ASF GitHub Bot commented on HDFS-11391: --- Github user pvillard31 commented on the issue: https://github.com/apache/hadoop/pull/186 Committed as part of 8e53f2b9b08560bf4f8e81e697063277dbdc68f9. Closing. > Numeric usernames do no work with WebHDFS FS (write access) > --- > > Key: HDFS-11391 > URL: https://issues.apache.org/jira/browse/HDFS-11391 > Project: Hadoop HDFS > Issue Type: Bug > Components: webhdfs >Affects Versions: 2.7.3 >Reporter: Pierre Villard >Assignee: Pierre Villard > Fix For: 2.9.0, 3.0.0-alpha3, 2.8.1 > > > In HDFS-4983, a property has been introduced to configure the pattern > validating name of users interacting with WebHDFS because default pattern was > excluding names starting with numbers. > Problem is that this fix works only for read access. In case of write access > against data node, the default pattern is still applied whatever the > configuration is. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10899) Add functionality to re-encrypt EDEKs.
[ https://issues.apache.org/jira/browse/HDFS-10899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-10899: - Attachment: HDFS-10899.09.patch patch 9: - added a throttler. Note this won't be very accurate as we do not want frequent lock/unlock, and the throttling has to happen out of locks after a batch is created. Better approaches welcome. (Wasn't sure if follow-on means separate jira preferable. can split that out if so to keep the central patch size sane.). - Should have mentioned in patch 8: I think after refactor the code is more readable, but seems we can't lock/unlock within each method. This is similar to {{FSDirEncryptionZoneOp#getEncryptionKeyInfo}} - Fixed precommit complains. > Add functionality to re-encrypt EDEKs. > -- > > Key: HDFS-10899 > URL: https://issues.apache.org/jira/browse/HDFS-10899 > Project: Hadoop HDFS > Issue Type: New Feature > Components: encryption, kms >Reporter: Xiao Chen >Assignee: Xiao Chen > Attachments: editsStored, HDFS-10899.01.patch, HDFS-10899.02.patch, > HDFS-10899.03.patch, HDFS-10899.04.patch, HDFS-10899.05.patch, > HDFS-10899.06.patch, HDFS-10899.07.patch, HDFS-10899.08.patch, > HDFS-10899.09.patch, HDFS-10899.wip.2.patch, HDFS-10899.wip.patch, Re-encrypt > edek design doc.pdf > > > Currently when an encryption zone (EZ) key is rotated, it only takes effect > on new EDEKs. We should provide a way to re-encrypt EDEKs after the EZ key > rotation, for improved security. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11238) Fix checkstyle warnings in NameNode#createNameNode
[ https://issues.apache.org/jira/browse/HDFS-11238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867461#comment-15867461 ] Hudson commented on HDFS-11238: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11252 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/11252/]) HDFS-11238. Fix checkstyle warnings in NameNode#createNameNode. (aajisaka: rev 8acb376c9c5f7f52a097be221ed18877a403bece) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java > Fix checkstyle warnings in NameNode#createNameNode > -- > > Key: HDFS-11238 > URL: https://issues.apache.org/jira/browse/HDFS-11238 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Ethan Li >Assignee: Ethan Li >Priority: Trivial > Fix For: 3.0.0-alpha3 > > Attachments: HDFS-11238.001.patch > > > switch-case should at the same level, avoid nested block, Array brackets at > illegal position -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-11401) Reduce verbosity of logs with favored nodes and block pinning enabled
[ https://issues.apache.org/jira/browse/HDFS-11401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867450#comment-15867450 ] Weiwei Yang commented on HDFS-11401: Hi [~rakeshr], [~umamaheswararao] Please let me know if you have any comments with v2 patch, thank you! > Reduce verbosity of logs with favored nodes and block pinning enabled > - > > Key: HDFS-11401 > URL: https://issues.apache.org/jira/browse/HDFS-11401 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer & mover, datanode >Affects Versions: 2.8.0 >Reporter: Thiruvel Thirumoolan >Assignee: Weiwei Yang >Priority: Minor > Attachments: HDFS-11401.01.patch, HDFS-11401.02.patch, > testBalancerWithPinnedBlocks.output.txt > > > I am working on enabling favored nodes for HBase (HBASE-15531). Was trying > out what happens if favored nodes is used and HDFS balancer is enabled. Ran > the unit test TestBalancer#testBalancerWithPinnedBlocks from branch-2.8. > There were too many exceptions and error messages with this (output attached) > since pinned blocks can't be moved. > Is there any way to reduce this logging since block pinning is intentional? > On a real cluster, this could be too much. HDFS-6133 enabled block pinning. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org