[jira] [Commented] (HDFS-13032) Make AvailableSpaceBlockPlacementPolicy more adaptive
[ https://issues.apache.org/jira/browse/HDFS-13032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517763#comment-16517763 ] genericqa commented on HDFS-13032: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 14s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 10s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 82m 11s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}145m 19s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.tools.TestHdfsConfigFields | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | HDFS-13032 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12907063/HDFS-13032.002.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 1bdfd96ae9f2 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 2d87592 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | whitespace | https://builds.apache.org/job/PreCommit-HDFS-Build/24478/artifact/out/whitespace-eol.txt | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/24478/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/24478/testReport/ | | Max. process+thread count | 3312 (vs. ulimit of 1) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U:
[jira] [Updated] (HDDS-166) Create a landing page for Ozone
[ https://issues.apache.org/jira/browse/HDDS-166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elek, Marton updated HDDS-166: -- Status: Patch Available (was: Open) Uploaded both the source and the rendered version (which also includes the current version of the ozone docs from the hadoop-ozone/docs project). It supposed to be commited to ozone-site repository. > Create a landing page for Ozone > --- > > Key: HDDS-166 > URL: https://issues.apache.org/jira/browse/HDDS-166 > Project: Hadoop Distributed Data Store > Issue Type: Task > Components: document >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > Attachments: ozone-site-rendered.tar.gz, ozone-site-source.tar.gz > > > As Ozone release cycle is seprated from hadoop we need a separated page to > publish the releases. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-166) Create a landing page for Ozone
[ https://issues.apache.org/jira/browse/HDDS-166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elek, Marton updated HDDS-166: -- Attachment: ozone-site-source.tar.gz > Create a landing page for Ozone > --- > > Key: HDDS-166 > URL: https://issues.apache.org/jira/browse/HDDS-166 > Project: Hadoop Distributed Data Store > Issue Type: Task > Components: document >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > Attachments: ozone-site-rendered.tar.gz, ozone-site-source.tar.gz > > > As Ozone release cycle is seprated from hadoop we need a separated page to > publish the releases. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-166) Create a landing page for Ozone
[ https://issues.apache.org/jira/browse/HDDS-166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elek, Marton updated HDDS-166: -- Attachment: (was: ozone-site.tar.gz) > Create a landing page for Ozone > --- > > Key: HDDS-166 > URL: https://issues.apache.org/jira/browse/HDDS-166 > Project: Hadoop Distributed Data Store > Issue Type: Task > Components: document >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > Attachments: ozone-site-rendered.tar.gz > > > As Ozone release cycle is seprated from hadoop we need a separated page to > publish the releases. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-166) Create a landing page for Ozone
[ https://issues.apache.org/jira/browse/HDDS-166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elek, Marton updated HDDS-166: -- Attachment: ozone-site-rendered.tar.gz > Create a landing page for Ozone > --- > > Key: HDDS-166 > URL: https://issues.apache.org/jira/browse/HDDS-166 > Project: Hadoop Distributed Data Store > Issue Type: Task > Components: document >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > Attachments: ozone-site-rendered.tar.gz, ozone-site.tar.gz > > > As Ozone release cycle is seprated from hadoop we need a separated page to > publish the releases. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-175) Refactor ContainerInfo to remove Pipeline object from it
[ https://issues.apache.org/jira/browse/HDDS-175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517712#comment-16517712 ] genericqa commented on HDDS-175: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 18 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 18s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 33s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site hadoop-ozone/integration-test {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 23s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 55s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 28m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 28m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 51s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site hadoop-ozone/integration-test {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 0s{color} | {color:red} hadoop-hdds/common generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 44s{color} | {color:red} hadoop-hdds/server-scm generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 48s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 21s{color} | {color:green} hadoop-yarn-site in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 9s{color} | {color:green} common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 24s{color} | {color:green} client in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 23s{color} | {color:green} server-scm in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} |
[jira] [Assigned] (HDFS-13032) Make AvailableSpaceBlockPlacementPolicy more adaptive
[ https://issues.apache.org/jira/browse/HDFS-13032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tao Jie reassigned HDFS-13032: -- Assignee: Tao Jie > Make AvailableSpaceBlockPlacementPolicy more adaptive > - > > Key: HDFS-13032 > URL: https://issues.apache.org/jira/browse/HDFS-13032 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.8.2 >Reporter: Tao Jie >Assignee: Tao Jie >Priority: Major > Attachments: HDFS-13032.001.patch, HDFS-13032.002.patch > > > In a heterogeneous HDFS cluster, datanode capacity and usage are very > different. > Now we can use HDFS-8131, a usage-aware block placement policy to deal with > the problem. However, this policy could be more flexible. > 1, The probability of a node with high usage being chosen is fixed once the > parameter is set. That is the probability is always the same no matter its > usage is 90% or 70%. When the usage of a node is close to full, its > probability of being chosen should be lower. > 2, When the difference of usage is below 5%(hard code), the two nodes are > considered the same usage. I think it's OK when usage is 30% and 35%, but > when usage is 93% and 98%, they should not be treated equally. The correction > of probability could be more smooth. > In my opinion, when we choose one node from two candidates (A: usage 30%, B: > usage 60%), we can calculate the probability according to the available > storage. p(A) = 70%/(70% + 40%), p(B) = 40% (70% +40%). When a node is close > to full, the probability would be very small. > Also we could have another factor to weaken this correctness, and make the > modification not so aggressive. > Any thought? [~liushaohui] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13609) [Edit Tail Fast Path Pt 3] NameNode-side changes to support tailing edits via RPC
[ https://issues.apache.org/jira/browse/HDFS-13609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517672#comment-16517672 ] Erik Krogen edited comment on HDFS-13609 at 6/20/18 1:05 AM: - Thanks for the review [~shv]! # I agree that this would be much cleaner. In many cases, {{inProgressOk}} will be equivalent to {{optimizeLatency}}. However there are a few cases where this is not currently true. If you agree with my assessments below about these cases being OK with RPC, and that we can handle {{BackupImage}} as I describe below, I think I can remove this new parameter and limit the scope of the change. ** {{FSEditLog#openForWrite()}} - It is using {{selectInputStreams}} to confirm that no one else is writing new transactions. It seems fine to allow this to use the RPC mechanism. ** {{BootstrapStandby#checkLogsAvailableForRead()}} - It is confirming that a range of transaction IDs are available. Seems fine to allow this to use the RPC mechanism. ** {{NameNodeRpcServer#getEventBatchList()}} - Serves ranges of transactions for INotify feature. Seems fine (actually, seems desirable) to let this use the RPC mechanism. However, on a slightly unrelated note, one portion of this will need to be changed to work properly in a read-from-standby environment... Filed HDFS-13689 for this. ** {{NameNode#copyEditLogSegmentsToSharedDir()}} - This is only called on {{NameNode#initializeSharedEdits()}}, i.e. a separate startup flag for the NameNode. I don't think it's necessary to optimize for this situation. ** {{BackupImage#tryConvergeJournalSpool()}} - This code is doing some sketchy things and making assumptions about the streams returned that will not be true when using the RPC mechanism. We need to prevent this from using the RPC mechanism, but given that this is only for the BackupNode, I recommend we avoid adding a new API / parameter just for this situation and disable the RPC mechanism on the BackupNode entirely. I instead propose that we add a way for the BackupNode to disable RPC reads on the {{QuorumJournalManager}}. This could take the form of an undocumented config parameter, or, my preference, add a static method {{QuorumJournalManager.disableRPCJournalStreams()}} which the BackupNode can call. # Agreed. I will fix this in the next patch. # I thought more about why an operator might want to change this config. I determined that I can imagine situations when I would want to increase it, if the situation arises that RPC response time from the JournalNodes is high and the number of transactions per second is very high (say, a very high write workload). But I can't think of a reason to lower it; this is more about just setting a sanity-check upper bound. This makes me think we should (a) raise the default limit to 5000 -> even with a RTT RPC time of 100ms, which is quite high, this would allow 50k transactions per second, (b) make it undocumented as you described. I will incorporate this into the next patch. was (Author: xkrogen): Thanks for the review [~shv]! # I agree that this would be much cleaner. In many cases, {{inProgressOk}} will be equivalent to {{optimizeLatency}}. However there are a few cases where this is not currently true: ** {{FSEditLog#openForWrite()}} - It is using {{selectInputStreams}} to confirm that no one else is writing new transactions. It seems fine to allow this to use the RPC mechanism. ** {{BootstrapStandby#checkLogsAvailableForRead()}} - It is confirming that a range of transaction IDs are available. Seems fine to allow this to use the RPC mechanism. ** {{NameNodeRpcServer#getEventBatchList()}} - Serves ranges of transactions for INotify feature. Seems fine (actually, seems desirable) to let this use the RPC mechanism. However, on a slightly unrelated note, one portion of this will need to be changed to work properly in a read-from-standby environment... Filed HDFS-13689 for this. ** {{NameNode#copyEditLogSegmentsToSharedDir()}} - This is only called on {{NameNode#initializeSharedEdits()}}, i.e. a separate startup flag for the NameNode. I don't think it's necessary to optimize for this situation. ** {{BackupImage#tryConvergeJournalSpool()}} - This code is doing some sketchy things and making assumptions about the streams returned that will not be true when using the RPC mechanism. We need to prevent this from using the RPC mechanism, but given that this is only for the BackupNode, I recommend we avoid adding a new API / parameter just for this situation and disable the RPC mechanism on the BackupNode entirely. I instead propose that we add a way for the BackupNode to disable RPC reads on the {{QuorumJournalManager}}. This could take the form of an undocumented config parameter, or, my preference, add a static method {{QuorumJournalManager.disableRPCJournalStreams()}} which the BackupNode can call. If you agree that we can handle
[jira] [Commented] (HDFS-13609) [Edit Tail Fast Path Pt 3] NameNode-side changes to support tailing edits via RPC
[ https://issues.apache.org/jira/browse/HDFS-13609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517672#comment-16517672 ] Erik Krogen commented on HDFS-13609: Thanks for the review [~shv]! # I agree that this would be much cleaner. In many cases, {{inProgressOk}} will be equivalent to {{optimizeLatency}}. However there are a few cases where this is not currently true: ** {{FSEditLog#openForWrite()}} - It is using {{selectInputStreams}} to confirm that no one else is writing new transactions. It seems fine to allow this to use the RPC mechanism. ** {{BootstrapStandby#checkLogsAvailableForRead()}} - It is confirming that a range of transaction IDs are available. Seems fine to allow this to use the RPC mechanism. ** {{NameNodeRpcServer#getEventBatchList()}} - Serves ranges of transactions for INotify feature. Seems fine (actually, seems desirable) to let this use the RPC mechanism. However, on a slightly unrelated note, one portion of this will need to be changed to work properly in a read-from-standby environment... Filed HDFS-13689 for this. ** {{NameNode#copyEditLogSegmentsToSharedDir()}} - This is only called on {{NameNode#initializeSharedEdits()}}, i.e. a separate startup flag for the NameNode. I don't think it's necessary to optimize for this situation. ** {{BackupImage#tryConvergeJournalSpool()}} - This code is doing some sketchy things and making assumptions about the streams returned that will not be true when using the RPC mechanism. We need to prevent this from using the RPC mechanism, but given that this is only for the BackupNode, I recommend we avoid adding a new API / parameter just for this situation and disable the RPC mechanism on the BackupNode entirely. I instead propose that we add a way for the BackupNode to disable RPC reads on the {{QuorumJournalManager}}. This could take the form of an undocumented config parameter, or, my preference, add a static method {{QuorumJournalManager.disableRPCJournalStreams()}} which the BackupNode can call. If you agree that we can handle {{BackupImage}} as I described, I think I can remove this new parameter and limit the scope of the change. # Agreed. I will fix this in the next patch. # I thought more about why an operator might want to change this config. I determined that I can imagine situations when I would want to increase it, if the situation arises that RPC response time from the JournalNodes is high and the number of transactions per second is very high (say, a very high write workload). But I can't think of a reason to lower it; this is more about just setting a sanity-check upper bound. This makes me think we should (a) raise the default limit to 5000 -> even with a RTT RPC time of 100ms, which is quite high, this would allow 50k transactions per second, (b) make it undocumented as you described. I will incorporate this into the next patch. > [Edit Tail Fast Path Pt 3] NameNode-side changes to support tailing edits via > RPC > - > > Key: HDFS-13609 > URL: https://issues.apache.org/jira/browse/HDFS-13609 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ha, namenode >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-13609-HDFS-12943.000.patch, > HDFS-13609-HDFS-12943.001.patch, HDFS-13609-HDFS-12943.002.patch > > > See HDFS-13150 for the full design. > This JIRA is targetted at the NameNode-side changes to enable tailing > in-progress edits via the RPC mechanism added in HDFS-13608. Most changes are > in the QuorumJournalManager. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-13689) NameNodeRpcServer getEditsFromTxid assumes it is run on active NameNode
Erik Krogen created HDFS-13689: -- Summary: NameNodeRpcServer getEditsFromTxid assumes it is run on active NameNode Key: HDFS-13689 URL: https://issues.apache.org/jira/browse/HDFS-13689 Project: Hadoop HDFS Issue Type: Sub-task Components: hdfs, namenode Reporter: Erik Krogen Assignee: Erik Krogen {{NameNodeRpcServer#getEditsFromTxid}} currently decides which transactions are able to be served, i.e. which transactions are durable, using the following logic: {code} long syncTxid = log.getSyncTxId(); // If we haven't synced anything yet, we can only read finalized // segments since we can't reliably determine which txns in in-progress // segments have actually been committed (e.g. written to a quorum of JNs). // If we have synced txns, we can definitely read up to syncTxid since // syncTxid is only updated after a transaction is committed to all // journals. (In-progress segments written by old writers are already // discarded for us, so if we read any in-progress segments they are // guaranteed to have been written by this NameNode.) boolean readInProgress = syncTxid > 0; {code} This assumes that the NameNode serving this request is the current writer/active NameNode, which may not be true in the ObserverNode situation. Since {{selectInputStreams}} now has a {{onlyDurableTxns}} flag, which, if enabled, will only return durable/committed transactions, we can instead leverage this to provide the same functionality. We should utilize this to avoid consistency issues when serving this request from the ObserverNode. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-177) Create a releasable ozonefs artifact
[ https://issues.apache.org/jira/browse/HDDS-177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517647#comment-16517647 ] genericqa commented on HDDS-177: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 31 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 27s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 29m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 32m 6s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 26m 29s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 31s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project hadoop-ozone hadoop-tools/hadoop-tools-dist hadoop-tools hadoop-dist hadoop-ozone/acceptance-test . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 8m 16s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 31s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 40m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 32m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 32m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 21m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 2s{color} | {color:green} There were no new shellcheck issues. {color} | | {color:green}+1{color} | {color:green} shelldocs {color} | {color:green} 0m 14s{color} | {color:green} There were no new shelldocs issues. {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 8s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 15s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-project hadoop-tools/hadoop-tools-dist hadoop-tools hadoop-ozone hadoop-ozone/acceptance-test . hadoop-dist {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 6m 54s{color} | {color:red} root in the patch failed. {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 21m 14s{color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 2m 9s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}253m 36s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests |
[jira] [Commented] (HDDS-175) Refactor ContainerInfo to remove Pipeline object from it
[ https://issues.apache.org/jira/browse/HDDS-175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517625#comment-16517625 ] Ajay Kumar commented on HDDS-175: - patch v3 to remove some unused imports. > Refactor ContainerInfo to remove Pipeline object from it > - > > Key: HDDS-175 > URL: https://issues.apache.org/jira/browse/HDDS-175 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.2.1 >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-175.00.patch, HDDS-175.01.patch, HDDS-175.02.patch, > HDDS-175.03.patch > > > Refactor ContainerInfo to remove Pipeline object from it. We can add below 4 > fields to ContainerInfo to recreate pipeline if required: > # pipelineId > # replication type > # expected replication count > # DataNode where its replica exist -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-175) Refactor ContainerInfo to remove Pipeline object from it
[ https://issues.apache.org/jira/browse/HDDS-175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ajay Kumar updated HDDS-175: Attachment: HDDS-175.03.patch > Refactor ContainerInfo to remove Pipeline object from it > - > > Key: HDDS-175 > URL: https://issues.apache.org/jira/browse/HDDS-175 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.2.1 >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-175.00.patch, HDDS-175.01.patch, HDDS-175.02.patch, > HDDS-175.03.patch > > > Refactor ContainerInfo to remove Pipeline object from it. We can add below 4 > fields to ContainerInfo to recreate pipeline if required: > # pipelineId > # replication type > # expected replication count > # DataNode where its replica exist -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-175) Refactor ContainerInfo to remove Pipeline object from it
[ https://issues.apache.org/jira/browse/HDDS-175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517596#comment-16517596 ] Hudson commented on HDDS-175: - SUCCESS: Integrated in Jenkins build Hadoop-precommit-ozone-acceptance #19 (See [https://builds.apache.org/job/Hadoop-precommit-ozone-acceptance/19/]) > Refactor ContainerInfo to remove Pipeline object from it > - > > Key: HDDS-175 > URL: https://issues.apache.org/jira/browse/HDDS-175 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.2.1 >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-175.00.patch, HDDS-175.01.patch, HDDS-175.02.patch > > > Refactor ContainerInfo to remove Pipeline object from it. We can add below 4 > fields to ContainerInfo to recreate pipeline if required: > # pipelineId > # replication type > # expected replication count > # DataNode where its replica exist -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13532) RBF: Adding security
[ https://issues.apache.org/jira/browse/HDFS-13532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517575#comment-16517575 ] Xiao Chen edited comment on HDFS-13532 at 6/19/18 10:00 PM: Thanks for the work here [~zhengxg3] and all. The last page of the doc looks familiar. :) Some high level questions from the doc. I have not followed RBF closely and my apologies if these are stupid comments/questions... * I second what Inigo said above. It's not clear to me how DTr is used. * It looks like we'll add the same mechanism to the router, so clients can auth with kerberos, then get a delegation token for subsequent authentications. Is this understanding correct? * I'm not a very security person - the router proxying as client part seems fine. But IMO that should only work if the client auth'ed via kerberos; if client->router auth is dt, then router should not auth to NN via kerberos, but only via the provided DTnn. * Who's gonna renew the router tokens? Tokens from different NNs may have different expiration time, hence need to be renewed at different intervals. RM currently does this, it's kinda nice to reuse RM to handle the DTr token renewal / cancelation. * [~daryn] at one point mentioned he's working on some token issuer interface. Not sure if it will benefit/collide with the work here. was (Author: xiaochen): Thanks for the work here [~zhengxg3] and all. The last page of the doc looks familiar. :) Some high level questions from the doc. I have not followed RBF closely and my apologies if these are stupid questions... * I second what Inigo said above. It's not clear to me how DTr is used. * It looks like we'll add the same mechanism to the router, so clients can auth with kerberos, then get a delegation token for subsequent authentications. Is this understanding correct? * I'm not a very security person - the router proxying as client part seems fine. But IMO that should only work if the client auth'ed via kerberos; if client->router auth is dt, then router should not auth to NN via kerberos, but only via the provided DTnn. * Who's gonna renew the router tokens? Tokens from different NNs may have different expiration time, hence need to be renewed at different intervals. RM currently does this, it's kinda nice to reuse RM to handle the DTr token renewal / cancelation. * [~daryn] at one point mentioned he's working on some token issuer interface. Not sure if it will benefit/collide with the work here. > RBF: Adding security > > > Key: HDFS-13532 > URL: https://issues.apache.org/jira/browse/HDFS-13532 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Íñigo Goiri >Assignee: Sherwood Zheng >Priority: Major > Attachments: Security_for_Router-based Federation_design_doc.pdf > > > HDFS Router based federation should support security. This includes > authentication and delegation tokens. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13532) RBF: Adding security
[ https://issues.apache.org/jira/browse/HDFS-13532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517575#comment-16517575 ] Xiao Chen commented on HDFS-13532: -- Thanks for the work here [~zhengxg3] and all. The last page of the doc looks familiar. :) Some high level questions from the doc. I have not followed RBF closely and my apologies if these are stupid questions... * I second what Inigo said above. It's not clear to me how DTr is used. * It looks like we'll add the same mechanism to the router, so clients can auth with kerberos, then get a delegation token for subsequent authentications. Is this understanding correct? * I'm not a very security person - the router proxying as client part seems fine. But IMO that should only work if the client auth'ed via kerberos; if client->router auth is dt, then router should not auth to NN via kerberos, but only via the provided DTnn. * Who's gonna renew the router tokens? Tokens from different NNs may have different expiration time, hence need to be renewed at different intervals. RM currently does this, it's kinda nice to reuse RM to handle the DTr token renewal / cancelation. * [~daryn] at one point mentioned he's working on some token issuer interface. Not sure if it will benefit/collide with the work here. > RBF: Adding security > > > Key: HDFS-13532 > URL: https://issues.apache.org/jira/browse/HDFS-13532 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Íñigo Goiri >Assignee: Sherwood Zheng >Priority: Major > Attachments: Security_for_Router-based Federation_design_doc.pdf > > > HDFS Router based federation should support security. This includes > authentication and delegation tokens. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-177) Create a releasable ozonefs artifact
[ https://issues.apache.org/jira/browse/HDDS-177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517570#comment-16517570 ] Elek, Marton commented on HDDS-177: --- Finally the patch is proved to be good. We had a good run of the acceptance tests (including the new ozonefs tests): https://builds.apache.org/job/Hadoop-precommit-ozone-acceptance/18 (You can also check the end of the console output) The dependency + the tab problems are also fixed in the latest patch (waiting for the jenkins) > Create a releasable ozonefs artifact > - > > Key: HDDS-177 > URL: https://issues.apache.org/jira/browse/HDDS-177 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Filesystem >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-177.001.patch, HDDS-177.002.patch, > HDDS-177.003.patch, HDDS-177.004.patch, HDDS-177.005.patch, > HDDS-177.006.patch, HDDS-177.007.patch > > > The current ozonefs implementaton is under hadoop-tools/hadoop-ozone and uses > the version of hadoop (3.2.0-SNAPSHOT currently) which is wrong. > The other problem is that we have no single hadoop independent arfitact from > ozonefs which could be used with any hadoop version. > In this patch I propose the following modification: > * move hadoop-tools/hadoop-ozone to hadoop-ozone/ozonefs and use the hdds > version (0.2.1-SNAPSHOT) > * Create a shaded artifact which includes all the required jar files to use > ozonefs (hdds/ozone client) > * Create an ozonefs acceptance test to test it with the latest stable hadoop > version -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13680) Httpfs does not support custom authentication
[ https://issues.apache.org/jira/browse/HDFS-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joris Nogneng updated HDFS-13680: - Resolution: Not A Problem Status: Resolved (was: Patch Available) There is no need to set the Authentication Handler at this level. The value read from the property file will be used by default. > Httpfs does not support custom authentication > - > > Key: HDFS-13680 > URL: https://issues.apache.org/jira/browse/HDFS-13680 > Project: Hadoop HDFS > Issue Type: Bug > Components: httpfs >Reporter: Joris Nogneng >Priority: Major > Attachments: HDFS-13680.01.patch > > > Currently Httpfs Authentication Filter does not support any custom > authentication: the Authentication Handler can only be > PseudoAuthenticationHandler or KerberosDelegationTokenAuthenticationHandler. > We should allow other authentication handlers to manage custom authentication. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-177) Create a releasable ozonefs artifact
[ https://issues.apache.org/jira/browse/HDDS-177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elek, Marton updated HDDS-177: -- Attachment: HDDS-177.007.patch > Create a releasable ozonefs artifact > - > > Key: HDDS-177 > URL: https://issues.apache.org/jira/browse/HDDS-177 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Filesystem >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-177.001.patch, HDDS-177.002.patch, > HDDS-177.003.patch, HDDS-177.004.patch, HDDS-177.005.patch, > HDDS-177.006.patch, HDDS-177.007.patch > > > The current ozonefs implementaton is under hadoop-tools/hadoop-ozone and uses > the version of hadoop (3.2.0-SNAPSHOT currently) which is wrong. > The other problem is that we have no single hadoop independent arfitact from > ozonefs which could be used with any hadoop version. > In this patch I propose the following modification: > * move hadoop-tools/hadoop-ozone to hadoop-ozone/ozonefs and use the hdds > version (0.2.1-SNAPSHOT) > * Create a shaded artifact which includes all the required jar files to use > ozonefs (hdds/ozone client) > * Create an ozonefs acceptance test to test it with the latest stable hadoop > version -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-177) Create a releasable ozonefs artifact
[ https://issues.apache.org/jira/browse/HDDS-177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517536#comment-16517536 ] Elek, Marton commented on HDDS-177: --- FYI: I am trying to stabilize a jenkins to test any new patch with the acceptance test suite. https://builds.apache.org/job/Hadoop-precommit-ozone-acceptance Ideally this patch will be commited after a good build. > Create a releasable ozonefs artifact > - > > Key: HDDS-177 > URL: https://issues.apache.org/jira/browse/HDDS-177 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Filesystem >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-177.001.patch, HDDS-177.002.patch, > HDDS-177.003.patch, HDDS-177.004.patch, HDDS-177.005.patch, HDDS-177.006.patch > > > The current ozonefs implementaton is under hadoop-tools/hadoop-ozone and uses > the version of hadoop (3.2.0-SNAPSHOT currently) which is wrong. > The other problem is that we have no single hadoop independent arfitact from > ozonefs which could be used with any hadoop version. > In this patch I propose the following modification: > * move hadoop-tools/hadoop-ozone to hadoop-ozone/ozonefs and use the hdds > version (0.2.1-SNAPSHOT) > * Create a shaded artifact which includes all the required jar files to use > ozonefs (hdds/ozone client) > * Create an ozonefs acceptance test to test it with the latest stable hadoop > version -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-177) Create a releasable ozonefs artifact
[ https://issues.apache.org/jira/browse/HDDS-177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elek, Marton updated HDDS-177: -- Attachment: HDDS-177.006.patch > Create a releasable ozonefs artifact > - > > Key: HDDS-177 > URL: https://issues.apache.org/jira/browse/HDDS-177 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Filesystem >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-177.001.patch, HDDS-177.002.patch, > HDDS-177.003.patch, HDDS-177.004.patch, HDDS-177.005.patch, HDDS-177.006.patch > > > The current ozonefs implementaton is under hadoop-tools/hadoop-ozone and uses > the version of hadoop (3.2.0-SNAPSHOT currently) which is wrong. > The other problem is that we have no single hadoop independent arfitact from > ozonefs which could be used with any hadoop version. > In this patch I propose the following modification: > * move hadoop-tools/hadoop-ozone to hadoop-ozone/ozonefs and use the hdds > version (0.2.1-SNAPSHOT) > * Create a shaded artifact which includes all the required jar files to use > ozonefs (hdds/ozone client) > * Create an ozonefs acceptance test to test it with the latest stable hadoop > version -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13672) clearCorruptLazyPersistFiles could crash NameNode
[ https://issues.apache.org/jira/browse/HDFS-13672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517514#comment-16517514 ] Wei-Chiu Chuang commented on HDFS-13672: I've given it some thoughts. Instead of breaking down the iteration by the number of blocks, would it make sense to have a time-based configuration? For example, break out from the loop after 1 second? > clearCorruptLazyPersistFiles could crash NameNode > - > > Key: HDFS-13672 > URL: https://issues.apache.org/jira/browse/HDFS-13672 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Wei-Chiu Chuang >Assignee: Gabor Bota >Priority: Major > > I started a NameNode on a pretty large fsimage. Since the NameNode is started > without any DataNodes, all blocks (100 million) are "corrupt". > Afterwards I observed FSNamesystem#clearCorruptLazyPersistFiles() held write > lock for a long time: > {noformat} > 18/06/12 12:37:03 INFO namenode.FSNamesystem: FSNamesystem write lock held > for 46024 ms via > java.lang.Thread.getStackTrace(Thread.java:1559) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:945) > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:198) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1689) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.clearCorruptLazyPersistFiles(FSNamesystem.java:5532) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.run(FSNamesystem.java:5543) > java.lang.Thread.run(Thread.java:748) > Number of suppressed write-lock reports: 0 > Longest write-lock held interval: 46024 > {noformat} > Here's the relevant code: > {code} > writeLock(); > try { > final Iterator it = > blockManager.getCorruptReplicaBlockIterator(); > while (it.hasNext()) { > Block b = it.next(); > BlockInfo blockInfo = blockManager.getStoredBlock(b); > if (blockInfo.getBlockCollection().getStoragePolicyID() == > lpPolicy.getId()) { > filesToDelete.add(blockInfo.getBlockCollection()); > } > } > for (BlockCollection bc : filesToDelete) { > LOG.warn("Removing lazyPersist file " + bc.getName() + " with no > replicas."); > changed |= deleteInternal(bc.getName(), false, false, false); > } > } finally { > writeUnlock(); > } > {code} > In essence, the iteration over corrupt replica list should be broken down > into smaller iterations to avoid a single long wait. > Since this operation holds NameNode write lock for more than 45 seconds, the > default ZKFC connection timeout, it implies an extreme case like this (100 > million corrupt blocks) could lead to NameNode failover. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13448) HDFS Block Placement - Ignore Locality for First Block Replica
[ https://issues.apache.org/jira/browse/HDFS-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517509#comment-16517509 ] BELUGA BEHR commented on HDFS-13448: I'm not sure why the build failed... any help please? :) > HDFS Block Placement - Ignore Locality for First Block Replica > -- > > Key: HDFS-13448 > URL: https://issues.apache.org/jira/browse/HDFS-13448 > Project: Hadoop HDFS > Issue Type: New Feature > Components: block placement, hdfs-client >Affects Versions: 2.9.0, 3.0.1 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Minor > Attachments: HDFS-13448.10.patch, HDFS-13448.11.patch, > HDFS-13448.12.patch, HDFS-13448.13.patch, HDFS-13448.6.patch, > HDFS-13448.7.patch, HDFS-13448.8.patch > > > According to the HDFS Block Place Rules: > {quote} > /** > * The replica placement strategy is that if the writer is on a datanode, > * the 1st replica is placed on the local machine, > * otherwise a random datanode. The 2nd replica is placed on a datanode > * that is on a different rack. The 3rd replica is placed on a datanode > * which is on a different node of the rack as the second replica. > */ > {quote} > However, there is a hint for the hdfs-client that allows the block placement > request to not put a block replica on the local datanode _where 'local' means > the same host as the client is being run on._ > {quote} > /** >* Advise that a block replica NOT be written to the local DataNode where >* 'local' means the same host as the client is being run on. >* >* @see CreateFlag#NO_LOCAL_WRITE >*/ > {quote} > I propose that we add a new flag that allows the hdfs-client to request that > the first block replica be placed on a random DataNode in the cluster. The > subsequent block replicas should follow the normal block placement rules. > The issue is that when the {{NO_LOCAL_WRITE}} is enabled, the first block > replica is not placed on the local node, but it is still placed on the local > rack. Where this comes into play is where you have, for example, a flume > agent that is loading data into HDFS. > If the Flume agent is running on a DataNode, then by default, the DataNode > local to the Flume agent will always get the first block replica and this > leads to un-even block placements, with the local node always filling up > faster than any other node in the cluster. > Modifying this example, if the DataNode is removed from the host where the > Flume agent is running, or this {{NO_LOCAL_WRITE}} is enabled by Flume, then > the default block placement policy will still prefer the local rack. This > remedies the situation only so far as now the first block replica will always > be distributed to a DataNode on the local rack. > This new flag would allow a single Flume agent to distribute the blocks > randomly, evenly, over the entire cluster instead of hot-spotting the local > node or the local rack. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13448) HDFS Block Placement - Ignore Locality for First Block Replica
[ https://issues.apache.org/jira/browse/HDFS-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517506#comment-16517506 ] genericqa commented on HDFS-13448: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 36s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 43s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 29m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 32m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 3m 30s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 6m 9s{color} | {color:red} branch has errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 16s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 25s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 20s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 37m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 37m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 37m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 2m 30s{color} | {color:red} patch has errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 53s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 27s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 1s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}110m 4s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 50s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}262m 45s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure | | | hadoop.hdfs.client.impl.TestBlockReaderLocal | | | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | HDFS-13448 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12928373/HDFS-13448.13.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc | | uname | Linux 787c4544bc3e 3.13.0-143-generic
[jira] [Commented] (HDDS-177) Create a releasable ozonefs artifact
[ https://issues.apache.org/jira/browse/HDDS-177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517467#comment-16517467 ] genericqa commented on HDDS-177: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 33s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 31 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 29s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 20m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 49s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-ozone hadoop-tools/hadoop-tools-dist hadoop-tools hadoop-dist hadoop-ozone/acceptance-test . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 37s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 6m 56s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 15s{color} | {color:red} root in the patch failed. {color} | | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 0m 12s{color} | {color:red} hadoop-dist in the patch failed. {color} | | {color:red}-1{color} | {color:red} compile {color} | {color:red} 0m 16s{color} | {color:red} root in the patch failed. {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 0m 16s{color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} mvnsite {color} | {color:red} 0m 17s{color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 0s{color} | {color:green} There were no new shellcheck issues. {color} | | {color:green}+1{color} | {color:green} shelldocs {color} | {color:green} 0m 15s{color} | {color:green} There were no new shelldocs issues. {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch 2 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 8s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 0m 22s{color} | {color:red} patch has errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: . hadoop-dist hadoop-ozone hadoop-ozone/acceptance-test hadoop-tools hadoop-tools/hadoop-tools-dist {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 16s{color} | {color:red} root in the patch failed. {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 17s{color} | {color:red} root in the patch failed. {color} | | {color:blue}0{color} | {color:blue} asflicense {color} | {color:blue} 0m 17s{color} | {color:blue} ASF License check generated no output? {color} | | {color:black}{color} | {color:black} {color} | {color:black}106m 59s{color} | {color:black}
[jira] [Commented] (HDFS-13658) fsck, dfsadmin -report, and NN WebUI should report number of blocks that have 1 replica
[ https://issues.apache.org/jira/browse/HDFS-13658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517446#comment-16517446 ] genericqa commented on HDFS-13658: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 33s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 6 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 9s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 6m 47s{color} | {color:red} branch has errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 17s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 29m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 29m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 4m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 2m 8s{color} | {color:red} patch has errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 7m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 19s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 5s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 47s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}110m 51s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 18m 6s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 55s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}269m 35s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestBlockStoragePolicy | | | hadoop.hdfs.TestReadStripedFileWithMissingBlocks | | | hadoop.hdfs.server.namenode.TestReencryptionWithKMS | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | HDFS-13658 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12928366/HDFS-13658.005.patch | | Optional Tests | asflicense mvnsite compile javac
[jira] [Updated] (HDDS-177) Create a releasable ozonefs artifact
[ https://issues.apache.org/jira/browse/HDDS-177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elek, Marton updated HDDS-177: -- Attachment: HDDS-177.005.patch > Create a releasable ozonefs artifact > - > > Key: HDDS-177 > URL: https://issues.apache.org/jira/browse/HDDS-177 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Filesystem >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-177.001.patch, HDDS-177.002.patch, > HDDS-177.003.patch, HDDS-177.004.patch, HDDS-177.005.patch > > > The current ozonefs implementaton is under hadoop-tools/hadoop-ozone and uses > the version of hadoop (3.2.0-SNAPSHOT currently) which is wrong. > The other problem is that we have no single hadoop independent arfitact from > ozonefs which could be used with any hadoop version. > In this patch I propose the following modification: > * move hadoop-tools/hadoop-ozone to hadoop-ozone/ozonefs and use the hdds > version (0.2.1-SNAPSHOT) > * Create a shaded artifact which includes all the required jar files to use > ozonefs (hdds/ozone client) > * Create an ozonefs acceptance test to test it with the latest stable hadoop > version -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-177) Create a releasable ozonefs artifact
[ https://issues.apache.org/jira/browse/HDDS-177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elek, Marton updated HDDS-177: -- Attachment: HDDS-177.004.patch > Create a releasable ozonefs artifact > - > > Key: HDDS-177 > URL: https://issues.apache.org/jira/browse/HDDS-177 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Filesystem >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-177.001.patch, HDDS-177.002.patch, > HDDS-177.003.patch, HDDS-177.004.patch > > > The current ozonefs implementaton is under hadoop-tools/hadoop-ozone and uses > the version of hadoop (3.2.0-SNAPSHOT currently) which is wrong. > The other problem is that we have no single hadoop independent arfitact from > ozonefs which could be used with any hadoop version. > In this patch I propose the following modification: > * move hadoop-tools/hadoop-ozone to hadoop-ozone/ozonefs and use the hdds > version (0.2.1-SNAPSHOT) > * Create a shaded artifact which includes all the required jar files to use > ozonefs (hdds/ozone client) > * Create an ozonefs acceptance test to test it with the latest stable hadoop > version -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12571) Ozone: remove spaces from the beginning of the hdfs script
[ https://issues.apache.org/jira/browse/HDFS-12571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517402#comment-16517402 ] Elek, Marton commented on HDFS-12571: - Unfortunately I can't test it on mac currently. But if it's mac specific, it will be a different problem. If it could be reproduced, I suggest to create a separated issue for that. > Ozone: remove spaces from the beginning of the hdfs script > > > Key: HDFS-12571 > URL: https://issues.apache.org/jira/browse/HDFS-12571 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: HDFS-7240 >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Critical > Labels: ozoneMerge > Fix For: HDFS-7240 > > Attachments: HDFS-12571-HDFS-7240.001.patch > > > It seems that during one of the previous merge some unnecessary spaces has > been added to the hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs file. > After a dist build I can not start server with the hdfs command: > {code} > /tmp/hadoop-3.1.0-SNAPSHOT/bin/../libexec/hadoop-functions.sh: line 398: > syntax error near unexpected token `<' > /tmp/hadoop-3.1.0-SNAPSHOT/bin/../libexec/hadoop-functions.sh: line 398: ` > done < <(for text in "${input[@]}"; do' > /tmp/hadoop-3.1.0-SNAPSHOT/bin/../libexec/hadoop-config.sh: line 70: > hadoop_deprecate_envvar: command not found > /tmp/hadoop-3.1.0-SNAPSHOT/bin/../libexec/hadoop-config.sh: line 87: > hadoop_bootstrap: command not found > /tmp/hadoop-3.1.0-SNAPSHOT/bin/../libexec/hadoop-config.sh: line 104: > hadoop_parse_args: command not found > /tmp/hadoop-3.1.0-SNAPSHOT/bin/../libexec/hadoop-config.sh: line 105: shift: > : numeric argument required > /tmp/hadoop-3.1.0-SNAPSHOT/bin/../libexec/hadoop-config.sh: line 110: > hadoop_find_confdir: command not found > /tmp/hadoop-3.1.0-SNAPSHOT/bin/../libexec/hadoop-config.sh: line 111: > hadoop_exec_hadoopenv: command not found > /tmp/hadoop-3.1.0-SNAPSHOT/bin/../libexec/hadoop-config.sh: line 112: > hadoop_import_shellprofiles: command not found > {code} > See the space at here: > https://github.com/apache/hadoop/blob/d0bd0f623338dbb558d0dee5e747001d825d92c5/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs > Or see the latest version at: > https://github.com/apache/hadoop/blob/HDFS-7240/hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs > To be honest I don't understand how it could work for others, as it seems to > be an older change. Maybe some git magic removed it on OSX (I use linux). > Anyway I upload a patch to fix it. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-177) Create a releasable ozonefs artifact
[ https://issues.apache.org/jira/browse/HDDS-177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elek, Marton updated HDDS-177: -- Attachment: HDDS-177.003.patch > Create a releasable ozonefs artifact > - > > Key: HDDS-177 > URL: https://issues.apache.org/jira/browse/HDDS-177 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Filesystem >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-177.001.patch, HDDS-177.002.patch, > HDDS-177.003.patch > > > The current ozonefs implementaton is under hadoop-tools/hadoop-ozone and uses > the version of hadoop (3.2.0-SNAPSHOT currently) which is wrong. > The other problem is that we have no single hadoop independent arfitact from > ozonefs which could be used with any hadoop version. > In this patch I propose the following modification: > * move hadoop-tools/hadoop-ozone to hadoop-ozone/ozonefs and use the hdds > version (0.2.1-SNAPSHOT) > * Create a shaded artifact which includes all the required jar files to use > ozonefs (hdds/ozone client) > * Create an ozonefs acceptance test to test it with the latest stable hadoop > version -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work started] (HDFS-13672) clearCorruptLazyPersistFiles could crash NameNode
[ https://issues.apache.org/jira/browse/HDFS-13672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-13672 started by Gabor Bota. - > clearCorruptLazyPersistFiles could crash NameNode > - > > Key: HDFS-13672 > URL: https://issues.apache.org/jira/browse/HDFS-13672 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Wei-Chiu Chuang >Assignee: Gabor Bota >Priority: Major > > I started a NameNode on a pretty large fsimage. Since the NameNode is started > without any DataNodes, all blocks (100 million) are "corrupt". > Afterwards I observed FSNamesystem#clearCorruptLazyPersistFiles() held write > lock for a long time: > {noformat} > 18/06/12 12:37:03 INFO namenode.FSNamesystem: FSNamesystem write lock held > for 46024 ms via > java.lang.Thread.getStackTrace(Thread.java:1559) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:945) > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:198) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1689) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.clearCorruptLazyPersistFiles(FSNamesystem.java:5532) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.run(FSNamesystem.java:5543) > java.lang.Thread.run(Thread.java:748) > Number of suppressed write-lock reports: 0 > Longest write-lock held interval: 46024 > {noformat} > Here's the relevant code: > {code} > writeLock(); > try { > final Iterator it = > blockManager.getCorruptReplicaBlockIterator(); > while (it.hasNext()) { > Block b = it.next(); > BlockInfo blockInfo = blockManager.getStoredBlock(b); > if (blockInfo.getBlockCollection().getStoragePolicyID() == > lpPolicy.getId()) { > filesToDelete.add(blockInfo.getBlockCollection()); > } > } > for (BlockCollection bc : filesToDelete) { > LOG.warn("Removing lazyPersist file " + bc.getName() + " with no > replicas."); > changed |= deleteInternal(bc.getName(), false, false, false); > } > } finally { > writeUnlock(); > } > {code} > In essence, the iteration over corrupt replica list should be broken down > into smaller iterations to avoid a single long wait. > Since this operation holds NameNode write lock for more than 45 seconds, the > default ZKFC connection timeout, it implies an extreme case like this (100 > million corrupt blocks) could lead to NameNode failover. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13672) clearCorruptLazyPersistFiles could crash NameNode
[ https://issues.apache.org/jira/browse/HDFS-13672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517378#comment-16517378 ] Gabor Bota commented on HDFS-13672: --- Hi [~jojochuang], What would be a reasonable amount of elements that could be iterated through at one pass? Should this be number (number of iterations per pass) configurable from outside? Thanks, Gabor > clearCorruptLazyPersistFiles could crash NameNode > - > > Key: HDFS-13672 > URL: https://issues.apache.org/jira/browse/HDFS-13672 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Wei-Chiu Chuang >Assignee: Gabor Bota >Priority: Major > > I started a NameNode on a pretty large fsimage. Since the NameNode is started > without any DataNodes, all blocks (100 million) are "corrupt". > Afterwards I observed FSNamesystem#clearCorruptLazyPersistFiles() held write > lock for a long time: > {noformat} > 18/06/12 12:37:03 INFO namenode.FSNamesystem: FSNamesystem write lock held > for 46024 ms via > java.lang.Thread.getStackTrace(Thread.java:1559) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:945) > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:198) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1689) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.clearCorruptLazyPersistFiles(FSNamesystem.java:5532) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.run(FSNamesystem.java:5543) > java.lang.Thread.run(Thread.java:748) > Number of suppressed write-lock reports: 0 > Longest write-lock held interval: 46024 > {noformat} > Here's the relevant code: > {code} > writeLock(); > try { > final Iterator it = > blockManager.getCorruptReplicaBlockIterator(); > while (it.hasNext()) { > Block b = it.next(); > BlockInfo blockInfo = blockManager.getStoredBlock(b); > if (blockInfo.getBlockCollection().getStoragePolicyID() == > lpPolicy.getId()) { > filesToDelete.add(blockInfo.getBlockCollection()); > } > } > for (BlockCollection bc : filesToDelete) { > LOG.warn("Removing lazyPersist file " + bc.getName() + " with no > replicas."); > changed |= deleteInternal(bc.getName(), false, false, false); > } > } finally { > writeUnlock(); > } > {code} > In essence, the iteration over corrupt replica list should be broken down > into smaller iterations to avoid a single long wait. > Since this operation holds NameNode write lock for more than 45 seconds, the > default ZKFC connection timeout, it implies an extreme case like this (100 > million corrupt blocks) could lead to NameNode failover. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-170) Fix TestBlockDeletingService#testBlockDeletionTimeout
[ https://issues.apache.org/jira/browse/HDDS-170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lokesh Jain updated HDDS-170: - Attachment: HDDS-170.001.patch > Fix TestBlockDeletingService#testBlockDeletionTimeout > - > > Key: HDDS-170 > URL: https://issues.apache.org/jira/browse/HDDS-170 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Mukul Kumar Singh >Assignee: Lokesh Jain >Priority: Major > Attachments: HDDS-170.001.patch > > > TestBlockDeletingService#testBlockDeletionTimeout timesout while waiting for > expected error messsage. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-170) Fix TestBlockDeletingService#testBlockDeletionTimeout
[ https://issues.apache.org/jira/browse/HDDS-170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lokesh Jain reassigned HDDS-170: Assignee: Lokesh Jain (was: Mukul Kumar Singh) > Fix TestBlockDeletingService#testBlockDeletionTimeout > - > > Key: HDDS-170 > URL: https://issues.apache.org/jira/browse/HDDS-170 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Mukul Kumar Singh >Assignee: Lokesh Jain >Priority: Major > > TestBlockDeletingService#testBlockDeletionTimeout timesout while waiting for > expected error messsage. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12857) StoragePolicyAdmin should support schema based path
[ https://issues.apache.org/jira/browse/HDFS-12857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517320#comment-16517320 ] Xiao Chen commented on HDFS-12857: -- Cherry-picked to branch-3.0. Ran the changed test before pushing. > StoragePolicyAdmin should support schema based path > --- > > Key: HDFS-12857 > URL: https://issues.apache.org/jira/browse/HDFS-12857 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.1.0 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore >Priority: Major > Fix For: 3.1.0, 3.0.4 > > Attachments: HDFS-12857.001.patch > > > When we execute storagepolicy admin command with full schema path, then it > will throw this exception > {noformat} > ./hdfs storagepolicies -getstoragepolicy -path > hdfs://localhost:39133/user1/bar > java.lang.IllegalArgumentException: Wrong FS: > hdfs://localhost:39133/user1/bar, expected: viewfs://cluster/ > {noformat} > This is because path schema is not matching with {{fs.defaultFS}}. > {{fs.defaultFS}} configured with {{viewFs}} and in file path {{hdfs}} is > used. This is broken because of HDFS-11968 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12857) StoragePolicyAdmin should support schema based path
[ https://issues.apache.org/jira/browse/HDFS-12857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-12857: - Fix Version/s: 3.0.4 > StoragePolicyAdmin should support schema based path > --- > > Key: HDFS-12857 > URL: https://issues.apache.org/jira/browse/HDFS-12857 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.1.0 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore >Priority: Major > Fix For: 3.1.0, 3.0.4 > > Attachments: HDFS-12857.001.patch > > > When we execute storagepolicy admin command with full schema path, then it > will throw this exception > {noformat} > ./hdfs storagepolicies -getstoragepolicy -path > hdfs://localhost:39133/user1/bar > java.lang.IllegalArgumentException: Wrong FS: > hdfs://localhost:39133/user1/bar, expected: viewfs://cluster/ > {noformat} > This is because path schema is not matching with {{fs.defaultFS}}. > {{fs.defaultFS}} configured with {{viewFs}} and in file path {{hdfs}} is > used. This is broken because of HDFS-11968 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13683) HDFS StoragePolicy commands should work with Federation
[ https://issues.apache.org/jira/browse/HDFS-13683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517319#comment-16517319 ] Xiao Chen commented on HDFS-13683: -- Yup, will be done via HDFS-12857 > HDFS StoragePolicy commands should work with Federation > --- > > Key: HDFS-13683 > URL: https://issues.apache.org/jira/browse/HDFS-13683 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, tools >Affects Versions: 3.0.0 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > > In a federated cluster (defaultFS = viewfs://cluster3), getStoragePolicy with > hdfs uri will run into the following error: > {noformat} > [root@cdh6-0-0-centos73-17443-1 ~]# hdfs storagepolicies -getStoragePolicy > -path > hdfs://ns2/hbase/WALs/cdh6-0-0-centos73-17443-4.vpc.cloudera.com,22101,1527272361896 > IllegalArgumentException: Wrong FS: > hdfs://ns2/hbase/WALs/cdh6-0-0-centos73-17443-4.vpc.cloudera.com,22101,1527272361896, > expected: viewfs://cluster3/ > {noformat} > Taking a quick look at the code, I think > [this|https://github.com/apache/hadoop/blob/3e37a9a70ba93430da1b47f2a8b50358348396b0/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/StoragePolicyAdmin.java#L106] > is the culprit: > {code} > final FileSystem fs = FileSystem.get(conf); > // should do: final FileSystem fs = p.getFilesystem(conf); > {code} > We should have a review of all shell and see if anything else is missing. At > the minimum, we should fix all places in StoragePolicyAdmin. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13683) HDFS StoragePolicy commands should work with Federation
[ https://issues.apache.org/jira/browse/HDFS-13683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517309#comment-16517309 ] Xiao Chen commented on HDFS-13683: -- Thanks Gabor for the investigation. Yes this looks to be a dup of HDFS-12857 :) branch-3.0 is still active, which will release to branch-3.0.x (latest is 3.0.3). More details at: http://hadoop.apache.org/versioning.html > HDFS StoragePolicy commands should work with Federation > --- > > Key: HDFS-13683 > URL: https://issues.apache.org/jira/browse/HDFS-13683 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, tools >Affects Versions: 3.0.0 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > > In a federated cluster (defaultFS = viewfs://cluster3), getStoragePolicy with > hdfs uri will run into the following error: > {noformat} > [root@cdh6-0-0-centos73-17443-1 ~]# hdfs storagepolicies -getStoragePolicy > -path > hdfs://ns2/hbase/WALs/cdh6-0-0-centos73-17443-4.vpc.cloudera.com,22101,1527272361896 > IllegalArgumentException: Wrong FS: > hdfs://ns2/hbase/WALs/cdh6-0-0-centos73-17443-4.vpc.cloudera.com,22101,1527272361896, > expected: viewfs://cluster3/ > {noformat} > Taking a quick look at the code, I think > [this|https://github.com/apache/hadoop/blob/3e37a9a70ba93430da1b47f2a8b50358348396b0/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/StoragePolicyAdmin.java#L106] > is the culprit: > {code} > final FileSystem fs = FileSystem.get(conf); > // should do: final FileSystem fs = p.getFilesystem(conf); > {code} > We should have a review of all shell and see if anything else is missing. At > the minimum, we should fix all places in StoragePolicyAdmin. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13683) HDFS StoragePolicy commands should work with Federation
[ https://issues.apache.org/jira/browse/HDFS-13683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517315#comment-16517315 ] Gabor Bota commented on HDFS-13683: --- Than maybe it would be a good idea to backport this to branch-3.0 > HDFS StoragePolicy commands should work with Federation > --- > > Key: HDFS-13683 > URL: https://issues.apache.org/jira/browse/HDFS-13683 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, tools >Affects Versions: 3.0.0 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > > In a federated cluster (defaultFS = viewfs://cluster3), getStoragePolicy with > hdfs uri will run into the following error: > {noformat} > [root@cdh6-0-0-centos73-17443-1 ~]# hdfs storagepolicies -getStoragePolicy > -path > hdfs://ns2/hbase/WALs/cdh6-0-0-centos73-17443-4.vpc.cloudera.com,22101,1527272361896 > IllegalArgumentException: Wrong FS: > hdfs://ns2/hbase/WALs/cdh6-0-0-centos73-17443-4.vpc.cloudera.com,22101,1527272361896, > expected: viewfs://cluster3/ > {noformat} > Taking a quick look at the code, I think > [this|https://github.com/apache/hadoop/blob/3e37a9a70ba93430da1b47f2a8b50358348396b0/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/StoragePolicyAdmin.java#L106] > is the culprit: > {code} > final FileSystem fs = FileSystem.get(conf); > // should do: final FileSystem fs = p.getFilesystem(conf); > {code} > We should have a review of all shell and see if anything else is missing. At > the minimum, we should fix all places in StoragePolicyAdmin. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-9457) Datanode Logging Improvement
[ https://issues.apache.org/jira/browse/HDFS-9457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HDFS-9457: -- Resolution: Fixed Status: Resolved (was: Patch Available) > Datanode Logging Improvement > > > Key: HDFS-9457 > URL: https://issues.apache.org/jira/browse/HDFS-9457 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.7.0, 3.0.0-alpha1 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Minor > Attachments: logging.patch > > > Please accept my patch for some minor clean-up of logging. Patch is against > 3.0.0 but applies to previous versions as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13448) HDFS Block Placement - Ignore Locality for First Block Replica
[ https://issues.apache.org/jira/browse/HDFS-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HDFS-13448: --- Status: Patch Available (was: Open) Submitted new patch with [~templedf]'s suggestions. I am accustomed to a different mocking framework that throws errors if a mock's signature don't actually match any calls during the test, so my previous work assumed that if the source node was 'null' then the arguments of my mock wouldn't match and that would raise an exception. I see in Mockito that is not the case. > HDFS Block Placement - Ignore Locality for First Block Replica > -- > > Key: HDFS-13448 > URL: https://issues.apache.org/jira/browse/HDFS-13448 > Project: Hadoop HDFS > Issue Type: New Feature > Components: block placement, hdfs-client >Affects Versions: 3.0.1, 2.9.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Minor > Attachments: HDFS-13448.10.patch, HDFS-13448.11.patch, > HDFS-13448.12.patch, HDFS-13448.13.patch, HDFS-13448.6.patch, > HDFS-13448.7.patch, HDFS-13448.8.patch > > > According to the HDFS Block Place Rules: > {quote} > /** > * The replica placement strategy is that if the writer is on a datanode, > * the 1st replica is placed on the local machine, > * otherwise a random datanode. The 2nd replica is placed on a datanode > * that is on a different rack. The 3rd replica is placed on a datanode > * which is on a different node of the rack as the second replica. > */ > {quote} > However, there is a hint for the hdfs-client that allows the block placement > request to not put a block replica on the local datanode _where 'local' means > the same host as the client is being run on._ > {quote} > /** >* Advise that a block replica NOT be written to the local DataNode where >* 'local' means the same host as the client is being run on. >* >* @see CreateFlag#NO_LOCAL_WRITE >*/ > {quote} > I propose that we add a new flag that allows the hdfs-client to request that > the first block replica be placed on a random DataNode in the cluster. The > subsequent block replicas should follow the normal block placement rules. > The issue is that when the {{NO_LOCAL_WRITE}} is enabled, the first block > replica is not placed on the local node, but it is still placed on the local > rack. Where this comes into play is where you have, for example, a flume > agent that is loading data into HDFS. > If the Flume agent is running on a DataNode, then by default, the DataNode > local to the Flume agent will always get the first block replica and this > leads to un-even block placements, with the local node always filling up > faster than any other node in the cluster. > Modifying this example, if the DataNode is removed from the host where the > Flume agent is running, or this {{NO_LOCAL_WRITE}} is enabled by Flume, then > the default block placement policy will still prefer the local rack. This > remedies the situation only so far as now the first block replica will always > be distributed to a DataNode on the local rack. > This new flag would allow a single Flume agent to distribute the blocks > randomly, evenly, over the entire cluster instead of hot-spotting the local > node or the local rack. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13448) HDFS Block Placement - Ignore Locality for First Block Replica
[ https://issues.apache.org/jira/browse/HDFS-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HDFS-13448: --- Attachment: HDFS-13448.13.patch > HDFS Block Placement - Ignore Locality for First Block Replica > -- > > Key: HDFS-13448 > URL: https://issues.apache.org/jira/browse/HDFS-13448 > Project: Hadoop HDFS > Issue Type: New Feature > Components: block placement, hdfs-client >Affects Versions: 2.9.0, 3.0.1 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Minor > Attachments: HDFS-13448.10.patch, HDFS-13448.11.patch, > HDFS-13448.12.patch, HDFS-13448.13.patch, HDFS-13448.6.patch, > HDFS-13448.7.patch, HDFS-13448.8.patch > > > According to the HDFS Block Place Rules: > {quote} > /** > * The replica placement strategy is that if the writer is on a datanode, > * the 1st replica is placed on the local machine, > * otherwise a random datanode. The 2nd replica is placed on a datanode > * that is on a different rack. The 3rd replica is placed on a datanode > * which is on a different node of the rack as the second replica. > */ > {quote} > However, there is a hint for the hdfs-client that allows the block placement > request to not put a block replica on the local datanode _where 'local' means > the same host as the client is being run on._ > {quote} > /** >* Advise that a block replica NOT be written to the local DataNode where >* 'local' means the same host as the client is being run on. >* >* @see CreateFlag#NO_LOCAL_WRITE >*/ > {quote} > I propose that we add a new flag that allows the hdfs-client to request that > the first block replica be placed on a random DataNode in the cluster. The > subsequent block replicas should follow the normal block placement rules. > The issue is that when the {{NO_LOCAL_WRITE}} is enabled, the first block > replica is not placed on the local node, but it is still placed on the local > rack. Where this comes into play is where you have, for example, a flume > agent that is loading data into HDFS. > If the Flume agent is running on a DataNode, then by default, the DataNode > local to the Flume agent will always get the first block replica and this > leads to un-even block placements, with the local node always filling up > faster than any other node in the cluster. > Modifying this example, if the DataNode is removed from the host where the > Flume agent is running, or this {{NO_LOCAL_WRITE}} is enabled by Flume, then > the default block placement policy will still prefer the local rack. This > remedies the situation only so far as now the first block replica will always > be distributed to a DataNode on the local rack. > This new flag would allow a single Flume agent to distribute the blocks > randomly, evenly, over the entire cluster instead of hot-spotting the local > node or the local rack. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-13683) HDFS StoragePolicy commands should work with Federation
[ https://issues.apache.org/jira/browse/HDFS-13683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Bota resolved HDFS-13683. --- Resolution: Duplicate > HDFS StoragePolicy commands should work with Federation > --- > > Key: HDFS-13683 > URL: https://issues.apache.org/jira/browse/HDFS-13683 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, tools >Affects Versions: 3.0.0 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > > In a federated cluster (defaultFS = viewfs://cluster3), getStoragePolicy with > hdfs uri will run into the following error: > {noformat} > [root@cdh6-0-0-centos73-17443-1 ~]# hdfs storagepolicies -getStoragePolicy > -path > hdfs://ns2/hbase/WALs/cdh6-0-0-centos73-17443-4.vpc.cloudera.com,22101,1527272361896 > IllegalArgumentException: Wrong FS: > hdfs://ns2/hbase/WALs/cdh6-0-0-centos73-17443-4.vpc.cloudera.com,22101,1527272361896, > expected: viewfs://cluster3/ > {noformat} > Taking a quick look at the code, I think > [this|https://github.com/apache/hadoop/blob/3e37a9a70ba93430da1b47f2a8b50358348396b0/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/StoragePolicyAdmin.java#L106] > is the culprit: > {code} > final FileSystem fs = FileSystem.get(conf); > // should do: final FileSystem fs = p.getFilesystem(conf); > {code} > We should have a review of all shell and see if anything else is missing. At > the minimum, we should fix all places in StoragePolicyAdmin. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13658) fsck, dfsadmin -report, and NN WebUI should report number of blocks that have 1 replica
[ https://issues.apache.org/jira/browse/HDFS-13658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517228#comment-16517228 ] Kitti Nanasi commented on HDFS-13658: - Thanks for the comments, [~xiaochen]! I uploaded patch v005 which fixes most of your comments: * I added a "listonereplicablocks" flag to fsck * isStriped checking is removed * I added test coverage in TestNameNodeMetrics. * I verify for the new metric in the existing test cases of TestLowRedundancyBlockQueues and that covers the test cases suggested by you. However I kept the TestOneReplicaBlocksAlert integration test as well, because that checks if everything is working together well. Do you think I should keep or remove the integration test? * About using a set instead of a single long: If I use a long, the metric increment still works fine, because I know the number of the current replicas, however when the metric decrement happens, I would need the information on what was the previous replica number when the previous increment happened. But the metric decrement can happen from various reasons, for example if the whole file was removed, or if more replicas were created for the block, and in some cases there is no information on what was the previous replica count. But I agree with you that I shouldn't store the block infos. Do you have any suggestions on how to fix that? I can only think of creating another priority queue in LowRedundancyBlocks, but I probably that would ruin a bunch of other things, or if I don't store the whole block info, just its id for example. > fsck, dfsadmin -report, and NN WebUI should report number of blocks that have > 1 replica > --- > > Key: HDFS-13658 > URL: https://issues.apache.org/jira/browse/HDFS-13658 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.1.0 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-13658.001.patch, HDFS-13658.002.patch, > HDFS-13658.003.patch, HDFS-13658.004.patch, HDFS-13658.005.patch > > > fsck, dfsadmin -report, and NN WebUI should report number of blocks that have > 1 replica. We have had many cases opened in which a customer has lost a disk > or a DN losing files/blocks due to the fact that they had blocks with only 1 > replica. We need to make the customer better aware of this situation and that > they should take action. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13658) fsck, dfsadmin -report, and NN WebUI should report number of blocks that have 1 replica
[ https://issues.apache.org/jira/browse/HDFS-13658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-13658: Attachment: HDFS-13658.005.patch > fsck, dfsadmin -report, and NN WebUI should report number of blocks that have > 1 replica > --- > > Key: HDFS-13658 > URL: https://issues.apache.org/jira/browse/HDFS-13658 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.1.0 >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Major > Attachments: HDFS-13658.001.patch, HDFS-13658.002.patch, > HDFS-13658.003.patch, HDFS-13658.004.patch, HDFS-13658.005.patch > > > fsck, dfsadmin -report, and NN WebUI should report number of blocks that have > 1 replica. We have had many cases opened in which a customer has lost a disk > or a DN losing files/blocks due to the fact that they had blocks with only 1 > replica. We need to make the customer better aware of this situation and that > they should take action. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13448) HDFS Block Placement - Ignore Locality for First Block Replica
[ https://issues.apache.org/jira/browse/HDFS-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] BELUGA BEHR updated HDFS-13448: --- Status: Open (was: Patch Available) > HDFS Block Placement - Ignore Locality for First Block Replica > -- > > Key: HDFS-13448 > URL: https://issues.apache.org/jira/browse/HDFS-13448 > Project: Hadoop HDFS > Issue Type: New Feature > Components: block placement, hdfs-client >Affects Versions: 3.0.1, 2.9.0 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Minor > Attachments: HDFS-13448.10.patch, HDFS-13448.11.patch, > HDFS-13448.12.patch, HDFS-13448.6.patch, HDFS-13448.7.patch, > HDFS-13448.8.patch > > > According to the HDFS Block Place Rules: > {quote} > /** > * The replica placement strategy is that if the writer is on a datanode, > * the 1st replica is placed on the local machine, > * otherwise a random datanode. The 2nd replica is placed on a datanode > * that is on a different rack. The 3rd replica is placed on a datanode > * which is on a different node of the rack as the second replica. > */ > {quote} > However, there is a hint for the hdfs-client that allows the block placement > request to not put a block replica on the local datanode _where 'local' means > the same host as the client is being run on._ > {quote} > /** >* Advise that a block replica NOT be written to the local DataNode where >* 'local' means the same host as the client is being run on. >* >* @see CreateFlag#NO_LOCAL_WRITE >*/ > {quote} > I propose that we add a new flag that allows the hdfs-client to request that > the first block replica be placed on a random DataNode in the cluster. The > subsequent block replicas should follow the normal block placement rules. > The issue is that when the {{NO_LOCAL_WRITE}} is enabled, the first block > replica is not placed on the local node, but it is still placed on the local > rack. Where this comes into play is where you have, for example, a flume > agent that is loading data into HDFS. > If the Flume agent is running on a DataNode, then by default, the DataNode > local to the Flume agent will always get the first block replica and this > leads to un-even block placements, with the local node always filling up > faster than any other node in the cluster. > Modifying this example, if the DataNode is removed from the host where the > Flume agent is running, or this {{NO_LOCAL_WRITE}} is enabled by Flume, then > the default block placement policy will still prefer the local rack. This > remedies the situation only so far as now the first block replica will always > be distributed to a DataNode on the local rack. > This new flag would allow a single Flume agent to distribute the blocks > randomly, evenly, over the entire cluster instead of hot-spotting the local > node or the local rack. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
[ https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517173#comment-16517173 ] Yiqun Lin edited comment on HDFS-13671 at 6/19/18 2:37 PM: --- [~jojochuang], the link of FoldedTreeSet design doc:https://issues.apache.org/jira/secure/attachment/12767102/HDFS%20Block%20and%20Replica%20Management%2020151013.pdf {quote}Yiqun Lin, do you happen to know what 's the deletion rate in your cluster before HDFS-9260? (I doubt if it's at 344k blocks/sec ) {quote} [~xiaochen], I haven't tested the case without the patch of HDFS-9260. I can have a test if I have some free time, :). was (Author: linyiqun): [~jojochuang], the link of FoldedTreeSet design doc: [^HDFS Block and Replica Management 20151013.pdf] {quote}Yiqun Lin, do you happen to know what 's the deletion rate in your cluster before HDFS-9260? (I doubt if it's at 344k blocks/sec ) {quote} [~xiaochen], I haven't tested the case without the patch of HDFS-9260. I can have a test if I have some free time, :). > Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet > -- > > Key: HDFS-13671 > URL: https://issues.apache.org/jira/browse/HDFS-13671 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.3 >Reporter: Yiqun Lin >Priority: Major > > NameNode hung when deleting large files/blocks. The stack info: > {code} > "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 > tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > {code} > In the current deletion logic in NameNode, there are mainly two steps: > * Collect INodes and all blocks to be deleted, then delete INodes. > * Remove blocks chunk by chunk in a loop. > Actually the first step should be a more expensive operation and will takes > more time. However, now we always see NN hangs during the remove block > operation. > Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a > better performance in dealing FBR/IBRs. But compared with early > implementation in remove-block logic, {{FoldedTreeSet}} seems more slower > since It will take additional time to balance tree node. When there are large > block to be removed/deleted, it looks bad. > For the get type operations in {{DatanodeStorageInfo}}, we only provide the > {{getBlockIterator}} to return blocks iterator and no other get operation > with specified block. Still we need to use {{FoldedTreeSet}} in > {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not > Update. Maybe we can revert this to the early implementation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail:
[jira] [Commented] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
[ https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517173#comment-16517173 ] Yiqun Lin commented on HDFS-13671: -- [~jojochuang], the link of FoldedTreeSet design doc: [^HDFS Block and Replica Management 20151013.pdf] {quote}Yiqun Lin, do you happen to know what 's the deletion rate in your cluster before HDFS-9260? (I doubt if it's at 344k blocks/sec ) {quote} [~xiaochen], I haven't tested the case without the patch of HDFS-9260. I can have a test if I have some free time, :). > Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet > -- > > Key: HDFS-13671 > URL: https://issues.apache.org/jira/browse/HDFS-13671 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.3 >Reporter: Yiqun Lin >Priority: Major > > NameNode hung when deleting large files/blocks. The stack info: > {code} > "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 > tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > {code} > In the current deletion logic in NameNode, there are mainly two steps: > * Collect INodes and all blocks to be deleted, then delete INodes. > * Remove blocks chunk by chunk in a loop. > Actually the first step should be a more expensive operation and will takes > more time. However, now we always see NN hangs during the remove block > operation. > Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a > better performance in dealing FBR/IBRs. But compared with early > implementation in remove-block logic, {{FoldedTreeSet}} seems more slower > since It will take additional time to balance tree node. When there are large > block to be removed/deleted, it looks bad. > For the get type operations in {{DatanodeStorageInfo}}, we only provide the > {{getBlockIterator}} to return blocks iterator and no other get operation > with specified block. Still we need to use {{FoldedTreeSet}} in > {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not > Update. Maybe we can revert this to the early implementation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13616) Batch listing of multiple directories
[ https://issues.apache.org/jira/browse/HDFS-13616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517070#comment-16517070 ] lpstudy commented on HDFS-13616: I am working on erasure coding, and encounter a question with no answer. I do not know where to post the question. Question: Hadoop 3.0 supports striped layout erasure coding, which will require to create multiple output streams so as to write data into file. However, according to my knowledge, hadoop doesn't support to write one file simultaneously. So my question is how to achieve this? > Batch listing of multiple directories > - > > Key: HDFS-13616 > URL: https://issues.apache.org/jira/browse/HDFS-13616 > Project: Hadoop HDFS > Issue Type: New Feature >Affects Versions: 3.2.0 >Reporter: Andrew Wang >Assignee: Andrew Wang >Priority: Major > Attachments: BenchmarkListFiles.java, HDFS-13616.001.patch, > HDFS-13616.002.patch > > > One of the dominant workloads for external metadata services is listing of > partition directories. This can end up being bottlenecked on RTT time when > partition directories contain a small number of files. This is fairly common, > since fine-grained partitioning is used for partition pruning by the query > engines. > A batched listing API that takes multiple paths amortizes the RTT cost. > Initial benchmarks show a 10-20x improvement in metadata loading performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
[ https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517057#comment-16517057 ] Wei-Chiu Chuang commented on HDFS-13671: [~linyiqun] thanks for reporting the issue. It seems you've tried to attach a file (HDFS Block and Replica Management 20151013.pdf) but it doesn't uploaded. Would you please share this file again? > Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet > -- > > Key: HDFS-13671 > URL: https://issues.apache.org/jira/browse/HDFS-13671 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.3 >Reporter: Yiqun Lin >Priority: Major > > NameNode hung when deleting large files/blocks. The stack info: > {code} > "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 > tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > {code} > In the current deletion logic in NameNode, there are mainly two steps: > * Collect INodes and all blocks to be deleted, then delete INodes. > * Remove blocks chunk by chunk in a loop. > Actually the first step should be a more expensive operation and will takes > more time. However, now we always see NN hangs during the remove block > operation. > Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a > better performance in dealing FBR/IBRs. But compared with early > implementation in remove-block logic, {{FoldedTreeSet}} seems more slower > since It will take additional time to balance tree node. When there are large > block to be removed/deleted, it looks bad. > For the get type operations in {{DatanodeStorageInfo}}, we only provide the > {{getBlockIterator}} to return blocks iterator and no other get operation > with specified block. Still we need to use {{FoldedTreeSet}} in > {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not > Update. Maybe we can revert this to the early implementation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-178) DeleteBlocks should not be handled by open containers
[ https://issues.apache.org/jira/browse/HDDS-178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516996#comment-16516996 ] genericqa commented on HDDS-178: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 37s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 24s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 29m 7s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 9s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 4m 0s{color} | {color:red} branch has errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-ozone/integration-test {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 28m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 2m 21s{color} | {color:red} patch has errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-ozone/integration-test {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 56s{color} | {color:red} hadoop-hdds/container-service generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 42s{color} | {color:green} container-service in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 22m 29s{color} | {color:red} integration-test in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 42s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}126m 37s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdds/container-service | | | Format-string method String.format(String, Object[]) called with format string "Ignoring delete blocks for containerId:with format string "Ignoring delete blocks for containerId: {}. Outdated delete transactionId {} < {}" wants 0 arguments but is given 3 in
[jira] [Commented] (HDDS-178) DeleteBlocks should not be handled by open containers
[ https://issues.apache.org/jira/browse/HDDS-178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516956#comment-16516956 ] genericqa commented on HDDS-178: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 14s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 4m 28s{color} | {color:red} branch has errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-ozone/integration-test {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 48s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 23s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 27m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 27m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 2m 17s{color} | {color:red} patch has errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-ozone/integration-test {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 56s{color} | {color:red} hadoop-hdds/container-service generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 46s{color} | {color:green} container-service in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 27m 42s{color} | {color:red} integration-test in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 45s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}129m 44s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdds/container-service | | | Format-string method String.format(String, Object[]) called with format string "Ignoring delete blocks for containerId:with format string "Ignoring delete blocks for containerId: {}. Outdated delete transactionId {} < {}" wants 0 arguments but is given 3 in
[jira] [Comment Edited] (HDFS-13683) HDFS StoragePolicy commands should work with Federation
[ https://issues.apache.org/jira/browse/HDFS-13683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516916#comment-16516916 ] Gabor Bota edited comment on HDFS-13683 at 6/19/18 10:31 AM: - But the fix is missing from [branch-3.0|https://github.com/apache/hadoop/blob/d1dcc39222b6d1d8ba10f38a3f2fb69e4d6548b3/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/StoragePolicyAdmin.java#L156] (which is ok, since there will be no new upstream release from the branch) was (Author: gabor.bota): But the fix is missing from [branch-3.0|https://github.com/apache/hadoop/blob/d1dcc39222b6d1d8ba10f38a3f2fb69e4d6548b3/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/StoragePolicyAdmin.java#L156] (which is ok, since there will be no upstream release from the branch) > HDFS StoragePolicy commands should work with Federation > --- > > Key: HDFS-13683 > URL: https://issues.apache.org/jira/browse/HDFS-13683 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, tools >Affects Versions: 3.0.0 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > > In a federated cluster (defaultFS = viewfs://cluster3), getStoragePolicy with > hdfs uri will run into the following error: > {noformat} > [root@cdh6-0-0-centos73-17443-1 ~]# hdfs storagepolicies -getStoragePolicy > -path > hdfs://ns2/hbase/WALs/cdh6-0-0-centos73-17443-4.vpc.cloudera.com,22101,1527272361896 > IllegalArgumentException: Wrong FS: > hdfs://ns2/hbase/WALs/cdh6-0-0-centos73-17443-4.vpc.cloudera.com,22101,1527272361896, > expected: viewfs://cluster3/ > {noformat} > Taking a quick look at the code, I think > [this|https://github.com/apache/hadoop/blob/3e37a9a70ba93430da1b47f2a8b50358348396b0/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/StoragePolicyAdmin.java#L106] > is the culprit: > {code} > final FileSystem fs = FileSystem.get(conf); > // should do: final FileSystem fs = p.getFilesystem(conf); > {code} > We should have a review of all shell and see if anything else is missing. At > the minimum, we should fix all places in StoragePolicyAdmin. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13683) HDFS StoragePolicy commands should work with Federation
[ https://issues.apache.org/jira/browse/HDFS-13683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516916#comment-16516916 ] Gabor Bota edited comment on HDFS-13683 at 6/19/18 10:31 AM: - But the fix is missing from [branch-3.0|https://github.com/apache/hadoop/blob/d1dcc39222b6d1d8ba10f38a3f2fb69e4d6548b3/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/StoragePolicyAdmin.java#L156] (which is ok, since there will be no upstream release from the branch) was (Author: gabor.bota): But the fix is missing from [branch-3.0|https://github.com/apache/hadoop/blob/d1dcc39222b6d1d8ba10f38a3f2fb69e4d6548b3/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/StoragePolicyAdmin.java#L156] > HDFS StoragePolicy commands should work with Federation > --- > > Key: HDFS-13683 > URL: https://issues.apache.org/jira/browse/HDFS-13683 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, tools >Affects Versions: 3.0.0 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > > In a federated cluster (defaultFS = viewfs://cluster3), getStoragePolicy with > hdfs uri will run into the following error: > {noformat} > [root@cdh6-0-0-centos73-17443-1 ~]# hdfs storagepolicies -getStoragePolicy > -path > hdfs://ns2/hbase/WALs/cdh6-0-0-centos73-17443-4.vpc.cloudera.com,22101,1527272361896 > IllegalArgumentException: Wrong FS: > hdfs://ns2/hbase/WALs/cdh6-0-0-centos73-17443-4.vpc.cloudera.com,22101,1527272361896, > expected: viewfs://cluster3/ > {noformat} > Taking a quick look at the code, I think > [this|https://github.com/apache/hadoop/blob/3e37a9a70ba93430da1b47f2a8b50358348396b0/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/StoragePolicyAdmin.java#L106] > is the culprit: > {code} > final FileSystem fs = FileSystem.get(conf); > // should do: final FileSystem fs = p.getFilesystem(conf); > {code} > We should have a review of all shell and see if anything else is missing. At > the minimum, we should fix all places in StoragePolicyAdmin. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-180) CloseContainer should commit all pending open keys for a container
[ https://issues.apache.org/jira/browse/HDDS-180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee resolved HDDS-180. -- Resolution: Fixed > CloseContainer should commit all pending open keys for a container > -- > > Key: HDDS-180 > URL: https://issues.apache.org/jira/browse/HDDS-180 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Fix For: 0.2.1 > > > When a close container command gets executed, it will first mark the > container in closing state. All the open Keys for the container will now > have to be committed. This requires us to track all pending open keys for a > container on a DataNode. This Jira aims to address all these. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13683) HDFS StoragePolicy commands should work with Federation
[ https://issues.apache.org/jira/browse/HDFS-13683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516916#comment-16516916 ] Gabor Bota commented on HDFS-13683: --- But the fix is missing from [branch-3.0|https://github.com/apache/hadoop/blob/d1dcc39222b6d1d8ba10f38a3f2fb69e4d6548b3/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/StoragePolicyAdmin.java#L156] > HDFS StoragePolicy commands should work with Federation > --- > > Key: HDFS-13683 > URL: https://issues.apache.org/jira/browse/HDFS-13683 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, tools >Affects Versions: 3.0.0 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > > In a federated cluster (defaultFS = viewfs://cluster3), getStoragePolicy with > hdfs uri will run into the following error: > {noformat} > [root@cdh6-0-0-centos73-17443-1 ~]# hdfs storagepolicies -getStoragePolicy > -path > hdfs://ns2/hbase/WALs/cdh6-0-0-centos73-17443-4.vpc.cloudera.com,22101,1527272361896 > IllegalArgumentException: Wrong FS: > hdfs://ns2/hbase/WALs/cdh6-0-0-centos73-17443-4.vpc.cloudera.com,22101,1527272361896, > expected: viewfs://cluster3/ > {noformat} > Taking a quick look at the code, I think > [this|https://github.com/apache/hadoop/blob/3e37a9a70ba93430da1b47f2a8b50358348396b0/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/StoragePolicyAdmin.java#L106] > is the culprit: > {code} > final FileSystem fs = FileSystem.get(conf); > // should do: final FileSystem fs = p.getFilesystem(conf); > {code} > We should have a review of all shell and see if anything else is missing. At > the minimum, we should fix all places in StoragePolicyAdmin. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-181) CloseContainer should commit all pending open Keys on a dataode
[ https://issues.apache.org/jira/browse/HDDS-181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee updated HDDS-181: - Description: A close container command arrives in the Datanode by the SCM heartBeat response.It will then be queued up over the ratis pipeline. Once the command execution starts inside the Datanode, it will mark the container in CLOSING State. All the pending open keys for the container now will be committed followed by the transition of the container state from CLOSING to CLOSED. For achieving this, all the open keys for a container need to be tracked. This Jira aims to address this. was: A close container command arrives in the Datanode by the SCM heartBeat response. It will then be queued up over the ratis pipeline. Once the command execution starts inside the Datanode, it will mark the container in CLOSING State. All the pending open keys for the container now will be committed followed by the transition of the container state from CLOSING to CLOSED state. For achieving this, all the open keys for a container need to be tracked. This Jira aims to address this. > CloseContainer should commit all pending open Keys on a dataode > --- > > Key: HDDS-181 > URL: https://issues.apache.org/jira/browse/HDDS-181 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Fix For: 0.2.1 > > > A close container command arrives in the Datanode by the SCM heartBeat > response.It will then be queued up over the ratis pipeline. Once the command > execution starts inside the Datanode, it will mark the container in CLOSING > State. All the pending open keys for the container now will be committed > followed by the transition of the container state from CLOSING to CLOSED. For > achieving this, all the open keys for a container need to be tracked. > This Jira aims to address this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-181) CloseContainer should commit all pending open Keys on a dataode
Shashikant Banerjee created HDDS-181: Summary: CloseContainer should commit all pending open Keys on a dataode Key: HDDS-181 URL: https://issues.apache.org/jira/browse/HDDS-181 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Datanode Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.2.1 A close container command arrives in the Datanode by the SCM heartBeat response. It will then be queued up over the ratis pipeline. Once the command execution starts inside the Datanode, it will mark the container in CLOSING State. All the pending open keys for the container now will be committed followed by the transition of the container state from CLOSING to CLOSED state. For achieving this, all the open keys for a container need to be tracked. This Jira aims to address this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work stopped] (HDFS-13393) Improve OOM logging
[ https://issues.apache.org/jira/browse/HDFS-13393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-13393 stopped by Gabor Bota. - > Improve OOM logging > --- > > Key: HDFS-13393 > URL: https://issues.apache.org/jira/browse/HDFS-13393 > Project: Hadoop HDFS > Issue Type: Improvement > Components: balancer mover, datanode >Reporter: Wei-Chiu Chuang >Assignee: Gabor Bota >Priority: Major > > It is not uncommon to find "java.lang.OutOfMemoryError: unable to create new > native thread" errors in a HDFS cluster. Most often this happens when > DataNode creating DataXceiver threads, or when balancer creates threads for > moving blocks around. > In most of cases, the "OOM" is a symptom of number of threads reaching system > limit, rather than actually running out of memory, and the current logging of > this message is usually misleading (suggesting this is due to insufficient > memory) > How about capturing the OOM, and if it is due to "unable to create new native > thread", print some more helpful message like "bump your ulimit" or "take a > jstack of the process"? > Even better, surface this error to make it more visible. It usually takes a > while for an in-depth investigation after users notice some job fails, by the > time the evidences may already been gone (like jstack output). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-180) CloseContainer should commit all pending open keys for a container
Shashikant Banerjee created HDDS-180: Summary: CloseContainer should commit all pending open keys for a container Key: HDDS-180 URL: https://issues.apache.org/jira/browse/HDDS-180 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Datanode Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.2.1 When a close container command gets executed, it will first mark the container in closing state. All the open Keys for the container will now have to be committed. This requires us to track all pending open keys for a container on a DataNode. This Jira aims to address all these. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-13672) clearCorruptLazyPersistFiles could crash NameNode
[ https://issues.apache.org/jira/browse/HDFS-13672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Bota reassigned HDFS-13672: - Assignee: Gabor Bota > clearCorruptLazyPersistFiles could crash NameNode > - > > Key: HDFS-13672 > URL: https://issues.apache.org/jira/browse/HDFS-13672 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Wei-Chiu Chuang >Assignee: Gabor Bota >Priority: Major > > I started a NameNode on a pretty large fsimage. Since the NameNode is started > without any DataNodes, all blocks (100 million) are "corrupt". > Afterwards I observed FSNamesystem#clearCorruptLazyPersistFiles() held write > lock for a long time: > {noformat} > 18/06/12 12:37:03 INFO namenode.FSNamesystem: FSNamesystem write lock held > for 46024 ms via > java.lang.Thread.getStackTrace(Thread.java:1559) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:945) > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:198) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1689) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.clearCorruptLazyPersistFiles(FSNamesystem.java:5532) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.run(FSNamesystem.java:5543) > java.lang.Thread.run(Thread.java:748) > Number of suppressed write-lock reports: 0 > Longest write-lock held interval: 46024 > {noformat} > Here's the relevant code: > {code} > writeLock(); > try { > final Iterator it = > blockManager.getCorruptReplicaBlockIterator(); > while (it.hasNext()) { > Block b = it.next(); > BlockInfo blockInfo = blockManager.getStoredBlock(b); > if (blockInfo.getBlockCollection().getStoragePolicyID() == > lpPolicy.getId()) { > filesToDelete.add(blockInfo.getBlockCollection()); > } > } > for (BlockCollection bc : filesToDelete) { > LOG.warn("Removing lazyPersist file " + bc.getName() + " with no > replicas."); > changed |= deleteInternal(bc.getName(), false, false, false); > } > } finally { > writeUnlock(); > } > {code} > In essence, the iteration over corrupt replica list should be broken down > into smaller iterations to avoid a single long wait. > Since this operation holds NameNode write lock for more than 45 seconds, the > default ZKFC connection timeout, it implies an extreme case like this (100 > million corrupt blocks) could lead to NameNode failover. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work started] (HDFS-13683) HDFS StoragePolicy commands should work with Federation
[ https://issues.apache.org/jira/browse/HDFS-13683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-13683 started by Gabor Bota. - > HDFS StoragePolicy commands should work with Federation > --- > > Key: HDFS-13683 > URL: https://issues.apache.org/jira/browse/HDFS-13683 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, tools >Affects Versions: 3.0.0 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > > In a federated cluster (defaultFS = viewfs://cluster3), getStoragePolicy with > hdfs uri will run into the following error: > {noformat} > [root@cdh6-0-0-centos73-17443-1 ~]# hdfs storagepolicies -getStoragePolicy > -path > hdfs://ns2/hbase/WALs/cdh6-0-0-centos73-17443-4.vpc.cloudera.com,22101,1527272361896 > IllegalArgumentException: Wrong FS: > hdfs://ns2/hbase/WALs/cdh6-0-0-centos73-17443-4.vpc.cloudera.com,22101,1527272361896, > expected: viewfs://cluster3/ > {noformat} > Taking a quick look at the code, I think > [this|https://github.com/apache/hadoop/blob/3e37a9a70ba93430da1b47f2a8b50358348396b0/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/StoragePolicyAdmin.java#L106] > is the culprit: > {code} > final FileSystem fs = FileSystem.get(conf); > // should do: final FileSystem fs = p.getFilesystem(conf); > {code} > We should have a review of all shell and see if anything else is missing. At > the minimum, we should fix all places in StoragePolicyAdmin. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13683) HDFS StoragePolicy commands should work with Federation
[ https://issues.apache.org/jira/browse/HDFS-13683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516910#comment-16516910 ] Gabor Bota commented on HDFS-13683: --- Seems like very much a duplicated issue of HDFS-12857. [~xiaochen], what do you think? > HDFS StoragePolicy commands should work with Federation > --- > > Key: HDFS-13683 > URL: https://issues.apache.org/jira/browse/HDFS-13683 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, tools >Affects Versions: 3.0.0 >Reporter: Gabor Bota >Assignee: Gabor Bota >Priority: Major > > In a federated cluster (defaultFS = viewfs://cluster3), getStoragePolicy with > hdfs uri will run into the following error: > {noformat} > [root@cdh6-0-0-centos73-17443-1 ~]# hdfs storagepolicies -getStoragePolicy > -path > hdfs://ns2/hbase/WALs/cdh6-0-0-centos73-17443-4.vpc.cloudera.com,22101,1527272361896 > IllegalArgumentException: Wrong FS: > hdfs://ns2/hbase/WALs/cdh6-0-0-centos73-17443-4.vpc.cloudera.com,22101,1527272361896, > expected: viewfs://cluster3/ > {noformat} > Taking a quick look at the code, I think > [this|https://github.com/apache/hadoop/blob/3e37a9a70ba93430da1b47f2a8b50358348396b0/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/StoragePolicyAdmin.java#L106] > is the culprit: > {code} > final FileSystem fs = FileSystem.get(conf); > // should do: final FileSystem fs = p.getFilesystem(conf); > {code} > We should have a review of all shell and see if anything else is missing. At > the minimum, we should fix all places in StoragePolicyAdmin. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-179) CloseContainer command should be executed only if all the prior "Write" type container requests get executed
[ https://issues.apache.org/jira/browse/HDDS-179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee updated HDDS-179: - Description: When a close Container command request comes to a Datanode (via SCM hearbeat response) through the Ratis protocol, all the prior enqueued "Write" type of request like putKey, WriteChunk, DeleteKey, CompactChunk etc should be executed first before CloseContainer request gets executed. This synchronization needs to be handled in the containerStateMachine. This Jira aims to address this. was: When a close Container command request comes to a Datanode (via SCM hearbeat response) through the Ratis protocol, all the prior enqueued "Write" type of request like putKey, WriteChunk, DeleteKey, CompactChunk etc should be executed first before CloseContainer request gets executed. This synchronization needs to be handled in the containerStateMachine. This Jira aims to address this. > CloseContainer command should be executed only if all the prior "Write" type > container requests get executed > - > > Key: HDDS-179 > URL: https://issues.apache.org/jira/browse/HDDS-179 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client, Ozone Datanode >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Fix For: 0.2.1 > > > When a close Container command request comes to a Datanode (via SCM hearbeat > response) through the Ratis protocol, all the prior enqueued "Write" type of > request like putKey, WriteChunk, DeleteKey, CompactChunk etc should be > executed first before CloseContainer request gets executed. This > synchronization needs to be handled in the containerStateMachine. This Jira > aims to address this. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-179) CloseContainer command should be executed only if all the prior "Write" type container requests get executed
Shashikant Banerjee created HDDS-179: Summary: CloseContainer command should be executed only if all the prior "Write" type container requests get executed Key: HDDS-179 URL: https://issues.apache.org/jira/browse/HDDS-179 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Client, Ozone Datanode Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.2.1 When a close Container command request comes to a Datanode (via SCM hearbeat response) through the Ratis protocol, all the prior enqueued "Write" type of request like putKey, WriteChunk, DeleteKey, CompactChunk etc should be executed first before CloseContainer request gets executed. This synchronization needs to be handled in the containerStateMachine. This Jira aims to address this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-178) DeleteBlocks should not be handled by open containers
[ https://issues.apache.org/jira/browse/HDDS-178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516867#comment-16516867 ] Lokesh Jain commented on HDDS-178: -- v2 patch fixes a few comments and debug logs. > DeleteBlocks should not be handled by open containers > - > > Key: HDDS-178 > URL: https://issues.apache.org/jira/browse/HDDS-178 > Project: Hadoop Distributed Data Store > Issue Type: Task > Components: Ozone Datanode >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Attachments: HDDS-178.001.patch, HDDS-178.002.patch > > > In the case of open containers deleteBlocks command just adds an entry in the > log but does not delete the blocks. These blocks are deleted only when > container is closed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-178) DeleteBlocks should not be handled by open containers
[ https://issues.apache.org/jira/browse/HDDS-178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lokesh Jain updated HDDS-178: - Attachment: HDDS-178.002.patch > DeleteBlocks should not be handled by open containers > - > > Key: HDDS-178 > URL: https://issues.apache.org/jira/browse/HDDS-178 > Project: Hadoop Distributed Data Store > Issue Type: Task > Components: Ozone Datanode >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Attachments: HDDS-178.001.patch, HDDS-178.002.patch > > > In the case of open containers deleteBlocks command just adds an entry in the > log but does not delete the blocks. These blocks are deleted only when > container is closed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-178) DeleteBlocks should not be handled by open containers
[ https://issues.apache.org/jira/browse/HDDS-178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lokesh Jain updated HDDS-178: - Status: Patch Available (was: Open) > DeleteBlocks should not be handled by open containers > - > > Key: HDDS-178 > URL: https://issues.apache.org/jira/browse/HDDS-178 > Project: Hadoop Distributed Data Store > Issue Type: Task > Components: Ozone Datanode >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Attachments: HDDS-178.001.patch > > > In the case of open containers deleteBlocks command just adds an entry in the > log but does not delete the blocks. These blocks are deleted only when > container is closed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-178) DeleteBlocks should not be handled by open containers
[ https://issues.apache.org/jira/browse/HDDS-178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lokesh Jain updated HDDS-178: - Attachment: HDDS-178.001.patch > DeleteBlocks should not be handled by open containers > - > > Key: HDDS-178 > URL: https://issues.apache.org/jira/browse/HDDS-178 > Project: Hadoop Distributed Data Store > Issue Type: Task > Components: Ozone Datanode >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Attachments: HDDS-178.001.patch > > > In the case of open containers deleteBlocks command just adds an entry in the > log but does not delete the blocks. These blocks are deleted only when > container is closed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-178) DeleteBlocks should not be handled by open containers
Lokesh Jain created HDDS-178: Summary: DeleteBlocks should not be handled by open containers Key: HDDS-178 URL: https://issues.apache.org/jira/browse/HDDS-178 Project: Hadoop Distributed Data Store Issue Type: Task Components: Ozone Datanode Reporter: Lokesh Jain Assignee: Lokesh Jain In the case of open containers deleteBlocks command just adds an entry in the log but does not delete the blocks. These blocks are deleted only when container is closed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13687) ConfiguredFailoverProxyProvider could direct requests to SBN
[ https://issues.apache.org/jira/browse/HDFS-13687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516754#comment-16516754 ] Chao Sun commented on HDFS-13687: - Thanks [~xkrogen] and [~shv] for the very valuable comments! Regarding Konstantin's comments: 1. Very good point. Sorry I didn't know that {{getServiceStatus}} requires super privilege. Another option might be to add another interface/protocol to get the active/standby state from NN, [as proposed in the original JIRA|https://issues.apache.org/jira/browse/HDFS-2917?focusedCommentId=13204178=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-13204178]. 2. Yes good point. Perhaps we can reuse {{NameNodeHAProxyFactory}} to create the factory needed in our case. 3. Can you elaborate more on this?. Currently {{ConfiguredFailoverProxyProvider}} does assume all remote addresses are NN, right? Will go back to HDFS-12976 in the mean time. > ConfiguredFailoverProxyProvider could direct requests to SBN > > > Key: HDFS-13687 > URL: https://issues.apache.org/jira/browse/HDFS-13687 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Chao Sun >Assignee: Chao Sun >Priority: Minor > Attachments: HDFS-13687.000.patch > > > In case there are multiple SBNs, and {{dfs.ha.allow.stale.reads}} is set to > true, failover could go to a SBN which then may serve read requests from > client. This may not be the expected behavior. This issue arises when we are > working on HDFS-12943 and HDFS-12976. > A better approach for this could be to check {{HAServiceState}} and find out > the active NN when performing failover. This also can reduce the # of > failovers the client has to do in case of multiple SBNs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13682) Cannot create encryption zone after KMS auth token expires
[ https://issues.apache.org/jira/browse/HDFS-13682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516709#comment-16516709 ] genericqa commented on HDFS-13682: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 29s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 56s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 25s{color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 5m 10s{color} | {color:red} branch has errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 55s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 28m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} shadedclient {color} | {color:red} 2m 14s{color} | {color:red} patch has errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 52s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 15s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 94m 33s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 45s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}217m 37s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.TestReencryptionWithKMS | | | hadoop.hdfs.qjournal.server.TestJournalNodeSync | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | HDFS-13682 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12928286/HDFS-13682.02.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux bbd5b16ce9bf 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / f386e78 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | unit |