[jira] [Commented] (HDFS-15023) [SBN read] ZKFC should check the state before joining the election
[ https://issues.apache.org/jira/browse/HDFS-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987610#comment-16987610 ] Ayush Saxena commented on HDFS-15023: - Thanx [~ferhui], overall Looks Good. {code:java} 276 System.setIn(inOriginial); {code} This probably in the test should be moved to finally block, else if the test fails in middle, stream won't get reset. [~vinayakumarb] is it fine with you. Anything you would like to add... > [SBN read] ZKFC should check the state before joining the election > -- > > Key: HDFS-15023 > URL: https://issues.apache.org/jira/browse/HDFS-15023 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > Attachments: HDFS-15023.001.patch, HDFS-15023.002.patch > > > As discussed HDFS-14961, ZKFC should not join election when its state is > observer. > Right now when namemode was an observer, it joined election and it would be > become a standby. > MonitorDaemon thread callchain is that > doHealthChecks -> enterState(State.SERVICE_HEALTHY) -> recheckElectability() > -> elector.joinElection(targetToData(localTarget)) -> joinElectionInternal -> > createLockNodeAsync > callBack for zookeeper > processResult -> becomeStandby -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13811) RBF: Race condition between router admin quota update and periodic quota update service
[ https://issues.apache.org/jira/browse/HDFS-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987586#comment-16987586 ] Jinglun commented on HDFS-13811: Hi [~linyiqun], thanks your comments ! I rollback the changes except the annotation. Upload v07. > RBF: Race condition between router admin quota update and periodic quota > update service > --- > > Key: HDFS-13811 > URL: https://issues.apache.org/jira/browse/HDFS-13811 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Dibyendu Karmakar >Assignee: Jinglun >Priority: Major > Attachments: HDFS-13811-000.patch, HDFS-13811-HDFS-13891-000.patch, > HDFS-13811.001.patch, HDFS-13811.002.patch, HDFS-13811.003.patch, > HDFS-13811.004.patch, HDFS-13811.005.patch, HDFS-13811.006.patch, > HDFS-13811.007.patch > > > If we try to update quota of an existing mount entry and at the same time > periodic quota update service is running on the same mount entry, it is > leading the mount table to _inconsistent state._ > Here transactions are: > A - Quota update service is fetching mount table entries. > B - Quota update service is updating the mount table with current usage. > A' - User is trying to update quota using admin cmd. > and the transaction sequence is [ A A' B ] > quota update service is updating the mount table with old quota value. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13811) RBF: Race condition between router admin quota update and periodic quota update service
[ https://issues.apache.org/jira/browse/HDFS-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinglun updated HDFS-13811: --- Attachment: HDFS-13811.007.patch > RBF: Race condition between router admin quota update and periodic quota > update service > --- > > Key: HDFS-13811 > URL: https://issues.apache.org/jira/browse/HDFS-13811 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Dibyendu Karmakar >Assignee: Jinglun >Priority: Major > Attachments: HDFS-13811-000.patch, HDFS-13811-HDFS-13891-000.patch, > HDFS-13811.001.patch, HDFS-13811.002.patch, HDFS-13811.003.patch, > HDFS-13811.004.patch, HDFS-13811.005.patch, HDFS-13811.006.patch, > HDFS-13811.007.patch > > > If we try to update quota of an existing mount entry and at the same time > periodic quota update service is running on the same mount entry, it is > leading the mount table to _inconsistent state._ > Here transactions are: > A - Quota update service is fetching mount table entries. > B - Quota update service is updating the mount table with current usage. > A' - User is trying to update quota using admin cmd. > and the transaction sequence is [ A A' B ] > quota update service is updating the mount table with old quota value. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15023) [SBN read] ZKFC should check the state before joining the election
[ https://issues.apache.org/jira/browse/HDFS-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui updated HDFS-15023: --- Attachment: HDFS-15023.002.patch > [SBN read] ZKFC should check the state before joining the election > -- > > Key: HDFS-15023 > URL: https://issues.apache.org/jira/browse/HDFS-15023 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > Attachments: HDFS-15023.001.patch, HDFS-15023.002.patch > > > As discussed HDFS-14961, ZKFC should not join election when its state is > observer. > Right now when namemode was an observer, it joined election and it would be > become a standby. > MonitorDaemon thread callchain is that > doHealthChecks -> enterState(State.SERVICE_HEALTHY) -> recheckElectability() > -> elector.joinElection(targetToData(localTarget)) -> joinElectionInternal -> > createLockNodeAsync > callBack for zookeeper > processResult -> becomeStandby -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15023) [SBN read] ZKFC should check the state before joining the election
[ https://issues.apache.org/jira/browse/HDFS-15023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987578#comment-16987578 ] Fei Hui commented on HDFS-15023: [~ayushtkn] Thanks for your comments. I had a mistake. I shouldn't change the fix mentioned on HDFS-14961 Upload v002 patch with UT > [SBN read] ZKFC should check the state before joining the election > -- > > Key: HDFS-15023 > URL: https://issues.apache.org/jira/browse/HDFS-15023 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Major > Attachments: HDFS-15023.001.patch > > > As discussed HDFS-14961, ZKFC should not join election when its state is > observer. > Right now when namemode was an observer, it joined election and it would be > become a standby. > MonitorDaemon thread callchain is that > doHealthChecks -> enterState(State.SERVICE_HEALTHY) -> recheckElectability() > -> elector.joinElection(targetToData(localTarget)) -> joinElectionInternal -> > createLockNodeAsync > callBack for zookeeper > processResult -> becomeStandby -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14990) HDFS: No symbolic icon to represent decommissioning state of datanode in Name node WEB UI
[ https://issues.apache.org/jira/browse/HDFS-14990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987568#comment-16987568 ] Xieming Li commented on HDFS-14990: --- [~ayushtkn] Thank you for your feedback. I will just ignore this ticket for the moment. > HDFS: No symbolic icon to represent decommissioning state of datanode in Name > node WEB UI > - > > Key: HDFS-14990 > URL: https://issues.apache.org/jira/browse/HDFS-14990 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, ui >Affects Versions: 3.2.1 >Reporter: Souryakanta Dwivedy >Assignee: Xieming Li >Priority: Minor > Attachments: image-2019-11-15-17-31-23-213.png, > image-2019-11-16-02-09-10-545.png > > > No symbolic icon to represent decommissioning state of datanode in Name node > WEB UI > Expected output:- > Like other datanode states as In-service , Down , Decommissioned etc. > an icon should also be added for decommissioning state > > !image-2019-11-15-17-31-23-213.png! > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14983) RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option
[ https://issues.apache.org/jira/browse/HDFS-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987567#comment-16987567 ] Xieming Li commented on HDFS-14983: --- I have uploaded a patch that works, and I hope to get some feedback. I performed a very simple test on my environment. In core-site.xml I added: {code:java} - - {code} {code:java} $ export HADOOP_PROXY_USER=dummyuser $ hdfs dfs -ls ls: User: sri@DEV is not allowed to impersonate dummyuser $ $ sudo hdfs dfsrouteradmin -refreshSuperUserGroupsConfiguration Successfully updated super user groups configuration on router 0.0.0.0:8111 $ $ hdfs dfs -ls {code} If everything looks okay, I will keep adding UnitTest, JavaDoc, Documentation. > RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option > --- > > Key: HDFS-14983 > URL: https://issues.apache.org/jira/browse/HDFS-14983 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Akira Ajisaka >Assignee: Xieming Li >Priority: Minor > Attachments: HDFS-14983.draft.001.patch > > > NameNode can update proxyuser config by -refreshSuperUserGroupsConfiguration > without restarting but DFSRouter cannot. It would be better for DFSRouter to > have such functionality to be compatible with NameNode. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14990) HDFS: No symbolic icon to represent decommissioning state of datanode in Name node WEB UI
[ https://issues.apache.org/jira/browse/HDFS-14990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987566#comment-16987566 ] Ayush Saxena commented on HDFS-14990: - The symbol is there for me in the UI too > HDFS: No symbolic icon to represent decommissioning state of datanode in Name > node WEB UI > - > > Key: HDFS-14990 > URL: https://issues.apache.org/jira/browse/HDFS-14990 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, ui >Affects Versions: 3.2.1 >Reporter: Souryakanta Dwivedy >Assignee: Xieming Li >Priority: Minor > Attachments: image-2019-11-15-17-31-23-213.png, > image-2019-11-16-02-09-10-545.png > > > No symbolic icon to represent decommissioning state of datanode in Name node > WEB UI > Expected output:- > Like other datanode states as In-service , Down , Decommissioned etc. > an icon should also be added for decommissioning state > > !image-2019-11-15-17-31-23-213.png! > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14983) RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option
[ https://issues.apache.org/jira/browse/HDFS-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xieming Li updated HDFS-14983: -- Attachment: HDFS-14983.draft.001.patch Status: Patch Available (was: In Progress) > RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option > --- > > Key: HDFS-14983 > URL: https://issues.apache.org/jira/browse/HDFS-14983 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Akira Ajisaka >Assignee: Xieming Li >Priority: Minor > Attachments: HDFS-14983.draft.001.patch > > > NameNode can update proxyuser config by -refreshSuperUserGroupsConfiguration > without restarting but DFSRouter cannot. It would be better for DFSRouter to > have such functionality to be compatible with NameNode. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14990) HDFS: No symbolic icon to represent decommissioning state of datanode in Name node WEB UI
[ https://issues.apache.org/jira/browse/HDFS-14990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xieming Li updated HDFS-14990: -- Status: Open (was: Patch Available) > HDFS: No symbolic icon to represent decommissioning state of datanode in Name > node WEB UI > - > > Key: HDFS-14990 > URL: https://issues.apache.org/jira/browse/HDFS-14990 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, ui >Affects Versions: 3.2.1 >Reporter: Souryakanta Dwivedy >Assignee: Xieming Li >Priority: Minor > Attachments: image-2019-11-15-17-31-23-213.png, > image-2019-11-16-02-09-10-545.png > > > No symbolic icon to represent decommissioning state of datanode in Name node > WEB UI > Expected output:- > Like other datanode states as In-service , Down , Decommissioned etc. > an icon should also be added for decommissioning state > > !image-2019-11-15-17-31-23-213.png! > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14990) HDFS: No symbolic icon to represent decommissioning state of datanode in Name node WEB UI
[ https://issues.apache.org/jira/browse/HDFS-14990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xieming Li updated HDFS-14990: -- Attachment: (was: HDFS-14893.draft.001.patch) > HDFS: No symbolic icon to represent decommissioning state of datanode in Name > node WEB UI > - > > Key: HDFS-14990 > URL: https://issues.apache.org/jira/browse/HDFS-14990 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, ui >Affects Versions: 3.2.1 >Reporter: Souryakanta Dwivedy >Assignee: Xieming Li >Priority: Minor > Attachments: image-2019-11-15-17-31-23-213.png, > image-2019-11-16-02-09-10-545.png > > > No symbolic icon to represent decommissioning state of datanode in Name node > WEB UI > Expected output:- > Like other datanode states as In-service , Down , Decommissioned etc. > an icon should also be added for decommissioning state > > !image-2019-11-15-17-31-23-213.png! > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15031) Allow BootstrapStandby to download FSImage if the directory is already formatted
[ https://issues.apache.org/jira/browse/HDFS-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987561#comment-16987561 ] Hadoop QA commented on HDFS-15031: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 31s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 15s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 36s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 26 unchanged - 0 fixed = 27 total (was 26) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 19s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 17s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 14s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}111m 34s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 37s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}168m 35s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks | | | hadoop.hdfs.protocol.datatransfer.sasl.TestSaslDataTransfer | | | hadoop.hdfs.TestSafeModeWithStripedFileWithRandomECPolicy | | | hadoop.hdfs.TestDecommission | | | hadoop.hdfs.TestFileChecksumCompositeCrc | | | hadoop.hdfs.server.namenode.TestAddOverReplicatedStripedBlocks | | | hadoop.hdfs.TestEncryptedTransfer | | | hadoop.hdfs.TestReconstructStripedFile | | | hadoop.hdfs.TestDFSPermission | | | hadoop.hdfs.client.impl.TestBlockReaderLocal | | | hadoop.hdfs.server.diskbalancer.TestDiskBalancer | | | hadoop.hdfs.TestReadStripedFileWithMissingBlocks | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | HDFS-15031 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12987418/HDFS-15031.000.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux aa3f5e79d42e 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Updated] (HDFS-14990) HDFS: No symbolic icon to represent decommissioning state of datanode in Name node WEB UI
[ https://issues.apache.org/jira/browse/HDFS-14990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xieming Li updated HDFS-14990: -- Attachment: HDFS-14893.draft.001.patch Status: Patch Available (was: In Progress) > HDFS: No symbolic icon to represent decommissioning state of datanode in Name > node WEB UI > - > > Key: HDFS-14990 > URL: https://issues.apache.org/jira/browse/HDFS-14990 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, ui >Affects Versions: 3.2.1 >Reporter: Souryakanta Dwivedy >Assignee: Xieming Li >Priority: Minor > Attachments: HDFS-14893.draft.001.patch, > image-2019-11-15-17-31-23-213.png, image-2019-11-16-02-09-10-545.png > > > No symbolic icon to represent decommissioning state of datanode in Name node > WEB UI > Expected output:- > Like other datanode states as In-service , Down , Decommissioned etc. > an icon should also be added for decommissioning state > > !image-2019-11-15-17-31-23-213.png! > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14990) HDFS: No symbolic icon to represent decommissioning state of datanode in Name node WEB UI
[ https://issues.apache.org/jira/browse/HDFS-14990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987560#comment-16987560 ] Xieming Li commented on HDFS-14990: --- Ping. Any thoughts on this ticket? > HDFS: No symbolic icon to represent decommissioning state of datanode in Name > node WEB UI > - > > Key: HDFS-14990 > URL: https://issues.apache.org/jira/browse/HDFS-14990 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, ui >Affects Versions: 3.2.1 >Reporter: Souryakanta Dwivedy >Assignee: Xieming Li >Priority: Minor > Attachments: HDFS-14893.draft.001.patch, > image-2019-11-15-17-31-23-213.png, image-2019-11-16-02-09-10-545.png > > > No symbolic icon to represent decommissioning state of datanode in Name node > WEB UI > Expected output:- > Like other datanode states as In-service , Down , Decommissioned etc. > an icon should also be added for decommissioning state > > !image-2019-11-15-17-31-23-213.png! > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15027) Correct target DN's log while balancing.
[ https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987559#comment-16987559 ] Hadoop QA commented on HDFS-15027: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 59s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 53s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 18s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 50s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 38s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 23s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 97m 45s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}167m 12s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean | | | hadoop.hdfs.TestEncryptionZonesWithKMS | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | HDFS-15027 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12987421/HDFS-15027.000.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 074700de44c3 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 54e7605 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/28442/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/28442/testReport/ | | Max. process+thread count | 2928 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U:
[jira] [Commented] (HDFS-14997) BPServiceActor process command from NameNode asynchronously
[ https://issues.apache.org/jira/browse/HDFS-14997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987530#comment-16987530 ] Xiaoqiao He commented on HDFS-14997: Ping [~elgoiri],[~sodonnell],[~weichiu] any furthermore comments here? > BPServiceActor process command from NameNode asynchronously > --- > > Key: HDFS-14997 > URL: https://issues.apache.org/jira/browse/HDFS-14997 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Xiaoqiao He >Assignee: Xiaoqiao He >Priority: Major > Attachments: HDFS-14997.001.patch, HDFS-14997.002.patch, > HDFS-14997.003.patch, HDFS-14997.004.patch, HDFS-14997.005.patch > > > There are two core functions, report(#sendHeartbeat, #blockReport, > #cacheReport) and #processCommand in #BPServiceActor main process flow. If > processCommand cost long time it will block send report flow. Meanwhile > processCommand could cost long time(over 1000s the worst case I meet) when IO > load of DataNode is very high. Since some IO operations are under > #datasetLock, So it has to wait to acquire #datasetLock long time when > process some of commands(such as #DNA_INVALIDATE). In such case, #heartbeat > will not send to NameNode in-time, and trigger other disasters. > I propose to improve #processCommand asynchronously and not block > #BPServiceActor to send heartbeat back to NameNode when meet high IO load. > Notes: > 1. Lifeline could be one effective solution, however some old branches are > not support this feature. > 2. IO operations under #datasetLock is another issue, I think we should solve > it at another JIRA. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15010) BlockPoolSlice#addReplicaThreadPool static pool should be initialized by static method
[ https://issues.apache.org/jira/browse/HDFS-15010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987532#comment-16987532 ] Surendra Singh Lilhore commented on HDFS-15010: --- [~yuvaldeg], Sure, I will do it. Need to create new patch, give me some time. > BlockPoolSlice#addReplicaThreadPool static pool should be initialized by > static method > -- > > Key: HDFS-15010 > URL: https://issues.apache.org/jira/browse/HDFS-15010 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 3.1.2 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore >Priority: Major > Fix For: 3.3.0, 3.1.4, 3.2.2 > > Attachments: HDFS-15010.001.patch, HDFS-15010.02.patch, > HDFS-15010.03.patch, HDFS-15010.04.patch, HDFS-15010.05.patch > > > {{BlockPoolSlice#initializeAddReplicaPool()}} method currently initialize the > static thread pool instance. But when two {{BPServiceActor}} actor try to > load block pool parallelly then it may create different instance. > So {{BlockPoolSlice#initializeAddReplicaPool()}} method should be a static > method. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14519) NameQuota is not update after concat operation, so namequota is wrong
[ https://issues.apache.org/jira/browse/HDFS-14519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987529#comment-16987529 ] Ayush Saxena commented on HDFS-14519: - Will try backport in couple of days. > NameQuota is not update after concat operation, so namequota is wrong > - > > Key: HDFS-14519 > URL: https://issues.apache.org/jira/browse/HDFS-14519 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ranith Sardar >Assignee: Ranith Sardar >Priority: Major > Fix For: 3.3.0, 3.1.4, 3.2.2 > > Attachments: HDFS-14519.001.patch, HDFS-14519.002.patch, > HDFS-14519.003.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15009) FSCK "-list-corruptfileblocks" return Invalid Entries
[ https://issues.apache.org/jira/browse/HDFS-15009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987528#comment-16987528 ] Ayush Saxena commented on HDFS-15009: - [~hemanthboyina] can you reopen and provide patches for lower branches. > FSCK "-list-corruptfileblocks" return Invalid Entries > - > > Key: HDFS-15009 > URL: https://issues.apache.org/jira/browse/HDFS-15009 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-15009.001.patch, HDFS-15009.002.patch, > HDFS-15009.003.patch, HDFS-15009.004.patch > > > Scenario : if we have two directories dir1, dir10 and only dir10 have > corrupt files > Now if we run -list-corruptfileblocks for dir1, corrupt files count for dir1 > showing is of dir10 > {code:java} > while (blkIterator.hasNext()) { > BlockInfo blk = blkIterator.next(); > final INodeFile inode = getBlockCollection(blk); > skip++; > if (inode != null) { > String src = inode.getFullPathName(); > if (src.startsWith(path)){ > corruptFiles.add(new CorruptFileBlockInfo(src, blk)); > count++; > if (count >= DEFAULT_MAX_CORRUPT_FILEBLOCKS_RETURNED) > break; > } > } > } {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time
[ https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986882#comment-16986882 ] huhaiyang edited comment on HDFS-15024 at 12/4/19 4:17 AM: --- [~xkrogen] [~csun] [~vagarychen] Thanks for your comments! I understand Normally, if we only set 2 NNS, dfs.ha.namenodes.ns1 nn1,nn2 Currently, nn1 is in active state nn2 is in standby state when the client connects to nn2, it needs to retry, and will quickly connect to nn1. However, when the nn1 fails to connect due to network problems, the next time( the third time), the sleep timeout will be performed for a period of time to retry Current HDFS-6440 Support more than 2 NameNodes. if we set 3 NNS, dfs.ha.namenodes.ns1 nn1,nn2,nn3 nn1 is in active state nn2 is in standby state nn3 is in standby state(or observer state) when the client connects to nn2, it needs to retry, and will quickly connect to nn3. and the client connects to nn3, it needs to retry, and will quickly connect to nn1. However, when the nn1 fails to connect due to network problems, the next time( the fourth time), the sleep timeout will be performed for a period of time to retry. That is to say, it is necessary to connect all the configured NN nodes once. If no NN nodes the requirements are found, required to perform sleep and retry... In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime , I think that the Number of NameNodes as a condition of calculation of sleep time is more reasonable((which is current v01 patch)). Pls let me know whether it is correct. Thanks was (Author: haiyang hu): [~xkrogen] [~csun] [~vagarychen] Thanks for your comments! I understand Normally, if we only set 2 NNS, dfs.ha.namenodes.ns1 nn1,nn2 Currently, nn1 is in active state nn2 is in standby state when the client connects to nn2, it needs to retry, and will quickly connect to nn1. However, when the nn1 fails to connect due to network problems, the next time( the third time), the sleep timeout will be performed for a period of time to retry Current HDFS-6440 Support more than 2 NameNodes. if we set 3 NNS, dfs.ha.namenodes.ns1 nn1,nn2,nn3 nn1 is in active state nn2 is in standby state nn3 is in standby state(or observer state) when the client connects to nn2, it needs to retry, and will quickly connect to nn3. and the client connects to nn3, it needs to retry, and will quickly connect to nn1. However, when the nn1 fails to connect due to network problems, the next time( the fourth time), the sleep timeout will be performed for a period of time to retry. That is to say, it is necessary to connect all the configured NN nodes once. If no NN nodes the requirements are found, required to perform sleep and retry... In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime , I think that the Number of NameNodes as a condition of calculation of sleep time is more reasonable((which is current v01 patch)). > [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a > condition of calculation of sleep time > --- > > Key: HDFS-15024 > URL: https://issues.apache.org/jira/browse/HDFS-15024 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.10.0, 3.3.0, 3.2.1 >Reporter: huhaiyang >Priority: Major > Attachments: HDFS-15024.001.patch, client_error.log > > > When we enable the ONN , there will be three NN nodes for the client > configuration, > Such as configuration > > dfs.ha.namenodes.ns1 > nn2,nn3,nn1 > > Currently, > nn2 is in standby state > nn3 is in observer state > nn1 is in active state > When the user performs an access HDFS operation > ./bin/hadoop --loglevel debug fs > -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider > -mkdir /user/haiyang1/test8 > You need to request nn1 when you execute the msync method, > Actually connect nn2 first and failover is required > In connection nn3 does not meet the requirements, failover needs to be > performed, but at this time, failover operation needs to be performed during > a period of hibernation > Finally, it took a period of hibernation to connect the successful request to > nn1 > In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime The current > default implementation is Sleep time is calculated when more than one > failover operation is performed > I think that the Number of NameNodes as a condition of calculation of sleep > time is more reasonable > That is, in the current test, executing failover on connection nn3 does not > need to sleep time to directly connect to the next nn node > See client_error.log for details -- This message was sent by Atlassian Jira
[jira] [Commented] (HDFS-14546) Document block placement policies
[ https://issues.apache.org/jira/browse/HDFS-14546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987504#comment-16987504 ] Amithsha commented on HDFS-14546: - [~weichiu] Yes will remove the PR from jira. Have updated the git but no response so removing it. > Document block placement policies > - > > Key: HDFS-14546 > URL: https://issues.apache.org/jira/browse/HDFS-14546 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Íñigo Goiri >Assignee: Amithsha >Priority: Major > Labels: documentation > Attachments: HDFS-14546-01.patch, HDFS-14546-02.patch, > HDFS-14546-03.patch, HDFS-14546-04.patch, HDFS-14546-05.patch, > HDFS-14546-06.patch, HDFS-14546-07.patch, HdfsDesign.patch > > > Currently, all the documentation refers to the default block placement policy. > However, over time there have been new policies: > * BlockPlacementPolicyRackFaultTolerant (HDFS-7891) > * BlockPlacementPolicyWithNodeGroup (HDFS-3601) > * BlockPlacementPolicyWithUpgradeDomain (HDFS-9006) > We should update the documentation to refer to them explaining their > particularities and probably how to setup each one of them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13811) RBF: Race condition between router admin quota update and periodic quota update service
[ https://issues.apache.org/jira/browse/HDFS-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987503#comment-16987503 ] Yiqun Lin commented on HDFS-13811: -- [~LiJinglun], following change of the unit test can also rollback. Would you mind rollbacking this? Others looks good to me. {noformat} + /** + * Test {@link RouterQuotaUpdateService#periodicInvoke()} updates quota usage + * in RouterQuotaManager. + */ @Test public void testQuotaUpdating() throws Exception { long nsQuota = 30; @@ -498,15 +504,14 @@ public void testQuotaUpdating() throws Exception { .spaceQuota(ssQuota).build()); addMountTable(mountTable); -// Call periodicInvoke to ensure quota updated in quota manager -// and state store. -RouterQuotaUpdateService updateService = routerContext.getRouter() -.getQuotaCacheUpdateService(); +// Call periodicInvoke to ensure quota updated in quota manager. +Router router = routerContext.getRouter(); +RouterQuotaUpdateService updateService = +router.getQuotaCacheUpdateService(); updateService.periodicInvoke(); // verify initial quota value -MountTable updatedMountTable = getMountTable(path); -RouterQuotaUsage quota = updatedMountTable.getQuota(); +RouterQuotaUsage quota = router.getQuotaManager().getQuotaUsage(path); assertEquals(nsQuota, quota.getQuota()); assertEquals(ssQuota, quota.getSpaceQuota()); assertEquals(1, quota.getFileAndDirectoryCount()); @@ -520,17 +525,16 @@ public void testQuotaUpdating() throws Exception { appendData(path + "/file", routerClient, BLOCK_SIZE); updateService.periodicInvoke(); -updatedMountTable = getMountTable(path); -quota = updatedMountTable.getQuota(); +quota = router.getQuotaManager().getQuotaUsage(path); -// verify if quota has been updated in state store +// verify if quota usage has been updated in RouterQuotaManager. assertEquals(nsQuota, quota.getQuota()); assertEquals(ssQuota, quota.getSpaceQuota()); assertEquals(3, quota.getFileAndDirectoryCount()); assertEquals(BLOCK_SIZE, quota.getSpaceConsumed()); // verify quota sync on adding new destination to mount entry. -updatedMountTable = getMountTable(path); +MountTable updatedMountTable = getMountTable(path); {noformat} > RBF: Race condition between router admin quota update and periodic quota > update service > --- > > Key: HDFS-13811 > URL: https://issues.apache.org/jira/browse/HDFS-13811 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Dibyendu Karmakar >Assignee: Jinglun >Priority: Major > Attachments: HDFS-13811-000.patch, HDFS-13811-HDFS-13891-000.patch, > HDFS-13811.001.patch, HDFS-13811.002.patch, HDFS-13811.003.patch, > HDFS-13811.004.patch, HDFS-13811.005.patch, HDFS-13811.006.patch > > > If we try to update quota of an existing mount entry and at the same time > periodic quota update service is running on the same mount entry, it is > leading the mount table to _inconsistent state._ > Here transactions are: > A - Quota update service is fetching mount table entries. > B - Quota update service is updating the mount table with current usage. > A' - User is trying to update quota using admin cmd. > and the transaction sequence is [ A A' B ] > quota update service is updating the mount table with old quota value. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15005) Backport HDFS-12300 to branch-2
[ https://issues.apache.org/jira/browse/HDFS-15005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987494#comment-16987494 ] Hadoop QA commented on HDFS-15005: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 17m 49s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-2 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 5s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 48s{color} | {color:green} branch-2 passed with JDK v1.8.0_222 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 13s{color} | {color:green} branch-2 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 14s{color} | {color:green} branch-2 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s{color} | {color:green} branch-2 passed with JDK v1.8.0_222 {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s{color} | {color:green} the patch passed with JDK v1.8.0_222 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 29s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 9 new + 235 unchanged - 1 fixed = 244 total (was 236) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green} the patch passed with JDK v1.8.0_222 {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 73m 51s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}123m 42s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:f555aa740b5 | | JIRA Issue | HDFS-15005 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12987413/HDFS-15005-branch-2.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux b4127fb9aeac 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | branch-2 / 0ac6dc7 | | maven | version: Apache Maven 3.3.9 |
[jira] [Commented] (HDFS-14825) [Dynamometer] Workload doesn't start unless an absolute path of Mapper class given
[ https://issues.apache.org/jira/browse/HDFS-14825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987462#comment-16987462 ] Hudson commented on HDFS-14825: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #17716 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17716/]) HDFS-14825. [Dynamometer] Workload doesn't start unless an absolute path (aajisaka: rev 54e760511a2e2f8e5ecf1eb8762434fcd041f4d6) * (edit) hadoop-tools/hadoop-dynamometer/hadoop-dynamometer-workload/src/main/java/org/apache/hadoop/tools/dynamometer/workloadgenerator/WorkloadDriver.java > [Dynamometer] Workload doesn't start unless an absolute path of Mapper class > given > -- > > Key: HDFS-14825 > URL: https://issues.apache.org/jira/browse/HDFS-14825 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Soya Miyoshi >Assignee: Takanobu Asanuma >Priority: Major > Fix For: 3.3.0 > > > When starting a workload by start-workload.sh, unless an absolute path of > Mapper is given, the workload doesn't start. > > {code:java} > $ hadoop/tools/dynamometer/dynamometer-workload/bin/start-workload.sh - \ > Dauditreplay.input-path=hdfs:///user/souya/input/audit \ > -Dauditreplay.output-path=hdfs:///user/souya/results/ \ > -Dauditreplay.num-threads=50 -Dauditreplay.log-start-time.ms=5 \ > -nn_uri hdfs://namenode_address:port/ \ > -mapper_class_name AuditReplayMapper > {code} > results in > {code:java} > SLF4J: Defaulting to no-operation (NOP) logger implementation > SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further > details. > Exception in thread "main" java.lang.ClassNotFoundException: Class > org.apache.hadoop.tools.dynamometer.workloadgenerator.AuditReplayMapper not > found > at > org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2572) > at > org.apache.hadoop.tools.dynamometer.workloadgenerator.WorkloadDriver.getMapperClass(WorkloadDriver.java:183) > at > org.apache.hadoop.tools.dynamometer.workloadgenerator.WorkloadDriver.run(WorkloadDriver.java:127) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at > org.apache.hadoop.tools.dynamometer.workloadgenerator.WorkloadDriver.main(WorkloadDriver.java:172) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15027) Correct target DN's log while balancing.
[ https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987459#comment-16987459 ] Xudong Cao commented on HDFS-15027: --- cc [~weichiu] Sorry, patch uploaded again, this is just a minor log improve, I think there's no need for unit test. > Correct target DN's log while balancing. > > > Key: HDFS-15027 > URL: https://issues.apache.org/jira/browse/HDFS-15027 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer mover >Affects Versions: 3.2.1 >Reporter: Xudong Cao >Assignee: Xudong Cao >Priority: Minor > Attachments: HDFS-15027.000.patch > > > During HDFS balancing, after the target DN copied a block from the proxy DN, > it prints a log following the pattern below: > *Moved BLOCK from BALANCER* > This is somehow misleading, maybe we can improve the pattern like: > *Copied BLOCK from PROXY DN, initiated by* *BALANCER* > > An example log of target DN during balancing: > 1. Wrong log printing before jira: > {code:java} > 2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from > /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code} > 2. Correct log printing after jira: > {code:java} > 2019-12-04 10:32:06,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Copied BP-428834875-192.168.202.11-1575425340126:blk_1073741918_1094 from > /192.168.202.11:9866, initiated by /192.168.202.13:44502, > delHint=84a0626a-5fa4-4c66-a776-074f537d4235{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15027) Correct target DN's log while balancing.
[ https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987459#comment-16987459 ] Xudong Cao edited comment on HDFS-15027 at 12/4/19 2:40 AM: cc [~weichiu] Sorry, patch uploaded again, this is just a minor log improvement, I think there's no need for unit test. was (Author: xudongcao): cc [~weichiu] Sorry, patch uploaded again, this is just a minor log improve, I think there's no need for unit test. > Correct target DN's log while balancing. > > > Key: HDFS-15027 > URL: https://issues.apache.org/jira/browse/HDFS-15027 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer mover >Affects Versions: 3.2.1 >Reporter: Xudong Cao >Assignee: Xudong Cao >Priority: Minor > Attachments: HDFS-15027.000.patch > > > During HDFS balancing, after the target DN copied a block from the proxy DN, > it prints a log following the pattern below: > *Moved BLOCK from BALANCER* > This is somehow misleading, maybe we can improve the pattern like: > *Copied BLOCK from PROXY DN, initiated by* *BALANCER* > > An example log of target DN during balancing: > 1. Wrong log printing before jira: > {code:java} > 2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from > /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code} > 2. Correct log printing after jira: > {code:java} > 2019-12-04 10:32:06,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Copied BP-428834875-192.168.202.11-1575425340126:blk_1073741918_1094 from > /192.168.202.11:9866, initiated by /192.168.202.13:44502, > delHint=84a0626a-5fa4-4c66-a776-074f537d4235{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.
[ https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xudong Cao updated HDFS-15027: -- Attachment: HDFS-15027.000.patch > Correct target DN's log while balancing. > > > Key: HDFS-15027 > URL: https://issues.apache.org/jira/browse/HDFS-15027 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer mover >Affects Versions: 3.2.1 >Reporter: Xudong Cao >Assignee: Xudong Cao >Priority: Minor > Attachments: HDFS-15027.000.patch > > > During HDFS balancing, after the target DN copied a block from the proxy DN, > it prints a log following the pattern below: > *Moved BLOCK from BALANCER* > This is somehow misleading, maybe we can improve the pattern like: > *Copied BLOCK from PROXY DN, initiated by* *BALANCER* > > An example log of target DN during balancing: > 1. Wrong log printing before jira: > {code:java} > 2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from > /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code} > 2. Correct log printing after jira: > {code:java} > 2019-12-04 10:32:06,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Copied BP-428834875-192.168.202.11-1575425340126:blk_1073741918_1094 from > /192.168.202.11:9866, initiated by /192.168.202.13:44502, > delHint=84a0626a-5fa4-4c66-a776-074f537d4235{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.
[ https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xudong Cao updated HDFS-15027: -- Description: During HDFS balancing, after the target DN copied a block from the proxy DN, it prints a log following the pattern below: *Moved BLOCK from BALANCER* This is somehow misleading, maybe we can improve the pattern like: *Copied BLOCK from PROXY DN, initiated by* *BALANCER* An example log of target DN during balancing: 1. Wrong log printing before jira: {code:java} 2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code} 2. Correct log printing after jira: {code:java} 2019-12-04 10:32:06,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Copied BP-428834875-192.168.202.11-1575425340126:blk_1073741918_1094 from /192.168.202.11:9866, initiated by /192.168.202.13:44502, delHint=84a0626a-5fa4-4c66-a776-074f537d4235{code} was: During HDFS balancing, after the target DN copied a block from the proxy DN, it prints a log following the pattern below: *Moved BLOCK from BALANCER* This is somehow misleading, maybe we can improve the pattern like: *Copied BLOCK from PROXY DN, initiated by* *BALANCER* An example log of target DN during balancing: # Wrong log printing before jira: {code:java} 2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code} # Correct log printing after jira: {code:java} 2019-12-04 10:32:06,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Copied BP-428834875-192.168.202.11-1575425340126:blk_1073741918_1094 from /192.168.202.11:9866, initiated by /192.168.202.13:44502, delHint=84a0626a-5fa4-4c66-a776-074f537d4235{code} > Correct target DN's log while balancing. > > > Key: HDFS-15027 > URL: https://issues.apache.org/jira/browse/HDFS-15027 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer mover >Affects Versions: 3.2.1 >Reporter: Xudong Cao >Assignee: Xudong Cao >Priority: Minor > > During HDFS balancing, after the target DN copied a block from the proxy DN, > it prints a log following the pattern below: > *Moved BLOCK from BALANCER* > This is somehow misleading, maybe we can improve the pattern like: > *Copied BLOCK from PROXY DN, initiated by* *BALANCER* > > An example log of target DN during balancing: > 1. Wrong log printing before jira: > {code:java} > 2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from > /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code} > 2. Correct log printing after jira: > {code:java} > 2019-12-04 10:32:06,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Copied BP-428834875-192.168.202.11-1575425340126:blk_1073741918_1094 from > /192.168.202.11:9866, initiated by /192.168.202.13:44502, > delHint=84a0626a-5fa4-4c66-a776-074f537d4235{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15027) Correct target DN's log while balancing.
[ https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xudong Cao updated HDFS-15027: -- Description: During HDFS balancing, after the target DN copied a block from the proxy DN, it prints a log following the pattern below: *Moved BLOCK from BALANCER* This is somehow misleading, maybe we can improve the pattern like: *Copied BLOCK from PROXY DN, initiated by* *BALANCER* An example log of target DN during balancing: # Wrong log printing before jira: {code:java} 2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code} # Correct log printing after jira: {code:java} 2019-12-04 10:32:06,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Copied BP-428834875-192.168.202.11-1575425340126:blk_1073741918_1094 from /192.168.202.11:9866, initiated by /192.168.202.13:44502, delHint=84a0626a-5fa4-4c66-a776-074f537d4235{code} was: During HDFS balancing, after the target DN copied a block from the proxy DN, it prints a log following the pattern below: *Moved BLOCK from BALANCER* This is somehow misleading, maybe we can improve the pattern like: *Copied BLOCK from PROXY DN, initiated by* *BALANCER* > Correct target DN's log while balancing. > > > Key: HDFS-15027 > URL: https://issues.apache.org/jira/browse/HDFS-15027 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer mover >Affects Versions: 3.2.1 >Reporter: Xudong Cao >Assignee: Xudong Cao >Priority: Minor > > During HDFS balancing, after the target DN copied a block from the proxy DN, > it prints a log following the pattern below: > *Moved BLOCK from BALANCER* > This is somehow misleading, maybe we can improve the pattern like: > *Copied BLOCK from PROXY DN, initiated by* *BALANCER* > > An example log of target DN during balancing: > # Wrong log printing before jira: > {code:java} > 2019-12-04 09:33:19,718 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Moved BP-1426342230-192.168.202.11-1575277482603:blk_1073741889_1065 from > /192.168.202.13:56322, delHint=54a14a41-0d7c-4487-b4f0-ce2848f86b48{code} > # Correct log printing after jira: > {code:java} > 2019-12-04 10:32:06,707 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Copied BP-428834875-192.168.202.11-1575425340126:blk_1073741918_1094 from > /192.168.202.11:9866, initiated by /192.168.202.13:44502, > delHint=84a0626a-5fa4-4c66-a776-074f537d4235{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14998) Update Observer Namenode doc for ZKFC after HDFS-14130
[ https://issues.apache.org/jira/browse/HDFS-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987452#comment-16987452 ] Hadoop QA commented on HDFS-14998: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 47s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 35m 26s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 18s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 35s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 53m 52s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | HDFS-14998 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12987417/HDFS-14998.004.patch | | Optional Tests | dupname asflicense mvnsite | | uname | Linux 9bf47831d059 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / f1ab7f1 | | maven | version: Apache Maven 3.3.9 | | Max. process+thread count | 309 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/28441/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Update Observer Namenode doc for ZKFC after HDFS-14130 > -- > > Key: HDFS-14998 > URL: https://issues.apache.org/jira/browse/HDFS-14998 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation >Affects Versions: 3.3.0 >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Minor > Attachments: HDFS-14998.001.patch, HDFS-14998.002.patch, > HDFS-14998.003.patch, HDFS-14998.004.patch > > > After HDFS-14130, we should update observer namenode doc, observer namenode > can run with ZKFC running -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-14825) [Dynamometer] Workload doesn't start unless an absolute path of Mapper class given
[ https://issues.apache.org/jira/browse/HDFS-14825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Akira Ajisaka resolved HDFS-14825. -- Fix Version/s: 3.3.0 Hadoop Flags: Reviewed Resolution: Fixed Merged into trunk. Thank you [~tasanuma], [~soyamiyoshi], and [~xkrogen]. > [Dynamometer] Workload doesn't start unless an absolute path of Mapper class > given > -- > > Key: HDFS-14825 > URL: https://issues.apache.org/jira/browse/HDFS-14825 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Soya Miyoshi >Assignee: Takanobu Asanuma >Priority: Major > Fix For: 3.3.0 > > > When starting a workload by start-workload.sh, unless an absolute path of > Mapper is given, the workload doesn't start. > > {code:java} > $ hadoop/tools/dynamometer/dynamometer-workload/bin/start-workload.sh - \ > Dauditreplay.input-path=hdfs:///user/souya/input/audit \ > -Dauditreplay.output-path=hdfs:///user/souya/results/ \ > -Dauditreplay.num-threads=50 -Dauditreplay.log-start-time.ms=5 \ > -nn_uri hdfs://namenode_address:port/ \ > -mapper_class_name AuditReplayMapper > {code} > results in > {code:java} > SLF4J: Defaulting to no-operation (NOP) logger implementation > SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further > details. > Exception in thread "main" java.lang.ClassNotFoundException: Class > org.apache.hadoop.tools.dynamometer.workloadgenerator.AuditReplayMapper not > found > at > org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2572) > at > org.apache.hadoop.tools.dynamometer.workloadgenerator.WorkloadDriver.getMapperClass(WorkloadDriver.java:183) > at > org.apache.hadoop.tools.dynamometer.workloadgenerator.WorkloadDriver.run(WorkloadDriver.java:127) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90) > at > org.apache.hadoop.tools.dynamometer.workloadgenerator.WorkloadDriver.main(WorkloadDriver.java:172) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-14980) diskbalancer query command always tries to contact to port 9867
[ https://issues.apache.org/jira/browse/HDFS-14980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan reassigned HDFS-14980: Assignee: Aravindan Vijayan (was: Siddharth Wagle) > diskbalancer query command always tries to contact to port 9867 > --- > > Key: HDFS-14980 > URL: https://issues.apache.org/jira/browse/HDFS-14980 > Project: Hadoop HDFS > Issue Type: Bug > Components: diskbalancer >Reporter: Nilotpal Nandi >Assignee: Aravindan Vijayan >Priority: Major > > disbalancer query commands always tries to connect to port 9867 even when > datanode IPC port is different. > In this setup , datanode IPC port is set to 20001. > > diskbalancer report command works fine and connects to IPC port 20001 > > {noformat} > hdfs diskbalancer -report -node 172.27.131.193 > 19/11/12 08:58:55 INFO command.Command: Processing report command > 19/11/12 08:58:57 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 19/11/12 08:58:57 INFO block.BlockTokenSecretManager: Setting block keys > 19/11/12 08:58:57 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 19/11/12 08:58:58 INFO command.Command: Reporting volume information for > DataNode(s). These DataNode(s) are parsed from '172.27.131.193'. > Processing report command > Reporting volume information for DataNode(s). These DataNode(s) are parsed > from '172.27.131.193'. > [172.27.131.193:20001] - : 3 > volumes with node data density 0.05. > [DISK: volume-/dataroot/ycloud/dfs/NEW_DISK1/] - 0.15 used: > 39343871181/259692498944, 0.85 free: 220348627763/259692498944, isFailed: > False, isReadOnly: False, isSkip: False, isTransient: False. > [DISK: volume-/dataroot/ycloud/dfs/NEW_DISK2/] - 0.15 used: > 39371179986/259692498944, 0.85 free: 220321318958/259692498944, isFailed: > False, isReadOnly: False, isSkip: False, isTransient: False. > [DISK: volume-/dataroot/ycloud/dfs/dn/] - 0.19 used: > 49934903670/259692498944, 0.81 free: 209757595274/259692498944, isFailed: > False, isReadOnly: False, isSkip: False, isTransient: False. > > {noformat} > > But diskbalancer query command fails and tries to connect to port 9867 > (default port). > > {noformat} > hdfs diskbalancer -query 172.27.131.193 > 19/11/12 06:37:15 INFO command.Command: Executing "query plan" command. > 19/11/12 06:37:16 INFO ipc.Client: Retrying connect to server: > /172.27.131.193:9867. Already tried 0 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > 19/11/12 06:37:17 INFO ipc.Client: Retrying connect to server: > /172.27.131.193:9867. Already tried 1 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > .. > .. > .. > 19/11/12 06:37:25 ERROR tools.DiskBalancerCLI: Exception thrown while running > DiskBalancerCLI. > {noformat} > > > Expectation : > diskbalancer query command should work fine without explicitly mentioning > datanode IPC port address -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-14980) diskbalancer query command always tries to contact to port 9867
[ https://issues.apache.org/jira/browse/HDFS-14980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan reassigned HDFS-14980: Assignee: Siddharth Wagle (was: Aravindan Vijayan) > diskbalancer query command always tries to contact to port 9867 > --- > > Key: HDFS-14980 > URL: https://issues.apache.org/jira/browse/HDFS-14980 > Project: Hadoop HDFS > Issue Type: Bug > Components: diskbalancer >Reporter: Nilotpal Nandi >Assignee: Siddharth Wagle >Priority: Major > > disbalancer query commands always tries to connect to port 9867 even when > datanode IPC port is different. > In this setup , datanode IPC port is set to 20001. > > diskbalancer report command works fine and connects to IPC port 20001 > > {noformat} > hdfs diskbalancer -report -node 172.27.131.193 > 19/11/12 08:58:55 INFO command.Command: Processing report command > 19/11/12 08:58:57 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 19/11/12 08:58:57 INFO block.BlockTokenSecretManager: Setting block keys > 19/11/12 08:58:57 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 19/11/12 08:58:58 INFO command.Command: Reporting volume information for > DataNode(s). These DataNode(s) are parsed from '172.27.131.193'. > Processing report command > Reporting volume information for DataNode(s). These DataNode(s) are parsed > from '172.27.131.193'. > [172.27.131.193:20001] - : 3 > volumes with node data density 0.05. > [DISK: volume-/dataroot/ycloud/dfs/NEW_DISK1/] - 0.15 used: > 39343871181/259692498944, 0.85 free: 220348627763/259692498944, isFailed: > False, isReadOnly: False, isSkip: False, isTransient: False. > [DISK: volume-/dataroot/ycloud/dfs/NEW_DISK2/] - 0.15 used: > 39371179986/259692498944, 0.85 free: 220321318958/259692498944, isFailed: > False, isReadOnly: False, isSkip: False, isTransient: False. > [DISK: volume-/dataroot/ycloud/dfs/dn/] - 0.19 used: > 49934903670/259692498944, 0.81 free: 209757595274/259692498944, isFailed: > False, isReadOnly: False, isSkip: False, isTransient: False. > > {noformat} > > But diskbalancer query command fails and tries to connect to port 9867 > (default port). > > {noformat} > hdfs diskbalancer -query 172.27.131.193 > 19/11/12 06:37:15 INFO command.Command: Executing "query plan" command. > 19/11/12 06:37:16 INFO ipc.Client: Retrying connect to server: > /172.27.131.193:9867. Already tried 0 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > 19/11/12 06:37:17 INFO ipc.Client: Retrying connect to server: > /172.27.131.193:9867. Already tried 1 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > .. > .. > .. > 19/11/12 06:37:25 ERROR tools.DiskBalancerCLI: Exception thrown while running > DiskBalancerCLI. > {noformat} > > > Expectation : > diskbalancer query command should work fine without explicitly mentioning > datanode IPC port address -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15031) Allow BootstrapStandby to download FSImage if the directory is already formatted
[ https://issues.apache.org/jira/browse/HDFS-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Íñigo Goiri updated HDFS-15031: --- Status: Patch Available (was: Open) > Allow BootstrapStandby to download FSImage if the directory is already > formatted > > > Key: HDFS-15031 > URL: https://issues.apache.org/jira/browse/HDFS-15031 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Reporter: Danny Becker >Assignee: Danny Becker >Priority: Minor > Attachments: HDFS-15031.000.patch > > > Currently, BootstrapStandby will only download the latest FSImage if it has > formatted the local image directory. This can be an issue when there are out > of date FSImages on a Standby NameNode, as the non-interactive mode will not > format the image directory, and BootstrapStandby will return an error code. > The changes here simply allow BootstrapStandby to download the latest FSImage > to the image directory, without needing to format first. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15031) Allow BootstrapStandby to download FSImage if the directory is already formatted
[ https://issues.apache.org/jira/browse/HDFS-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Becker updated HDFS-15031: Attachment: HDFS-15031.000.patch > Allow BootstrapStandby to download FSImage if the directory is already > formatted > > > Key: HDFS-15031 > URL: https://issues.apache.org/jira/browse/HDFS-15031 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Reporter: Danny Becker >Assignee: Danny Becker >Priority: Minor > Attachments: HDFS-15031.000.patch > > > Currently, BootstrapStandby will only download the latest FSImage if it has > formatted the local image directory. This can be an issue when there are out > of date FSImages on a Standby NameNode, as the non-interactive mode will not > format the image directory, and BootstrapStandby will return an error code. > The changes here simply allow BootstrapStandby to download the latest FSImage > to the image directory, without needing to format first. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-15031) Allow BootstrapStandby to download FSImage if the directory is already formatted
[ https://issues.apache.org/jira/browse/HDFS-15031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Danny Becker reassigned HDFS-15031: --- Assignee: Danny Becker > Allow BootstrapStandby to download FSImage if the directory is already > formatted > > > Key: HDFS-15031 > URL: https://issues.apache.org/jira/browse/HDFS-15031 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs, namenode >Reporter: Danny Becker >Assignee: Danny Becker >Priority: Minor > > Currently, BootstrapStandby will only download the latest FSImage if it has > formatted the local image directory. This can be an issue when there are out > of date FSImages on a Standby NameNode, as the non-interactive mode will not > format the image directory, and BootstrapStandby will return an error code. > The changes here simply allow BootstrapStandby to download the latest FSImage > to the image directory, without needing to format first. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15031) Allow BootstrapStandby to download FSImage if the directory is already formatted
Danny Becker created HDFS-15031: --- Summary: Allow BootstrapStandby to download FSImage if the directory is already formatted Key: HDFS-15031 URL: https://issues.apache.org/jira/browse/HDFS-15031 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs, namenode Reporter: Danny Becker Currently, BootstrapStandby will only download the latest FSImage if it has formatted the local image directory. This can be an issue when there are out of date FSImages on a Standby NameNode, as the non-interactive mode will not format the image directory, and BootstrapStandby will return an error code. The changes here simply allow BootstrapStandby to download the latest FSImage to the image directory, without needing to format first. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14998) Update Observer Namenode doc for ZKFC after HDFS-14130
[ https://issues.apache.org/jira/browse/HDFS-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987416#comment-16987416 ] Fei Hui commented on HDFS-14998: [~ayushtkn][~csun][~shv] Thanks for your comments. Upload v005 patch. After HDFS-14130, we state that users can run ZKFC on observer namenode and zkfc will participate in the election of Active until the namenode is transitioned to standby state. > Update Observer Namenode doc for ZKFC after HDFS-14130 > -- > > Key: HDFS-14998 > URL: https://issues.apache.org/jira/browse/HDFS-14998 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation >Affects Versions: 3.3.0 >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Minor > Attachments: HDFS-14998.001.patch, HDFS-14998.002.patch, > HDFS-14998.003.patch, HDFS-14998.004.patch > > > After HDFS-14130, we should update observer namenode doc, observer namenode > can run with ZKFC running -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14998) Update Observer Namenode doc for ZKFC after HDFS-14130
[ https://issues.apache.org/jira/browse/HDFS-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fei Hui updated HDFS-14998: --- Attachment: HDFS-14998.004.patch > Update Observer Namenode doc for ZKFC after HDFS-14130 > -- > > Key: HDFS-14998 > URL: https://issues.apache.org/jira/browse/HDFS-14998 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation >Affects Versions: 3.3.0 >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Minor > Attachments: HDFS-14998.001.patch, HDFS-14998.002.patch, > HDFS-14998.003.patch, HDFS-14998.004.patch > > > After HDFS-14130, we should update observer namenode doc, observer namenode > can run with ZKFC running -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14963) Add HDFS Client machine caching active namenode index mechanism.
[ https://issues.apache.org/jira/browse/HDFS-14963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987411#comment-16987411 ] Xudong Cao commented on HDFS-14963: --- cc [~weichiu] All review comments have been resolved, could someone please merge this patch? thus we can proceed to process HDFS-14969 (as they modified some same files). > Add HDFS Client machine caching active namenode index mechanism. > > > Key: HDFS-14963 > URL: https://issues.apache.org/jira/browse/HDFS-14963 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 3.1.3 >Reporter: Xudong Cao >Assignee: Xudong Cao >Priority: Minor > Labels: multi-sbnn > > In multi-NameNodes scenery, a new hdfs client always begins a rpc call from > the 1st namenode, simply polls, and finally determines the current Active > namenode. > This brings at least two problems: > # Extra failover consumption, especially in the case of frequent creation of > clients. > # Unnecessary log printing, suppose there are 3 NNs and the 3rd is ANN, and > then a client starts rpc with the 1st NN, it will be silent when failover > from the 1st NN to the 2nd NN, but when failover from the 2nd NN to the 3rd > NN, it prints some unnecessary logs, in some scenarios, these logs will be > very numerous: > {code:java} > 2019-11-07 11:35:41,577 INFO retry.RetryInvocationHandler: > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): > Operation category READ is not supported in state standby. Visit > https://s.apache.org/sbnn-error > at > org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:98) > at > org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:2052) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1459) > ...{code} > We can introduce a solution for this problem: in client machine, for every > hdfs cluster, caching its current Active NameNode index in a separate cache > file named by its uri. *Note these cache files are shared by all hdfs client > processes on this machine*. > For example, suppose there are hdfs://ns1 and hdfs://ns2, and the client > machine cache file directory is /tmp, then: > # the ns1 cluster related cache file is /tmp/ns1 > # the ns2 cluster related cache file is /tmp/ns2 > And then: > # When a client starts, it reads the current Active NameNode index from the > corresponding cache file based on the target hdfs uri, and then directly make > an rpc call toward the right ANN. > # After each time client failovers, it need to write the latest Active > NameNode index to the corresponding cache file based on the target hdfs uri. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15005) Backport HDFS-12300 to branch-2
[ https://issues.apache.org/jira/browse/HDFS-15005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated HDFS-15005: Attachment: HDFS-15005-branch-2.002.patch > Backport HDFS-12300 to branch-2 > --- > > Key: HDFS-15005 > URL: https://issues.apache.org/jira/browse/HDFS-15005 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Chao Sun >Assignee: Chao Sun >Priority: Major > Attachments: HDFS-15005-branch-2.000.patch, > HDFS-15005-branch-2.001.patch, HDFS-15005-branch-2.002.patch > > > Having DT related information is very useful in audit log. This tracks effort > to backport HDFS-12300 to branch-2. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15005) Backport HDFS-12300 to branch-2
[ https://issues.apache.org/jira/browse/HDFS-15005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987346#comment-16987346 ] Chao Sun commented on HDFS-15005: - Thansk [~weichiu] and sorry missed many unused imports. One question though: for the style issue, do you think we should fix the other things than the unused imports? I'm thinking whether we should keep it consistent with the original patch. > Backport HDFS-12300 to branch-2 > --- > > Key: HDFS-15005 > URL: https://issues.apache.org/jira/browse/HDFS-15005 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Chao Sun >Assignee: Chao Sun >Priority: Major > Attachments: HDFS-15005-branch-2.000.patch, > HDFS-15005-branch-2.001.patch > > > Having DT related information is very useful in audit log. This tracks effort > to backport HDFS-12300 to branch-2. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14998) Update Observer Namenode doc for ZKFC after HDFS-14130
[ https://issues.apache.org/jira/browse/HDFS-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987338#comment-16987338 ] Konstantin Shvachko commented on HDFS-14998: I see you guys are all on the same page here, just need to formulate it in the document. # We do not require turning off ZKFC on an Observer Node anymore. So we should just remove any mentioning of it in the doc. # ZKFC running on Observer Node does not participate in the election of Active, only Standby Nodes do. Should clarify it in the doc. > Update Observer Namenode doc for ZKFC after HDFS-14130 > -- > > Key: HDFS-14998 > URL: https://issues.apache.org/jira/browse/HDFS-14998 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation >Affects Versions: 3.3.0 >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Minor > Attachments: HDFS-14998.001.patch, HDFS-14998.002.patch, > HDFS-14998.003.patch > > > After HDFS-14130, we should update observer namenode doc, observer namenode > can run with ZKFC running -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15010) BlockPoolSlice#addReplicaThreadPool static pool should be initialized by static method
[ https://issues.apache.org/jira/browse/HDFS-15010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987305#comment-16987305 ] Yuval Degani commented on HDFS-15010: - [~surendrasingh], would you consider backporting this patch to branch-2? > BlockPoolSlice#addReplicaThreadPool static pool should be initialized by > static method > -- > > Key: HDFS-15010 > URL: https://issues.apache.org/jira/browse/HDFS-15010 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 3.1.2 >Reporter: Surendra Singh Lilhore >Assignee: Surendra Singh Lilhore >Priority: Major > Fix For: 3.3.0, 3.1.4, 3.2.2 > > Attachments: HDFS-15010.001.patch, HDFS-15010.02.patch, > HDFS-15010.03.patch, HDFS-15010.04.patch, HDFS-15010.05.patch > > > {{BlockPoolSlice#initializeAddReplicaPool()}} method currently initialize the > static thread pool instance. But when two {{BPServiceActor}} actor try to > load block pool parallelly then it may create different instance. > So {{BlockPoolSlice#initializeAddReplicaPool()}} method should be a static > method. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15009) FSCK "-list-corruptfileblocks" return Invalid Entries
[ https://issues.apache.org/jira/browse/HDFS-15009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987303#comment-16987303 ] Yuval Degani commented on HDFS-15009: - [~ayushtkn], [~hemanthboyina], would you mind backporting this patch to branch-2? seems highly relevant > FSCK "-list-corruptfileblocks" return Invalid Entries > - > > Key: HDFS-15009 > URL: https://issues.apache.org/jira/browse/HDFS-15009 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Fix For: 3.3.0 > > Attachments: HDFS-15009.001.patch, HDFS-15009.002.patch, > HDFS-15009.003.patch, HDFS-15009.004.patch > > > Scenario : if we have two directories dir1, dir10 and only dir10 have > corrupt files > Now if we run -list-corruptfileblocks for dir1, corrupt files count for dir1 > showing is of dir10 > {code:java} > while (blkIterator.hasNext()) { > BlockInfo blk = blkIterator.next(); > final INodeFile inode = getBlockCollection(blk); > skip++; > if (inode != null) { > String src = inode.getFullPathName(); > if (src.startsWith(path)){ > corruptFiles.add(new CorruptFileBlockInfo(src, blk)); > count++; > if (count >= DEFAULT_MAX_CORRUPT_FILEBLOCKS_RETURNED) > break; > } > } > } {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14519) NameQuota is not update after concat operation, so namequota is wrong
[ https://issues.apache.org/jira/browse/HDFS-14519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987291#comment-16987291 ] Erik Krogen commented on HDFS-14519: [~RANith] or [~ayushtkn], are you interested in putting together a branch-2 backport? It seems that the test case needs to be moved. > NameQuota is not update after concat operation, so namequota is wrong > - > > Key: HDFS-14519 > URL: https://issues.apache.org/jira/browse/HDFS-14519 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ranith Sardar >Assignee: Ranith Sardar >Priority: Major > Fix For: 3.3.0, 3.1.4, 3.2.2 > > Attachments: HDFS-14519.001.patch, HDFS-14519.002.patch, > HDFS-14519.003.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15027) Correct target DN's log while balancing.
[ https://issues.apache.org/jira/browse/HDFS-15027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987284#comment-16987284 ] Wei-Chiu Chuang commented on HDFS-15027: Was the patch file deleted from the attachments? > Correct target DN's log while balancing. > > > Key: HDFS-15027 > URL: https://issues.apache.org/jira/browse/HDFS-15027 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer mover >Affects Versions: 3.2.1 >Reporter: Xudong Cao >Assignee: Xudong Cao >Priority: Minor > > During HDFS balancing, after the target DN copied a block from the proxy DN, > it prints a log following the pattern below: > *Moved BLOCK from BALANCER* > This is somehow misleading, maybe we can improve the pattern like: > *Copied BLOCK from PROXY DN, initiated by* *BALANCER* -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15030) Add setQuota operation to HttpFS
[ https://issues.apache.org/jira/browse/HDFS-15030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987280#comment-16987280 ] Wei-Chiu Chuang commented on HDFS-15030: I thought it was added by HDFS-8631? Looks like the HttpFS part was removed in the final patch. > Add setQuota operation to HttpFS > > > Key: HDFS-15030 > URL: https://issues.apache.org/jira/browse/HDFS-15030 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > > setQuota operation is missing in HttpFS -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15022) Add new RPC to transfer data block with external shell script across Datanode
[ https://issues.apache.org/jira/browse/HDFS-15022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987276#comment-16987276 ] Wei-Chiu Chuang commented on HDFS-15022: This looks like a new feature to me. Can we add documentation about this feature, how it is used and so on? > Add new RPC to transfer data block with external shell script across Datanode > - > > Key: HDFS-15022 > URL: https://issues.apache.org/jira/browse/HDFS-15022 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Yang Yun >Assignee: Yang Yun >Priority: Minor > Attachments: HDFS-15022.patch, HDFS-15022.patch > > > Replicating data block is expensive when some Datanodes are down, especially > for slow storage. Add a new RPC to replicate block with external shell script > across datanode. User can choose more effective way to copy block files. > In our setup, Archive volume are configured to remote reliable storage. we > just add a new link file in new datanode to the remote file when do > replication. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15005) Backport HDFS-12300 to branch-2
[ https://issues.apache.org/jira/browse/HDFS-15005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987274#comment-16987274 ] Wei-Chiu Chuang commented on HDFS-15005: Cool. For my future reference: https://dzone.com/articles/string-concatenation-performacne-improvement-in-ja Can you take care of the checkstyle warnings? Other than that I am +1 > Backport HDFS-12300 to branch-2 > --- > > Key: HDFS-15005 > URL: https://issues.apache.org/jira/browse/HDFS-15005 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Reporter: Chao Sun >Assignee: Chao Sun >Priority: Major > Attachments: HDFS-15005-branch-2.000.patch, > HDFS-15005-branch-2.001.patch > > > Having DT related information is very useful in audit log. This tracks effort > to backport HDFS-12300 to branch-2. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14546) Document block placement policies
[ https://issues.apache.org/jira/browse/HDFS-14546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987251#comment-16987251 ] Wei-Chiu Chuang commented on HDFS-14546: [~Amithsha] sounds like you moved to the patch files here in the JIRA? Could you abandon the github PR if that's stale? > Document block placement policies > - > > Key: HDFS-14546 > URL: https://issues.apache.org/jira/browse/HDFS-14546 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Íñigo Goiri >Assignee: Amithsha >Priority: Major > Labels: documentation > Attachments: HDFS-14546-01.patch, HDFS-14546-02.patch, > HDFS-14546-03.patch, HDFS-14546-04.patch, HDFS-14546-05.patch, > HDFS-14546-06.patch, HDFS-14546-07.patch, HdfsDesign.patch > > > Currently, all the documentation refers to the default block placement policy. > However, over time there have been new policies: > * BlockPlacementPolicyRackFaultTolerant (HDFS-7891) > * BlockPlacementPolicyWithNodeGroup (HDFS-3601) > * BlockPlacementPolicyWithUpgradeDomain (HDFS-9006) > We should update the documentation to refer to them explaining their > particularities and probably how to setup each one of them. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14957) INodeReference Space Consumed was not same in QuotaUsage and ContentSummary
[ https://issues.apache.org/jira/browse/HDFS-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987235#comment-16987235 ] Hadoop QA commented on HDFS-14957: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 3s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 34s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 30s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 12s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 40s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 149 unchanged - 1 fixed = 149 total (was 150) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 25s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}100m 56s{color} | {color:green} hadoop-hdfs in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 32s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}163m 6s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | HDFS-14957 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12987386/HDFS-14957.001.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 403e9a680e72 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 0c217fe | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/28439/testReport/ | | Max. process+thread count | 2732 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/28439/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > INodeReference Space Consumed was not same in QuotaUsage and ContentSummary >
[jira] [Commented] (HDFS-14998) Update Observer Namenode doc for ZKFC after HDFS-14130
[ https://issues.apache.org/jira/browse/HDFS-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987202#comment-16987202 ] Ayush Saxena commented on HDFS-14998: - [~ferhui] you need to exactly copy paste the words, if you have understood what [~csun] meant to say, you can write in your words, what you feel shall convey the meaning to all. Let me clear it once more, Earlier ZKFC was not allowed to run, on the node running Observer Namenode, since ZKFC wasn't aware what about ONN, it used to transition it to Standby, so it was mandatory to turn the ZKFC off on Observer Namenode, if you want to use the feature. In the present scenario, ZKFC is aware about Observer Namenode, it doesn't bother the Observer Namenode, so you can have ZKFC running on the Observer Node, it is now not compulsory to turn it off. But it is not required for a person to turn it on for normal working of Observer since Observer can not participate in failover, But if a user has a usecase where he keeps on changing the state of Observer to Standby back and forth depending on load or some condition (Either manually or through scripts), he can keep the ZKFC running so as when the Namenode is moved to Standby State it can participate in Automatic Failover, With ZKFC being aware of the ONN, this will shed the need of turning on and off the ZKFC when the namenode transitions states. We are just being suggestive, not forcing any user to choose any way, whatever his use case may be he can go ahead, What Chao mentioned is one use case where ZKFC can be kept running on ONN. If someone doesn't have such a use case, he is free to turn the ZKFC off. You can frame it in your own words, Whatever way you think you can convey the user correctly in less words, shall be fine for both of us. [~csun] correct me, if I have gone wrong in the concept somewhere. > Update Observer Namenode doc for ZKFC after HDFS-14130 > -- > > Key: HDFS-14998 > URL: https://issues.apache.org/jira/browse/HDFS-14998 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation >Affects Versions: 3.3.0 >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Minor > Attachments: HDFS-14998.001.patch, HDFS-14998.002.patch, > HDFS-14998.003.patch > > > After HDFS-14130, we should update observer namenode doc, observer namenode > can run with ZKFC running -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15016) RBF: getDatanodeReport() should return the latest update
[ https://issues.apache.org/jira/browse/HDFS-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987196#comment-16987196 ] Ayush Saxena commented on HDFS-15016: - Not as such preference, I am fine as far as it covers our change. > RBF: getDatanodeReport() should return the latest update > > > Key: HDFS-15016 > URL: https://issues.apache.org/jira/browse/HDFS-15016 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Major > Attachments: HDFS-15016.000.patch > > > Currently, when the Router calls getDatanodeReport() (or > getDatanodeStorageReport()) and the DN is in multiple clusters, it just takes > the one that comes first. It should consider the latest update. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14908) LeaseManager should check parent-child relationship when filter open files.
[ https://issues.apache.org/jira/browse/HDFS-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987194#comment-16987194 ] Íñigo Goiri commented on HDFS-14908: Thanks for rebasing and checking the issue. HDFS-15009 seems to be the one doing the change. [~hemanthboyina] do you mind taking a look? > LeaseManager should check parent-child relationship when filter open files. > --- > > Key: HDFS-14908 > URL: https://issues.apache.org/jira/browse/HDFS-14908 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.1 >Reporter: Jinglun >Assignee: Jinglun >Priority: Minor > Attachments: HDFS-14908.001.patch, HDFS-14908.002.patch, > HDFS-14908.003.patch, HDFS-14908.004.patch, HDFS-14908.005.patch, > HDFS-14908.006.patch, HDFS-14908.007.patch, HDFS-14908.TestV4.patch, > Test.java, TestV2.java, TestV3.java > > > Now when doing listOpenFiles(), LeaseManager only checks whether the filter > path is the prefix of the open files. We should check whether the filter path > is the parent/ancestor of the open files. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14957) INodeReference Space Consumed was not same in QuotaUsage and ContentSummary
[ https://issues.apache.org/jira/browse/HDFS-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987189#comment-16987189 ] Ayush Saxena commented on HDFS-14957: - bq. for calculating space consumed , we are cosidering file which exists in diff(In Snapshot) and also the file's getBlocks() Can you link the JIRA, as part of which it was done. Do you find any reason for doing so, the people involved there might have some reasons. Just to be sure, we can pull them up here. Anyway ContentSummary and QuotaUsage are different API's and can have different behaviors, provided that is justified. > INodeReference Space Consumed was not same in QuotaUsage and ContentSummary > --- > > Key: HDFS-14957 > URL: https://issues.apache.org/jira/browse/HDFS-14957 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.1.4 >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14957.001.patch, HDFS-14957.JPG > > > for INodeReferences , space consumed was different in QuotaUsage and Content > Summary -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15016) RBF: getDatanodeReport() should return the latest update
[ https://issues.apache.org/jira/browse/HDFS-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987188#comment-16987188 ] Íñigo Goiri commented on HDFS-15016: Thanks [~ayushtkn], let me fix the logging. Any preference for the test? > RBF: getDatanodeReport() should return the latest update > > > Key: HDFS-15016 > URL: https://issues.apache.org/jira/browse/HDFS-15016 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Major > Attachments: HDFS-15016.000.patch > > > Currently, when the Router calls getDatanodeReport() (or > getDatanodeStorageReport()) and the DN is in multiple clusters, it just takes > the one that comes first. It should consider the latest update. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15012) NN fails to parse Edit logs after applying HDFS-13101
[ https://issues.apache.org/jira/browse/HDFS-15012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987183#comment-16987183 ] Ayush Saxena commented on HDFS-15012: - HDFS-13101 is fixing FSImage Corruption, that too seems a critical bug fix, Will reverting that not make it vulnerable? Moreover that has been released too as part of lower versions. Reverting would solve this problem but would open up the older problem. IMO Reverting isn't going to give any big relief, Anyway there is no release planned soon, you all can take some more time to get the solution. If you have any more pointers to how to reproduce the problem, let us know we can try help too.. > NN fails to parse Edit logs after applying HDFS-13101 > - > > Key: HDFS-15012 > URL: https://issues.apache.org/jira/browse/HDFS-15012 > Project: Hadoop HDFS > Issue Type: Bug > Components: nn >Reporter: Eric Lin >Assignee: Shashikant Banerjee >Priority: Blocker > Labels: release-blocker > > After applying HDFS-13101, and deleting and creating large number of > snapshots, SNN exited with below error: > > {code:sh} > 2019-11-18 08:28:06,528 ERROR > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception > on operation DeleteSnapshotOp [snapshotRoot=/path/to/hdfs/file, > snapshotName=distcp-3479-31-old, > RpcClientId=b16a6cb5-bdbb-45ae-9f9a-f7dc57931f37, Rpc > CallId=1] > java.lang.AssertionError: Element already exists: > element=partition_isactive=true, DELETED=[partition_isactive=true] > at org.apache.hadoop.hdfs.util.Diff.insert(Diff.java:193) > at org.apache.hadoop.hdfs.util.Diff.delete(Diff.java:239) > at org.apache.hadoop.hdfs.util.Diff.combinePosterior(Diff.java:462) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff$2.initChildren(DirectoryWithSnapshotFeature.java:240) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff$2.iterator(DirectoryWithSnapshotFeature.java:250) > at > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:755) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:753) > at > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:790) > at > org.apache.hadoop.hdfs.server.namenode.INodeReference.cleanSubtree(INodeReference.java:332) > at > org.apache.hadoop.hdfs.server.namenode.INodeReference$WithName.cleanSubtree(INodeReference.java:583) > at > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:760) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:753) > at > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:790) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectorySnapshottableFeature.removeSnapshot(DirectorySnapshottableFeature.java:235) > at > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.removeSnapshot(INodeDirectory.java:259) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.deleteSnapshot(SnapshotManager.java:301) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:688) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:141) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:903) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:756) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:324) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1144) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:796) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:614) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:676) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615) >
[jira] [Commented] (HDFS-15016) RBF: getDatanodeReport() should return the latest update
[ https://issues.apache.org/jira/browse/HDFS-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987172#comment-16987172 ] Ayush Saxena commented on HDFS-15016: - Thanx [~elgoiri] for the patch. Fix LGTM. Just FYI. this Log, will not be there if the second time the DN has greater last update : {code:java} LOG.debug("{} is in multiple subclusters", nodeId); {code} Though the DN would have been reported from multiple clusters. If you don't have any issues with it. I also fine. > RBF: getDatanodeReport() should return the latest update > > > Key: HDFS-15016 > URL: https://issues.apache.org/jira/browse/HDFS-15016 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Íñigo Goiri >Assignee: Íñigo Goiri >Priority: Major > Attachments: HDFS-15016.000.patch > > > Currently, when the Router calls getDatanodeReport() (or > getDatanodeStorageReport()) and the DN is in multiple clusters, it just takes > the one that comes first. It should consider the latest update. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15012) NN fails to parse Edit logs after applying HDFS-13101
[ https://issues.apache.org/jira/browse/HDFS-15012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987164#comment-16987164 ] Wei-Chiu Chuang commented on HDFS-15012: We are actively working on this. However given the nature of the problem it is not easy to reproduce in a unit test format. To be on the safe side, I would suggest reverting HDFS-13101 for now. Bump the priority to blocker and add release-blocker label to this jira. > NN fails to parse Edit logs after applying HDFS-13101 > - > > Key: HDFS-15012 > URL: https://issues.apache.org/jira/browse/HDFS-15012 > Project: Hadoop HDFS > Issue Type: Bug > Components: nn >Reporter: Eric Lin >Assignee: Shashikant Banerjee >Priority: Blocker > Labels: release-blocker > > After applying HDFS-13101, and deleting and creating large number of > snapshots, SNN exited with below error: > > {code:sh} > 2019-11-18 08:28:06,528 ERROR > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception > on operation DeleteSnapshotOp [snapshotRoot=/path/to/hdfs/file, > snapshotName=distcp-3479-31-old, > RpcClientId=b16a6cb5-bdbb-45ae-9f9a-f7dc57931f37, Rpc > CallId=1] > java.lang.AssertionError: Element already exists: > element=partition_isactive=true, DELETED=[partition_isactive=true] > at org.apache.hadoop.hdfs.util.Diff.insert(Diff.java:193) > at org.apache.hadoop.hdfs.util.Diff.delete(Diff.java:239) > at org.apache.hadoop.hdfs.util.Diff.combinePosterior(Diff.java:462) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff$2.initChildren(DirectoryWithSnapshotFeature.java:240) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff$2.iterator(DirectoryWithSnapshotFeature.java:250) > at > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:755) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:753) > at > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:790) > at > org.apache.hadoop.hdfs.server.namenode.INodeReference.cleanSubtree(INodeReference.java:332) > at > org.apache.hadoop.hdfs.server.namenode.INodeReference$WithName.cleanSubtree(INodeReference.java:583) > at > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:760) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:753) > at > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:790) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectorySnapshottableFeature.removeSnapshot(DirectorySnapshottableFeature.java:235) > at > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.removeSnapshot(INodeDirectory.java:259) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.deleteSnapshot(SnapshotManager.java:301) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:688) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:141) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:903) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:756) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:324) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1144) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:796) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:614) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:676) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615) > {code} > We confirmed that fsimage and edit files were NOT corrupted, as reverting > HDFS-13101 fixed the issue. So the logic introduced in HDFS-13101 is broken > and failed to parse edit log files. -- This message was sent
[jira] [Updated] (HDFS-15012) NN fails to parse Edit logs after applying HDFS-13101
[ https://issues.apache.org/jira/browse/HDFS-15012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-15012: --- Target Version/s: 3.0.4, 3.3.0, 2.8.6, 2.9.3, 3.1.4, 3.2.2, 2.10.1 Labels: release-blocker (was: ) > NN fails to parse Edit logs after applying HDFS-13101 > - > > Key: HDFS-15012 > URL: https://issues.apache.org/jira/browse/HDFS-15012 > Project: Hadoop HDFS > Issue Type: Bug > Components: nn >Reporter: Eric Lin >Assignee: Shashikant Banerjee >Priority: Blocker > Labels: release-blocker > > After applying HDFS-13101, and deleting and creating large number of > snapshots, SNN exited with below error: > > {code:sh} > 2019-11-18 08:28:06,528 ERROR > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception > on operation DeleteSnapshotOp [snapshotRoot=/path/to/hdfs/file, > snapshotName=distcp-3479-31-old, > RpcClientId=b16a6cb5-bdbb-45ae-9f9a-f7dc57931f37, Rpc > CallId=1] > java.lang.AssertionError: Element already exists: > element=partition_isactive=true, DELETED=[partition_isactive=true] > at org.apache.hadoop.hdfs.util.Diff.insert(Diff.java:193) > at org.apache.hadoop.hdfs.util.Diff.delete(Diff.java:239) > at org.apache.hadoop.hdfs.util.Diff.combinePosterior(Diff.java:462) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff$2.initChildren(DirectoryWithSnapshotFeature.java:240) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff$2.iterator(DirectoryWithSnapshotFeature.java:250) > at > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:755) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:753) > at > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:790) > at > org.apache.hadoop.hdfs.server.namenode.INodeReference.cleanSubtree(INodeReference.java:332) > at > org.apache.hadoop.hdfs.server.namenode.INodeReference$WithName.cleanSubtree(INodeReference.java:583) > at > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:760) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:753) > at > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:790) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectorySnapshottableFeature.removeSnapshot(DirectorySnapshottableFeature.java:235) > at > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.removeSnapshot(INodeDirectory.java:259) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.deleteSnapshot(SnapshotManager.java:301) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:688) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:141) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:903) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:756) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:324) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1144) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:796) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:614) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:676) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615) > {code} > We confirmed that fsimage and edit files were NOT corrupted, as reverting > HDFS-13101 fixed the issue. So the logic introduced in HDFS-13101 is broken > and failed to parse edit log files. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional
[jira] [Updated] (HDFS-15012) NN fails to parse Edit logs after applying HDFS-13101
[ https://issues.apache.org/jira/browse/HDFS-15012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-15012: --- Priority: Blocker (was: Critical) > NN fails to parse Edit logs after applying HDFS-13101 > - > > Key: HDFS-15012 > URL: https://issues.apache.org/jira/browse/HDFS-15012 > Project: Hadoop HDFS > Issue Type: Bug > Components: nn >Reporter: Eric Lin >Assignee: Shashikant Banerjee >Priority: Blocker > > After applying HDFS-13101, and deleting and creating large number of > snapshots, SNN exited with below error: > > {code:sh} > 2019-11-18 08:28:06,528 ERROR > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception > on operation DeleteSnapshotOp [snapshotRoot=/path/to/hdfs/file, > snapshotName=distcp-3479-31-old, > RpcClientId=b16a6cb5-bdbb-45ae-9f9a-f7dc57931f37, Rpc > CallId=1] > java.lang.AssertionError: Element already exists: > element=partition_isactive=true, DELETED=[partition_isactive=true] > at org.apache.hadoop.hdfs.util.Diff.insert(Diff.java:193) > at org.apache.hadoop.hdfs.util.Diff.delete(Diff.java:239) > at org.apache.hadoop.hdfs.util.Diff.combinePosterior(Diff.java:462) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff$2.initChildren(DirectoryWithSnapshotFeature.java:240) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature$DirectoryDiff$2.iterator(DirectoryWithSnapshotFeature.java:250) > at > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:755) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:753) > at > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:790) > at > org.apache.hadoop.hdfs.server.namenode.INodeReference.cleanSubtree(INodeReference.java:332) > at > org.apache.hadoop.hdfs.server.namenode.INodeReference$WithName.cleanSubtree(INodeReference.java:583) > at > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtreeRecursively(INodeDirectory.java:760) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectoryWithSnapshotFeature.cleanDirectory(DirectoryWithSnapshotFeature.java:753) > at > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.cleanSubtree(INodeDirectory.java:790) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.DirectorySnapshottableFeature.removeSnapshot(DirectorySnapshottableFeature.java:235) > at > org.apache.hadoop.hdfs.server.namenode.INodeDirectory.removeSnapshot(INodeDirectory.java:259) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.SnapshotManager.deleteSnapshot(SnapshotManager.java:301) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:688) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:232) > at > org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:141) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:903) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:756) > at > org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:324) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1144) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:796) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:614) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:676) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615) > {code} > We confirmed that fsimage and edit files were NOT corrupted, as reverting > HDFS-13101 fixed the issue. So the logic introduced in HDFS-13101 is broken > and failed to parse edit log files. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14957) INodeReference Space Consumed was not same in QuotaUsage and ContentSummary
[ https://issues.apache.org/jira/browse/HDFS-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987144#comment-16987144 ] Íñigo Goiri commented on HDFS-14957: * Can we minimize the diff? Currently lines 889 to 893 are the same as before but show as different because of the format, let's avoid that; or if there are changes, isolate them. * We should add a javadoc to getDiskSpaceQuota() explaining the high level and probably the issue raised by this JIRA. * Similar for the test, it would be good to explain what is different from before. > INodeReference Space Consumed was not same in QuotaUsage and ContentSummary > --- > > Key: HDFS-14957 > URL: https://issues.apache.org/jira/browse/HDFS-14957 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.1.4 >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14957.001.patch, HDFS-14957.JPG > > > for INodeReferences , space consumed was different in QuotaUsage and Content > Summary -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14901) RBF: Add Encryption Zone related ClientProtocol APIs
[ https://issues.apache.org/jira/browse/HDFS-14901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987140#comment-16987140 ] Íñigo Goiri commented on HDFS-14901: The test is fairly fast: https://builds.apache.org/job/PreCommit-HDFS-Build/28435/testReport/org.apache.hadoop.hdfs.server.federation.router/TestRouterEncryptionZone/ I would prefer to use Before and After instead of BeforeClass as it is safer. In the AfterClass (or After now), we may want to set the cluster to null and check if it was null. > RBF: Add Encryption Zone related ClientProtocol APIs > > > Key: HDFS-14901 > URL: https://issues.apache.org/jira/browse/HDFS-14901 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14901.001.patch, HDFS-14901.002.patch, > HDFS-14901.003.patch > > > Currently listEncryptionZones,reencryptEncryptionZone,listReencryptionStatus > these APIs are not implemented in Router. > This JIRA is intend to implement above mentioned APIs. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14998) Update Observer Namenode doc for ZKFC after HDFS-14130
[ https://issues.apache.org/jira/browse/HDFS-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987138#comment-16987138 ] Chao Sun commented on HDFS-14998: - [~ferhui] without ZKFC running it is not quite useful to just run the DFS admin command to transition the observer to SBN, since it won't be eligible for failover unless you also launch the ZKFC on that host. {quote} The only benefit for running ZKFC on Observer NameNode is that it would join election when the Observer NameNode is transitioned to Standby state. {quote} Yes this is correct. {quote} >From your words dynamic transition(between observer and standby) can be done >only when ZKFC running on it. But i think dynamic transition can be done >without ZKFC running. Please correct me, thanks. {quote} It can be done without ZKFC running, but with extra steps such as launching ZKFC / shutting down ZKFC etc. > Update Observer Namenode doc for ZKFC after HDFS-14130 > -- > > Key: HDFS-14998 > URL: https://issues.apache.org/jira/browse/HDFS-14998 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation >Affects Versions: 3.3.0 >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Minor > Attachments: HDFS-14998.001.patch, HDFS-14998.002.patch, > HDFS-14998.003.patch > > > After HDFS-14130, we should update observer namenode doc, observer namenode > can run with ZKFC running -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15030) Add setQuota operation to HttpFS
hemanthboyina created HDFS-15030: Summary: Add setQuota operation to HttpFS Key: HDFS-15030 URL: https://issues.apache.org/jira/browse/HDFS-15030 Project: Hadoop HDFS Issue Type: Improvement Reporter: hemanthboyina Assignee: hemanthboyina setQuota operation is missing in HttpFS -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14957) INodeReference Space Consumed was not same in QuotaUsage and ContentSummary
[ https://issues.apache.org/jira/browse/HDFS-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987095#comment-16987095 ] hemanthboyina commented on HDFS-14957: -- attached patch , please review > INodeReference Space Consumed was not same in QuotaUsage and ContentSummary > --- > > Key: HDFS-14957 > URL: https://issues.apache.org/jira/browse/HDFS-14957 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.1.4 >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14957.001.patch, HDFS-14957.JPG > > > for INodeReferences , space consumed was different in QuotaUsage and Content > Summary -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14957) INodeReference Space Consumed was not same in QuotaUsage and ContentSummary
[ https://issues.apache.org/jira/browse/HDFS-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hemanthboyina updated HDFS-14957: - Attachment: HDFS-14957.001.patch Status: Patch Available (was: Open) > INodeReference Space Consumed was not same in QuotaUsage and ContentSummary > --- > > Key: HDFS-14957 > URL: https://issues.apache.org/jira/browse/HDFS-14957 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.1.4 >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14957.001.patch, HDFS-14957.JPG > > > for INodeReferences , space consumed was different in QuotaUsage and Content > Summary -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13811) RBF: Race condition between router admin quota update and periodic quota update service
[ https://issues.apache.org/jira/browse/HDFS-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986920#comment-16986920 ] Hadoop QA commented on HDFS-13811: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 36s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 45s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 33s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 6s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 60m 57s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | HDFS-13811 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12987366/HDFS-13811.006.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 82775d96bdf0 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 0c217fe | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/28438/testReport/ | | Max. process+thread count | 3134 (vs. ulimit of 5500) | | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: hadoop-hdfs-project/hadoop-hdfs-rbf | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/28438/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > RBF: Race condition between router admin quota update and periodic quota > update service >
[jira] [Comment Edited] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time
[ https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986891#comment-16986891 ] huhaiyang edited comment on HDFS-15024 at 12/3/19 1:26 PM: --- Current In RetryPolicies Implemented {code:java} /** * @return 0 if this is our first failover/retry (i.e., retry immediately), * sleep exponentially otherwise */ private long getFailoverOrRetrySleepTime(int times) { return times == 0 ? 0 : calculateExponentialTime(delayMillis, times, maxDelayBase); } {code} It is reasonable to consider the number of namenode as a condition for calculating sleep duration was (Author: haiyang hu): Current In RetryPolicies Implemented {code:java} /** * @return 0 if this is our first failover/retry (i.e., retry immediately), * sleep exponentially otherwise */ private long getFailoverOrRetrySleepTime(int times) { return times == 0 ? 0 : calculateExponentialTime(delayMillis, times, maxDelayBase); } {code} It is reasonable to consider the number of namenode as a condition for calculating sleep duration > [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a > condition of calculation of sleep time > --- > > Key: HDFS-15024 > URL: https://issues.apache.org/jira/browse/HDFS-15024 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.10.0, 3.3.0, 3.2.1 >Reporter: huhaiyang >Priority: Major > Attachments: HDFS-15024.001.patch, client_error.log > > > When we enable the ONN , there will be three NN nodes for the client > configuration, > Such as configuration > > dfs.ha.namenodes.ns1 > nn2,nn3,nn1 > > Currently, > nn2 is in standby state > nn3 is in observer state > nn1 is in active state > When the user performs an access HDFS operation > ./bin/hadoop --loglevel debug fs > -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider > -mkdir /user/haiyang1/test8 > You need to request nn1 when you execute the msync method, > Actually connect nn2 first and failover is required > In connection nn3 does not meet the requirements, failover needs to be > performed, but at this time, failover operation needs to be performed during > a period of hibernation > Finally, it took a period of hibernation to connect the successful request to > nn1 > In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime The current > default implementation is Sleep time is calculated when more than one > failover operation is performed > I think that the Number of NameNodes as a condition of calculation of sleep > time is more reasonable > That is, in the current test, executing failover on connection nn3 does not > need to sleep time to directly connect to the next nn node > See client_error.log for details -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time
[ https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986891#comment-16986891 ] huhaiyang edited comment on HDFS-15024 at 12/3/19 1:26 PM: --- Current In RetryPolicies Implemented {code:java} /** * @return 0 if this is our first failover/retry (i.e., retry immediately), * sleep exponentially otherwise */ private long getFailoverOrRetrySleepTime(int times) { return times == 0 ? 0 : calculateExponentialTime(delayMillis, times, maxDelayBase); } {code} It is reasonable to consider the number of namenode as a condition for calculating sleep duration was (Author: haiyang hu): In RetryPolicies Implemented /** * @return 0 if this is our first failover/retry (i.e., retry immediately), * sleep exponentially otherwise */ private long getFailoverOrRetrySleepTime(int times) { return times == 0 ? 0 : calculateExponentialTime(delayMillis, times, maxDelayBase); } > [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a > condition of calculation of sleep time > --- > > Key: HDFS-15024 > URL: https://issues.apache.org/jira/browse/HDFS-15024 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.10.0, 3.3.0, 3.2.1 >Reporter: huhaiyang >Priority: Major > Attachments: HDFS-15024.001.patch, client_error.log > > > When we enable the ONN , there will be three NN nodes for the client > configuration, > Such as configuration > > dfs.ha.namenodes.ns1 > nn2,nn3,nn1 > > Currently, > nn2 is in standby state > nn3 is in observer state > nn1 is in active state > When the user performs an access HDFS operation > ./bin/hadoop --loglevel debug fs > -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider > -mkdir /user/haiyang1/test8 > You need to request nn1 when you execute the msync method, > Actually connect nn2 first and failover is required > In connection nn3 does not meet the requirements, failover needs to be > performed, but at this time, failover operation needs to be performed during > a period of hibernation > Finally, it took a period of hibernation to connect the successful request to > nn1 > In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime The current > default implementation is Sleep time is calculated when more than one > failover operation is performed > I think that the Number of NameNodes as a condition of calculation of sleep > time is more reasonable > That is, in the current test, executing failover on connection nn3 does not > need to sleep time to directly connect to the next nn node > See client_error.log for details -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time
[ https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986891#comment-16986891 ] huhaiyang commented on HDFS-15024: -- In RetryPolicies Implemented /** * @return 0 if this is our first failover/retry (i.e., retry immediately), * sleep exponentially otherwise */ private long getFailoverOrRetrySleepTime(int times) { return times == 0 ? 0 : calculateExponentialTime(delayMillis, times, maxDelayBase); } > [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a > condition of calculation of sleep time > --- > > Key: HDFS-15024 > URL: https://issues.apache.org/jira/browse/HDFS-15024 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.10.0, 3.3.0, 3.2.1 >Reporter: huhaiyang >Priority: Major > Attachments: HDFS-15024.001.patch, client_error.log > > > When we enable the ONN , there will be three NN nodes for the client > configuration, > Such as configuration > > dfs.ha.namenodes.ns1 > nn2,nn3,nn1 > > Currently, > nn2 is in standby state > nn3 is in observer state > nn1 is in active state > When the user performs an access HDFS operation > ./bin/hadoop --loglevel debug fs > -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider > -mkdir /user/haiyang1/test8 > You need to request nn1 when you execute the msync method, > Actually connect nn2 first and failover is required > In connection nn3 does not meet the requirements, failover needs to be > performed, but at this time, failover operation needs to be performed during > a period of hibernation > Finally, it took a period of hibernation to connect the successful request to > nn1 > In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime The current > default implementation is Sleep time is calculated when more than one > failover operation is performed > I think that the Number of NameNodes as a condition of calculation of sleep > time is more reasonable > That is, in the current test, executing failover on connection nn3 does not > need to sleep time to directly connect to the next nn node > See client_error.log for details -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time
[ https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986882#comment-16986882 ] huhaiyang edited comment on HDFS-15024 at 12/3/19 1:11 PM: --- [~xkrogen] [~csun] [~vagarychen] Thanks for your comments! I understand Normally, if we only set 2 NNS, dfs.ha.namenodes.ns1 nn1,nn2 Currently, nn1 is in active state nn2 is in standby state when the client connects to nn2, it needs to retry, and will quickly connect to nn1. However, when the nn1 fails to connect due to network problems, the next time( the third time), the sleep timeout will be performed for a period of time to retry Current HDFS-6440 Support more than 2 NameNodes. if we set 3 NNS, dfs.ha.namenodes.ns1 nn1,nn2,nn3 nn1 is in active state nn2 is in standby state nn3 is in standby state(or observer state) when the client connects to nn2, it needs to retry, and will quickly connect to nn3. and the client connects to nn3, it needs to retry, and will quickly connect to nn1. However, when the nn1 fails to connect due to network problems, the next time( the fourth time), the sleep timeout will be performed for a period of time to retry. That is to say, it is necessary to connect all the configured NN nodes once. If no NN nodes the requirements are found, required to perform sleep and retry... In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime , I think that the Number of NameNodes as a condition of calculation of sleep time is more reasonable((which is current v01 patch)). was (Author: haiyang hu): [~xkrogen][~csun][~vagarychen] Thanks for your comments! I understand Normally, if we only set 2 NNS, dfs.ha.namenodes.ns1 nn1,nn2 Currently, nn1 is in active state nn2 is in standby state when the client connects to nn2, it needs to retry, and will quickly connect to nn1. However, when the nn1 fails to connect due to network problems, the next time( the third time), the sleep timeout will be performed for a period of time to retry Current HDFS-6440 Support more than 2 NameNodes. if we set 3 NNS, dfs.ha.namenodes.ns1 nn1,nn2,nn3 nn1 is in active state nn2 is in standby state nn3 is in standby state(or observer state) when the client connects to nn2, it needs to retry, and will quickly connect to nn3. and the client connects to nn3, it needs to retry, and will quickly connect to nn1. However, when the nn1 fails to connect due to network problems, the next time( the fourth time), the sleep timeout will be performed for a period of time to retry. That is to say, it is necessary to connect all the configured NN nodes once. If no NN nodes the requirements are found, required to perform sleep and retry... In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime , I think that the Number of NameNodes as a condition of calculation of sleep time is more reasonable((which is current v01 patch)). > [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a > condition of calculation of sleep time > --- > > Key: HDFS-15024 > URL: https://issues.apache.org/jira/browse/HDFS-15024 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.10.0, 3.3.0, 3.2.1 >Reporter: huhaiyang >Priority: Major > Attachments: HDFS-15024.001.patch, client_error.log > > > When we enable the ONN , there will be three NN nodes for the client > configuration, > Such as configuration > > dfs.ha.namenodes.ns1 > nn2,nn3,nn1 > > Currently, > nn2 is in standby state > nn3 is in observer state > nn1 is in active state > When the user performs an access HDFS operation > ./bin/hadoop --loglevel debug fs > -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider > -mkdir /user/haiyang1/test8 > You need to request nn1 when you execute the msync method, > Actually connect nn2 first and failover is required > In connection nn3 does not meet the requirements, failover needs to be > performed, but at this time, failover operation needs to be performed during > a period of hibernation > Finally, it took a period of hibernation to connect the successful request to > nn1 > In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime The current > default implementation is Sleep time is calculated when more than one > failover operation is performed > I think that the Number of NameNodes as a condition of calculation of sleep > time is more reasonable > That is, in the current test, executing failover on connection nn3 does not > need to sleep time to directly connect to the next nn node > See client_error.log for details -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time
[ https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986882#comment-16986882 ] huhaiyang edited comment on HDFS-15024 at 12/3/19 1:09 PM: --- [~xkrogen][~csun][~vagarychen] Thanks for your comments! I understand Normally, if we only set 2 NNS, dfs.ha.namenodes.ns1 nn1,nn2 Currently, nn1 is in active state nn2 is in standby state when the client connects to nn2, it needs to retry, and will quickly connect to nn1. However, when the nn1 fails to connect due to network problems, the next time( the third time), the sleep timeout will be performed for a period of time to retry Current HDFS-6440 Support more than 2 NameNodes. if we set 3 NNS, dfs.ha.namenodes.ns1 nn1,nn2,nn3 nn1 is in active state nn2 is in standby state nn3 is in standby state(or observer state) when the client connects to nn2, it needs to retry, and will quickly connect to nn3. and the client connects to nn3, it needs to retry, and will quickly connect to nn1. However, when the nn1 fails to connect due to network problems, the next time( the fourth time), the sleep timeout will be performed for a period of time to retry. That is to say, it is necessary to connect all the configured NN nodes once. If no NN nodes the requirements are found, required to perform sleep and retry... In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime , I think that the Number of NameNodes as a condition of calculation of sleep time is more reasonable((which is current v01 patch)). was (Author: haiyang hu): [~xkrogen][~csun][~vagarychen] Thanks for your comments! I understand Normally, if we only set 2 NNS, dfs.ha.namenodes.ns1 nn1,nn2 Currently, nn1 is in active state nn2 is in standby state when the client connects to nn2, it needs to retry, and will quickly connect to nn1. However, when the nn1 fails to connect due to network problems, the next time( the third time), the sleep timeout will be performed for a period of time to retry Current HDFS-6440 Support more than 2 NameNodes. if we set 3 NNS, dfs.ha.namenodes.ns1 nn1,nn2,nn3 nn1 is in active state nn2 is in standby state nn3 is in standby state(or observer state) when the client connects to nn1, it needs to retry, and will quickly connect to nn2. and the client connects to nn2, it needs to retry, and will quickly connect to nn1. However, when the nn1 fails to connect due to network problems, the next time( the fourth time), the sleep timeout will be performed for a period of time to retry. That is to say, it is necessary to connect all the configured NN nodes once. If no NN nodes the requirements are found, required to perform sleep and retry... In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime , I think that the Number of NameNodes as a condition of calculation of sleep time is more reasonable((which is current v01 patch)). > [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a > condition of calculation of sleep time > --- > > Key: HDFS-15024 > URL: https://issues.apache.org/jira/browse/HDFS-15024 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.10.0, 3.3.0, 3.2.1 >Reporter: huhaiyang >Priority: Major > Attachments: HDFS-15024.001.patch, client_error.log > > > When we enable the ONN , there will be three NN nodes for the client > configuration, > Such as configuration > > dfs.ha.namenodes.ns1 > nn2,nn3,nn1 > > Currently, > nn2 is in standby state > nn3 is in observer state > nn1 is in active state > When the user performs an access HDFS operation > ./bin/hadoop --loglevel debug fs > -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider > -mkdir /user/haiyang1/test8 > You need to request nn1 when you execute the msync method, > Actually connect nn2 first and failover is required > In connection nn3 does not meet the requirements, failover needs to be > performed, but at this time, failover operation needs to be performed during > a period of hibernation > Finally, it took a period of hibernation to connect the successful request to > nn1 > In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime The current > default implementation is Sleep time is calculated when more than one > failover operation is performed > I think that the Number of NameNodes as a condition of calculation of sleep > time is more reasonable > That is, in the current test, executing failover on connection nn3 does not > need to sleep time to directly connect to the next nn node > See client_error.log for details -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HDFS-15024) [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a condition of calculation of sleep time
[ https://issues.apache.org/jira/browse/HDFS-15024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986882#comment-16986882 ] huhaiyang commented on HDFS-15024: -- [~xkrogen][~csun][~vagarychen] Thanks for your comments! I understand Normally, if we only set 2 NNS, dfs.ha.namenodes.ns1 nn1,nn2 Currently, nn1 is in active state nn2 is in standby state when the client connects to nn2, it needs to retry, and will quickly connect to nn1. However, when the nn1 fails to connect due to network problems, the next time( the third time), the sleep timeout will be performed for a period of time to retry Current HDFS-6440 Support more than 2 NameNodes. if we set 3 NNS, dfs.ha.namenodes.ns1 nn1,nn2,nn3 nn1 is in active state nn2 is in standby state nn3 is in standby state(or observer state) when the client connects to nn1, it needs to retry, and will quickly connect to nn2. and the client connects to nn2, it needs to retry, and will quickly connect to nn1. However, when the nn1 fails to connect due to network problems, the next time( the fourth time), the sleep timeout will be performed for a period of time to retry. That is to say, it is necessary to connect all the configured NN nodes once. If no NN nodes the requirements are found, required to perform sleep and retry... In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime , I think that the Number of NameNodes as a condition of calculation of sleep time is more reasonable((which is current v01 patch)). > [SBN read] In FailoverOnNetworkExceptionRetry , Number of NameNodes as a > condition of calculation of sleep time > --- > > Key: HDFS-15024 > URL: https://issues.apache.org/jira/browse/HDFS-15024 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 2.10.0, 3.3.0, 3.2.1 >Reporter: huhaiyang >Priority: Major > Attachments: HDFS-15024.001.patch, client_error.log > > > When we enable the ONN , there will be three NN nodes for the client > configuration, > Such as configuration > > dfs.ha.namenodes.ns1 > nn2,nn3,nn1 > > Currently, > nn2 is in standby state > nn3 is in observer state > nn1 is in active state > When the user performs an access HDFS operation > ./bin/hadoop --loglevel debug fs > -Ddfs.client.failover.proxy.provider.ns1=org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider > -mkdir /user/haiyang1/test8 > You need to request nn1 when you execute the msync method, > Actually connect nn2 first and failover is required > In connection nn3 does not meet the requirements, failover needs to be > performed, but at this time, failover operation needs to be performed during > a period of hibernation > Finally, it took a period of hibernation to connect the successful request to > nn1 > In FailoverOnNetworkExceptionRetry getFailoverOrRetrySleepTime The current > default implementation is Sleep time is calculated when more than one > failover operation is performed > I think that the Number of NameNodes as a condition of calculation of sleep > time is more reasonable > That is, in the current test, executing failover on connection nn3 does not > need to sleep time to directly connect to the next nn node > See client_error.log for details -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13811) RBF: Race condition between router admin quota update and periodic quota update service
[ https://issues.apache.org/jira/browse/HDFS-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986867#comment-16986867 ] Jinglun commented on HDFS-13811: Hi [~linyiqun], thanks your nice comments ! Fix the failed unit tests and follow your suggestions. Upload v06. > RBF: Race condition between router admin quota update and periodic quota > update service > --- > > Key: HDFS-13811 > URL: https://issues.apache.org/jira/browse/HDFS-13811 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Dibyendu Karmakar >Assignee: Jinglun >Priority: Major > Attachments: HDFS-13811-000.patch, HDFS-13811-HDFS-13891-000.patch, > HDFS-13811.001.patch, HDFS-13811.002.patch, HDFS-13811.003.patch, > HDFS-13811.004.patch, HDFS-13811.005.patch, HDFS-13811.006.patch > > > If we try to update quota of an existing mount entry and at the same time > periodic quota update service is running on the same mount entry, it is > leading the mount table to _inconsistent state._ > Here transactions are: > A - Quota update service is fetching mount table entries. > B - Quota update service is updating the mount table with current usage. > A' - User is trying to update quota using admin cmd. > and the transaction sequence is [ A A' B ] > quota update service is updating the mount table with old quota value. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13811) RBF: Race condition between router admin quota update and periodic quota update service
[ https://issues.apache.org/jira/browse/HDFS-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinglun updated HDFS-13811: --- Attachment: HDFS-13811.006.patch > RBF: Race condition between router admin quota update and periodic quota > update service > --- > > Key: HDFS-13811 > URL: https://issues.apache.org/jira/browse/HDFS-13811 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Dibyendu Karmakar >Assignee: Jinglun >Priority: Major > Attachments: HDFS-13811-000.patch, HDFS-13811-HDFS-13891-000.patch, > HDFS-13811.001.patch, HDFS-13811.002.patch, HDFS-13811.003.patch, > HDFS-13811.004.patch, HDFS-13811.005.patch, HDFS-13811.006.patch > > > If we try to update quota of an existing mount entry and at the same time > periodic quota update service is running on the same mount entry, it is > leading the mount table to _inconsistent state._ > Here transactions are: > A - Quota update service is fetching mount table entries. > B - Quota update service is updating the mount table with current usage. > A' - User is trying to update quota using admin cmd. > and the transaction sequence is [ A A' B ] > quota update service is updating the mount table with old quota value. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13671) Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet
[ https://issues.apache.org/jira/browse/HDFS-13671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986840#comment-16986840 ] Stephen O'Donnell commented on HDFS-13671: -- I have seen a few occurrences of an issue where operations against the foldedTreeSet structure (removeAndGet and get) appear to get worse over time. In once case some time back, I recall a namenode that worked fine for about 4 weeks and then deletes became painfully slow. The delete code held the NN lock for too long per iteration and ultimately blocked IBRs etc. Restarting the NN fixed the issue, for about another 4 weeks when it returned. Once the slow down started happening it would not clear without a restart. I recently came across an similar looking issue where the Datanodes (which also use foldedTreeSet for the replica map) appeared to slow down on get operations from the foldedTreeSet, again a restart seems to have cleared the problem. This makes me wonder if the issue is that the data structure somehow degrades (eg does not balance correctly) over time, rather than being slow from the outset. However I have not been able to figure out how that can be the case. > Namenode deletes large dir slowly caused by FoldedTreeSet#removeAndGet > -- > > Key: HDFS-13671 > URL: https://issues.apache.org/jira/browse/HDFS-13671 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.1.0, 3.0.3 >Reporter: Yiqun Lin >Priority: Major > > NameNode hung when deleting large files/blocks. The stack info: > {code} > "IPC Server handler 4 on 8020" #87 daemon prio=5 os_prio=0 > tid=0x7fb505b27800 nid=0x94c3 runnable [0x7fa861361000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.compare(FoldedTreeSet.java:474) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.removeAndGet(FoldedTreeSet.java:849) > at > org.apache.hadoop.hdfs.util.FoldedTreeSet.remove(FoldedTreeSet.java:911) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.removeBlock(DatanodeStorageInfo.java:252) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:194) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.removeBlock(BlocksMap.java:108) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlockFromMap(BlockManager.java:3813) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.removeBlock(BlockManager.java:3617) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.removeBlocks(FSNamesystem.java:4270) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:4244) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInt(FSNamesystem.java:4180) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:4164) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:871) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.delete(AuthorizationProviderProxyClientProtocol.java:311) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:625) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > {code} > In the current deletion logic in NameNode, there are mainly two steps: > * Collect INodes and all blocks to be deleted, then delete INodes. > * Remove blocks chunk by chunk in a loop. > Actually the first step should be a more expensive operation and will takes > more time. However, now we always see NN hangs during the remove block > operation. > Looking into this, we introduced a new structure {{FoldedTreeSet}} to have a > better performance in dealing FBR/IBRs. But compared with early > implementation in remove-block logic, {{FoldedTreeSet}} seems more slower > since It will take additional time to balance tree node. When there are large > block to be removed/deleted, it looks bad. > For the get type operations in {{DatanodeStorageInfo}}, we only provide the > {{getBlockIterator}} to return blocks iterator and no other get operation > with specified block. Still we need to use {{FoldedTreeSet}} in > {{DatanodeStorageInfo}}? As we know {{FoldedTreeSet}} is benefit for Get not > Update. Maybe we can revert this to the early implementation. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HDFS-14998) Update Observer Namenode doc for ZKFC after HDFS-14130
[ https://issues.apache.org/jira/browse/HDFS-14998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986832#comment-16986832 ] Fei Hui commented on HDFS-14998: [~csun] [~ayushtkn] Thanks for your comments. {quote} With ZKFC running in both hosts, user just need to run the DFS admin commands to achieve that. {quote} *Without* ZKFC running, user could also run the DFS admin commands to achieve that(transition an observer to standby or transition a standby to observer).Is it right? The only benefit for running ZKFC on Observer NameNode is that it would join election when the Observer NameNode is transitioned to Standby state. >From your words dynamic transition(between observer and standby) can be done >only when ZKFC running on it. But i think dynamic transition can be done >without ZKFC running. Please correct me, thanks. > Update Observer Namenode doc for ZKFC after HDFS-14130 > -- > > Key: HDFS-14998 > URL: https://issues.apache.org/jira/browse/HDFS-14998 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation >Affects Versions: 3.3.0 >Reporter: Fei Hui >Assignee: Fei Hui >Priority: Minor > Attachments: HDFS-14998.001.patch, HDFS-14998.002.patch, > HDFS-14998.003.patch > > > After HDFS-14130, we should update observer namenode doc, observer namenode > can run with ZKFC running -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15028) Keep the capacity of volume and reduce a system call
[ https://issues.apache.org/jira/browse/HDFS-15028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986774#comment-16986774 ] Yang Yun commented on HDFS-15028: - I have no number how much the overhead is reduced by the change for the local disk. I think it can be ignored .But we met this issue when we mount a slow volume from remote as Archive storage. Yes, it break the underlying changing of the volume size. the fix should be configurable and disabled by default to keep current behavior. > Keep the capacity of volume and reduce a system call > > > Key: HDFS-15028 > URL: https://issues.apache.org/jira/browse/HDFS-15028 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Yang Yun >Assignee: Yang Yun >Priority: Minor > Attachments: HDFS-15028.patch > > > The local volume is not changed. so keep the first value of the capacity and > reuse for each heartbeat. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13811) RBF: Race condition between router admin quota update and periodic quota update service
[ https://issues.apache.org/jira/browse/HDFS-13811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16986773#comment-16986773 ] Yiqun Lin commented on HDFS-13811: -- Thanks for updating the patch, [~LiJinglun]. The change almost looks good to me. Some minor comments: * Can you update the javadoc of quota service class? {code:java} /** * Service to periodically update the {@link RouterQuotaUsage} * cached information in the {@link Router} and update corresponding * mount table in State Store. */ public class RouterQuotaUpdateService extends PeriodicService { {code} * I don't think we need to change following places since the mount table entry queried from state store should already get the quota usage now. {code:java} -updatedMountTable = getMountTable(path); -quota = updatedMountTable.getQuota(); ... -assertEquals(2, mountQuota1.getFileAndDirectoryCount()); +assertEquals(0, mountQuota1.getFileAndDirectoryCount()); {code} * Can you look into failed unit test? It's related. > RBF: Race condition between router admin quota update and periodic quota > update service > --- > > Key: HDFS-13811 > URL: https://issues.apache.org/jira/browse/HDFS-13811 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Dibyendu Karmakar >Assignee: Jinglun >Priority: Major > Attachments: HDFS-13811-000.patch, HDFS-13811-HDFS-13891-000.patch, > HDFS-13811.001.patch, HDFS-13811.002.patch, HDFS-13811.003.patch, > HDFS-13811.004.patch, HDFS-13811.005.patch > > > If we try to update quota of an existing mount entry and at the same time > periodic quota update service is running on the same mount entry, it is > leading the mount table to _inconsistent state._ > Here transactions are: > A - Quota update service is fetching mount table entries. > B - Quota update service is updating the mount table with current usage. > A' - User is trying to update quota using admin cmd. > and the transaction sequence is [ A A' B ] > quota update service is updating the mount table with old quota value. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org