[jira] [Commented] (HDFS-10285) Storage Policy Satisfier in Namenode
[ https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518901#comment-16518901 ] Anu Engineer commented on HDFS-10285: - +1, I also completely agree with [~umamaheswararao]'s proposal. The internal SPS code can come in the next phase of SPS. This way HBase will be able to start using the feature and SPS being internal or external is not visible to clients. I suggest that start a DISCUSS thread for SPS merge. > Storage Policy Satisfier in Namenode > > > Key: HDFS-10285 > URL: https://issues.apache.org/jira/browse/HDFS-10285 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: HDFS-10285 >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G >Priority: Major > Attachments: HDFS-10285-consolidated-merge-patch-00.patch, > HDFS-10285-consolidated-merge-patch-01.patch, > HDFS-10285-consolidated-merge-patch-02.patch, > HDFS-10285-consolidated-merge-patch-03.patch, > HDFS-10285-consolidated-merge-patch-04.patch, > HDFS-10285-consolidated-merge-patch-05.patch, > HDFS-SPS-TestReport-20170708.pdf, SPS Modularization.pdf, > Storage-Policy-Satisfier-in-HDFS-June-20-2017.pdf, > Storage-Policy-Satisfier-in-HDFS-May10.pdf, > Storage-Policy-Satisfier-in-HDFS-Oct-26-2017.pdf > > > Heterogeneous storage in HDFS introduced the concept of storage policy. These > policies can be set on directory/file to specify the user preference, where > to store the physical block. When user set the storage policy before writing > data, then the blocks could take advantage of storage policy preferences and > stores physical block accordingly. > If user set the storage policy after writing and completing the file, then > the blocks would have been written with default storage policy (nothing but > DISK). User has to run the ‘Mover tool’ explicitly by specifying all such > file names as a list. In some distributed system scenarios (ex: HBase) it > would be difficult to collect all the files and run the tool as different > nodes can write files separately and file can have different paths. > Another scenarios is, when user rename the files from one effected storage > policy file (inherited policy from parent directory) to another storage > policy effected directory, it will not copy inherited storage policy from > source. So it will take effect from destination file/dir parent storage > policy. This rename operation is just a metadata change in Namenode. The > physical blocks still remain with source storage policy. > So, Tracking all such business logic based file names could be difficult for > admins from distributed nodes(ex: region servers) and running the Mover tool. > Here the proposal is to provide an API from Namenode itself for trigger the > storage policy satisfaction. A Daemon thread inside Namenode should track > such calls and process to DN as movement commands. > Will post the detailed design thoughts document soon. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-94) Change ozone datanode command to start the standalone datanode plugin
[ https://issues.apache.org/jira/browse/HDDS-94?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518723#comment-16518723 ] Elek, Marton commented on HDDS-94: -- Thank you very much to do this [~Sandeep Nemuri]. I did a quick survey offline and asked multiple people. Most of the people voted to use the datanode as the command line name of the standalone ozone datanode. (ozone datanode instead of ozone hddsdn). I propose to replace the original datanode subcommand instead of creataing hddsdn (sorry for not including this info in the original description). I also propose to update the docker-compose files: ./hadoop-ozone/acceptance-test/src/test/acceptance/basic/docker-compose.yaml ./hadoop-dist/src/main/compose/ozoneperf/docker-compose.yaml ./hadoop-dist/src/main/compose/ozone/docker-compose.yaml The namenode service could be removed from them (they use "ozone datanode" which should ok after the change). It could be done in other jira, but I prefer to do this in one step to make it easier to test. (BTW, I tested it with modifying the docker-compose files locally and it worked very well). Thanks, again. > Change ozone datanode command to start the standalone datanode plugin > - > > Key: HDDS-94 > URL: https://issues.apache.org/jira/browse/HDDS-94 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Elek, Marton >Assignee: Sandeep Nemuri >Priority: Major > Labels: newbie > Fix For: 0.2.1 > > Attachments: HDDS-94.001.patch > > > The current ozone datanode command starts the regular hdfs datanode with an > enabled HddsDatanodeService as a datanode plugin. > The goal is to start only the HddsDatanodeService.java (main function is > already there but GenericOptionParser should be adopted). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13682) Cannot create encryption zone after KMS auth token expires
[ https://issues.apache.org/jira/browse/HDFS-13682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518718#comment-16518718 ] Hudson commented on HDFS-13682: --- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14457 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/14457/]) HDFS-13682. Cannot create encryption zone after KMS auth token expires. (xiao: rev 32f867a6a907c05a312657139d295a92756d98ef) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestSecureEncryptionZoneWithKMS.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/crypto/key/kms/KMSClientProvider.java > Cannot create encryption zone after KMS auth token expires > -- > > Key: HDFS-13682 > URL: https://issues.apache.org/jira/browse/HDFS-13682 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption, kms, namenode >Affects Versions: 3.0.0 >Reporter: Xiao Chen >Assignee: Xiao Chen >Priority: Critical > Fix For: 3.2.0, 3.1.1, 3.0.4 > > Attachments: HDFS-13682.01.patch, HDFS-13682.02.patch, > HDFS-13682.03.patch, HDFS-13682.dirty.repro.branch-2.patch, > HDFS-13682.dirty.repro.patch > > > Our internal testing reported this behavior recently. > {noformat} > [root@nightly6x-1 ~]# sudo -u hdfs /usr/bin/kinit -kt > /cdep/keytabs/hdfs.keytab hdfs -l 30d -r 30d > [root@nightly6x-1 ~]# sudo -u hdfs klist > Ticket cache: FILE:/tmp/krb5cc_994 > Default principal: h...@gce.cloudera.com > Valid starting Expires Service principal > 06/12/2018 03:24:09 07/12/2018 03:24:09 > krbtgt/gce.cloudera@gce.cloudera.com > [root@nightly6x-1 ~]# sudo -u hdfs hdfs crypto -createZone -keyName key77 > -path /user/systest/ez > RemoteException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt) > {noformat} > Upon further investigation, it's due to the KMS client (cached in HDFS NN) > cannot authenticate with the server after the authentication token (which is > cached by KMSCP) expires, even if the HDFS client RPC has valid kerberos > credentials. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13682) Cannot create encryption zone after KMS auth token expires
[ https://issues.apache.org/jira/browse/HDFS-13682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-13682: - Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.4 3.1.1 3.2.0 Status: Resolved (was: Patch Available) > Cannot create encryption zone after KMS auth token expires > -- > > Key: HDFS-13682 > URL: https://issues.apache.org/jira/browse/HDFS-13682 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption, kms, namenode >Affects Versions: 3.0.0 >Reporter: Xiao Chen >Assignee: Xiao Chen >Priority: Critical > Fix For: 3.2.0, 3.1.1, 3.0.4 > > Attachments: HDFS-13682.01.patch, HDFS-13682.02.patch, > HDFS-13682.03.patch, HDFS-13682.dirty.repro.branch-2.patch, > HDFS-13682.dirty.repro.patch > > > Our internal testing reported this behavior recently. > {noformat} > [root@nightly6x-1 ~]# sudo -u hdfs /usr/bin/kinit -kt > /cdep/keytabs/hdfs.keytab hdfs -l 30d -r 30d > [root@nightly6x-1 ~]# sudo -u hdfs klist > Ticket cache: FILE:/tmp/krb5cc_994 > Default principal: h...@gce.cloudera.com > Valid starting Expires Service principal > 06/12/2018 03:24:09 07/12/2018 03:24:09 > krbtgt/gce.cloudera@gce.cloudera.com > [root@nightly6x-1 ~]# sudo -u hdfs hdfs crypto -createZone -keyName key77 > -path /user/systest/ez > RemoteException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt) > {noformat} > Upon further investigation, it's due to the KMS client (cached in HDFS NN) > cannot authenticate with the server after the authentication token (which is > cached by KMSCP) expires, even if the HDFS client RPC has valid kerberos > credentials. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13682) Cannot create encryption zone after KMS auth token expires
[ https://issues.apache.org/jira/browse/HDFS-13682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-13682: - Component/s: kms > Cannot create encryption zone after KMS auth token expires > -- > > Key: HDFS-13682 > URL: https://issues.apache.org/jira/browse/HDFS-13682 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption, kms, namenode >Affects Versions: 3.0.0 >Reporter: Xiao Chen >Assignee: Xiao Chen >Priority: Critical > Attachments: HDFS-13682.01.patch, HDFS-13682.02.patch, > HDFS-13682.03.patch, HDFS-13682.dirty.repro.branch-2.patch, > HDFS-13682.dirty.repro.patch > > > Our internal testing reported this behavior recently. > {noformat} > [root@nightly6x-1 ~]# sudo -u hdfs /usr/bin/kinit -kt > /cdep/keytabs/hdfs.keytab hdfs -l 30d -r 30d > [root@nightly6x-1 ~]# sudo -u hdfs klist > Ticket cache: FILE:/tmp/krb5cc_994 > Default principal: h...@gce.cloudera.com > Valid starting Expires Service principal > 06/12/2018 03:24:09 07/12/2018 03:24:09 > krbtgt/gce.cloudera@gce.cloudera.com > [root@nightly6x-1 ~]# sudo -u hdfs hdfs crypto -createZone -keyName key77 > -path /user/systest/ez > RemoteException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt) > {noformat} > Upon further investigation, it's due to the KMS client (cached in HDFS NN) > cannot authenticate with the server after the authentication token (which is > cached by KMSCP) expires, even if the HDFS client RPC has valid kerberos > credentials. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13682) Cannot create encryption zone after KMS auth token expires
[ https://issues.apache.org/jira/browse/HDFS-13682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518699#comment-16518699 ] Xiao Chen commented on HDFS-13682: -- Thanks a lot Wei-Chiu! Test failure is HDFS-13662, unrelated to this patch. Committing. > Cannot create encryption zone after KMS auth token expires > -- > > Key: HDFS-13682 > URL: https://issues.apache.org/jira/browse/HDFS-13682 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption, namenode >Affects Versions: 3.0.0 >Reporter: Xiao Chen >Assignee: Xiao Chen >Priority: Critical > Attachments: HDFS-13682.01.patch, HDFS-13682.02.patch, > HDFS-13682.03.patch, HDFS-13682.dirty.repro.branch-2.patch, > HDFS-13682.dirty.repro.patch > > > Our internal testing reported this behavior recently. > {noformat} > [root@nightly6x-1 ~]# sudo -u hdfs /usr/bin/kinit -kt > /cdep/keytabs/hdfs.keytab hdfs -l 30d -r 30d > [root@nightly6x-1 ~]# sudo -u hdfs klist > Ticket cache: FILE:/tmp/krb5cc_994 > Default principal: h...@gce.cloudera.com > Valid starting Expires Service principal > 06/12/2018 03:24:09 07/12/2018 03:24:09 > krbtgt/gce.cloudera@gce.cloudera.com > [root@nightly6x-1 ~]# sudo -u hdfs hdfs crypto -createZone -keyName key77 > -path /user/systest/ez > RemoteException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt) > {noformat} > Upon further investigation, it's due to the KMS client (cached in HDFS NN) > cannot authenticate with the server after the authentication token (which is > cached by KMSCP) expires, even if the HDFS client RPC has valid kerberos > credentials. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13682) Cannot create encryption zone after KMS auth token expires
[ https://issues.apache.org/jira/browse/HDFS-13682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518670#comment-16518670 ] genericqa commented on HDFS-13682: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 33s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 41s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 36s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 14s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 54s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 28m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 1s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 19s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 54s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 9m 17s{color} | {color:green} hadoop-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 97m 46s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 45s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}239m 22s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.client.impl.TestBlockReaderLocal | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | HDFS-13682 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12928520/HDFS-13682.03.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 38b60da8f0c5 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 9a9e969 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | unit |
[jira] [Commented] (HDFS-13682) Cannot create encryption zone after KMS auth token expires
[ https://issues.apache.org/jira/browse/HDFS-13682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518520#comment-16518520 ] Wei-Chiu Chuang commented on HDFS-13682: +1 pending Jenkins > Cannot create encryption zone after KMS auth token expires > -- > > Key: HDFS-13682 > URL: https://issues.apache.org/jira/browse/HDFS-13682 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption, namenode >Affects Versions: 3.0.0 >Reporter: Xiao Chen >Assignee: Xiao Chen >Priority: Critical > Attachments: HDFS-13682.01.patch, HDFS-13682.02.patch, > HDFS-13682.03.patch, HDFS-13682.dirty.repro.branch-2.patch, > HDFS-13682.dirty.repro.patch > > > Our internal testing reported this behavior recently. > {noformat} > [root@nightly6x-1 ~]# sudo -u hdfs /usr/bin/kinit -kt > /cdep/keytabs/hdfs.keytab hdfs -l 30d -r 30d > [root@nightly6x-1 ~]# sudo -u hdfs klist > Ticket cache: FILE:/tmp/krb5cc_994 > Default principal: h...@gce.cloudera.com > Valid starting Expires Service principal > 06/12/2018 03:24:09 07/12/2018 03:24:09 > krbtgt/gce.cloudera@gce.cloudera.com > [root@nightly6x-1 ~]# sudo -u hdfs hdfs crypto -createZone -keyName key77 > -path /user/systest/ez > RemoteException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt) > {noformat} > Upon further investigation, it's due to the KMS client (cached in HDFS NN) > cannot authenticate with the server after the authentication token (which is > cached by KMSCP) expires, even if the HDFS client RPC has valid kerberos > credentials. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13682) Cannot create encryption zone after KMS auth token expires
[ https://issues.apache.org/jira/browse/HDFS-13682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518444#comment-16518444 ] Xiao Chen commented on HDFS-13682: -- Thanks for the review and offline discussion [~jojochuang]! Actually my memory overflowed and we can use {{UGI#shouldRelogin}}. [^HDFS-13682.03.patch]uploaded > Cannot create encryption zone after KMS auth token expires > -- > > Key: HDFS-13682 > URL: https://issues.apache.org/jira/browse/HDFS-13682 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption, namenode >Affects Versions: 3.0.0 >Reporter: Xiao Chen >Assignee: Xiao Chen >Priority: Critical > Attachments: HDFS-13682.01.patch, HDFS-13682.02.patch, > HDFS-13682.03.patch, HDFS-13682.dirty.repro.branch-2.patch, > HDFS-13682.dirty.repro.patch > > > Our internal testing reported this behavior recently. > {noformat} > [root@nightly6x-1 ~]# sudo -u hdfs /usr/bin/kinit -kt > /cdep/keytabs/hdfs.keytab hdfs -l 30d -r 30d > [root@nightly6x-1 ~]# sudo -u hdfs klist > Ticket cache: FILE:/tmp/krb5cc_994 > Default principal: h...@gce.cloudera.com > Valid starting Expires Service principal > 06/12/2018 03:24:09 07/12/2018 03:24:09 > krbtgt/gce.cloudera@gce.cloudera.com > [root@nightly6x-1 ~]# sudo -u hdfs hdfs crypto -createZone -keyName key77 > -path /user/systest/ez > RemoteException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt) > {noformat} > Upon further investigation, it's due to the KMS client (cached in HDFS NN) > cannot authenticate with the server after the authentication token (which is > cached by KMSCP) expires, even if the HDFS client RPC has valid kerberos > credentials. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13682) Cannot create encryption zone after KMS auth token expires
[ https://issues.apache.org/jira/browse/HDFS-13682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-13682: - Attachment: HDFS-13682.03.patch > Cannot create encryption zone after KMS auth token expires > -- > > Key: HDFS-13682 > URL: https://issues.apache.org/jira/browse/HDFS-13682 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption, namenode >Affects Versions: 3.0.0 >Reporter: Xiao Chen >Assignee: Xiao Chen >Priority: Critical > Attachments: HDFS-13682.01.patch, HDFS-13682.02.patch, > HDFS-13682.03.patch, HDFS-13682.dirty.repro.branch-2.patch, > HDFS-13682.dirty.repro.patch > > > Our internal testing reported this behavior recently. > {noformat} > [root@nightly6x-1 ~]# sudo -u hdfs /usr/bin/kinit -kt > /cdep/keytabs/hdfs.keytab hdfs -l 30d -r 30d > [root@nightly6x-1 ~]# sudo -u hdfs klist > Ticket cache: FILE:/tmp/krb5cc_994 > Default principal: h...@gce.cloudera.com > Valid starting Expires Service principal > 06/12/2018 03:24:09 07/12/2018 03:24:09 > krbtgt/gce.cloudera@gce.cloudera.com > [root@nightly6x-1 ~]# sudo -u hdfs hdfs crypto -createZone -keyName key77 > -path /user/systest/ez > RemoteException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt) > {noformat} > Upon further investigation, it's due to the KMS client (cached in HDFS NN) > cannot authenticate with the server after the authentication token (which is > cached by KMSCP) expires, even if the HDFS client RPC has valid kerberos > credentials. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13682) Cannot create encryption zone after KMS auth token expires
[ https://issues.apache.org/jira/browse/HDFS-13682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518345#comment-16518345 ] Wei-Chiu Chuang commented on HDFS-13682: Thanks [~xiaochen]! Ok, I think the bug report & fix makes sense. so looks like the externally managed subjects are handled by HADOOP-9747. the shadedclient error doesn't seem related. Could you consider rename ugiCanRelogin as something like shouldUseLoginUser()? Somehow the name ugiCanRelogin confused me. +1 after that. > Cannot create encryption zone after KMS auth token expires > -- > > Key: HDFS-13682 > URL: https://issues.apache.org/jira/browse/HDFS-13682 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption, namenode >Affects Versions: 3.0.0 >Reporter: Xiao Chen >Assignee: Xiao Chen >Priority: Critical > Attachments: HDFS-13682.01.patch, HDFS-13682.02.patch, > HDFS-13682.dirty.repro.branch-2.patch, HDFS-13682.dirty.repro.patch > > > Our internal testing reported this behavior recently. > {noformat} > [root@nightly6x-1 ~]# sudo -u hdfs /usr/bin/kinit -kt > /cdep/keytabs/hdfs.keytab hdfs -l 30d -r 30d > [root@nightly6x-1 ~]# sudo -u hdfs klist > Ticket cache: FILE:/tmp/krb5cc_994 > Default principal: h...@gce.cloudera.com > Valid starting Expires Service principal > 06/12/2018 03:24:09 07/12/2018 03:24:09 > krbtgt/gce.cloudera@gce.cloudera.com > [root@nightly6x-1 ~]# sudo -u hdfs hdfs crypto -createZone -keyName key77 > -path /user/systest/ez > RemoteException: > org.apache.hadoop.security.authentication.client.AuthenticationException: > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt) > {noformat} > Upon further investigation, it's due to the KMS client (cached in HDFS NN) > cannot authenticate with the server after the authentication token (which is > cached by KMSCP) expires, even if the HDFS client RPC has valid kerberos > credentials. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13448) HDFS Block Placement - Ignore Locality for First Block Replica
[ https://issues.apache.org/jira/browse/HDFS-13448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518314#comment-16518314 ] Daniel Templeton commented on HDFS-13448: - The build failures are unrelated: {quote}[ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.7:run (common-test-bats-driver) on project hadoop-common: An Ant BuildException has occured: exec returned: 1 [ERROR] around Ant part .. @ 4:69 in /testptch/hadoop/hadoop-common-project/hadoop-common/target/antrun/build-main.xml [ERROR] -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :hadoop-common{quote} It's also really hard to find that output. It took some reverse engineering. Someone should look into that... The unit test failures are worth looking at. I recognize at least one of them as flaky, but I can't assert they all are flaky. (Maybe someone else could.) Aside from the unit tests, the patch looks good to me. I haven't done a formal final pass, but I think you got it. [~daryn], wanna take another look? Thanks for sticking through it, [~belugabehr]. > HDFS Block Placement - Ignore Locality for First Block Replica > -- > > Key: HDFS-13448 > URL: https://issues.apache.org/jira/browse/HDFS-13448 > Project: Hadoop HDFS > Issue Type: New Feature > Components: block placement, hdfs-client >Affects Versions: 2.9.0, 3.0.1 >Reporter: BELUGA BEHR >Assignee: BELUGA BEHR >Priority: Minor > Attachments: HDFS-13448.10.patch, HDFS-13448.11.patch, > HDFS-13448.12.patch, HDFS-13448.13.patch, HDFS-13448.6.patch, > HDFS-13448.7.patch, HDFS-13448.8.patch > > > According to the HDFS Block Place Rules: > {quote} > /** > * The replica placement strategy is that if the writer is on a datanode, > * the 1st replica is placed on the local machine, > * otherwise a random datanode. The 2nd replica is placed on a datanode > * that is on a different rack. The 3rd replica is placed on a datanode > * which is on a different node of the rack as the second replica. > */ > {quote} > However, there is a hint for the hdfs-client that allows the block placement > request to not put a block replica on the local datanode _where 'local' means > the same host as the client is being run on._ > {quote} > /** >* Advise that a block replica NOT be written to the local DataNode where >* 'local' means the same host as the client is being run on. >* >* @see CreateFlag#NO_LOCAL_WRITE >*/ > {quote} > I propose that we add a new flag that allows the hdfs-client to request that > the first block replica be placed on a random DataNode in the cluster. The > subsequent block replicas should follow the normal block placement rules. > The issue is that when the {{NO_LOCAL_WRITE}} is enabled, the first block > replica is not placed on the local node, but it is still placed on the local > rack. Where this comes into play is where you have, for example, a flume > agent that is loading data into HDFS. > If the Flume agent is running on a DataNode, then by default, the DataNode > local to the Flume agent will always get the first block replica and this > leads to un-even block placements, with the local node always filling up > faster than any other node in the cluster. > Modifying this example, if the DataNode is removed from the host where the > Flume agent is running, or this {{NO_LOCAL_WRITE}} is enabled by Flume, then > the default block placement policy will still prefer the local rack. This > remedies the situation only so far as now the first block replica will always > be distributed to a DataNode on the local rack. > This new flag would allow a single Flume agent to distribute the blocks > randomly, evenly, over the entire cluster instead of hot-spotting the local > node or the local rack. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-178) DeleteBlocks should not be handled by open containers
[ https://issues.apache.org/jira/browse/HDDS-178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518270#comment-16518270 ] genericqa commented on HDDS-178: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 23s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 4s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 26m 54s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 23s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-ozone/integration-test {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 24s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 27m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 27m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 27s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 24s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-ozone/integration-test {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 46s{color} | {color:green} container-service in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 23m 29s{color} | {color:red} integration-test in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 45s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}142m 52s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.ozone.container.common.statemachine.commandhandler.TestCloseContainerByPipeline | | | hadoop.ozone.TestStorageContainerManager | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | HDDS-178 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12928476/HDDS-178.004.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient
[jira] [Commented] (HDFS-13672) clearCorruptLazyPersistFiles could crash NameNode
[ https://issues.apache.org/jira/browse/HDFS-13672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518158#comment-16518158 ] genericqa commented on HDFS-13672: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 12s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 56s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 5s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 94m 33s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 30s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}158m 8s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy | | | hadoop.hdfs.server.balancer.TestBalancer | | | hadoop.hdfs.server.namenode.TestReencryptionWithKMS | | | hadoop.hdfs.client.impl.TestBlockReaderLocal | | | hadoop.hdfs.TestDFSClientRetries | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | HDFS-13672 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12928461/HDFS-13672.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 1a5051bc4e08 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 2d87592 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_171 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/24480/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results |
[jira] [Commented] (HDDS-178) DeleteBlocks should not be handled by open containers
[ https://issues.apache.org/jira/browse/HDDS-178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518100#comment-16518100 ] Lokesh Jain commented on HDDS-178: -- v4 patch fixes failure in TestKeys. > DeleteBlocks should not be handled by open containers > - > > Key: HDDS-178 > URL: https://issues.apache.org/jira/browse/HDDS-178 > Project: Hadoop Distributed Data Store > Issue Type: Task > Components: Ozone Datanode >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Attachments: HDDS-178.001.patch, HDDS-178.002.patch, > HDDS-178.003.patch, HDDS-178.004.patch > > > In the case of open containers deleteBlocks command just adds an entry in the > log but does not delete the blocks. These blocks are deleted only when > container is closed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-178) DeleteBlocks should not be handled by open containers
[ https://issues.apache.org/jira/browse/HDDS-178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lokesh Jain updated HDDS-178: - Attachment: HDDS-178.004.patch > DeleteBlocks should not be handled by open containers > - > > Key: HDDS-178 > URL: https://issues.apache.org/jira/browse/HDDS-178 > Project: Hadoop Distributed Data Store > Issue Type: Task > Components: Ozone Datanode >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Attachments: HDDS-178.001.patch, HDDS-178.002.patch, > HDDS-178.003.patch, HDDS-178.004.patch > > > In the case of open containers deleteBlocks command just adds an entry in the > log but does not delete the blocks. These blocks are deleted only when > container is closed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-13691) Usage of ErasureCoder/Codec in Hadoop
Chaitanya Mukka created HDFS-13691: -- Summary: Usage of ErasureCoder/Codec in Hadoop Key: HDFS-13691 URL: https://issues.apache.org/jira/browse/HDFS-13691 Project: Hadoop HDFS Issue Type: Wish Components: erasure-coding, hdfs Reporter: Chaitanya Mukka While looking through the source code of Hadoop, we found the ErasureCoder and ErasureCodec APIs are not being used in HDFS. The HDFS still uses the RawErasureCoder API. We have been working on creating a Erasure Codec Plugin for [ClayCodes,|https://www.usenix.org/conference/fast18/presentation/vajha] which uses these APIs. We would like to know what is the progress regarding integrating the ErasureCodec in HDFS. We could not find any active Jira on the same, correct us if we are wrong. (HDFS-7337 seems to be resolved) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-170) Fix TestBlockDeletingService#testBlockDeletionTimeout
[ https://issues.apache.org/jira/browse/HDDS-170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518069#comment-16518069 ] genericqa commented on HDDS-170: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 19s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 28m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 30m 26s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 22s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-ozone/integration-test {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 55s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 27m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 27m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 58s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-ozone/integration-test {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 54s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 7s{color} | {color:green} common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 41s{color} | {color:green} container-service in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 14m 23s{color} | {color:red} integration-test in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 41s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}144m 6s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.ozone.TestStorageContainerManager | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | HDDS-170 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12928382/HDDS-170.001.patch | | Optional Tests | asflicense compile javac javadoc
[jira] [Commented] (HDDS-178) DeleteBlocks should not be handled by open containers
[ https://issues.apache.org/jira/browse/HDDS-178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518044#comment-16518044 ] genericqa commented on HDDS-178: | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 38s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 35s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 29m 21s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 30m 39s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 24s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 48s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-ozone/integration-test {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 46s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 0s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 28m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 6s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Skipped patched modules with no Java source: hadoop-ozone/integration-test {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 53s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 43s{color} | {color:red} container-service in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 25m 47s{color} | {color:red} integration-test in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 41s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}148m 59s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.ozone.container.common.report.TestReportPublisher | | | hadoop.ozone.TestStorageContainerManager | | | hadoop.ozone.web.client.TestKeys | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | HDDS-178 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12928455/HDDS-178.003.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient
[jira] [Updated] (HDFS-13672) clearCorruptLazyPersistFiles could crash NameNode
[ https://issues.apache.org/jira/browse/HDFS-13672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Bota updated HDFS-13672: -- Status: Patch Available (was: In Progress) > clearCorruptLazyPersistFiles could crash NameNode > - > > Key: HDFS-13672 > URL: https://issues.apache.org/jira/browse/HDFS-13672 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Wei-Chiu Chuang >Assignee: Gabor Bota >Priority: Major > Attachments: HDFS-13672.001.patch > > > I started a NameNode on a pretty large fsimage. Since the NameNode is started > without any DataNodes, all blocks (100 million) are "corrupt". > Afterwards I observed FSNamesystem#clearCorruptLazyPersistFiles() held write > lock for a long time: > {noformat} > 18/06/12 12:37:03 INFO namenode.FSNamesystem: FSNamesystem write lock held > for 46024 ms via > java.lang.Thread.getStackTrace(Thread.java:1559) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:945) > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:198) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1689) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.clearCorruptLazyPersistFiles(FSNamesystem.java:5532) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.run(FSNamesystem.java:5543) > java.lang.Thread.run(Thread.java:748) > Number of suppressed write-lock reports: 0 > Longest write-lock held interval: 46024 > {noformat} > Here's the relevant code: > {code} > writeLock(); > try { > final Iterator it = > blockManager.getCorruptReplicaBlockIterator(); > while (it.hasNext()) { > Block b = it.next(); > BlockInfo blockInfo = blockManager.getStoredBlock(b); > if (blockInfo.getBlockCollection().getStoragePolicyID() == > lpPolicy.getId()) { > filesToDelete.add(blockInfo.getBlockCollection()); > } > } > for (BlockCollection bc : filesToDelete) { > LOG.warn("Removing lazyPersist file " + bc.getName() + " with no > replicas."); > changed |= deleteInternal(bc.getName(), false, false, false); > } > } finally { > writeUnlock(); > } > {code} > In essence, the iteration over corrupt replica list should be broken down > into smaller iterations to avoid a single long wait. > Since this operation holds NameNode write lock for more than 45 seconds, the > default ZKFC connection timeout, it implies an extreme case like this (100 > million corrupt blocks) could lead to NameNode failover. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10285) Storage Policy Satisfier in Namenode
[ https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518035#comment-16518035 ] Surendra Singh Lilhore commented on HDFS-10285: --- Agree with [~umamaheswararao]'s proposal. Lets merge external SPS code and continue working on internal SPS in next phase... > Storage Policy Satisfier in Namenode > > > Key: HDFS-10285 > URL: https://issues.apache.org/jira/browse/HDFS-10285 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: HDFS-10285 >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G >Priority: Major > Attachments: HDFS-10285-consolidated-merge-patch-00.patch, > HDFS-10285-consolidated-merge-patch-01.patch, > HDFS-10285-consolidated-merge-patch-02.patch, > HDFS-10285-consolidated-merge-patch-03.patch, > HDFS-10285-consolidated-merge-patch-04.patch, > HDFS-10285-consolidated-merge-patch-05.patch, > HDFS-SPS-TestReport-20170708.pdf, SPS Modularization.pdf, > Storage-Policy-Satisfier-in-HDFS-June-20-2017.pdf, > Storage-Policy-Satisfier-in-HDFS-May10.pdf, > Storage-Policy-Satisfier-in-HDFS-Oct-26-2017.pdf > > > Heterogeneous storage in HDFS introduced the concept of storage policy. These > policies can be set on directory/file to specify the user preference, where > to store the physical block. When user set the storage policy before writing > data, then the blocks could take advantage of storage policy preferences and > stores physical block accordingly. > If user set the storage policy after writing and completing the file, then > the blocks would have been written with default storage policy (nothing but > DISK). User has to run the ‘Mover tool’ explicitly by specifying all such > file names as a list. In some distributed system scenarios (ex: HBase) it > would be difficult to collect all the files and run the tool as different > nodes can write files separately and file can have different paths. > Another scenarios is, when user rename the files from one effected storage > policy file (inherited policy from parent directory) to another storage > policy effected directory, it will not copy inherited storage policy from > source. So it will take effect from destination file/dir parent storage > policy. This rename operation is just a metadata change in Namenode. The > physical blocks still remain with source storage policy. > So, Tracking all such business logic based file names could be difficult for > admins from distributed nodes(ex: region servers) and running the Mover tool. > Here the proposal is to provide an API from Namenode itself for trigger the > storage policy satisfaction. A Daemon thread inside Namenode should track > such calls and process to DN as movement commands. > Will post the detailed design thoughts document soon. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-175) Refactor ContainerInfo to remove Pipeline object from it
[ https://issues.apache.org/jira/browse/HDDS-175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518019#comment-16518019 ] Shashikant Banerjee commented on HDDS-175: -- Thanks [~ajayydv], for updating the patch. I just had a quick look. There are some unwanted changes in file "WritingYarnApplications.md" and there are test failures as well as some findBug issues. Can you please check these? I will have a closer look at it. > Refactor ContainerInfo to remove Pipeline object from it > - > > Key: HDDS-175 > URL: https://issues.apache.org/jira/browse/HDDS-175 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.2.1 >Reporter: Ajay Kumar >Assignee: Ajay Kumar >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-175.00.patch, HDDS-175.01.patch, HDDS-175.02.patch, > HDDS-175.03.patch > > > Refactor ContainerInfo to remove Pipeline object from it. We can add below 4 > fields to ContainerInfo to recreate pipeline if required: > # pipelineId > # replication type > # expected replication count > # DataNode where its replica exist -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13672) clearCorruptLazyPersistFiles could crash NameNode
[ https://issues.apache.org/jira/browse/HDFS-13672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518008#comment-16518008 ] Gabor Bota commented on HDFS-13672: --- In my v001 patch, I'm solving the issue with the limited number of block per iteration. [~jojochuang], do you mean +break out from the loop after 1 second+ by breaking out of the whole clearCorruptLazyPersistFiles function or just releasing a lock and continue the interation? If the {{LazyPersistFileScrubber}} runs periodically we don't even have to remove all the corrupt replica blocks in one run. > clearCorruptLazyPersistFiles could crash NameNode > - > > Key: HDFS-13672 > URL: https://issues.apache.org/jira/browse/HDFS-13672 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Wei-Chiu Chuang >Assignee: Gabor Bota >Priority: Major > Attachments: HDFS-13672.001.patch > > > I started a NameNode on a pretty large fsimage. Since the NameNode is started > without any DataNodes, all blocks (100 million) are "corrupt". > Afterwards I observed FSNamesystem#clearCorruptLazyPersistFiles() held write > lock for a long time: > {noformat} > 18/06/12 12:37:03 INFO namenode.FSNamesystem: FSNamesystem write lock held > for 46024 ms via > java.lang.Thread.getStackTrace(Thread.java:1559) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:945) > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:198) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1689) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.clearCorruptLazyPersistFiles(FSNamesystem.java:5532) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.run(FSNamesystem.java:5543) > java.lang.Thread.run(Thread.java:748) > Number of suppressed write-lock reports: 0 > Longest write-lock held interval: 46024 > {noformat} > Here's the relevant code: > {code} > writeLock(); > try { > final Iterator it = > blockManager.getCorruptReplicaBlockIterator(); > while (it.hasNext()) { > Block b = it.next(); > BlockInfo blockInfo = blockManager.getStoredBlock(b); > if (blockInfo.getBlockCollection().getStoragePolicyID() == > lpPolicy.getId()) { > filesToDelete.add(blockInfo.getBlockCollection()); > } > } > for (BlockCollection bc : filesToDelete) { > LOG.warn("Removing lazyPersist file " + bc.getName() + " with no > replicas."); > changed |= deleteInternal(bc.getName(), false, false, false); > } > } finally { > writeUnlock(); > } > {code} > In essence, the iteration over corrupt replica list should be broken down > into smaller iterations to avoid a single long wait. > Since this operation holds NameNode write lock for more than 45 seconds, the > default ZKFC connection timeout, it implies an extreme case like this (100 > million corrupt blocks) could lead to NameNode failover. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13672) clearCorruptLazyPersistFiles could crash NameNode
[ https://issues.apache.org/jira/browse/HDFS-13672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16518008#comment-16518008 ] Gabor Bota edited comment on HDFS-13672 at 6/20/18 10:26 AM: - In my v001 patch, I'm solving the issue with the limited number of block per iteration. [~jojochuang], do you mean _break out from the loop after 1 second_ by breaking out of the whole clearCorruptLazyPersistFiles function or just releasing a lock and continue the interation? If the {{LazyPersistFileScrubber}} runs periodically we don't even have to remove all the corrupt replica blocks in one run. was (Author: gabor.bota): In my v001 patch, I'm solving the issue with the limited number of block per iteration. [~jojochuang], do you mean +break out from the loop after 1 second+ by breaking out of the whole clearCorruptLazyPersistFiles function or just releasing a lock and continue the interation? If the {{LazyPersistFileScrubber}} runs periodically we don't even have to remove all the corrupt replica blocks in one run. > clearCorruptLazyPersistFiles could crash NameNode > - > > Key: HDFS-13672 > URL: https://issues.apache.org/jira/browse/HDFS-13672 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Wei-Chiu Chuang >Assignee: Gabor Bota >Priority: Major > Attachments: HDFS-13672.001.patch > > > I started a NameNode on a pretty large fsimage. Since the NameNode is started > without any DataNodes, all blocks (100 million) are "corrupt". > Afterwards I observed FSNamesystem#clearCorruptLazyPersistFiles() held write > lock for a long time: > {noformat} > 18/06/12 12:37:03 INFO namenode.FSNamesystem: FSNamesystem write lock held > for 46024 ms via > java.lang.Thread.getStackTrace(Thread.java:1559) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:945) > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:198) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1689) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.clearCorruptLazyPersistFiles(FSNamesystem.java:5532) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.run(FSNamesystem.java:5543) > java.lang.Thread.run(Thread.java:748) > Number of suppressed write-lock reports: 0 > Longest write-lock held interval: 46024 > {noformat} > Here's the relevant code: > {code} > writeLock(); > try { > final Iterator it = > blockManager.getCorruptReplicaBlockIterator(); > while (it.hasNext()) { > Block b = it.next(); > BlockInfo blockInfo = blockManager.getStoredBlock(b); > if (blockInfo.getBlockCollection().getStoragePolicyID() == > lpPolicy.getId()) { > filesToDelete.add(blockInfo.getBlockCollection()); > } > } > for (BlockCollection bc : filesToDelete) { > LOG.warn("Removing lazyPersist file " + bc.getName() + " with no > replicas."); > changed |= deleteInternal(bc.getName(), false, false, false); > } > } finally { > writeUnlock(); > } > {code} > In essence, the iteration over corrupt replica list should be broken down > into smaller iterations to avoid a single long wait. > Since this operation holds NameNode write lock for more than 45 seconds, the > default ZKFC connection timeout, it implies an extreme case like this (100 > million corrupt blocks) could lead to NameNode failover. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13672) clearCorruptLazyPersistFiles could crash NameNode
[ https://issues.apache.org/jira/browse/HDFS-13672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Bota updated HDFS-13672: -- Attachment: HDFS-13672.001.patch > clearCorruptLazyPersistFiles could crash NameNode > - > > Key: HDFS-13672 > URL: https://issues.apache.org/jira/browse/HDFS-13672 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Wei-Chiu Chuang >Assignee: Gabor Bota >Priority: Major > Attachments: HDFS-13672.001.patch > > > I started a NameNode on a pretty large fsimage. Since the NameNode is started > without any DataNodes, all blocks (100 million) are "corrupt". > Afterwards I observed FSNamesystem#clearCorruptLazyPersistFiles() held write > lock for a long time: > {noformat} > 18/06/12 12:37:03 INFO namenode.FSNamesystem: FSNamesystem write lock held > for 46024 ms via > java.lang.Thread.getStackTrace(Thread.java:1559) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:945) > org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:198) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1689) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.clearCorruptLazyPersistFiles(FSNamesystem.java:5532) > org.apache.hadoop.hdfs.server.namenode.FSNamesystem$LazyPersistFileScrubber.run(FSNamesystem.java:5543) > java.lang.Thread.run(Thread.java:748) > Number of suppressed write-lock reports: 0 > Longest write-lock held interval: 46024 > {noformat} > Here's the relevant code: > {code} > writeLock(); > try { > final Iterator it = > blockManager.getCorruptReplicaBlockIterator(); > while (it.hasNext()) { > Block b = it.next(); > BlockInfo blockInfo = blockManager.getStoredBlock(b); > if (blockInfo.getBlockCollection().getStoragePolicyID() == > lpPolicy.getId()) { > filesToDelete.add(blockInfo.getBlockCollection()); > } > } > for (BlockCollection bc : filesToDelete) { > LOG.warn("Removing lazyPersist file " + bc.getName() + " with no > replicas."); > changed |= deleteInternal(bc.getName(), false, false, false); > } > } finally { > writeUnlock(); > } > {code} > In essence, the iteration over corrupt replica list should be broken down > into smaller iterations to avoid a single long wait. > Since this operation holds NameNode write lock for more than 45 seconds, the > default ZKFC connection timeout, it implies an extreme case like this (100 > million corrupt blocks) could lead to NameNode failover. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13690) Improve error message when creating encryption zone while KMS is unreachable
[ https://issues.apache.org/jira/browse/HDFS-13690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-13690: Description: In failure testing, we stopped the KMS and then tried to run some encryption related commands. {{hdfs crypto -createZone}} will complain with a short "RemoteException: Connection refused." This message could be improved to explain that we cannot connect to the KMSClientProvier. For example, {{hadoop key list}} while KMS is down will error: {code} -bash-4.1$ hadoop key list Cannot list keys for KeyProvider: KMSClientProvider[http://hdfs-cdh5-vanilla-1.vpc.cloudera.com:16000/kms/v1/]: Connection refusedjava.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at sun.net.NetworkClient.doConnect(NetworkClient.java:175) at sun.net.www.http.HttpClient.openServer(HttpClient.java:432) at sun.net.www.http.HttpClient.openServer(HttpClient.java:527) at sun.net.www.http.HttpClient.(HttpClient.java:211) at sun.net.www.http.HttpClient.New(HttpClient.java:308) at sun.net.www.http.HttpClient.New(HttpClient.java:326) at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:996) at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:932) at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:850) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:186) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:125) at org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:216) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.openConnection(DelegationTokenAuthenticatedURL.java:312) at org.apache.hadoop.crypto.key.kms.KMSClientProvider$1.run(KMSClientProvider.java:397) at org.apache.hadoop.crypto.key.kms.KMSClientProvider$1.run(KMSClientProvider.java:392) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.crypto.key.kms.KMSClientProvider.createConnection(KMSClientProvider.java:392) at org.apache.hadoop.crypto.key.kms.KMSClientProvider.getKeys(KMSClientProvider.java:479) at org.apache.hadoop.crypto.key.KeyShell$ListCommand.execute(KeyShell.java:286) at org.apache.hadoop.crypto.key.KeyShell.run(KeyShell.java:79) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.crypto.key.KeyShell.main(KeyShell.java:513) {code} was: In failure testing, we stopped the KMS and then tried to run some encryption related commands. {{hdfs crypto -createZone}} will complain with a short "RemoteException: Connection refused." This message could be improved to explain that we cannot connect to the KMSClientProvier. For example, {{hadoop key list}} while KMS is down will error: -bash-4.1$ hadoop key list {code} Cannot list keys for KeyProvider: KMSClientProvider[http://hdfs-cdh5-vanilla-1.vpc.cloudera.com:16000/kms/v1/]: Connection refusedjava.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at sun.net.NetworkClient.doConnect(NetworkClient.java:175) at sun.net.www.http.HttpClient.openServer(HttpClient.java:432) at sun.net.www.http.HttpClient.openServer(HttpClient.java:527) at sun.net.www.http.HttpClient.(HttpClient.java:211) at sun.net.www.http.HttpClient.New(HttpClient.java:308) at sun.net.www.http.HttpClient.New(HttpClient.java:326) at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:996) at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:932) at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:850) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:186) at
[jira] [Work started] (HDFS-13690) Improve error message when creating encryption zone while KMS is unreachable
[ https://issues.apache.org/jira/browse/HDFS-13690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-13690 started by Kitti Nanasi. --- > Improve error message when creating encryption zone while KMS is unreachable > > > Key: HDFS-13690 > URL: https://issues.apache.org/jira/browse/HDFS-13690 > Project: Hadoop HDFS > Issue Type: Improvement > Components: encryption, hdfs, kms >Reporter: Kitti Nanasi >Assignee: Kitti Nanasi >Priority: Minor > > In failure testing, we stopped the KMS and then tried to run some encryption > related commands. > {{hdfs crypto -createZone}} will complain with a short "RemoteException: > Connection refused." This message could be improved to explain that we cannot > connect to the KMSClientProvier. > For example, {{hadoop key list}} while KMS is down will error: > {code} > -bash-4.1$ hadoop key list > Cannot list keys for KeyProvider: > KMSClientProvider[http://hdfs-cdh5-vanilla-1.vpc.cloudera.com:16000/kms/v1/]: > Connection refusedjava.net.ConnectException: Connection refused > at java.net.PlainSocketImpl.socketConnect(Native Method) > at > java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) > at > java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) > at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) > at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) > at java.net.Socket.connect(Socket.java:579) > at sun.net.NetworkClient.doConnect(NetworkClient.java:175) > at sun.net.www.http.HttpClient.openServer(HttpClient.java:432) > at sun.net.www.http.HttpClient.openServer(HttpClient.java:527) > at sun.net.www.http.HttpClient.(HttpClient.java:211) > at sun.net.www.http.HttpClient.New(HttpClient.java:308) > at sun.net.www.http.HttpClient.New(HttpClient.java:326) > at > sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:996) > at > sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:932) > at > sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:850) > at > org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:186) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:125) > at > org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:216) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.openConnection(DelegationTokenAuthenticatedURL.java:312) > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider$1.run(KMSClientProvider.java:397) > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider$1.run(KMSClientProvider.java:392) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.createConnection(KMSClientProvider.java:392) > at > org.apache.hadoop.crypto.key.kms.KMSClientProvider.getKeys(KMSClientProvider.java:479) > at > org.apache.hadoop.crypto.key.KeyShell$ListCommand.execute(KeyShell.java:286) > at org.apache.hadoop.crypto.key.KeyShell.run(KeyShell.java:79) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.crypto.key.KeyShell.main(KeyShell.java:513) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13690) Improve error message when creating encryption zone while KMS is unreachable
[ https://issues.apache.org/jira/browse/HDFS-13690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kitti Nanasi updated HDFS-13690: Description: In failure testing, we stopped the KMS and then tried to run some encryption related commands. {{hdfs crypto -createZone}} will complain with a short "RemoteException: Connection refused." This message could be improved to explain that we cannot connect to the KMSClientProvier. For example, {{hadoop key list}} while KMS is down will error: -bash-4.1$ hadoop key list {code} Cannot list keys for KeyProvider: KMSClientProvider[http://hdfs-cdh5-vanilla-1.vpc.cloudera.com:16000/kms/v1/]: Connection refusedjava.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at sun.net.NetworkClient.doConnect(NetworkClient.java:175) at sun.net.www.http.HttpClient.openServer(HttpClient.java:432) at sun.net.www.http.HttpClient.openServer(HttpClient.java:527) at sun.net.www.http.HttpClient.(HttpClient.java:211) at sun.net.www.http.HttpClient.New(HttpClient.java:308) at sun.net.www.http.HttpClient.New(HttpClient.java:326) at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:996) at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:932) at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:850) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:186) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:125) at org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:216) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.openConnection(DelegationTokenAuthenticatedURL.java:312) at org.apache.hadoop.crypto.key.kms.KMSClientProvider$1.run(KMSClientProvider.java:397) at org.apache.hadoop.crypto.key.kms.KMSClientProvider$1.run(KMSClientProvider.java:392) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.crypto.key.kms.KMSClientProvider.createConnection(KMSClientProvider.java:392) at org.apache.hadoop.crypto.key.kms.KMSClientProvider.getKeys(KMSClientProvider.java:479) at org.apache.hadoop.crypto.key.KeyShell$ListCommand.execute(KeyShell.java:286) at org.apache.hadoop.crypto.key.KeyShell.run(KeyShell.java:79) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.crypto.key.KeyShell.main(KeyShell.java:513) {code} was: In failure testing, we stopped the KMS and then tried to run some encryption related commands. {{hdfs crypto -createZone}} will complain with a short "RemoteException: Connection refused." This message could be improved to explain that we cannot connect to the KMSClientProvier. For example, {{hadoop key list}} while KMS is down will error: -bash-4.1$ hadoop key list Cannot list keys for KeyProvider: KMSClientProvider[http://hdfs-cdh5-vanilla-1.vpc.cloudera.com:16000/kms/v1/]: Connection refusedjava.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at sun.net.NetworkClient.doConnect(NetworkClient.java:175) at sun.net.www.http.HttpClient.openServer(HttpClient.java:432) at sun.net.www.http.HttpClient.openServer(HttpClient.java:527) at sun.net.www.http.HttpClient.(HttpClient.java:211) at sun.net.www.http.HttpClient.New(HttpClient.java:308) at sun.net.www.http.HttpClient.New(HttpClient.java:326) at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:996) at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:932) at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:850) at
[jira] [Created] (HDFS-13690) Improve error message when creating encryption zone while KMS is unreachable
Kitti Nanasi created HDFS-13690: --- Summary: Improve error message when creating encryption zone while KMS is unreachable Key: HDFS-13690 URL: https://issues.apache.org/jira/browse/HDFS-13690 Project: Hadoop HDFS Issue Type: Improvement Components: encryption, hdfs, kms Reporter: Kitti Nanasi Assignee: Kitti Nanasi In failure testing, we stopped the KMS and then tried to run some encryption related commands. {{hdfs crypto -createZone}} will complain with a short "RemoteException: Connection refused." This message could be improved to explain that we cannot connect to the KMSClientProvier. For example, {{hadoop key list}} while KMS is down will error: -bash-4.1$ hadoop key list Cannot list keys for KeyProvider: KMSClientProvider[http://hdfs-cdh5-vanilla-1.vpc.cloudera.com:16000/kms/v1/]: Connection refusedjava.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at sun.net.NetworkClient.doConnect(NetworkClient.java:175) at sun.net.www.http.HttpClient.openServer(HttpClient.java:432) at sun.net.www.http.HttpClient.openServer(HttpClient.java:527) at sun.net.www.http.HttpClient.(HttpClient.java:211) at sun.net.www.http.HttpClient.New(HttpClient.java:308) at sun.net.www.http.HttpClient.New(HttpClient.java:326) at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:996) at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:932) at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:850) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:186) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticator.authenticate(DelegationTokenAuthenticator.java:125) at org.apache.hadoop.security.authentication.client.AuthenticatedURL.openConnection(AuthenticatedURL.java:216) at org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticatedURL.openConnection(DelegationTokenAuthenticatedURL.java:312) at org.apache.hadoop.crypto.key.kms.KMSClientProvider$1.run(KMSClientProvider.java:397) at org.apache.hadoop.crypto.key.kms.KMSClientProvider$1.run(KMSClientProvider.java:392) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) at org.apache.hadoop.crypto.key.kms.KMSClientProvider.createConnection(KMSClientProvider.java:392) at org.apache.hadoop.crypto.key.kms.KMSClientProvider.getKeys(KMSClientProvider.java:479) at org.apache.hadoop.crypto.key.KeyShell$ListCommand.execute(KeyShell.java:286) at org.apache.hadoop.crypto.key.KeyShell.run(KeyShell.java:79) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.crypto.key.KeyShell.main(KeyShell.java:513) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-170) Fix TestBlockDeletingService#testBlockDeletionTimeout
[ https://issues.apache.org/jira/browse/HDDS-170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDDS-170: --- Status: Patch Available (was: Open) > Fix TestBlockDeletingService#testBlockDeletionTimeout > - > > Key: HDDS-170 > URL: https://issues.apache.org/jira/browse/HDDS-170 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Mukul Kumar Singh >Assignee: Lokesh Jain >Priority: Major > Attachments: HDDS-170.001.patch > > > TestBlockDeletingService#testBlockDeletionTimeout timesout while waiting for > expected error messsage. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-178) DeleteBlocks should not be handled by open containers
[ https://issues.apache.org/jira/browse/HDDS-178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517945#comment-16517945 ] Lokesh Jain commented on HDDS-178: -- v3 patch fixes unit test failures, findbugs and license issues. > DeleteBlocks should not be handled by open containers > - > > Key: HDDS-178 > URL: https://issues.apache.org/jira/browse/HDDS-178 > Project: Hadoop Distributed Data Store > Issue Type: Task > Components: Ozone Datanode >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Attachments: HDDS-178.001.patch, HDDS-178.002.patch, > HDDS-178.003.patch > > > In the case of open containers deleteBlocks command just adds an entry in the > log but does not delete the blocks. These blocks are deleted only when > container is closed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-178) DeleteBlocks should not be handled by open containers
[ https://issues.apache.org/jira/browse/HDDS-178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lokesh Jain updated HDDS-178: - Attachment: HDDS-178.003.patch > DeleteBlocks should not be handled by open containers > - > > Key: HDDS-178 > URL: https://issues.apache.org/jira/browse/HDDS-178 > Project: Hadoop Distributed Data Store > Issue Type: Task > Components: Ozone Datanode >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Attachments: HDDS-178.001.patch, HDDS-178.002.patch, > HDDS-178.003.patch > > > In the case of open containers deleteBlocks command just adds an entry in the > log but does not delete the blocks. These blocks are deleted only when > container is closed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10285) Storage Policy Satisfier in Namenode
[ https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517908#comment-16517908 ] genericqa commented on HDFS-10285: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 7s{color} | {color:red} HDFS-10285 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-10285 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12908083/HDFS-10285-consolidated-merge-patch-05.patch | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/24479/console | | Powered by | Apache Yetus 0.8.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > Storage Policy Satisfier in Namenode > > > Key: HDFS-10285 > URL: https://issues.apache.org/jira/browse/HDFS-10285 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: HDFS-10285 >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G >Priority: Major > Attachments: HDFS-10285-consolidated-merge-patch-00.patch, > HDFS-10285-consolidated-merge-patch-01.patch, > HDFS-10285-consolidated-merge-patch-02.patch, > HDFS-10285-consolidated-merge-patch-03.patch, > HDFS-10285-consolidated-merge-patch-04.patch, > HDFS-10285-consolidated-merge-patch-05.patch, > HDFS-SPS-TestReport-20170708.pdf, SPS Modularization.pdf, > Storage-Policy-Satisfier-in-HDFS-June-20-2017.pdf, > Storage-Policy-Satisfier-in-HDFS-May10.pdf, > Storage-Policy-Satisfier-in-HDFS-Oct-26-2017.pdf > > > Heterogeneous storage in HDFS introduced the concept of storage policy. These > policies can be set on directory/file to specify the user preference, where > to store the physical block. When user set the storage policy before writing > data, then the blocks could take advantage of storage policy preferences and > stores physical block accordingly. > If user set the storage policy after writing and completing the file, then > the blocks would have been written with default storage policy (nothing but > DISK). User has to run the ‘Mover tool’ explicitly by specifying all such > file names as a list. In some distributed system scenarios (ex: HBase) it > would be difficult to collect all the files and run the tool as different > nodes can write files separately and file can have different paths. > Another scenarios is, when user rename the files from one effected storage > policy file (inherited policy from parent directory) to another storage > policy effected directory, it will not copy inherited storage policy from > source. So it will take effect from destination file/dir parent storage > policy. This rename operation is just a metadata change in Namenode. The > physical blocks still remain with source storage policy. > So, Tracking all such business logic based file names could be difficult for > admins from distributed nodes(ex: region servers) and running the Mover tool. > Here the proposal is to provide an API from Namenode itself for trigger the > storage policy satisfaction. A Daemon thread inside Namenode should track > such calls and process to DN as movement commands. > Will post the detailed design thoughts document soon. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10285) Storage Policy Satisfier in Namenode
[ https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517906#comment-16517906 ] Uma Maheswara Rao G commented on HDFS-10285: Hi All, After all long discussions offline, I would like to summarize the current state of arguments/approaches. >From [~andrew.wang]: Interested to to this process inside NN to reduce >maintenance cost etc. He also agreed to have this running optionally outside. >From [~anu]: He has no interest to run the process inside NN and in-fact he >was the one who proposed to start this process outside NN. We worked so far to >satisfy both arguments. In Offline discussion today, Anu proposing to go ahead >with merge using existing workable external SPS part in first phase and we >continue improve the feature as alternatives proposed. This feature can be >Alpha. >From [~chris.douglas]: He is fine with both options and he proposed context >based abstractions what we agreed and implemented so far. >From [~daryn]: He is fine with running this process outside. If we want run >this internal to NN, he proposed to couple with RM instead of keeping >logics/queues in separate daemon thread. >From Uma: Interested primarily to run with NN and have no major concerns to >start as separate process to move forward on the project. >From [~rakeshr]: He has no concerns on either way as users can run depending >on their usage model. Here I am trying to propose that the current code supports both options. But internal to NN is not depending on RM in current code base. So, how about we move forward with External SPS option for merge and continue discussing the internal SPS? Because internal SPS takes time to integrate with RM and testing etc and also definitely there may not be a major common code. While we will continue discussion on Internal SPS, and no concerns in External SPS, we could move forward with External SPS to merge? If that works we will make necessary cleanups and go for external SPS merge? However we will not mark this feature as Stable until we run this for some time. So, It should be ok to keep improve the code part incrementally instead nothing and blocked each other on arguments. Thanks > Storage Policy Satisfier in Namenode > > > Key: HDFS-10285 > URL: https://issues.apache.org/jira/browse/HDFS-10285 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, namenode >Affects Versions: HDFS-10285 >Reporter: Uma Maheswara Rao G >Assignee: Uma Maheswara Rao G >Priority: Major > Attachments: HDFS-10285-consolidated-merge-patch-00.patch, > HDFS-10285-consolidated-merge-patch-01.patch, > HDFS-10285-consolidated-merge-patch-02.patch, > HDFS-10285-consolidated-merge-patch-03.patch, > HDFS-10285-consolidated-merge-patch-04.patch, > HDFS-10285-consolidated-merge-patch-05.patch, > HDFS-SPS-TestReport-20170708.pdf, SPS Modularization.pdf, > Storage-Policy-Satisfier-in-HDFS-June-20-2017.pdf, > Storage-Policy-Satisfier-in-HDFS-May10.pdf, > Storage-Policy-Satisfier-in-HDFS-Oct-26-2017.pdf > > > Heterogeneous storage in HDFS introduced the concept of storage policy. These > policies can be set on directory/file to specify the user preference, where > to store the physical block. When user set the storage policy before writing > data, then the blocks could take advantage of storage policy preferences and > stores physical block accordingly. > If user set the storage policy after writing and completing the file, then > the blocks would have been written with default storage policy (nothing but > DISK). User has to run the ‘Mover tool’ explicitly by specifying all such > file names as a list. In some distributed system scenarios (ex: HBase) it > would be difficult to collect all the files and run the tool as different > nodes can write files separately and file can have different paths. > Another scenarios is, when user rename the files from one effected storage > policy file (inherited policy from parent directory) to another storage > policy effected directory, it will not copy inherited storage policy from > source. So it will take effect from destination file/dir parent storage > policy. This rename operation is just a metadata change in Namenode. The > physical blocks still remain with source storage policy. > So, Tracking all such business logic based file names could be difficult for > admins from distributed nodes(ex: region servers) and running the Mover tool. > Here the proposal is to provide an API from Namenode itself for trigger the > storage policy satisfaction. A Daemon thread inside Namenode should track > such calls and process to DN as movement commands. > Will post the detailed design thoughts document soon. -- This message was sent by