[jira] [Commented] (HDFS-12682) ECAdmin -listPolicies will always show policy state as DISABLED
[ https://issues.apache.org/jira/browse/HDFS-12682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219155#comment-16219155 ] Rakesh R commented on HDFS-12682: - [~xiaochen] Good catch! Thanks for the quick updates. IIUC, the problem occurs only to the SystemErasureCodingPolicies. In that case, can we change the jira subjectline reflecting the same - {{ECAdmin -listPolicies will always show SystemErasureCodingPolicies state as DISABLED}} bq. since the ECP object will always have the state, so for calls other than -listPolicies, the object will have a state to DISABLED (the default). Instead of separately handling the state value for {{-listPolicies}} and other function calls. Can we keep the state field value consistent for all the calls for better code maintenance ? > ECAdmin -listPolicies will always show policy state as DISABLED > --- > > Key: HDFS-12682 > URL: https://issues.apache.org/jira/browse/HDFS-12682 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding >Reporter: Xiao Chen >Assignee: Xiao Chen >Priority: Blocker > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-12682.01.patch, HDFS-12682.02.patch, > HDFS-12682.03.patch > > > On a real cluster, {{hdfs ec -listPolicies}} will always show policy state as > DISABLED. > {noformat} > [hdfs@nightly6x-1 root]$ hdfs ec -listPolicies > Erasure Coding Policies: > ErasureCodingPolicy=[Name=RS-10-4-1024k, Schema=[ECSchema=[Codec=rs, > numDataUnits=10, numParityUnits=4]], CellSize=1048576, Id=5, State=DISABLED] > ErasureCodingPolicy=[Name=RS-3-2-1024k, Schema=[ECSchema=[Codec=rs, > numDataUnits=3, numParityUnits=2]], CellSize=1048576, Id=2, State=DISABLED] > ErasureCodingPolicy=[Name=RS-6-3-1024k, Schema=[ECSchema=[Codec=rs, > numDataUnits=6, numParityUnits=3]], CellSize=1048576, Id=1, State=DISABLED] > ErasureCodingPolicy=[Name=RS-LEGACY-6-3-1024k, > Schema=[ECSchema=[Codec=rs-legacy, numDataUnits=6, numParityUnits=3]], > CellSize=1048576, Id=3, State=DISABLED] > ErasureCodingPolicy=[Name=XOR-2-1-1024k, Schema=[ECSchema=[Codec=xor, > numDataUnits=2, numParityUnits=1]], CellSize=1048576, Id=4, State=DISABLED] > [hdfs@nightly6x-1 root]$ hdfs ec -getPolicy -path /ecec > XOR-2-1-1024k > {noformat} > This is because when [deserializing > protobuf|https://github.com/apache/hadoop/blob/branch-3.0.0-beta1/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelperClient.java#L2942], > the static instance of [SystemErasureCodingPolicies > class|https://github.com/apache/hadoop/blob/branch-3.0.0-beta1/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/SystemErasureCodingPolicies.java#L101] > is first checked, and always returns the cached policy objects, which are > created by default with state=DISABLED. > All the existing unit tests pass, because that static instance that the > client (e.g. ECAdmin) reads in unit test is updated by NN. :) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12682) ECAdmin -listPolicies will always show policy state as DISABLED
[ https://issues.apache.org/jira/browse/HDFS-12682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16217928#comment-16217928 ] Lei (Eddy) Xu commented on HDFS-12682: -- Thanks for reporting this and working on this, [~xiaochen] * I think we should keep {{ErasureCodingPolicyInfo}} as {{@InterfaceAudience.Private}}, also private for {{ErasureCodingPolicy}}. These classes should not be used outside of HDFS. Lessons from HADOOP-14957 :) * Wondering whether it is possible that always set {{state}} in {{PBHelperClient#convertErasureCodingPolicy}} {code} public static convertErasureCodingPolicy() { ... if (proto.hasState()) { policy.setState(proto.getState()); } return policy; } {code} > ECAdmin -listPolicies will always show policy state as DISABLED > --- > > Key: HDFS-12682 > URL: https://issues.apache.org/jira/browse/HDFS-12682 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding >Reporter: Xiao Chen >Assignee: Xiao Chen >Priority: Blocker > Labels: hdfs-ec-3.0-must-do > Attachments: HDFS-12682.01.patch, HDFS-12682.02.patch > > > On a real cluster, {{hdfs ec -listPolicies}} will always show policy state as > DISABLED. > {noformat} > [hdfs@nightly6x-1 root]$ hdfs ec -listPolicies > Erasure Coding Policies: > ErasureCodingPolicy=[Name=RS-10-4-1024k, Schema=[ECSchema=[Codec=rs, > numDataUnits=10, numParityUnits=4]], CellSize=1048576, Id=5, State=DISABLED] > ErasureCodingPolicy=[Name=RS-3-2-1024k, Schema=[ECSchema=[Codec=rs, > numDataUnits=3, numParityUnits=2]], CellSize=1048576, Id=2, State=DISABLED] > ErasureCodingPolicy=[Name=RS-6-3-1024k, Schema=[ECSchema=[Codec=rs, > numDataUnits=6, numParityUnits=3]], CellSize=1048576, Id=1, State=DISABLED] > ErasureCodingPolicy=[Name=RS-LEGACY-6-3-1024k, > Schema=[ECSchema=[Codec=rs-legacy, numDataUnits=6, numParityUnits=3]], > CellSize=1048576, Id=3, State=DISABLED] > ErasureCodingPolicy=[Name=XOR-2-1-1024k, Schema=[ECSchema=[Codec=xor, > numDataUnits=2, numParityUnits=1]], CellSize=1048576, Id=4, State=DISABLED] > [hdfs@nightly6x-1 root]$ hdfs ec -getPolicy -path /ecec > XOR-2-1-1024k > {noformat} > This is because when [deserializing > protobuf|https://github.com/apache/hadoop/blob/branch-3.0.0-beta1/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelperClient.java#L2942], > the static instance of [SystemErasureCodingPolicies > class|https://github.com/apache/hadoop/blob/branch-3.0.0-beta1/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/SystemErasureCodingPolicies.java#L101] > is first checked, and always returns the cached policy objects, which are > created by default with state=DISABLED. > All the existing unit tests pass, because that static instance that the > client (e.g. ECAdmin) reads in unit test is updated by NN. :) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12682) ECAdmin -listPolicies will always show policy state as DISABLED
[ https://issues.apache.org/jira/browse/HDFS-12682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214971#comment-16214971 ] Hadoop QA commented on HDFS-12682: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 9m 45s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 5 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 58s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 27s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 25s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 0s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 14m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m 12s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 2m 26s{color} | {color:orange} root: The patch generated 4 new + 557 unchanged - 2 fixed = 561 total (was 559) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 23s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 5m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 12s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 36s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 84m 19s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}119m 18s{color} | {color:red} hadoop-mapreduce-client-jobclient in the patch failed. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 38s{color} | {color:red} The patch generated 1 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}310m 20s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | Timed out junit tests | org.apache.hadoop.mapred.pipes.TestPipeApplication | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:ca8ddc6 | | JIRA Issue | HDFS-12682 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12893487/HDFS-12682.02.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle cc | | uname | Linux e44d5b7f938d
[jira] [Commented] (HDFS-12682) ECAdmin -listPolicies will always show policy state as DISABLED
[ https://issues.apache.org/jira/browse/HDFS-12682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16211473#comment-16211473 ] Xiao Chen commented on HDFS-12682: -- Thanks for the response Sammi, good find on HDFS-12686! I propose we fix the problem by: - Remove the state from {{ErasureCodingPolicy}}. The motivation is, {{ErasureCodingPolicy}} is returned with {{HdfsFileStatus}}, which impacts all clients listing hdfs. We want to make it as lightweight as possible, and keep Andrew's work on HDFS-11565 for performance. - Add a new class {{ErasureCodingPolicyInfo}} (or whatever name people feel intuitive), that contains the policy and its state. This will be used by the ECAdmin-purpose APIs, as well as internally HDFS persistency. Will prepare a patch toward this direction for demonstration. If you or any watchers have concerns, please feel free to speak up. > ECAdmin -listPolicies will always show policy state as DISABLED > --- > > Key: HDFS-12682 > URL: https://issues.apache.org/jira/browse/HDFS-12682 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding >Reporter: Xiao Chen >Assignee: Xiao Chen > Labels: hdfs-ec-3.0-must-do > > On a real cluster, {{hdfs ec -listPolicies}} will always show policy state as > DISABLED. > {noformat} > [hdfs@nightly6x-1 root]$ hdfs ec -listPolicies > Erasure Coding Policies: > ErasureCodingPolicy=[Name=RS-10-4-1024k, Schema=[ECSchema=[Codec=rs, > numDataUnits=10, numParityUnits=4]], CellSize=1048576, Id=5, State=DISABLED] > ErasureCodingPolicy=[Name=RS-3-2-1024k, Schema=[ECSchema=[Codec=rs, > numDataUnits=3, numParityUnits=2]], CellSize=1048576, Id=2, State=DISABLED] > ErasureCodingPolicy=[Name=RS-6-3-1024k, Schema=[ECSchema=[Codec=rs, > numDataUnits=6, numParityUnits=3]], CellSize=1048576, Id=1, State=DISABLED] > ErasureCodingPolicy=[Name=RS-LEGACY-6-3-1024k, > Schema=[ECSchema=[Codec=rs-legacy, numDataUnits=6, numParityUnits=3]], > CellSize=1048576, Id=3, State=DISABLED] > ErasureCodingPolicy=[Name=XOR-2-1-1024k, Schema=[ECSchema=[Codec=xor, > numDataUnits=2, numParityUnits=1]], CellSize=1048576, Id=4, State=DISABLED] > [hdfs@nightly6x-1 root]$ hdfs ec -getPolicy -path /ecec > XOR-2-1-1024k > {noformat} > This is because when [deserializing > protobuf|https://github.com/apache/hadoop/blob/branch-3.0.0-beta1/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelperClient.java#L2942], > the static instance of [SystemErasureCodingPolicies > class|https://github.com/apache/hadoop/blob/branch-3.0.0-beta1/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/SystemErasureCodingPolicies.java#L101] > is first checked, and always returns the cached policy objects, which are > created by default with state=DISABLED. > All the existing unit tests pass, because that static instance that the > client (e.g. ECAdmin) reads in unit test is updated by NN. :) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12682) ECAdmin -listPolicies will always show policy state as DISABLED
[ https://issues.apache.org/jira/browse/HDFS-12682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16210790#comment-16210790 ] SammiChen commented on HDFS-12682: -- Hi [~xiaochen], thanks for reporting this issue. Inspired by your discovery, I found the same issue exists in system EC persist into and load from fsImage (HFDS-12686). The current convertErasureCodingPolicy function is perfect in most cases. For special cases, like get all erasure coding policy and persist policy into fsImage, I think we need a new edition for full convert. {quote} The problem I see from HDFS-12258's implementation though, is the mutable ECPS is saved on the immutable ECP, breaking assumptions such as shared single instance policy. At the same time the policy is still not persisted independently. I think ECPS is highly dependent on the missing piece from HDFS-7337: policies are not persisted to NN metadata. The state of whether a policy is enabled could be persisted together with the policy, without impacting HDFSFileStatus. {quote} Persist ec policies is implemented in HDFS-7337. {quote} I think this bug (HDFS-12682) and HDFS-12258 would make more sense if we could first persist policies to NN metadata. Would also be helpful to separate out something like ErasureCodingPolicyAndState for the policy-specific APIs, so the state isn't deserialized onto HDFSFileStatus. {quote} For HDFS-12258, [~zhouwei], [~drankye] and I, we discussed and do have two different approaches when we first think about how to implement it. One is the current implemented approach, which add one extra "state" field in the existing ECP definition. Another is define a new class, something like {{ErasureCodingPolicyWithState}} to hold the EPC and new policy state field. They are almost equally good. The only concern is if we introduce the new {{ErasureCodingPolicyWithState}}, it may introduce complexity to API interfaces, and to end users. There are multiple EC related APIs. If we return {{ErasureCodingPolicyWithState}} for {{getAllErasureCodingPolicies}} , should we return {{ErasureCodingPolicyWithState}} or {{ErasureCodingPolicy}} for {{getErasureCodingPolicy}}? something like that. Also is it worth to introduce a new class definition in Hadoop which only has 1 extra new field? After all the considerations, the current approach is chosen to leverage the existing ECP. Please let me know if you have other concerns. Thanks! > ECAdmin -listPolicies will always show policy state as DISABLED > --- > > Key: HDFS-12682 > URL: https://issues.apache.org/jira/browse/HDFS-12682 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding >Reporter: Xiao Chen >Assignee: Xiao Chen > Labels: hdfs-ec-3.0-must-do > > On a real cluster, {{hdfs ec -listPolicies}} will always show policy state as > DISABLED. > {noformat} > [hdfs@nightly6x-1 root]$ hdfs ec -listPolicies > Erasure Coding Policies: > ErasureCodingPolicy=[Name=RS-10-4-1024k, Schema=[ECSchema=[Codec=rs, > numDataUnits=10, numParityUnits=4]], CellSize=1048576, Id=5, State=DISABLED] > ErasureCodingPolicy=[Name=RS-3-2-1024k, Schema=[ECSchema=[Codec=rs, > numDataUnits=3, numParityUnits=2]], CellSize=1048576, Id=2, State=DISABLED] > ErasureCodingPolicy=[Name=RS-6-3-1024k, Schema=[ECSchema=[Codec=rs, > numDataUnits=6, numParityUnits=3]], CellSize=1048576, Id=1, State=DISABLED] > ErasureCodingPolicy=[Name=RS-LEGACY-6-3-1024k, > Schema=[ECSchema=[Codec=rs-legacy, numDataUnits=6, numParityUnits=3]], > CellSize=1048576, Id=3, State=DISABLED] > ErasureCodingPolicy=[Name=XOR-2-1-1024k, Schema=[ECSchema=[Codec=xor, > numDataUnits=2, numParityUnits=1]], CellSize=1048576, Id=4, State=DISABLED] > [hdfs@nightly6x-1 root]$ hdfs ec -getPolicy -path /ecec > XOR-2-1-1024k > {noformat} > This is because when [deserializing > protobuf|https://github.com/apache/hadoop/blob/branch-3.0.0-beta1/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelperClient.java#L2942], > the static instance of [SystemErasureCodingPolicies > class|https://github.com/apache/hadoop/blob/branch-3.0.0-beta1/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/SystemErasureCodingPolicies.java#L101] > is first checked, and always returns the cached policy objects, which are > created by default with state=DISABLED. > All the existing unit tests pass, because that static instance that the > client (e.g. ECAdmin) reads in unit test is updated by NN. :) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail:
[jira] [Commented] (HDFS-12682) ECAdmin -listPolicies will always show policy state as DISABLED
[ https://issues.apache.org/jira/browse/HDFS-12682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16210630#comment-16210630 ] Xiao Chen commented on HDFS-12682: -- Familiarized myself with the code more, and investigated further. Also including [~zhouwei] [~Sammi] and [~rakeshr] since this is related to the initial work of HDFS-12258, and kind of related to HDFS-7337. HDFS-12258 adds good functionalities. Before HDFS-12258, the information "what policies are enabled" is not persisted to NN metadata, so {{-listPolicies}} always return the default {{RS-6-3-1024k}}. Even if there are some files/dirs already under a different policy, after NN restart {{-listPolicies}} still returns RS-6-3 only. This feels definitely wrong and I believe is the motivation of HDFS-12558. The problem I see from HDFS-12258's implementation though, is the mutable ECPS is saved on the immutable ECP, breaking assumptions such as shared single instance policy. At the same time the policy is still not persisted independently. I think ECPS is highly dependent on the missing piece from HDFS-7337: policies are not persisted to NN metadata. The state of whether a policy is enabled could be persisted together with the policy, without impacting HDFSFileStatus. I think this bug (HDFS-12682) and HDFS-12258 would make more sense if we could first persist policies to NN metadata. Would also be helpful to separate out something like {{ErasureCodingPolicyAndState}} for the policy-specific APIs, so the state isn't deserialized onto {{HDFSFileStatus}}. Please let me know what you think. Thanks! > ECAdmin -listPolicies will always show policy state as DISABLED > --- > > Key: HDFS-12682 > URL: https://issues.apache.org/jira/browse/HDFS-12682 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding >Reporter: Xiao Chen >Assignee: Xiao Chen > Labels: hdfs-ec-3.0-must-do > > On a real cluster, {{hdfs ec -listPolicies}} will always show policy state as > DISABLED. > {noformat} > [hdfs@nightly6x-1 root]$ hdfs ec -listPolicies > Erasure Coding Policies: > ErasureCodingPolicy=[Name=RS-10-4-1024k, Schema=[ECSchema=[Codec=rs, > numDataUnits=10, numParityUnits=4]], CellSize=1048576, Id=5, State=DISABLED] > ErasureCodingPolicy=[Name=RS-3-2-1024k, Schema=[ECSchema=[Codec=rs, > numDataUnits=3, numParityUnits=2]], CellSize=1048576, Id=2, State=DISABLED] > ErasureCodingPolicy=[Name=RS-6-3-1024k, Schema=[ECSchema=[Codec=rs, > numDataUnits=6, numParityUnits=3]], CellSize=1048576, Id=1, State=DISABLED] > ErasureCodingPolicy=[Name=RS-LEGACY-6-3-1024k, > Schema=[ECSchema=[Codec=rs-legacy, numDataUnits=6, numParityUnits=3]], > CellSize=1048576, Id=3, State=DISABLED] > ErasureCodingPolicy=[Name=XOR-2-1-1024k, Schema=[ECSchema=[Codec=xor, > numDataUnits=2, numParityUnits=1]], CellSize=1048576, Id=4, State=DISABLED] > [hdfs@nightly6x-1 root]$ hdfs ec -getPolicy -path /ecec > XOR-2-1-1024k > {noformat} > This is because when [deserializing > protobuf|https://github.com/apache/hadoop/blob/branch-3.0.0-beta1/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelperClient.java#L2942], > the static instance of [SystemErasureCodingPolicies > class|https://github.com/apache/hadoop/blob/branch-3.0.0-beta1/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/SystemErasureCodingPolicies.java#L101] > is first checked, and always returns the cached policy objects, which are > created by default with state=DISABLED. > All the existing unit tests pass, because that static instance that the > client (e.g. ECAdmin) reads in unit test is updated by NN. :) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12682) ECAdmin -listPolicies will always show policy state as DISABLED
[ https://issues.apache.org/jira/browse/HDFS-12682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16210317#comment-16210317 ] Xiao Chen commented on HDFS-12682: -- Had a quick discussion with Andrew, here's the outcome: ErasureCodingPolicyState should not be part of the ErasureCodingPolicy, since ECP should be immutable. This was added recently from HDFS-12258. The concern is that ECP in part of HdfsFileStatus, so pretty impactful to clients doing listing. Although apps like hive and impala probably doesn't care about the ECP, returning a wrong ECPS will just be a bug. I'll look at whether we can do HDFS-12258 in another way, so we HdfsFileStatus only contains immutable ECP, and -listPolices would get the ECPS with ECP. Thanks Andrew for the discussion! > ECAdmin -listPolicies will always show policy state as DISABLED > --- > > Key: HDFS-12682 > URL: https://issues.apache.org/jira/browse/HDFS-12682 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding >Reporter: Xiao Chen >Assignee: Xiao Chen > Labels: hdfs-ec-3.0-must-do > > On a real cluster, {{hdfs ec -listPolicies}} will always show policy state as > DISABLED. > {noformat} > [hdfs@nightly6x-1 root]$ hdfs ec -listPolicies > Erasure Coding Policies: > ErasureCodingPolicy=[Name=RS-10-4-1024k, Schema=[ECSchema=[Codec=rs, > numDataUnits=10, numParityUnits=4]], CellSize=1048576, Id=5, State=DISABLED] > ErasureCodingPolicy=[Name=RS-3-2-1024k, Schema=[ECSchema=[Codec=rs, > numDataUnits=3, numParityUnits=2]], CellSize=1048576, Id=2, State=DISABLED] > ErasureCodingPolicy=[Name=RS-6-3-1024k, Schema=[ECSchema=[Codec=rs, > numDataUnits=6, numParityUnits=3]], CellSize=1048576, Id=1, State=DISABLED] > ErasureCodingPolicy=[Name=RS-LEGACY-6-3-1024k, > Schema=[ECSchema=[Codec=rs-legacy, numDataUnits=6, numParityUnits=3]], > CellSize=1048576, Id=3, State=DISABLED] > ErasureCodingPolicy=[Name=XOR-2-1-1024k, Schema=[ECSchema=[Codec=xor, > numDataUnits=2, numParityUnits=1]], CellSize=1048576, Id=4, State=DISABLED] > [hdfs@nightly6x-1 root]$ hdfs ec -getPolicy -path /ecec > XOR-2-1-1024k > {noformat} > This is because when [deserializing > protobuf|https://github.com/apache/hadoop/blob/branch-3.0.0-beta1/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelperClient.java#L2942], > the static instance of [SystemErasureCodingPolicies > class|https://github.com/apache/hadoop/blob/branch-3.0.0-beta1/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/SystemErasureCodingPolicies.java#L101] > is first checked, and always returns the cached policy objects, which are > created by default with state=DISABLED. > All the existing unit tests pass, because that static instance that the > client (e.g. ECAdmin) reads in unit test is updated by NN. :) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12682) ECAdmin -listPolicies will always show policy state as DISABLED
[ https://issues.apache.org/jira/browse/HDFS-12682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16210230#comment-16210230 ] Andrew Wang commented on HDFS-12682: My concern was actually for the clients, since there are apps (Hive, Impala) that do listing of thousands or millions of files. I assume we can do a hybrid approach, where we get most of the fields from the static class, but get the enabled/disabled state from the PB? > ECAdmin -listPolicies will always show policy state as DISABLED > --- > > Key: HDFS-12682 > URL: https://issues.apache.org/jira/browse/HDFS-12682 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding >Reporter: Xiao Chen >Assignee: Xiao Chen > Labels: hdfs-ec-3.0-must-do > > On a real cluster, {{hdfs ec -listPolicies}} will always show policy state as > DISABLED. > {noformat} > [hdfs@nightly6x-1 root]$ hdfs ec -listPolicies > Erasure Coding Policies: > ErasureCodingPolicy=[Name=RS-10-4-1024k, Schema=[ECSchema=[Codec=rs, > numDataUnits=10, numParityUnits=4]], CellSize=1048576, Id=5, State=DISABLED] > ErasureCodingPolicy=[Name=RS-3-2-1024k, Schema=[ECSchema=[Codec=rs, > numDataUnits=3, numParityUnits=2]], CellSize=1048576, Id=2, State=DISABLED] > ErasureCodingPolicy=[Name=RS-6-3-1024k, Schema=[ECSchema=[Codec=rs, > numDataUnits=6, numParityUnits=3]], CellSize=1048576, Id=1, State=DISABLED] > ErasureCodingPolicy=[Name=RS-LEGACY-6-3-1024k, > Schema=[ECSchema=[Codec=rs-legacy, numDataUnits=6, numParityUnits=3]], > CellSize=1048576, Id=3, State=DISABLED] > ErasureCodingPolicy=[Name=XOR-2-1-1024k, Schema=[ECSchema=[Codec=xor, > numDataUnits=2, numParityUnits=1]], CellSize=1048576, Id=4, State=DISABLED] > [hdfs@nightly6x-1 root]$ hdfs ec -getPolicy -path /ecec > XOR-2-1-1024k > {noformat} > This is because when [deserializing > protobuf|https://github.com/apache/hadoop/blob/branch-3.0.0-beta1/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelperClient.java#L2942], > the static instance of [SystemErasureCodingPolicies > class|https://github.com/apache/hadoop/blob/branch-3.0.0-beta1/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/SystemErasureCodingPolicies.java#L101] > is first checked, and always returns the cached policy objects, which are > created by default with state=DISABLED. > All the existing unit tests pass, because that static instance that the > client (e.g. ECAdmin) reads in unit test is updated by NN. :) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-12682) ECAdmin -listPolicies will always show policy state as DISABLED
[ https://issues.apache.org/jira/browse/HDFS-12682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16210220#comment-16210220 ] Xiao Chen commented on HDFS-12682: -- Looking at the history of HDFS-11565, I understand this is done to save the redundant object constructions in NN. I'm thinking we should keep the same for the NN, but construct every time for the client. Don't think this overhead will matter for a downstream long-running dfsclient that calls {{DFS#getAllErasureCodingPolicies}} repeatedly, since there will be numerous other objects created during the RPC. There will also be the headache to update these objects without the FSN lock. Would this be okay with the intent of HDFS-11565, [~andrew.wang]? > ECAdmin -listPolicies will always show policy state as DISABLED > --- > > Key: HDFS-12682 > URL: https://issues.apache.org/jira/browse/HDFS-12682 > Project: Hadoop HDFS > Issue Type: Bug > Components: erasure-coding >Reporter: Xiao Chen >Assignee: Xiao Chen > Labels: hdfs-ec-3.0-must-do > > On a real cluster, {{hdfs ec -listPolicies}} will always show policy state as > DISABLED. > {noformat} > [hdfs@nightly6x-1 root]$ hdfs ec -listPolicies > Erasure Coding Policies: > ErasureCodingPolicy=[Name=RS-10-4-1024k, Schema=[ECSchema=[Codec=rs, > numDataUnits=10, numParityUnits=4]], CellSize=1048576, Id=5, State=DISABLED] > ErasureCodingPolicy=[Name=RS-3-2-1024k, Schema=[ECSchema=[Codec=rs, > numDataUnits=3, numParityUnits=2]], CellSize=1048576, Id=2, State=DISABLED] > ErasureCodingPolicy=[Name=RS-6-3-1024k, Schema=[ECSchema=[Codec=rs, > numDataUnits=6, numParityUnits=3]], CellSize=1048576, Id=1, State=DISABLED] > ErasureCodingPolicy=[Name=RS-LEGACY-6-3-1024k, > Schema=[ECSchema=[Codec=rs-legacy, numDataUnits=6, numParityUnits=3]], > CellSize=1048576, Id=3, State=DISABLED] > ErasureCodingPolicy=[Name=XOR-2-1-1024k, Schema=[ECSchema=[Codec=xor, > numDataUnits=2, numParityUnits=1]], CellSize=1048576, Id=4, State=DISABLED] > [hdfs@nightly6x-1 root]$ hdfs ec -getPolicy -path /ecec > XOR-2-1-1024k > {noformat} > This is because when [deserializing > protobuf|https://github.com/apache/hadoop/blob/branch-3.0.0-beta1/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelperClient.java#L2942], > the static instance of [SystemErasureCodingPolicies > class|https://github.com/apache/hadoop/blob/branch-3.0.0-beta1/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/SystemErasureCodingPolicies.java#L101] > is first checked, and always returns the cached policy objects, which are > created by default with state=DISABLED. > All the existing unit tests pass, because that static instance that the > client (e.g. ECAdmin) reads in unit test is updated by NN. :) -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org