[ 
https://issues.apache.org/jira/browse/HDFS-14039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16670160#comment-16670160
 ] 

Kitti Nanasi commented on HDFS-14039:
-------------------------------------

Thanks [~xiaochen] for reporting this!

This problem occurs because:
 # The fsimage contains the list of erasure coding policies and their states 
(if it is enabled or not), at first the RS-3-2-1024k is disabled.
 # We change the RS-3-2-1024k to be the default policy in the configuration and 
restart the namenode, so ErasureCodingPolicyManager.init() function will 
execute, which will enable the new default policy. The policies are stored in 
multiple data structures in ErasureCodingPolicyManager, those are still in sync 
at this point (enabledPolicies, policiesByName, allPolicies), but the 
enablement is not written out to the edit logs.
 # After that, because the namenode was restarted, the erasure coding policy 
list is loaded from the fsimage (ErasureCodingPolicyManager.loadPolicies()), in 
which RS-3-2-1024k is still disabled. But here only one of the data structures 
is updated with the new ec policy list (allPolicies), the other ones 
(enabledPolicies, policiesByName) stay the same, so they still have 
RS-3-2-1024k in enabled state.
 # When executing the enable command for RS-3-2-1024k, it won't succeed, 
because enabledPolicies still list contains it (checked in 
ErasureCodingPolicyManager.enablePolicy()), so allPolicies will never be 
updated.
 # When executing listPolicies command, it will show the state of the 
allPolicies list.

The workaround is to enable the policy before setting it as default in the 
config.

The easy solution could be to keep the data structures in 
ErasureCodingPolicyManager in sync and to enable the default policy again when 
loading the fsimage (ErasureCodingPolicyManager.loadPolicies()).

> ec -listPolicies doesn't show correct state for the default policy when the 
> default is not RS(6,3)
> --------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-14039
>                 URL: https://issues.apache.org/jira/browse/HDFS-14039
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: erasure-coding
>    Affects Versions: 3.0.0
>            Reporter: Xiao Chen
>            Assignee: Kitti Nanasi
>            Priority: Major
>
> {noformat}
> $ hdfs ec -listPolicies
> Erasure Coding Policies:
> ErasureCodingPolicy=[Name=RS-10-4-1024k, Schema=[ECSchema=[Codec=rs, 
> numDataUnits=10, numParityUnits=4]], CellSize=1048576, Id=5], State=DISABLED
> ErasureCodingPolicy=[Name=RS-3-2-1024k, Schema=[ECSchema=[Codec=rs, 
> numDataUnits=3, numParityUnits=2]], CellSize=1048576, Id=2], State=DISABLED
> ErasureCodingPolicy=[Name=RS-6-3-1024k, Schema=[ECSchema=[Codec=rs, 
> numDataUnits=6, numParityUnits=3]], CellSize=1048576, Id=1], State=ENABLED
> ErasureCodingPolicy=[Name=RS-LEGACY-6-3-1024k, 
> Schema=[ECSchema=[Codec=rs-legacy, numDataUnits=6, numParityUnits=3]], 
> CellSize=1048576, Id=3], State=DISABLED
> ErasureCodingPolicy=[Name=XOR-2-1-1024k, Schema=[ECSchema=[Codec=xor, 
> numDataUnits=2, numParityUnits=1]], CellSize=1048576, Id=4], State=DISABLED
> $ hdfs ec -enablePolicy -policy XOR-2-1-1024k
> Erasure coding policy XOR-2-1-1024k is enabled
> $ hdfs ec -listPolicies
> Erasure Coding Policies:
> ErasureCodingPolicy=[Name=RS-10-4-1024k, Schema=[ECSchema=[Codec=rs, 
> numDataUnits=10, numParityUnits=4]], CellSize=1048576, Id=5], State=DISABLED
> ErasureCodingPolicy=[Name=RS-3-2-1024k, Schema=[ECSchema=[Codec=rs, 
> numDataUnits=3, numParityUnits=2]], CellSize=1048576, Id=2], State=DISABLED
> ErasureCodingPolicy=[Name=RS-6-3-1024k, Schema=[ECSchema=[Codec=rs, 
> numDataUnits=6, numParityUnits=3]], CellSize=1048576, Id=1], State=ENABLED
> ErasureCodingPolicy=[Name=RS-LEGACY-6-3-1024k, 
> Schema=[ECSchema=[Codec=rs-legacy, numDataUnits=6, numParityUnits=3]], 
> CellSize=1048576, Id=3], State=DISABLED
> ErasureCodingPolicy=[Name=XOR-2-1-1024k, Schema=[ECSchema=[Codec=xor, 
> numDataUnits=2, numParityUnits=1]], CellSize=1048576, Id=4], State=ENABLED
> ----------------------------------
> $ #set default to be RS-3-2 for dfs.namenode.ec.system.default.policy, and 
> restart NN
> (this seems to be what's triggering the failure)
> -----------------------------------
> $ hdfs ec -listPolicies
> Erasure Coding Policies:
> ErasureCodingPolicy=[Name=RS-10-4-1024k, Schema=[ECSchema=[Codec=rs, 
> numDataUnits=10, numParityUnits=4]], CellSize=1048576, Id=5], State=DISABLED
> ErasureCodingPolicy=[Name=RS-3-2-1024k, Schema=[ECSchema=[Codec=rs, 
> numDataUnits=3, numParityUnits=2]], CellSize=1048576, Id=2], State=DISABLED
> ErasureCodingPolicy=[Name=RS-6-3-1024k, Schema=[ECSchema=[Codec=rs, 
> numDataUnits=6, numParityUnits=3]], CellSize=1048576, Id=1], State=ENABLED
> ErasureCodingPolicy=[Name=RS-LEGACY-6-3-1024k, 
> Schema=[ECSchema=[Codec=rs-legacy, numDataUnits=6, numParityUnits=3]], 
> CellSize=1048576, Id=3], State=DISABLED
> ErasureCodingPolicy=[Name=XOR-2-1-1024k, Schema=[ECSchema=[Codec=xor, 
> numDataUnits=2, numParityUnits=1]], CellSize=1048576, Id=4], State=ENABLED
> $ hdfs ec -enablePolicy -policy RS-3-2-1024k
> Erasure coding policy RS-3-2-1024k is enabled
> $ hdfs ec -listPolicies
> Erasure Coding Policies:
> ErasureCodingPolicy=[Name=RS-10-4-1024k, Schema=[ECSchema=[Codec=rs, 
> numDataUnits=10, numParityUnits=4]], CellSize=1048576, Id=5], State=DISABLED
> ErasureCodingPolicy=[Name=RS-3-2-1024k, Schema=[ECSchema=[Codec=rs, 
> numDataUnits=3, numParityUnits=2]], CellSize=1048576, Id=2], State=DISABLED
> ErasureCodingPolicy=[Name=RS-6-3-1024k, Schema=[ECSchema=[Codec=rs, 
> numDataUnits=6, numParityUnits=3]], CellSize=1048576, Id=1], State=ENABLED
> ErasureCodingPolicy=[Name=RS-LEGACY-6-3-1024k, 
> Schema=[ECSchema=[Codec=rs-legacy, numDataUnits=6, numParityUnits=3]], 
> CellSize=1048576, Id=3], State=DISABLED
> ErasureCodingPolicy=[Name=XOR-2-1-1024k, Schema=[ECSchema=[Codec=xor, 
> numDataUnits=2, numParityUnits=1]], CellSize=1048576, Id=4], State=ENABLED
> {noformat}
> The last 2 should show RS-3-2 as ENABLED. RS-6-3 DISABLED if it's not enabled 
> before.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to