[
https://issues.apache.org/jira/browse/HDFS-11412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877094#comment-15877094
]
Manoj Govindassamy edited comment on HDFS-11412 at 2/22/17 7:17 PM:
--------------------------------------------------------------------
Thanks for the review and detailed comments [~mingma].
(1)
bq. Regarding whether to use default replication factor or max replication
factor,..Block A have large replication factor 30
I am assuming, this is because a file create has been done with an explicit
replication config of 30
bq. and would like to keep at least 20 live replicas around during
maintenance...the system need to honor minReplicationToBeInMaintenance == 20.
True. Which means, the allowable value for minReplicationToBeInMaintenance can
be argued to be set in the range {{0 - dfs.namenode.replication.max}}. Though
this is a valid case, this particular range can have adverse effects as it can
force replicate to larger number of blocks (to honor
minReplicationToBeInMaintenance) even for the files that aren't created with
higher replication factor. Under normal operation, the only way file could have
lots of replicas, way larger than default replication is by explicitly setting
the file replication param during the file creation. Also, maintenance
operation are suppose to be for short duration only. So, to avoid
re-replication of blocks, we can argue the value range can be {{0 -
dfs.replication}}. Let me know your thoughts.
(2)
bq. Impact on getExpectedLiveRedundancyNum calculation.
{noformat}
public short getExpectedLiveRedundancyNum(BlockInfo block,
NumberReplicas numberReplicas) {
final short expectedRedundancy = getExpectedRedundancyNum(block);
return (short)Math.max(expectedRedundancy -
numberReplicas.maintenanceReplicas(),
getMinMaintenanceStorageNum(block));
}
{noformat}
{{getExpectedLiveRedundancyNum()}} is used in variety of places other in
maintenance state monitor. Just replacing this with
*Math.min*(expectedRedundancy - numberReplicas.maintenanceReplicas(),
getMinMaintenanceStorageNum(block)) might not be right, as
minReplicationToBeInMaintenance = 0 is also a valid config, and that would mean
the above expression returns the min value which is 0, which can break
non-maintenance callers.
So, during the normal BlockManager block validation/reconstruction, we want
this the above expression to honor {{expectedRedundancy -
numberReplicas.maintenanceReplicas()}}. May be we need to return the max or min
value based on how the block replication is set compared to the default
replication. Your thoughts please ?
was (Author: manojg):
(1)
bq. Regarding whether to use default replication factor or max replication
factor,..Block A have large replication factor 30
I am assuming, this is because a file create has been done with an explicit
replication config of 30
bq. and would like to keep at least 20 live replicas around during
maintenance...the system need to honor minReplicationToBeInMaintenance == 20.
True. Which means, the allowable value for minReplicationToBeInMaintenance can
be argued to be set in the range {{0 - dfs.namenode.replication.max}}. Though
this is a valid case, this particular range can have adverse effects as it can
force replicate to larger number of blocks (to honor
minReplicationToBeInMaintenance) even for the files that aren't created with
higher replication factor. Under normal operation, the only way file could have
lots of replicas, way larger than default replication is by explicitly setting
the file replication param during the file creation. Also, maintenance
operation are suppose to be for short duration only. So, to avoid
re-replication of blocks, we can argue the value range can be {{0 -
dfs.replication}}. Let me know your thoughts.
(2)
bq. Impact on getExpectedLiveRedundancyNum calculation.
{noformat}
public short getExpectedLiveRedundancyNum(BlockInfo block,
NumberReplicas numberReplicas) {
final short expectedRedundancy = getExpectedRedundancyNum(block);
return (short)Math.max(expectedRedundancy -
numberReplicas.maintenanceReplicas(),
getMinMaintenanceStorageNum(block));
}
{noformat}
{{getExpectedLiveRedundancyNum()}} is used in variety of places other in
maintenance state monitor. Just replacing this with
*Math.min*(expectedRedundancy - numberReplicas.maintenanceReplicas(),
getMinMaintenanceStorageNum(block)) might not be right, as
minReplicationToBeInMaintenance = 0 is also a valid config, and that would mean
the above expression returns the min value which is 0, which can break
non-maintenance callers.
So, during the normal BlockManager block validation/reconstruction, we want
this the above expression to honor {{expectedRedundancy -
numberReplicas.maintenanceReplicas()}}. May be we need to return the max or min
value based on how the block replication is set compared to the default
replication. Your thoughts please ?
> Maintenance minimum replication config value allowable range should be {0 -
> DefaultReplication}
> -----------------------------------------------------------------------------------------------
>
> Key: HDFS-11412
> URL: https://issues.apache.org/jira/browse/HDFS-11412
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: datanode, namenode
> Affects Versions: 3.0.0-alpha1
> Reporter: Manoj Govindassamy
> Assignee: Manoj Govindassamy
> Attachments: HDFS-11412.01.patch
>
>
> Currently the allowed value range for Maintenance Min Replication
> {{dfs.namenode.maintenance.replication.min}} is 0 to
> {{dfs.namenode.replication.min}} (default=1). Users wanting not to affect the
> performance of the cluster would wish to have the Maintenance Min Replication
> number greater than 1, say 2. In the current design, it is possible to have
> this Maintenance Min Replication configuration, but only after changing the
> NameNode level Block Min Replication to 2, and which could slowdown the
> overall latency for client writes.
> Technically speaking we should be allowing Maintenance Min Replication to be
> in range 0 to dfs.replication.max.
> * There is always config value of 0 for users not wanting any
> availability/performance during maintenance.
> * And, performance centric workloads can still get maintenance done without
> major disruptions by having a bigger Maintenance Min Replication. Setting the
> upper limit as dfs.replication.max could be an overkill as it could trigger
> re-replication which Maintenance State is trying to avoid. So, we could allow
> the {{dfs.namenode.maintenance.replication.min}} in the range {{0 to
> dfs.replication}}
> {noformat}
> if (minMaintenanceR < 0) {
> throw new IOException("Unexpected configuration parameters: "
> + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY
> + " = " + minMaintenanceR + " < 0");
> }
> if (minMaintenanceR > minR) {
> throw new IOException("Unexpected configuration parameters: "
> + DFSConfigKeys.DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY
> + " = " + minMaintenanceR + " > "
> + DFSConfigKeys.DFS_NAMENODE_REPLICATION_MIN_KEY
> + " = " + minR);
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]