[ 
https://issues.apache.org/jira/browse/HDFS-9411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15015401#comment-15015401
 ] 

Mark Sun commented on HDFS-9411:
--------------------------------

Hi [~vinayrpet], sorry for late reply

* *About use DataNode configuration or not.*
   ** I mixed up StorageType and DataNode StorageInfo, I means maybe DataNode 
can storage  zone_label into VERSION file of each Volume, or another file in 
same directories. Sure, if then we have to add new command to heartbeat.
   ** Besides consideration we already discussed, set zone_label on Volume 
could be better in case of Volume level isolation in the future. 
   ** zone_label updating is not like simple configuration applying, it is 
mostly looks like refreshNodes:
      *** Both NameNode and DataNode involved.
      *** In same case, we should ensure necessary works(block moving or 
erasuring) done before label finally changes.

* *About BPP related API.*
   ** Your comments makes me a litter bit clear: 
      *** Client do not set xattr explicitly, but ZoneProvider in NameNode will 
read policy from user-context and zone-stats, and then set zone layout as xattr 
to file. 
      *** Use set/getZone API & CLI to change zone layout explicitly after file 
created, and new file/subdirs can inherit zone layout from its parent.
   ** Still not clear:
      *** Where does NameNode get ZoneProvider implementation of certain user?
      *** If NameNode stores zone layout to xattr, how to deal with zoneStats 
change? How to distinguish between “we want to store block in zone_a” and “we 
want to store block in zone_b, but zone_b is full according to zoneStats, so we 
use zone_a”?
      *** How to deal with large files? zoneStats always changes, will we 
update zone layout with new zoneStats for later blocks? 
   ** Proposition
      *** We do not store xattr by NameNode, just re-calculate for each block 
allocation.
      *** If we want per-file policy, we should store **the policy itself** to 
xattr, not the layout result, due to zoneStats changes.
      *** Maybe we need to pass&store policies to DataNode also, for new zone 
based Balancer or Mover.

* *About feature on-off control and DEFAULT_ZONE*
   ** Agree with you, it’s just a philosophy level tendency. I just think 
DEFAULT_ZONE can make API uniform with less if/else, and by default, every node 
belongs to DEFAULT_ZONE, and the code path will be same to current 
implementation.
   ** Or another option, we may use polymorphism to reduce if/else and code 
complexity caused by on-off control.
   ** Sure, both are OK :)

* *Make zone transparent to existing BPPs*
   ** I proposed expand zone to node list to make zone transparent to BPPs, for:
      *** Code is more generic, the new constraint is BPP should put X replicas 
to set<node>, but doesn’t restrict this constraint should be made by 
ZoneProvider. (We can even use this constraint to re-implementation 
local-node-choosing )
      *** More flexible, in some ambident scenario, one replica can be placed 
to both zone_a and zone_b.
      *** Code is loosely-coupled, BPP code modification don’t required to know 
zone_label.

> HDFS ZoneLabel support
> ----------------------
>
>                 Key: HDFS-9411
>                 URL: https://issues.apache.org/jira/browse/HDFS-9411
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Vinayakumar B
>            Assignee: Vinayakumar B
>         Attachments: HDFS_ZoneLabels-16112015.pdf
>
>
> HDFS currently stores data blocks on different datanodes chosen by 
> BlockPlacement Policy. These datanodes are random within the 
> scope(local-rack/different-rack/nodegroup) of network topology. 
> In Multi-tenant (Tenant can be user/service) scenario, blocks of any tenant 
> can be on any datanodes.
>  Based on applications of different tenant, sometimes datanode might get busy 
> making the other tenant's application to slow down. It would be better if 
> admin's have a provision to logically divide the cluster among multi-tenants.
> ZONE_LABELS can logically divide the cluster datanodes into multiple Zones.
> High level design doc to follow soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to