[
https://issues.apache.org/jira/browse/HDFS-9411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343776#comment-15343776
]
Vinayakumar B commented on HDFS-9411:
-------------------------------------
HI [~drankye], Answering your earlier comments.
bq. 1. Would it be good to support generic node label instead of ZoneLabel? I
thought it may be useful for some considerations like cluster provisioning and
management, security, repl/EC task scheduling and etc. in addition to block
placement. The label could help specify some node attributes about network,
CPU, storage, usage, and some other application domains
Yes, New Design is a Generic Node Labels support, which considers EC tasks as
well.
bq. 2. Given generic node label is used, maybe we can leverage file/directory
attributes to implement the requirement? Like we create/manage zones of files
expressed in file attributes and place blocks based on flexible node label
combinations.
Yes, Design leverages xAttr to support label expressions on path.
bq. 3. So in the design, Zone or ZoneLabel will be the first factor to block
placement, and will dominate storage policies, right?
Yes, NodeLabel expression will be another factor to select Node, before
selecting the storage based on storage policy in a node.
bq. 4. How this might relate to federation and block pool?
IMO, This don't have any specific relation to federation. Datanode's Label is
applicable for all NameNodes it serving. So Label should be created in all
Namenodes before DN is labelled.
Hope this answers your earlier questions and waiting for some more from new doc.
> HDFS NodeLabel support
> ----------------------
>
> Key: HDFS-9411
> URL: https://issues.apache.org/jira/browse/HDFS-9411
> Project: Hadoop HDFS
> Issue Type: New Feature
> Reporter: Vinayakumar B
> Assignee: Vinayakumar B
> Attachments: HDFSNodeLabels-20-06-2016.pdf,
> HDFS_ZoneLabels-16112015.pdf
>
>
> HDFS currently stores data blocks on different datanodes chosen by
> BlockPlacement Policy. These datanodes are random within the
> scope(local-rack/different-rack/nodegroup) of network topology.
> In Multi-tenant (Tenant can be user/service) scenario, blocks of any tenant
> can be on any datanodes.
> Based on applications of different tenant, sometimes datanode might get busy
> making the other tenant's application to slow down. It would be better if
> admin's have a provision to logically divide the cluster among multi-tenants.
> NodeLabels adds more options to user to specify constraints to select
> specific nodes with specific requirements.
> High level design doc to follow soon.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]