[ 
https://issues.apache.org/jira/browse/HDFS-9411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343776#comment-15343776
 ] 

Vinayakumar B commented on HDFS-9411:
-------------------------------------

HI [~drankye], Answering your earlier comments.
bq. 1. Would it be good to support generic node label instead of ZoneLabel? I 
thought it may be useful for some considerations like cluster provisioning and 
management, security, repl/EC task scheduling and etc. in addition to block 
placement. The label could help specify some node attributes about network, 
CPU, storage, usage, and some other application domains
Yes, New Design is a Generic Node Labels support, which considers EC tasks as 
well.

bq. 2. Given generic node label is used, maybe we can leverage file/directory 
attributes to implement the requirement? Like we create/manage zones of files 
expressed in file attributes and place blocks based on flexible node label 
combinations.
Yes, Design leverages xAttr to support label expressions on path.

bq. 3. So in the design, Zone or ZoneLabel will be the first factor to block 
placement, and will dominate storage policies, right?
Yes, NodeLabel expression will be another factor to select Node, before 
selecting the storage based on storage policy in a node.

bq. 4. How this might relate to federation and block pool?
IMO, This don't have any specific relation to federation. Datanode's Label is 
applicable for all NameNodes it serving. So Label should be created in all 
Namenodes before DN is labelled.

Hope this answers your earlier questions and waiting for some more from new doc.

> HDFS NodeLabel support
> ----------------------
>
>                 Key: HDFS-9411
>                 URL: https://issues.apache.org/jira/browse/HDFS-9411
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Vinayakumar B
>            Assignee: Vinayakumar B
>         Attachments: HDFSNodeLabels-20-06-2016.pdf, 
> HDFS_ZoneLabels-16112015.pdf
>
>
> HDFS currently stores data blocks on different datanodes chosen by 
> BlockPlacement Policy. These datanodes are random within the 
> scope(local-rack/different-rack/nodegroup) of network topology. 
> In Multi-tenant (Tenant can be user/service) scenario, blocks of any tenant 
> can be on any datanodes.
>  Based on applications of different tenant, sometimes datanode might get busy 
> making the other tenant's application to slow down. It would be better if 
> admin's have a provision to logically divide the cluster among multi-tenants.
> NodeLabels adds more options to user to specify constraints to select 
> specific nodes with specific requirements.
> High level design doc to follow soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to