[ 
https://issues.apache.org/jira/browse/HDFS-12946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16647401#comment-16647401
 ] 

Xiao Chen commented on HDFS-12946:
----------------------------------

Thanks [~knanasi] for the work here!

It'd be nice to have a quick summary of offline discussions, for context. I'll 
try to do it below this time. :)
{quote}[~zvenczel] and Kitti wondered if it would have made sense to do this 
check in the NN (instead of on the client side via multiple RPCs). This way, 
the enableECPolicy could also be injected with the check, and NN can expose 
this via jmx. I think this is a good idea, and I appreciate Kitti's quick 
turnaround on implementing this quickly.
{quote}
Looking at the patch though, I'm a little worried that this new RPC seems to be 
very 'light' comparing to other RPCs. We should investigate to see if there's 
other possibilities so that we do not 'pollute' the ClientNamenodeProtocol. 1 
way I found is if we do it in fsck, we could directly call to FSN. There may be 
better alternatives, but I'd need more time to investigate.

> Add a tool to check rack configuration against EC policies
> ----------------------------------------------------------
>
>                 Key: HDFS-12946
>                 URL: https://issues.apache.org/jira/browse/HDFS-12946
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: erasure-coding
>            Reporter: Xiao Chen
>            Assignee: Kitti Nanasi
>            Priority: Major
>         Attachments: HDFS-12946.01.patch, HDFS-12946.02.patch
>
>
> From testing we have seen setups with problematic racks / datanodes that 
> would not suffice basic EC usages. These are usually found out only after the 
> tests failed.
> We should provide a way to check this beforehand.
> Some scenarios:
> - not enough datanodes compared to EC policy's highest data+parity number
> - not enough racks to satisfy BPPRackFaultTolerant
> - highly uneven racks to satisfy BPPRackFaultTolerant
> - highly uneven racks (so that BPP's considerLoad logic may exclude some busy 
> nodes on the rack, resulting in #2)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to