[ 
https://issues.apache.org/jira/browse/HADOOP-18629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17688460#comment-17688460
 ] 

ASF GitHub Bot commented on HADOOP-18629:
-----------------------------------------

steveloughran commented on PR #5391:
URL: https://github.com/apache/hadoop/pull/5391#issuecomment-1429555546

   -1 to anything exposing internal hdfs implementation methods. Sorry
   
    People start using them and expect them to be stable and maintained. There 
is also the little detail that in cloud deployments do not always have hdfs 
jars on the class path; this PR would break those deployments.
   
   What would make sense would be to use createFile() and for hdfs to add a 
.opt() option for those favoured nodes, createFile() is the public api, .opt() 
options can be ignorred by other filesystems, *or reimplemented*. There is a 
lot more in terms of design and wiring up but the benefit is that portability 
and maintainability.




> Hadoop DistCp supports specifying favoredNodes for data copying
> ---------------------------------------------------------------
>
>                 Key: HADOOP-18629
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18629
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: tools
>            Reporter: zhuyaogai
>            Priority: Major
>              Labels: pull-request-available
>
> When importing large scale data to HBase, we always generate the hfiles with 
> other Hadoop cluster, use the Distcp tool to copy the data to the HBase 
> cluster, and bulkload data to HBase table. However, the data locality is 
> rather low which may result in high query latency. After taking a compaction 
> it will recover. Therefore, we can increase the data locality by specifying 
> the favoredNodes in Distcp.
> Could I submit a pull request to optimize it?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to