zhuyaogai created HADOOP-18629:
----------------------------------
Summary: Hadoop DistCp supports specifying favoredNodes for data
copying
Key: HADOOP-18629
URL: https://issues.apache.org/jira/browse/HADOOP-18629
Project: Hadoop Common
Issue Type: New Feature
Components: tools
Reporter: zhuyaogai
When importing large scale data to HBase, we always generate the hfiles with
other Hadoop clusters, use the Distcp tool to copy the data to the HBase
cluster, and bulkload data to HBase table. However, the data locality is rather
low which may result in high query latency. After taking a compaction it will
recover. Therefore, we can increase the data locality by specifying the
favoredNodes in Distcp.
Could I submit a pull request to optimize it?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]