[
https://issues.apache.org/jira/browse/HBASE-12596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Victor Xu updated HBASE-12596:
------------------------------
Description:
Normally, we have 2 steps to perform a bulkload: 1. use a job to write HFiles
to be loaded; 2. Move these HFiles to the right hdfs directory. However, the
locality could be loss during the first step. Why not just write the HFiles
directly into the right place? We can do this easily because
StoreFile.WriterBuilder has the "withFavoredNodes" method, and we just need to
call it in HFileOutputFormat's getNewWriter().
This feature is disabled by default, and we could use
'hbase.bulkload.locality.sensitive.enabled' to enable it.
was:Normally, we have 2 steps to perform a bulkload: 1. use a job to write
HFiles to be loaded; 2. Move these HFiles to the right hdfs directory. However,
the locality could be loss during the first step. Why not just write the HFiles
directly into the right place? We can do this easily because
StoreFile.WriterBuilder has the "withFavoredNodes" method, and we just need to
call it in HFileOutputFormat's getNewWriter().
> bulkload needs to follow locality
> ---------------------------------
>
> Key: HBASE-12596
> URL: https://issues.apache.org/jira/browse/HBASE-12596
> Project: HBase
> Issue Type: Improvement
> Components: HFile, regionserver
> Affects Versions: 0.98.8
> Environment: hadoop-2.3.0, hbase-0.98.8, jdk1.7
> Reporter: Victor Xu
> Assignee: Victor Xu
> Attachments: HBASE-12596-0.98-v1.patch, HBASE-12596-master-v1.patch,
> HBASE-12596.patch
>
>
> Normally, we have 2 steps to perform a bulkload: 1. use a job to write HFiles
> to be loaded; 2. Move these HFiles to the right hdfs directory. However, the
> locality could be loss during the first step. Why not just write the HFiles
> directly into the right place? We can do this easily because
> StoreFile.WriterBuilder has the "withFavoredNodes" method, and we just need
> to call it in HFileOutputFormat's getNewWriter().
> This feature is disabled by default, and we could use
> 'hbase.bulkload.locality.sensitive.enabled' to enable it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)