[
https://issues.apache.org/jira/browse/HBASE-11335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matteo Bertozzi updated HBASE-11335:
------------------------------------
Status: Patch Available (was: Open)
> Fix the TABLE_DIR param in TableSnapshotInputFormat
> ---------------------------------------------------
>
> Key: HBASE-11335
> URL: https://issues.apache.org/jira/browse/HBASE-11335
> Project: HBase
> Issue Type: Bug
> Components: mapreduce, snapshots
> Affects Versions: 0.98.3, 0.96.2
> Reporter: deepankar
> Attachments: HBASE_11335-0.96-v1.patch, HBASE_11335-trunk-v1.patch
>
>
> In class *TableSnapshotInputFormat* or *TableSnapshotInputFormatImpl*
> in the function
> {code}
> public static void setInput(Job job, String snapshotName, Path restoreDir)
> throws IOException {
> {code}
> we are setting restoreDir (temporary root) to tableDir
> {code}
> conf.set(TABLE_DIR_KEY, restoreDir.toString());
> {code}
> The above parameter is used to get the InputSplits, especially for
> calculating favorable hosts in the function
> {code}
> Path tableDir = new Path(conf.get(TABLE_DIR_KEY));
> List<String> hosts = getBestLocations(conf,
> HRegion.computeHDFSBlocksDistribution(conf, htd, hri, tableDir));
> {code}
> This will lead to returning a empty *HDFSBlocksDistribution*, as there is
> will be no directory with name as the region name from hri in the restored
> root directory, which will lead to scheduling of non local tasks.
> The change is simple in the sense, is to call the
> {code}FSUtils.getTableDir(rootDir, tableDesc.getTableName()) {code}
> in the getSplits function
> more discussion in the comments below
> https://issues.apache.org/jira/browse/HBASE-8369?focusedCommentId=14012085&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14012085
--
This message was sent by Atlassian JIRA
(v6.2#6252)