[
https://issues.apache.org/jira/browse/HBASE-14442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14900513#comment-14900513
]
Nathan commented on HBASE-14442:
--------------------------------
Can I just add a unit test in the TestMultiTableInputFormat.java for getSplit
without start a MR job, like this:
@Test
public void testScanA5ToA5Splits() throws IOException,
InterruptedException, ClassNotFoundException {
String start = "aaaaa";
String stop = "aaaaa";
String jobName = "ScanA5ToA5Splits";
Configuration c = new Configuration(TEST_UTIL.getConfiguration());
List<Scan> scans = new ArrayList<Scan>();
for (String tableName : TABLES) {
Scan scan = new Scan();
scan.addFamily(INPUT_FAMILY);
scan.setAttribute(Scan.SCAN_ATTRIBUTES_TABLE_NAME,
Bytes.toBytes(tableName));
if (start != null) {
scan.setStartRow(Bytes.toBytes(start));
}
if (stop != null) {
scan.setStopRow(Bytes.toBytes(stop));
}
scans.add(scan);
LOG.info("scan before: " + scan);
}
Job job = new Job(c, jobName);
initJob(scans, job);
job.setReducerClass(ScanReducer.class);
job.setNumReduceTasks(1); // one to get final "first" and "last" key
FileOutputFormat.setOutputPath(job, new Path(job.getJobName()));
initJob(scans, job);
MultiTableInputFormat format = new MultiTableInputFormat();
//as the startRow is equals stopRow (in the same region), the splits'
size should equal the scanList size
Assert.assertEquals(format.getSplits(job).size(), scans.size());
}
> MultiTableInputFormatBase.getSplits dosenot build split for a scan whose
> startRow=stopRow=(startRow of a region)
> ----------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-14442
> URL: https://issues.apache.org/jira/browse/HBASE-14442
> Project: HBase
> Issue Type: Bug
> Components: mapreduce
> Affects Versions: 1.1.2
> Reporter: Nathan
> Assignee: Nathan
> Original Estimate: 0.5h
> Remaining Estimate: 0.5h
>
> I created a Scan whose startRow and stopRow are the same with a region's
> startRow, then I found no map was built.
> The following is the source code of this condtion:
> (startRow.length == 0 || keys.getSecond()[i].length == 0 ||
> Bytes.compareTo(startRow, keys.getSecond()[i]) < 0) &&
> (stopRow.length == 0 || Bytes.compareTo(stopRow,
> keys.getFirst()[i]) > 0)
> I think a "=" should be added.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)