[
https://issues.apache.org/jira/browse/HBASE-13356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14512859#comment-14512859
]
Hadoop QA commented on HBASE-13356:
-----------------------------------
{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12728190/HBASE-13356.patch
against master branch at commit cd83d39fb4f50db901b699ba5470b5f709c95c69.
ATTACHMENT ID: 12728190
{color:green}+1 @author{color}. The patch does not contain any @author
tags.
{color:green}+1 tests included{color}. The patch appears to include 12 new
or modified tests.
{color:green}+1 hadoop versions{color}. The patch compiles with all
supported hadoop versions (2.4.1 2.5.2 2.6.0)
{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.
{color:green}+1 protoc{color}. The applied patch does not increase the
total number of protoc compiler warnings.
{color:red}-1 javadoc{color}. The javadoc tool appears to have generated 4
warning messages.
{color:red}-1 checkstyle{color}. The applied patch generated
1965 checkstyle errors (more than the master's current 1900 errors).
{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 2.0.3) warnings.
{color:red}-1 release audit{color}. The applied patch generated 7 release
audit warnings (more than the master's current 0 warnings).
{color:red}-1 lineLengths{color}. The patch introduces the following lines
longer than 100:
+ * MultiTableSnapshotInputFormat generalizes {@link
org.apache.hadoop.hbase.mapred.TableSnapshotInputFormat}
+ * allowing a MapReduce job to run over one or more table snapshots, with one
or more scans configured for each.
+ * Internally, the input format delegates to {@link
org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormat}
+ * and thus has the same performance advantages; see {@link
org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormat} for
+ * Usage is similar to TableSnapshotInputFormat, with the following exception:
initMultiTableSnapshotMapperJob takes in a map
+ * from snapshot name to a collection of scans. For each snapshot in the map,
each corresponding scan will be applied;
+ * the overall dataset for the job is defined by the concatenation of the
regions and tables included in each snapshot/scan
+ * {@link
org.apache.hadoop.hbase.mapred.TableMapReduceUtil#initMultiTableSnapshotMapperJob(Map,
Class, Class, Class, JobConf, boolean, Path)}
+ * Internally, this input format restores each snapshot into a subdirectory of
the given tmp directory. Input splits and
+ * record readers are created as described in {@link
org.apache.hadoop.hbase.mapreduce.TableSnapshotInputFormat}
{color:green}+1 site{color}. The mvn site goal succeeds with this patch.
{color:green}+1 core tests{color}. The patch passed unit tests in .
Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/13816//testReport/
Release audit warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/13816//artifact/patchprocess/patchReleaseAuditWarnings.txt
Release Findbugs (version 2.0.3) warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/13816//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors:
https://builds.apache.org/job/PreCommit-HBASE-Build/13816//artifact/patchprocess/checkstyle-aggregate.html
Javadoc warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/13816//artifact/patchprocess/patchJavadocWarnings.txt
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/13816//console
This message is automatically generated.
> HBase should provide an InputFormat supporting multiple scans in mapreduce
> jobs over snapshots
> ----------------------------------------------------------------------------------------------
>
> Key: HBASE-13356
> URL: https://issues.apache.org/jira/browse/HBASE-13356
> Project: HBase
> Issue Type: New Feature
> Components: mapreduce
> Reporter: Andrew Mains
> Assignee: Andrew Mains
> Priority: Minor
> Attachments: HBASE-13356.patch
>
>
> Currently, HBase supports the pushing of multiple scans to mapreduce jobs
> over live tables (via MultiTableInputFormat) but only supports a single scan
> for mapreduce jobs over table snapshots. It would be handy to support
> multiple scans over snapshots as well, probably through another input format
> (MultiTableSnapshotInputFormat?). To mimic the functionality present in
> MultiTableInputFormat, the new input format would likely have to take in the
> names of all snapshots used in addition to the scans.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)