[
https://issues.apache.org/jira/browse/HBASE-13702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14597117#comment-14597117
]
Hadoop QA commented on HBASE-13702:
-----------------------------------
{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12741191/HBASE-13702-v4.patch
against master branch at commit d51a184051d968dc3bdc00b1c9257c0a9e5ff8a6.
ATTACHMENT ID: 12741191
{color:green}+1 @author{color}. The patch does not contain any @author
tags.
{color:green}+1 tests included{color}. The patch appears to include 7 new
or modified tests.
{color:green}+1 hadoop versions{color}. The patch compiles with all
supported hadoop versions (2.4.1 2.5.2 2.6.0)
{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.
{color:green}+1 protoc{color}. The applied patch does not increase the
total number of protoc compiler warnings.
{color:green}+1 javadoc{color}. The javadoc tool did not generate any
warning messages.
{color:green}+1 checkstyle{color}. The applied patch does not increase the
total number of checkstyle errors
{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 2.0.3) warnings.
{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.
{color:green}+1 lineLengths{color}. The patch does not introduce lines
longer than 100
{color:green}+1 site{color}. The mvn post-site goal succeeds with this patch.
{color:red}-1 core tests{color}. The patch failed these unit tests:
org.apache.hadoop.hbase.util.TestProcessBasedCluster
org.apache.hadoop.hbase.mapreduce.TestImportExport
Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/14517//testReport/
Release Findbugs (version 2.0.3) warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/14517//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors:
https://builds.apache.org/job/PreCommit-HBASE-Build/14517//artifact/patchprocess/checkstyle-aggregate.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/14517//console
This message is automatically generated.
> ImportTsv: Add dry-run functionality and log bad rows
> -----------------------------------------------------
>
> Key: HBASE-13702
> URL: https://issues.apache.org/jira/browse/HBASE-13702
> Project: HBase
> Issue Type: New Feature
> Reporter: Apekshit Sharma
> Assignee: Apekshit Sharma
> Attachments: HBASE-13702-v2.patch, HBASE-13702-v3.patch,
> HBASE-13702-v4.patch, HBASE-13702.patch
>
>
> ImportTSV job skips bad records by default (keeps a count though).
> -Dimporttsv.skip.bad.lines=false can be used to fail if a bad row is
> encountered.
> To be easily able to determine which rows are corrupted in an input, rather
> than failing on one row at a time seems like a good feature to have.
> Moreover, there should be 'dry-run' functionality in such kinds of tools,
> which can essentially does a quick run of tool without making any changes but
> reporting any errors/warnings and success/failure.
> To identify corrupted rows, simply logging them should be enough. In worst
> case, all rows will be logged and size of logs will be same as input size,
> which seems fine. However, user might have to do some work figuring out where
> the logs. Is there some link we can show to the user when the tool starts
> which can help them with that?
> For the dry run, we can simply use if-else to skip over writing out KVs, and
> any other mutations, if present.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)