[
https://issues.apache.org/jira/browse/SQOOP-2811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15124564#comment-15124564
]
Sqoop QA bot commented on SQOOP-2811:
-------------------------------------
Testing file
[SQOOP-2811.patch|https://issues.apache.org/jira/secure/attachment/12785295/SQOOP-2811.patch]
against branch sqoop2 took 1:05:57.655409.
{color:red}Overall:{color} -1 due to an error(s), see details below:
{color:green}SUCCESS:{color} Clean was successful
{color:green}SUCCESS:{color} Patch applied correctly
{color:red}ERROR:{color} Patch does not add/modify any test case
{color:green}SUCCESS:{color} License check passed
{color:green}SUCCESS:{color} Patch compiled
{color:green}SUCCESS:{color} All unit tests passed (executed 1676 tests)
{color:orange}WARNING:{color} Test coverage has decreased
([report|https://builds.apache.org/job/PreCommit-SQOOP-Build/2153/artifact/patch-process/cobertura_report.txt])
* Package {{connector/connector-hdfs}} has lower test coverage: Line coverage
decreased by 5% (from 80% to 75%), Branch coverage decreased by 0% (from 59% to
59%)
{color:green}SUCCESS:{color} No new findbugs warnings
([report|https://builds.apache.org/job/PreCommit-SQOOP-Build/2153/artifact/patch-process/findbugs_report.txt])
{color:green}SUCCESS:{color} All integration tests passed (executed 190 tests)
Console output is available
[here|https://builds.apache.org/job/PreCommit-SQOOP-Build/2153/console].
This message is automatically generated.
> Sqoop2: Extracting sequence files may result in duplicates
> ----------------------------------------------------------
>
> Key: SQOOP-2811
> URL: https://issues.apache.org/jira/browse/SQOOP-2811
> Project: Sqoop
> Issue Type: Bug
> Affects Versions: 1.99.6
> Reporter: Abraham Fine
> Assignee: Abraham Fine
> Attachments: SQOOP-2811.patch
>
>
> In the hdfs extractor we use:
> {code:java}
> if (start > filereader.getPosition()) {
> filereader.sync(start); // sync to start
> }
> {code}
> to jump to the correct point in the sequence file that we want to extract.
> If the sequence file is small, multiple start points may `sync` to the same
> point and we could end up extracting the same record multiple times.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)