[
https://issues.apache.org/jira/browse/SQOOP-671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13492875#comment-13492875
]
Jarek Jarcec Cecho commented on SQOOP-671:
------------------------------------------
Hi Hari,
thank you for your question. Please accept my apology for not being descriptive
enough in the first place. We're currently overriding Mapper.run() and
bypassing usual record handling through context.write(). As a result we're
missing mapreduce counters that are generated during the default process. But
other counters should be intact (like number of spawned mappers, number of
spawned reducers, ...). This means that we're currently not able to tell how
many records (rows) we've imported or how many bytes we've transferred. That's
what I meant by "some counters are lost" - it should be "some default mapreduce
counters are lost".
The reason for that is current implementation of mapreduce execution (not
submission) engine and I believe that it needs to be fixed there. It's
completely independent on way how you're submitting the job to the cluster
(thus independent on submission engine). Even more, there is already a callback
in submission engine that is querying counters after given submission finish,
but it's always returning null at the moment. Please note that this "second"
issue is covered by SQOOP-678.
Jarcec
> Mapreduce counters are not used in generated mapreduce jobs
> -----------------------------------------------------------
>
> Key: SQOOP-671
> URL: https://issues.apache.org/jira/browse/SQOOP-671
> Project: Sqoop
> Issue Type: Bug
> Affects Versions: 2.0.0
> Reporter: Jarek Jarcec Cecho
> Fix For: 2.0.0
>
>
> As we're using threads to pass data instead of hadoop native way, we're
> loosing some counters (bytes written, number of entries) that might be
> interested for end user. We should propagate those counters ourselves.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira