[
https://issues.apache.org/jira/browse/HCATALOG-373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268847#comment-13268847
]
[email protected] commented on HCATALOG-373:
--------------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4971/
-----------------------------------------------------------
(Updated 2012-05-05 01:23:50.756124)
Review request for hcatalog and Francis Liu.
Changes
-------
Patch updated to include a test that counters exist when input has been read.
Initially I went to add a test specific to this functionality but there was
significant overlap with what already exists, so I updated an existing test to
check that the bytes read counter is set when data has actually been read. This
seems like the best approach as the ongoing maintenance will be low. The check
includes a comment about why the check exists to help future developers.
Suggestions from the previous review have also been incorporated. Specifically,
we reuse hcatSplit instead of recasting. As the whole setting properties loop
is not used I removed that section.
Summary
-------
Update ProgressReporter to work with both old and new mapreduce API. Delay
creating the base record reader so we have a StatusReporter and can use
counters.
This addresses bug HCATALOG-373.
https://issues.apache.org/jira/browse/HCATALOG-373
Diffs (updated)
-----
src/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java 268167e
src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java 65f96f4
src/java/org/apache/hcatalog/mapreduce/InternalUtil.java 1837081
src/java/org/apache/hcatalog/mapreduce/ProgressReporter.java fb379cd
src/test/org/apache/hcatalog/mapreduce/HCatMapReduceTest.java f3d07a0
Diff: https://reviews.apache.org/r/4971/diff
Testing
-------
"ant clean test" passes
I can run pig+hcatalog queries using Elephant-Bird deprecated API wrappers,
which is why this issue originally came up.
Thanks,
Travis
> ProgressReporter should work with both old and new MR API
> ---------------------------------------------------------
>
> Key: HCATALOG-373
> URL: https://issues.apache.org/jira/browse/HCATALOG-373
> Project: HCatalog
> Issue Type: Bug
> Reporter: Travis Crawford
> Assignee: Travis Crawford
> Attachments: HCATALOG-373_progress_reporter.diff,
> HCATALOG-373_progress_reporter_2.diff, HCATALOG-373_progress_reporter_3.diff,
> HCATALOG-373_progress_reporter_4.diff
>
>
> {{org.apache.hcatalog.mapreduce.ProgressReporter}} currently implements
> {{org.apache.hadoop.mapred.Reporter}}. It should also extend
> {{org.apache.hadoop.mapreduce.StatusReporter}} so it works with code
> expecting either an old or new API reporter.
> The use case is using a wrapper so a serde works with a new-API input format.
> https://github.com/kevinweil/elephant-bird/blob/master/src/java/com/twitter/elephantbird/mapred/input/DeprecatedInputFormatWrapper.java#L163
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira