[ 
https://issues.apache.org/jira/browse/HCATALOG-373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13268847#comment-13268847
 ] 

[email protected] commented on HCATALOG-373:
--------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4971/
-----------------------------------------------------------

(Updated 2012-05-05 01:23:50.756124)


Review request for hcatalog and Francis Liu.


Changes
-------

Patch updated to include a test that counters exist when input has been read. 
Initially I went to add a test specific to this functionality but there was 
significant overlap with what already exists, so I updated an existing test to 
check that the bytes read counter is set when data has actually been read. This 
seems like the best approach as the ongoing maintenance will be low. The check 
includes a comment about why the check exists to help future developers.

Suggestions from the previous review have also been incorporated. Specifically, 
we reuse hcatSplit instead of recasting. As the whole setting properties loop 
is not used I removed that section.


Summary
-------

Update ProgressReporter to work with both old and new mapreduce API. Delay 
creating the base record reader so we have a StatusReporter and can use 
counters.


This addresses bug HCATALOG-373.
    https://issues.apache.org/jira/browse/HCATALOG-373


Diffs (updated)
-----

  src/java/org/apache/hcatalog/mapreduce/HCatBaseInputFormat.java 268167e 
  src/java/org/apache/hcatalog/mapreduce/HCatRecordReader.java 65f96f4 
  src/java/org/apache/hcatalog/mapreduce/InternalUtil.java 1837081 
  src/java/org/apache/hcatalog/mapreduce/ProgressReporter.java fb379cd 
  src/test/org/apache/hcatalog/mapreduce/HCatMapReduceTest.java f3d07a0 

Diff: https://reviews.apache.org/r/4971/diff


Testing
-------

"ant clean test" passes

I can run pig+hcatalog queries using Elephant-Bird deprecated API wrappers, 
which is why this issue originally came up.


Thanks,

Travis


                
> ProgressReporter should work with both old and new MR API
> ---------------------------------------------------------
>
>                 Key: HCATALOG-373
>                 URL: https://issues.apache.org/jira/browse/HCATALOG-373
>             Project: HCatalog
>          Issue Type: Bug
>            Reporter: Travis Crawford
>            Assignee: Travis Crawford
>         Attachments: HCATALOG-373_progress_reporter.diff, 
> HCATALOG-373_progress_reporter_2.diff, HCATALOG-373_progress_reporter_3.diff, 
> HCATALOG-373_progress_reporter_4.diff
>
>
> {{org.apache.hcatalog.mapreduce.ProgressReporter}} currently implements 
> {{org.apache.hadoop.mapred.Reporter}}. It should also extend 
> {{org.apache.hadoop.mapreduce.StatusReporter}} so it works with code 
> expecting either an old or new API reporter.
> The use case is using a wrapper so a serde works with a new-API input format.
> https://github.com/kevinweil/elephant-bird/blob/master/src/java/com/twitter/elephantbird/mapred/input/DeprecatedInputFormatWrapper.java#L163

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to