[ 
https://issues.apache.org/jira/browse/MAPREDUCE-5670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13877847#comment-13877847
 ] 

Jason Lowe commented on MAPREDUCE-5670:
---------------------------------------

Thanks for picking this up, Chen!  Comments on the patch:

* org.apache.hadoop.mapred.lib.db.TestCombineFileRecordReader is in the wrong 
package, should be org.apache.hadoop.mapred.lib.TestCombineFileRecordReader
* There's an unused Mockito import in one of the tests, and it seems like 
Mockito could save quite a bit of test code.  A mocked reporter with 
verify(mockReporter).progress() to verify the progress method was called would 
accomplish the same with much less code.  Did that not work for some reason?  
The unused import implies that approach was probably attempted, so curious if 
that's the case why it didn't work out.
* The original bug involves a bunch of empty files, so the test doesn't need to 
write any data to the test files and all the lengths should be zero.  Bonus 
points if the test calls next() on the record reader, verifies it returns false 
and verifies progress was called the correct number of times.
* There's an unnecessary try...finally block in testInitNextRecordReader

> CombineFileRecordReader should report progress when moving to the next file
> ---------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-5670
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5670
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 0.23.9
>            Reporter: Jason Lowe
>            Assignee: Chen He
>            Priority: Minor
>         Attachments: MR-5670.patch
>
>
> If a combine split consists of many "empty" files (i.e.: no record found by 
> the underlying record reader) then theoretically a task can timeout due to 
> lack of reported progress.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to