[ 
https://issues.apache.org/jira/browse/MRUNIT-91?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13237321#comment-13237321
 ] 

Brock Noland commented on MRUNIT-91:
------------------------------------

Yeah this is an interesting one. So if you run a map only job your outputs are 
written directly to HDFS in the order you output them.  In a mapreduce job they 
are sorted and then delivered to the reducers. Based on that I don't see a 
clear line to whether we should care about the output order of the mapper by 
default.

My feeling on the default to false (in new api where we could change 
compatibly) comes from when I first tested a mapper with MapDriver it failed 
due to ordering and I was surprised. It's very much a gut feeling.
                
> runTest() should optionally ignore output order
> -----------------------------------------------
>
>                 Key: MRUNIT-91
>                 URL: https://issues.apache.org/jira/browse/MRUNIT-91
>             Project: MRUnit
>          Issue Type: Improvement
>    Affects Versions: 0.8.1
>            Reporter: William McNeill
>            Priority: Minor
>              Labels: order
>
> Currently MapDriver.runTest() assumes that the order of pairs emitted by the 
> mapper matches the order of the MapDriver.addOutput() calls. However, there 
> are valid mappers that for a given input pair produce output pairs whose 
> order is unspecified for testing purposes. (For example, if the mapper being 
> tested uses a set object for deduplication before emission.) runTest() cannot 
> be used to test these kinds of mappers.
> A workaround is to not use runTest() but instead put the output of run() into 
> a Set and assert that the contents of the set are correct, bypassing MRUnit's 
> validation code.
> A possible improvement would be to add a boolean orderMatters parameter to 
> MapDriver.runTest(), invoking an order-insensitive version of 
> TestDriver.validate() when orderMatters is false and the existing version 
> otherwise.
> For clarity's sake only mappers are discussed in this feature request, but 
> the same applies to reducers as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to