[jira] [Commented] (MRUNIT-69) new mrunit api

Bertrand Dechoux (JIRA) Wed, 15 Aug 2012 03:57:42 -0700

    [ 
https://issues.apache.org/jira/browse/MRUNIT-69?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13434963#comment-13434963
 ]


Bertrand Dechoux commented on MRUNIT-69:
----------------------------------------

I do have questions about the annotation based API (assuming there is no need 
to extend a provided parent test class).

1) How do you intend to check for input/ouput types? I don't see how that can 
be done without exposing a context (like the typed driver or a parent class). 
It wouldn't be enforced anymore?

2) How do you intend to check for methods which should not be called eg 
withKeyGroupingComparator during a reducer test? I don't see how that can be 
done without exposing a context. It wouldn't be enforced anymore?

3) I like the idea of a in memory FileSystem implementation. I know Cascading 2 
has now a local mode which sounds a bit similar. But it might too abstract or 
too strongly tied to Cascading concepts to be of any use to MRUnit. This 
feature would be also be a must for pig/hive when you want to run the same 
query but locally without the cluster latency. So it would be interesting to 
see if something already exist around Hadoop.
                
> new mrunit api
> --------------
>
>                 Key: MRUNIT-69
>                 URL: https://issues.apache.org/jira/browse/MRUNIT-69
>             Project: MRUnit
>          Issue Type: Umbrella
>    Affects Versions: 0.8.1
>            Reporter: Jim Donofrio
>            Assignee: Jim Donofrio
>
> So I am curious what the plan is for the longterm future of MRUNIT?
> I think currently MRUNIT is useful for just unit testing a single mapper or 
> reducer but currently there is a void for testing more complicated features 
> such as MultipleInputs, MultipleOutputs, a driver class, counters, among 
> other things. I wonder if instead of adding support to the current MRUNIT 
> framework for these extra features it would more useful to add in hooks to 
> the existing LocalJobRunner and MiniMRCluster classes to provide methods to 
> more easily verify file output from text files, sequence files, etc. This 
> would allow MRUNIT to test driver classes, MultipleInputs, MultipleOutputs, 
> etc. MRUNIT would also then test against the real hadoop code instead of an 
> implementation that mimics hadoop which can miss some bugs such as the 
> ReduceDriver that did not reuse the same object until 0.8.0. MRUNIT would 
> also keep up with new map reduce features instead of us having to implement 
> fake versions of them
> I understand that performance would be an issue due to the file I/O but I 
> wonder how fast the LocalJobRunner would be if we wrote a new class that 
> extending FileSystem to allow users to write out fake files to memory and 
> make the LocalJobRunner read from them

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MRUNIT-69) new mrunit api

Reply via email to