[ 
https://issues.apache.org/jira/browse/PIG-72?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12564573#action_12564573
 ] 

Xu Zhang commented on PIG-72:
-----------------------------

This patch contains an overhaul from the 1st version of the patch on the 
architecture of how the local mini clusters are set up and torn down.  Now we 
start the local mini clusters in the setUp() method and shut them down in the 
teardown() method.

Any Pig unit test case class that wants to make use of the local mini clusters 
needs to extend the PigTestCase abstract class, and then calls super.setUp() 
from its own setUp() method and super.tearDown() from its own teardown() 
method.  Also, make sure PigServer is instantiated after super.setUp() has been 
called.

If you need to, you can specify the number of datanodes and tasktrackers 
through the corresponding PigTestCase constructor.  The default values are both 
2. 

Note that if you do not want to use the local mini clusters (such as when you 
just want to run Pig in local mode) you can override the setUp() and teardown() 
methods and DO NOT call super.setUp() and super.tearDown() in those methods if 
your class happens to extend PigTestCase.  Alternatively, you can just simply 
extends from TestCase.

There is no change in the way of invoking the Pig unit tests from the 1st 
version of the patch; i.e., you still issue the "ant test" command at the top 
directory of your Pig project. 

On a related note, the total amount of time used to execute all the existing 
Pig test cases is 19 - 20 minutes on my computer. 


> Porting Pig unit tests to use MiniDFSCluster and MiniMRCluster on the local 
> machine
> -----------------------------------------------------------------------------------
>
>                 Key: PIG-72
>                 URL: https://issues.apache.org/jira/browse/PIG-72
>             Project: Pig
>          Issue Type: Test
>          Components: tools
>            Reporter: Xu Zhang
>         Attachments: PortPigUnitTestToMiniClusters.patch
>
>
> We have the need to port the Pig unit tests to use MiniDFSCluster and 
> MiniMRCluster, so that tests can be executed with the DFS and MR threads on 
> the local machine.   This feature will eliminate the need to set up a real 
> distributed hadoop cluster before running the unit tests, as everything will 
> now be carried out with the (mini) cluster on the user's local machine.  
> One prerequisite for using this feature is a hadoop jar that has the class 
> files for MiniDFSCluster, MiniMRCluster and other supporting components.  I 
> have been able to generate such a jar file with a special target added by 
> myself to hadoop's build.xml and have also logged a hadoop jira to request 
> this target be a permanent part of that build file.  If possible, we can just 
> replace hadoop15.jar with this jar file on the SVN source tree and then the 
> users will never need to worry about the availability of this jar file. 
> Please find such a hadoop jar file in the attachment.
> To use the feature in unit tests, the user just need to call 
> MiniClusterBuilder.buildCluster() before a PigServer instance is created with 
> the string "mapreduce" as the parameter to its constructor.  Here is an 
> example of how the MiniClusterBuilder is used in a test case class:
>         public class TestWhatEver extends TestCase {
>               private String initString = "mapreduce";
>               private MiniClusterBuilder cluster = 
> MiniClusterBuilder.buildCluster();
>       
>                 @Test
>                 public void testGroupCountWithMultipleFields() throws 
> Exception {
>                         PigServer pig = new PigServer(initString);
>                         // Do something with the pig server, such as 
> registering and executing Pig 
>                         // queries. The queries will executed with the local 
> cluster. 
>                 }
>      
>                 // More test cases if needed
>         }
> To run the unit tests with the local cluster, under the top directory of the 
> source tree, issue the command "ant test". Notice that you do not need to 
> specify the location of the hadoop-site.xml file with the command line option 
> "-Djunit.hadoop.conf=<dir>" anymore. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to