Corbin, Have you looked at PigUnit? https://issues.apache.org/jira/browse/PIG-1404
On Tue, Jul 20, 2010 at 11:07 PM, Corbin Hoenes <cor...@tynt.com> wrote: > okay no attachments...try this gist: > > http://gist.github.com/484135 > > On Jul 21, 2010, at 12:02 AM, Corbin Hoenes wrote: > > > Trying to attach the PigRunner class in case that helps give you a start > using register script. > > > > > > > > On Jul 20, 2010, at 11:56 PM, Corbin Hoenes wrote: > > > >> Hey Todd we run against entire pig scripts with some helper classes we > built basically they preprocess the variables then call register script but > the test looks like this: > >> > >> @Before > >> public void setUp() throws Exception { > >> Helper.delete(OUT_FILE); > >> runner = new PigRunner(); > >> } > >> > >> > >> @Test > >> public void testRecordCount() throws Exception { > >> runner.execute("myscript.pig", "param1=foo","param2=bar"); > >> > >> Iterator<Tuple> tuples = runner.getPigServer().openIterator("foo"); > >> assertEquals(41L, Helper.countTuples(tuples)); > >> } > >> > >> It's been very useful for us to test this way. Would love to see more > chatter about other techniques. > >> > >> On Jul 20, 2010, at 3:26 PM, ToddG wrote: > >> > >> > >>> I'd like to include running various PIG scripts in my continuous build > system. Of course, I'll only use small datasets for this, and in the > beginning, I'll only target a local machine instance. However, this brings > up several questions: > >>> > >>> > >>> Q: Whats the best way to run PIG from java? Here's what I'm doing, > following a pattern I found in some of the pig tests: > >>> > >>> 1. Create Pig resources in a base class (shamelessly copied from > PigExecTestCase): > >>> > >>> protected MiniCluster cluster; > >>> protected PigServer pigServer; > >>> > >>> @Before > >>> public void setUp() throws Exception { > >>> > >>> String execTypeString = System.getProperty("test.exectype"); > >>> if(execTypeString!=null && execTypeString.length()>0){ > >>> execType = PigServer.parseExecType(execTypeString); > >>> } > >>> if(execType == MAPREDUCE) { > >>> cluster = MiniCluster.buildCluster(); > >>> pigServer = new PigServer(MAPREDUCE, cluster.getProperties()); > >>> } else { > >>> pigServer = new PigServer(LOCAL); > >>> } > >>> } > >>> > >>> 2. Test classes sub class this to get access to the MiniCluster and > PigServer (copied from TestPigSplit): > >>> > >>> @Test > >>> public void notestLongEvalSpec() throws Exception{ > >>> inputFileName = "notestLongEvalSpec-input.txt"; > >>> createInput(new String[] {"0\ta"}); > >>> > >>> pigServer.registerQuery("a = load '" + inputFileName + "';"); > >>> for (int i=0; i< 500; i++){ > >>> pigServer.registerQuery("a = filter a by $0 == '1';"); > >>> } > >>> Iterator<Tuple> iter = pigServer.openIterator("a"); > >>> while (iter.hasNext()){ > >>> throw new Exception(); > >>> } > >>> } > >>> > >>> 3. ERROR > >>> > >>> This pattern works for simple PIG directives, but I want to load up > entire pig scripts, which have REGISTER and DEFINE directives, then the > pigServer.registerQuery() fails with: > >>> > >>> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error > during parsing. Unrecognized alias REGISTER > >>> at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1170) > >>> at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114) > >>> at org.apache.pig.PigServer.registerQuery(PigServer.java:425) > >>> at org.apache.pig.PigServer.registerQuery(PigServer.java:441) > >>> at > com.audiencescience.apollo.reporting.NetworkRevenueReportTest.shouldParseNetworkRevenueReportScript(NetworkRevenueReportTest.java:74) > >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > >>> at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > >>> at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > >>> > >>> Any suggestions? > >>> > >>> -Todd > >> > > > >