Trying to attach the PigRunner class in case that helps give you a start using 
register script.


On Jul 20, 2010, at 11:56 PM, Corbin Hoenes wrote:

> Hey Todd we run against entire pig scripts with some helper classes we built 
> basically they preprocess the variables then call register script but the 
> test looks like this:
> 
>    @Before
>    public void setUp() throws Exception {
>        Helper.delete(OUT_FILE);
>        runner = new PigRunner();
>    }
> 
> 
>    @Test
>    public void testRecordCount() throws Exception {
>       runner.execute("myscript.pig", "param1=foo","param2=bar");
> 
>       Iterator<Tuple> tuples = runner.getPigServer().openIterator("foo");
>       assertEquals(41L, Helper.countTuples(tuples));
>    }
> 
> It's been very useful for us to test this way.  Would love to see more 
> chatter about other techniques.
> 
> On Jul 20, 2010, at 3:26 PM, ToddG wrote:
> 
> 
>> I'd like to include running various PIG scripts in my continuous build 
>> system. Of course, I'll only use small datasets for this, and in the 
>> beginning, I'll only target a local machine instance. However, this brings 
>> up several questions:
>> 
>> 
>> Q: Whats the best way to run PIG from java? Here's what I'm doing, following 
>> a pattern I found in some of the pig tests:
>> 
>> 1. Create Pig resources in a base class (shamelessly copied from 
>> PigExecTestCase):
>> 
>>   protected MiniCluster cluster;
>>   protected PigServer pigServer;
>> 
>>   @Before
>>   public void setUp() throws Exception {
>> 
>>       String execTypeString = System.getProperty("test.exectype");
>>       if(execTypeString!=null && execTypeString.length()>0){
>>           execType = PigServer.parseExecType(execTypeString);
>>       }
>>       if(execType == MAPREDUCE) {
>>           cluster = MiniCluster.buildCluster();
>>           pigServer = new PigServer(MAPREDUCE, cluster.getProperties());
>>       } else {
>>           pigServer = new PigServer(LOCAL);
>>       }
>>   }
>> 
>> 2. Test classes sub class this to get access to the MiniCluster and 
>> PigServer (copied from TestPigSplit):
>> 
>>   @Test
>>   public void notestLongEvalSpec() throws Exception{
>>       inputFileName = "notestLongEvalSpec-input.txt";
>>       createInput(new String[] {"0\ta"});
>> 
>>       pigServer.registerQuery("a = load '" + inputFileName + "';");
>>       for (int i=0; i< 500; i++){
>>           pigServer.registerQuery("a = filter a by $0 == '1';");
>>       }
>>       Iterator<Tuple> iter = pigServer.openIterator("a");
>>       while (iter.hasNext()){
>>           throw new Exception();
>>       }
>>   }
>> 
>> 3. ERROR
>> 
>> This pattern works for simple PIG directives, but I want to load up entire 
>> pig scripts, which have REGISTER and DEFINE directives, then the 
>> pigServer.registerQuery() fails with:
>> 
>> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during 
>> parsing. Unrecognized alias REGISTER
>>   at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1170)
>>   at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114)
>>   at org.apache.pig.PigServer.registerQuery(PigServer.java:425)
>>   at org.apache.pig.PigServer.registerQuery(PigServer.java:441)
>>   at 
>> com.audiencescience.apollo.reporting.NetworkRevenueReportTest.shouldParseNetworkRevenueReportScript(NetworkRevenueReportTest.java:74)
>>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>   at 
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>   at 
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> 
>> Any suggestions?
>> 
>> -Todd
> 

Reply via email to