This is currently not possible. There is some discussion of how to achieve this here: http://wiki.apache.org/pig/TuringCompletePig . This topic is, naturally, very interesting to a number of parties, so any thoughts are welcome.
At the moment, the answer is to just write your control structures in a wrapper language, be it java or some scripting language. We've been working on a Pig DSL for Ruby, which you may find useful: http://github.com/ningliang/piglet -Dmitriy On Fri, Jul 23, 2010 at 7:13 PM, Yong-gang Cao <[email protected]>wrote: > I want to do know whether it's possible to do loop in pig and end loop by > some feedback variable. > > More specifically > 1. I want to read a set of files/directories with different names, and > process them in the same workflow and then join the result of all of the > processed result. > e.g. A=load 'a.txt' as (a,b,c); AGroup=group A by a, count(A) as ACount; > B=load 'b.txt' as (a,b,c); BGroup=group B by a,count(B) as BCount; > C=load 'b.txt' as (a,b,c); CGroup=group C by a,count(C) as CCount; > .... > X=load 'x.txt' as (a,b,c); XGroup=group X by a,count(X) as XCount; > Result= foreach (join AGroup by a, BGroup by a, CGroup by a, ..., > XGroup by a) generate AGroup::a, ACount, BCount, CCount, .... XCount > > Is it possible to simplify my statements by using loop like statements? > > 2. I want to run one statement again and again until one UDF's value is 0 > e.g. I want something like following > A = load something; > while(true){ > A= foreach A generate UDF1(a), FEEDBACKUDF(a) as Signal; > if(Signal==0) > break; > } > > Is it possible to do above things in Pig? and How? > Thanks a lot! > -- > Regards, > > Yong-gang Cao > Seattle,WA,98104 >
