I'm most interested in the macro expansion and importing other files for shared 
common code.  I could be missing something, but the TempStorage thing necessary?

bot_filter.pig:
--------------
define bot_cleanser(user) {
    A = load 'bc_input' using TempStorage();
    B = filter A by not is_a_bot($user);
    store B into 'bc_output' using TempStorage();
}
----------------
main.pig:
-------------------
import bot_filter.pig;

A = load 'fact';
store A into 'bc_input' using TempStorage();
inline bot_cleanser('username');
B = load 'bc_output' using TempStorage();
C = group B by user;
...
store Z into 'processed';
-----------------------

Couldn't we pass aliases in instead and remove lots of boilerplate?

bot_filter.pig:
--------------
define bot_cleanser[X,Y](user) {
    X = filter Y by not is_a_bot($user);
}
----------------
main.pig:
-------------------
import bot_filter.pig;

A = load 'fact';
inline bot_cleanser[A,B]('username');
C = group B by user;
...
store Z into 'processed';
-----------------------

The inline then would substitute A for X, B for Y, and 'username' for user.  
Aliases are separated from other parameters because we may actually be 
declaring new aliases when inlining and it should be easier to deal with the 
semantic differences that way.  In particular, the [A, B] above are essentially 
declaring that the macro 'shares' these aliases, and all other aliases do not 
overlap.

Any aliases not declared up front are renamed as to not collide when inlined.  
I look at the macro expansion and function examples and see tons of alias 
naming boilerplate that should IMO be implicit somehow.  Pig already has a lot 
of alias and field naming boilerplate, I would like to avoid introducing more.  
Otherwise, I'm sure I'll use a preprocessor again to get rid of it :).




On Oct 15, 2010, at 4:39 PM, Alan Gates wrote:

> After several months of mulling things around Richard and I have put  
> together a proposed design for adding control flow to Pig.  See 
> http://wiki.apache.org/pig/TuringCompletePig 
>  for complete details.  Please give us your feedback.
> 
> Alan.

Reply via email to