On Oct 18, 2010, at 12:28 PM, Scott Carey wrote:

It is the last that looks to complicate things a bit. We should probably pass the fields visible to the macro as well as the aliases.

That might look like:

------------------------
define disjunctive_join_filter[out RESULT, A.(a,b,c), B.(a,b,c)] (filterStr) {
 inline join_filter[MATCH_1, A.(a,c), B.(a,c)]($filterStr);
 inline join_filter[MATCH_2, A.(b,c), B.(b,c)]($filterStr);
RESULT_GROUP = COGROUP MATCH_1 BY (a,b,c), MATCH_2 BY (a,b,c); // not sure if it can avoid alias ambiguity here, would hate to have to project for it
 RESULT = FOREACH RESULT_GROUP {
   CHOSEN = (IsEmpty(MATCH_1) ? MATCH_2 : MATCH_1);
   GENERATE FLATTEN(CHOSEN);
 }
}

I am concerned that including the fields in each alias that are accessed is fairly complex, especially for a first version of this feature. This same functionality can be accomplished by passing these values as parameters in the general list. That is, your example can be rewritten as:

define disjunctive_join_filter[out RESULT, A, B](filterStr, A_a, A_b, A_c, B_a, B_b, B_c) {

This will require better documentation from macro writers to clearly declare what each parameter is for, but it will work. And nothing prevents us from extending to the syntax you suggest in the future if it becomes clear that it will be useful.

I would also like to get people's feedback on the syntax. I'm not wild about brackets meaning aliases and parenthesis meaning parameters. Would explicit lists be better?

define macro in A, B out Y, Z (param1, param2) { ... }

Alan.


Reply via email to