The first case will give you a tuple which contains, as it first element, a
tuple of all of the stuff in *, and as its second element, 'input'.
The second will give youa tuple which contains all of the elements of *,
and then as its last element, 'input'.
This is what I thought, but to be sure I ran this UDF:
import org.apache.pig.EvalFunc;
import java.io.IOException;
import org.apache.pig.data.Tuple;
public class ATHING extends EvalFunc<String> {
public String exec(Tuple input) throws IOException {
System.out.println(input.toString());
return null;
}
}
2011/11/24 Prashant Kommireddi <[email protected]>
> I have a question regarding the pig data types.
>
> If I have a UDF, say 'CustomUDF' and I do something like this:
>
> REGISTER 'foo.jar';
>
> A = LOAD '/shared/a.dat';
>
> What would be the difference in the data types for UDF arguments between
> -->
>
> Case 1 : B = FOREACH A GENERATE CustomUDF(TOTUPLE(*), 'input'); AND
> Case 2 : B = FOREACH A GENERATE CustomUDF(*, 'input');
>
> I am sure Case 1 is (tuple, chararray). Can anyone let me know the data
> type for Case 2 arguments?
>
> Thanks,
> Prashant
>