Small question—the python UDF doc says that "variable names inside a schema
string are not used anywhere, they just make the syntax identifiable to the
parser" (https://pig.apache.org/docs/r0.9.0/udf.html#schemafunction).
However, it looks like pig is picking up those field names and keeping them if
I don't override them.
For instance if I have a python UDF:
@outputSchema('a:int')
def my_udf(x):
return 123
And a pig script:
raw = LOAD 'data.txt' USING PigStorage() AS (x:int);
with_udf = FOREACH raw GENERATE my_udfs.my_udf(x);
Running describe on with_udf gives me:
with_udf: {a: int}
Is the doc incorrect there?
Thanks,
Doug