Small question—the python UDF doc says that "variable names inside a schema 
string are not used anywhere, they just make the syntax identifiable to the 
parser"  (https://pig.apache.org/docs/r0.9.0/udf.html#schemafunction).  
However, it looks like pig is picking up those field names and keeping them if 
I don't override them.

For instance if I have a python UDF:

@outputSchema('a:int')
def my_udf(x):
    return 123

And a pig script:

raw = LOAD 'data.txt' USING PigStorage() AS (x:int);
with_udf = FOREACH raw GENERATE my_udfs.my_udf(x);

Running describe on with_udf gives me:

with_udf: {a: int}

Is the doc incorrect there?

Thanks,
Doug

Reply via email to