pig-0.17.0bin/pig  -x local

very basic UDF file:

#!/usr/bin/python3

from pig_util import outputSchema

@outputSchema("as:int")
def square(num):
if num == None:
return None
return ((num) * (num))

@outputSchema("word:chararray")
def concat(word):
return word + word

Exceedingly simple pig script:

REGISTER '/home/scs/woodcock/SD411/lab_udf/test.py' USING
org.apache.pig.scripting.streaming.Python.PythonScriptEngine AS myFuncs;

A = LOAD '/home/scs/woodcock/SD411/DATA/accident.csv' USING PigStorage(',')
AS (state:int,name:chararray);

B = FOREACH A GENERATE myFuncs.square(state) AS state, name;



If I do a "DUMP A" I get exactly what I would expect.

But, on a "DUMP B", I get a failed job:

java.lang.Exception: org.apache.pig.impl.streaming.StreamingUDFException:
LINE :
at
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: org.apache.pig.impl.streaming.StreamingUDFException: LINE :
at
org.apache.pig.impl.builtin.StreamingUDF$ProcessErrorThread.run(StreamingUDF.java:506)

grunt> Exception in thread "Thread-82" java.lang.NullPointerException:
Cannot invoke "java.util.concurrent.BlockingQueue.put(Object)" because the
return value of
"org.apache.pig.impl.builtin.StreamingUDF.access$500(org.apache.pig.impl.builtin.StreamingUDF)"
is null
at
org.apache.pig.impl.builtin.StreamingUDF$ProcessOutputThread.run(StreamingUDF.java:471)
2024-10-29 13:02:15,296 [communication thread] INFO
org.apache.hadoop.mapred.LocalJobRunner - map > map

?

Reply via email to