Strange stacktrace. Do you get if from map task log of a job which reads
the data?

Try to replace  your udf function with:
@outputSchema("y:float")
def ptile(value):
    return 0.0

Just to see that your UDF is correct.


2014-03-16 3:09 GMT+04:00 Jason W <[email protected]>:

> The stack trace I posted is from the logs. Thanks.
>
>
>
>
>
> On Saturday, March 15, 2014 6:42 PM, Ankit Bhatnagar <[email protected]>
> wrote:
> Check the log file ..it should in the folder where your are running the
> script
>
>
> On 3/15/14 2:12 PM, "Jason W" <[email protected]> wrote:
>
> >
> >
> >Updated for a typo:
> >
> >hi all,
> >
> >My script basically performs a calculation and outputs an array of values.
> >
> >
> >Pig script:
> >
> >register pcent.py using jython as pc;
> >data = load 'Data.csv' as (value:int);
> >B = FOREACH data GENERATE pc.ptile(value) as ptiles;
> >C = group B by (ptiles);
> >store C into 'C';
> >
> >
> >UDF:
> >
> >@outputSchema("y:float")
> >def ptile(value):
> >    unique = set(value)
> >    maps = {}
> >    pc = float(1)/(len(unique)-1)
> >    for n, i in enumerate(unique):
> >        maps[i] = (n*pc)
> >    return [maps.get(i) for i in value]
> >
> >
> >The python script runs fine but I keep getting an error in Pig and the
> >stack trace is not very helpful:
> >
> >
> >Backend error message
> >---------------------
> >org.apache.pig.backend.executionengine.ExecException: ERROR 0: Error
> >executing function
> >    at
> >org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:12
> >0)
> >    at
> >org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOper
> >ators.POUserFunc.getNext(POUserFunc.java:337)
> >    at
> >org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOper
> >ators.POUserFunc.getNext(POUserFunc.java:410)
> >    at
> >org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperat
> >or.getNext(PhysicalOperator.java:344)
> >    at
> >org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOper
> >ators.POForEach.processPlan(POForEach.java:372)
> >    at
> >org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOper
> >ators.POForEach.getNext(POForEach.java:297)
> >    at
> >org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMap
> >Base.runPipeline(PigGenericMapBase.java:283)
> >    at
> >org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMap
> >Base.map(PigGenericMapBase.java:278)
> >    at o
> >
> >Pig Stack Trace
> >---------------
> >ERROR 0: Error executing function
> >
> >org.apache.pig.backend.executionengine.ExecException: ERROR 0: Error
> >executing function
> >    at
> >org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:12
> >0)
> >    at
> >org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOper
> >ators.POUserFunc.getNext(POUserFunc.java:337)
> >    at
> >org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOper
> >ators.POUserFunc.getNext(POUserFunc.java:410)
> >    at
> >org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperat
> >or.getNext(PhysicalOperator.java:344)
> >    at
> >org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOper
> >ators.POForEach.processPlan(POForEach.java:372)
> >    at
> >org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOper
> >ators.POForEach.getNext(POForEach.java:297)
> >    at
> >org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMap
> >Base.runPipeline(PigGenericMapBase.java:283)
> >    at
> >org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMap
> >Base.map(PigGenericMapBase.java:278)
> >==========================================================================
> >======
> >
> >
> >On Saturday, March 15, 2014 5:08 PM, Jason W <[email protected]> wrote:
> >hi all,
> >
> >My script basically performs a calculation and outputs an array of values.
> >
> >
> >Pig script:
> >
> >register pcent.py using jython as pc;
> >data = load 'Data.csv' as (value:int);
> >B = FOREACH data GENERATE pc.ptile(value);
> >C = group B by (percentiles);
> >store C into 'C';
> >
> >
> >UDF:
> >
> >@outputSchema("y:float")
> >def ptile(value):
> >    unique = set(value)
> >    maps = {}
> >    pc = float(1)/(len(unique)-1)
> >    for n, i in enumerate(unique):
> >        maps[i] = (n*pc)
> >    return [maps.get(i) for i in value]
> >
> >
> >The python script runs fine but I keep getting an error in Pig and the
> >stack trace is not very helpful:
> >
> >
> >Backend error message
> >---------------------
> >org.apache.pig.backend.executionengine.ExecException: ERROR 0: Error
> >executing function
> >    at
> >org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:12
> >0)
> >    at
> >org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOper
> >ators.POUserFunc.getNext(POUserFunc.java:337)
> >    at
> >org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOper
> >ators.POUserFunc.getNext(POUserFunc.java:410)
> >    at
> >org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperat
> >or.getNext(PhysicalOperator.java:344)
> >    at
> >org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOper
> >ators.POForEach.processPlan(POForEach.java:372)
> >    at
> >org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOper
> >ators.POForEach.getNext(POForEach.java:297)
> >    at
> >org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMap
> >Base.runPipeline(PigGenericMapBase.java:283)
> >    at
> >org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMap
> >Base.map(PigGenericMapBase.java:278)
> >    at o
> >
> >Pig Stack Trace
> >---------------
> >ERROR 0: Error executing function
> >
> >org.apache.pig.backend.executionengine.ExecException: ERROR 0: Error
> >executing function
> >    at
> >org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:12
> >0)
> >    at
> >org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOper
> >ators.POUserFunc.getNext(POUserFunc.java:337)
> >    at
> >org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOper
> >ators.POUserFunc.getNext(POUserFunc.java:410)
> >    at
> >org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperat
> >or.getNext(PhysicalOperator.java:344)
> >    at
> >org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOper
> >ators.POForEach.processPlan(POForEach.java:372)
> >    at
> >org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOper
> >ators.POForEach.getNext(POForEach.java:297)
> >    at
> >org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMap
> >Base.runPipeline(PigGenericMapBase.java:283)
> >    at
> >org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMap
> >Base.map(PigGenericMapBase.java:278)
> >==========================================================================
> >======
> >
>
>

Reply via email to