[
https://issues.apache.org/jira/browse/PIG-2739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Daniel Dai updated PIG-2739:
----------------------------
Description:
The following script does not work:
{code}
register 'util.py' using jython as util;
A = load '1.txt' as (sentence:chararray);
B = foreach A generate flatten(util.tokenize(sentence));
dump B;
{code}
util.py
{code}
outputSchema("words:{(word:chararray)}")
def tokenize(sentence):
return sentence.split(' ')
{code}
Error message:
org.apache.pig.backend.executionengine.ExecException: ERROR 2078: Caught error
from UDF: org.apache.pig.scripting.jython.JythonFunction [Error executing
function]
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:288)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:304)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:332)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:353)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:294)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:273)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:268)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
Caused by: java.io.IOException: Error executing function
at
org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:122)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:262)
... 11 more
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0:
Cannot convert jython type (org.python.core.PyList) to pig datatype
java.lang.ClassCastException: java.lang.String cannot be cast to
org.apache.pig.data.Tuple
at
org.apache.pig.scripting.jython.JythonUtils.pythonToPig(JythonUtils.java:113)
at
org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:117)
... 12 more
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to
org.apache.pig.data.Tuple
at
org.apache.pig.scripting.jython.JythonUtils.pythonToPig(JythonUtils.java:69)
... 13 more
The problem is Pig expects a tuple inside a list, which is unintuitive in
Python.
was:
The following script does not work:
<code>
register 'util.py' using jython as util;
A = load '1.txt' as (sentence:chararray);
B = foreach A generate flatten(util.tokenize(sentence));
dump B;
<code>
util.py
<code>
outputSchema("words:{(word:chararray)}")
def tokenize(sentence):
return sentence.split(' ')
<code>
Error message:
org.apache.pig.backend.executionengine.ExecException: ERROR 2078: Caught error
from UDF: org.apache.pig.scripting.jython.JythonFunction [Error executing
function]
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:288)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:304)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:332)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:353)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:294)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:273)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:268)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
Caused by: java.io.IOException: Error executing function
at
org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:122)
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:262)
... 11 more
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0:
Cannot convert jython type (org.python.core.PyList) to pig datatype
java.lang.ClassCastException: java.lang.String cannot be cast to
org.apache.pig.data.Tuple
at
org.apache.pig.scripting.jython.JythonUtils.pythonToPig(JythonUtils.java:113)
at
org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:117)
... 12 more
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to
org.apache.pig.data.Tuple
at
org.apache.pig.scripting.jython.JythonUtils.pythonToPig(JythonUtils.java:69)
... 13 more
The problem is Pig expects a tuple inside a list, which is unintuitive in
Python.
> PyList should map to Bag automatically in Jython
> ------------------------------------------------
>
> Key: PIG-2739
> URL: https://issues.apache.org/jira/browse/PIG-2739
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.10.0, 0.11
> Reporter: Daniel Dai
> Assignee: Daniel Dai
>
> The following script does not work:
> {code}
> register 'util.py' using jython as util;
> A = load '1.txt' as (sentence:chararray);
> B = foreach A generate flatten(util.tokenize(sentence));
> dump B;
> {code}
> util.py
> {code}
> outputSchema("words:{(word:chararray)}")
> def tokenize(sentence):
> return sentence.split(' ')
> {code}
> Error message:
> org.apache.pig.backend.executionengine.ExecException: ERROR 2078: Caught
> error from UDF: org.apache.pig.scripting.jython.JythonFunction [Error
> executing function]
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:288)
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:304)
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:332)
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:353)
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:294)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:273)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:268)
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
> Caused by: java.io.IOException: Error executing function
> at
> org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:122)
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:262)
> ... 11 more
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0:
> Cannot convert jython type (org.python.core.PyList) to pig datatype
> java.lang.ClassCastException: java.lang.String cannot be cast to
> org.apache.pig.data.Tuple
> at
> org.apache.pig.scripting.jython.JythonUtils.pythonToPig(JythonUtils.java:113)
> at
> org.apache.pig.scripting.jython.JythonFunction.exec(JythonFunction.java:117)
> ... 12 more
> Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to
> org.apache.pig.data.Tuple
> at
> org.apache.pig.scripting.jython.JythonUtils.pythonToPig(JythonUtils.java:69)
> ... 13 more
> The problem is Pig expects a tuple inside a list, which is unintuitive in
> Python.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira