Cheolsoo Park created PIG-4124:
----------------------------------

             Summary: Command for Python streaming udf should be configurable
                 Key: PIG-4124
                 URL: https://issues.apache.org/jira/browse/PIG-4124
             Project: Pig
          Issue Type: Improvement
            Reporter: Cheolsoo Park
            Assignee: Cheolsoo Park
             Fix For: 0.14.0


In my cluster, multiple versions of python are installed such as python2.6, 
python2.7, etc. Since some modules are only available on non-default python 
versions, it would be nice if the python command could be configurable by the 
user.

For eg, I have a streaming udf that imports pytz. It fails with the following 
error if it runs with {{python}}-
{code}
: Caused by: org.apache.pig.impl.streaming.StreamingUDFException: LINE 4: 
ImportError: No module named pytz
: File 
/mnt1/var/lib/hadoop/nm-local-dir/usercache/cheolsoop/appcache/application_1407968511815_0021/container_1407968511815_0021_01_001322/tmp/udfs.py,
 line 4, in <module>
: import pytz
: at 
org.apache.pig.impl.builtin.StreamingUDF$ProcessErrorThread.run(StreamingUDF.java:519)
{code}
But it works if I use {{python2.7}} as command.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to