[
https://issues.apache.org/jira/browse/PIG-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14172502#comment-14172502
]
Mike Sukmanowsky commented on PIG-4124:
---------------------------------------
Awesome, thanks [~cheolsoo]! So we'll be able to use this with {{SET
pig.streaming.udf.python.command '/path/to/virtualenv/bin/python'}}? Wasn't
sure if that would invalidate your {{language.toLowerCase()}} clause above.
Perhaps needs to be
{code}
private boolean isPython() {
return language.toLowerCase().indexOf("python") != -1;
}
{code}
> Command for Python streaming udf should be configurable
> -------------------------------------------------------
>
> Key: PIG-4124
> URL: https://issues.apache.org/jira/browse/PIG-4124
> Project: Pig
> Issue Type: Improvement
> Reporter: Cheolsoo Park
> Assignee: Cheolsoo Park
> Fix For: 0.14.0
>
> Attachments: PIG-4124-1.patch, PIG-4124-2.patch
>
>
> In my cluster, multiple versions of python are installed such as python2.6,
> python2.7, etc. Since some modules are only available on non-default python
> versions, it would be nice if the python command could be configurable by the
> user.
> For eg, I have a streaming udf that imports pytz. It fails with the following
> error if it runs with {{python}}-
> {code}
> : Caused by: org.apache.pig.impl.streaming.StreamingUDFException: LINE 4:
> ImportError: No module named pytz
> : File
> /mnt1/var/lib/hadoop/nm-local-dir/usercache/cheolsoop/appcache/application_1407968511815_0021/container_1407968511815_0021_01_001322/tmp/udfs.py,
> line 4, in <module>
> : import pytz
> : at
> org.apache.pig.impl.builtin.StreamingUDF$ProcessErrorThread.run(StreamingUDF.java:519)
> {code}
> But it works if I use {{python2.7}} as command.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)