[ 
https://issues.apache.org/jira/browse/PIG-928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arnab Nandi updated PIG-928:
----------------------------

    Attachment: pig.scripting.patch.arnab
                test.zip
                calltrace.png

Building on Julien's and Woody's code, this patch provides pluggable scripting 
support in native Pig.

##Syntax:##

register 'test.py' USING org.apache.pig.scripting.jython.JythonScriptEngine;

This makes all functions inside test.py available as Pig functions.

##Things in this patch: ##

1. Modifications to parser .jjt file

2. ScriptEngine abstract class and Jython instantiation. 

3. Ability to ship .py files similar to .jars, loaded on demand.

4. Input checking and Schema support.


##Things NOT in this patch: ##

1. Inline code support: (Replace 'test.py' with `multiline inline code`, prefer 
to submit as separate bug)

2. Scripting engines and examples other than Jython(e.g. beanshell and rhino)

3. Junit-based test harness (provided as test.zip)

4. Python<->Pig Object transforms are not very efficient (see calltrace.zip). 
Preferred the cleaner implementation first. (non-obvious optimizations such as 
object reuse can be introduced as separate bug)


##Notes: ##

1. I went with "register" instead of "define" since files can contain multiple 
functions, similar to .jars. imho this makes more sense, using define would 
introduce the concept of "codeblock aliases" and function names would look like 
"alias.functionName()", which is possible but inconsistent since we cannot have 
"alias2.functionName()" (which would require separate interpreter instances, 
etc etc).

2. This has been tested both locally and in mapred mode.

3. We assume .py files are simply a list of functions. Since the entire file is 
loaded, you can have dependent functions. No effort is made to resolve imports, 
though.

4. You'll need to add jython.jar into classpath, or compile it into pig.jar.


Would love comments and code-followups!


> UDFs in scripting languages
> ---------------------------
>
>                 Key: PIG-928
>                 URL: https://issues.apache.org/jira/browse/PIG-928
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Alan Gates
>             Fix For: 0.8.0
>
>         Attachments: calltrace.png, package.zip, pig-greek.tgz, 
> pig.scripting.patch.arnab, pyg.tgz, scripting.tgz, scripting.tgz, test.zip
>
>
> It should be possible to write UDFs in scripting languages such as python, 
> ruby, etc.  This frees users from needing to compile Java, generate a jar, 
> etc.  It also opens Pig to programmers who prefer scripting languages over 
> Java.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to