[ 
https://issues.apache.org/jira/browse/PIG-928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12879181#action_12879181
 ] 

Alan Gates commented on PIG-928:
--------------------------------

I propose the following syntax for register:

{code}
REGISTER _filename_ [USING _class_ [AS _namespace_]]
{code}

This is backwards compatible with the current version of register.

_class_ in the USING clause would need to implement a new interface 
ScriptEngine (or something) which would be used to interpret the file.  If no 
USING clause is
given, then it is assumed that _filename_ is a jar.  I like this better than 
the 'lang python' option we had earlier because it allows users to add new 
engines
without modifying the parser.  We should however provide a pre-defined set of 
scripting engines and names, so that for example python translates to
org.apache.pig.script.jython.JythonScriptingEngine

If the AS clause is not given, then the basename of _filename_ defines the 
namespace name for all functions defined in that file.  This allows us to avoid
function name clashes.  If the AS clause is given, this defines an alternate 
namespace.  This allows us to avoid name clashes for filenames.  Functions would
have to be referenced by full namespace names, though aliases can be given via 
DEFINE.

Note that the AS clause is a sub-clause of the USING clause, and cannot be used 
alone, so there is no ability to give namespaces to jars.

As far as I can tell there is no need for a SHIP clause in the register.  
Additional python modules that are needed can be registered.  As long as Pig 
lazily
searches for functions and does not automatically find every function in every 
file we register, this will work fine.

So taken altogether, this would look like the following.  Assume we have two 
python files {{/home/alan/myfuncs.py}}

{code}
import mymod

def a():
    ...

def b():
    ...
{code}

and {{/home/bob/myfuncs.py}}:

{code}
def a():
    ...

def c():
    ...
{code}

and the following Pig Latin

{code}
REGISTER /home/alan/myfuncs.py USING python;
REGISTER /home/alan/mymod.py; -- no need for USING since I won't be looking in 
here for files, it just has to be moved over
REGISTER /home/bob/myfuncs.py  USING python AS hisfuncs;

DEFINE b myfuncs.b();

A = LOAD 'mydata' as (x, y, z);
B = FOREACH A GENERATE myfuncs.a(x), b(y), hisfuncs.a(z);
...
{code}



> UDFs in scripting languages
> ---------------------------
>
>                 Key: PIG-928
>                 URL: https://issues.apache.org/jira/browse/PIG-928
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Alan Gates
>            Assignee: Aniket Mokashi
>             Fix For: 0.8.0
>
>         Attachments: calltrace.png, package.zip, pig-greek.tgz, 
> pig.scripting.patch.arnab, pyg.tgz, RegisterPythonUDF2.patch, 
> RegisterScriptUDFDefineParse.patch, scripting.tgz, scripting.tgz, test.zip
>
>
> It should be possible to write UDFs in scripting languages such as python, 
> ruby, etc.  This frees users from needing to compile Java, generate a jar, 
> etc.  It also opens Pig to programmers who prefer scripting languages over 
> Java.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to