[ 
https://issues.apache.org/jira/browse/PIG-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy V. Ryaboy updated PIG-1551:
-----------------------------------

          Status: Resolved  (was: Patch Available)
    Release Note: 
The idea is simple: frequently, Pig users need to use a simple function that is 
already provided by standard Java libraries, but for which a UDF has not been 
written. Dynamic Invokers allow a Pig programmer to refer to Java functions 
without having to wrap them in custom Pig UDFs, at the cost of doing some Java 
reflection on every function call.

{code}
DEFINE UrlDecode InvokeForString('java.net.URLDecoder.decode', 'String String');
encoded_strings = LOAD 'encoded_strings.txt' as (encoded:chararray);
decoded_strings = FOREACH encoded_strings GENERATE UrlDecode(encoded, 'UTF-8');
{code}

Currently, Dynamic Invokers can be used for any static function that accepts no 
arguments or some combination of Strings, ints, longs, doubles, floats, or 
arrays of same, and returns a String, an int, a long, a double, or a float. 
Primitives only for the numbers, no capital-letter numeric classes as 
arguments. Depending on the return type, a specific kind of Invoker must be 
used: InvokeForString, InvokeForInt, InvokeForLong, InvokeForDouble, or 
InvokeForFloat.

The DEFINE keyword is used to bind a keyword to a Java method, as above. The 
first argument to the InvokeFor* constructor is the full path to the desired 
method. The second argument is a space-delimited ordered list of the classes of 
the method arguments. This can be omitted or an empty string if the method 
takes no arguments. Valid class names are String, Long, Float, Double, and Int. 
Invokers can also work with array arguments, represented in Pig as DataBags of 
single-tuple elements. Simply refer to string[], for example. Class names are 
not case-sensitive.

The ability to use invokers on methods that take array arguments makes methods 
like those in org.apache.commons.math.stat.StatUtils available for processing 
the results of grouping your datasets, for example. This is very nice, but a 
word of caution: the resulting UDF will of course not be optimized for Hadoop, 
and the very significant benefits one gains from implementing the Algebraic and 
Accumulative interfaces are lost here. Be careful with this one.
      Resolution: Fixed

Commited.

> Improve dynamic invokers to deal with no-arg methods and array parameters
> -------------------------------------------------------------------------
>
>                 Key: PIG-1551
>                 URL: https://issues.apache.org/jira/browse/PIG-1551
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.8.0
>            Reporter: Dmitriy V. Ryaboy
>            Assignee: Dmitriy V. Ryaboy
>             Fix For: 0.8.0
>
>         Attachments: PIG-1551.patch, PIG_1551.2.patch, PIG_1551.3.patch
>
>
> PIG-1354 introduced a set of UDFs that can be used to dynamically wrap simple 
> Java methods in a UDF, so that users don't need to create trivial wrappers if 
> they are ok sacrificing some speed.
> This issue is to extend the set of methods that can be wrapped this way to 
> include methods that do not take any arguments, and methods that take arrays 
> of {int,long,float,double,string} as arguments. 
> Arrays are expected to be represented by bags in Pig. Notably, this allows 
> users to wrap statistical functions in o.a.commons.math.stat.StatUtils . 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to