[ 
https://issues.apache.org/jira/browse/PIG-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Olga Natkovich updated PIG-1678:
--------------------------------

    Fix Version/s:     (was: 0.9.0)

> Need a easy way to bind to external Java library functions that require 
> object constructors such as Apache Commons Math library
> -------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-1678
>                 URL: https://issues.apache.org/jira/browse/PIG-1678
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: David Ciemiewicz
>
> I would like to have a trivial way to bind and invoke Java library functions 
> from within Pig without creating wrapper functions in Java. In particular, 
> there is need to do this for dynamic (non-static) class methods which first 
> require creation of a class instance object to invoke the method.
> Note that the new Pig 0.8 built-in function Invoker only works for static 
> class methods and does not support instantiation of class objects and 
> subsequent dynamic method invocation.
> For instance, I need functions out of the Apache Commons Math library 
> (http://commons.apache.org/math/) such as 
> BetaDistributionImpl.cumulativeProbability.
>     
> http://commons.apache.org/math/apidocs/org/apache/commons/math/distribution/BetaDistributionImpl.html
> To use this class, I must first create a new object with a parameterized 
> constructor -- BetaDistributionImpl(alpha,beta) and then I can invoke a 
> method. This two stage process of object instantiation and then method 
> invocation is a bit clumsy, necessitating a wrapper function.
> I would like to be able to do a simple Pig definition to declare a binding to 
> and instantiate instances of a Java class and invoke methods on these 
> instances.  In the case of Apache Commons Math distribution 
> BetaDistributionImpl, I must parameterize the objection creation with values 
> from my data I am processing with Pig followed by an invocation of a method 
> with a third parameter.
> {code}
> register commons-math-2.1.jar;
> define (new 
> org.apache.commons.math.distribution.BetaDistributionImpl((double) alpha, 
> (double) beta))
>             . cumulativeProbability((double) x) BetaIncomplete(x, alpha, beta)
> {code}
> Writing a Pig Eval<Double> wrapper function that does the same thing requires 
> about 100 lines of Java code to implement the binding to do all the necessary 
> comments, imports, parameter coercions, exception handling and output scheme 
> declarations.  And that's just one wrapper for one method.  The class has on 
> the order of 10-20 methods and there are on the order of 100-200 classes.
> And alternate form to consider is if I could just say something like:
> {code}
> register commons-math-2.1.jar;
> import org.apache.commons.math.distribution.BetaDistributionImpl as BetaDist;
> B = foreach A as
>        alpha,
>        beta,
>        x,
>        BetaDist(alpha,beta).cumulativeProbability(x) as prob;
> {code}
> Ideally I'd be able to register or include a list of all the bindings to the 
> library.
> Of course in the case, Pig should automatically coerce all parameters to 
> their corresponding implementation types e.g. a double parameter in the Java 
> function would dictate that Pig coerce int, long, float, double, chararray, 
> and bytearray to double automagically (albeit some compiler warning might be 
> warranted).
> One question about this proposal is how to handle methods that throw 
> exceptions such as:
> {code}
> public double cumulativeProbability(double x) throws MathException
> {code}
> I think I would propose that Pig provide a means for handling the exception 
> case such as a simple annotation in the declaration:
> {code}
> register commons-math-2.1.jar;
> import org.apache.commons.math.distribution.BetaDistributionImpl as BetaDist, 
> return null on (MathException, Exception);
> {code}
> Or we could get even more fancy and permit wholesale default handling for 
> every method that might throw an exception:
> {code}
> register commons-math-2.1.jar as ApacheMathCommons;
> ApacheMathCommons warn and return null on (MathException, AnyException);
> import org.apache.commons.math.distribution.BetaDistributionImpl as BetaDist;
> {code}
> I'm sure if people think about it, there are probably potentially cleaner 
> ways to import the bindings and handle exceptions cases.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to