[ 
https://issues.apache.org/jira/browse/PIG-2421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13173823#comment-13173823
 ] 

Jonathan Coveney commented on PIG-2421:
---------------------------------------

I thought I would just mention that as a result of this JIRA 
https://issues.apache.org/jira/browse/PIG-2430, I'd mention some stuff Julien 
and I had chatted about.

Right now, the whole getArgToFuncMapping thing is pretty rough around the 
edges. It doesn't handle varArgs,   it uses a somewhat cumbersome FuncSpec, 
etc. It could be a lot more elegant to allow people to either register an 
EvalFunc, or an EvalFuncFactory. This would decouple creating evalfuncs based 
on whatever and the evalfuncs themselves. The factory could have any number of 
parameters and any number of methods, but it would allow people to more easily 
handle the various cases that come up in that JIRA. We could, of course, 
provide a lot of convenience methods and could make simpler interfaces to 
extend on top, but I think part of the difficulty of Pig is that you have your 
high level EvalFunc, and if you want to do anything beyond that, you're knee 
deep in the code. It'd be nice to have abstractions that provide a lot more 
power, and let you write more elegant layers on top to expose to people who 
aren't real power users.
                
> EvalFuncs need redesigned
> -------------------------
>
>                 Key: PIG-2421
>                 URL: https://issues.apache.org/jira/browse/PIG-2421
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.11
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: PIG-newudf.patch, examples.patch
>
>
> The current EvalFunc interface (and associated Algebraic and Accumulator 
> interfaces) have grown unwieldy.  In particular, people have noted the 
> following issues:
> # Writing a UDF requires a lot of boiler plate code.
> # Since UDFs always pass a tuple, users are required to manage their own type 
> checking for input.
> # Declaring schemas for output data is confusing.
> # Writing a UDF that accepts multiple different parameters (using 
> getArgToFuncMapping) is confusing.
> # Using Algebraic and Accumulator interfaces often entails duplicating code 
> from the initial implementation.
> # UDF implementors are exposed to the internals of Pig since they have to 
> know when to return a tuple (Initial, Intermediate) and when not to (exec, 
> Final).
> # The separation of Initial, Intermediate, and Final into separate classes 
> forces code duplication and makes it hard for UDFs in other languages to use 
> those interfaces.
> # There is unused code in the current interface that occasionally causes 
> confusion (e.g. isAsynchronous)
> Any change must be done in a way that allows existing UDFs to continue 
> working essentially forever.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to