[
https://issues.apache.org/jira/browse/PIG-2317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13127979#comment-13127979
]
Jonathan Coveney commented on PIG-2317:
---------------------------------------
I think that this looks really excellent. Very ruby. I think we should allow
the simpler syntax, but then also allow a more robust syntax for people who
want it.
Wrapping things in a class should simplify things a lot, especially because we
can make helper methods and whatnot to facilitate everything.
I'd also like to have the more robust syntax for people who want it, something
like
{code}
class MyUdfs < PigUdf
outputSchema "str:chararray"
def helloworld
"Hello World"
end
filterFunc
def amihappy
true
end
end
{code}
also it could support
{code}
NoImNot = PigUdf.filterFunc do |x|
return !x
end
{code}
The benefit of the latter being brevity, the benefit of the former being that
you can have easier helper functions etc.
also, and this is something the other scripting languages haven't tackled but I
think we could do it pretty easily, and it would make it so that serious people
could make efficient udfs
{code}
class Count < AlgPigUdf
def initial t
end
def intermed t
end
def final t
end
end
{code}
another syntax
{code}
Count = PigUdf.algebraic do |udf|
udf.initial :func1
udf.intermed :func2
udf.final :func3
end
{code}
Now, one issue that comes up here that didn't come up otherwise was the fact
that in the previous example, we knew exactly what the class was called (since
we wrapped their code in it). Now we don't. However, we are having them require
the gem, so we'd ideally want a way to know the classes that they defined, and
other fun. We could of course do the same trick of wrapping them in a class,
but then you have the same issue, so ideally it'd be something all ruby, where
after running all of these things, there are functions defined in the gem that
give you information on the classes, how to invoke them, etc (ie all the stuff
pig needs, but conveniently exposed in ruby so that the user can manipulate it
as well). But my ideal would being able to do something like
{code}
PigUdf.getClasses //returns a list of the defined classes
PigUdf.getFunctions(className) //returns a dictionary of the defined functions,
and their schema
{code}
And more, who knows.
If you can think of a more ruby-esque syntax that allows this (I'm sure you can
:), do tell. It'd be easy for PigUdf.new, but different for the subclasses.
Love your thoughts.
> Ruby/Jruby UDFs
> ---------------
>
> Key: PIG-2317
> URL: https://issues.apache.org/jira/browse/PIG-2317
> Project: Pig
> Issue Type: New Feature
> Reporter: Jacob Perkins
> Assignee: Jacob Perkins
> Priority: Minor
> Fix For: 0.9.2
>
> Attachments: jruby_scripting.patch, jruby_scripting_2_real.patch
>
>
> It should be possible to write UDFs in Ruby. These UDFs will be registered in
> the same way as python and javascript UDFs.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira