[
https://issues.apache.org/jira/browse/PIG-3359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jonathan Packer updated PIG-3359:
---------------------------------
Attachment: PIG-3359-v2.diff
Hi, thanks for taking a look at this. I simply wasn't aware of that
pattern--I've uploaded an updated patch with registerCode and registerJar moved
back to PigServer. I think it's a bit weird that PigServer instantiates QPD
which then instantiates a new PigServer, but I agree its better to be
consistent with what's already there.
doParamSubstitution(...) I kept moved into PigContext because the param lists
are stored there. I noticed in some code paths PigContext wasn't being informed
of which parameters were being used (ex. GruntParser has its own param
substitution method for some reason?) so I wanted to make sure at least that
each time doParamSubstitution was called it would update the params List object.
Finally, the updated patch adds global param substitutions in macros. The use
case is as follows: a macro file defines a java evalfunc udf with a constructor
argument, which needs to be configurable by the main pigscript. In my case,
this was for a K-Nearest-Neighbors udf; I wanted to be able to specify K.
Thanks for bearing with me--this is my first time touching the main Pig
codebase outside of Piggybank and I'm still trying to grasp the overall
architecture.
> Register Statements and Param Substitution in Macros
> ----------------------------------------------------
>
> Key: PIG-3359
> URL: https://issues.apache.org/jira/browse/PIG-3359
> Project: Pig
> Issue Type: Bug
> Components: parser
> Reporter: Jonathan Packer
> Assignee: Jonathan Packer
> Attachments: PIG-3359_test.tar.gz, PIG-3359-v1.diff, PIG-3359-v2.diff
>
>
> There are some gaps in the functionality of macros that I've made a patch to
> address. The goal is to provide everything you'd need to make reusable
> algorithms libraries.
> 1. You can't register udfs inside a macro
> 2. Paramater substitutions aren't done inside macros
> 3. Resources (including macros) should not be redundantly acquired if they
> are already present.
> Rohini's patch https://issues.apache.org/jira/browse/PIG-3204 should address
> problem 3 where Pig reparses everything every time it reads a line, but there
> still would be a problem if two separate files import the same macro / udf
> file.
> To get this working, I moved methods for registering jars/udfs and param
> substitution from PigServer to PigContext so they can be accessed in
> QueryParserDriver which processes macros (QPD was already passed a PigContext
> reference). Is that ok?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira