[ 
https://issues.apache.org/jira/browse/PIG-3359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Packer updated PIG-3359:
---------------------------------

    Attachment: PIG-3359-v5.diff

Latest patch should fix corner cases from RB. I kept 
doParamSubstitutionForMacros because the regular parser from PigScriptParser.jj 
did not work for the following macro file:

/*
%default BINARY_FUNC 'divide'
*/
/* a comment */ %default BINARY_FUNC 'multiply'

REGISTER 'udfs.py' USING jython AS udfs;

DEFINE ApplyBinaryFunc(data)
RETURNS funced {
    $funced = FOREACH $data GENERATE udfs.$BINARY_FUNC($0, $1);
};

DEFINE MacroThatAppliesCross(data1, data2)
RETURNS crossed {
    $crossed = CROSS $data1, $data2;
};

It does not substitute $BINARY_FUNC inside the macro. I don't know how the .jj 
files work but it looks complicated, and I think my latest regex solution, 
while admittedly hacky, should handle 95%+ of situations (including these test 
macros).
                
> Register Statements and Param Substitution in Macros
> ----------------------------------------------------
>
>                 Key: PIG-3359
>                 URL: https://issues.apache.org/jira/browse/PIG-3359
>             Project: Pig
>          Issue Type: Bug
>          Components: parser
>            Reporter: Jonathan Packer
>            Assignee: Jonathan Packer
>         Attachments: PIG-3359_test.tar.gz, PIG-3359-v1.diff, 
> PIG-3359-v2.diff, PIG-3359-v3.diff, PIG-3359-v3-test-failures.txt, 
> PIG-3359-v4.diff, PIG-3359-v5.diff
>
>
> There are some gaps in the functionality of macros that I've made a patch to 
> address. The goal is to provide everything you'd need to make reusable 
> algorithms libraries.
> 1. You can't register udfs inside a macro
> 2. Paramater substitutions aren't done inside macros
> 3. Resources (including macros) should not be redundantly acquired if they 
> are already present.
> Rohini's patch https://issues.apache.org/jira/browse/PIG-3204 should address 
> problem 3 where Pig reparses everything every time it reads a line, but there 
> still would be a problem if two separate files import the same macro / udf 
> file.
> To get this working, I moved methods for registering jars/udfs and param 
> substitution from PigServer to PigContext so they can be accessed in 
> QueryParserDriver which processes macros (QPD was already passed a PigContext 
> reference). Is that ok?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to