[
https://issues.apache.org/jira/browse/PIG-58?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12565975#action_12565975
]
Olga Natkovich commented on PIG-58:
-----------------------------------
This is in reply to
https://issues.apache.org/jira/browse/PIG-58?focusedCommentId=12565956#action_12565956
>> Given the use of a distinct preprocessor, are there any existing ones that
>> make sense?
I found this one: http://antenna.sourceforge.net/wtkpreprocess.php which
basically implements C-style preprocessor for JAVA but I am not sure that we
want/need to go this far. See my comments on CPP preprocessor.
>> Does the pig language support literals with whitespace, carriage- returns
>> and so on? If so, how does one define a parameter in a file with such
>> characters in them?
I believe that pig allows string constants to contain arbitrary text including
white spaces and carriage-returns. To put it in the parameter file you would
enclose the value in quotes and we can just allow literals to span multiple
lines. I can make clarification in the document.
>> How does this approach compare to solutions used in SQL?
As far as I could tell this is not part of standard SQL. In MySQL, user can
define variables using set statement and then use them in their SQL statement:
http://dev.mysql.com/doc/refman/5.0/en/user-variables.html.
> parameterized Pig scripts
> -------------------------
>
> Key: PIG-58
> URL: https://issues.apache.org/jira/browse/PIG-58
> Project: Pig
> Issue Type: New Feature
> Reporter: Olga Natkovich
>
> This feature has been requested by several users and would be very useful in
> conjunction with streaming. The feature would allow pig script to include
> parameters that are replaced at run time. For instance, if your script needs
> to run on a daily basis over the data of the previous day, you would be able
> to use the script and providing a date as a run-time parameter to it.
> Example:
> =======
> Pig script myscript.pig:
> A = load '/data/mydata/%date%';
> B = filter A by $0>'5';
> .....
> Pig command line:
> pig -param date='20080110' myscript.pig
> Proposed interface and implementation:
> Interface:
> =======
> (0) Substitution will be only supported with pig script files.
> (1) Parameters are specified on the command line via -param <param>=<val>
> construct. Multiple parameters can be specified. They are applied to the
> script in the order they are specified on the command line
> (2) Default values for the parameters can be specified within the script via
> decare statement:
> decare <param>=<value>
> (3) Withint the script the parameter will be enclosed in %%. \% can be used
> te escape.
> Implementation:
> ============
> Use preprocessor to do the substitution. The preprocessor would be invoced by
> Main before grunt is instanciated and do the following:
> - create a new file in temp location
> - build a hash of parameters from command line and declare statement
> - for each line in the original script
> if this is a declare line, skip it
> else for each unescaped pattern %<identifie>% look for a match in the hash.
> Replace, if found. Write the line to the temp file.
> - pass the temp file to grunt.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.