[
https://issues.apache.org/jira/browse/PIG-58?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12586897#action_12586897
]
Alan Gates commented on PIG-58:
-------------------------------
[Alan before] 1) Dryrun isn't a good name for the command line option. It's far
to generic. Knowing nothing about parameter substitution and looking at the
usage statement of pig I'd assume that dryrun meant that pig was going to parse
my query but not run it. I would suggest a name like preproc or preproconly
[Olga] I was hoping that we would extend dryrun over time to include things
that you were talking about. I did not mean it to stay parameter substitution
specific. What do you think?
Seems fine. We should put in a comment articulating that in the command line
option.
[Alan before] 2) Why did we choose to put all of the data for the unit tests in
separate files in a test/data directory? To date the approach has been to have
the
tests themselves generate the data they need on the fly. There are pros and
cons to switching, but I think we should discuss and have a policy of how unit
tests handle their data before we start adding a directory with a lot of files
in it.
[Olga] This is how the test cases were created when they were given to me.
SInce I did not have strong oppinion and did not have time to redo them I left
them as is. I am not really sure that we need a policy on this. What are your
concerns with different tests doing what makes most sense for the kind of
things they are testing? Let me know if you think it is ok to leave it as is or
if you think that it needs to be reworked before committing the changes.
I guess I like having one way to do things instead of 3 (does that mean I'm
more of python programmer than perl programmer?). But it isn't worth delaying
getting this in for.
[Alan before] 5) PigFileParser.jj doesn't skip over commented lines in the pig
code. It should ignore anything on a line after -
[Olga] The logic is to not change any lines including comments that don't have
parameters to be substituted. Grun parser will skip the comments.
I was referring to that fact that it will do substitutions in comments. If we
had comments that could be terminated, this could lead to strange behavior
(consider C
style /* comment */ comments and then defining a parameter that contained */).
But our comments are currently to line end only (--) so it's should be fine.
> parameterized Pig scripts
> -------------------------
>
> Key: PIG-58
> URL: https://issues.apache.org/jira/browse/PIG-58
> Project: Pig
> Issue Type: New Feature
> Reporter: Olga Natkovich
> Attachments: PIG-58_v1.patch, PIG-58_v2
>
>
> This feature has been requested by several users and would be very useful in
> conjunction with streaming. The feature would allow pig script to include
> parameters that are replaced at run time. For instance, if your script needs
> to run on a daily basis over the data of the previous day, you would be able
> to use the script and providing a date as a run-time parameter to it.
> Example:
> =======
> Pig script myscript.pig:
> A = load '/data/mydata/%date%';
> B = filter A by $0>'5';
> .....
> Pig command line:
> pig -param date='20080110' myscript.pig
> Proposed interface and implementation:
> Interface:
> =======
> (0) Substitution will be only supported with pig script files.
> (1) Parameters are specified on the command line via -param <param>=<val>
> construct. Multiple parameters can be specified. They are applied to the
> script in the order they are specified on the command line
> (2) Default values for the parameters can be specified within the script via
> decare statement:
> decare <param>=<value>
> (3) Withint the script the parameter will be enclosed in %%. \% can be used
> te escape.
> Implementation:
> ============
> Use preprocessor to do the substitution. The preprocessor would be invoced by
> Main before grunt is instanciated and do the following:
> - create a new file in temp location
> - build a hash of parameters from command line and declare statement
> - for each line in the original script
> if this is a declare line, skip it
> else for each unescaped pattern %<identifie>% look for a match in the hash.
> Replace, if found. Write the line to the temp file.
> - pass the temp file to grunt.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.