Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 
notification.

The following page has been changed by OlgaN:
http://wiki.apache.org/pig/ParameterSubstitution

------------------------------------------------------------------------------
  In addition to supplying parameter value, a user can supply a command to 
execute to generate a parameter value. This can be done using `declare` 
statement. 
  
  {{{
- #declare CMD `generate_date`
+ %declare CMD `generate_date`
  A = load '/data/mydata/$CMD';
  B = filter A by $0>'5';
  .....
@@ -40, +40 @@

  
  For this example, pig would execute `generate_date` command when it 
encounters the `declare` statement and assigns the result (stdout) to parameter 
`CMD`. The value of `CMD` is substituted prior to running the load statement.
  
- `declare` statement starts with `#` to indicate that it is part of the 
preprocessor that performs parameter substitution rather than Pig language 
itself. 
+ `declare` statement starts with `%` to indicate that it is part of the 
preprocessor that performs parameter substitution rather than Pig language 
itself. 
  
  `declare` can also be used to define one parameter in terms of others:
  
  {{{
- #declare param1 ($param2 + $param3)
+ %declare param1 ($param2 + $param3)
  }}}
  
  With exception to string literals that can span multiple lines, for initial 
release, `declare` is a single-line command.
@@ -53, +53 @@

  The command specified within `declare` statement can take parameters which 
need to be substituted as well.
  
  {{{
- #declare CMD `generate_date $date`
+ %declare CMD `generate_date $date`
  A = load '/data/mydata/$CMD';
  B = filter A by $0>'5';
  .....
@@ -64, +64 @@

  Note that variables passed on the command line must be resolved prior to the 
declare statement. The following sequence would cause an error:
  
  {{{
- #declare A `cmd1 $B`
+ %declare A `cmd1 $B`
- #declare $B `cmd2`
+ %declare $B `cmd2`
  }}}
  
  Command name itself can be a parameter.
  
  {{{
- #declare CMD `$mycmd $date`
+ %declare CMD `$mycmd $date`
  A = load '/data/mydata/$CMD';
  B = filter A by $0>'5';
  .....
@@ -108, +108 @@

  
  `declare` command takes the highest precedence. Having multiple `declare` 
commands defining the same parameter is an error that results in an error 
message and abort of the processing.
  
- Default parameter values can be specified in a script using `#default <param> 
<value>` statement. This statement is identical to `declare` except that it has 
the lowest precedence meaning that its value is only used if it has not been 
defined before.
+ Default parameter values can be specified in a script using `%default <param> 
<value>` statement. This statement is identical to `declare` except that it has 
the lowest precedence meaning that its value is only used if it has not been 
defined before.
  
  {{{
- #default cmd=generate_name
+ %default cmd=generate_name
  }}}
+ 
+ Values specified from the command line as well as configuration file can be 
commands or expressions including other parameters. Their format is identical 
to `declare` and `default` format. Also, the same rule that variables need to 
be resolved before they can be used applies. The following order will be used:
+ 
+  1. Configuration files will be scanned in the order they are specified on 
the command line. Within each file, the parameteres are processed in the order 
they are specified.
+  2. Command line parameters will be scanned in the order they are specified 
on the command line.
+  3. declare/default commands will be processed in the order they appear in 
the pig script.
  
  === Debugging ===
  

Reply via email to