Dear Wiki user, You have subscribed to a wiki page or wiki category on "Pig Wiki" for change notification.
The following page has been changed by OlgaN: http://wiki.apache.org/pig/ParameterSubstitution ------------------------------------------------------------------------------ In addition to supplying parameter value, a user can supply a command to execute to generate a parameter value. This can be done using `declare` statement. {{{ - #declare CMD `generate_date` + %declare CMD `generate_date` A = load '/data/mydata/$CMD'; B = filter A by $0>'5'; ..... @@ -40, +40 @@ For this example, pig would execute `generate_date` command when it encounters the `declare` statement and assigns the result (stdout) to parameter `CMD`. The value of `CMD` is substituted prior to running the load statement. - `declare` statement starts with `#` to indicate that it is part of the preprocessor that performs parameter substitution rather than Pig language itself. + `declare` statement starts with `%` to indicate that it is part of the preprocessor that performs parameter substitution rather than Pig language itself. `declare` can also be used to define one parameter in terms of others: {{{ - #declare param1 ($param2 + $param3) + %declare param1 ($param2 + $param3) }}} With exception to string literals that can span multiple lines, for initial release, `declare` is a single-line command. @@ -53, +53 @@ The command specified within `declare` statement can take parameters which need to be substituted as well. {{{ - #declare CMD `generate_date $date` + %declare CMD `generate_date $date` A = load '/data/mydata/$CMD'; B = filter A by $0>'5'; ..... @@ -64, +64 @@ Note that variables passed on the command line must be resolved prior to the declare statement. The following sequence would cause an error: {{{ - #declare A `cmd1 $B` + %declare A `cmd1 $B` - #declare $B `cmd2` + %declare $B `cmd2` }}} Command name itself can be a parameter. {{{ - #declare CMD `$mycmd $date` + %declare CMD `$mycmd $date` A = load '/data/mydata/$CMD'; B = filter A by $0>'5'; ..... @@ -108, +108 @@ `declare` command takes the highest precedence. Having multiple `declare` commands defining the same parameter is an error that results in an error message and abort of the processing. - Default parameter values can be specified in a script using `#default <param> <value>` statement. This statement is identical to `declare` except that it has the lowest precedence meaning that its value is only used if it has not been defined before. + Default parameter values can be specified in a script using `%default <param> <value>` statement. This statement is identical to `declare` except that it has the lowest precedence meaning that its value is only used if it has not been defined before. {{{ - #default cmd=generate_name + %default cmd=generate_name }}} + + Values specified from the command line as well as configuration file can be commands or expressions including other parameters. Their format is identical to `declare` and `default` format. Also, the same rule that variables need to be resolved before they can be used applies. The following order will be used: + + 1. Configuration files will be scanned in the order they are specified on the command line. Within each file, the parameteres are processed in the order they are specified. + 2. Command line parameters will be scanned in the order they are specified on the command line. + 3. declare/default commands will be processed in the order they appear in the pig script. === Debugging ===