[ https://issues.apache.org/jira/browse/PIG-564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12704455#action_12704455 ]
Viraj Bhat edited comment on PIG-564 at 4/29/09 8:04 PM: --------------------------------------------------------- Another special character "/" is not handled correctly and is particularly useful when passing nested output directories. It is truncated when it is passed to Pig. Example case: {code} a = load '/user/viraj/test1' using PigStorage(',') as (id1:int, char1:chararray); b = foreach a generate id1; store b into '/user/viraj/$outputfolder' using PigStorage(); {code} Run this script as: {code} shell>hadoop fs -mkdir /user/viraj/paramtest shell>java -cp pig.jar:/home/viraj/hadoop-0.18.0-dev/conf/ -Dhod.server='' org.apache.pig.Main -param outputfolder="paramtest/moretest" -r paramtest.pig 2009-04-30 03:00:21,234 [main] INFO org.apache.pig.Main - Dry run completed. Substituted pig script is at paramtest.pig.substituted {code} Now if we open the substituted Pig script (paramtest.pig.substituted). {code} a = load '/user/viraj/test1' using PigStorage(',') as (id1:int, char1:chararray); b = foreach a generate id1; store b into '/user/viraj/paramtest' using PigStorage(); {code} Viraj was (Author: viraj): Another special character "/" is not handled correctly and is particularly useful when passing nested output directories. It is truncated when it is passed to Pig. Example case: {code} a = load '/user/viraj/test1' using PigStorage(',') as (id1:int, char1:chararray); b = foreach a generate id1; store b into '/user/viraj/$outputfolder' using PigStorage(); {code} Run this script as: {code} shell>hadoop fs -mkdir /user/viraj/paramtest shell>java -cp pig.jar:/home/viraj/hadoop-0.18.0-dev/conf/ -Dhod.server='' org.apache.pig.Main -param outputfolder="paramtest/moretest" paramtest.pig 2009-04-30 03:00:21,234 [main] INFO org.apache.pig.Main - Dry run completed. Substituted pig script is at paramtest.pig.substituted {code} Now if we open the substituted Pig script (paramtest.pig.substituted). {code} a = load '/user/viraj/test1' using PigStorage(',') as (id1:int, char1:chararray); b = foreach a generate id1; store b into '/user/viraj/paramtest' using PigStorage(); {code} Viraj > Parameter Substitution using -param option does not seem to work when > parameters contain special characters such as +,=,-,?,' " > ------------------------------------------------------------------------------------------------------------------------------- > > Key: PIG-564 > URL: https://issues.apache.org/jira/browse/PIG-564 > Project: Pig > Issue Type: Bug > Components: impl > Affects Versions: 0.2.0 > Reporter: Viraj Bhat > Assignee: Olga Natkovich > > Consider the following Pig script which uses parameter substitution > {code} > %default qual '/user/viraj' > %default mydir 'mydir_myextraqual' > VISIT_LOGS = load '$qual/$mydir' as (a,b,c); > dump VISIT_LOGS; > {code} > If you run the script as: > ================================================================================================================== > java -cp pig.jar:${HADOOP_HOME}/conf/ -Dhod.server='' org.apache.pig.Main > -param mydir=mydir-myextraqual mypigparamsub.pig > ================================================================================================================== > You get the following error: > ================================================================================================================== > 2008-12-15 19:49:43,964 [main] ERROR > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher > - java.io.IOException: /user/viraj/mydir does not exist > at > org.apache.pig.backend.executionengine.PigSlicer.validate(PigSlicer.java:109) > at > org.apache.pig.impl.io.ValidatingInputFileSpec.validate(ValidatingInputFileSpec.java:59) > at > org.apache.pig.impl.io.ValidatingInputFileSpec.<init>(ValidatingInputFileSpec.java:44) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:200) > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:742) > at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370) > at > org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247) > at > org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279) > at java.lang.Thread.run(Thread.java:619) > java.io.IOException: Unable to open iterator for alias: VISIT_LOGS [Job > terminated with anomalous status FAILED] > at org.apache.pig.PigServer.openIterator(PigServer.java:389) > at > org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:269) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:178) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:84) > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:64) > at org.apache.pig.Main.main(Main.java:306) > Caused by: java.io.IOException: Job terminated with anomalous status FAILED > ... 6 more > ================================================================================================================== > Also tried using: -param mydir='mydir\-myextraqual' > This behavior occurs if the parameter value contains characters such as +,=, > ?. > A workaround for this behavior is using a param_file which contains > <param_name>=<param_value> on each line, with the <param_value> enclosed by > quotes. For example: > mydir='mydir-myextraqual' and then running the pig script as: > java -cp pig.jar:${HADOOP_HOME}/conf/ -Dhod.server='' org.apache.pig.Main > -param_file myparamfile mypigparamsub.pig > The following issues need to be fixed: > 1) In -param option if parameter value contains special characters, it is > truncated > 2) In param_file, if param_value contains a special characters, it should be > enclosed in quotes > 3) If 2 is a known issue then it should be documented in > http://wiki.apache.org/pig/ParameterSubstitution -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.