Nice job figuring out a fix!
You should seriously file a bug with AMR for that. That's kind of
ridiculous.

D

On Wed, Aug 17, 2011 at 6:03 PM, Dexin Wang <[email protected]> wrote:

> I solved my own problem and just want to share with whoever might encounter
> the same issue.
>
> I pass colon separated list then convert it to comma separated list inside
> pig script using declare command.
>
> Submit pig job  like this:
>
>     -p SOURCE_DIRS="2011-08:2011-07:2011-06"
>
> and in Pig script
>
>     % declare SOURCE_DIRS_CONVERTED  `echo $SOURCE_DIRS | tr ':' ','`;
>     LOAD '/root_dir/{$SOURCE_DIRS_CONVERTED}' ...
>
>
> On Wed, Aug 17, 2011 at 4:21 PM, Dexin Wang <[email protected]> wrote:
>
> > Hi,
> >
> > I'm running pig jobs using Amazon pig support, where you submit jobs with
> > comma concatenated parameters like this:
> >
> >      elastic-mapreduce --pig-script --args myscript.pig --args
> > -p,PARAM1=value1,-p,PARAM2=value2,-p,PARAM3=value3
> >
> > In my script, I need to pass multiple directories for the pig script to
> > load like this:
> >
> >      raw = LOAD '/root_dir/{$SOURCE_DIRS}'
> >
> > and SOURCE_DIRS is computed. For example, it can be
> > "2011-08,2011-07,20110-06", meaning my pig script need to load data for
> the
> > past 3 months. This works fine when I run my job using local or direct
> > hadoop mode. But with Amazon pig, I have to do something like this:
> >
> >      elastic-mapreduce --pig-script --args myscript.pig
> > -p,SOURCE_DIRS="2011-08,2011-07,2011-06"
> >
> > but emr will just replace commas with spaces so it breaks the parameter
> > passing syntax. I've tried adding backslashes before commas, but I simply
> > end up with back slash with space in between.
> >
> > So question becomes:
> >
> > 1. can I do something differently than what I'm doing to pass multiple
> > folders to pig script (without commas), or
> > 2. anyone knows how to properly pass commas to elastic-mapreduce ?
> >
> > Thanks!
> >
> > Dexin
> >
>

Reply via email to