Yes, you can use the "set" keyword to set such properties in the script. 

On Jan 11, 2012, at 6:12 PM, Yang <[email protected]> wrote:

> I have a pig script  that does basically a map-only job:
> 
> raw = LOAD 'input.txt' ;
> 
> processed = FOREACH raw GENERATE convert_somehow($1,$2...);
> 
> store processed into 'output.txt';
> 
> 
> 
> I have many nodes on my cluster, so I want PIG to process the input in
> more mappers. but it generates only 2 part-m-xxxxx  files, i.e.
> using 2 mappers.
> 
> in hadoop job it's possible to pass mapper count and
> -Dmapred.min.split.size= ,  would this also work for PIG? the PARALLEL
> keyword only works for reducers
> 
> 
> Thanks
> Yang

Reply via email to