Yes, you can use the "set" keyword to set such properties in the script.
On Jan 11, 2012, at 6:12 PM, Yang <[email protected]> wrote: > I have a pig script that does basically a map-only job: > > raw = LOAD 'input.txt' ; > > processed = FOREACH raw GENERATE convert_somehow($1,$2...); > > store processed into 'output.txt'; > > > > I have many nodes on my cluster, so I want PIG to process the input in > more mappers. but it generates only 2 part-m-xxxxx files, i.e. > using 2 mappers. > > in hadoop job it's possible to pass mapper count and > -Dmapred.min.split.size= , would this also work for PIG? the PARALLEL > keyword only works for reducers > > > Thanks > Yang
