Dear Pig-Group,
I am trying to use pig 0.8's new features called flow allows custom Map-Reduce
jobs. The book "Progrmming Pig" gives a quite simple example and it makes me
puzzled. The example is below:
crawl = load 'webcrawl' as (url, pageid);
normalized = foreach crawl generate normalize(url);
goodurls = mapreduce 'blacklistchecker.jar'
store normalized into 'input'
load 'output' as (url, pageid)
`com.acmeweb.security.BlackListChecker -i input -o output`;
My mapreduce program needs three parametres , two are input path and the other
is output path. My question is how can I pass it to the "mapreduce" command?
By the way, would you please give more details about the mapreduce command?
There is little source about that.
Thanks very much!!
Sincerely,
Yan Meng
June 5th, 2013