Hi, On Thursday, June 6, 2013, weishenyun <[email protected]> wrote: > node and seven slave nodes. But only one reduce task running on a single > node is trying to fetch pages from those three sites. It's too slow to fetch > all three sites only through one task of one node. How can I speed up the > job? What should I configure so that each site crawling task will be taken > by different tasks on different nodes? >
In short 'mapred.reduce.tasks', but please also see this thread http://www.mail-archive.com/[email protected]/msg06584.html -- *Lewis*

