Re: parallel mapping on single server

2008-07-12 Thread hong
Hi, I have a question about the strategy described by Jonman Chu: Hadoop will try to split the file according to how it is split up in the HDFS use wordcount as example. suppose hadoop is a word in input file. and block 1 ends with had, block 2 starts with oop, how to handle this case?

Re: parallel mapping on single server

2008-07-10 Thread hong
Hi Follows Cao Haijun's reply: Suppose we have set 8 map tasks. How does each map know which part of input file it should process? 在 2008-7-10,上午2:33,Haijun Cao 写道: Set number of map slots per tasktracker to 8 in order to run 8 map tasks on one machine (assuming one tasktracker per

RE: parallel mapping on single server

2008-07-09 Thread Haijun Cao
Set number of map slots per tasktracker to 8 in order to run 8 map tasks on one machine (assuming one tasktracker per machine) at the same time: property namemapred.tasktracker.map.tasks.maximum/name value8/value descriptionThe maximum number of map tasks that will be run simultaneously