the setting mapred.tasktracker.* related settings are related to maximum number of maps or reducers a tasktracker can run. This can change across machines if you have multiple nodes then depending on machine config you can decide these values. If you set it to 4, it will basically mean that at any given point the tasktracker running on that machine will run maximum of 4 maps or reducers.
mapred.map.* settings are cluster wide settings. These setting mean that by default how many tasks (maps or reducers) per job should be configured by default. These settings are overwritten by the job when its submitted to jobtracker or by the client itself. Its not must for you to set the mapred.map.tasks or mapred.reduce.tasks as the default value for it is 2 in config. On Fri, Aug 1, 2014 at 4:06 PM, sindhu hosamane <[email protected]> wrote: > Thanks a ton for ur help Harsh . I am a newbie in hadoop. > If i have set > mapred.tasktracker.map.tasks.maximum = 4 > mapred.tasktracker.reduce.tasks.maximum = 4 > Should i also bother or set below values > mapred.map.Tasks and mapred.reduce.Tasks . > If yes then what is the ideal value? > > > > > > On Fri, Aug 1, 2014 at 12:00 AM, Harsh J <[email protected]> wrote: > >> You can perhaps start with a generic 4+4 configuration (which matches >> your cores), and tune your way upwards or downwards from there based >> on your results. >> >> On Thu, Jul 31, 2014 at 8:35 PM, Sindhu Hosamane <[email protected]> >> wrote: >> > Hello friends , >> > >> > If i am running my experiment on a server with 2 processors (4 cores >> each ) . >> > To say it has 2 processors and 8 cores . >> > What would be the ideal values for mapred.tasktracker.map.tasks.maximum >> and mapred.tasktracker.reduce.tasks.maximum to get maximum performance. >> > I am running cascalog queries on data of size 280 MB. >> > I have multiple datanodes running on same machine. >> > >> > Your help is very much appreciated. >> > >> > >> > Regards, >> > sindhu >> > >> >> >> >> -- >> Harsh J >> > > -- Nitin Pawar
