All the same, no change in that...0.20.2. Other people do have access to this system to change things like conf files, but nobody's owning up and I have to figure this out. I have verified that the mapred.map.tasks property is not getting set in the mapred-site.xml files on the cluster or in the job. Just out of other ideas about where it might be getting set...
Thanks, Brendan On Fri, Nov 4, 2011 at 11:04 AM, Robert Evans <[email protected]> wrote: > What versions of Hadoop were you running with previously, and what version > are you running with now? > > --Bobby Evans > > On 11/4/11 9:33 AM, "Brendan W." <[email protected]> wrote: > > Hi, > > In the jobs running on my cluster of 20 machines, I used to run jobs (via > "hadoop jar ...") that would spawn around 4000 map tasks. Now when I run > the same jobs, that number is 20; and I notice that in the job > configuration, the parameter mapred.map.tasks is set to 20, whereas it > never used to be present at all in the configuration file. > > Changing the input split size in the job doesn't affect this--I get the > size split I ask for, but the *number* of input splits is still capped at > 20--i.e., the job isn't reading all of my data. > > The mystery to me is where this parameter could be getting set. It is not > present in the mapred-site.xml file in <hadoop home>/conf on any machine in > the cluster, and it is not being set in the job (I'm running out of the > same jar I always did; no updates). > > Is there *anywhere* else this parameter could possibly be getting set? > I've stopped and restarted map-reduce on the cluster with no effect...it's > getting re-read in from somewhere, but I can't figure out where. > > Thanks a lot, > > Brendan > >
