I have lots of small files in hive, the mapred is too slow .... Is there a way to improve the speed ?
2010/6/10 Edward Capriolo <[email protected]> > > > On Wed, Jun 9, 2010 at 3:04 AM, wd <[email protected]> wrote: > >> I've tried hive 0.5, the option not work too. >> And find this page[ >> http://markmail.org/message/k32nrcb2ncsq67ef?q=mapred.map.tasks+#query:mapred.map.tasks%20+page:1+mid:k32nrcb2ncsq67ef+state:results] >> via google. >> >> 2010/6/9 wd <[email protected]> >> >> hi, >>> >>> I'm using hive svn rev946854. And try to set mapred.map.tasks=1 at hive >>> cli, but seemes it doesn't work, total map tasks still over 300+. >>> >>> Is this a svn version problem? >>> >> >> > You answered your own question, look in the link > > "You cannot force *mapred.map.tasks* but can specify mapred.reduce.tasks. > " > > Map tasks is based on the number of input files and folders. Even though > hive uses a CombinedInput format you still can get a number of mappers. > > Edward >
