Hi
In hadoop mapred.map.tasks won't do the job of controlling the number of
mappers. There would be one map task created against each input split. To
reduce the number of map tasks you can use CombineFileInputFormat if that works
for you in the implementation. It actually assigns multiple input splits to one
mapper,but with a light loss of data locality.
Hope it Helps!..
Regards
Bejoy K S
-----Original Message-----
From: Mohammed Al khooja <[email protected]>
Date: Mon, 5 Dec 2011 20:04:25
To: <[email protected]>
Subject: Too many mappers - Mahout lda
Hi,
I'm running lda on 123 file-parts (Block size is 128 MB). However, mahout
is creating 1555 mappers. I tried setting mapred.map.tasks in
mapred-site.xml but I guess Mahout overrides it.
Does anything has to do with block size or splitting size ?
Thanks.
--
M.khouja