Hi,

I'm trying to run the matrix multiplication of two relatively small
(4219*200)(200*54622) but it is taking too long because only a single
mapper is launched. I'm running this on a 10 node cluster.

I have tried changing the MAHOUT_OPTS in the mahout file:

MAHOUT_OPTS="$MAHOUT_OPTS -Dmapred.map.tasks=18"
MAHOUT_OPTS="$MAHOUT_OPTS -Dmapred.reduce.tasks=9"

Also passing the options directly on the command:

mahout matrixmult -Dmapred.map.tasks=18 -Dmapred.reduce.tasks=9
--numRowsA 200 --numColsA 4819 --numRowsB 200 --numColsB 54622
--inputPathA matrixA --inputPathB matrixB

But no luck with this either.

My Hadoop mapred-site.xml looks like this:

<configuration>
  <property>
    <name>mapred.job.tracker</name>
    <value>serverX:54311</value>
    <final>true</final>
  </property>
  <property>
    <name>mapred.child.ulimit</name>
    <value>unlimited</value>
  </property>
  <property>
    <name>mapred.tasktracker.map.tasks.maximum</name>
    <value>2</value>
    <final>true</final>
  </property>
  <property>
    <name>mapred.tasktracker.reduce.tasks.maximum</name>
    <value>2</value>
    <final>true</final>
  </property>
  <property>
    <name>mapred.child.java.opts</name>
    <value>-Xmx2000m</value>
  </property>
</configuration>

Am I missing something on the configuration?

Right now with 1 mapper it is taking 4 min in average to advance 1%
with the mapper task.

Thank you,
Rafael Alfaro

Reply via email to