It's up to Hadoop in the end. Try calling FileInputFormat.setMaxInputSplitSize() with a smallish value, like your 10MB (10000000).
I don't know if Hadoop params can be set as sys properties like that anyway? On Wed, Jan 16, 2013 at 7:48 AM, Stuti Awasthi <[email protected]> wrote: > Hi, > > I am trying to multiple dense matrix of size [100 x 100k]. The size of the > file is 104MB and with default block sizeof 64MB only 2 blocks are getting > created. > So I reduced the block size to 10MB and now my file divided into 11 blocks > across the cluster. Cluster size is 10 nodes with 1 NN/JT and 9 DN/TT. > > Everytime Im running Mahout MatrixMultiplicationJob through commandline, I > can see on JobTracker WebUI that only 1 map task is launched. According to my > understanding of Inputsplit, there should be 11 map tasks launched. > Apart from this Map task stays at 0.99% completion and in the Tasks Logs , I > can see that map task is spilling the map output. > > Mahout Command: > > mahout matrixmult -Dmapred.child.java.opts=-Xmx1024M > -Dfs.inmemory.size.mb=200 -Dio.sort.factor=100 -Dio.sort.mb=200 > -Dio.file.buffer.size=131072 --inputPathA /test/matrixA --numRowsA 100 > --numColsA 100000 --inputPathB /test/matrixA --numRowsB 100 --numColsB 100000 > --tempDir /test/temp > > Now here I want to know that why only 1 map task is launched everytime and > how can I performance tune the cluster so that I can perform the dense matrix > multiplication of the order [90K x 1 Million] . > > Thanks > Stuti > > > ::DISCLAIMER:: > ---------------------------------------------------------------------------------------------------------------------------------------------------- > > The contents of this e-mail and any attachment(s) are confidential and > intended for the named recipient(s) only. > E-mail transmission is not guaranteed to be secure or error-free as > information could be intercepted, corrupted, > lost, destroyed, arrive late or incomplete, or may contain viruses in > transmission. The e mail and its contents > (with or without referred errors) shall therefore not attach any liability on > the originator or HCL or its affiliates. > Views or opinions, if any, presented in this email are solely those of the > author and may not necessarily reflect the > views or opinions of HCL or its affiliates. Any form of reproduction, > dissemination, copying, disclosure, modification, > distribution and / or publication of this message without the prior written > consent of authorized representative of > HCL is strictly prohibited. If you have received this email in error please > delete it and notify the sender immediately. > Before opening any email and/or attachments, please check them for viruses > and other defects. > > ----------------------------------------------------------------------------------------------------------------------------------------------------
