Finally I am able to generate the correct output. Actually was just checking the wrong file :)
Thanks Stuti From: Stuti Awasthi Sent: Monday, January 14, 2013 5:42 PM To: '[email protected]' Subject: RE: MatrixMultiplicationJob Input query Hi I have made little progress but still not able to get to the output. I have created a sequencefile with Key<IntWritable> and Value <VectorWritable> in the following format : Key Value 0 [1,2,3] 1 [4,5,6] Following is the dump output using seqdumper : $ mahout seqdumper -s /test/file Input Path: /test/points/file1 Key class: class org.apache.hadoop.io.IntWritable Value Class: class org.apache.mahout.math.VectorWritable Key: 0: Value: org.apache.mahout.math.VectorWritable@ef83d3<mailto:org.apache.mahout.math.VectorWritable@ef83d3> Key: 1: Value: org.apache.mahout.math.VectorWritable@ef83d3<mailto:org.apache.mahout.math.VectorWritable@ef83d3> Count: 2 13/01/14 17:30:22 INFO driver.MahoutDriver: Program took 313 ms Following is the dump output using vectordump: $ mahout vectordump -s /test/file {2:3.0,1:2.0,0:1.0} {2:6.0,1:5.0,0:4.0} 13/01/14 17:29:16 INFO driver.MahoutDriver: Program took 364 ms When I executed "matrixmult" from Mahout, the MR job completed successfully but the output file is created of 0 bytes: $mahout matrixmult --inputPathA /test/file --numRowsA 2 --numColsA 3 --inputPathB /test/points/file1 --numRowsB 2 --numColsB 3 --tempDir /test/temp 13/01/14 17:26:14 INFO common.AbstractJob: Command line arguments: {--endPhase=2147483647, --inputPathA=/test/points/file1, --inputPathB=/test/file, --numColsA=3, --numColsB=3, --numRowsA=2, --numRowsB=2, --startPhase=0, --tempDir=/test/temp} 13/01/14 17:26:17 INFO mapred.FileInputFormat: Total input paths to process : 1 13/01/14 17:26:17 INFO mapred.FileInputFormat: Total input paths to process : 1 13/01/14 17:26:17 INFO mapred.JobClient: Running job: job_201301101352_0031 13/01/14 17:26:18 INFO mapred.JobClient: map 0% reduce 0% 13/01/14 17:26:33 INFO mapred.JobClient: map 100% reduce 0% 13/01/14 17:26:45 INFO mapred.JobClient: map 100% reduce 100% 13/01/14 17:26:50 INFO mapred.JobClient: Job complete: job_201301101352_0031 13/01/14 17:26:50 INFO mapred.JobClient: Counters: 30 13/01/14 17:26:50 INFO mapred.JobClient: Job Counters 13/01/14 17:26:50 INFO mapred.JobClient: Launched reduce tasks=1 13/01/14 17:26:50 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=11706 13/01/14 17:26:50 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 13/01/14 17:26:50 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 13/01/14 17:26:50 INFO mapred.JobClient: Launched map tasks=1 13/01/14 17:26:50 INFO mapred.JobClient: Data-local map tasks=1 13/01/14 17:26:50 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=9883 13/01/14 17:26:50 INFO mapred.JobClient: File Input Format Counters 13/01/14 17:26:50 INFO mapred.JobClient: Bytes Read=0 13/01/14 17:26:50 INFO mapred.JobClient: File Output Format Counters 13/01/14 17:26:50 INFO mapred.JobClient: Bytes Written=223 13/01/14 17:26:50 INFO mapred.JobClient: FileSystemCounters 13/01/14 17:26:50 INFO mapred.JobClient: FILE_BYTES_READ=114 13/01/14 17:26:50 INFO mapred.JobClient: HDFS_BYTES_READ=611 13/01/14 17:26:50 INFO mapred.JobClient: FILE_BYTES_WRITTEN=44871 13/01/14 17:26:50 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=223 13/01/14 17:26:50 INFO mapred.JobClient: Map-Reduce Framework 13/01/14 17:26:50 INFO mapred.JobClient: Map output materialized bytes=114 13/01/14 17:26:50 INFO mapred.JobClient: Map input records=2 13/01/14 17:26:50 INFO mapred.JobClient: Reduce shuffle bytes=0 13/01/14 17:26:50 INFO mapred.JobClient: Spilled Records=6 13/01/14 17:26:50 INFO mapred.JobClient: Map output bytes=204 13/01/14 17:26:50 INFO mapred.JobClient: Total committed heap usage (bytes)=217907200 13/01/14 17:26:50 INFO mapred.JobClient: CPU time spent (ms)=1030 13/01/14 17:26:50 INFO mapred.JobClient: Map input bytes=0 13/01/14 17:26:50 INFO mapred.JobClient: SPLIT_RAW_BYTES=249 13/01/14 17:26:50 INFO mapred.JobClient: Combine input records=6 13/01/14 17:26:50 INFO mapred.JobClient: Reduce input records=3 13/01/14 17:26:50 INFO mapred.JobClient: Reduce input groups=3 13/01/14 17:26:50 INFO mapred.JobClient: Combine output records=3 13/01/14 17:26:50 INFO mapred.JobClient: Physical memory (bytes) snapshot=273883136 13/01/14 17:26:50 INFO mapred.JobClient: Reduce output records=3 13/01/14 17:26:50 INFO mapred.JobClient: Virtual memory (bytes) snapshot=4233621504 13/01/14 17:26:50 INFO mapred.JobClient: Map output records=6 13/01/14 17:26:50 INFO driver.MahoutDriver: Program took 35987 ms Please suggest where I am going wrong. Thanks Stuti Awasthi -----Original Message----- From: Stuti Awasthi Sent: Monday, January 14, 2013 4:05 PM To: [email protected] Subject: RE: MatrixMultiplicationJob Input query Hi Ashish, Im running the job like this : mahout matrixmult --inputPathA --numRowsA <Rowno of MatrixA> --numColsA <Column no of MatrixA> --inputPathB <Inputpath of MatrixB> --numRowsB <Rowno of MatrixB> --numColsB <Column no of MatrixB> --tempDir <temporaryDir path> And getting the errors of InputFormat in mapper as mapper expects <TupleWritable> format. Now here Im not sure that what should be the keys and Value<TupleWritable> as Input to mapper to get this job working. -Stuti -----Original Message----- From: ashish negi [mailto:[email protected]] Sent: Monday, January 14, 2013 2:29 PM To: [email protected] Subject: Re: MatrixMultiplicationJob Input query Could you tell how are you trying to run the job? Regards, On Mon, Jan 14, 2013 at 1:42 PM, Stuti Awasthi <[email protected]<mailto:[email protected]>> wrote: > Hi Ashish, > I tried to run the "matrixmult" example of Mahout but getting errors > in input format. I want to create the input file of matrix in the > format which is required by MatrixMultiplicationJob for further > processing. Im facing issues in creating that file as I have doubts > that what will be the key and values for the input file. Anybody's help will > be appreciated. > > Thanks > Stuti > > -----Original Message----- > From: ashish negi [mailto:[email protected]] > Sent: Monday, January 14, 2013 1:04 PM > To: [email protected] > Subject: Re: MatrixMultiplicationJob Input query > > Hi Stuti, > > I am not with specific answer of your question but why don't you try > few examples to guess the algorithm or browse source code. > > Regards, > Ashish > > On Mon, Jan 14, 2013 at 12:20 PM, Stuti Awasthi <[email protected] > >wrote: > > > Hi, > > I want to execute MatrixMultiplicationJob provided in Mahout. I > > understand that the input file should be in SequenceFileFormat with > > input Key as IntWritable and Value as TupleWritable. > > > > If my matrix is 2x3 like : > > > > A = 1 2 3 > > 4 5 6 > > > > What should I keep as key and value in SequenceFileFormat so that I > > can provide it in MatrixMultiplicationJob as input. Sorry for the > > basic question but Im new to Mahout and not finding much details of > > MatrixMultiplicationJob execution. > > > > Thanks > > Stuti > > > > > > ::DISCLAIMER:: > > > > -------------------------------------------------------------------- > > -- > > -------------------------------------------------------------------- > > -- > > -------- > > > > The contents of this e-mail and any attachment(s) are confidential > > and intended for the named recipient(s) only. > > E-mail transmission is not guaranteed to be secure or error-free as > > information could be intercepted, corrupted, lost, destroyed, arrive > > late or incomplete, or may contain viruses in transmission. The e > > mail and its contents (with or without referred errors) shall > > therefore not attach any liability on the originator or HCL or its > > affiliates. > > Views or opinions, if any, presented in this email are solely those > > of the author and may not necessarily reflect the views or opinions > > of HCL or its affiliates. Any form of reproduction, dissemination, > > copying, disclosure, modification, distribution and / or publication > > of this message without the prior written consent of authorized > > representative of HCL is strictly prohibited. If you have received > > this email in error please delete it and notify the sender > > immediately. > > Before opening any email and/or attachments, please check them for > > viruses and other defects. > > > > > > -------------------------------------------------------------------- > > -- > > > ---------------------------------------------------------------------- > -------- > > >
