Hi
I have made little progress but still not able to get to the output.
I have created a sequencefile with Key<IntWritable> and Value <VectorWritable>
in the following format :
Key Value
0 [1,2,3]
1 [4,5,6]
Following is the dump output using seqdumper :
$ mahout seqdumper -s /test/file
Input Path: /test/points/file1
Key class: class org.apache.hadoop.io.IntWritable Value Class: class
org.apache.mahout.math.VectorWritable
Key: 0: Value: org.apache.mahout.math.VectorWritable@ef83d3
Key: 1: Value: org.apache.mahout.math.VectorWritable@ef83d3
Count: 2
13/01/14 17:30:22 INFO driver.MahoutDriver: Program took 313 ms
Following is the dump output using vectordump:
$ mahout vectordump -s /test/file
{2:3.0,1:2.0,0:1.0}
{2:6.0,1:5.0,0:4.0}
13/01/14 17:29:16 INFO driver.MahoutDriver: Program took 364 ms
When I executed "matrixmult" from Mahout, the MR job completed successfully but
the output file is created of 0 bytes:
$mahout matrixmult --inputPathA /test/file --numRowsA 2 --numColsA 3
--inputPathB /test/points/file1 --numRowsB 2 --numColsB 3 --tempDir /test/temp
13/01/14 17:26:14 INFO common.AbstractJob: Command line arguments:
{--endPhase=2147483647, --inputPathA=/test/points/file1,
--inputPathB=/test/file, --numColsA=3, --numColsB=3, --numRowsA=2,
--numRowsB=2, --startPhase=0, --tempDir=/test/temp}
13/01/14 17:26:17 INFO mapred.FileInputFormat: Total input paths to process : 1
13/01/14 17:26:17 INFO mapred.FileInputFormat: Total input paths to process : 1
13/01/14 17:26:17 INFO mapred.JobClient: Running job: job_201301101352_0031
13/01/14 17:26:18 INFO mapred.JobClient: map 0% reduce 0%
13/01/14 17:26:33 INFO mapred.JobClient: map 100% reduce 0%
13/01/14 17:26:45 INFO mapred.JobClient: map 100% reduce 100%
13/01/14 17:26:50 INFO mapred.JobClient: Job complete: job_201301101352_0031
13/01/14 17:26:50 INFO mapred.JobClient: Counters: 30
13/01/14 17:26:50 INFO mapred.JobClient: Job Counters
13/01/14 17:26:50 INFO mapred.JobClient: Launched reduce tasks=1
13/01/14 17:26:50 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=11706
13/01/14 17:26:50 INFO mapred.JobClient: Total time spent by all reduces
waiting after reserving slots (ms)=0
13/01/14 17:26:50 INFO mapred.JobClient: Total time spent by all maps
waiting after reserving slots (ms)=0
13/01/14 17:26:50 INFO mapred.JobClient: Launched map tasks=1
13/01/14 17:26:50 INFO mapred.JobClient: Data-local map tasks=1
13/01/14 17:26:50 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=9883
13/01/14 17:26:50 INFO mapred.JobClient: File Input Format Counters
13/01/14 17:26:50 INFO mapred.JobClient: Bytes Read=0
13/01/14 17:26:50 INFO mapred.JobClient: File Output Format Counters
13/01/14 17:26:50 INFO mapred.JobClient: Bytes Written=223
13/01/14 17:26:50 INFO mapred.JobClient: FileSystemCounters
13/01/14 17:26:50 INFO mapred.JobClient: FILE_BYTES_READ=114
13/01/14 17:26:50 INFO mapred.JobClient: HDFS_BYTES_READ=611
13/01/14 17:26:50 INFO mapred.JobClient: FILE_BYTES_WRITTEN=44871
13/01/14 17:26:50 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=223
13/01/14 17:26:50 INFO mapred.JobClient: Map-Reduce Framework
13/01/14 17:26:50 INFO mapred.JobClient: Map output materialized bytes=114
13/01/14 17:26:50 INFO mapred.JobClient: Map input records=2
13/01/14 17:26:50 INFO mapred.JobClient: Reduce shuffle bytes=0
13/01/14 17:26:50 INFO mapred.JobClient: Spilled Records=6
13/01/14 17:26:50 INFO mapred.JobClient: Map output bytes=204
13/01/14 17:26:50 INFO mapred.JobClient: Total committed heap usage
(bytes)=217907200
13/01/14 17:26:50 INFO mapred.JobClient: CPU time spent (ms)=1030
13/01/14 17:26:50 INFO mapred.JobClient: Map input bytes=0
13/01/14 17:26:50 INFO mapred.JobClient: SPLIT_RAW_BYTES=249
13/01/14 17:26:50 INFO mapred.JobClient: Combine input records=6
13/01/14 17:26:50 INFO mapred.JobClient: Reduce input records=3
13/01/14 17:26:50 INFO mapred.JobClient: Reduce input groups=3
13/01/14 17:26:50 INFO mapred.JobClient: Combine output records=3
13/01/14 17:26:50 INFO mapred.JobClient: Physical memory (bytes)
snapshot=273883136
13/01/14 17:26:50 INFO mapred.JobClient: Reduce output records=3
13/01/14 17:26:50 INFO mapred.JobClient: Virtual memory (bytes)
snapshot=4233621504
13/01/14 17:26:50 INFO mapred.JobClient: Map output records=6
13/01/14 17:26:50 INFO driver.MahoutDriver: Program took 35987 ms
Please suggest where I am going wrong.
Thanks
Stuti Awasthi
-----Original Message-----
From: Stuti Awasthi
Sent: Monday, January 14, 2013 4:05 PM
To: [email protected]
Subject: RE: MatrixMultiplicationJob Input query
Hi Ashish,
Im running the job like this :
mahout matrixmult --inputPathA --numRowsA <Rowno of MatrixA> --numColsA
<Column no of MatrixA> --inputPathB <Inputpath of MatrixB> --numRowsB <Rowno
of MatrixB> --numColsB <Column no of MatrixB> --tempDir <temporaryDir path>
And getting the errors of InputFormat in mapper as mapper expects
<TupleWritable> format. Now here Im not sure that what should be the keys and
Value<TupleWritable> as Input to mapper to get this job working.
-Stuti
-----Original Message-----
From: ashish negi [mailto:[email protected]]
Sent: Monday, January 14, 2013 2:29 PM
To: [email protected]
Subject: Re: MatrixMultiplicationJob Input query
Could you tell how are you trying to run the job?
Regards,
On Mon, Jan 14, 2013 at 1:42 PM, Stuti Awasthi
<[email protected]<mailto:[email protected]>> wrote:
> Hi Ashish,
> I tried to run the "matrixmult" example of Mahout but getting errors
> in input format. I want to create the input file of matrix in the
> format which is required by MatrixMultiplicationJob for further
> processing. Im facing issues in creating that file as I have doubts
> that what will be the key and values for the input file. Anybody's help will
> be appreciated.
>
> Thanks
> Stuti
>
> -----Original Message-----
> From: ashish negi [mailto:[email protected]]
> Sent: Monday, January 14, 2013 1:04 PM
> To: [email protected]
> Subject: Re: MatrixMultiplicationJob Input query
>
> Hi Stuti,
>
> I am not with specific answer of your question but why don't you try
> few examples to guess the algorithm or browse source code.
>
> Regards,
> Ashish
>
> On Mon, Jan 14, 2013 at 12:20 PM, Stuti Awasthi <[email protected]
> >wrote:
>
> > Hi,
> > I want to execute MatrixMultiplicationJob provided in Mahout. I
> > understand that the input file should be in SequenceFileFormat with
> > input Key as IntWritable and Value as TupleWritable.
> >
> > If my matrix is 2x3 like :
> >
> > A = 1 2 3
> > 4 5 6
> >
> > What should I keep as key and value in SequenceFileFormat so that I
> > can provide it in MatrixMultiplicationJob as input. Sorry for the
> > basic question but Im new to Mahout and not finding much details of
> > MatrixMultiplicationJob execution.
> >
> > Thanks
> > Stuti
> >
> >
> > ::DISCLAIMER::
> >
> > --------------------------------------------------------------------
> > --
> > --------------------------------------------------------------------
> > --
> > --------
> >
> > The contents of this e-mail and any attachment(s) are confidential
> > and intended for the named recipient(s) only.
> > E-mail transmission is not guaranteed to be secure or error-free as
> > information could be intercepted, corrupted, lost, destroyed, arrive
> > late or incomplete, or may contain viruses in transmission. The e
> > mail and its contents (with or without referred errors) shall
> > therefore not attach any liability on the originator or HCL or its
> > affiliates.
> > Views or opinions, if any, presented in this email are solely those
> > of the author and may not necessarily reflect the views or opinions
> > of HCL or its affiliates. Any form of reproduction, dissemination,
> > copying, disclosure, modification, distribution and / or publication
> > of this message without the prior written consent of authorized
> > representative of HCL is strictly prohibited. If you have received
> > this email in error please delete it and notify the sender
> > immediately.
> > Before opening any email and/or attachments, please check them for
> > viruses and other defects.
> >
> >
> > --------------------------------------------------------------------
> > --
> >
> ----------------------------------------------------------------------
> --------
> >
>