Hi Jake, As requested the stats from the job are listed below: Counter Map Reduce Total Job Counters Launched reduce tasks 0 0 2 Rack-local map tasks 0 0 69 Launched map tasks 0 0 194 Data-local map tasks 0 0 125 FileSystemCounters FILE_BYTES_READ 66,655,795,630 0 66,655,795,630 HDFS_BYTES_READ 12,871,657,393 0 12,871,657,393 FILE_BYTES_WRITTEN 103,841,910,638 0 103,841,910,638 Map-Reduce Framework Combine output records 0 0 0 Map input records 54,675 0 54,675 Spilled Records 4,720,084,588 0 4,720,084,588 Map output bytes 33,805,552,500 0 33,805,552,500 Map input bytes 12,804,666,825 0 12,804,666,825 Map output records 1,690,277,625 0 1,690,277,625 Combine input records 0 0 0
In response to your suggestion, I do have a server with lots of ram however I would like to stick to having files on the HDFS. As I am running some PCA analysis I would have to reimport the data back into HDFS to run SVD. (We tried to run similar computations on a machine with >64GB and the previous R implementation crashed after a few days...) Because I am limited by my resources, I coded up a slower but effective implementation of the transpose job that I could share. It avoids loading all the data on to one node by transposing the matrix in pieces. The slowest part of this is combining the pieces back to one matrix. :( -Vincent On Fri, May 6, 2011 at 2:54 PM, Jake Mannix <[email protected]> wrote: > > On Fri, May 6, 2011 at 6:01 AM, Vincent Xue <[email protected]> wrote: > > > Dear Mahout Users, > > > > I am using Mahout-0.5-SNAPSHOT to transpose a dense matrix of 55000 x > > 31000. > > My matrix is in stored on the HDFS as a > > SequenceFile<IntWritable,VectorWritable>, consuming just about 13 GB. When > > I > > run the transpose function on my matrix, the function falls over during the > > reduce phase. With closer inspection, I noticed that I was receiving the > > following error: > > > > FSError: java.io.IOException: No space left on device > > > > I thought this was not possible considering that I was only using 15% of > > the > > 2.5 TB in the cluster but when I closely monitored the disk space, it was > > true that the 40 GB hard drive on the node was running out of space. > > Unfortunately, all of my nodes are limited to 40 GB and I have not been > > successful in transposing my matrix. > > > > Running HDFS with nodes with only 40GB of hard disk each is a recipe > for disaster, IMO. There are lots of temporary files created by map/reduce > jobs, and working on an input file of size 13GB you're bound to run into > this. > > Can you show us what your job tracker says the amount of > HDFS_BYTES_WRITTEN (and other similar numbers) during your job? > > > > From this observation, I would like to know if there is any alternative > > method to transpose my matrix or if there is something I am missing? > > > Do you have a server with 26GB of RAM lying around somewhere? > You could do it on one machine without hitting disk. :) > > -jake
