For the first issue, I've tried adding that argument on the command
line, but it has no effect on the transpose job. It does have an
effect an another job the program runs though. From inspecting the
code of TransposeJob and DistributedRowMatrix, I wouldn't expect it to
have an effect:
DistributedRowMatrix:
156 public DistributedRowMatrix transpose() throws IOException {
157 Path outputPath = new Path(rowPath.getParent(), "transpose-" +
(System.nanoTime() & 0xFF));
158 JobConf conf = TransposeJob.buildTransposeJobConf(rowPath,
outputPath, numRows);
159 JobClient.runJob(conf);
160 DistributedRowMatrix m = new DistributedRowMatrix(outputPath,
outputTmpPath, numCols, numRows);
161 m.configure(this.conf);
162 return m;
163 }
TransposeJob:
public static JobConf buildTransposeJobConf(Path matrixInputPath,
78 Path matrixOutputPath,
79 int numInputRows)
throws IOException {
80 JobConf conf = new JobConf(TransposeJob.class);
...
97 return conf;
98 }
(Hopefully email formatting doesn't mangle that too badly)
The job is created with an entirely new JobConf, not based on any
existing conf which would have a parameter for mapred.child.java.opts.
Then the transpose method simply runs the job without modifying the
conf either. So it's not surprising that adding options to the command
line doesn't affect it. I imagine my problem could be solved simply by
buildTransposeJobConf optionally getting a JobConf, and transpose can
pass it 'this.conf'.
For the second issue, that would be nice.
On 04/27/2011 12:56 PM, Jake Mannix wrote:
There are two issues here - a) giving more memory to your reducers (have you
tried specifying -Dmapred.child.java.opts=-Xmx1024m (or something like that,
on the command line?), and b)
https://issues.apache.org/jira/browse/MAHOUT-639 which I should really have
gotten cleaned up and committed.
-jake
On Wed, Apr 27, 2011 at 12:43 PM, Paul Mahon<[email protected]> wrote:
I'm having trouble using Mahout's (0.4) DistributedRowMatrix transpose
method. The matrix I'm transposing is about 12 million rows by 2.5 million
columns. It's quite sparse (no more than 10 non-zero elements per row) so
memory shouldn't be a problem. However, running transpose always runs out of
memory in the reduce step:
2011-04-27 10:51:29,910 FATAL org.apache.hadoop.mapred.TaskTracker: Error
running child : java.lang.OutOfMemoryError: Java heap space
at
org.apache.mahout.math.map.OpenIntDoubleHashMap.rehash(OpenIntDoubleHashMap.java:434)
at
org.apache.mahout.math.map.OpenIntDoubleHashMap.put(OpenIntDoubleHashMap.java:387)
at
org.apache.mahout.math.RandomAccessSparseVector.setQuick(RandomAccessSparseVector.java:134)
at
org.apache.mahout.math.hadoop.TransposeJob$TransposeReducer.reduce(TransposeJob.java:142)
at
org.apache.mahout.math.hadoop.TransposeJob$TransposeReducer.reduce(TransposeJob.java:122)
at
org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:463)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
In digging into the problem I found out that the reduce task is being with
with -Xmx=200m. That is the default hadoop mapred.child.java.opts, since I
didn't override it in the mapred conf on the machine running the job. It
should be possible to set parameters which are used by TransposeJob when
called from the transpose method, but it seems there isn't.
Did I miss some other way of transposing the matrix or some way to
configure the transpose job?