Hi all,

I have been trying to run LDA alone on some text corpora compared to running
SVD prior to LDA. However, after the rank has been reduced,
somehow I am immediately getting LDA Spill fail errors which doesn't make
any sense...
Any help is appreciated! Thanks!

chunk of the error:
11/05/26 12:50:56 INFO mapred.JobClient: Running job: job_201105261036_0002
11/05/26 12:50:57 INFO mapred.JobClient:  map 0% reduce 0%
11/05/26 12:51:12 INFO mapred.JobClient: Task Id :
attempt_201105261036_0002_m_000000_0, Status : FAILED
java.io.IOException: Spill failed
    at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1044)
    at java.io.DataOutputStream.write(DataOutputStream.java:90)
    at java.io.FilterOutputStream.write(FilterOutputStream.java:80)
    at
org.apache.mahout.common.IntPairWritable.write(IntPairWritable.java:83)
    at
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:90)
    at
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:77)
    at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:892)
    at
org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:541)
    at
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
    at org.apache.mahout.clustering.lda.LDAMapper.map(LDAMapper.java:71)
    at org.apache.mahout.clustering.lda.LDAMapper.map(LDAMapper.java:36)
    at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.IllegalArgumentException: Found NaN for topic=(%d,%d)
[-2, -2]
    at
com.google.common.base.Preconditions.checkArgument(Preconditions.java:116)
    at
org.apache.mahout.clustering.lda.LDAReducer.reduce(LDAReducer.java:41)
    at
org.apache.mahout.clustering.lda.LDAReducer.reduce(LDAReducer.java:29)
    at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
    at
org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1222)
    at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1265)
    at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:686)
    at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1173)


Florie

Reply via email to