Hello all:

I'm testing the Random Forest Partial version in the version of Hadoop: Hadoop 2.0.0-cdh4.1.1

I'm trying to modify the algorithm, all I do is add more information to the leaves of the tree. Currently containing the label and I want to add another label more:

@Override
public void readFields(DataInput in) throws IOException{

label = in.readDouble();
leafWeight = in.readDouble();

}

@Override
protected void writeNode(DataOutput out) throws IOException{

out.writeDouble(label);
out.writeDouble(leafWeight);

}

And I get the following error:

13/02/27 06:53:27 INFO mapreduce.BuildForest: Partial Mapred implementation
13/02/27 06:53:27 INFO mapreduce.BuildForest: Building the forest...
13/02/27 06:53:27 INFO mapreduce.BuildForest: Weights Estimation: IR
13/02/27 06:53:37 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 13/02/27 06:53:39 INFO input.FileInputFormat: Total input paths to process : 1 13/02/27 06:53:39 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
13/02/27 06:53:39 WARN snappy.LoadSnappy: Snappy native library not loaded
13/02/27 06:53:39 INFO mapred.JobClient: Running job: job_201302270205_0013
13/02/27 06:53:40 INFO mapred.JobClient: map 0% reduce 0%
13/02/27 06:54:18 INFO mapred.JobClient: map 20% reduce 0%
13/02/27 06:54:42 INFO mapred.JobClient: map 40% reduce 0%
13/02/27 06:55:03 INFO mapred.JobClient: map 60% reduce 0%
13/02/27 06:55:26 INFO mapred.JobClient: map 70% reduce 0%
13/02/27 06:55:27 INFO mapred.JobClient: map 80% reduce 0%
13/02/27 06:55:49 INFO mapred.JobClient: map 100% reduce 0%
13/02/27 06:56:04 INFO mapred.JobClient: Job complete: job_201302270205_0013
13/02/27 06:56:04 INFO mapred.JobClient: Counters: 24
13/02/27 06:56:04 INFO mapred.JobClient: File System Counters
13/02/27 06:56:04 INFO mapred.JobClient: FILE: Number of bytes read=0
13/02/27 06:56:04 INFO mapred.JobClient: FILE: Number of bytes written=1828230
13/02/27 06:56:04 INFO mapred.JobClient: FILE: Number of read operations=0
13/02/27 06:56:04 INFO mapred.JobClient: FILE: Number of large read operations=0
13/02/27 06:56:04 INFO mapred.JobClient: FILE: Number of write operations=0
13/02/27 06:56:04 INFO mapred.JobClient: HDFS: Number of bytes read=1381649
13/02/27 06:56:04 INFO mapred.JobClient: HDFS: Number of bytes written=1680
13/02/27 06:56:04 INFO mapred.JobClient: HDFS: Number of read operations=30
13/02/27 06:56:04 INFO mapred.JobClient: HDFS: Number of large read operations=0
13/02/27 06:56:04 INFO mapred.JobClient: HDFS: Number of write operations=10
13/02/27 06:56:04 INFO mapred.JobClient: Job Counters
13/02/27 06:56:04 INFO mapred.JobClient: Launched map tasks=10
13/02/27 06:56:04 INFO mapred.JobClient: Data-local map tasks=10
13/02/27 06:56:04 INFO mapred.JobClient: Total time spent by all maps in occupied slots (ms)=254707 13/02/27 06:56:04 INFO mapred.JobClient: Total time spent by all reduces in occupied slots (ms)=0 13/02/27 06:56:04 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 13/02/27 06:56:04 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
13/02/27 06:56:04 INFO mapred.JobClient: Map-Reduce Framework
13/02/27 06:56:04 INFO mapred.JobClient: Map input records=20
13/02/27 06:56:04 INFO mapred.JobClient: Map output records=10
13/02/27 06:56:04 INFO mapred.JobClient: Input split bytes=1540
13/02/27 06:56:04 INFO mapred.JobClient: Spilled Records=0
13/02/27 06:56:04 INFO mapred.JobClient: CPU time spent (ms)=12070
13/02/27 06:56:04 INFO mapred.JobClient: Physical memory (bytes) snapshot=949579776 13/02/27 06:56:04 INFO mapred.JobClient: Virtual memory (bytes) snapshot=8412340224 13/02/27 06:56:04 INFO mapred.JobClient: Total committed heap usage (bytes)=478412800 Exception in thread "main" java.lang.IllegalStateException: java.io.EOFException at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:104) at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:38) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.mahout.classifier.df.mapreduce.partial.PartialBuilder.processOutput(PartialBuilder.java:129) at org.apache.mahout.classifier.df.mapreduce.partial.PartialBuilder.parseOutput(PartialBuilder.java:96)
at org.apache.mahout.classifier.df.mapreduce.Builder.build(Builder.java:312)
at org.apache.mahout.classifier.df.mapreduce.BuildForest.buildForest(BuildForest.java:246) at org.apache.mahout.classifier.df.mapreduce.BuildForest.run(BuildForest.java:200)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.mahout.classifier.df.mapreduce.BuildForest.main(BuildForest.java:270)
Caused by: java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:180)
at java.io.DataInputStream.readLong(DataInputStream.java:399)
at java.io.DataInputStream.readDouble(DataInputStream.java:451)
at org.apache.mahout.classifier.df.node.Leaf.readFields(Leaf.java:136)
at org.apache.mahout.classifier.df.node.Node.read(Node.java:85)
at org.apache.mahout.classifier.df.mapreduce.MapredOutput.readFields(MapredOutput.java:64) at org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:2114)
at org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2242)
at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.computeNext(SequenceFileIterator.java:95)
... 10 more

What's the problem?

You can try to write more information in the leaves of the tree?

Thank you very much.


Best regards,

Sara

Reply via email to