I'm getting started with Giraph and I have a basic program running
using the PseudoRandomVertexInputFormat. However, when I switch it to
the IntIntNullIntTextInputFormat and specify an input file, I get a
divide by zero error. I'm assuming that (1) I'm not setting this up
properly and (2) there should probably be a length check before the
division happens.
This is the error I'm getting:
java.lang.ArithmeticException: / by zero
at
org.apache.giraph.graph.LocalityInfoSorter.prioritizeLocalInputSplits(LocalityInfoSorter.java:107)
at
org.apache.giraph.graph.LocalityInfoSorter.<init>(LocalityInfoSorter.java:71)
at
org.apache.giraph.graph.BspServiceWorker.reserveInputSplit(BspServiceWorker.java:228)
at
org.apache.giraph.graph.BspServiceWorker.loadVertices(BspServiceWorker.java:317)
at
org.apache.giraph.graph.BspServiceWorker.setup(BspServiceWorker.java:604)
at org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:368)
at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:569)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)
at org.apache.hadoop.mapred.Child.main(Child.java:264)
This is the change that causes the program to no longer work:
------------------------------------
import org.apache.hadoop.io.LongWritable;
import org.apache.giraph.graph.Edge;
-import org.apache.giraph.graph.EdgeListVertex;
+import org.apache.giraph.graph.IntIntNullIntVertex;
import org.apache.giraph.graph.GiraphJob;
-import org.apache.giraph.io.PseudoRandomVertexInputFormat;
+import org.apache.giraph.io.IntIntNullIntTextInputFormat;
import org.apache.giraph.io.AdjacencyListTextVertexOutputFormat;
/**
* Simple function to return the in degree for each vertex.
*/
-public class SimpleInDegreeCountVertex extends EdgeListVertex<
- LongWritable, DoubleWritable, DoubleWritable, DoubleWritable>
- implements Tool {
+public class SimpleInDegreeCountVertex extends IntIntNullIntVertex
implements Tool {
private Configuration conf;
@Override
- public void compute(Iterable<DoubleWritable> messages) {
+ public void compute(Iterable<IntWritable> messages) {
voteToHalt();
}
@@ -80,15 +78,11 @@ public class SimpleInDegreeCountVertex extends
EdgeListVertex<
GiraphJob job = new GiraphJob(getConf(), getClass().getName());
job.setVertexClass(SimpleInDegreeCountVertex.class);
- job.setVertexInputFormatClass(PseudoRandomVertexInputFormat.class);
+ job.setVertexInputFormatClass(IntIntNullIntTextInputFormat.class);
job.setVertexOutputFormatClass(AdjacencyListTextVertexOutputFormat.class);
job.setWorkerConfiguration(10, 10, 100.0f);
- job.getConfiguration().setLong(
- PseudoRandomVertexInputFormat.AGGREGATE_VERTICES, 100l);
- job.getConfiguration().setLong(
- PseudoRandomVertexInputFormat.EDGES_PER_VERTEX, 2l);
-
+ FileInputFormat.addInputPath(job.getInternalJob(), new
Path("connections/1.txt"));
FileOutputFormat.setOutputPath(job.getInternalJob(), new
Path("in_degree_output"));
boolean isVerbose = cmd.hasOption('v');
------------------------------------
"connections/1.txt" and "in_degree_output" are both in my home
directory. /connections/1.txt has the following content:
1 5
2 5 6
3 5 6
4 1 2 3
I've spent some time digging through the source and comparing to some
of the example classes, but I'm having trouble working this out. Any
thoughts?
Thanks!
Vernon