I think you answered your question "Or am I supposed to write a
VertexOutputFormat implementation that generates no output for the vertices
that have no data?", as in YES!.
But don't be put off; It is actually a very simple class to override. Here is
an example for something like you describe:
package com.ebay.foo.bar.giraph.io.formats;
import org.apache.giraph.graph.Vertex;
import org.apache.giraph.io.formats.TextVertexOutputFormat;
import org.apache.hadoop.io.BooleanWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.TaskAttemptContext;
import java.io.IOException;
public class ExampleOutputFormat extends
TextVertexOutputFormat<Text, Text, BooleanWritable> {
public class ExampleWriter extends TextVertexWriter {
@Override
public void writeVertex(
Vertex<Text, Text, BooleanWritable> vertex)
throws IOException, InterruptedException {
if (!vertex.getValue().toString().isEmpty())
getRecordWriter().write(vertex.getId(), vertex.getValue());
}
}
}
@Override
public TextVertexWriter createVertexWriter(TaskAttemptContext context)
throws IOException, InterruptedException {
return new ExampleWriter();
}
}
Thomas A J Schweiger
Sr. Software Architect
GDI-Inc Data Services-Seattle
[X]
Office: (425) 586-2669
email: [email protected]<mailto:[email protected]>
________________________________
From: [email protected] [[email protected]] on behalf of Matthew
Cornell [[email protected]]
Sent: Monday, August 25, 2014 11:38 AM
To: user
Subject: How do I output only a subset of a graph?
Hi Folks. I have a graph computation that starts with a subset of vertices of a
certain type and propagates information through the graph to a set of target
vertices, which are also subset of the graph. I want to output only information
from those particular vertices, but I don't see a way to do this in the various
VertexOutputFormat subclasses, which all seem oriented to outputting something
for every vertex in the graph. How do I do this? E.g., are there hooks for the
output phase where I can filter output? Or am I supposed to write a
VertexOutputFormat implementation that generates no output for the vertices
that have no data? Thanks in advance.
--
Matthew Cornell | [email protected]<mailto:[email protected]> |
413-626-3621 | 34 Dickinson Street, Amherst MA 01002 |
matthewcornell.org<http://matthewcornell.org>