I have had some trouble scaling it too, that is an issue I've been working at from several angles for a few months now. The main problem is the explosion of messaging that occurs.
It might be worth trying to employ the spill-to-disk features, there was a thread in the JIRA (I think for GIRAPH-328 or 322, maybe a bit earlier I can check...) where Maja explained that the spill also halts computation when messages build up so that we never quite overrun our memory reserves during the computation/message stages. This trades time for space, but is something I have been meaning to experiement with, as in many situations its a trade well worth making. I will be experimenting with this option myself soon, its on my "short list" of Giraph stuff-to-do! I am also independently working on some ways to deduplicate broadcast messages such as those used in triangle closing so that in-memory runs of this algorithm are possible at interesting scales. That idea has undergone some "evolution" and is still underway, (its the aforementioned GIRAPH-322) so more to follow there when my schoolwork lets up... ;) Eli On Sun, Oct 7, 2012 at 12:11 PM, Vernon Thommeret <[email protected]> wrote: > Thanks. I ended up getting it working. Having some issues scaling it, > but working on it. > > On Mon, Sep 24, 2012 at 1:17 PM, Eli Reisman <[email protected]> > wrote: > > The io format types have to be compatible. Since > > IdWithValueVertexOutputFormat does not specify the types it takes, it > just > > attempts to output them as using the Writable interface, I use it to > output > > data from the SimpleTriangleClosingVertex. I also had to write an > > InputFormat to accept IntWritable id's and IntWritable out-edge > > destinations. Otherwise, should work. > > > > > > > > On Mon, Sep 24, 2012 at 12:06 AM, Avery Ching <[email protected]> wrote: > >> > >> I don't think the types are compatible. > >> > >> public class SimpleTriangleClosingVertex extends EdgeListVertex< > >> IntWritable, SimpleTriangleClosingVertex.IntArrayListWritable, > >> NullWritable, IntWritable> > >> > >> You'll need to use an input format and output format that fits these > >> types. Otherwise the issue is likely to be > serialization/deserialization > >> here. > >> > >> > >> On 9/23/12 10:44 PM, Vernon Thommeret wrote: > >>> > >>> I'm trying to get the SimpleTriangleClosingVertex to run, but getting > >>> this error: > >>> > >>> java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException: IPC > >>> server unable to read call parameters: null > >>> at > >>> > org.apache.giraph.comm.BasicRPCCommunications.sendPartitionRequest(BasicRPCCommunications.java:923) > >>> at > >>> > org.apache.giraph.graph.BspServiceWorker.loadVertices(BspServiceWorker.java:327) > >>> at > >>> > org.apache.giraph.graph.BspServiceWorker.setup(BspServiceWorker.java:604) > >>> at > >>> org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:377) > >>> at > org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:578) > >>> at > >>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647) > >>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) > >>> at org.apache.hadoop.mapred.Child$4.run(Child.java:270) > >>> at java.security.AccessController.doPrivileged(Native Method) > >>> at javax.security.auth.Subject.doAs(Subject.java:396) > >>> at > >>> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157) > >>> at org.apache.hadoop.mapred.Child.main(Child.java:264) > >>> Caused by: org.apache.hadoop.ipc.RemoteException: IPC server > >>> > >>> This is the diff that causes the issue: > >>> > >>> @@ -33,7 +33,7 @@ import org.apache.hadoop.fs.Path; > >>> import org.apache.hadoop.io.IntWritable; > >>> > >>> import org.apache.giraph.graph.GiraphJob; > >>> -import org.apache.giraph.graph.IntIntNullIntVertex; > >>> +import org.apache.giraph.examples.SimpleTriangleClosingVertex; > >>> import org.apache.giraph.io.IntIntNullIntTextInputFormat; > >>> import org.apache.giraph.io.AdjacencyListTextVertexOutputFormat; > >>> > >>> @@ -44,16 +44,12 @@ import org.apache.log4j.Logger; > >>> /** > >>> * Simple function to return the in degree for each vertex. > >>> */ > >>> -public class SharedConnectionsVertex extends IntIntNullIntVertex > >>> implements Tool { > >>> +public class SharedConnections implements Tool { > >>> > >>> private Configuration conf; > >>> private static final Logger LOG = > >>> Logger.getLogger(SharedConnections.class); > >>> > >>> - public void compute(Iterable<IntWritable> messages) { > >>> - voteToHalt(); > >>> - } > >>> - > >>> @Override > >>> public final int run(final String[] args) throws Exception { > >>> Options options = new Options(); > >>> @@ -71,7 +67,7 @@ public class SharedConnections extends > >>> IntIntNullIntVertex implements Tool { > >>> > >>> GiraphJob job = new GiraphJob(getConf(), getClass().getName()); > >>> > >>> - job.setVertexClass(SharedConnections.class); > >>> + job.setVertexClass(SimpleTriangleClosingVertex.class); > >>> > job.setVertexInputFormatClass(IntIntNullIntTextInputFormat.class); > >>> > >>> > job.setVertexOutputFormatClass(AdjacencyListTextVertexOutputFormat.class); > >>> job.setWorkerConfiguration(10, 10, 100.0f); > >>> > >>> -- > >>> > >>> I.e. I have a dummy job that just outputs the vertices which works, > >>> but trying to switch the vertex class doesn't seem to work. I'm > >>> running the latest version of Giraph (rev 1388628). Should this work > >>> or should I try something different? > >>> > >>> Thanks! > >>> Vernon > >> > >> > > >
