Brief follow-up: GIRAPH-314, which is not rebased or committed yet, is another part of this puzzle where I attempt to combine the messages and allow primitive (hacky) ability to amortize the supersteps where vertices message each other to keep the volume of messages down per-superstep. Its a blatant trade of time for space, and probably a desperate cry for help too. I will update it ASAP so you can play with it. I had pretty promising results but that was when I had a cluster to play with ;)
First step, I'd try Maja's recipe for spill-to-disk during messaging. Her advice is in those 314-322-328 threads. On Mon, Oct 8, 2012 at 3:55 PM, Eli Reisman <[email protected]>wrote: > I have had some trouble scaling it too, that is an issue I've been working > at from several angles for a few months now. The main problem is the > explosion of messaging that occurs. > > It might be worth trying to employ the spill-to-disk features, there was a > thread in the JIRA (I think for GIRAPH-328 or 322, maybe a bit earlier I > can check...) where Maja explained that the spill also halts computation > when messages build up so that we never quite overrun our memory reserves > during the computation/message stages. This trades time for space, but is > something I have been meaning to experiement with, as in many situations > its a trade well worth making. I will be experimenting with this option > myself soon, its on my "short list" of Giraph stuff-to-do! > > I am also independently working on some ways to deduplicate broadcast > messages such as those used in triangle closing so that in-memory runs of > this algorithm are possible at interesting scales. That idea has undergone > some "evolution" and is still underway, (its the aforementioned GIRAPH-322) > so more to follow there when my schoolwork lets up... ;) > > Eli > > > > On Sun, Oct 7, 2012 at 12:11 PM, Vernon Thommeret <[email protected]>wrote: > >> Thanks. I ended up getting it working. Having some issues scaling it, >> but working on it. >> >> On Mon, Sep 24, 2012 at 1:17 PM, Eli Reisman <[email protected]> >> wrote: >> > The io format types have to be compatible. Since >> > IdWithValueVertexOutputFormat does not specify the types it takes, it >> just >> > attempts to output them as using the Writable interface, I use it to >> output >> > data from the SimpleTriangleClosingVertex. I also had to write an >> > InputFormat to accept IntWritable id's and IntWritable out-edge >> > destinations. Otherwise, should work. >> > >> > >> > >> > On Mon, Sep 24, 2012 at 12:06 AM, Avery Ching <[email protected]> >> wrote: >> >> >> >> I don't think the types are compatible. >> >> >> >> public class SimpleTriangleClosingVertex extends EdgeListVertex< >> >> IntWritable, SimpleTriangleClosingVertex.IntArrayListWritable, >> >> NullWritable, IntWritable> >> >> >> >> You'll need to use an input format and output format that fits these >> >> types. Otherwise the issue is likely to be >> serialization/deserialization >> >> here. >> >> >> >> >> >> On 9/23/12 10:44 PM, Vernon Thommeret wrote: >> >>> >> >>> I'm trying to get the SimpleTriangleClosingVertex to run, but getting >> >>> this error: >> >>> >> >>> java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException: IPC >> >>> server unable to read call parameters: null >> >>> at >> >>> >> org.apache.giraph.comm.BasicRPCCommunications.sendPartitionRequest(BasicRPCCommunications.java:923) >> >>> at >> >>> >> org.apache.giraph.graph.BspServiceWorker.loadVertices(BspServiceWorker.java:327) >> >>> at >> >>> >> org.apache.giraph.graph.BspServiceWorker.setup(BspServiceWorker.java:604) >> >>> at >> >>> org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:377) >> >>> at >> org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:578) >> >>> at >> >>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647) >> >>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) >> >>> at org.apache.hadoop.mapred.Child$4.run(Child.java:270) >> >>> at java.security.AccessController.doPrivileged(Native Method) >> >>> at javax.security.auth.Subject.doAs(Subject.java:396) >> >>> at >> >>> >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157) >> >>> at org.apache.hadoop.mapred.Child.main(Child.java:264) >> >>> Caused by: org.apache.hadoop.ipc.RemoteException: IPC server >> >>> >> >>> This is the diff that causes the issue: >> >>> >> >>> @@ -33,7 +33,7 @@ import org.apache.hadoop.fs.Path; >> >>> import org.apache.hadoop.io.IntWritable; >> >>> >> >>> import org.apache.giraph.graph.GiraphJob; >> >>> -import org.apache.giraph.graph.IntIntNullIntVertex; >> >>> +import org.apache.giraph.examples.SimpleTriangleClosingVertex; >> >>> import org.apache.giraph.io.IntIntNullIntTextInputFormat; >> >>> import org.apache.giraph.io.AdjacencyListTextVertexOutputFormat; >> >>> >> >>> @@ -44,16 +44,12 @@ import org.apache.log4j.Logger; >> >>> /** >> >>> * Simple function to return the in degree for each vertex. >> >>> */ >> >>> -public class SharedConnectionsVertex extends IntIntNullIntVertex >> >>> implements Tool { >> >>> +public class SharedConnections implements Tool { >> >>> >> >>> private Configuration conf; >> >>> private static final Logger LOG = >> >>> Logger.getLogger(SharedConnections.class); >> >>> >> >>> - public void compute(Iterable<IntWritable> messages) { >> >>> - voteToHalt(); >> >>> - } >> >>> - >> >>> @Override >> >>> public final int run(final String[] args) throws Exception { >> >>> Options options = new Options(); >> >>> @@ -71,7 +67,7 @@ public class SharedConnections extends >> >>> IntIntNullIntVertex implements Tool { >> >>> >> >>> GiraphJob job = new GiraphJob(getConf(), getClass().getName()); >> >>> >> >>> - job.setVertexClass(SharedConnections.class); >> >>> + job.setVertexClass(SimpleTriangleClosingVertex.class); >> >>> >> job.setVertexInputFormatClass(IntIntNullIntTextInputFormat.class); >> >>> >> >>> >> job.setVertexOutputFormatClass(AdjacencyListTextVertexOutputFormat.class); >> >>> job.setWorkerConfiguration(10, 10, 100.0f); >> >>> >> >>> -- >> >>> >> >>> I.e. I have a dummy job that just outputs the vertices which works, >> >>> but trying to switch the vertex class doesn't seem to work. I'm >> >>> running the latest version of Giraph (rev 1388628). Should this work >> >>> or should I try something different? >> >>> >> >>> Thanks! >> >>> Vernon >> >> >> >> >> > >> > >
