Hey Eli, Thanks for the suggestions. I've been playing with this nights and weekends, which is why there's been such a delay :). I should have more time in a couple weeks and will dig back in and report back.
Vernon On Mon, Oct 8, 2012 at 9:00 PM, Eli Reisman <[email protected]> wrote: > Brief follow-up: > > GIRAPH-314, which is not rebased or committed yet, is another part of this > puzzle where I attempt to combine the messages and allow primitive (hacky) > ability to amortize the supersteps where vertices message each other to keep > the volume of messages down per-superstep. Its a blatant trade of time for > space, and probably a desperate cry for help too. I will update it ASAP so > you can play with it. I had pretty promising results but that was when I had > a cluster to play with ;) > > First step, I'd try Maja's recipe for spill-to-disk during messaging. Her > advice is in those 314-322-328 threads. > > > On Mon, Oct 8, 2012 at 3:55 PM, Eli Reisman <[email protected]> > wrote: >> >> I have had some trouble scaling it too, that is an issue I've been working >> at from several angles for a few months now. The main problem is the >> explosion of messaging that occurs. >> >> It might be worth trying to employ the spill-to-disk features, there was a >> thread in the JIRA (I think for GIRAPH-328 or 322, maybe a bit earlier I can >> check...) where Maja explained that the spill also halts computation when >> messages build up so that we never quite overrun our memory reserves during >> the computation/message stages. This trades time for space, but is something >> I have been meaning to experiement with, as in many situations its a trade >> well worth making. I will be experimenting with this option myself soon, its >> on my "short list" of Giraph stuff-to-do! >> >> I am also independently working on some ways to deduplicate broadcast >> messages such as those used in triangle closing so that in-memory runs of >> this algorithm are possible at interesting scales. That idea has undergone >> some "evolution" and is still underway, (its the aforementioned GIRAPH-322) >> so more to follow there when my schoolwork lets up... ;) >> >> Eli >> >> >> >> On Sun, Oct 7, 2012 at 12:11 PM, Vernon Thommeret <[email protected]> >> wrote: >>> >>> Thanks. I ended up getting it working. Having some issues scaling it, >>> but working on it. >>> >>> On Mon, Sep 24, 2012 at 1:17 PM, Eli Reisman <[email protected]> >>> wrote: >>> > The io format types have to be compatible. Since >>> > IdWithValueVertexOutputFormat does not specify the types it takes, it >>> > just >>> > attempts to output them as using the Writable interface, I use it to >>> > output >>> > data from the SimpleTriangleClosingVertex. I also had to write an >>> > InputFormat to accept IntWritable id's and IntWritable out-edge >>> > destinations. Otherwise, should work. >>> > >>> > >>> > >>> > On Mon, Sep 24, 2012 at 12:06 AM, Avery Ching <[email protected]> >>> > wrote: >>> >> >>> >> I don't think the types are compatible. >>> >> >>> >> public class SimpleTriangleClosingVertex extends EdgeListVertex< >>> >> IntWritable, SimpleTriangleClosingVertex.IntArrayListWritable, >>> >> NullWritable, IntWritable> >>> >> >>> >> You'll need to use an input format and output format that fits these >>> >> types. Otherwise the issue is likely to be >>> >> serialization/deserialization >>> >> here. >>> >> >>> >> >>> >> On 9/23/12 10:44 PM, Vernon Thommeret wrote: >>> >>> >>> >>> I'm trying to get the SimpleTriangleClosingVertex to run, but getting >>> >>> this error: >>> >>> >>> >>> java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException: >>> >>> IPC >>> >>> server unable to read call parameters: null >>> >>> at >>> >>> >>> >>> org.apache.giraph.comm.BasicRPCCommunications.sendPartitionRequest(BasicRPCCommunications.java:923) >>> >>> at >>> >>> >>> >>> org.apache.giraph.graph.BspServiceWorker.loadVertices(BspServiceWorker.java:327) >>> >>> at >>> >>> >>> >>> org.apache.giraph.graph.BspServiceWorker.setup(BspServiceWorker.java:604) >>> >>> at >>> >>> org.apache.giraph.graph.GraphMapper.setup(GraphMapper.java:377) >>> >>> at >>> >>> org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:578) >>> >>> at >>> >>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647) >>> >>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) >>> >>> at org.apache.hadoop.mapred.Child$4.run(Child.java:270) >>> >>> at java.security.AccessController.doPrivileged(Native Method) >>> >>> at javax.security.auth.Subject.doAs(Subject.java:396) >>> >>> at >>> >>> >>> >>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157) >>> >>> at org.apache.hadoop.mapred.Child.main(Child.java:264) >>> >>> Caused by: org.apache.hadoop.ipc.RemoteException: IPC server >>> >>> >>> >>> This is the diff that causes the issue: >>> >>> >>> >>> @@ -33,7 +33,7 @@ import org.apache.hadoop.fs.Path; >>> >>> import org.apache.hadoop.io.IntWritable; >>> >>> >>> >>> import org.apache.giraph.graph.GiraphJob; >>> >>> -import org.apache.giraph.graph.IntIntNullIntVertex; >>> >>> +import org.apache.giraph.examples.SimpleTriangleClosingVertex; >>> >>> import org.apache.giraph.io.IntIntNullIntTextInputFormat; >>> >>> import org.apache.giraph.io.AdjacencyListTextVertexOutputFormat; >>> >>> >>> >>> @@ -44,16 +44,12 @@ import org.apache.log4j.Logger; >>> >>> /** >>> >>> * Simple function to return the in degree for each vertex. >>> >>> */ >>> >>> -public class SharedConnectionsVertex extends IntIntNullIntVertex >>> >>> implements Tool { >>> >>> +public class SharedConnections implements Tool { >>> >>> >>> >>> private Configuration conf; >>> >>> private static final Logger LOG = >>> >>> Logger.getLogger(SharedConnections.class); >>> >>> >>> >>> - public void compute(Iterable<IntWritable> messages) { >>> >>> - voteToHalt(); >>> >>> - } >>> >>> - >>> >>> @Override >>> >>> public final int run(final String[] args) throws Exception { >>> >>> Options options = new Options(); >>> >>> @@ -71,7 +67,7 @@ public class SharedConnections extends >>> >>> IntIntNullIntVertex implements Tool { >>> >>> >>> >>> GiraphJob job = new GiraphJob(getConf(), getClass().getName()); >>> >>> >>> >>> - job.setVertexClass(SharedConnections.class); >>> >>> + job.setVertexClass(SimpleTriangleClosingVertex.class); >>> >>> >>> >>> job.setVertexInputFormatClass(IntIntNullIntTextInputFormat.class); >>> >>> >>> >>> >>> >>> job.setVertexOutputFormatClass(AdjacencyListTextVertexOutputFormat.class); >>> >>> job.setWorkerConfiguration(10, 10, 100.0f); >>> >>> >>> >>> -- >>> >>> >>> >>> I.e. I have a dummy job that just outputs the vertices which works, >>> >>> but trying to switch the vertex class doesn't seem to work. I'm >>> >>> running the latest version of Giraph (rev 1388628). Should this work >>> >>> or should I try something different? >>> >>> >>> >>> Thanks! >>> >>> Vernon >>> >> >>> >> >>> > >> >> >
