I see your reasoning. In general I'm being open to use MR when necessary (e.g. i used to propose it instead of the automatic vertex creation), here it could get tricky. I can see additional HDFS usage as down (you have to be able to store 2x the graph). However, once the graph is pre-filtered, this additional stage would not be necessary again for the successive jobs (only when a different number of workers is used). Though, it does add a not so small passage to the process.
On Sun, Jul 22, 2012 at 10:49 PM, Alessandro Presta <[email protected]> wrote: > Exactly. On paper, the amount of data around should be the same as during > the computation, but in practice we do use a lot more memory. > You can play with the settings and just push the problem a little farther > away, by caching less and flushing requests more frequently, so then the > bottleneck is on the servers. > We're basically sending (k-1)/k of the graph through the network, where k > is the number of workers. > > What I'm thinking is that in INPUT_SUPERSTEP we're doing what MapReduce is > really good at (sorting and aggregating) in a probably inefficient (or at > least non-scalable) way. > We could try implementing it with a MapReduce job instead, where the > mappers take input splits and emit (partition_id, vertex) (they would have > access to the partitioner) and reducers just output the built partitions > to HDFS. > The computation stage would then be the usual Giraph job, where each > worker knows where to get its partitions from HDFS. > I can try making this change and see how it goes. It would just be one MR > job, so we're not selling our souls to iterative MR. > > I can also see many cases where one might not want to shuffle vertices > around at all: each worker reads a roughly equal part of the input (forget > about bigger vertices for now) and simply communicates its own vertex ids > to the master. Partition "a posteriori" instead of "a priori". > > What do you think? > > On 7/20/12 9:42 PM, "Eli Reisman" <[email protected]> wrote: > >>What we are seeing in the metrics is the three-way load of >> >>1. reading InputSplits from HDFS (mostly over the wire as there is no >>locality right now) >>2. creating temporary collections of vertices, sending them on netty >>3. simultaneously receiving collections of vertices on netty from remote >>nodes that will be place in the local workers' partitions for processing >>stages > -- Claudio Martella [email protected]
