Dear all, After having worked with Giraph for some weeks I feel like there are two features 'missing' in Giraph. It may be I simply missed them in the Javadoc, since the documentation is a work in progress at this point. In another Google Pregel-clone, Stanford GPS, it is possible to define a global object map, which can be used by all workers to share data, like the current phase in the algorithm. I have not been able to find such a feature in Giraph. Of course it would be possible to (ab)use aggregators for this, but I doubt this is the easiest or most efficient approach. Furthermore, it would be very helpful if there would be one special vertex that has the role of a master. This should not have to correspond to an existing vertex in the graph, it would be easier if it were not, actually. This master node would then be able to perform some centralized steps in the algorithm, of which the output can then be shared with other workers via the global object map. The master node could have the same interface as the workers (compute(), getAggregator(), getConf(), etc.). Again, it would be possible to solve this otherwise, for example in the VertexReader, but this would make code less elegant and would require picking a vertex id that does not exist in the graph, which is difficult if the input is not known in advance.
I realize I am biased because my earlier experiences with Stanford GPS, but I feel these features will not be very difficult to implement or would add bulkiness to the API. They can make the implementation of many graph algorithms easier, though, because many of these algorithms have some notion of a centralized master node. During the next 5 months I will be working with Giraph for my Master's project, so I would be more than willing to help out implementing these features, ideally after receiving some pointers from more experienced Giraph developers. Regards, Jan van der Lugt