Apache Giraph is a framework for graph processing, currently runs over "MR" (but is getting its own coordination via YARN soon): http://giraph.apache.org.
You may also checkout the generic BSP system (Giraph uses BSP too, if am not wrong, but doesn't use Hama - works over MR instead), Apache Hama: http://hama.apache.org On Wed, Sep 26, 2012 at 9:51 PM, Jane Wayne <jane.wayne2...@gmail.com> wrote: > i'll look for myself, but could you please let me know what is giraph? > is it another layer on hadoop like hive/pig or an api like mahout? > > > > On Wed, Sep 26, 2012 at 12:09 PM, Jonathan Bishop <jbishop....@gmail.com> > wrote: >> Yes, Giraph seems like the best way to go - it is mainly a vertex >> evaluation with message passing between vertices. Synchronization is >> handled for you. >> >> On Wed, Sep 26, 2012 at 8:36 AM, Jane Wayne <jane.wayne2...@gmail.com>wrote: >> >>> hi, >>> >>> i know that some algorithms cannot be parallelized and adapted to the >>> mapreduce paradigm. however, i have noticed that in most cases where i >>> find myself struggling to express an algorithm in mapreduce, the >>> problem is mainly due to no ability to cross-communicate between >>> mappers or reducers. >>> >>> one naive approach i've seen mentioned here and elsewhere, is to use a >>> database to store data for use by all the mappers. however, i have >>> seen many arguments (that i agree with largely) against this approach. >>> >>> in general, my question is this: has anyone tried to implement an >>> algorithm using mapreduce where mappers required cross-communications? >>> how did you solve this limitation of mapreduce? >>> >>> thanks, >>> >>> jane. >>> -- Harsh J