Hi Miklos! Yes, I've already looked at your patch. It's perfect! (actually it was on my todo list.)
By the way, having been developing the 0.2 for so long, I think it is time to release the Hama 0.2 and implement next features. (You can check our roadmap here http://wiki.apache.org/hama/RoadMap) Here's more detailed my thoughts and opinions about 0.2 release: 1) Currently, the Apache Hama 0.2 provides a *basic* functionality enabling the user to understand the Bulk Synchronous Parallel. That's it! so there's no performance improvement plan anymore. I think, ... it'll also provide big participate opportunities for new volunteers like you. :) 2) Once we release Hama 0.2, we can just start the Hama 0.3. IMO, It'll be a *product level* software includes many useful features with improved performance, such as In/output system, Runtime compression of BSP message, Network condition based mechanisms, Web UI, ..., etc. As above reason, I prefer to commit your patch to 0.3 version. What do you think? P.S. I hope you forgive me, I'm CC our community to discuss this together. 2011/3/18 Miklós Erdélyi <[email protected]>: > Hi Edward, > > Thanks for accepting my first little patch for HAMA (TestBSPPeer > modification)! > > I'm a PhD student in Hungary at the University of Pannonia. At the > moment my research topic is related to link-based graph-similarity > algorithms for which I plan to do measurements on implementations in > HAMA BSP. > I've already written a prototype on top of HAMA which supports > Pregel-like graph computations without fault-tolerance. It supports > loading graph partitions by peers, performing local computations on > vertexes, grouping of incoming messages per vertex and stopping after > a fixed number of iterations. > > Currently the greatest bottleneck is the messaging part: there I would > like to help improving HAMA. I'd like to implement asynchronous > message processing and grouping incoming messages by message tag. The > latter would allow more efficient iteration through messages intended > for a specific vertex while the former would shorten the computational > + messaging phase generally. > > As a first step I've uploaded to JIRA a simple patch which speeds up > HAMA messaging by delivering messages for the local peer directly into > its queue (i.e., there's no network transfer for locally delivered > messages). Also, this patch adds a peer InetNetworkAddress cache to > avoid costly InetNetworkAddress construction on every message sent. > You have already commented on this patch so I guess you have already > looked at it. > > In case you have ideas or pointers for the HAMA-based graph > computation framework in general or improving messaging performance in > particular, please let me know! :) > > Thanks, > Miklos > -- Best Regards, Edward J. Yoon http://blog.udanax.org http://twitter.com/eddieyoon
