That's really cool. BTW, Have you tried these algorithms on the distributed environment?
On Mon, Aug 3, 2009 at 6:33 PM, Bin Cai<[email protected]> wrote: > *X-RIM**E**(http://xrime.sourceforge.net/): Hadoop based large scale social > network analysis* > * > Motivation* > Today's telecom service providers and Internet-based social network sites > possess huge user communities. They hold large amount of data about their > users and want to generate core competency from the data. A key enabler for > this is a cost efficient solution for social data management and social > network analysis (SNA). > > Such a solution faces a few challenges. The most important one is that the > solution should be able to handle massive and heterogeneous data sets. > Facing this challenge, the traditional data warehouse based solutions are > usually not cost efficient enough. On the other hand, existing SNA tools are > mostly used in single workstation mode, and not scalable enough. To this > end, low cost and highly scalable data management and processing > technologies from cloud computing society should be brought in to help. > > However, most of existing cloud based data analysis solutions are trying to > provide SQL-like general purpose query languages, and do not directly > support social network analysis. This makes them hard to optimize and hard > to use for SNA users. So, we came up with X-RIME to fix this gap. > > So, briefly speaking, X-RIME wants to provide a few value-added layers on > top of existing cloud infrastructure, to support smart decision loops based > on massive data sets and SNA. To end users, X-RIME is a library consists of > Map-Reduce programs, which are used to do raw data pre-processing, > transformation, SNA metrics and structures calculation, and graph / network > visualization. The library could be integrated with other Hadoop based data > warehouses (e.g., HIVE) to build more comprehensive solutions. > > *Currently Supported SNA Metrics and Structures* > vertex degree statistics > weakly connected components (WCC) > strongly connected components (SCC) > bi-connected components (BCC) > ego-centric density > bread first search / single source shortest path (BFS/SSSP) > K-core > maximal cliques > pagerank > hyperlink-induced topic search (HITS) > minimal spanning tree (MST) > -- Best Regards, Edward J. Yoon @ NHN, corp. [email protected] http://blog.udanax.org
