[ 
https://issues.apache.org/jira/browse/GIRAPH-96?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152291#comment-13152291
 ] 

Claudio Martella commented on GIRAPH-96:
----------------------------------------

it is indeed a nice discussion, the amount of data to be read is the same after 
all, but we're talking about random i/o here. It's a possibility. Also, 
answering to Gianmarco, the idea of having an HBase InputReader is the same as 
the current discussion on supporting Hive, Pig and HCatalog. If you store your 
data in HBase it can be quite useful, as much as it is not for MR. The 
lazy-approach could be something to investigate  and anyway something that 
would be necessary only with huge graphs or, as in my case, where we have 
computations that don't necessarily touch the whole graph.

Good out-of-core data structures/maps are difficult to find around, maybe 
linkedin's krati or leveldb (but i guess we'd have license issues there).
                
> Support for Graphs with Huge adjacency lists
> --------------------------------------------
>
>                 Key: GIRAPH-96
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-96
>             Project: Giraph
>          Issue Type: Improvement
>          Components: bsp
>    Affects Versions: 0.70.0
>            Reporter: Arun Suresh
>
> Currently the vertex initialize() method is passed the complete adjacency 
> list as a HashMap. All the current concrete implementations of Vertex iterate 
> over the adjacency list and recreate new Data Structures within the Vertex 
> instance to hold/manipulate the adjacency list. This would seize to be 
> feasible once the size of the adjacency list becomes really huge.
> I propose storing the adjacency list and all vertex information (and incoming 
> messages ?) in a distributed data store such as HBase. The adjacency list can 
> be lazily loaded via HBase Scans. I was thinking of an HBase schema where the 
> row Id is a concatenation of VertexID+OutboundVertexId with a single column 
> containing the edge.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to