[jira] [Commented] (GIRAPH-249) Move part of the graph out-of-core when memory is low

Claudio Martella (JIRA) Wed, 18 Jul 2012 06:42:37 -0700

    [ 
https://issues.apache.org/jira/browse/GIRAPH-249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417085#comment-13417085
 ]


Claudio Martella commented on GIRAPH-249:
-----------------------------------------

Just to be clear, i'm cautious mostly with out-of-core graph, not with 
out-of-core messages. The first one is a first citizen of the framework, it 
represents state and structure, and affects most of the code path.
Anyway, estimating the size of an object is not a very easy task in java, in 
particular with Messages which are user-defined and can be composed of multiple 
objects. For this reason i think we have two approaches:

1) we ask the Messages to implement a sizeOf() method, following the approach 
of: http://www.javaworld.com/javaworld/javaqa/2003-12/02-qa-1226-sizeof.html
2) we keep the Messages in serialized format, which we can calculate the size 
of easily. 

Wrt to (2), one of the things we discussed at the last workshop in berlin, and 
that was suggested by Owen, and that was also attacked by GPS Stanford, is that 
the pressure on the GC is quite a big loss in performance for continuous object 
creation. Mapreduce re-uses objects, GPS and Stratosphere keep the data in 
serialized format in side of byte[]. It's not something for this JIRA, but it 
could be a nice moment to actually start the appropriate ticket and discussion 
elsewhere as the two things go together. I'll do that.
                
> Move part of the graph out-of-core when memory is low
> -----------------------------------------------------
>
>                 Key: GIRAPH-249
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-249
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Alessandro Presta
>            Assignee: Alessandro Presta
>         Attachments: GIRAPH-249.patch, GIRAPH-249.patch, GIRAPH-249.patch, 
> GIRAPH-249.patch, GIRAPH-249.patch
>
>
> There has been some talk about Giraph's scaling limitations due to keeping 
> the whole graph and messages in RAM.
> We need to investigate methods to fall back to disk when running out of 
> memory, while gracefully degrading performance.
> This issue is for graph storage. Messages should probably be a separate 
> issue, although the interplay between the two is crucial.
> We should also discuss what are our primary goals here: completing a job 
> (albeit slowly) instead of failing when the graph is too big, while still 
> encouraging memory optimizations and high-memory clusters; or restructuring 
> Giraph to be as efficient as possible in disk mode, making it almost a 
> standard way of operating.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GIRAPH-249) Move part of the graph out-of-core when memory is low

Reply via email to