[ 
https://issues.apache.org/jira/browse/GIRAPH-249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413250#comment-13413250
 ] 

Eli Reisman commented on GIRAPH-249:
------------------------------------

What you have said really has got me thinking about a clean simple thing we 
could do along these lines to add this functionality. What if, only during 
INPUT_SUPERSTEP, each mapper on a worker just spills to disk any incoming 
partitions some other worker sent to us (built ones that are not going to be 
used or changed again until the end of the input split -> partition phase) as 
soon as they are off the wire. InputSplit we are currently reading stays in 
mem, and we flush partitions to other workers as we create them, so that stays 
the same anyway. Then, end of super step -1, we clean up and then read from 
disk into memory whatever partitions we will be using on that worker to do the 
super step calculation phases. This is good because the mutation stuff will 
occur (or dynamic repartitioning, if we choose it) at this time and for the 
rest of the run, and in-RAM data is a plus for us then. Metrics show this 
INPUT_SUPERSTEP is the danger phase, as I already mentioned, so that might 
solve the problem enough to really scale up significantly without dramatic 
changes to code or run speeds yet (input step is already slow)

What do you think? If this yields good results and the metrics show it, it sets 
the stage to do something more comprehensive down the road with a 
proof-of-concept under our belts.

Nice work either way, btw!

                
> Move part of the graph out-of-core when memory is low
> -----------------------------------------------------
>
>                 Key: GIRAPH-249
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-249
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Alessandro Presta
>            Assignee: Alessandro Presta
>         Attachments: GIRAPH-249.patch
>
>
> There has been some talk about Giraph's scaling limitations due to keeping 
> the whole graph and messages in RAM.
> We need to investigate methods to fall back to disk when running out of 
> memory, while gracefully degrading performance.
> This issue is for graph storage. Messages should probably be a separate 
> issue, although the interplay between the two is crucial.
> We should also discuss what are our primary goals here: completing a job 
> (albeit slowly) instead of failing when the graph is too big, while still 
> encouraging memory optimizations and high-memory clusters; or restructuring 
> Giraph to be as efficient as possible in disk mode, making it almost a 
> standard way of operating.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to