[
https://issues.apache.org/jira/browse/GIRAPH-249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alessandro Presta updated GIRAPH-249:
-------------------------------------
Attachment: GIRAPH-249.patch
This is a first stab.
I replaced the HashMap that stores partitions in a worker with a
WorkerPartitionMap.
A WorkerPartitionMap has a normal in-memory map, and the ability to store
entire partitions to the local FS when memory is low.
In order to provide the normal views of the contents of a map, we operate
lazily by loading the out-of-core partitions as we iterate.
We always add the requested partition to the in-memory map (moving another one
to disk to make room) in order to allow modification.
The option "giraph.outOfCoreGraph" controls whether we use WorkerPartitionMap
or a normal HashMap as before.
"giraph.minFreeMemoryRatio" controls how much free memory we want to preserve
out of the maximum available memory for the program.
If out-of-core is enabled and the memory limit is exceeded, we start spilling
partitions to disk.
A few remarks:
- we may want to change other logic (e.g. mutations and message assignment) in
order to minimize the number of times we iterate over the partitions. For
example, we might group operations by partition, and interleave message
assignment with computatation. This will become irrelevant once we also have
out-of-core messages (they will most likely be stored outside of the vertices).
- The code to determine if we're low on memory is kind of spaghetti. I'm not
sure whether I should check maxMemory or totalMemory, and whether/when it's a
good idea to run the GC.
This logic will get more complex when we add out-of-core messages, since the
two will somehow compete for the available memory, and we want to make sure we
make the best use of it.
- We might also have to change the input splitting phase. I think currently we
send partitions over as soon as they reach the max number of vertices. It looks
like we keep only one partition per owner, so this may not present a problem
(as long as we have several more partitions than workers).
> Move part of the graph out-of-core when memory is low
> -----------------------------------------------------
>
> Key: GIRAPH-249
> URL: https://issues.apache.org/jira/browse/GIRAPH-249
> Project: Giraph
> Issue Type: Improvement
> Reporter: Alessandro Presta
> Assignee: Alessandro Presta
> Attachments: GIRAPH-249.patch
>
>
> There has been some talk about Giraph's scaling limitations due to keeping
> the whole graph and messages in RAM.
> We need to investigate methods to fall back to disk when running out of
> memory, while gracefully degrading performance.
> This issue is for graph storage. Messages should probably be a separate
> issue, although the interplay between the two is crucial.
> We should also discuss what are our primary goals here: completing a job
> (albeit slowly) instead of failing when the graph is too big, while still
> encouraging memory optimizations and high-memory clusters; or restructuring
> Giraph to be as efficient as possible in disk mode, making it almost a
> standard way of operating.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira