The example where we actually go out of memory was with 500K vertices and 500M edges, but yes, as a general rule we should strive to reduce our memory footprint in order to push the point where we need to go out of core as far away as possible.
On 8/17/12 2:11 PM, "Gianmarco De Francisci Morales" <[email protected]> wrote: >Very interesting. > >On a side note, a graph with 100k vertices and 100M edges is largish but >not that big after all. >If it does not fit on 10+ GB of memory, it means that each edge occupies >around 100B (amortizing the cost of the vertex over the edges). >In my opinion this deserves some thought. >If memory is an issue, why not think about compressed memory structures, >at >least for common graph formats? > >Cheers, >-- >Gianmarco > > > >On Wed, Aug 15, 2012 at 11:20 PM, Eli Reisman ><[email protected]>wrote: > >> Great metrics, this made a very interesting read, and great code too as >> always. This must have been a lot of work. I like the idea of >>eliminating >> the extra temporary storage data structures where possible, even when >>not >> going out-of-core. I think that + avoiding extra object creation during >>the >> workflow can still do a lot for in-core job's memory profile, but this >>is >> looking really good and sounds like with the config options its also >> pluggable depending on your hardware situation, so it sounds great to >>me. >> Great work! >> >> On Wed, Aug 15, 2012 at 12:23 PM, Alessandro Presta (JIRA) >> <[email protected]>wrote: >> >> > >> > [ >> > >> >>https://issues.apache.org/jira/browse/GIRAPH-249?page=com.atlassian.jira. >>plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13435437#c >>omment-13435437 >> ] >> > >> > Alessandro Presta commented on GIRAPH-249: >> > ------------------------------------------ >> > >> > Thanks Claudio, good observation. >> > You got me curious so I quickly ran a shortest paths benchmark. >> > >> > 500k vertices, 100 edges/vertex, 10 workers >> > >> > This is with trunk: >> > >> > {code} >> > hadoop jar giraph-trunk.jar >> > org.apache.giraph.benchmark.ShortestPathsBenchmark >>-Dgiraph.useNetty=true >> > -v -V 500000 -e 100 -w 10 >> > {code} >> > >> > {code} >> > Superstep 3 (milliseconds) 5,394 >> > Superstep 4 (milliseconds) 5,650 >> > Superstep 23 (milliseconds) 1,100 >> > Superstep 20 (milliseconds) 1,108 >> > Superstep 31 (milliseconds) 1,192 >> > Setup (milliseconds) 3,297 >> > Shutdown (milliseconds) 2,309 >> > Superstep 27 (milliseconds) 1,135 >> > Superstep 7 (milliseconds) 4,476 >> > Superstep 9 (milliseconds) 3,400 >> > Superstep 8 (milliseconds) 4,043 >> > Superstep 16 (milliseconds) 1,184 >> > Superstep 14 (milliseconds) 2,251 >> > Superstep 6 (milliseconds) 5,150 >> > Superstep 24 (milliseconds) 1,167 >> > Superstep 18 (milliseconds) 1,176 >> > Superstep 5 (milliseconds) 5,483 >> > Superstep 1 (milliseconds) 1,125 >> > Superstep 21 (milliseconds) 1,192 >> > Total (milliseconds) 85,757 >> > Superstep 15 (milliseconds) 1,375 >> > Superstep 22 (milliseconds) 1,159 >> > Vertex input superstep (milliseconds) 11,644 >> > Superstep 25 (milliseconds) 1,058 >> > Superstep 17 (milliseconds) 1,075 >> > Superstep 26 (milliseconds) 1,051 >> > Superstep 12 (milliseconds) 2,342 >> > Superstep 10 (milliseconds) 3,192 >> > Superstep 19 (milliseconds) 1,092 >> > Superstep 11 (milliseconds) 2,533 >> > Superstep 30 (milliseconds) 1,126 >> > Superstep 0 (milliseconds) 821 >> > Superstep 28 (milliseconds) 1,184 >> > Superstep 29 (milliseconds) 1,116 >> > Superstep 2 (milliseconds) 1,165 >> > Superstep 13 (milliseconds) 1,983 >> > {code} >> > >> > And with 5 partitions out-of-core: >> > >> > {code} >> > hadoop jar giraph-249.jar >> > org.apache.giraph.benchmark.ShortestPathsBenchmark >>-Dgiraph.useNetty=true >> > -Dgiraph.useOutOfCoreGraph=true -Dgiraph.maxPartitionsInMemory=5 -v -V >> > 500000 -e 100 -w 10 >> > {code} >> > >> > {code} >> > Superstep 3 (milliseconds) 27,407 >> > Superstep 4 (milliseconds) 26,620 >> > Superstep 23 (milliseconds) 20,906 >> > Superstep 20 (milliseconds) 21,324 >> > Superstep 31 (milliseconds) 21,055 >> > Setup (milliseconds) 2,639 >> > Superstep 7 (milliseconds) 25,819 >> > Superstep 27 (milliseconds) 20,790 >> > Shutdown (milliseconds) 175 >> > Superstep 16 (milliseconds) 21,434 >> > Superstep 8 (milliseconds) 24,434 >> > Superstep 9 (milliseconds) 24,183 >> > Superstep 14 (milliseconds) 22,401 >> > Superstep 6 (milliseconds) 25,948 >> > Superstep 24 (milliseconds) 20,968 >> > Superstep 18 (milliseconds) 21,179 >> > Superstep 5 (milliseconds) 27,134 >> > Superstep 1 (milliseconds) 20,315 >> > Superstep 21 (milliseconds) 21,442 >> > Total (milliseconds) 729,459 >> > Superstep 15 (milliseconds) 22,198 >> > Superstep 22 (milliseconds) 20,875 >> > Vertex input superstep (milliseconds) 19,595 >> > Superstep 25 (milliseconds) 20,829 >> > Superstep 17 (milliseconds) 21,617 >> > Superstep 12 (milliseconds) 22,548 >> > Superstep 26 (milliseconds) 20,763 >> > Superstep 19 (milliseconds) 21,302 >> > Superstep 10 (milliseconds) 23,823 >> > Superstep 11 (milliseconds) 22,908 >> > Superstep 0 (milliseconds) 10,836 >> > Superstep 30 (milliseconds) 21,014 >> > Superstep 28 (milliseconds) 21,109 >> > Superstep 29 (milliseconds) 21,158 >> > Superstep 13 (milliseconds) 21,974 >> > Superstep 2 (milliseconds) 20,726 >> > {code} >> > >> > So yeah, this is more like an order of magnitude. Of course this is >> > nothing scientific. >> > >> > > Move part of the graph out-of-core when memory is low >> > > ----------------------------------------------------- >> > > >> > > Key: GIRAPH-249 >> > > URL: >>https://issues.apache.org/jira/browse/GIRAPH-249 >> > > Project: Giraph >> > > Issue Type: Improvement >> > > Reporter: Alessandro Presta >> > > Assignee: Alessandro Presta >> > > Attachments: GIRAPH-249.patch, GIRAPH-249.patch, >> > GIRAPH-249.patch, GIRAPH-249.patch, GIRAPH-249.patch, GIRAPH-249.patch >> > > >> > > >> > > There has been some talk about Giraph's scaling limitations due to >> > keeping the whole graph and messages in RAM. >> > > We need to investigate methods to fall back to disk when running >>out of >> > memory, while gracefully degrading performance. >> > > This issue is for graph storage. Messages should probably be a >>separate >> > issue, although the interplay between the two is crucial. >> > > We should also discuss what are our primary goals here: completing a >> job >> > (albeit slowly) instead of failing when the graph is too big, while >>still >> > encouraging memory optimizations and high-memory clusters; or >> restructuring >> > Giraph to be as efficient as possible in disk mode, making it almost a >> > standard way of operating. >> > >> > -- >> > This message is automatically generated by JIRA. >> > If you think it was sent incorrectly, please contact your JIRA >> > administrators: >> > >>https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa >> > For more information on JIRA, see: >> http://www.atlassian.com/software/jira >> > >> > >> > >>
