I don't know about imho! My python task is not yet finished after 11 hours. I forgot to mention that we halved the number of incoming and outgoing edges. Actually, we have defined a function which returns a pair of random integers between 0 and 80 for incoming and outgoing edges, so we expect the average degree of vertices to be 80.
Monitoring the processes on our system using top, I can see that all RAM and 80GB on swap is used, and the python program has CPU usage usually higher than 90%. So, it seems that perhaps the memory is not the main issue on run time at this moment. I am not sure how much the 2 randint(0,80) functions we have called for the number of edges are responsible in this CPU load! Thanks for your advice, Arash On Fri, Feb 1, 2013 at 10:45 PM, Ronnie Ghose <[email protected]>wrote: > imho build it in C. so use the boost graph libraries natively?. ..... the > graphs your talking about are big enough i don't think libraries are going > to make such a difference considering the memory usage is from the graph > itself rather than any internal structures iirc.... @Tiago? You might want > to try memory mapping since your graph is big enough to not fit in ram. or > have part of it memory mapped ... > > > On Fri, Feb 1, 2013 at 5:52 PM, Arash Fard <[email protected]> wrote: > >> Hi Tiago, >> >> Thanks for your useful hints. I do research on distributed algorithms for >> subgraph pattern matching on very large datasets; that is way I need such a >> large graph for performance measurements. >> >> Before trying your suggestions, I added a very large swap file to the >> system. Now, GraphTool has been running for more than 5 hours. Initially, >> it was keeping increasing memory usage and its CPU usage was low, but now >> the memory usage has become stable (about 125 GB on physical memory and 55 >> GB on swap) and the average CPU usage is decently high. I hope that the >> process will finish in the next few hours; otherwise, I will try your >> suggestions. >> >> I wonder if you now any other tool which might be more efficient for >> creating very large graphs? >> >> I appreciate your help. >> >> Arash >> >> On Fri, Feb 1, 2013 at 1:36 PM, Tiago de Paula Peixoto >> <[email protected]>wrote: >> >>> On 02/01/2013 05:04 PM, Arash wrote: >>> > Hi, >>> > >>> > I try to create a very large graph using random_graph function. My >>> computer >>> > has 128 GB physical memory and 16 GB swap partition. When I try to >>> create a >>> > graph with 50 million for number of vertices, and 80 for both numbers >>> of >>> > incoming and outgoing edges, it consumes all memory space and then >>> crashes. >>> > >>> > Do you have any idea how I can overcome the memory limit? Is there any >>> way >>> > to make random_graph function more memory efficient? >>> >>> Try to pass the option "random=False". In this case the edges will not >>> be randomized, but it may need less memory. If it fits your memory, then >>> you may do random_rewire() with the option strat="erdos", if you don't >>> have any prescribed degree sequence. This should need less memory, since >>> a list of the edges does not need to be built internally. >>> >>> But note that you are dealing with a very large graph indeed, with 4 >>> billion edges. If you use two 64 bit integers to specify an edge, this >>> already amounts to 80 * 50 * 1e6 * 64 * 2 / (8 * 1024 ** 3) ~ 60 GB. >>> Since the edge indexes are also needed, this is increased by another 30 >>> GB. Thus, simply the storage of such a thing would require at least 90 >>> GB, but probably even more. If some temporary data structure has a size >>> O(E), it will cross over your 128 GB limit pretty easily. >>> >>> The use could be halved by using 32 bit integers instead, but the >>> library would need to be modified. >>> >>> But I can't help but wonder if you really need a random graph this >>> large... >>> >>> Cheers, >>> Tiago >>> >>> -- >>> Tiago de Paula Peixoto <[email protected]> >>> >>> >>> _______________________________________________ >>> graph-tool mailing list >>> [email protected] >>> http://lists.skewed.de/mailman/listinfo/graph-tool >>> >>> >> >> _______________________________________________ >> graph-tool mailing list >> [email protected] >> http://lists.skewed.de/mailman/listinfo/graph-tool >> >> > > _______________________________________________ > graph-tool mailing list > [email protected] > http://lists.skewed.de/mailman/listinfo/graph-tool > >
_______________________________________________ graph-tool mailing list [email protected] http://lists.skewed.de/mailman/listinfo/graph-tool
