I don't know about imho! My python task is not yet finished after 11 hours.
I forgot to mention that we halved the number of incoming and outgoing
edges. Actually, we have defined a function which returns a pair of random
integers between 0 and 80 for incoming and outgoing edges, so we expect the
average degree of vertices to be 80.

Monitoring the processes on our system using top, I can see that all RAM
and 80GB on swap is used, and the python program has CPU usage usually
higher than 90%. So, it seems that perhaps the memory is not the main issue
on run time at this moment. I am not sure how much the 2 randint(0,80)
functions we have called for the number of edges are responsible in this
CPU load!

Thanks for your advice,

Arash

On Fri, Feb 1, 2013 at 10:45 PM, Ronnie Ghose <[email protected]>wrote:

> imho build it in C. so use the boost graph libraries natively?. ..... the
> graphs your talking about are big enough i don't think libraries are going
> to make such a difference considering the memory usage is from the graph
> itself rather than any internal structures iirc.... @Tiago? You might want
> to try memory mapping since your graph is big enough to not fit in ram. or
> have part of it memory mapped ...
>
>
> On Fri, Feb 1, 2013 at 5:52 PM, Arash Fard <[email protected]> wrote:
>
>> Hi Tiago,
>>
>> Thanks for your useful hints. I do research on distributed algorithms for
>> subgraph pattern matching on very large datasets; that is way I need such a
>> large graph for performance measurements.
>>
>> Before trying your suggestions, I added a very large swap file to the
>> system. Now, GraphTool has been running for more than 5 hours. Initially,
>> it was keeping increasing memory usage and its CPU usage was low, but now
>> the memory usage has become stable (about 125 GB on physical memory and 55
>> GB on swap) and the average CPU usage is decently high. I hope that the
>> process will finish in the next few hours; otherwise, I will try your
>> suggestions.
>>
>> I wonder if you now any other tool which might be more efficient for
>> creating very large graphs?
>>
>> I appreciate your help.
>>
>> Arash
>>
>>  On Fri, Feb 1, 2013 at 1:36 PM, Tiago de Paula Peixoto 
>> <[email protected]>wrote:
>>
>>> On 02/01/2013 05:04 PM, Arash wrote:
>>> > Hi,
>>> >
>>> > I try to create a very large graph using random_graph function. My
>>> computer
>>> > has 128 GB physical memory and 16 GB swap partition. When I try to
>>> create a
>>> > graph  with 50 million for number of vertices, and 80 for both numbers
>>> of
>>> > incoming and outgoing edges, it consumes all memory space and then
>>> crashes.
>>> >
>>> > Do you have any idea how I can overcome the memory limit? Is there any
>>> way
>>> > to make random_graph function more memory efficient?
>>>
>>> Try to pass the option "random=False". In this case the edges will not
>>> be randomized, but it may need less memory. If it fits your memory, then
>>> you may do random_rewire() with the option strat="erdos", if you don't
>>> have any prescribed degree sequence. This should need less memory, since
>>> a list of the edges does not need to be built internally.
>>>
>>> But note that you are dealing with a very large graph indeed, with 4
>>> billion edges. If you use two 64 bit integers to specify an edge, this
>>> already amounts to 80 * 50 * 1e6 * 64 * 2 / (8 * 1024 ** 3) ~ 60 GB.
>>> Since the edge indexes are also needed, this is increased by another 30
>>> GB.  Thus, simply the storage of such a thing would require at least 90
>>> GB, but probably even more. If some temporary data structure has a size
>>> O(E), it will cross over your 128 GB limit pretty easily.
>>>
>>> The use could be halved by using 32 bit integers instead, but the
>>> library would need to be modified.
>>>
>>> But I can't help but wonder if you really need a random graph this
>>> large...
>>>
>>> Cheers,
>>> Tiago
>>>
>>> --
>>> Tiago de Paula Peixoto <[email protected]>
>>>
>>>
>>> _______________________________________________
>>> graph-tool mailing list
>>> [email protected]
>>> http://lists.skewed.de/mailman/listinfo/graph-tool
>>>
>>>
>>
>> _______________________________________________
>> graph-tool mailing list
>> [email protected]
>> http://lists.skewed.de/mailman/listinfo/graph-tool
>>
>>
>
> _______________________________________________
> graph-tool mailing list
> [email protected]
> http://lists.skewed.de/mailman/listinfo/graph-tool
>
>
_______________________________________________
graph-tool mailing list
[email protected]
http://lists.skewed.de/mailman/listinfo/graph-tool

Reply via email to