imho build it in C. so use the boost graph libraries natively?. ..... the
graphs your talking about are big enough i don't think libraries are going
to make such a difference considering the memory usage is from the graph
itself rather than any internal structures iirc.... @Tiago? You might want
to try memory mapping since your graph is big enough to not fit in ram. or
have part of it memory mapped ...


On Fri, Feb 1, 2013 at 5:52 PM, Arash Fard <[email protected]> wrote:

> Hi Tiago,
>
> Thanks for your useful hints. I do research on distributed algorithms for
> subgraph pattern matching on very large datasets; that is way I need such a
> large graph for performance measurements.
>
> Before trying your suggestions, I added a very large swap file to the
> system. Now, GraphTool has been running for more than 5 hours. Initially,
> it was keeping increasing memory usage and its CPU usage was low, but now
> the memory usage has become stable (about 125 GB on physical memory and 55
> GB on swap) and the average CPU usage is decently high. I hope that the
> process will finish in the next few hours; otherwise, I will try your
> suggestions.
>
> I wonder if you now any other tool which might be more efficient for
> creating very large graphs?
>
> I appreciate your help.
>
> Arash
>
>  On Fri, Feb 1, 2013 at 1:36 PM, Tiago de Paula Peixoto 
> <[email protected]>wrote:
>
>> On 02/01/2013 05:04 PM, Arash wrote:
>> > Hi,
>> >
>> > I try to create a very large graph using random_graph function. My
>> computer
>> > has 128 GB physical memory and 16 GB swap partition. When I try to
>> create a
>> > graph  with 50 million for number of vertices, and 80 for both numbers
>> of
>> > incoming and outgoing edges, it consumes all memory space and then
>> crashes.
>> >
>> > Do you have any idea how I can overcome the memory limit? Is there any
>> way
>> > to make random_graph function more memory efficient?
>>
>> Try to pass the option "random=False". In this case the edges will not
>> be randomized, but it may need less memory. If it fits your memory, then
>> you may do random_rewire() with the option strat="erdos", if you don't
>> have any prescribed degree sequence. This should need less memory, since
>> a list of the edges does not need to be built internally.
>>
>> But note that you are dealing with a very large graph indeed, with 4
>> billion edges. If you use two 64 bit integers to specify an edge, this
>> already amounts to 80 * 50 * 1e6 * 64 * 2 / (8 * 1024 ** 3) ~ 60 GB.
>> Since the edge indexes are also needed, this is increased by another 30
>> GB.  Thus, simply the storage of such a thing would require at least 90
>> GB, but probably even more. If some temporary data structure has a size
>> O(E), it will cross over your 128 GB limit pretty easily.
>>
>> The use could be halved by using 32 bit integers instead, but the
>> library would need to be modified.
>>
>> But I can't help but wonder if you really need a random graph this
>> large...
>>
>> Cheers,
>> Tiago
>>
>> --
>> Tiago de Paula Peixoto <[email protected]>
>>
>>
>> _______________________________________________
>> graph-tool mailing list
>> [email protected]
>> http://lists.skewed.de/mailman/listinfo/graph-tool
>>
>>
>
> _______________________________________________
> graph-tool mailing list
> [email protected]
> http://lists.skewed.de/mailman/listinfo/graph-tool
>
>
_______________________________________________
graph-tool mailing list
[email protected]
http://lists.skewed.de/mailman/listinfo/graph-tool

Reply via email to