Ah, that point about the attributes is interesting. I hadn't realized that
that was pluggable. It makes sense now that you describe it, though.

After poking at my problem a bit more, I realized that having the graph
persist may not be worth the effort it would take to implement.

Thanks for the pointers!

-Cale

On Wed Feb 25 2015 at 4:22:53 PM Tamas Nepusz <[email protected]> wrote:

> > Thanks, Tamas. I'll add a little more color to what I'm doing.
> Thanks for the clarification, it is a bit clearer now.
>
> > The advantage I see of memory mapping is that if we can get to the point
> > where a particular igraph_t (and all of the data it references) is backed
> > by a memory mapped file, then loading it doesn't involve parsing at all.
> Getting all the memory that igraph_t refers to either directly or
> indirectly
> is a bit complicated, especially when igraph is embedded in a higher level
> language. Let me explain it. igraph_t itself is basically six growable
> vectors
> (to store the edge list and the first and second level indexes of the edge
> list) plus a pointer to another chunk of memory that is used by the
> attribute
> handler to store the graph/vertex/edge attributes. Attribute handling is
> decoupled from the "core" igraph library to allow different igraph
> interfaces
> (R, Python, Ruby etc) to bring their own implementations of the attribute
> handler -- this is how we ensure that the Python interface can store
> arbitrary
> Python objects as attributes while the R interface can store arbitrary
> R objects. Mapping the six vectors to a file is probably doable with a bit
> of
> hacking -- however, mapping the memory that is used for storing the
> attributes
> is probably impossible without hacking the host language itself. For
> instance,
> in the Python interface, the pointer in igraph_t points to an array
> containing
> three Python dicts; one for the graph attributes, one for the vertex
> attributes
> and one for the edge attributes. Each of these dicts in turn probably
> refer to
> other Python objects (the keys and values of the dicts), but these cannot
> be
> memory-mapped easily because the Python interpreter is responsible for
> managing
> the memory that they occupy.
>
> So, *if* you don't use graph, vertex and edge attributes at all, then
> *maybe*
> it is doable, although I'm not sure about the amount of hacking involved.
> Theoretically, igraph_t is an opaque data type for all parts of the igraph
> library except a single file: type_indexededgelist.c. The idea is that you
> could implement a, say, type_mmapped_edgelist.c, replace
> type_indexededgelist.c
> with it, recompile igraph and then you get the same library with a
> different
> storage method for graphs. The truth is that no one ever really tried it
> (as
> far as I know).
>
> > Also, I don't think it's the case that "you have to load the entire graph
> > into memory at some point". If using a memory mapped file, I think you'd
> be
> > able to rely on the os to only have the subset of pages resident that you
> > actually need to run the particular graph operations.
> True -- I was sort of assuming that the analysis you would like to perform
> would need the entire graph sooner or later anyway, so you would be just
> spreading out the cost of loading the graph across a longer time interval.
>
> T.
>
> _______________________________________________
> igraph-help mailing list
> [email protected]
> https://lists.nongnu.org/mailman/listinfo/igraph-help
>
_______________________________________________
igraph-help mailing list
[email protected]
https://lists.nongnu.org/mailman/listinfo/igraph-help

Reply via email to