[Puppet-dev] Re: Graph databases and languages

Henrik Lindberg Tue, 24 May 2011 17:45:17 -0700

Hi,

We have not used neo4j nor gremlin, but have used google "bigtable" viaGoogleApp Engine. Our experience is that things break down at largescale and having many small individually stored pieces. We hit limitspretty quickly (i.e. thousands of things).

Our approach is to use models that are stored in documents which containcross references to elements in the same or other documents.A document is the unit of storage as seen from the application, and weuse binary storage of data broken up into optimal chunks for thespecific storage technology. This works remarkably well.

Our solutions are built with EMF and we rely on proxy references thatenables referring to elements in not currently loaded documents, and onthe ability to handle change-sets/deltas.


Interested in hearing more about the bottlenecks in the current design.

What are the frequent operations in a large network? My guess is thatthere would be many small changes in a large set of data from manymachines and that there is a problem with all having to send theirentire dataset to a master (lots of redundancy). Any technology thatefficiently gets deltas to the master would work. If the database isaccessed over the network there is the risk the assembly of the datasets generates a load that is just as heavy as sending entire documentsfrom agents.

Just my 2c (but I am just speculating and guessing) - love to hear moreabout requirements and anticipated issues.

Did not know about Gremlin - looks cool will take a closer look. Seemslike it could be useful on top of the type of models we use.


Regards
- henrik

On 5/24/11 5:48 AM, Luke Kanies wrote:

Hi all,

I've been thinking for a while of experimenting with graph databases -- 
especially Neo4j[1], but there are others out there -- and just this week I ran 
across a graph language, Gremlin[2].

I know Volcane has done some experimentation with Neo4j, but has anyone else 
messed with any of these?

I'm especially wondering how suitable it'd be to store the catalogs for all 
hosts on a large network, and what kind of benefits we'd see from that over, 
say, storing them in a document database or key/value store.

1 - http://neo4j.org/
2 - https://github.com/tinkerpop/gremlin/wiki



--
You received this message because you are subscribed to the Google Groups "Puppet 
Developers" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/puppet-dev?hl=en.

[Puppet-dev] Re: Graph databases and languages

Reply via email to