Hi,
We have not used neo4j nor gremlin, but have used google "bigtable" via GoogleApp Engine. Our experience is that things break down at large scale and having many small individually stored pieces. We hit limits pretty quickly (i.e. thousands of things).

Our approach is to use models that are stored in documents which contain cross references to elements in the same or other documents. A document is the unit of storage as seen from the application, and we use binary storage of data broken up into optimal chunks for the specific storage technology. This works remarkably well.

Our solutions are built with EMF and we rely on proxy references that enables referring to elements in not currently loaded documents, and on the ability to handle change-sets/deltas.

Interested in hearing more about the bottlenecks in the current design.
What are the frequent operations in a large network? My guess is that there would be many small changes in a large set of data from many machines and that there is a problem with all having to send their entire dataset to a master (lots of redundancy). Any technology that efficiently gets deltas to the master would work. If the database is accessed over the network there is the risk the assembly of the data sets generates a load that is just as heavy as sending entire documents from agents.

Just my 2c (but I am just speculating and guessing) - love to hear more about requirements and anticipated issues.

Did not know about Gremlin - looks cool will take a closer look. Seems like it could be useful on top of the type of models we use.

Regards
- henrik

On 5/24/11 5:48 AM, Luke Kanies wrote:
Hi all,

I've been thinking for a while of experimenting with graph databases -- 
especially Neo4j[1], but there are others out there -- and just this week I ran 
across a graph language, Gremlin[2].

I know Volcane has done some experimentation with Neo4j, but has anyone else 
messed with any of these?

I'm especially wondering how suitable it'd be to store the catalogs for all 
hosts on a large network, and what kind of benefits we'd see from that over, 
say, storing them in a document database or key/value store.

1 - http://neo4j.org/
2 - https://github.com/tinkerpop/gremlin/wiki



--
You received this message because you are subscribed to the Google Groups "Puppet 
Developers" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/puppet-dev?hl=en.

Reply via email to