[Neo4j] 回复: Fans of Neo4j From Chinese
Ok, thx . That's help me a lot. Gtalk: houbo...@gmail.com skype: bolin.hou -- 原始邮件 -- 发件人: Tobias Ivarssontobias.ivars...@neotechnology.com; 发送时间: 2011年3月19日(星期六) 下午5:59 收件人: Neo4j user discussionsuser@lists.neo4j.org; 主题: Re: [Neo4j] Fans of Neo4j From Chinese Neo4j serializes commits. I.e. at most one thread is committing a transaction at once. For the actual work of building up the data to be committed, Neo4j supports multiple concurrent threads. This fact alone, that there is a single congestion point, means that if an application, like in your case, is very write centric, it is unlikely for it to scale beyond two threads, with one building up the next commit while the other is commiting its data. It might scale to a few more threads than that if the buildup time is significantly larger than the commit time. It is simple time slicing, only one train can be at the station at once, then you have to do the maths on how many trains can be out on the track during that time. It is also worth keeping in mind, that for CPU bound operation, an application doesn't scale much further than the number of CPUs in the computer. The threads that are not in commit mode - i.e. the ones that are building up the data for their next commit - are CPU bound, and contending for the same CPU resources. This means that your application is not going to scale much further than the number of CPUs in your computer, and few desktop/laptop computers have more than 4 CPUs these days, which makes 5 threads about the most you can squeeze out of it, anything more than that is just going to add contention, and possibly even slow things down. Finally, the (CPU bound) threads that create the graph might be contending on the same resources. As Peter said. If multiple threads modify the same node or relationship, i.e. if they create relationships to the same node (the root node for example), they are all going to block on that resource. Neo4j only allows one transaction to modify each entity at a time. This means that to get maximum concurrency out of your data creation, each thread should be creating each own disconnected subgraph. And if they have connected parts, the connections to the global data should be made last in the transaction (in a predictable order to avoid deadlocks[1]), to maximize the time the thread is operational before hitting the congestion point that is the (potentially) contended data. Cheers, Tobias [1] Neo4j will detect if a deadlock has occurred and throw a DeadlockDetectedException in that case. 2011/3/18 孤竹 ho...@foxmail.com hi, Sorry for disturb you , I am a chinese engineer , Excused for my bad english :) . Recently, I am learning Neo4j and trying to use it in my project . But When I make a Pressure on neo4j with 5 theads , 10 theads, 20 and 30, I found the nodes inserted to the Neo4J is not change obvious (sometimes not change ~ ~! ). Does it not matter with threads ? the kenerl will make it Serial ? Is there any documents or something about The performance of Neo4j ? thanks for your help The program as follows: I put this function in ExecutorService ,with 5/10/30 threads. then test for the nodes inserted into at same time .(The counts have not changed obviously) Transaction tx = null; Node before = null; try { for (int i = 0; i 100; i++) { if(stop == true){ return; } if (graphDb == null) { return; } try { if (tx == null) { tx = graphDb.beginTx(); } // 引用计数加1 writeCount.addAndGet(1); int startNodeString = name.addAndGet(1); Node start = getOrCreateNodeWithOutIndex( + startNodeString); if (before == null) { // 根节点.哈哈哈 I got U Node root = graphDb.getNodeById(0); root.createRelationshipTo(start, LEAD); } if (before != null) { before.createRelationshipTo(start, LOVES); } int endNodeName = name.addAndGet(1); Node end = getOrCreateNodeWithOutIndex( + endNodeName); start.createRelationshipTo(end, KNOWS);
Re: [Neo4j] Finding a Path Between Nodes (filtered by relationship property)
Add Something like: return filter: { language: javascript, body: position.lastRelationship().hasProperty(\userGroupId\) position.lastRelationship().getProperty(\userGroupId\) == 111;}}) to your traversal. On Mon, Mar 21, 2011 at 9:44 AM, Kevin Dieter kevin.die...@megree.com wrote: Hi, I am using the REST API from a .Net application and have a need to find paths between nodes and I would like to include or exclude relationships based on a property value. For example: 1. Node1 has an outgoing relationship of type Friend, with relationship property userGroupId= 111 to Node2 2. Node2 has an outgoing relationship of type Family, with relationship property userGroupId= 111 to Node3 3. Node1 has an outgoing relationship of type WorkedWith, with relationship property userGroupId= 222 to Node3 I would like to find paths from Node1 to Node3 using relationships of any type, but only using those relationships with userGroupId=111. This should return the path that includes Node2 and the first two relationships, but should not include the direct path that uses the third relationship. Is this possible using the REST API? Thanks, Kevin ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Graph traversal doubt
Hi, In Gremlin do, g.v(torontoId).inE('traveled_to').outV.outE('traveled_to').inV If you want it ranked by frequency, do: m = [:]; g.v(torontoId).inE('traveled_to').outV.outE('traveled_to').inV.groupCount(m) Take care, Marko. http://markorodriguez.com On Mar 20, 2011, at 11:52 PM, Peter Neubauer wrote: Adriano, how about something like this? import org.junit.Test; import org.neo4j.graphdb.Node; import org.neo4j.graphdb.Path; import org.neo4j.graphdb.traversal.Evaluation; import org.neo4j.graphdb.traversal.Evaluator; import org.neo4j.graphdb.traversal.TraversalDescription; import org.neo4j.kernel.Traversal; import common.Neo4jAlgoTestCase; public class TraversalTest extends Neo4jAlgoTestCase { @Test public void test2Steps() { graph.makeEdge(John, Paris); graph.makeEdge(Peter, Paris); graph.makeEdge(John, Rome); graph.makeEdge(Peter, Toronto); graph.makeEdge(Adriano, Toronto); graph.makeEdge(Adriano, Tokyo); Node node = graph.getNode( Toronto ); TraversalDescription td = Traversal.description().evaluator( new Evaluator() { @Override public Evaluation evaluate( Path path ) { if (path.length() == 2) { return Evaluation.INCLUDE_AND_PRUNE; } return Evaluation.EXCLUDE_AND_CONTINUE; } }); for (Node res : td.traverse( node ).nodes()){ System.out.println(res); } } } Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://startupbootcamp.org/- Öresund - Innovation happens HERE. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Sun, Mar 20, 2011 at 10:32 PM, Adriano Henrique de Almeida adrianoalmei...@gmail.com wrote: Hi, I have the following attached graph where I have persons who traveled to some cities. What I want to find out is, for a given city, for instance Toronto, the ones who traveled there, also traveled to these other cities (in the attached graph are Tokyo (by Adriano) and Paris (by Peter)). To retrieve this information, I did the following code: CollectionNode allNodes = new ArrayListNode(); Node toronto = db.getNodeById(torontoId); // First I get Toronto node and its relationships to know who traveled there IterableRelationship relationships = toronto.getRelationships(Relationships.TRAVELED_TO, Direction.INCOMING); for (Relationship relationship : relationships) { Node[] nodes = relationship.getNodes(); // For each relationship found, I all nodes that somehow is related to this relationship for (Node node : nodes) { CollectionNode citiesNode = node.traverse(Order.DEPTH_FIRST, StopEvaluator.DEPTH_ONE, ReturnableEvaluator.ALL_BUT_START_NODE, Relationships.TRAVELED_TO, Direction.OUTGOING).getAllNodes(); // And finally I traverse the graph to find to find from theses nodes where the other people traveled to allNodes.addAll(citiesNode); } } Well, with this I can get the results I wanted, however, it seemed to me that what I did was too complicated :) . So, my question is: is there any way to do this traversal in a more straightforward manner?. Thanks in advance. -- Adriano Almeida ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] Possible performance regression issue?
Here's the quick summary of what we're encountering: We are inserting large numbers of activity stream entries on a nearly constant basis. To optimize transactioning, we queue these up and have a single scheduled task that reads the entries from the queue and persists them to Neo. Within these transactions, it's possible that a very large number of relationships will be created and deleted (sometimes create and deleted all within the transaction, since we are managing something similar to an index). I've noticed that the time required to handle the inserts (not just the total, but the time per insert) degrades DRAMATICALLY if there are more than a few hundred entries to write. It is very fast if there are 100 entries in the batch, but very slow if there are over 1000. With Neo 1.1, we did not notice this behavior. We have tried Neo 1.2 and 1.3 and both seem to exhibit this behavior. Can anyone provide any insight into possible causes/fixes? Thanks, Rick ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] New Blog post: Strategies for Scaling Neo4j
With especial thanks to Mark Harwood and Alex Averbuch, I wrote this on approaches for scaling: http://jim.webber.name/2011/03/22/ef4748c3-6459-40b6-bcfa-818960150e0f.aspx Your thoughts would be most welcome. Jim ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Fans of Neo4j From Chinese
I'd like to explore this question a bit further. Does this mean that basically there's no way to scale beyond a single thread/CPU for disconnected graphs if you have complex graph dependencies (e.g. you cannot create disjoint subgraphs)? -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Tobias Ivarsson Sent: Saturday, March 19, 2011 5:59 AM To: Neo4j user discussions Subject: Re: [Neo4j] Fans of Neo4j From Chinese Neo4j serializes commits. I.e. at most one thread is committing a transaction at once. For the actual work of building up the data to be committed, Neo4j supports multiple concurrent threads. This fact alone, that there is a single congestion point, means that if an application, like in your case, is very write centric, it is unlikely for it to scale beyond two threads, with one building up the next commit while the other is commiting its data. It might scale to a few more threads than that if the buildup time is significantly larger than the commit time. It is simple time slicing, only one train can be at the station at once, then you have to do the maths on how many trains can be out on the track during that time. It is also worth keeping in mind, that for CPU bound operation, an application doesn't scale much further than the number of CPUs in the computer. The threads that are not in commit mode - i.e. the ones that are building up the data for their next commit - are CPU bound, and contending for the same CPU resources. This means that your application is not going to scale much further than the number of CPUs in your computer, and few desktop/laptop computers have more than 4 CPUs these days, which makes 5 threads about the most you can squeeze out of it, anything more than that is just going to add contention, and possibly even slow things down. Finally, the (CPU bound) threads that create the graph might be contending on the same resources. As Peter said. If multiple threads modify the same node or relationship, i.e. if they create relationships to the same node (the root node for example), they are all going to block on that resource. Neo4j only allows one transaction to modify each entity at a time. This means that to get maximum concurrency out of your data creation, each thread should be creating each own disconnected subgraph. And if they have connected parts, the connections to the global data should be made last in the transaction (in a predictable order to avoid deadlocks[1]), to maximize the time the thread is operational before hitting the congestion point that is the (potentially) contended data. Cheers, Tobias [1] Neo4j will detect if a deadlock has occurred and throw a DeadlockDetectedException in that case. 2011/3/18 孤竹 ho...@foxmail.com hi, Sorry for disturb you , I am a chinese engineer , Excused for my bad english :) . Recently, I am learning Neo4j and trying to use it in my project . But When I make a Pressure on neo4j with 5 theads , 10 theads, 20 and 30, I found the nodes inserted to the Neo4J is not change obvious (sometimes not change ~ ~! ). Does it not matter with threads ? the kenerl will make it Serial ? Is there any documents or something about The performance of Neo4j ? thanks for your help The program as follows: I put this function in ExecutorService ,with 5/10/30 threads. then test for the nodes inserted into at same time .(The counts have not changed obviously) Transaction tx = null; Node before = null; try { for (int i = 0; i 100; i++) { if(stop == true){ return; } if (graphDb == null) { return; } try { if (tx == null) { tx = graphDb.beginTx(); } // 引用计数加1 writeCount.addAndGet(1); int startNodeString = name.addAndGet(1); Node start = getOrCreateNodeWithOutIndex( + startNodeString); if (before == null) { // 根节点.哈哈哈 I got U Node root = graphDb.getNodeById(0); root.createRelationshipTo(start, LEAD); } if (before != null) { before.createRelationshipTo(start, LOVES); } int endNodeName = name.addAndGet(1);
Re: [Neo4j] New Blog post: Strategies for Scaling Neo4j
Great post. Only thing I'd add is that a weakness of 1 2 is that while they scale ~linearly for reads, they don't scale writes. Maybe that's obvious but it may be worth pointing out anyway. Cheers, -EE On Mon, Mar 21, 2011 at 17:47, Jim Webber j...@neotechnology.com wrote: With especial thanks to Mark Harwood and Alex Averbuch, I wrote this on approaches for scaling: http://jim.webber.name/2011/03/22/ef4748c3-6459-40b6-bcfa-818960150e0f.aspx Your thoughts would be most welcome. Jim ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Emil Eifrém, CEO [e...@neotechnology.com] Neo Technology, www.neotechnology.com Cell: +46 733 462 271 | US: 206 403 8808 http://blogs.neotechnology.com/emil http://twitter.com/emileifrem ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Neo4j and Microsoft Azure
On Mon, Feb 14, 2011 at 17:15, Peter Neubauer peter.neuba...@neotechnology.com wrote: Hi all Graphytes, there has been a lot of interest in Neo4j fro the Microsoft side of things, so Magnus Mårtensson and me did write-up o how to get a first version of a Neo4j Server hosted on Microsoft Azure. Enjoy, and as always feel free to feedback to the community! http://blog.neo4j.org/2011/02/announcing-neo4j-on-windows-azure.html Great post indeed. But recently I read about Microsoft Trinity [1], what are your opinions about that? Will it be a competitor for Neo4j? Do you think it would be a good idea to support hyperedges in a native way in the future Neo4j releases? Best regards. [1] http://research.microsoft.com/en-us/projects/trinity/default.aspx -- Javier de la Rosa http://versae.es ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] New Blog post: Strategies for Scaling Neo4j
Duly updated, thanks for the feedback. Jim On 22 Mar 2011, at 00:56, Emil Eifrem wrote: Great post. Only thing I'd add is that a weakness of 1 2 is that while they scale ~linearly for reads, they don't scale writes. Maybe that's obvious but it may be worth pointing out anyway. Cheers, -EE On Mon, Mar 21, 2011 at 17:47, Jim Webber j...@neotechnology.com wrote: With especial thanks to Mark Harwood and Alex Averbuch, I wrote this on approaches for scaling: http://jim.webber.name/2011/03/22/ef4748c3-6459-40b6-bcfa-818960150e0f.aspx Your thoughts would be most welcome. Jim ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Emil Eifrém, CEO [e...@neotechnology.com] Neo Technology, www.neotechnology.com Cell: +46 733 462 271 | US: 206 403 8808 http://blogs.neotechnology.com/emil http://twitter.com/emileifrem ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] 回复: Fans of Neo4j From Chinese
OK, thanks for you help! It help me a lot! There is another question , In my application, there are lots of nodes and relations(May be million nodes,and ten Thousands relation). I am wonder, I have a method to take relation less,but the nodes will be more( the same ratio ), Is it faster or better for my search ? I think it's faster , because the nodes have index~ Please give me some advices :) -- 原始邮件 -- 发件人: Tobias Ivarssontobias.ivars...@neotechnology.com; 发送时间: 2011年3月19日(星期六) 下午5:59 收件人: Neo4j user discussionsuser@lists.neo4j.org; 主题: Re: [Neo4j] Fans of Neo4j From Chinese Neo4j serializes commits. I.e. at most one thread is committing a transaction at once. For the actual work of building up the data to be committed, Neo4j supports multiple concurrent threads. This fact alone, that there is a single congestion point, means that if an application, like in your case, is very write centric, it is unlikely for it to scale beyond two threads, with one building up the next commit while the other is commiting its data. It might scale to a few more threads than that if the buildup time is significantly larger than the commit time. It is simple time slicing, only one train can be at the station at once, then you have to do the maths on how many trains can be out on the track during that time. It is also worth keeping in mind, that for CPU bound operation, an application doesn't scale much further than the number of CPUs in the computer. The threads that are not in commit mode - i.e. the ones that are building up the data for their next commit - are CPU bound, and contending for the same CPU resources. This means that your application is not going to scale much further than the number of CPUs in your computer, and few desktop/laptop computers have more than 4 CPUs these days, which makes 5 threads about the most you can squeeze out of it, anything more than that is just going to add contention, and possibly even slow things down. Finally, the (CPU bound) threads that create the graph might be contending on the same resources. As Peter said. If multiple threads modify the same node or relationship, i.e. if they create relationships to the same node (the root node for example), they are all going to block on that resource. Neo4j only allows one transaction to modify each entity at a time. This means that to get maximum concurrency out of your data creation, each thread should be creating each own disconnected subgraph. And if they have connected parts, the connections to the global data should be made last in the transaction (in a predictable order to avoid deadlocks[1]), to maximize the time the thread is operational before hitting the congestion point that is the (potentially) contended data. Cheers, Tobias [1] Neo4j will detect if a deadlock has occurred and throw a DeadlockDetectedException in that case. 2011/3/18 孤竹 ho...@foxmail.com hi, Sorry for disturb you , I am a chinese engineer , Excused for my bad english :) . Recently, I am learning Neo4j and trying to use it in my project . But When I make a Pressure on neo4j with 5 theads , 10 theads, 20 and 30, I found the nodes inserted to the Neo4J is not change obvious (sometimes not change ~ ~! ). Does it not matter with threads ? the kenerl will make it Serial ? Is there any documents or something about The performance of Neo4j ? thanks for your help The program as follows: I put this function in ExecutorService ,with 5/10/30 threads. then test for the nodes inserted into at same time .(The counts have not changed obviously) Transaction tx = null; Node before = null; try { for (int i = 0; i 100; i++) { if(stop == true){ return; } if (graphDb == null) { return; } try { if (tx == null) { tx = graphDb.beginTx(); } // 引用计数加1 writeCount.addAndGet(1); int startNodeString = name.addAndGet(1); Node start = getOrCreateNodeWithOutIndex( + startNodeString); if (before == null) { // 根节点.哈哈哈 I got U Node root = graphDb.getNodeById(0); root.createRelationshipTo(start, LEAD); } if (before != null) { before.createRelationshipTo(start,
[Neo4j] 回复: 回复: Fans of Neo4j From Chinese
Sorry, I have not take it clear, my node have many relations, the relations will be more and more than the nodes, I say ten thousands relations is one node connect to another :) -- 原始邮件 -- 发件人: Rick Bullottarick.bullo...@thingworx.com; 发送时间: 2011年3月22日(星期二) 上午9:55 收件人: Neo4j user discussionsuser@lists.neo4j.org; 主题: Re: [Neo4j]回复: Fans of Neo4j From Chinese I will be surprised if you do not have at least as many relationships as nodes (since usually each node is connected to at least one other node). -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of ?? Sent: Monday, March 21, 2011 9:55 PM To: Neo4j user discussions Subject: [Neo4j] 回复: Fans of Neo4j From Chinese OK, thanks for you help! It help me a lot! There is another question , In my application, there are lots of nodes and relations(May be million nodes,and ten Thousands relation). I am wonder, I have a method to take relation less,but the nodes will be more( the same ratio ), Is it faster or better for my search ? I think it's faster , because the nodes have index~ Please give me some advices :) -- 原始邮件 -- 发件人: Tobias Ivarssontobias.ivars...@neotechnology.com; 发送时间: 2011年3月19日(星期六) 下午5:59 收件人: Neo4j user discussionsuser@lists.neo4j.org; 主题: Re: [Neo4j] Fans of Neo4j From Chinese Neo4j serializes commits. I.e. at most one thread is committing a transaction at once. For the actual work of building up the data to be committed, Neo4j supports multiple concurrent threads. This fact alone, that there is a single congestion point, means that if an application, like in your case, is very write centric, it is unlikely for it to scale beyond two threads, with one building up the next commit while the other is commiting its data. It might scale to a few more threads than that if the buildup time is significantly larger than the commit time. It is simple time slicing, only one train can be at the station at once, then you have to do the maths on how many trains can be out on the track during that time. It is also worth keeping in mind, that for CPU bound operation, an application doesn't scale much further than the number of CPUs in the computer. The threads that are not in commit mode - i.e. the ones that are building up the data for their next commit - are CPU bound, and contending for the same CPU resources. This means that your application is not going to scale much further than the number of CPUs in your computer, and few desktop/laptop computers have more than 4 CPUs these days, which makes 5 threads about the most you can squeeze out of it, anything more than that is just going to add contention, and possibly even slow things down. Finally, the (CPU bound) threads that create the graph might be contending on the same resources. As Peter said. If multiple threads modify the same node or relationship, i.e. if they create relationships to the same node (the root node for example), they are all going to block on that resource. Neo4j only allows one transaction to modify each entity at a time. This means that to get maximum concurrency out of your data creation, each thread should be creating each own disconnected subgraph. And if they have connected parts, the connections to the global data should be made last in the transaction (in a predictable order to avoid deadlocks[1]), to maximize the time the thread is operational before hitting the congestion point that is the (potentially) contended data. Cheers, Tobias [1] Neo4j will detect if a deadlock has occurred and throw a DeadlockDetectedException in that case. 2011/3/18 孤竹 ho...@foxmail.com hi, Sorry for disturb you , I am a chinese engineer , Excused for my bad english :) . Recently, I am learning Neo4j and trying to use it in my project . But When I make a Pressure on neo4j with 5 theads , 10 theads, 20 and 30, I found the nodes inserted to the Neo4J is not change obvious (sometimes not change ~ ~! ). Does it not matter with threads ? the kenerl will make it Serial ? Is there any documents or something about The performance of Neo4j ? thanks for your help The program as follows: I put this function in ExecutorService ,with 5/10/30 threads. then test for the nodes inserted into at same time .(The counts have not changed obviously) Transaction tx = null; Node before = null; try { for (int i = 0; i 100; i++) { if(stop == true){ return; } if (graphDb == null) { return; } try { if (tx == null) { tx = graphDb.beginTx();
Re: [Neo4j] Performance expectations for Neo4j.
Bård, sorry, this seems to have slipped through the list. I think you should be able to attach a picture to the list. Let me know if you have problems with that! Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://startupbootcamp.org/ - Öresund - Innovation happens HERE. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Tue, Mar 8, 2011 at 12:47 PM, Bård Lind bard.l...@gmail.com wrote: Hi Peter and David. Thank you very much for your replies. Following your input I have run som more tests, with -Xmx for memory, and repeating tests. Each test is returning 170138 nodes. Single run on laptop, with encrypted SSD disk: java -jar -Xmx128m graphdb-1.0.SNAPSHOT.jar 7481.0 ms java -jar -Xmx1024m graphdb-1.0.SNAPSHOT.jar 5545.0 ms - were not able to assign 2GB memory. Single run on ultra-fast Ubuntu: java -jar -Xmx1024m graphdb-1.0.SNAPSHOT.jar 2795.0 ms. java -jar -Xmx2048m graphdb-1.0.SNAPSHOT.jar 2734.0 ms Repeated runs on laptop: java -jar -Xmx1024m graphdb-1.0.SNAPSHOT.jar 1.st run: 5422.0 ms 2.nd run: 1328.0 ms 3.rd run: 1031.0 ms Repeated runs on Ubuntu: java -jar -Xmx2048m graphdb-1.0.SNAPSHOT.jar 1.st run: 2797.0 ms 2.nd run: 573.0 ms 3.rd run: 448.0 ms These tests indicate that IO/disk speed, possibly and CPU is crucial for first fetch, while secondary retrievals are ok performance wise. What do you think about these results, David and Peter? The Scenario I'm trying to prove: Show a tree structure of a Customer, with sub-companies, accounts and subscriptions. 1. Using a single user id, find all the resources a user_id has access to. 2. Show these resources in a tree structure with parameters including id, parent_id, type and name. 3. Use the tree for navigation client-side. The resources are Customer (which can have Customer children), Account (child of Customer), Subscription (child of Account). Normally a user access starts at a single customer, or account level. Then has read, write and inherit access to all children. Different users do have access to different parts of the full Customer-tree. May I send a picture, describing the scenario, to this list, or must I post the photo to different location, and only send the link? The code I use is: public CollectionNode findGraphBasedOnPrincipal(long userId) { CollectionNode nodeList = new LinkedHashSetNode(); Monitor monFind= MonitorFactory.start(); final TraversalDescription PRINCIPAL_TRAVERSAL = Traversal.description() .relationships(RelationshipTypeTelenor.IS_MEMBER_OF_GROUP, Direction.OUTGOING) .relationships(RelationshipTypeTelenor.SECURITY, Direction.OUTGOING) .relationships( RelationshipTypeTelenor.IS_CHILD_RESOURCE_OF, Direction.INCOMING ) .depthFirst() .evaluator(new Evaluator() { @Override public Evaluation evaluate(Path path) { if (path.endNode().getId() == 1) { return Evaluation.EXCLUDE_AND_PRUNE; } return Evaluation.INCLUDE_AND_CONTINUE; } }) .uniqueness(Uniqueness.RELATIONSHIP_GLOBAL); Transaction tx = graphDb.beginTx(); try { Node startNode = graphDb.getNodeById(userId); Monitor monTrav = MonitorFactory.start(); IterableNode nodes = PRINCIPAL_TRAVERSAL.traverse(startNode).nodes(); IteratorNode iter = nodes.iterator(); int count = 0; while (iter.hasNext()) { Node next = iter.next(); count ++; } monTrav.stop(); System.out.println(INFO: Fetch of + count + nodes in: + monTrav); -- This is where I stop the timer regarding Neo4J performance. for (Node node : nodes) { nodeList.add(node); } tx.success(); } finally { tx.finish(); } return nodeList; } Scenario 2: Has user x access to resource y Here we think of using something like the the ACL example on the Neo4J wiki, which does not include loading the full customer-graph. Hope this clarifies my challenge a bit. Looking forward to your suggestions :-) Big smile from Bård On Mon, Mar 7, 2011 at 5:31 PM, David Montag david.mon...@neotechnology.com wrote: Bård, Great to hear you're evaluating us for your solution. I have a couple of questions. First, how much RAM do you have in the machine, and how much heap are you allocating for the Java process? Peter's question about running it multiple times is also very relevant. Secondly,