On 10/16/07, Klas Ehnrot <[EMAIL PROTECTED]> wrote: > First of all, thank you all for your valuable input. There is still > some work to do so I'll get right on to commenting your comments :) > > > Use a linked list structure, from the "subreference node" have a > > NEXT_CUSTOMER relationship instead of a lot of CUSTOMER relationships > > to the customers. Or use a tree structure (each "tree node" is allowed > > to have 100k entries and so on). List structure works good when you > > only need to insert,delete and iterate over entries. Tree is better > > when you need to go from an entry to the "subreference node" (for > > example to find the type). > > > > Ok, I have a few questions around this. First of all can you explain > more on how the linked list setup is better for performance than > connecting all customers to the subreference node? Does Neo have a > performance issue connecting alot of nodes to a single node and then > iterating over them? >
Yes Neo has a problem with this. If you have 10M relationships of type A on a node and 1 of type B iterating over relationships of type B from that node (if not cached) could result in a lot of relationships of type A being checked if they are of type B. This problem can be fixed and I think we will do that but it isn't trivial and is going to take some time. Regarding the linked list setup vs. all relationships on the same node I can say this: Using a linked list will result in slightly lower performance iterating over the elements but it will use a lot less memory so basically it is a memory vs. speed trade-off. If you use a tree you get the best of both worlds. > Second, how do we feel about useability surrounding this issue? Maybe > it makes sense to connect the customers to eachother when we are > talking performance or node space optimization, but it doesn't feel > natural (at least for me) to connect the customers together if they > don't have a logical explanation for being connected (like being > friends etc..) > If they are friends you connect them together with a relationship type named "FRIEND" and probably expose a ICustomer.getFriends() : Iterable<ICustomer>. If the customers are stored in a linked list (for whatever reason that makes it logical) you use a relationship type named "NEXT_ELEMENT" and don't expose it in the interface. > I believe it's very important to demonstrate that Neo is easy to use > and code for and I've tried to keep it simple in the guide. > Indeed and I think the code can be simplified some more (will get back to you on that). > Regarding the tree structure, I agree with this layout, but shouldn't > we base the node tree on some property of the customer such as Name or > Id, so that the tree structure make sense? To just do it would seem a > bit strange but then again, if there is a performance issue as > described above, we might have to do it? > Sure you could do that and when I use the tree structure it is most often to index something (that makes sense). I just pointed out (since Björn brought up the scalability issue) that a simplified tree structure could do the trick. > As you have seen, my example already demand quite alot of coding just > to get a simple domain model, and I think this addition will add even > more code and complexity, so we have to ask ourselves if we need it. > At what number of estimated customers do we need this approach etc. > Depends on hardware and what you do. For example just adding customers and removing them (i.e relationship create/delete) will work fine but iterating over all customers could pose a problem. I don't think we should add any of this (linked lists/trees) to the design guide but the question is if "sub-reference node pattern" is good or bad. The more networked you make your data the better Neo becomes at taking care of it. I mean you could store everything in a single property on a node but by using the sub-reference node pattern you make your data more networked. /Johan _______________________________________________ Neo mailing list [email protected] http://lists.neo4j.org/mailman/listinfo/user

