Re: [Neo] General questions
Hi Ilya! 1) Inheritance. Suppose I want to draw tree where every node is derived from other node; say Car à Sedan à Toyota So when creating Toyota node I don’t need to recreate same properties and relationships that Sedan has. I think this example can help you: http://blog.neo4j.org/2010/03/modeling-categories-in-graph-database.html Java code for the example is found here (blog post uses Python): http://github.com/neo4j-examples/java-shop-categories 2) Multi value properties When it come to properties I see I can store key=value pairs but what if I want to store key=values For example, I’d like to have property possibleColors where my values are red, blue, green, etc Neo4j has multivalue properties, see: http://api.neo4j.org/current/org/neo4j/graphdb/PropertyContainer.html#setProperty%28java.lang.String,%20java.lang.Object%29 Do I need to create separate nodes and properties for each color? I'd go for having a node represent each color, and then draw relationships from car nodes to all color nodes that are possible for that car. Then you can store extra information on the color as properties on the color node (and by adding relationships for example to the manufacturer node or whatever). 3) Interfaces If I have the following: person – drives - car, person – drives - boat Is there a way to define drives interface which might have its own properties that can be used as template or even reused in both relationships above? If not what’s the good approach to achieve this? In our IMDB example we have Actor --Role-- Movie, where the Role is a wrapper around the realtionship. Not sure if that's what you're looking for. http://wiki.neo4j.org/content/IMDB_The_Domain Or maybe your looking for something like meta model? http://components.neo4j.org/neo4j-meta-model/ 4) Categorizing/grouping multiple nodes including relationships What is the best strategy of grouping nodes if I want to include relationships as well? For example, person 1 – owns à car 1car 1 – sold à person 2 then car 1 -- sold àperson 3 I want to keep track of how long each person had car 1; this also need to account for each person having more than one car. Not sure what your requirements are here. One simple way to model this would be purchased and sold relationships between persons and cars, both having a property keeping the date. Depending on your needs, you could have a current_owner relationship as well. 5) Neo clustering (load balancing) / accessing from multiple sources From what I read I understand that Neo only accessible from JVM that runs it. I leave this one for someone else to chime in on! /anders ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo] Evaluating Neo4J as an enterprise class application
We are evaluating Neo4J for a business critical application. There will be a User Interface UI component to browse the graph, create nodes and properties as well as create/modify relationships. The data set spans across 7 domains and expected to be around 40 GB. User will manipulate data in 3 domains. A back end integration is expected to manage data in remaining 4 domains. I use the word domain to mean nodes/relationships/attributes that are grouped to perform one activity like Sale Order, Shipping, Distribution etc. The domains are related to each other and queries traverse across different domains We are expecting 500 users per hour to use the system. Each user may initiate a query once in 2 minutes. Each query is expected to traverse through 20,000 nodes and collect 10 properties for filtering/display. I am accountable for implementing this system. You probably know what accountable means:) Say it is related to Guillotine. What should I do to convince myself to move forward? Things that come to my mind are stability, scalability, auditing and monitoring. Stability means the JVM/application won't crash. Scalability means each user will get response in 1-2 second for up to 500 users. Auditing means the system reports its performance for all interactions. Monitoring means health and performance of the system are made visible. Comments and pointers to related articles are appreciated. TIA SDev ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo] General questions
Max and Anders, Thanks much for your replies and detail info! I think I have better understanding now about reusability via inheritance and interfaces. I see now that if I want to define things once and then extend/reuse them I simply need to create my own wrappers in language of choice (ie Java) IMDB role as wrapper around relationship is perfect example. For multi-value I like setProperty API very useful. Categorizing/grouping I realized I mightve not been very clear. So let me provide different example. Suppose I haveActor starts in à Show airs on à Network I want to keep track when given actor started in given show that air on given Network. Id like to keep this 3 way relationship as constant but have property (or something similar) that will keep track of dates when Actors/networks change while relationships and show remain constant. Max, for clustering, I checked Restful API (very useful) but still ties everything into one app server that neo is running under. I wont be able to run that server as cluster unless I have some smart logic that will have server not running Neo hit other server that runs Neo via Rest API which is messy. Its definitely possibility but hope there better way. Regards, Ilya ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] General questions
Ilya, Remember that we have no 3 way relationships (as in types), you have use a node for anything that connects more than two objects. So break down Show into Episodes if you want to to track the 1st (or only) airing of that episode. Actor 1 = Node 1 Class Actor Show 1 = Node 2 Class Show Network 1 = Node 3 Class Network Season 1 = Node 4 Class Season Season 2 = Node 5 Class Season Episodes 1-12 = Nodes 6-17 Class Episodes Network 2 = Node 18 Class Network Episodes 13-24 = Nodes 19-30 Class Episodes Actor 1 STARRED_ON Show 1 Actor 1 APPEARED_ON Episode 3 Actor 1 APPEARED_ON Episode 20 Episode 1 AIRED_ON Network 1 Episode 20 AIRED_ON Network 2 -- Now, if you want to keep track of Re-Runs on different networks then you have to change that AIRED_ON relationship into a Node class. Aired_on 1 PLAYED_ON Network 2 Aired_on 1 IS_A Episode 20 The trick for me has been to find all the many-to-many relationships and break them down into nodes and if the idea is to eventually have a data warehouse with slowly changing dimensions, then bake those in as Instances. With the rest API NEO4j becomes a web server. Have 50 of them if you want, multicast the writes, load balance the reads, etc. Suppose I haveActor – starts in à Show – airs on à Network I want to keep track when given actor started in given show that air on given Network. I’d like to keep this 3 way relationship as constant but have property (or something similar) that will keep track of dates when Actors/networks change while relationships and show remain constant. On Sun, May 2, 2010 at 4:44 PM, ilya il...@nyc.rr.com wrote: Max and Anders, Thanks much for your replies and detail info! I think I have better understanding now about reusability via inheritance and interfaces. I see now that if I want to define things once and then extend/reuse them I simply need to create my own wrappers in language of choice (ie Java) IMDB role as wrapper around relationship is perfect example. For multi-value I like setProperty API – very useful. Categorizing/grouping – I realized I might’ve not been very clear. So let me provide different example. Suppose I haveActor – starts in à Show – airs on à Network I want to keep track when given actor started in given show that air on given Network. I’d like to keep this 3 way relationship as constant but have property (or something similar) that will keep track of dates when Actors/networks change while relationships and show remain constant. Max, for clustering, I checked Restful API (very useful) but still ties everything into one app server that neo is running under. I won’t be able to run that server as cluster unless I have some smart logic that will have server not running Neo hit other server that runs Neo via Rest API which is messy. It’s definitely possibility but hope there better way. Regards, Ilya ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo] Evaluating Neo4J as an enterprise class application
We have stress tested neo4j with over 500 concurrent users in a webapp with a smaller dataset and we found no performance issues. We even wrap their api in a domain layer that adds some extra overhead. One thing to keep in mind is that if your data ever grows to a point where it needs to be distributed among machines you can't do that with the free version of neo but I think they support it with one of the commercial licenses. In my experience so far with Neo +5 months, since it is embedded if you use java you get a much better experience than using any relational db with a orm layer such as hibernate. The data is not transported from your db to a resulset then to a pojo then tou your view objects. With neo the data may be in memory when you request it and there is no jdbc layer in between your code and the graph. We have also purposedly crashed the JVM and app hoping that at some point it will corrupt the graph and we have been doing this repeteadly at least 10 times a day for the last 5 months. It has always recovered and completed queued transactions. So far we have not been able to corrupt the graph or bring it down. Backing up the data is also easy as a copy of the graph folder is all you need. PROS - Fast - Easy api - Reliable - High Performance in our use cases - The Neo team is fast answering doubts and questions - No SQL - Fast relationship traversals, in the relational world this usually means JOINs which are not very scalable. - Ideal for scenarios where there are multiple relationships and interconnected objects CONS - Free version is non distributable in multiple machines - Only one process or JVM can access the graph at a time - No SQL (if you like sql) - Filtered traversals where results should be ordered usually require full scans / traversal then reorder results. This is not scalable when pagination is required and the results are millions. We have fixed this issue though by having a separate index for single ordereded relationships. In a nutshell this is what your typical relational db provides as a btree index of properties that allows you to query with order by fast. Neo at the time does not have that so you have to keep your own indexes if you want ordered traversals. (Not a trivial task to implement) 2010/5/2 suryadev vasudev suryadev.vasu...@gmail.com We are evaluating Neo4J for a business critical application. There will be a User Interface UI component to browse the graph, create nodes and properties as well as create/modify relationships. The data set spans across 7 domains and expected to be around 40 GB. User will manipulate data in 3 domains. A back end integration is expected to manage data in remaining 4 domains. I use the word domain to mean nodes/relationships/attributes that are grouped to perform one activity like Sale Order, Shipping, Distribution etc. The domains are related to each other and queries traverse across different domains We are expecting 500 users per hour to use the system. Each user may initiate a query once in 2 minutes. Each query is expected to traverse through 20,000 nodes and collect 10 properties for filtering/display. I am accountable for implementing this system. You probably know what accountable means:) Say it is related to Guillotine. What should I do to convince myself to move forward? Things that come to my mind are stability, scalability, auditing and monitoring. Stability means the JVM/application won't crash. Scalability means each user will get response in 1-2 second for up to 500 users. Auditing means the system reports its performance for all interactions. Monitoring means health and performance of the system are made visible. Comments and pointers to related articles are appreciated. TIA SDev ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Raul Raja ___ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user