Re: [Neo] General questions

2010-05-02 Thread Anders Nawroth
Hi Ilya!

 1)  Inheritance.
 Suppose I want to draw tree where every node is derived from other node; say
 Car à Sedan à Toyota
 So when creating Toyota node I don’t need to recreate same properties and
 relationships that Sedan has.

I think this example can help you:
http://blog.neo4j.org/2010/03/modeling-categories-in-graph-database.html
Java code for the example is found here (blog post uses Python):
http://github.com/neo4j-examples/java-shop-categories

 2)  Multi value properties
 When it come to properties I see I can store key=value pairs but what if I
 want to store key=values
 For example, I’d like to have property possibleColors where my values are
 red, blue, green, etc

Neo4j has multivalue properties, see:
http://api.neo4j.org/current/org/neo4j/graphdb/PropertyContainer.html#setProperty%28java.lang.String,%20java.lang.Object%29

 Do I need to create separate nodes and properties for each color?

I'd go for having a node represent each color, and then draw 
relationships from car nodes to all color nodes that are possible for 
that car. Then you can store extra information on the color as 
properties on the color node (and by adding relationships for example to 
the manufacturer node or whatever).

 3)  Interfaces
 If I have the following: person – drives - car,  person – drives - boat
 Is there a way to define drives interface which might have its own
 properties that can be used as template or even reused in both relationships
 above? If not what’s the good approach to achieve this?

In our IMDB example we have Actor --Role-- Movie, where the Role is a 
wrapper around the realtionship. Not sure if that's what you're looking for.
http://wiki.neo4j.org/content/IMDB_The_Domain
Or maybe your looking for something like meta model?
http://components.neo4j.org/neo4j-meta-model/

 4)  Categorizing/grouping multiple nodes including relationships
 What  is the best strategy of grouping nodes if I want to include
 relationships as well?
 For example, person 1 – owns à car 1car 1 – sold à person 2  then car 1
 -- sold àperson 3
 I want to keep track of how long each person had car 1; this also need to
 account for each person having more than one car.

Not sure what your requirements are here. One simple way to model this 
would be purchased and sold relationships between persons and cars, 
both having a property keeping the date. Depending on your needs, you 
could have a current_owner relationship as well.

 5)  Neo clustering (load balancing) / accessing from multiple sources
 From what I read I understand that Neo only accessible from JVM that runs
 it.

I leave this one for someone else to chime in on!


/anders

___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo] Evaluating Neo4J as an enterprise class application

2010-05-02 Thread suryadev vasudev
We are evaluating Neo4J for a business critical application. There will be a
User Interface UI component to browse the graph, create nodes and properties
as well as create/modify relationships. The data set spans across 7 domains
and expected to be around 40 GB.
User will manipulate data in 3 domains. A back end integration is expected
to manage data in remaining 4 domains. I use the word domain to mean
nodes/relationships/attributes that are grouped to perform one activity like
Sale Order, Shipping, Distribution etc. The domains are related to each
other and queries traverse across different domains
We are expecting 500 users per hour to use the system. Each user may
initiate a query once in 2 minutes. Each query is expected to traverse
through 20,000 nodes and collect 10 properties for filtering/display.
I am accountable for implementing this system. You probably know what
accountable means:) Say it is related to Guillotine.
What should I do to convince myself to move forward? Things that come to my
mind are stability, scalability, auditing and monitoring. Stability means
the JVM/application won't crash. Scalability means each user will get
response in 1-2 second for up to 500 users. Auditing means the system
reports its performance for all interactions. Monitoring means health and
performance of the system are made visible.
Comments and pointers to related articles are appreciated.
TIA
SDev
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo] General questions

2010-05-02 Thread ilya
Max and Anders,

 

Thanks much for your replies and detail info!

 

I think I have better understanding now about reusability via inheritance
and interfaces.  I see now that if I want to define things once and then
extend/reuse them I simply need to create my own wrappers in language of
choice (ie Java)  IMDB role as wrapper around relationship is perfect
example.

 

For multi-value I like setProperty API – very useful.

 

Categorizing/grouping – I realized I might’ve not been very clear.  So let
me provide different example.

Suppose I haveActor – starts in à Show – airs on à Network


I want to keep track when given actor started in given show that air on
given Network.  I’d like to keep this 3 way relationship as constant but
have property (or something similar) that will keep track of dates when
Actors/networks change while relationships and show remain constant.   

 

Max, for clustering, I checked Restful API (very useful) but still ties
everything into one app server that neo is running under.
I won’t be able to run that server as cluster unless I have some smart logic
that will have server not running Neo hit other server that runs Neo via
Rest API which is messy.  It’s definitely possibility but hope there better
way.

 

Regards,

Ilya

___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] General questions

2010-05-02 Thread Max De Marzi Jr.
Ilya,

Remember that we have no 3 way relationships (as in types), you have use a
node for anything that connects more than two objects. So break down Show
into Episodes if you want to to track the 1st (or only) airing of that
episode.

Actor  1 = Node 1 Class Actor
Show 1 = Node 2 Class Show
Network 1 = Node 3 Class Network
Season 1 = Node 4 Class Season
Season 2 = Node 5 Class Season
Episodes 1-12 = Nodes 6-17 Class Episodes
Network 2 = Node 18 Class Network
Episodes 13-24 = Nodes 19-30 Class Episodes

Actor 1 STARRED_ON Show 1
Actor 1 APPEARED_ON Episode 3
Actor 1 APPEARED_ON Episode 20
Episode 1 AIRED_ON Network 1
Episode 20 AIRED_ON Network 2
--

Now, if you want to keep track of Re-Runs on different networks then you
have to change that AIRED_ON relationship into a Node class.

Aired_on 1 PLAYED_ON Network 2
Aired_on 1 IS_A Episode 20

The trick for me has been to find all the many-to-many relationships and
break them down into nodes and if the idea is to eventually have a data
warehouse with slowly changing dimensions, then bake those in as
Instances.

With the rest API NEO4j becomes a web server.  Have 50 of them if you want,
multicast the writes, load balance the reads, etc.


Suppose I haveActor – starts in à Show – airs on à Network


I want to keep track when given actor started in given show that air on
given Network.  I’d like to keep this 3 way relationship as constant but
have property (or something similar) that will keep track of dates when
Actors/networks change while relationships and show remain constant.



On Sun, May 2, 2010 at 4:44 PM, ilya il...@nyc.rr.com wrote:

 Max and Anders,



 Thanks much for your replies and detail info!



 I think I have better understanding now about reusability via inheritance
 and interfaces.  I see now that if I want to define things once and then
 extend/reuse them I simply need to create my own wrappers in language of
 choice (ie Java)  IMDB role as wrapper around relationship is perfect
 example.



 For multi-value I like setProperty API – very useful.



 Categorizing/grouping – I realized I might’ve not been very clear.  So let
 me provide different example.

 Suppose I haveActor – starts in à Show – airs on à Network


 I want to keep track when given actor started in given show that air on
 given Network.  I’d like to keep this 3 way relationship as constant but
 have property (or something similar) that will keep track of dates when
 Actors/networks change while relationships and show remain constant.



 Max, for clustering, I checked Restful API (very useful) but still ties
 everything into one app server that neo is running under.
 I won’t be able to run that server as cluster unless I have some smart
 logic
 that will have server not running Neo hit other server that runs Neo via
 Rest API which is messy.  It’s definitely possibility but hope there better
 way.



 Regards,

 Ilya

 ___
 Neo mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] Evaluating Neo4J as an enterprise class application

2010-05-02 Thread Raul Raja Martinez
We have stress tested neo4j with over 500 concurrent users in a webapp with
a smaller dataset and we found no performance issues.
We even wrap their api in a domain layer that adds some extra overhead.
One thing to keep in mind is that if your data ever grows to a point where
it needs to be distributed among machines you can't do that with
the free version of neo but I think they support it with one of the
commercial licenses.

In my experience so far with Neo +5 months, since it is embedded if you use
java you get a much better experience than using any relational db with a
orm layer such as hibernate. The data is not transported from your db to a
resulset then to a pojo then tou your view objects.
With neo the data may be in memory when you request it and there is no jdbc
layer in between your code and the graph.

We have also purposedly crashed the JVM and app hoping that at some point it
will corrupt the graph and we have been doing this repeteadly at least 10
times a day for the last 5 months. It has always recovered and completed
queued transactions. So far we have not been able to corrupt the graph or
bring it down. Backing up the data is also easy as a copy of the graph
folder is all you need.

PROS

- Fast
- Easy api
- Reliable
- High Performance in our use cases
- The Neo team is fast answering doubts and questions
- No SQL
- Fast relationship traversals, in the relational world this usually means
JOINs which are not very scalable.
- Ideal for scenarios where there are multiple relationships and
interconnected objects

CONS

- Free version is non distributable in multiple machines
- Only one process or JVM can access the graph at a time
- No SQL (if you like sql)
- Filtered traversals where results should be ordered usually require full
scans / traversal then reorder results. This is not scalable when pagination
is required and the results are millions. We have fixed this issue though by
having a separate index for single ordereded relationships.
In a nutshell this is what your typical relational db provides as a btree
index of properties that allows you to query with order by fast. Neo at
the time does not have that so you have to keep your own indexes if you want
ordered traversals. (Not a trivial task to implement)


2010/5/2 suryadev vasudev suryadev.vasu...@gmail.com

 We are evaluating Neo4J for a business critical application. There will be
 a
 User Interface UI component to browse the graph, create nodes and
 properties
 as well as create/modify relationships. The data set spans across 7 domains
 and expected to be around 40 GB.
 User will manipulate data in 3 domains. A back end integration is expected
 to manage data in remaining 4 domains. I use the word domain to mean
 nodes/relationships/attributes that are grouped to perform one activity
 like
 Sale Order, Shipping, Distribution etc. The domains are related to each
 other and queries traverse across different domains
 We are expecting 500 users per hour to use the system. Each user may
 initiate a query once in 2 minutes. Each query is expected to traverse
 through 20,000 nodes and collect 10 properties for filtering/display.
 I am accountable for implementing this system. You probably know what
 accountable means:) Say it is related to Guillotine.
 What should I do to convince myself to move forward? Things that come to my
 mind are stability, scalability, auditing and monitoring. Stability means
 the JVM/application won't crash. Scalability means each user will get
 response in 1-2 second for up to 500 users. Auditing means the system
 reports its performance for all interactions. Monitoring means health and
 performance of the system are made visible.
 Comments and pointers to related articles are appreciated.
 TIA
 SDev
 ___
 Neo mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user




-- 
Raul Raja
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user