from:"Niels Hoogeveen"

Re: [Neo4j] Modeling subrelationships in Neo4j

2011-12-07 Thread Niels Hoogeveen


I think my explanation was not clear as it should be. 
I wasn't suggesting to replace the relationships with a node, but to shadow the 
relationshiptypes with a node.
Let's say we have two relationshiptypes, KNOWS and FRIEND, where we want to 
state that friends form a subset of the people a person knows. Additionally we 
have a relationshiptype SUBRELATIONSHIP indicating that a relationshiptype is a 
subtype of another relationshiptype.
For the two relationshiptypes KNOWS and FRIEND, create nodes and store the name 
of the relationshiptype in a property on that node. These two nodes must 
somehow be indexed, which you can either do with Lucene, though in my own 
application I have chosen to create a namespace node attached to the reference 
node, and create a relationship from that namespace node to the 
relationshiptype node. This allows for a quick lookup of the relationshiptype 
nodes. 
Additionally a relationhip of type SUBRELATIONSHIP should be created from the 
FRIEND node to the KNOWS node. 
Now methods for the retrieval of relationships should be written, so you don't 
fetch just the relationships with a given relationshiptype, but traverse all 
subrelationshiptype too and fetch all relationships on a node with those 
subrelationships. 
Example:
pete -- FRIEND -- jakepete -- FRIEND -- ellenpete -- KNOWS -- patty
Suppose we want to fetch all the people pete knows.
We traverse the hierarchy of relationshiptypes under KNOWS, and get an Iterable 
with the two relationshiptype nodes associated with KNOWS and FRIEND. Then we 
iterate over these relationhiptype nodes fetching the relationship on the 
pete-node with the corresponding relationshiptype, thereby returning an 
Iterable with the nodes associated with jake, ellen and patty.
For faster lookups, I have decided to use the id of the relationshiptype node 
as the name of the relationships used, but this is not a requirement for this 
solution.
Niels




 Date: Wed, 7 Dec 2011 18:16:15 +0530
 From: sourajit.ba...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Modeling subrelationships in Neo4j
 
 To Niels' approach,
 Wouldn't it be a very dense graph ? For e.g. there will be several people
 inter-connected by KNOWS; if we model KNOWS as a node, there would be lots
 of edges originating from it.
 
 
 On Wed, Dec 7, 2011 at 5:16 PM, Alistair Jones 
 alistair.jo...@neotechnology.com wrote:
 
  Qualifying the relationships with an additional property (or properties)
  sounds like a sensible approach.
 
  The simplest thing to do would be to have a boolean property to distinguish
  the two types, so they would both have relationship type KNOWS, and also
  a boolean property well.  You could use this in a cypher query like this:
 
  start Alistair = node(1) match Alistair -[r:KNOWS]- friend where r.well =
  true return friend.name
 
  Alternatively, as Rick suggests, if you wanted a sliding scale of knowing,
  you could have a numerical property, and then do more sophisticated
  traversals.  This is analogous to a weighted graph that you might use for
  route planning, where each of the relationships is weighted with a property
  distance or time.  In cypher:
 
  start Alistair = node(1) match Alistair -[r:KNOWS]- friend where
  r.how_well  50 return friend.name
 
  This property-based approach is less sophisticated than Niels' true
  relationship-type-hierarchy approach, but I guess it depends on your domain
  what will be most appropriate.  I think using properties is probably
  simpler to implement if it meets your needs.
 
  -Alistair
 
  On 6 December 2011 14:14, Rick Otten rot...@manta.com wrote:
 
   Can you do this with properties on the relationship?
  
   In your example a KNOWS relationship could have a how well property,
   with values 1 to 100.
  
   You could define KNOWS_BETTER as  [ 50  how well  80 ].
   KNOWS_BEST as [ 80 = how well = 100 ].
  
   I'm not sure what the difference between a sub relationship and a
   relationship qualified with properties really is.
  
  
   -Original Message-
   From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org]
   On Behalf Of Sourajit Basak
   Sent: Tuesday, December 06, 2011 6:09 AM
   To: user@lists.neo4j.org
   Subject: [Neo4j] Modeling subrelationships in Neo4j
  
   Is it possible to create subrelationships in neo4j ? For e.g. a
   relationship called KNOWS_BETTER as a subrelationship of KNOWS.
  
   Do I need to explicitly connect the nodes using both relationships for
  the
   traversal to work ? Lets say, I create this
  
   neo4j -- KNOWS_BETTER -- graphDB, does this entails the following ?
   neo4j -- KNOWS -- graphDB.
  
   Such a scenario can be modeled in OWL Ontology, wondering if neo4j has
  any
   capabilities.
  
   Note: Under the hood, most OWL Ontology implementations do create these
   *extra* inferred links internally.
   ___
   Neo4j mailing list
   User@lists.neo4j.org

Re: [Neo4j] Modeling subrelationships in Neo4j

2011-12-06 Thread Niels Hoogeveen


It cannot directly be done through the standard API, but of course it can be 
implemented.
I do this myself in an application I am building. For every RelationshipType, i 
create a Node and between those Nodes there can have subtyping relationships. 
To make lookup fast, I use the node-id of the RelationshipTypeNodes as the 
RelationshipType name, and give it a more meaningful name by means of a 
property on the RelationshipTypeNode.
This way the Node belonging to a RelationshipType can be fetched without 
overhead and it allows me to change the name of the relationhip type. Downside 
to the approach is that relationhips have no meaningful name when displayed in 
neoclipse.
Of course you need to write your own methods to fetch relationships from nodes, 
because you may want to fetch not only the ones with the RelationhipType you 
supply, but also those with a RelationshipType that is a subtype thereof.
Niels

 Date: Tue, 6 Dec 2011 16:39:19 +0530
 From: sourajit.ba...@gmail.com
 To: user@lists.neo4j.org
 Subject: [Neo4j] Modeling subrelationships in Neo4j
 
 Is it possible to create subrelationships in neo4j ? For e.g. a relationship
 called KNOWS_BETTER as a subrelationship of KNOWS.
 
 Do I need to explicitly connect the nodes using both relationships for the
 traversal to work ? Lets say, I create this
 
 neo4j -- KNOWS_BETTER -- graphDB, does this entails the following ?
 neo4j -- KNOWS -- graphDB.
 
 Such a scenario can be modeled in OWL Ontology, wondering if neo4j has any
 capabilities.
 
 Note: Under the hood, most OWL Ontology implementations do create these
 *extra* inferred links internally.
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Moving to u...@neo4j.org

2011-11-30 Thread Niels Hoogeveen

Good decision. Immediately signed up.

 From: peter.neuba...@neotechnology.com
 Date: Wed, 30 Nov 2011 13:55:44 +0100
 To: user@lists.neo4j.org
 Subject: [Neo4j] Moving to u...@neo4j.org

 Hi all,
 we are going to move from mailman to google groups,
 http://groups.google.com/a/neo4j.org/groups/dir soon, I just wanted to
 give you a heads-up that I will invite add of the current members to
 http://groups.google.com/a/neo4j.org/group/user/topics?lnk when we are
 ready.

 Just wanted to warn you that there might be a surprising welcome
 message from that group soonish, hope you don't mind!

 Let me know if you have any objections.

 Happy hacking!

 /peter neubauer

 GTalk:  neubauer.peter
 Skype   peter.neubauer
 Phone   +46 704 106975
 LinkedIn   http://www.linkedin.com/in/neubauer
 Twitter  http://twitter.com/peterneubauer

 brew install neo4j  neo4j start
 heroku addons:add neo4j
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

[Neo4j] collation and wild card queries

2011-11-28 Thread Niels Hoogeveen


In order to have proper sort order for Strings with diacritical characters, I 
started using Lucene's ICUCollationKeyAnalyzer. This indeed gives the proper 
sort order for queries, but for some reason wild card queries no longer seem to 
work. This applies for both the normal CollationKeyAnalyzer and for the ICU 
variant. Exact queries work, but as soon as a wild card is added the query no 
longer returns any results.
Does anyone have an idea how to solve this?
I'd like to be able to have an index that allows both diacritics-aware sort 
order and support for wild cards.
Niels


  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Neo4j upcoming features importance poll

2011-11-22 Thread Niels Hoogeveen

I noticed work on supernodes being committed to GitHub. Looking forward seeing 
this and in 1.6-SNAPSHOT. I would like to test this sooner rather than later. 
The node#getDegree methods are a great addition.
Niels

 From: peter.neuba...@neotechnology.com
 Date: Tue, 22 Nov 2011 15:51:15 +0100
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Neo4j upcoming features importance poll

 Uservoice seems great. If rapportive uses it,
 http://feedback.rapportive.com/forums/42557-general then it is good in
 my book. I think we should try it if we can integrate this with GIThub
 issues.

 Pablo, impressive feedback on http://www.doodle.com/wg8k77vwq6b654bv !
 I think Mattias will be delighted that the supernode support is on top
 ;)

 Cheers,

 /peter neubauer

 GTalk:  neubauer.peter
 Skype   peter.neubauer
 Phone   +46 704 106975
 LinkedIn   http://www.linkedin.com/in/neubauer
 Twitter  http://twitter.com/peterneubauer

 http://www.neo4j.org  - NOSQL for the Enterprise.
 http://startupbootcamp.org/- Öresund - Innovation happens HERE.

 On Sat, Nov 19, 2011 at 9:29 PM, Peter Bell li...@pbell.com wrote:
  Uservoice might be a good fit. I used it for feature voting on as OSS 
  project and it worked out pretty well...

  Sent from my iPhone

  On Nov 19, 2011, at 2:11 PM, Nigel Small ni...@nigelsmall.name wrote:

  Actually sounds like we may have finally found a use for Google Wave! :-P

  On 19 Nov 2011 13:09, Pablo Pareja ppar...@era7.com wrote:

  Yeah it'd be great having something more wiki-like that everyone could
  edit.
  I have no idea though about how could this be done
  Any ideas?

  Pablo

  On Sat, Nov 19, 2011 at 7:54 PM, Nigel Small ni...@nigelsmall.name
  wrote:

  How about something like Wufoo?

  http://www.wufoo.com/
  http://www.wufoo.com/

  *Nigel Small*
  Phone: +44 7814 638 246
  Blog: http://nigelsmall.name/
  GTalk: ni...@nigelsmall.name
  MSN: nasm...@live.co.uk
  Skype: technige
  Twitter: @technige https://twitter.com/#!/technige
  LinkedIn: http://uk.linkedin.com/in/nigelsmall

  On 19 November 2011 18:45, Peter Neubauer
  peter.neuba...@neotechnology.comwrote:

  I really like this. Is there any other transparent public method you
  poll,
  like a Google form that everyone can edit?
  On Nov 19, 2011 7:19 PM, Pablo Pareja ppar...@era7.com wrote:

  I just added a link for every possible upcoming feature and created
  an
  issue for those which didn't have one so far.

  Sorry for those who voted already but since the options changed,
  their
  vote
  was lost, could you please vote again?
  From now on every time we add a new feature to the poll we should
  create
  its respective issue before adding it.
  At least, whenever a new option is added, the votes for the rest of
  options
  are conserved, so we should be able update our votes just adding our
  vote
  (or not) to the new ones.
  Sorry for the inconvenience!

  Pablo

  On Sat, Nov 19, 2011 at 5:21 PM, Pablo Pareja ppar...@era7.com
  wrote:

  Ok, I just did that for the first one; the bad thing about this is
  that
  every time I edit one of the options, all the votes cast for it get
  lost
  and you have to edit your vote again...
  So maybe from now on I'd be better adding new features to the poll
  only
  once their respective issues has been risen in github.
  What do you think?

  Pablo

  On Sat, Nov 19, 2011 at 5:16 PM, Pablo Pareja ppar...@era7.com
  wrote:

  Yeah that'd be cool, if you give me the links I can put them as
  part
  of
  the options themselves (with bit.ly or something like that).
  Cheers,

  Pablo

  On Sat, Nov 19, 2011 at 1:30 PM, Peter Neubauer 
  peter.neuba...@neotechnology.com wrote:

  Guys,
  This is great! Could you raise issues for these and we mitigate
  missing
  voting on Github with this, linking back to github for
  discussion?
  On Nov 19, 2011 1:09 PM, Pablo Pareja ppar...@era7.com
  wrote:

  @Linan get_or_create feature added ;)
  @Mattias I mean being required to specify a node type at
  creation
  time, (as
  how things are right now with relationships)

  On Sat, Nov 19, 2011 at 1:01 PM, Mattias Persson
  matt...@neotechnology.comwrote:

  hat exactly does mandatory node types mean?

  2011/11/19 Pablo Pareja ppar...@era7.com

  Hi all,

  I was thinking it'd be cool to create a sort of a poll in
  order
  to
  know
  which features (that are missing right now...) are the most
  important
  ones
  for the community. I just did a quick google search for
  free
  online
  poll
  creation platforms and found doodle site, (btw do you know
  a
  better
  site
  to
  do this?).
  The address for the poll is:
  http://www.doodle.com/wg8k77vwq6b654bv
  So far I just added three features that came to my mind
  while I
  was
  creating it, so please say which features you're missing
  and
  I'll
  add
  them
  so that we can all vote for them or not.
  What do you think about all this?
  Cheers,

Re: [Neo4j] Lucene sort with diacritic characters

2011-11-11 Thread Niels Hoogeveen

anyone?

 From: pd_aficion...@hotmail.com
 To: user@lists.neo4j.org
 Date: Thu, 10 Nov 2011 20:20:46 +0100
 Subject: [Neo4j] Lucene sort with diacritic characters

 When retrieving items from a Lucene index, using the sort method, it seems 
 the order doesn't abide proper rules for sorting diacritic characters. 
 For example, Århus comes later in the list than Zürich and Ḩalab comes later 
 than Žužemberk. 
 Can someone help me solve this?
 Niels   
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Lucene sort with diacritic characters

2011-11-11 Thread Niels Hoogeveen

Thanks Rick, I will try out your suggestions.
Niels

 From: rick.bullo...@thingworx.com
 To: user@lists.neo4j.org
 Date: Fri, 11 Nov 2011 07:33:44 -0700
 Subject: Re: [Neo4j] Lucene sort with diacritic characters

 You probably need to create a custom analyzer using one of Lucene's collation 
 filters (which you will provide as a parameter to the Neo4J index creation 
 method).  Unfortunately, you can't apply a new analyzer after the fact.  I 
 think you'll need to delete and regenerate the index.  Lucene has some 
 built-in language specific collation filters, but there is also a contributed 
 package, ICUCollationKeyFilter, which may have some advantages in terms of 
 performance.  Unfortunately, I do not direct experience in using either, but 
 hopefully this will help get you pointed in the right direction.

 Rick

 From: user-boun...@lists.neo4j.org [user-boun...@lists.neo4j.org] On Behalf 
 Of Niels Hoogeveen [pd_aficion...@hotmail.com]
 Sent: Friday, November 11, 2011 9:27 AM
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Lucene sort with diacritic characters

 anyone?

  From: pd_aficion...@hotmail.com
  To: user@lists.neo4j.org
  Date: Thu, 10 Nov 2011 20:20:46 +0100
  Subject: [Neo4j] Lucene sort with diacritic characters

  When retrieving items from a Lucene index, using the sort method, it seems 
  the order doesn't abide proper rules for sorting diacritic characters.
  For example, Århus comes later in the list than Zürich and Ḩalab comes 
  later than Žužemberk.
  Can someone help me solve this?
  Niels
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user

 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Lucene sort with diacritic characters

2011-11-11 Thread Niels Hoogeveen


It works like a dream.
One note for others needing this functionality. The ICUCollationKeyAnalyzer has 
a constructor which takes a Collator (from icu4j) as argument. Neo4j's index 
requires a constructor without arguments, so it's necessary to wrap the 
ICUCollationKeyAnalyzer and provide it the appropriate Collator in the 
constructor. For me Collator.SECONDARY was the best choice.
Niels

 From: rick.bullo...@thingworx.com
 To: user@lists.neo4j.org
 Date: Fri, 11 Nov 2011 07:33:44 -0700
 Subject: Re: [Neo4j] Lucene sort with diacritic characters
 
 You probably need to create a custom analyzer using one of Lucene's collation 
 filters (which you will provide as a parameter to the Neo4J index creation 
 method).  Unfortunately, you can't apply a new analyzer after the fact.  I 
 think you'll need to delete and regenerate the index.  Lucene has some 
 built-in language specific collation filters, but there is also a contributed 
 package, ICUCollationKeyFilter, which may have some advantages in terms of 
 performance.  Unfortunately, I do not direct experience in using either, but 
 hopefully this will help get you pointed in the right direction.
 
 Rick
 
 
 
 
 From: user-boun...@lists.neo4j.org [user-boun...@lists.neo4j.org] On Behalf 
 Of Niels Hoogeveen [pd_aficion...@hotmail.com]
 Sent: Friday, November 11, 2011 9:27 AM
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Lucene sort with diacritic characters
 
 anyone?
 
  From: pd_aficion...@hotmail.com
  To: user@lists.neo4j.org
  Date: Thu, 10 Nov 2011 20:20:46 +0100
  Subject: [Neo4j] Lucene sort with diacritic characters
 
 
  When retrieving items from a Lucene index, using the sort method, it seems 
  the order doesn't abide proper rules for sorting diacritic characters.
  For example, Århus comes later in the list than Zürich and Ḩalab comes 
  later than Žužemberk.
  Can someone help me solve this?
  Niels
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Function to check whether two nodes are connected?

2011-10-27 Thread Niels Hoogeveen


There is one caveat to this method, you'd have to know which node is most 
densely connected. 

Suppose one of the nodes has 100,000 relationships (incoming and outgoing) and 
the other node has only a few relationships, then you'd want to iterate over 
the relationships of the second node.

A solution could be to iterate over both sets of relationships at the same time:

public boolean areConnected(Node n1,Node n2, RelationshipType relType,Direction 
dir) {

  IteratorRelatiionship rels1 = n1.getRelationships(relType, dir).iterator();
  IteratorRelatiionship rels2 = n2.getRelationships(relType, dir).iterator();

  while(rels1.hasNext  rels2.hasNext){
 Relationship rel1 = rels1.next();
 Relationship rel2 = rels2.next();

if (rel1.getEndNode().equals(n2)
  return true;
else if (rel2.getEndNode().equals(n1))
  return true;
  }
  return false;
}
 Date: Thu, 27 Oct 2011 18:39:01 +0200
 From: bplsi...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Function to check whether two nodes are connected?
 
 Easy: just one.
 
 For now, I've written this, but I'm still not sure it is the simplest 
 way to write it
 
  public boolean areConnected(Node n1,Node n2,Relationship 
 rel,Direction dir) throws Exception {
  IterableRelationship relationships = n1.getRelationships(dir);
 
  for (Relationship r : relationships) {
  //I am only working with Dynamic Relationships
  if (r.getType().equals(rel.getType())) {
  if (dir == Direction.OUTGOING) { if 
 (r.getEndNode().equals(n2)) { return true; } }
  else { if (r.getStartNode().equals(n2)) { return true; } }
  }
  }
  return false;
  }
 
 Bruno
 
 Le 27/10/2011 18:31, Peter Neubauer a écrit :
  Bruno,
  There is no such function low level, but toy can use a Shortest path algo to
  check this. What is the maximum length for a path between the nodes?
  On Oct 27, 2011 6:14 PM, Bruno Paiva Lima da Silvabplsi...@gmail.com
  wrote:
 
  Hello there!
  First of all, thanks for the help in all my previous questions, all the
  answers have been helping me to use Neo4j with success.
 
  I have a very simple question, but I haven't found the answer yet...
 
  I'd like to have a function, which signature would be more or less like
  this:
 
  public areTheyConnected(Node *n1*,Node *n2*,Relationship *rel*,Direction
  *dir*)
 
  which returns true iff there is an edge of type *rel*, between *n1* and
  *n2*, in the *dir* direction (the direction has n1 as reference).
 
  Example:
 
  In my graph, I have: Bob knows Tom, Tom knows Peter, Jack knows Tom
 
  areTheyConnected(nodeBob,nodeTom,relKnows,Direction.OUTGOING) returns
  true; (Bob knows Tom)
  areTheyConnected(nodeTom,nodeJack,relKnows,Direction.INCOMING) also
  returns true; (Jack knows Tom)
 
  areTheyConnected(nodeBob,nodeTom,relKnows,Direction.INCOMING) returns
  false; (Tom doesn't know Bob)
 
  Is there an easy method (constant time, or close) for that?
 
  Thank you very much,
  Bruno
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Function to check whether two nodes are connected?

2011-10-27 Thread Niels Hoogeveen


I see I made a bit of a mistake with this one. The gist of the solution 
remains, but I made a mistake dealing with the directions of relationship.
It should be something like this.
public boolean areConnected(Node n1,Node n2, RelationshipType relType,Direction 
dir) {
 
 Direction dir2 = null;
 if(dir.equals(Direction.INCOMING))
   dir2 = Direction.OUTGOING;
 else if(dir.equals(Direction.OUTGOING))
   dir2 = Direction.INCOMING;
 else dir2 = Direction.BOTH;

 IteratorRelationship rels1 = n1.getRelationships(relType, dir).iterator();
 IteratorRelationship rels2 = n2.getRelationships(relType, dir2).iterator();
 
 while(rels1.hasNext  rels2.hasNext){
   Relationship rel1 = rels1.next();
   Relationship rel2 = rels2.next();
 
   if (rel1.getEndNode().equals(n2)
 return true;
   else if (rel2.getEndNode().equals(n1))
 return true;
 }
 return false;
}
 From: pd_aficion...@hotmail.com
 To: user@lists.neo4j.org
 Date: Thu, 27 Oct 2011 19:05:16 +0200
 Subject: Re: [Neo4j] Function to check whether two nodes are connected?
 
 
 There is one caveat to this method, you'd have to know which node is most 
 densely connected. 
 
 Suppose one of the nodes has 100,000 relationships (incoming and outgoing) 
 and the other node has only a few relationships, then you'd want to iterate 
 over the relationships of the second node.
 
 A solution could be to iterate over both sets of relationships at the same 
 time:
 
 public boolean areConnected(Node n1,Node n2, RelationshipType 
 relType,Direction dir) {
 
   IteratorRelatiionship rels1 = n1.getRelationships(relType, 
 dir).iterator();
   IteratorRelatiionship rels2 = n2.getRelationships(relType, 
 dir).iterator();
 
   while(rels1.hasNext  rels2.hasNext){
  Relationship rel1 = rels1.next();
  Relationship rel2 = rels2.next();
 
 if (rel1.getEndNode().equals(n2)
   return true;
 else if (rel2.getEndNode().equals(n1))
   return true;
   }
   return false;
 }
  Date: Thu, 27 Oct 2011 18:39:01 +0200
  From: bplsi...@gmail.com
  To: user@lists.neo4j.org
  Subject: Re: [Neo4j] Function to check whether two nodes are connected?
  
  Easy: just one.
  
  For now, I've written this, but I'm still not sure it is the simplest 
  way to write it
  
   public boolean areConnected(Node n1,Node n2,Relationship 
  rel,Direction dir) throws Exception {
   IterableRelationship relationships = n1.getRelationships(dir);
  
   for (Relationship r : relationships) {
   //I am only working with Dynamic Relationships
   if (r.getType().equals(rel.getType())) {
   if (dir == Direction.OUTGOING) { if 
  (r.getEndNode().equals(n2)) { return true; } }
   else { if (r.getStartNode().equals(n2)) { return true; } }
   }
   }
   return false;
   }
  
  Bruno
  
  Le 27/10/2011 18:31, Peter Neubauer a écrit :
   Bruno,
   There is no such function low level, but toy can use a Shortest path algo 
   to
   check this. What is the maximum length for a path between the nodes?
   On Oct 27, 2011 6:14 PM, Bruno Paiva Lima da Silvabplsi...@gmail.com
   wrote:
  
   Hello there!
   First of all, thanks for the help in all my previous questions, all the
   answers have been helping me to use Neo4j with success.
  
   I have a very simple question, but I haven't found the answer yet...
  
   I'd like to have a function, which signature would be more or less like
   this:
  
   public areTheyConnected(Node *n1*,Node *n2*,Relationship *rel*,Direction
   *dir*)
  
   which returns true iff there is an edge of type *rel*, between *n1* and
   *n2*, in the *dir* direction (the direction has n1 as reference).
  
   Example:
  
   In my graph, I have: Bob knows Tom, Tom knows Peter, Jack knows Tom
  
   areTheyConnected(nodeBob,nodeTom,relKnows,Direction.OUTGOING) returns
   true; (Bob knows Tom)
   areTheyConnected(nodeTom,nodeJack,relKnows,Direction.INCOMING) also
   returns true; (Jack knows Tom)
  
   areTheyConnected(nodeBob,nodeTom,relKnows,Direction.INCOMING) returns
   false; (Tom doesn't know Bob)
  
   Is there an easy method (constant time, or close) for that?
  
   Thank you very much,
   Bruno
   ___
   Neo4j mailing list
   User@lists.neo4j.org
   https://lists.neo4j.org/mailman/listinfo/user
  
   ___
   Neo4j mailing list
   User@lists.neo4j.org
   https://lists.neo4j.org/mailman/listinfo/user
  
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org

Re: [Neo4j] Function to check whether two nodes are connected?

2011-10-27 Thread Niels Hoogeveen

You know me and my obsession for densely connected nodes :-)

 Date: Thu, 27 Oct 2011 17:37:07 +
 From: peter.neuba...@neotechnology.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Function to check whether two nodes are connected?

 Good catch Niels, thanks - my brain is in jet lag mode :-\
 On Oct 27, 2011 7:26 PM, Niels Hoogeveen pd_aficion...@hotmail.com
 wrote:

  I see I made a bit of a mistake with this one. The gist of the solution
  remains, but I made a mistake dealing with the directions of relationship.
  It should be something like this.
  public boolean areConnected(Node n1,Node n2, RelationshipType
  relType,Direction dir) {

   Direction dir2 = null;
   if(dir.equals(Direction.INCOMING))
dir2 = Direction.OUTGOING;
   else if(dir.equals(Direction.OUTGOING))
dir2 = Direction.INCOMING;
   else dir2 = Direction.BOTH;

   IteratorRelationship rels1 = n1.getRelationships(relType,
  dir).iterator();
   IteratorRelationship rels2 = n2.getRelationships(relType,
  dir2).iterator();

   while(rels1.hasNext  rels2.hasNext){
Relationship rel1 = rels1.next();
Relationship rel2 = rels2.next();

if (rel1.getEndNode().equals(n2)
  return true;
else if (rel2.getEndNode().equals(n1))
  return true;
   }
   return false;
  }
   From: pd_aficion...@hotmail.com
   To: user@lists.neo4j.org
   Date: Thu, 27 Oct 2011 19:05:16 +0200
   Subject: Re: [Neo4j] Function to check whether two nodes are connected?

   There is one caveat to this method, you'd have to know which node is
  most densely connected.

   Suppose one of the nodes has 100,000 relationships (incoming and
  outgoing) and the other node has only a few relationships, then you'd want
  to iterate over the relationships of the second node.

   A solution could be to iterate over both sets of relationships at the
  same time:

   public boolean areConnected(Node n1,Node n2, RelationshipType
  relType,Direction dir) {

 IteratorRelatiionship rels1 = n1.getRelationships(relType,
  dir).iterator();
 IteratorRelatiionship rels2 = n2.getRelationships(relType,
  dir).iterator();

 while(rels1.hasNext  rels2.hasNext){
Relationship rel1 = rels1.next();
Relationship rel2 = rels2.next();

   if (rel1.getEndNode().equals(n2)
 return true;
   else if (rel2.getEndNode().equals(n1))
 return true;
 }
 return false;
   }
Date: Thu, 27 Oct 2011 18:39:01 +0200
From: bplsi...@gmail.com
To: user@lists.neo4j.org
Subject: Re: [Neo4j] Function to check whether two nodes are connected?

Easy: just one.

For now, I've written this, but I'm still not sure it is the simplest
way to write it

 public boolean areConnected(Node n1,Node n2,Relationship
rel,Direction dir) throws Exception {
 IterableRelationship relationships =
  n1.getRelationships(dir);

 for (Relationship r : relationships) {
 //I am only working with Dynamic Relationships
 if (r.getType().equals(rel.getType())) {
 if (dir == Direction.OUTGOING) { if
(r.getEndNode().equals(n2)) { return true; } }
 else { if (r.getStartNode().equals(n2)) { return
  true; } }
 }
 }
 return false;
 }

Bruno

Le 27/10/2011 18:31, Peter Neubauer a écrit :
 Bruno,
 There is no such function low level, but toy can use a Shortest path
  algo to
 check this. What is the maximum length for a path between the nodes?
 On Oct 27, 2011 6:14 PM, Bruno Paiva Lima da Silva
  bplsi...@gmail.com
 wrote:

 Hello there!
 First of all, thanks for the help in all my previous questions, all
  the
 answers have been helping me to use Neo4j with success.

 I have a very simple question, but I haven't found the answer yet...

 I'd like to have a function, which signature would be more or less
  like
 this:

 public areTheyConnected(Node *n1*,Node *n2*,Relationship
  *rel*,Direction
 *dir*)

 which returns true iff there is an edge of type *rel*, between *n1*
  and
 *n2*, in the *dir* direction (the direction has n1 as reference).

 Example:

 In my graph, I have: Bob knows Tom, Tom knows Peter, Jack knows
  Tom

 areTheyConnected(nodeBob,nodeTom,relKnows,Direction.OUTGOING)
  returns
 true; (Bob knows Tom)
 areTheyConnected(nodeTom,nodeJack,relKnows,Direction.INCOMING) also
 returns true; (Jack knows Tom)

 areTheyConnected(nodeBob,nodeTom,relKnows,Direction.INCOMING)
  returns
 false; (Tom doesn't know Bob)

 Is there an easy method (constant time, or close) for that?

 Thank you very much,
 Bruno
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Article: The Coming SQL Collapse

2011-10-14 Thread Niels Hoogeveen


Hijack alert (going completely off topic)
I noticed the following statement: all reasoning is best with a linked list 
data structure.
When looking at the underlying store we see that the RelationshipRecord indeed 
forms two linked lists, one for the incoming side of the relationship and one 
for the outgoing side of the relationship, yet the API provides more or less 
the methods of a Set. The insert mechanism for Relationships guarantees that 
the two linked lists cannot contain duplicates (hence they form a set). The 
lists are always prepended, with an entry point to the head of the list stored 
in the NodeRecord.
In the past I have (ad nauseum) proposed a partitioning of the Relationship 
linked lists per Direction per RelationshipType. I am not going to repeat my 
arguments, they can be found here: 
http://lists.neo4j.org/pipermail/user/2011-August/011191.html and in other 
posts to the mailing list around that time.
Partitioning the two linked lists per Direction per RelationshipType, I now 
realize, also makes it possible to treat the two linked lists as 
implementations of the LinkedList interface in a meaningful way. For many 
practical purposes an ordering of Relationships makes little sense when the 
Relationships of a Node are not grouped by some critieria, but once we apply 
such a grouping, ordering starts to make sense. The simplest example I can 
think of is a timeline, where all relationships are either appended or 
prepended to the linked lists (depending on the preferred timeline arrow), so 
each iteration over the Relationships of a certain node for a given Direction 
and RelationshipType will be returned in the insert order (or inverse insert 
order) of the Relationships.  
Supporting all methods of a linked list would also allow for constructs like 
createRelationshipTo(node, SOME_REL, 2, 4), where 2 and 4 represent the 
positions in the linked lists (throwing IndexOutOfBoundsExceptions when 
appropriate).
Since linked list data structures are foundational to the Neo4j engine it would 
make sense to make these structures more explicit in the API, so application 
programmers can take advantage of the inherent ordering of the underlying 
storage. Many applications eventually present information in some default sort 
order, so it would be nice if it were possible to insert relationships 
according to some sort criterion.
Niels

 From: okramma...@gmail.com
 Date: Fri, 14 Oct 2011 11:28:16 -0600
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Article: The Coming SQL Collapse
 
 Hi,
 
  This is not conducive to Baysian-based reasoning, evidential reasoning, 
  other forms of logics (classical and non-classical)  
  
  How would you model those to a suitable domain model?
  Can you give a good example?
  Michael
 
 Here is an article that argues for support of other data semantics in the Web 
 of Data (RDF world) beyond description logics. In here, you will find 
 examples of other forms of reasoning.
 
   http://arxiv.org/abs/0905.3378
 
 Unlike the triple/quad-store world, graph databases provide a very generic 
 data model with limited constraints on meaning. Unfortunately (in my 
 opinion), graph databases like OrientDB and DEX employ typing at the graph 
 database level. Neo4j provides it at the Spring Data Graph level -- a level 
 above. This is good in that Neo4j is not pushing a world view to low into the 
 stack. The world of RDF, on the other hand, and its strong bent towards OWL 
 (description logic) makes it such that the entire technology stack is mixed 
 up with this logic. And, while this logic is very cool, its not the only way 
 to do things -- the only way to view the world.
 
  however, at some point, there is always an assumption, and the 
 foundational assumption of graph databases is all reasoning is best with a 
 linked list data structure.
 
 See ya,
 Marko.
 
 http://markorodriguez.com
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Article: The Coming SQL Collapse

2011-10-14 Thread Niels Hoogeveen


I concur. In my opinion Neo4j is more a storage engine with certain storage 
features than a database management system. This is already exemplified by the 
absence of a query language as primary interface. 
The author is therefore wrong in his assessment that there is no separation of 
logical model and physical model. There is no logical model, so the separation 
is complete, any logical model can be bolted onto the physical model, or can be 
stored in a separate repository. 
In general, I think, NOSQL databases are more storage engine than database 
management system. It's exactly the control over storage that forms the niche 
NOSQL database operate in. Distributed key value lookup and tree/graph 
traversals are typical application domains where SQL engines don't provide the 
hooks to efficiently or scalably process certain questions or actions. 
Niels

 Date: Sat, 15 Oct 2011 01:12:58 +0200
 From: a...@morgner.de
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Article: The Coming SQL Collapse
 
 My 2 cents:
 
 The Neo4j API is clean, open, and sort-of low level by intention. It is 
 neither ugly, smelly, nor it does it violate anything.
 
 Neo4j in general is very stable. But, of course, if you try the latest 
 snapshot, it may have bugs (as any software has).
 
 Since May 2010, we're developing a CMS based on Neo4j (structr) and do 
 some graph-related projects.
 
 Coming from the Oracle world, I can only say that working with Neo4j is 
 a revelation.
 
 Axel
 
 
 Am 14.10.2011 14:48, schrieb Tobias Ivarsson:
  We had an interesting discussion about this internally at Neo Technology
  today. We thought it might be of interest to the broader community. I don't
  think the discussion is over, so it would be interesting to continue it on
  the public mailing list.
 
  It regards the initial paragraphs of an article posted to dzone recently:
  http://www.dzone.com/links/rss/the_coming_sql_collapse.html
  It mentions Neo4j and how the author dislikes a common way of using Neo4j
  for building applications.
 
  It would be interesting to hear suggestions on how to improve this.
 
  Forwarded conversation follows:
 
  On Fri, Oct 14, 2011 at 10:13 AM, Tobias Ivarsson
  tobias.ivars...@neotechnology.com  wrote:
 
  I found this while reading feeds in bed last night:
 
  *The Coming SQL Collapse*
  http://www.dzone.com/links/rss/the_coming_sql_collapse.html
 
  (Sent from Flipboardhttp://flipboard.com)
 
 
  The things he say about SQL vs NOSQL is not very interesting, but I'd like
  to raise what he says about Neo4j:
 
I looked at neo4j briefly the other day, and quite predictably thought
  ‘wow, this looks like a serious tinkertoy: it‘s basically a bunch of nodes
  where you just blob your attributes.‘ Worse than that, to wrap objects
  around it, you have to have them explicitly incorporate their node class,
  which is ugly, smelly, violates every law of separation of concerns and
  logical vs. physical models. On the plus side, as I started to look at it
  more, I realized that it was the perfect way to implement a backend for a
  bayesian inference engine (more on that later). Why? Because inference
  doesn‘t care particularly about all the droll requirements that are 
  settled
  for you by SQL, and there are no real set operations to speak of.
 
  He attacks our pattern of building domain models with Neo4j, calling it
  ugly, smelly and in violation of every law of separation of concerns
  and logical vs. physical models. Is he right? My feeling is that he is
  brain washed with too many so called best practices, but Neo4j has been 
  my
  main model for a long time now, my perspective is likely skewed. I'd like 
  to
  hear your thoughts.
 
  --
  Tobias Ivarssontobias.ivars...@neotechnology.com
 
  On Fri, Oct 14, 2011 at 10:32 AM, Rickard Öberg
  rickard.ob...@neotechnology.com  wrote:
 
  Well, I'd tend to agree with the author. Mixing persistence details with
  the domain model itself is really a bad idea. Infrastructure details should
  not pollute the domain logic as it does with the currently suggested 
  usage
  of Neo4j. But I think both Spring Data Graph and the Qi4j usage model fixes
  this, as it allows you to keep many of those things outside of the domain
  code.
 
  /Rickard
 
  On Fri, Oct 14, 2011 at 11:45 AM, Tobias Ivarssontobias.ivarsson@
  neotechnology.com  wrote:
 
  On Fri, Oct 14, 2011 at 11:21 AM, Rickard Öberg
  rickard.ob...@neotechnology.com  wrote:
 
  On 10/14/11 17:16 , Tobias Ivarsson wrote:
 
  I was hoping for a bit more elaboration, of why it is a bad idea.
 
  Spring Data Neo4j operates mainly in the same way (at least it did when
  I was part of the design process), it just hides the details of it.
 
  The model we suggest is not to mix infrastructure details (nodes,
  relationships, traversals) with the domain logic. We suggest the domain
  logic be a separate layer, acting on domain data objects (defined as a
  set of interfaces). What we do suggest though

Re: [Neo4j] HyperRelationship example

2011-09-24 Thread Niels Hoogeveen


When I wrote the wiki page for Enhanced-API, I ended up using all the words I 
had spent on the hyperrelationship example, so I decided to keep the original 
page alive, but link it to the enhanced API page.

 Date: Sat, 24 Sep 2011 19:45:47 +1200
 From: bryc...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] HyperRelationship example
 
 Here you go:
 https://github.com/neo4j/graph-collections/wiki/HyperRelationship-example
 
 Though that page just has a link to:
 https://github.com/neo4j/graph-collections/wiki/Enhanced-API
 
 Bryce
 
 On Sat, Sep 24, 2011 at 5:00 PM, loldrup lold...@gmail.com wrote:
 
 
  Niels Hoogeveen wrote:
  
   I just posted an example on how to use HyperRelationships:
  
  
  https://github.com/peterneubauer/graph-collections/wiki/HyperRelationship-example
  
 
  This link now gives 404. Does it have a new address? If so, what is it?
 
 
  Jon
 
  --
  View this message in context:
  http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-HyperRelationship-example-tp3204449p3363779.html
  Sent from the Neo4j Community Discussions mailing list archive at
  Nabble.com.
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Modelling with neo4j

2011-09-24 Thread Niels Hoogeveen


You raise interesting questions, most of them very much related to the work I 
did on Enhanced API.

Let me start with the distinction between Node and Relationship, which in my 
opinion too is a bit artificial. I understand when creating a graph database, 
it is helpful to have something like vertices and edges, but indeed see those 
more as modalities of the elements of the graph than as clearly separated 
types. This was one of the reasons to unify all elements of the graph with one 
underlying type.

At the time, I saw two option: 

a) make the graph bipartite, so that all relationships and properties become 
nodes and use relationships only as a hidden linking feature
b) create shadow nodes for relationships and properties when needed and let the 
API handle that transparently

I chose for option b for performance reasons. There are likely many 
applications where most of the relationships are simple, ie. link two nodes 
while possibly having some properties. Using a bipartite layout for such 
relationships adds nothing, but it takes twice as many links to traverse.

The shadow node solution only treats relationships and properties as special 
(having relationships to them) when that is needed. 

Now to the typing issues. Neo4j has chosen not to add typing features to the 
database and I actually like that. It allows for optional type systems that can 
be used but are not enforced to be used. 

Type systems are nice beasts, especially when dealing with large and complex 
applications, but they impose a development overhead, mostly felt in small 
quick and dirty applications. This is true for programming languages, where 
many people prefer to use an untyped language such as Javascript, Python, Ruby 
and PHP over a typed language such as Java, Scala, C# or Haskell and I think it 
is also true for databases. I think one of the reasons NOSQL became so popular 
is because the type system of an RDBMS adds overhead to simple applications. 

An RDBMS needs a type system because the storage layout requires that. Tables 
have a fixed number of columns, where each column has a designated type. While 
this is a great feature when processing massive amounts of similar data, it can 
also make the application brittle. The tight coupling between type system and 
storage layout makes that rapid schema evolution is not easy to do.

Neo4j doesn't impose a type system like an RDBMS does, because its storage 
layout doesn't require it. Something is either a node, a relationship or a 
property, but the combinations don't need to explicit modelling for the sake of 
storage.

Because of this untyped nature of the database, it now becomes possible to add 
a type system that not only is optional, but can in fact be made as strong or 
as weak as the application demands.

Unfortunately Neo4j doesn't provide all the necessary hooks for a type system, 
another reason why I started Enhanced API. It was not my intention with that 
API to provide a full fledged type system to Neo4j, but to provide the 
necessary hooks so a type system can be created.

Of course there is some type-creep in Neo4j. Properties and relationships have 
names, which in almost every application are used as types. Say we have several 
nodes we like to use to store information about people, where each of those 
nodes has a property last_name. This property name effectively is used as a 
type. For all nodes the property name will denote the same fact: the last name 
of a person. 

This is not necessarily required by the Neo4j database. Different nodes may use 
the same property name to denote different things even with different 
datatypes. It is possible to have nodes with property name last_name that for 
some nodes is a String while it is an Integer for other nodes. While this is 
possible, I venture this is not all that common. The same property name will 
likely be used to denote the same fact and have the same datatype across the 
graph and therefore in most common cases be used like a type. 

The same applies to relationships, where the name will in general be used to 
denote the same type of relationship. It is unlikely an application with use 
the FRIEND relationship to sometimes denote a friendship between two people 
while at other times use that relationship name to denote the address of a 
building.

This is as far as typing goes in Neo4j, but it is there and means we have to 
incorporate it into the API somehow. 

This is the reason why I decided to add subtyping of relationship-types and 
property-types in the API, a feature that may be of interest to the model you 
describe in your email.

Joe is a janitor at the school.

Here we see three elements: Joe, is janitor at, and the school, which can 
indeed be modeled with two nodes and a relationship.

There is however a more general statement here of the form: person works with 
organization. Suppose we want to store the fact:

Jane is principal of the school. Again we can model this with two

Re: [Neo4j] Modelling with neo4j

2011-09-24 Thread Niels Hoogeveen

 kind like object blobs), so in our metamodel, they are not 
  stored as nodes, relationships, and properties, but rather, as a JSON blob, 
  serialized as a string to a node property.  That has worked out really 
  well.  When we do need to filter/manipulate those, we do them at the domain 
  level
 
  Just wanted to share some more examples.
 
  Rick
 
  
  From: user-boun...@lists.neo4j.org [user-boun...@lists.neo4j.org] On Behalf 
  Of Niels Hoogeveen [pd_aficion...@hotmail.com]
  Sent: Saturday, September 24, 2011 9:14 AM
  To: user@lists.neo4j.org
  Subject: Re: [Neo4j] Modelling with neo4j
 
  You raise interesting questions, most of them very much related to the work 
  I did on Enhanced API.
 
  Let me start with the distinction between Node and Relationship, which in 
  my opinion too is a bit artificial. I understand when creating a graph 
  database, it is helpful to have something like vertices and edges, but 
  indeed see those more as modalities of the elements of the graph than as 
  clearly separated types. This was one of the reasons to unify all elements 
  of the graph with one underlying type.
 
  At the time, I saw two option:
 
  a) make the graph bipartite, so that all relationships and properties 
  become nodes and use relationships only as a hidden linking feature
  b) create shadow nodes for relationships and properties when needed and let 
  the API handle that transparently
 
  I chose for option b for performance reasons. There are likely many 
  applications where most of the relationships are simple, ie. link two nodes 
  while possibly having some properties. Using a bipartite layout for such 
  relationships adds nothing, but it takes twice as many links to traverse.
 
  The shadow node solution only treats relationships and properties as 
  special (having relationships to them) when that is needed.
 
  Now to the typing issues. Neo4j has chosen not to add typing features to 
  the database and I actually like that. It allows for optional type systems 
  that can be used but are not enforced to be used.
 
  Type systems are nice beasts, especially when dealing with large and 
  complex applications, but they impose a development overhead, mostly felt 
  in small quick and dirty applications. This is true for programming 
  languages, where many people prefer to use an untyped language such as 
  Javascript, Python, Ruby and PHP over a typed language such as Java, Scala, 
  C# or Haskell and I think it is also true for databases. I think one of the 
  reasons NOSQL became so popular is because the type system of an RDBMS adds 
  overhead to simple applications.
 
  An RDBMS needs a type system because the storage layout requires that. 
  Tables have a fixed number of columns, where each column has a designated 
  type. While this is a great feature when processing massive amounts of 
  similar data, it can also make the application brittle. The tight coupling 
  between type system and storage layout makes that rapid schema evolution is 
  not easy to do.
 
  Neo4j doesn't impose a type system like an RDBMS does, because its storage 
  layout doesn't require it. Something is either a node, a relationship or a 
  property, but the combinations don't need to explicit modelling for the 
  sake of storage.
 
  Because of this untyped nature of the database, it now becomes possible to 
  add a type system that not only is optional, but can in fact be made as 
  strong or as weak as the application demands.
 
  Unfortunately Neo4j doesn't provide all the necessary hooks for a type 
  system, another reason why I started Enhanced API. It was not my intention 
  with that API to provide a full fledged type system to Neo4j, but to 
  provide the necessary hooks so a type system can be created.
 
  Of course there is some type-creep in Neo4j. Properties and relationships 
  have names, which in almost every application are used as types. Say we 
  have several nodes we like to use to store information about people, where 
  each of those nodes has a property last_name. This property name 
  effectively is used as a type. For all nodes the property name will denote 
  the same fact: the last name of a person.
 
  This is not necessarily required by the Neo4j database. Different nodes may 
  use the same property name to denote different things even with different 
  datatypes. It is possible to have nodes with property name last_name that 
  for some nodes is a String while it is an Integer for other nodes. While 
  this is possible, I venture this is not all that common. The same property 
  name will likely be used to denote the same fact and have the same datatype 
  across the graph and therefore in most common cases be used like a type.
 
  The same applies to relationships, where the name will in general be used 
  to denote the same type of relationship. It is unlikely an application with 
  use the FRIEND relationship to sometimes

Re: [Neo4j] Modelling with neo4j

2011-09-24 Thread Niels Hoogeveen

 to parse and traverse.
 
 - We often found that there were data structures in our application domain 
 for which it was OK to be opaque - e.g. although the structures were deep 
 and complex, they did not require searchability or traversability (e.g. they 
 were kind like object blobs), so in our metamodel, they are not stored as 
 nodes, relationships, and properties, but rather, as a JSON blob, serialized 
 as a string to a node property.  That has worked out really well.  When we do 
 need to filter/manipulate those, we do them at the domain level
 
 Just wanted to share some more examples.
 
 Rick
 
 
 From: user-boun...@lists.neo4j.org [user-boun...@lists.neo4j.org] On Behalf 
 Of Niels Hoogeveen [pd_aficion...@hotmail.com]
 Sent: Saturday, September 24, 2011 9:14 AM
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Modelling with neo4j
 
 You raise interesting questions, most of them very much related to the work I 
 did on Enhanced API.
 
 Let me start with the distinction between Node and Relationship, which in my 
 opinion too is a bit artificial. I understand when creating a graph database, 
 it is helpful to have something like vertices and edges, but indeed see those 
 more as modalities of the elements of the graph than as clearly separated 
 types. This was one of the reasons to unify all elements of the graph with 
 one underlying type.
 
 At the time, I saw two option:
 
 a) make the graph bipartite, so that all relationships and properties become 
 nodes and use relationships only as a hidden linking feature
 b) create shadow nodes for relationships and properties when needed and let 
 the API handle that transparently
 
 I chose for option b for performance reasons. There are likely many 
 applications where most of the relationships are simple, ie. link two nodes 
 while possibly having some properties. Using a bipartite layout for such 
 relationships adds nothing, but it takes twice as many links to traverse.
 
 The shadow node solution only treats relationships and properties as special 
 (having relationships to them) when that is needed.
 
 Now to the typing issues. Neo4j has chosen not to add typing features to the 
 database and I actually like that. It allows for optional type systems that 
 can be used but are not enforced to be used.
 
 Type systems are nice beasts, especially when dealing with large and complex 
 applications, but they impose a development overhead, mostly felt in small 
 quick and dirty applications. This is true for programming languages, where 
 many people prefer to use an untyped language such as Javascript, Python, 
 Ruby and PHP over a typed language such as Java, Scala, C# or Haskell and I 
 think it is also true for databases. I think one of the reasons NOSQL became 
 so popular is because the type system of an RDBMS adds overhead to simple 
 applications.
 
 An RDBMS needs a type system because the storage layout requires that. Tables 
 have a fixed number of columns, where each column has a designated type. 
 While this is a great feature when processing massive amounts of similar 
 data, it can also make the application brittle. The tight coupling between 
 type system and storage layout makes that rapid schema evolution is not easy 
 to do.
 
 Neo4j doesn't impose a type system like an RDBMS does, because its storage 
 layout doesn't require it. Something is either a node, a relationship or a 
 property, but the combinations don't need to explicit modelling for the sake 
 of storage.
 
 Because of this untyped nature of the database, it now becomes possible to 
 add a type system that not only is optional, but can in fact be made as 
 strong or as weak as the application demands.
 
 Unfortunately Neo4j doesn't provide all the necessary hooks for a type 
 system, another reason why I started Enhanced API. It was not my intention 
 with that API to provide a full fledged type system to Neo4j, but to provide 
 the necessary hooks so a type system can be created.
 
 Of course there is some type-creep in Neo4j. Properties and relationships 
 have names, which in almost every application are used as types. Say we have 
 several nodes we like to use to store information about people, where each of 
 those nodes has a property last_name. This property name effectively is 
 used as a type. For all nodes the property name will denote the same fact: 
 the last name of a person.
 
 This is not necessarily required by the Neo4j database. Different nodes may 
 use the same property name to denote different things even with different 
 datatypes. It is possible to have nodes with property name last_name that 
 for some nodes is a String while it is an Integer for other nodes. While this 
 is possible, I venture this is not all that common. The same property name 
 will likely be used to denote the same fact and have the same datatype across 
 the graph and therefore in most common cases be used like a type.
 
 The same applies

Re: [Neo4j] Modelling with neo4j

2011-09-24 Thread Niels Hoogeveen


Subtyping works as follows in Enhanced API.
When calling getRelationships(RelationshipType, Direction) or any of its 
alternatives, the API looks up all subtypes of that relationship type and then 
call getRelationshipTypes(Direction, RelationshipType and its subtypes). All 
you need to do is create a RelationshipType IS_JANITOR_OF and a 
RelationshipType WORKS_FOR and state that the former is a subtype of the 
latter. 
Haskell type classes are a great mechanism for ad-hoc polymorphism and in some 
ways are preferable to subtyping, though not necessarily in the context of a 
database. It allows you indeed to say there is a commonality between WORKS_AT 
and IS_JANITOR_OF, but it doesn't allow you to state that the relationships 
of type  IS_JANITOR_OF are a subset of the relationships of type WORKS_AT. 
In a database context the subsumption rule is actually quite important and 
Haskell type classes don't offer that. The combination of type classes and 
subtyping is as far as I know still an open research topic. It is not without 
reason that Scala (which has subtyping) doesn't have type classes, though it 
allows similar constructs through implicit conversions. Working in both 
disciplines at the same time (poor-man type classes through implicit 
conversions in combination with subtyping) seems to be non-trivial. 
Niels

 Date: Sat, 24 Sep 2011 08:09:48 -0700
 From: lold...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Modelling with neo4j
 
 Subtyping of relationship types sounds like the cure to my problems.
 When creating a relationship IS_A_JANITOR_OF, will a corresponding
 relationship type IS_A_JANITOR_OF-relationship-type automatically be
 created?
 
 If I have a simple relationship can I then ask which relationship types it's
 type is a subtype of?
 
 Regarding interfaces:
 I took the idea of interfaces from Haskells type classes, which makes great
 sense as interfaces. In Neo4j we could imagine that relationships with types
 WORKS_AT and REFERS_TO might have something in common (e.g. they both have
 to specify a boss who gives them orders).
 
 For now I don't think my problem requires interfaces before it can be be
 solved, but I only just started so who knows :)
 
 Jon
 On Sep 24, 2011 3:15 PM, Niels Hoogeveen [via Neo4j Community Discussions]
 ml-node+s438527n3364304...@n3.nabble.com wrote:
 
 
 
  You raise interesting questions, most of them very much related to the
 work I did on Enhanced API.
 
  Let me start with the distinction between Node and Relationship, which in
 my opinion too is a bit artificial. I understand when creating a graph
 database, it is helpful to have something like vertices and edges, but
 indeed see those more as modalities of the elements of the graph than as
 clearly separated types. This was one of the reasons to unify all elements
 of the graph with one underlying type.
 
  At the time, I saw two option:
 
  a) make the graph bipartite, so that all relationships and properties
 become nodes and use relationships only as a hidden linking feature
  b) create shadow nodes for relationships and properties when needed and
 let the API handle that transparently
 
  I chose for option b for performance reasons. There are likely many
 applications where most of the relationships are simple, ie. link two nodes
 while possibly having some properties. Using a bipartite layout for such
 relationships adds nothing, but it takes twice as many links to traverse.
 
  The shadow node solution only treats relationships and properties as
 special (having relationships to them) when that is needed.
 
  Now to the typing issues. Neo4j has chosen not to add typing features to
 the database and I actually like that. It allows for optional type systems
 that can be used but are not enforced to be used.
 
  Type systems are nice beasts, especially when dealing with large and
 complex applications, but they impose a development overhead, mostly felt in
 small quick and dirty applications. This is true for programming languages,
 where many people prefer to use an untyped language such as Javascript,
 Python, Ruby and PHP over a typed language such as Java, Scala, C# or
 Haskell and I think it is also true for databases. I think one of the
 reasons NOSQL became so popular is because the type system of an RDBMS adds
 overhead to simple applications.
 
  An RDBMS needs a type system because the storage layout requires that.
 Tables have a fixed number of columns, where each column has a designated
 type. While this is a great feature when processing massive amounts of
 similar data, it can also make the application brittle. The tight coupling
 between type system and storage layout makes that rapid schema evolution is
 not easy to do.
 
  Neo4j doesn't impose a type system like an RDBMS does, because its storage
 layout doesn't require it. Something is either a node, a relationship or a
 property, but the combinations don't need to explicit modelling for the sake
 of storage

Re: [Neo4j] Unrolled Linked List

2011-09-23 Thread Niels Hoogeveen

A quick skim of the code shows me you have a baseNode which is an entrypoint 
for the ULL. This is a logical candidate node to use for the purpose of locking.
What are the pros and cons to locking the baseNode on every read and write 
operation?
Niels
 Date: Fri, 23 Sep 2011 09:39:38 +1200
 From: bryc...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Unrolled Linked List

 Good stuff.

 I am presently looking into concurrent use of a given UnrolledLinkedList at
 least within the same graph database instance, might be a little bit harder
 in HA environment.  Its hard enough writing test cases for this, maybe even
 harder than making it work properly!  Hoping that some utility code I am
 going to produce will help with testing concurrency of other data
 structures.

 By concurrent use I mean concurrent use of the data within the graph, not of
 the given instantiation of the class, e.g. what happens when one thread gets
 an instance of ULL based off a given node and is iterating over it, then
 another thread gets an instance of a ULL and writes into it.

 Cheers
 Bryce

 On Fri, Sep 23, 2011 at 4:57 AM, Niels Hoogeveen
 pd_aficion...@hotmail.comwrote:

  It looks really cool.
  I always find it fun to create something and later find out it is an
  already known construction (something worth inventing).
  Anyway, I pulled your code and will removed the dependencies to the
  Enhanced API stuff this week. After that we can start adding some
  documentation.
  Niels

   Date: Thu, 22 Sep 2011 15:57:13 +1200
   From: bryc...@gmail.com
   To: user@lists.neo4j.org
   Subject: [Neo4j] Unrolled Linked List

   Hi all,

   I have added an in graph representation of an unrolled linked list to the
   graph collections code, currently just in my githug repo:
   https://github.com/brycenz/graph-collections

   See this in particular:

  https://github.com/brycenz/graph-collections/blob/master/src/main/java/org/neo4j/collections/list/UnrolledLinkedList.java

   The name comes from:
   http://en.wikipedia.org/wiki/Unrolled_linked_list

   And it works roughly in the same manner, though I had the idea prior to
   reading the wiki article.

   As the UnrolledLinkedList class implements the NodeCollection interface
  it
   can be used as the backing of an IndexedRelationship, which is done in
  tests
   here:

  https://github.com/brycenz/graph-collections/blob/master/src/test/java/org/neo4j/collections/indexedrelationship/TestUnrolledLinkedListIndexedRelationship.java

   The main reason for me being interested in this, and an example of where
   this is (probably) really useful is in the following case:

  - you have a number of tag (or category, folder etc.) nodes
  - they each link to a large number of document (or article, comments,
  post etc.) nodes
  - using a single relationship type
  - you generally only are interested in showing the newest documents in
  descending date order (showing the head, in a paged ui)
  - documents are generally added in ascending date order (added to the
  head)

   The benefits come from being able to iterate over a small percentage of a
   collection of nodes in a fixed order without having to first load all the
   nodes and sort them.  This reduces the amount of data read in from disk,
   reduces the turnover of data in memory, and therefore aids with reduction
  in
   garbage collection.  In my case I have a large number of tags with a
  large
   number of items against them, I might read the first 100-200 items out of
  a
   collection of 30,000 and therefore by not reading in the other 29800
   relationships / nodes (per tag) I should be saving 90% or more.
  here's
   hoping.

   From the java doc: The structure is broken into pages of links to nodes
   where the size of the page can be controlled at initial construction
  time.
   Page size is not fixed but instead can float between a lower bound, and
  an
   upper bound. The bounds are at a fixed margin from the page size of M.
  When
   a page drops below the lower bound it will be joined onto the an adjacent
   page, and when the page goes above the upper bound it will be split in
  half.

   I am about to do some tests with this based on my use case and will
  report
   back on the performance impacts.

   Cheers
   Bryce

   P.S. still thinking about how to make this thread safe, any suggestions
   would be appreciated (presently only one thread will be able to write at
  a
   time, I am worried about a thread reading while another is writing,
   especially when it joins / splits pages or changes the head).
   ___
   Neo4j mailing list
   User@lists.neo4j.org
   https://lists.neo4j.org/mailman/listinfo/user

  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Unrolled Linked List

2011-09-23 Thread Niels Hoogeveen


Read integrity is really a dog. We haven't even begun to address that in the 
other collections. 
With regards to write locks ( and this is something we should check in 
sortedtree too ) is code like:
 page.setProperty( ITEM_COUNT, ((Integer) page.getProperty( ITEM_COUNT )) + 1 );
This is only threadsafe if the value returned by page.getProperty( ITEM_COUNT ) 
is read from a locked node.
Niels

 Date: Sat, 24 Sep 2011 09:14:07 +1200
 From: bryc...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Unrolled Linked List
 
 For writing that works well, and in fact for add node I am doing that (just
 realised I am not for remove node but I should be).  The problems for
 reading are:
 
- it should allow multiple threads to read at the same time
- it shouldn't dictate that client code has a transaction in order to
read
 
 As a simple solution thats probably workable (and probably the safest), and
 means that HA will just work, but restricting one thread at a time into a
 given node collection isn't the best.
 
 Maybe the client code should set whether it locks the data structure when
 reading, or fails with a ConcurrentModificationException when reading and
 data is changed.
 
 On Sat, Sep 24, 2011 at 6:00 AM, Niels Hoogeveen
 pd_aficion...@hotmail.comwrote:
 
 
  A quick skim of the code shows me you have a baseNode which is an
  entrypoint for the ULL. This is a logical candidate node to use for the
  purpose of locking.
  What are the pros and cons to locking the baseNode on every read and write
  operation?
  Niels
   Date: Fri, 23 Sep 2011 09:39:38 +1200
   From: bryc...@gmail.com
   To: user@lists.neo4j.org
   Subject: Re: [Neo4j] Unrolled Linked List
  
   Good stuff.
  
   I am presently looking into concurrent use of a given UnrolledLinkedList
  at
   least within the same graph database instance, might be a little bit
  harder
   in HA environment.  Its hard enough writing test cases for this, maybe
  even
   harder than making it work properly!  Hoping that some utility code I am
   going to produce will help with testing concurrency of other data
   structures.
  
   By concurrent use I mean concurrent use of the data within the graph, not
  of
   the given instantiation of the class, e.g. what happens when one thread
  gets
   an instance of ULL based off a given node and is iterating over it, then
   another thread gets an instance of a ULL and writes into it.
  
   Cheers
   Bryce
  
   On Fri, Sep 23, 2011 at 4:57 AM, Niels Hoogeveen
   pd_aficion...@hotmail.comwrote:
  
   
It looks really cool.
I always find it fun to create something and later find out it is an
already known construction (something worth inventing).
Anyway, I pulled your code and will removed the dependencies to the
Enhanced API stuff this week. After that we can start adding some
documentation.
Niels
   
 Date: Thu, 22 Sep 2011 15:57:13 +1200
 From: bryc...@gmail.com
 To: user@lists.neo4j.org
 Subject: [Neo4j] Unrolled Linked List

 Hi all,

 I have added an in graph representation of an unrolled linked list to
  the
 graph collections code, currently just in my githug repo:
 https://github.com/brycenz/graph-collections

 See this in particular:

   
  https://github.com/brycenz/graph-collections/blob/master/src/main/java/org/neo4j/collections/list/UnrolledLinkedList.java

 The name comes from:
 http://en.wikipedia.org/wiki/Unrolled_linked_list

 And it works roughly in the same manner, though I had the idea prior
  to
 reading the wiki article.

 As the UnrolledLinkedList class implements the NodeCollection
  interface
it
 can be used as the backing of an IndexedRelationship, which is done
  in
tests
 here:

   
  https://github.com/brycenz/graph-collections/blob/master/src/test/java/org/neo4j/collections/indexedrelationship/TestUnrolledLinkedListIndexedRelationship.java

 The main reason for me being interested in this, and an example of
  where
 this is (probably) really useful is in the following case:

- you have a number of tag (or category, folder etc.) nodes
- they each link to a large number of document (or article,
  comments,
post etc.) nodes
- using a single relationship type
- you generally only are interested in showing the newest
  documents in
descending date order (showing the head, in a paged ui)
- documents are generally added in ascending date order (added to
  the
head)

 The benefits come from being able to iterate over a small percentage
  of a
 collection of nodes in a fixed order without having to first load all
  the
 nodes and sort them.  This reduces the amount of data read in from
  disk,
 reduces the turnover of data in memory, and therefore aids with
  reduction
in
 garbage collection.  In my case I have a large number of tags

Re: [Neo4j] Unrolled Linked List

2011-09-22 Thread Niels Hoogeveen

It looks really cool. 
I always find it fun to create something and later find out it is an already 
known construction (something worth inventing).
Anyway, I pulled your code and will removed the dependencies to the Enhanced 
API stuff this week. After that we can start adding some documentation.
Niels

 Date: Thu, 22 Sep 2011 15:57:13 +1200
 From: bryc...@gmail.com
 To: user@lists.neo4j.org
 Subject: [Neo4j] Unrolled Linked List

 Hi all,

 I have added an in graph representation of an unrolled linked list to the
 graph collections code, currently just in my githug repo:
 https://github.com/brycenz/graph-collections

 See this in particular:
 https://github.com/brycenz/graph-collections/blob/master/src/main/java/org/neo4j/collections/list/UnrolledLinkedList.java

 The name comes from:
 http://en.wikipedia.org/wiki/Unrolled_linked_list

 And it works roughly in the same manner, though I had the idea prior to
 reading the wiki article.

 As the UnrolledLinkedList class implements the NodeCollection interface it
 can be used as the backing of an IndexedRelationship, which is done in tests
 here:
 https://github.com/brycenz/graph-collections/blob/master/src/test/java/org/neo4j/collections/indexedrelationship/TestUnrolledLinkedListIndexedRelationship.java

 The main reason for me being interested in this, and an example of where
 this is (probably) really useful is in the following case:

- you have a number of tag (or category, folder etc.) nodes
- they each link to a large number of document (or article, comments,
post etc.) nodes
- using a single relationship type
- you generally only are interested in showing the newest documents in
descending date order (showing the head, in a paged ui)
- documents are generally added in ascending date order (added to the
head)

 The benefits come from being able to iterate over a small percentage of a
 collection of nodes in a fixed order without having to first load all the
 nodes and sort them.  This reduces the amount of data read in from disk,
 reduces the turnover of data in memory, and therefore aids with reduction in
 garbage collection.  In my case I have a large number of tags with a large
 number of items against them, I might read the first 100-200 items out of a
 collection of 30,000 and therefore by not reading in the other 29800
 relationships / nodes (per tag) I should be saving 90% or more. here's
 hoping.

 From the java doc: The structure is broken into pages of links to nodes
 where the size of the page can be controlled at initial construction time.
 Page size is not fixed but instead can float between a lower bound, and an
 upper bound. The bounds are at a fixed margin from the page size of M. When
 a page drops below the lower bound it will be joined onto the an adjacent
 page, and when the page goes above the upper bound it will be split in half.

 I am about to do some tests with this based on my use case and will report
 back on the performance impacts.

 Cheers
 Bryce

 P.S. still thinking about how to make this thread safe, any suggestions
 would be appreciated (presently only one thread will be able to write at a
 time, I am worried about a thread reading while another is writing,
 especially when it joins / splits pages or changes the head).
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Neo4j graph collections introduction of NodeCollection interface

2011-09-20 Thread Niels Hoogeveen


Hi Bryce,
Sorry for the late response.
I understand it's difficult to come up with a really good use-case for making 
NodeCollection more general in the context of IndexedRelationships, but I like 
to think of that interface as something we can eventually use for all sorts of 
collections, not just the ones derived from SortedTree. 
There is of course the issue that relationships can not attach to 
relationships, so collections of relationships will need to be addressed by Id. 
This is not necessarily a bad thing, because it decouples the container and the 
elements. In other words the container knows what elements it contains, but the 
elements don't know in what containers they are placed. 
Another option would be to create shadow nodes for contained relationships. 
Instead of adding a relationships to the collection, its shadow node is added 
and both the shadow node and the relationship contain pointers (properties with 
Id values) towards each other.
I think it would be best if we do indeed create a GraphCollection interface 
parameterized by T extends PropertyContainer  even if that type parameter for 
now is always a Node. It doesn't add much complexity now to do it, and later on 
we may regret it and by then it becomes harder to do because there is an 
installed base.
Niels

 Date: Sat, 17 Sep 2011 14:19:04 +1200
 From: bryc...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Neo4j graph collections introduction of NodeCollection   
 interface
 
 Hi Niels,
 
 I had wondered about having a collection interface that covered both nodes
 and relationships.  There were a couple of reasons I didn't go with that
 right now, though well worthwhile discussing it and going with a
 GraphCollection super interface if it fits properly.
 
 Firstly I wanted to get something out there so people could have a look,
 and having something that matched what IndexedRelationship currently
 required was easiest first step.  Biggest thing specific in there to that
 functionality is the addNode method returning a relationship.
 
 The other issue was more wondering how a relationship collection would work.
  Say I have a relationship collection, and I have a relationship R1 between
 node A and B, how am I going to represent that relationship withing some
 graph based data structure that makes sense.  There could be a node X that
 is part of the relationship collection data structure (e.g. tree) and that
 node could have an attribute that has the relationship id on it, but that
 doesn't seem like it would be very performant.  There could be a
 relationship between X and A that also gave the relationship type of R1, so
 you could find the relationship based on that, but there isn't
 any guarantee of the relationship type being unique.  What it would need to
 properly model it is the ability to have a relationship between X and R1,
 i.e. a relationship from a node to a relationship.
 
 If instead of being able to add any given relationship to the relationship
 collection you instead restrict it to being relationships matching a certain
 criteria from a given node then it is practically the same thing as a
 relationship expander.
 
 Or if you instead have a way through the relationship collection to create
 relationships from a given node to a set of other arbitrary nodes, with the
 relationship collection having a fixed relationship type and direction, then
 that is practically the current IndexedRelationship.
 
 I guess a way it could work is similar to IndexedRelationship, basically
 more general case of that class, where you have a method on the relationship
 collection createRelationship(startNode, endNode, relationshipType,
 direction) that was then stored in an internal data structure to create a
 pseudo relationship between the start and end, and then being able to
 iterate over this set of relationships.  Not sure exactly what the use case
 of that would be.  Maybe of more interest could be the same situation where
 the relationship type and direction are fixed, then you may have a friend
 of set of relationships that you create between arbitrary nodes and then
 iterate over all of those.
 
 I can't personally think of a good way of adding a set of arbitrary
 relationships into a collection stored in a graph data structure.
 
 Thoughts?
 
 Cheers
 Bryce
 
 P.S. Peter, I had thought to remove the passing in of the graph database and
 instead just getting it from the node, or only passing in the graph database
 and creating the node internally.
 
 On Sat, Sep 17, 2011 at 2:19 AM, Niels Hoogeveen
 pd_aficion...@hotmail.comwrote:
 
 
  Hi Bryce,
  I really like what you are trying to achieve here.
  One question:
  Instead of having NodeCollection, why not have GraphCollectionT extends
  PropertyContainer. That way we can have collections of both Relationships
  and Nodes.
  Niels
 
   Date: Fri, 16 Sep 2011 17:37:29 +1200
   From: bryc...@gmail.com
   To: user@lists.neo4j.org
   Subject: [Neo4j] Neo4j graph

Re: [Neo4j] Radix tree

2011-09-16 Thread Niels Hoogeveen

Peter,
I agree, we need to work on documentation once the dust has settled around the 
changes Bryce has been working on.
Niels

 From: peter.neuba...@neotechnology.com
 Date: Fri, 16 Sep 2011 13:59:07 +0200
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Radix tree

 Yes, great work Davide!

 I think we need to start enabling documentation for the graph
 collections as it is evolving pretty rapidly!

 Cheers,

 /peter neubauer

 GTalk:  neubauer.peter
 Skype   peter.neubauer
 Phone   +46 704 106975
 LinkedIn   http://www.linkedin.com/in/neubauer
 Twitter  http://twitter.com/peterneubauer

 http://www.neo4j.org   - Your high performance graph database.
 http://startupbootcamp.org/- Öresund - Innovation happens HERE.
 http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.

 On Thu, Sep 15, 2011 at 1:48 PM, Niels Hoogeveen
 pd_aficion...@hotmail.com wrote:

  Thanks to the good work of Davide, graph-collections now contains an 
  implementation of Radix-tree. See: http://en.wikipedia.org/wiki/Radix_tree
  This particular datastructure can be used to store nodes sorted by a String 
  value, very handy when you want to create associative arrays in Neo4j.
  Niels
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user

 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Neo4j graph collections introduction of NodeCollection interface

2011-09-16 Thread Niels Hoogeveen


Bryce's point makes perfect sense. The argument graphDb().createNode() gives 
the constructor an instance of Node, which contains a reference to the 
database, so there is no real need to additionally supply the database instance.
Of course his example would have been less confusing if he'd written:

Node indexedNode = graphDb().createNode(); 
SortedTree st = new SortedTree( graphDb(), indexedNode, new IdComparator(), 
true, RelTypes.INDEXED_RELATIONSHIP.name() );

 From: peter.neuba...@neotechnology.com
 Date: Fri, 16 Sep 2011 15:22:27 +0200
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Neo4j graph collections introduction of NodeCollection   
 interface
 
 Also,
 since you can do node.getGraphDatabase(), I think we don't need to
 pass in the graphdb instance in
 new SortedTree( graphDb(), graphDb().createNode(), new IdComparator(),
 true, RelTypes.INDEXED_RELATIONSHIP.name() );
 ?
 
 Cheers,
 
 /peter neubauer
 
 GTalk:  neubauer.peter
 Skype   peter.neubauer
 Phone   +46 704 106975
 LinkedIn   http://www.linkedin.com/in/neubauer
 Twitter  http://twitter.com/peterneubauer
 
 http://www.neo4j.org   - Your high performance graph database.
 http://startupbootcamp.org/- Öresund - Innovation happens HERE.
 http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.
 
 
 
 On Fri, Sep 16, 2011 at 7:37 AM, Bryce bryc...@gmail.com wrote:
  Hi,
 
  I had mentioned in a previous thread that I was working on introducing a
  NodeCollection interface to remove the dependency from IndexedRelationship
  to SortedTree.  I have an initial cut of this up now in my github repo:
  https://github.com/brycenz/graph-collections
 
  It would be great to get community feedback on this as I think that having a
  well designed and common NodeCollection interface would help for multiple
  use cases, e.g. sortedTreeNodeCollection.addAll(linkedListNodeCollection)
  doing exactly what you think it would.
 
  IndexedRelationship now takes a node to index relationships from, a
  relationship type, and a direction, as well as a NodeCollection at creation
  time.  As in the unit tests this then leads to:
 
  Node indexedNode = graphDb().createNode();
  SortedTree st = new SortedTree( graphDb(), graphDb().createNode(), new
  IdComparator(), true, RelTypes.INDEXED_RELATIONSHIP.name() );
 
  IndexedRelationship ir = new IndexedRelationship( indexedNode,
  RelTypes.INDEXED_RELATIONSHIP, Direction.OUTGOING, st );
 
  To create the IndexedRelationship.  To later add nodes to the relationship
  you need to create an instance of IndexedRelationship without the
  NodeCollection:
 
  IndexedRelationship ir = new IndexedRelationship( indexedNode,
  RelTypes.INDEXED_RELATIONSHIP, Direction.OUTGOING );
 
 
  What this means from a NodeCollection implementation point of view is that
  firstly it needs to use the NodeCollection.RelationshipType.VALUE
  relationship to connect from its internal data structure to the nodes being
  added to the collection, and it needs to be able to recreate itself from a
  base node that is passed into a constructor (that only takes the base node).
   A node collection also needs to store its class name on the base node for
  later construction purposes, as well as any other data required to recreate
  the NodeCollection instance (in the case of SortedTree this is the
  comparator class, the tree name, and whether it is a unique index.
 
  Niels, you may want to have a good look over SortedTree, I have made a few
  changes to it, mainly around introduction of a base node, and changing of
  the end value relationships.  This could be cleaned up better, but I wanted
  to start with minimal changes.
 
  Both IndexedRelationship and IndexedRelationshipExpander have no
  dependencies on SortedTree now, and should work with any properly
  implemented NodeCollection.  I will be putting together a paged linked list
  NodeCollection next to try this.
 
  Some future thoughts for NodeCollection, addition of as many of the
  java.util.Collection methods (e.g. addAll, removeAll, retainAll, contains,
  containsAll) as well as an abstract base NodeCollection to help provide
  non-optimised support for these methods.
 
  Cheers
  Bryce
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Neo4j graph collections introduction of NodeCollection interface

2011-09-16 Thread Niels Hoogeveen


Hi Bryce,
I really like what you are trying to achieve here. 
One question:
Instead of having NodeCollection, why not have GraphCollectionT extends 
PropertyContainer. That way we can have collections of both Relationships and 
Nodes.
Niels

 Date: Fri, 16 Sep 2011 17:37:29 +1200
 From: bryc...@gmail.com
 To: user@lists.neo4j.org
 Subject: [Neo4j] Neo4j graph collections introduction of NodeCollection   
 interface
 
 Hi,
 
 I had mentioned in a previous thread that I was working on introducing a
 NodeCollection interface to remove the dependency from IndexedRelationship
 to SortedTree.  I have an initial cut of this up now in my github repo:
 https://github.com/brycenz/graph-collections
 
 It would be great to get community feedback on this as I think that having a
 well designed and common NodeCollection interface would help for multiple
 use cases, e.g. sortedTreeNodeCollection.addAll(linkedListNodeCollection)
 doing exactly what you think it would.
 
 IndexedRelationship now takes a node to index relationships from, a
 relationship type, and a direction, as well as a NodeCollection at creation
 time.  As in the unit tests this then leads to:
 
 Node indexedNode = graphDb().createNode();
 SortedTree st = new SortedTree( graphDb(), graphDb().createNode(), new
 IdComparator(), true, RelTypes.INDEXED_RELATIONSHIP.name() );
 
 IndexedRelationship ir = new IndexedRelationship( indexedNode,
 RelTypes.INDEXED_RELATIONSHIP, Direction.OUTGOING, st );
 
 To create the IndexedRelationship.  To later add nodes to the relationship
 you need to create an instance of IndexedRelationship without the
 NodeCollection:
 
 IndexedRelationship ir = new IndexedRelationship( indexedNode,
 RelTypes.INDEXED_RELATIONSHIP, Direction.OUTGOING );
 
 
 What this means from a NodeCollection implementation point of view is that
 firstly it needs to use the NodeCollection.RelationshipType.VALUE
 relationship to connect from its internal data structure to the nodes being
 added to the collection, and it needs to be able to recreate itself from a
 base node that is passed into a constructor (that only takes the base node).
  A node collection also needs to store its class name on the base node for
 later construction purposes, as well as any other data required to recreate
 the NodeCollection instance (in the case of SortedTree this is the
 comparator class, the tree name, and whether it is a unique index.
 
 Niels, you may want to have a good look over SortedTree, I have made a few
 changes to it, mainly around introduction of a base node, and changing of
 the end value relationships.  This could be cleaned up better, but I wanted
 to start with minimal changes.
 
 Both IndexedRelationship and IndexedRelationshipExpander have no
 dependencies on SortedTree now, and should work with any properly
 implemented NodeCollection.  I will be putting together a paged linked list
 NodeCollection next to try this.
 
 Some future thoughts for NodeCollection, addition of as many of the
 java.util.Collection methods (e.g. addAll, removeAll, retainAll, contains,
 containsAll) as well as an abstract base NodeCollection to help provide
 non-optimised support for these methods.
 
 Cheers
 Bryce
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

[Neo4j] Radix tree

2011-09-15 Thread Niels Hoogeveen


Thanks to the good work of Davide, graph-collections now contains an 
implementation of Radix-tree. See: http://en.wikipedia.org/wiki/Radix_tree
This particular datastructure can be used to store nodes sorted by a String 
value, very handy when you want to create associative arrays in Neo4j.
Niels 
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] regarding supernode

2011-09-09 Thread Niels Hoogeveen


Peter, 
I'd gladly put out a piece of code demonstrating the use of IndexRelationships, 
using this LIVES_IN example. Though I get the impression the question here 
relates to the normal relationship index. However when supernodes (still 
don't like that term for densely connected nodes) come into play, the normal 
relationship index doesn't offer much  help.
Niels

 From: pe...@neubauer.se
 Date: Fri, 9 Sep 2011 09:58:28 +0200
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] regarding supernode
 
 Linan,
 I think your example would be great for doing a JUnit test showing
 this. Niels, could you do that, plz? In that case, I can add a
 graph and exaplanations to it.
 
 /peter
 
 On Thu, Sep 8, 2011 at 11:25 PM, Linan Wang tali.w...@gmail.com wrote:
  On Wed, Sep 7, 2011 at 5:21 PM, Linan Wang tali.w...@gmail.com wrote:
  hi,
  I don't quite understand RelationshipIndex and RelationshipExpander.
  say I have a supernode city (beijing), it has 10 m users links to
  through relationship LIVES_IN. so how should I index? should be
  something like:
  RelationshipIndex idx = db.index().forRelationships(CITY_LIVES_IN);
  idx.add(rel, LIVES_IN, Beijing);
  if so, what's the advantage over this?
  IndexNode idx = db.index().forNodes(CITY_LIVES_IN);
  idx.add(user, LIVES_IN, beijing);
  (I read source code of LuceneIndex.java, found out that the
  implementation of the add method is shared between Indexnode and
  RelationshipIndex.)
  ok, i answer my own question:
  RelationshipIndex has the function query which takes startNode and
  endNode as extra parameters.
  so if traverse only depth 1, it could be faster than using Traverser.
  am i right here? (please say yes!)
  then the question is how to take advantage of it for more than 1?
 
 
  about RelationshipExpander. i don't see how RelationshipIndex could
  help combining with RelationshipExpander, when use
  GraphAlgoFactory.shortestPath(RelationshipExpander expander, int
  maxDepth)?
 
  thanks for help!
 
  --
  Best regards
 
  Linan Wang
 
 
 
 
  --
  Best regards
 
  Linan Wang
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Issues with IndexedRelationship

2011-09-08 Thread Niels Hoogeveen

Excellent... I did a code review and think this is a huge improvement over what 
we had.
Peter, can you pull these changes, I no longer have the privs to do so.
Niels

 Date: Thu, 8 Sep 2011 17:24:44 +1200
 From: bryc...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Issues with IndexedRelationship

 I have made the changes in regards to SortedTree in regards to relationships
 vs nodes, and have got all the tests passing.  The changes are pushed up to
 my github account (and pull request has been raised).

 The changes can be seen here:
 https://github.com/brycenz/graph-collections

 On Thu, Sep 8, 2011 at 3:41 PM, Bryce bryc...@gmail.com wrote:

  Another thought if there is going to be a larger refactor of the code is
  whether the indexing mechanism should be broken out as a strategy for the
  IndexedRelationship.  At present it is tied to SortedTree, but if an
  interface was extracted out that had addNode, removeNode, iterator, and
  isUniqueIndex then other indexing implementations could be used in certain
  cases.

  The particular other implementation I am currently thinking of that could
  be of use to me would be a paged linked list.  So that would have a linked
  list of pages, each with min  x  max KEY_VALUE (or equivalent)
  relationships.  I think that could work quite well for the situation where
  the index is descending date ordered, and generally just appended at the
  most recent end, and results are retrieved in a paged manner generally from
  near the most recent.

  But more to the point there could be any number of implementations that
  would be good for given different situations.

  That does bring up a question though, there was some discussion a while ago
  about some functionality along the lines of IndexedRelationship being pulled
  into the core, so is that overkill for now if there is going to be another
  core offering later?

  On Thu, Sep 8, 2011 at 2:38 PM, Niels Hoogeveen pd_aficion...@hotmail.com
   wrote:

  I think we don't have to worry about backwards compatibility much yet.
  There has not been a formal release of the component, so if there are 
  people
  using the software, they will accept that they are bleeding edgers.
  Indeed addNode should return the KEY_VALUE relationship and I think we
  should change the signature of SortedTree to turn it into
  IterableRelationship. No need to maintain a Node iterator, the node is
  always one getEndNode away.
  Niels

   Date: Thu, 8 Sep 2011 14:17:59 +1200
   From: bryc...@gmail.com
   To: user@lists.neo4j.org
   Subject: Re: [Neo4j] Issues with IndexedRelationship

   Will have to experiment with changing my id's to be stored as longs, it
  does
   make perfect sense really that it would be better.  Thanks for the hint.

   In regards to SortedTree returning the KEY_VALUE relationship instead of
  the
   end Node, I had thought of that too, and it would definitely help.
   Could
   end up being a significant change to SortedTree though, e.g.:
 sortedTree.addNode( node );
   Would need to return the KEY_VALUE relationship instead of a boolean.
   Which
   not knowing where else SortedTree is used could be a large change?

   Maybe SortedTree would have two iterator's available a key_value
   relationship iterator, and a node iterator.  Having a quick look at it
  now
   it seems that it could work ok that way without introducing much code
   duplication.

   On Thu, Sep 8, 2011 at 12:46 PM, Niels Hoogeveen
   pd_aficion...@hotmail.comwrote:

Two longs is certainly cheaper than a string. Two longs take 128 bit
  and
are stored in the main record of the PropertyContainer, while a String
  would
require a 64 bit pointer in the main record of the
  PropertyContainer, and
an additional read in the String store where the string representation
  will
take up 256 bits. So both memory-wise, as perfomance wise, it is
  better to
store a UUID as two long values.

The main issue is something that needs a deeper fix than adding ID's.
SortedTree now returns Nodes when traversing the tree. We should
  however
return the KEY_VALUE Relationship to the indexed Node. Then
IndexedRelationship.DirectRelationship can be created with that
  relationship
as an argument. We get the Direction and the RelationshipType for
  free.
Niels

 Date: Thu, 8 Sep 2011 11:36:11 +1200
 From: bryc...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Issues with IndexedRelationship

 Hi Niels,

 Sorry I didn't quite write the bit about (1) clearly enough.  The
  problem
is
 that it presently throws an Exception where it shouldn't.

 This stems from IndexedRelationship.DirectRelationship:
 this.endRelationship = endNode.getSingleRelationship(
 SortedTree.RelTypes.KEY_VALUE, Direction.INCOMING );

 So if the end node has more than one incoming KEY_VALUE relationship
  a
more

Re: [Neo4j] Issues with IndexedRelationship

2011-09-08 Thread Niels Hoogeveen

I like this idea

 Date: Thu, 8 Sep 2011 15:41:52 +1200
 From: bryc...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Issues with IndexedRelationship

 Another thought if there is going to be a larger refactor of the code is
 whether the indexing mechanism should be broken out as a strategy for the
 IndexedRelationship.  At present it is tied to SortedTree, but if an
 interface was extracted out that had addNode, removeNode, iterator, and
 isUniqueIndex then other indexing implementations could be used in certain
 cases.

 The particular other implementation I am currently thinking of that could be
 of use to me would be a paged linked list.  So that would have a linked list
 of pages, each with min  x  max KEY_VALUE (or equivalent) relationships.
  I think that could work quite well for the situation where the index is
 descending date ordered, and generally just appended at the most recent end,
 and results are retrieved in a paged manner generally from near the most
 recent.

 But more to the point there could be any number of implementations that
 would be good for given different situations.

 That does bring up a question though, there was some discussion a while ago
 about some functionality along the lines of IndexedRelationship being pulled
 into the core, so is that overkill for now if there is going to be another
 core offering later?

 On Thu, Sep 8, 2011 at 2:38 PM, Niels Hoogeveen
 pd_aficion...@hotmail.comwrote:

  I think we don't have to worry about backwards compatibility much yet.
  There has not been a formal release of the component, so if there are people
  using the software, they will accept that they are bleeding edgers.
  Indeed addNode should return the KEY_VALUE relationship and I think we
  should change the signature of SortedTree to turn it into
  IterableRelationship. No need to maintain a Node iterator, the node is
  always one getEndNode away.
  Niels

   Date: Thu, 8 Sep 2011 14:17:59 +1200
   From: bryc...@gmail.com
   To: user@lists.neo4j.org
   Subject: Re: [Neo4j] Issues with IndexedRelationship

   Will have to experiment with changing my id's to be stored as longs, it
  does
   make perfect sense really that it would be better.  Thanks for the hint.

   In regards to SortedTree returning the KEY_VALUE relationship instead of
  the
   end Node, I had thought of that too, and it would definitely help.  Could
   end up being a significant change to SortedTree though, e.g.:
 sortedTree.addNode( node );
   Would need to return the KEY_VALUE relationship instead of a boolean.
   Which
   not knowing where else SortedTree is used could be a large change?

   Maybe SortedTree would have two iterator's available a key_value
   relationship iterator, and a node iterator.  Having a quick look at it
  now
   it seems that it could work ok that way without introducing much code
   duplication.

   On Thu, Sep 8, 2011 at 12:46 PM, Niels Hoogeveen
   pd_aficion...@hotmail.comwrote:

Two longs is certainly cheaper than a string. Two longs take 128 bit
  and
are stored in the main record of the PropertyContainer, while a String
  would
require a 64 bit pointer in the main record of the PropertyContainer,
  and
an additional read in the String store where the string representation
  will
take up 256 bits. So both memory-wise, as perfomance wise, it is better
  to
store a UUID as two long values.

The main issue is something that needs a deeper fix than adding ID's.
SortedTree now returns Nodes when traversing the tree. We should
  however
return the KEY_VALUE Relationship to the indexed Node. Then
IndexedRelationship.DirectRelationship can be created with that
  relationship
as an argument. We get the Direction and the RelationshipType for free.
Niels

 Date: Thu, 8 Sep 2011 11:36:11 +1200
 From: bryc...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Issues with IndexedRelationship

 Hi Niels,

 Sorry I didn't quite write the bit about (1) clearly enough.  The
  problem
is
 that it presently throws an Exception where it shouldn't.

 This stems from IndexedRelationship.DirectRelationship:
 this.endRelationship = endNode.getSingleRelationship(
 SortedTree.RelTypes.KEY_VALUE, Direction.INCOMING );

 So if the end node has more than one incoming KEY_VALUE relationship
  a
more
 than one relationship exception is thrown.

 Instead of the getSingleRelationship I was planning on iterating over
  the
 relationships and matching the UUID stored at the root end of the IR
  with
 one of the KEY_VALUE relationships (which is why using a unique id is
 necessary rather than the relationship type).  Note: there will
  actually
 still be an issue if the same IR has multiple relationships to the
  same
leaf
 node - still thinking about that might need .

 Is storing the UUID as two

Re: [Neo4j] Issues with IndexedRelationship

2011-09-07 Thread Niels Hoogeveen


Great work Bryce,
I do have a question though.
What is the rationale for the restriction mentioned under 1). Do you need 
this for the general case (to make IndexedRelationshipExpander work correctly), 
or do you need it for your own application to throw that exception? If the 
latter is the case, I think it would be important to tease out the general case 
and offer this new behaviour as an option.
A unique key for the index is a good idea anyway and can be added to 
SortedTree. Generate a UUID and store it in two long properties. That way the 
two values will always be read in the first fetch of the underlying 
PropertyContainer. A getId method on the TreeNodes can then return a String 
representation of of the two long values.
IndexRelationships are a relatively new development, so I think you are one of 
the first to actually try it out. Personally I have chosen to directly work 
with SortedTree, because I am working within the framework of a wrapper API, so 
I can integrate the functionality behind the regular createRelationshipTo and 
getRelationships methods.
I don't think API changes will be an issue at the moment.
Niels
 Date: Thu, 8 Sep 2011 10:22:11 +1200
 From: bryc...@gmail.com
 To: user@lists.neo4j.org
 Subject: [Neo4j] Issues with IndexedRelationship
 
 Hi,
 
 As I mentioned a while ago I am looking at using IndexedRelationship's
 within my application.  The major thing that was missing for me to be able
 to do this was IndexedRelationshipExpander being able to provide all the
 relationships from the leaf end of indexed relationships through the the
 root end.  So I have been working on getting that support in there.
 
 However in writing this I have discovered a number of other issues that I
 have also fixed, and at least one I am still working on.  Since I was right
 into the extra support for expanding the relationships it is hard to break
 out these fixes as a separate commit (which I think would be ideal), so it
 will most likely all come in together hopefully later today (NZ time).
 
 Just letting everyone know in case someone else is doing development against
 indexed relationships.
 
 Quick run down of the issues, note: N -- IR(X) -- {A,B} below means there
 is a indexed relationship from N to A  B, of type X.
 
 1) Exception thrown when more than one IR terminates at a given node, e.g.:
 N1 -- IR(X) -- {A,B,C,D}
 N2 -- IR(X) -- {A,X,Y,Z}
 Will throw an exception when using the IndexedRelationshipExpander on either
 N1, or N2.
 
 2) Start / End nodes are transposed when the IR has an direction of
 incoming, i.e. the IR is created against N but across a set of incoming
 relationships:
 N -- IR(Y) -- {A,B,C}
 Will return 3 relationships N -- A, N -- B, N -- C.
 
 I have written tests for each of these, as well as a couple of other tests.
 
 Still completing (1) and have a little question about this.  In order to fix
 this I may need to introduce a unique ID stored against the IR both at the
 root and at the leaves.  Currently the relationship type is used to name the
 IR at both root and leaves, but in the case above that means you can't tell
 from node A which KEY_VALUE relationship belongs to which IR tree without
 traversing the tree.
 
 So the question is adding this ID would mean that anyone who is already
 using this wont have the ID, and therefore without care will be data
 incompatible with the updated code.  This could be managed via a check for
 the ID when accessing the tree and if it isn't there doing a walk over the
 tree to populate all the places where it is required.
 
 In general in developing against this code where do we sit on data
 compatibility and API compatibility?
 
 Cheers
 Bryce
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Issues with IndexedRelationship

2011-09-07 Thread Niels Hoogeveen


Two longs is certainly cheaper than a string. Two longs take 128 bit and are 
stored in the main record of the PropertyContainer, while a String would 
require a 64 bit pointer in the main record of the PropertyContainer, and an 
additional read in the String store where the string representation will take 
up 256 bits. So both memory-wise, as perfomance wise, it is better to store a 
UUID as two long values. 


The main issue is something that needs a deeper fix than adding ID's. 
SortedTree now returns Nodes when traversing the tree. We should however return 
the KEY_VALUE Relationship to the indexed Node. Then 
IndexedRelationship.DirectRelationship can be created with that relationship as 
an argument. We get the Direction and the RelationshipType for free.
Niels

 Date: Thu, 8 Sep 2011 11:36:11 +1200
 From: bryc...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Issues with IndexedRelationship
 
 Hi Niels,
 
 Sorry I didn't quite write the bit about (1) clearly enough.  The problem is
 that it presently throws an Exception where it shouldn't.
 
 This stems from IndexedRelationship.DirectRelationship:
 this.endRelationship = endNode.getSingleRelationship(
 SortedTree.RelTypes.KEY_VALUE, Direction.INCOMING );
 
 So if the end node has more than one incoming KEY_VALUE relationship a more
 than one relationship exception is thrown.
 
 Instead of the getSingleRelationship I was planning on iterating over the
 relationships and matching the UUID stored at the root end of the IR with
 one of the KEY_VALUE relationships (which is why using a unique id is
 necessary rather than the relationship type).  Note: there will actually
 still be an issue if the same IR has multiple relationships to the same leaf
 node - still thinking about that might need .
 
 Is storing the UUID as two longs much quicker than storing it as a string?
  Curious about this since in my current model I have all the domain objects
 with UUID's, and these are all stored as strings.  If it was going to help
 with either memory or performance then I would be keen to migrate this to
 two longs.
 
 Cheers
 Bryce
 
 On Thu, Sep 8, 2011 at 11:07 AM, Niels Hoogeveen
 pd_aficion...@hotmail.comwrote:
 
 
  Great work Bryce,
  I do have a question though.
  What is the rationale for the restriction mentioned under 1). Do you need
  this for the general case (to make IndexedRelationshipExpander work
  correctly), or do you need it for your own application to throw that
  exception? If the latter is the case, I think it would be important to tease
  out the general case and offer this new behaviour as an option.
  A unique key for the index is a good idea anyway and can be added to
  SortedTree. Generate a UUID and store it in two long properties. That way
  the two values will always be read in the first fetch of the underlying
  PropertyContainer. A getId method on the TreeNodes can then return a String
  representation of of the two long values.
  IndexRelationships are a relatively new development, so I think you are one
  of the first to actually try it out. Personally I have chosen to directly
  work with SortedTree, because I am working within the framework of a wrapper
  API, so I can integrate the functionality behind the regular
  createRelationshipTo and getRelationships methods.
  I don't think API changes will be an issue at the moment.
  Niels
   Date: Thu, 8 Sep 2011 10:22:11 +1200
   From: bryc...@gmail.com
   To: user@lists.neo4j.org
   Subject: [Neo4j] Issues with IndexedRelationship
  
   Hi,
  
   As I mentioned a while ago I am looking at using IndexedRelationship's
   within my application.  The major thing that was missing for me to be
  able
   to do this was IndexedRelationshipExpander being able to provide all the
   relationships from the leaf end of indexed relationships through the the
   root end.  So I have been working on getting that support in there.
  
   However in writing this I have discovered a number of other issues that I
   have also fixed, and at least one I am still working on.  Since I was
  right
   into the extra support for expanding the relationships it is hard to
  break
   out these fixes as a separate commit (which I think would be ideal), so
  it
   will most likely all come in together hopefully later today (NZ time).
  
   Just letting everyone know in case someone else is doing development
  against
   indexed relationships.
  
   Quick run down of the issues, note: N -- IR(X) -- {A,B} below means
  there
   is a indexed relationship from N to A  B, of type X.
  
   1) Exception thrown when more than one IR terminates at a given node,
  e.g.:
   N1 -- IR(X) -- {A,B,C,D}
   N2 -- IR(X) -- {A,X,Y,Z}
   Will throw an exception when using the IndexedRelationshipExpander on
  either
   N1, or N2.
  
   2) Start / End nodes are transposed when the IR has an direction of
   incoming, i.e. the IR is created against N but across a set of incoming
   relationships:
   N -- IR(Y) -- {A,B,C

Re: [Neo4j] Issues with IndexedRelationship

2011-09-07 Thread Niels Hoogeveen


I think we don't have to worry about backwards compatibility much yet. There 
has not been a formal release of the component, so if there are people using 
the software, they will accept that they are bleeding edgers. 
Indeed addNode should return the KEY_VALUE relationship and I think we should 
change the signature of SortedTree to turn it into IterableRelationship. No 
need to maintain a Node iterator, the node is always one getEndNode away. 
Niels

 Date: Thu, 8 Sep 2011 14:17:59 +1200
 From: bryc...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Issues with IndexedRelationship
 
 Will have to experiment with changing my id's to be stored as longs, it does
 make perfect sense really that it would be better.  Thanks for the hint.
 
 In regards to SortedTree returning the KEY_VALUE relationship instead of the
 end Node, I had thought of that too, and it would definitely help.  Could
 end up being a significant change to SortedTree though, e.g.:
   sortedTree.addNode( node );
 Would need to return the KEY_VALUE relationship instead of a boolean.  Which
 not knowing where else SortedTree is used could be a large change?
 
 Maybe SortedTree would have two iterator's available a key_value
 relationship iterator, and a node iterator.  Having a quick look at it now
 it seems that it could work ok that way without introducing much code
 duplication.
 
 On Thu, Sep 8, 2011 at 12:46 PM, Niels Hoogeveen
 pd_aficion...@hotmail.comwrote:
 
 
  Two longs is certainly cheaper than a string. Two longs take 128 bit and
  are stored in the main record of the PropertyContainer, while a String would
  require a 64 bit pointer in the main record of the PropertyContainer, and
  an additional read in the String store where the string representation will
  take up 256 bits. So both memory-wise, as perfomance wise, it is better to
  store a UUID as two long values.
 
 
  The main issue is something that needs a deeper fix than adding ID's.
  SortedTree now returns Nodes when traversing the tree. We should however
  return the KEY_VALUE Relationship to the indexed Node. Then
  IndexedRelationship.DirectRelationship can be created with that relationship
  as an argument. We get the Direction and the RelationshipType for free.
  Niels
 
   Date: Thu, 8 Sep 2011 11:36:11 +1200
   From: bryc...@gmail.com
   To: user@lists.neo4j.org
   Subject: Re: [Neo4j] Issues with IndexedRelationship
  
   Hi Niels,
  
   Sorry I didn't quite write the bit about (1) clearly enough.  The problem
  is
   that it presently throws an Exception where it shouldn't.
  
   This stems from IndexedRelationship.DirectRelationship:
   this.endRelationship = endNode.getSingleRelationship(
   SortedTree.RelTypes.KEY_VALUE, Direction.INCOMING );
  
   So if the end node has more than one incoming KEY_VALUE relationship a
  more
   than one relationship exception is thrown.
  
   Instead of the getSingleRelationship I was planning on iterating over the
   relationships and matching the UUID stored at the root end of the IR with
   one of the KEY_VALUE relationships (which is why using a unique id is
   necessary rather than the relationship type).  Note: there will actually
   still be an issue if the same IR has multiple relationships to the same
  leaf
   node - still thinking about that might need .
  
   Is storing the UUID as two longs much quicker than storing it as a
  string?
Curious about this since in my current model I have all the domain
  objects
   with UUID's, and these are all stored as strings.  If it was going to
  help
   with either memory or performance then I would be keen to migrate this to
   two longs.
  
   Cheers
   Bryce
  
   On Thu, Sep 8, 2011 at 11:07 AM, Niels Hoogeveen
   pd_aficion...@hotmail.comwrote:
  
   
Great work Bryce,
I do have a question though.
What is the rationale for the restriction mentioned under 1). Do you
  need
this for the general case (to make IndexedRelationshipExpander work
correctly), or do you need it for your own application to throw that
exception? If the latter is the case, I think it would be important to
  tease
out the general case and offer this new behaviour as an option.
A unique key for the index is a good idea anyway and can be added to
SortedTree. Generate a UUID and store it in two long properties. That
  way
the two values will always be read in the first fetch of the underlying
PropertyContainer. A getId method on the TreeNodes can then return a
  String
representation of of the two long values.
IndexRelationships are a relatively new development, so I think you are
  one
of the first to actually try it out. Personally I have chosen to
  directly
work with SortedTree, because I am working within the framework of a
  wrapper
API, so I can integrate the functionality behind the regular
createRelationshipTo and getRelationships methods.
I don't think API changes will be an issue at the moment.
Niels

Re: [Neo4j] IndexedRelationship some observations and questions

2011-09-05 Thread Niels Hoogeveen


Hi Bryce,
Sorry for my belated response. I have been away for a couple of days and wasn't 
able to check my emails.
I am glad you took the time to look into the IndexRelationship module. It 
certainly could use some scrutiny.
Remarks:
1)  Good catch... Something the unit test didn't catch because it runs in the 
same namespace as IndexedRelationship itself. Didn't catch it in user code 
either, because personally I prefer to directly call SortedTree. 
2) Agreed. It should be possible to define more than one IndexRelationship per 
node.
3) I haven't tried out an anonymous inner class as Comparator. As far as I can 
tell any object implementing ComparatorNode should be able to work as a 
comparator.

Questions:
1) That is certainly an option. IndexRelationships however offer you the 
possibility to sort your Relationships based on some value associated with a 
node (for example creation/edit date of the document). This may be a reason to 
use IndexRelationships even in the situation where you have less than 500 
entries per tag (though it would be possible to do that sorting in memory too).
2) The end node of an IndexRelationship is always referred to by a Relationship 
with RelationshipType KEY_VALUE, and has a property tree_name (both are 
defined in SortedTree). The tree_name property has the same value as the 
RelationshipType.name used in IndexRelationship. To traverse from a leaf node 
to the tree root, keep following the incoming relationships: KEY_VALUE (there 
is only one), KEY_ENTRY (there can be many), SUB_TREE (there can be many), 
TREE_ROOT (there is only one)
Niels


 Date: Fri, 2 Sep 2011 11:44:40 +1200
 From: bryc...@gmail.com
 To: user@lists.neo4j.org
 Subject: [Neo4j] IndexedRelationship some observations and questions
 
 Hi,
 
 I have been looking at performance options for Neo4j as presently I have
 been observing a number of performance issues.  I am still investigating the
 way to get the best performance out of what I am doing, and one thing it
 might be are longer running transactions stopping other work going on (but
 thats an aside to what this message is about).
 
 One of the things that I investigated using was the IndexedRelationship work
 by Niels.  Thought I would give a bit of feedback, although I haven't quite
 got this implemented at present.
 
 1) I had to change the IndexedRelationshipExpander to be a public class in
 order to use it outside the package its in.
 
 2) IndexedRelationship assumes only one tree root per node, whereas the
 expander allows for multiple (IndexedRelationship uses getSingleRelationship
 vs expander using getRelationships then matching on tree name).  Having
 multiple would obviously be good as it means you could have two types of
 relationships covered by IndexedRelationship's.
 
 3) Might pay to make it clear in the Javadocs for IndexedRelationship that
 the comparator can't be an anonymous inner class.
 
 Then I have some questions about usage of this.  First a little background
 of the model I have, from reading a few things it seems quite standard.
  There are a lot of document nodes each of which have a relationship with
 multiple tag nodes.  Documents generally have in the order of 10-20 tags,
 and tags can have as few as 1 document and sometimes tens of thousands.
  When tags are viewed through the UI they are almost always displayed with a
 descending date ordered list of documents.  Seemed to be to fit quite well
 with IndexedRelationship.
 
 1) I was thinking of having a switch over point at say around 500 documents
 for a given node where I will switch from using normal relationships to an
 IndexedRelationship as I was thinking at small numbers of relationships
 normal relationships would be quicker.  Would that be correct, or not worth
 it?
 
 2) On the tag end (which is the incoming end of the document-tag
 relationship) I was going to use a IndexedRelationshipExpander which would
 cover the case of whether the relationship was done through normal
 relationships, or through an IndexedRelationship.  I also need to get a set
 of tags from the document end where their may be both normal relationships,
 and relationships coming from multiple IndexedRelationship's.  From looking
 at it IndexedRelationshipExpander doesn't cover the reverse direction, but I
 would imagine using a relationship expander here would be correct.  What
 would the best way of doing this be?
 
 As an aside it may be a good idea to note in the configuration settings
 page:
 http://wiki.neo4j.org/content/Configuration_Settings#Optimizing_for_traversals_example
 that -XX:+UseNUMA
 only works when using the Parallel Scavenger garbage collector (default
 or -XX:+UseParallelGC) not the concurrent mark and sweep one.  Based on
 
 Cheers
 Bryce
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___

Re: [Neo4j] Hyperedges in Neo4j

2011-09-01 Thread Niels Hoogeveen


Correct, turing completeness is not the lower bound for non-guaranteed 
termination.
It is however possible to have some forms of recursion without sacrificing 
guaranteed termination. Neo4j traversals, memorizing visited paths, 
relationships or nodes are an example (Note, it would be nice to have an option 
to memorize visited (Node, RelationshipType, Direction)). This limited form of 
recursion is useful as a query language. Doing so of course eliminates correct 
statements. 

When memorizing nodes, the statement john (FRIEND_OF, OUTGOING) pete 
(FRIEND_OF, OUTGOING) john can no longer be true, but we could memorize 
relationships instead of nodes. This makes the former statement possible, but 
makes it impossible to return the statement john (FRIEND_OF, OUTGOING) pete 
(FRIEND_OF, OUTGOING) john (FRIEND_OF, OUTGOING) pete, unless john has more 
than one outgoing FRIEND relationship with pete (Memorizing (Node, 
RelationshipType, Direction) would make that statement impossible even in the 
presence of more than one FRIEND relationship from john to pete). 
While repeated paths in the graph are in principle true statements of the graph 
grammar, in many practical programming tasks we are not interested in such 
statements and in fact like to see those eliminated. 
Niels

 From: okramma...@gmail.com
 Date: Thu, 1 Sep 2011 08:17:27 -0600
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Hyperedges in Neo4j
 
 Hey,
 
  I think a traversal should in principal be performed with a query language 
  that is not turing complete so we can guarantee termination.
 
 Turning completeness is not the lower bound for non-guaranteed termination. 
 You can't guarantee completion in a regular language when your String (data 
 structure) is a graph. E.g.
 
   a*
 
 The only languages guaranteed to complete are Star-free languages. That is, 
 those that don't allow for recursion.
 
 See ya,
 Marko.
 
 http://markorodriguez.com
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] API adventures in Scalaland

2011-08-29 Thread Niels Hoogeveen


Peter,
I haven't put this code out yet. It has been too much in flux to share the code 
yet.
I use neoclipse for visualization, which helps to check the layout of the test 
graphs i am using. I would need something more programmable for the visual 
output, since i use node id's of the types as property names and 
relationshiptypes. This allows for renaming of types and makes it possible to 
put types in namespaces and be moved from one namespace to another. Using real 
names for properties and relationshiptypes is not flexible enough. As a result 
the visualization in neoclipse looks pretty cryptic, having only numbers as 
labels.
I will look into neo4j/neoviz to see if I can export the graph with proper 
names for the relationships and properties, otherwise i can always roll my own 
output program. Dot is not the most complex file format to generate.
Niels

 Date: Mon, 29 Aug 2011 07:19:35 +0200
 From: peter.neuba...@neotechnology.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] API adventures in Scalaland
 
 Niels,
 Is that Scala code in the graph collections? If you want, ,you could use the
 neo4j/neoviz project to output .dot graphs at any point and thus visualize
 what's happening in the graph to illustrate :)
 
 /Peter
 
 On Monday, August 29, 2011, Niels Hoogeveen pd_aficion...@hotmail.com
 wrote:
 
  In the last week I have been working on a Neo4j API in Scala, taking
 navigation in the graph as primary.
 
  Just like the Enhanced API written in Java, the Scala API generalizes each
 element (Node, Relationship, RelationshipType, property name and property
 value) of the Neo4j database as being a Vertex.
 
  Before digging into the details of the Scala API, let's start with some
 example code.
 
 val name = Db(String(name))
 val friend = Db(VertexOut(FRIEND))
 val john = Db(NewVertex).put(name, John)
 val pete = Db(NewVertex).put(name, Pete).put(friend, john)
 
  This piece of code defines the PropertyType name and the EdgeTypes
 FRIEND, creates two vertices for the persons John and Pete, and states
 that John is a friend of Pete.
 
  In standard Neo4j API this could have been written as:
 
   Node john = db.createNode();
   Node pete = db.createNode();
   john.setProperty(name, John);
   pete.setProperty(name, Pete);
   pete.createRelationshipTo(john,
 DynamicRelationshipType.withName(FRIEND));
 
  Apart from an obvious style difference, there is one immediate difference
 noticeable between the two API's.
 
  In the Neo4j API it is possible to write:
 
   john.setProperty(name, John);
   pete.setProperty(name, 99);
 
  While the following Scala program won't typecheck:
 
   pete.put(name, 99) //ERROR
 
  The name property is defined as a String and the API enforces that type.
 This also applies when fetching a property value. In the Neo4j API we write:
 
   john.getProperty(name)  // returns java.lang.Object
 
  In the Scala API we write:
 
   john(name)   // returns java.lang.String
 
  It is also possible to ask the names of pete's friend as follows:
 
   pete(friend andThen name)
 
  This is equal to the Neo4j call:
 
 
  
 (String)pete.getSingleRelationship(DynamicRelationshipType.withName(FRIEND),
 Direction.OUTGOING).endNode.getProperty(name)
 
  If pete has more than one friend, we have to define a different key to
 fetch them ( the VertexOut key refers to one single relationship, either the
 one to be created, or a singular exisiting relationship ):
 
   val friends = Db(Vertices(FRIEND))
 
  We now have a key to all FRIEND relationships so we can ask:
 
   pete(friends andThen name)
 
  This returns an Iterable[String] with the names of all Pete's friends.
 
  We can even do:
 
   pete(Rec(friends) andThen name)
 
  This returns an Iterable[String] with the names of all Pete's friends of
 friends (to the n-th degree). The Rec object recursively applies friend to
 all vertices it traverses, remembering already taken paths, traversed
 Relationships or traversed Nodes (settings are optional with sensible
 defaults).
 
  We can also write:
 
   pete(Rec(friend, 2) andThen name)
 
  This returns an Iterable[String] with the names of all Pete's friends of
 friends (to the 2nd degree)
 
  It is even possible to write:
 
   pete(Rec(friend andThen friend) andThen name)
 
  This returns an Iterable[String] with the names of all Pete's friends of
 friends (to the n-th degree where n is even)
 
  Instead of having get methods for properties and relationships and
 traversal methods on nodes, the Scala API uses one calling pattern for all
 database related objects:
 
  object(traverser)
 
  So a call like Db(String(name)) is not just a call on the database to
 return a PropertyType with name name and datatype String, it is a
 traversal from the database to that PropertyType. What is being returned
 with that call is a traverser itself.
 
  Traversers can be composed with andThen, so the output of one traverser
 is used as input for the next traverser.
 
  All traversers are typed

[Neo4j] API adventures in Scalaland

2011-08-28 Thread Niels Hoogeveen


In the last week I have been working on a Neo4j API in Scala, taking navigation 
in the graph as primary.

Just like the Enhanced API written in Java, the Scala API generalizes each 
element (Node, Relationship, RelationshipType, property name and property 
value) of the Neo4j database as being a Vertex. 

Before digging into the details of the Scala API, let's start with some example 
code.

val name = Db(String(name))
val friend = Db(VertexOut(FRIEND))
val john = Db(NewVertex).put(name, John)
val pete = Db(NewVertex).put(name, Pete).put(friend, john)

This piece of code defines the PropertyType name and the EdgeTypes FRIEND, 
creates two vertices for the persons John and Pete, and states that John is 
a friend of Pete.

In standard Neo4j API this could have been written as:

  Node john = db.createNode();
  Node pete = db.createNode();
  john.setProperty(name, John);
  pete.setProperty(name, Pete);
  pete.createRelationshipTo(john, DynamicRelationshipType.withName(FRIEND));

Apart from an obvious style difference, there is one immediate difference 
noticeable between the two API's.

In the Neo4j API it is possible to write:

  john.setProperty(name, John);
  pete.setProperty(name, 99);

While the following Scala program won't typecheck:

  pete.put(name, 99) //ERROR

The name property is defined as a String and the API enforces that type. This 
also applies when fetching a property value. In the Neo4j API we write:

  john.getProperty(name)  // returns java.lang.Object

In the Scala API we write:

  john(name)   // returns java.lang.String

It is also possible to ask the names of pete's friend as follows:

  pete(friend andThen name)

This is equal to the Neo4j call:

  
(String)pete.getSingleRelationship(DynamicRelationshipType.withName(FRIEND), 
Direction.OUTGOING).endNode.getProperty(name)

If pete has more than one friend, we have to define a different key to fetch 
them ( the VertexOut key refers to one single relationship, either the one to 
be created, or a singular exisiting relationship ):

  val friends = Db(Vertices(FRIEND))

We now have a key to all FRIEND relationships so we can ask:

  pete(friends andThen name)

This returns an Iterable[String] with the names of all Pete's friends.

We can even do:

  pete(Rec(friends) andThen name)

This returns an Iterable[String] with the names of all Pete's friends of 
friends (to the n-th degree). The Rec object recursively applies friend to 
all vertices it traverses, remembering already taken paths, traversed 
Relationships or traversed Nodes (settings are optional with sensible defaults).

We can also write:

  pete(Rec(friend, 2) andThen name)

This returns an Iterable[String] with the names of all Pete's friends of 
friends (to the 2nd degree)

It is even possible to write:

  pete(Rec(friend andThen friend) andThen name)

This returns an Iterable[String] with the names of all Pete's friends of 
friends (to the n-th degree where n is even)

Instead of having get methods for properties and relationships and traversal 
methods on nodes, the Scala API uses one calling pattern for all database 
related objects:

object(traverser)

So a call like Db(String(name)) is not just a call on the database to return 
a PropertyType with name name and datatype String, it is a traversal from the 
database to that PropertyType. What is being returned with that call is a 
traverser itself. 

Traversers can be composed with andThen, so the output of one traverser is 
used as input for the next traverser. 

All traversers are typed, so the andThen connective can only be applied when 
the type of the output of the left-hand-side traverser is equal to the type as 
the input of the right-hand-side traverser. This is checked at compile time.

Traversals not only work on Vertex objects and it's subtypes (Property, 
PropertyType, Edge, EdgeType...), it also works on Iterable[Vertex]. 

Instead of fetching just pete's friends, as in:

pete(friends)

we can also fetch the friends of pete and john:

val frnds = List(pete, john)
frnds(friends)

or if we don't need the frnds object later on, we simply state:

List(pete, john)(friends)

and if we want the names of those friends:

List(pete, john)(friends andThen name)

it is even possible to set properties or create relationships on 
Iterable[Vertex]

val age = Db(Int(age))
val nationality = Db(String(nationality))
List(pete, john).put(age, 40).put(nationality, Irish)

This sets the age property to 40 on both pete and john.

It is also possible to write this as a traversal:

List(pete, john)(Put(age, 40) andThen Put(nationality, Irish))

All traversers are function objects, so they can both be called as a function 
and can be treated as an object. This makes it possible to create traverers 
programmatically, allowing for the storage of traversers in the database, and 
many more nifty tricks.

Using the Put object, we could for example create a list of such 
actions/traversals and perform a validation on the

Re: [Neo4j] partitioning the relationship store

2011-08-18 Thread Niels Hoogeveen

Jim, 
Can you tell me how to add my suggestions for a solution to this problem to 
your issue tracker?
Niels

 From: pd_aficion...@hotmail.com
 To: user@lists.neo4j.org
 Date: Tue, 16 Aug 2011 16:33:04 +0200
 Subject: Re: [Neo4j] partitioning the relationship store

 The partitioning is a solution to the densely-connected node problem, but 
 would also allow for the iteration over RelationshipTypes/Directions, another 
 feature I would very much like to see.
 I have posted suggestions on how to approach this problem and would like to 
 add those suggestions to the issue tracker so they will be taken into 
 consideration when addressing this issue. Yet I can't find an issue in 
 Lighthouse.
 I am glad to hear it is #5 in priority order. 
 I would be extra pleased if devteam when picking up this issue would stay in 
 touch, because Enhanced API could greatly benefit depending on the approach 
 taken.
 Niels

  From: j...@neotechnology.com
  Date: Tue, 16 Aug 2011 14:40:21 +0100
  To: user@lists.neo4j.org
  Subject: Re: [Neo4j] partitioning the relationship store

  Hi Niels,

  Is this partitioning an aspect of the supernode problem? If so, there is a 
  feature request* in the devteam backlog for that.

  Jim

  * It is currently 5th in priority order.
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user

 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

[Neo4j] Subtyping

2011-08-16 Thread Niels Hoogeveen


Yesterday, I added subtyping to Enhanced API. 

Suppose an application has UserGroups, Users and Roles, where both UserGroups 
and Users are Vertices and Roles are BinaryEdges. There can be different 
predefined Roles, such as ADMINISTRATOR, EDITOR, MEMBER, GUEST.

With subtyping it is possible to say that each of the types ADMINISTRATOR, 
EDITOR, MEMBER, GUEST is a subtype of ROLE.

We can now call the method user.getAllBinaryEdges(ROLE, Direction.OUTGOING), 
and all roles of that user will be returned. It is also possible to ask if a 
user has any role by calling user.hasAnyBinaryEdge(ROLE, Direction.OUTGOING).

The same applies for Properties. 

Suppose a user has the properties: UserName, FullName, NickName. 

With subtyping it is possible to say that each of the types UserName, FullName, 
NickName is a subtype of Name.

We can now call the method user.getAllProperties(Name) and all names of that 
user will be returned.  It is also possible to ask if a user has any name by 
calling user.hasAnyProperty(Name).

Niels 
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

[Neo4j] partitioning the relationship store

2011-08-16 Thread Niels Hoogeveen


At the risk of coming off as an utter bore, I would like once more to raise 
awareness for the fact that the relationships of a node are currently stored as 
one linked list. The downside of this has been discussed in many posts, so I 
shan't rehash the points. 

It's just that whatever I try to implement, this one issue keeps me from making 
the progress I would want to make. 

I know that the issue will be addressed some day, I would just want to ask 
the Neo team to give it priority. I am almost inclined to fork the kernel and 
do it myself, but I don't want to do that for obvious reasons.

Niels 
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Subtyping

2011-08-16 Thread Niels Hoogeveen

Later today I will push the changes to Git, including tests.

 Date: Tue, 16 Aug 2011 14:51:42 +0200
 From: peter.neuba...@neotechnology.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Subtyping

 Very cool. Is there a test demonstrating it?

 /peter

 Sent from my phone.
 On Aug 16, 2011 1:52 PM, Niels Hoogeveen pd_aficion...@hotmail.com
 wrote:

  Yesterday, I added subtyping to Enhanced API.

  Suppose an application has UserGroups, Users and Roles, where both
 UserGroups and Users are Vertices and Roles are BinaryEdges. There can be
 different predefined Roles, such as ADMINISTRATOR, EDITOR, MEMBER, GUEST.

  With subtyping it is possible to say that each of the types ADMINISTRATOR,
 EDITOR, MEMBER, GUEST is a subtype of ROLE.

  We can now call the method user.getAllBinaryEdges(ROLE,
 Direction.OUTGOING), and all roles of that user will be returned. It is also
 possible to ask if a user has any role by calling
 user.hasAnyBinaryEdge(ROLE, Direction.OUTGOING).

  The same applies for Properties.

  Suppose a user has the properties: UserName, FullName, NickName.

  With subtyping it is possible to say that each of the types UserName,
 FullName, NickName is a subtype of Name.

  We can now call the method user.getAllProperties(Name) and all names of
 that user will be returned. It is also possible to ask if a user has any
 name by calling user.hasAnyProperty(Name).

  Niels
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] partitioning the relationship store

2011-08-16 Thread Niels Hoogeveen


The partitioning is a solution to the densely-connected node problem, but would 
also allow for the iteration over RelationshipTypes/Directions, another feature 
I would very much like to see.
I have posted suggestions on how to approach this problem and would like to add 
those suggestions to the issue tracker so they will be taken into consideration 
when addressing this issue. Yet I can't find an issue in Lighthouse.
I am glad to hear it is #5 in priority order. 
I would be extra pleased if devteam when picking up this issue would stay in 
touch, because Enhanced API could greatly benefit depending on the approach 
taken.
Niels

 From: j...@neotechnology.com
 Date: Tue, 16 Aug 2011 14:40:21 +0100
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] partitioning the relationship store
 
 Hi Niels,
 
 Is this partitioning an aspect of the supernode problem? If so, there is a 
 feature request* in the devteam backlog for that.
 
 Jim
 
 * It is currently 5th in priority order.
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] n-ary relationships

2011-08-15 Thread Niels Hoogeveen


Hi Emerson,

Over the last couple of weeks, I have been working on an implementation of 
n-ary relationships on top of Neo4j. I also detailed how n-ary relationships 
could in principle be implemented in the database kernel (see: 
http://lists.neo4j.org/pipermail/user/2011-August/011191.html).

Right now I am working on traversals for n-ary relationships, in an attempt to 
remove the unnaturalness you describe.

If we look at your example and using the nomenclature of Enhanced-API, you'd 
have an EdgeType REFERS with three ConnectorsTypes: Referrer, Referree, 
Course, such that we can create the following Egde:

REFERS
Referrer --paul
Referree -- john
Course -- history 

A traversal takes as input a Vertex (strictly speaking a Traversal, of which a 
Vertex is a subclass), and takes two ConnectorTypes to traverse from a Vertex 
to an Edge to a Vertex.

So if you want to know the Referrers for the course history, the traversal 
would be defined like:

//create a traversal description
TraversalDescription descr = TraversalDescription.add(Course, Referrer)

//traverse  the graph based on the description starting from the vertex 
history
descr.traverse(history)

Ther traverse method returns a IterablePath, which in this case contains only 
one Path. The path consists of two Connections, (history, Course) and (paul, 
Referrer).

I hope this somehow answers your question.

Niels

 Date: Sun, 14 Aug 2011 18:57:22 +0200
 From: emerson.farru...@gmail.com
 To: user@lists.neo4j.org
 Subject: [Neo4j] n-ary relationships
 
 Hi,
 
 I started looking into Neo4j this morning, and played with some domain
 models to see whether it really passes a whiteboard friendliness test. I'm
 really after a persistence solution that makes it straightforward to persist
 a domain model designed from a DDD perspective, and Neo4j is looking
 promising so far. The one aspect that's not too clear is how to model n-ary
 relationships, and I'm curious as to how those of you with experience with
 Neo4j and graph databases would approach it.
 
 An edge connects two vertices, so in a graph database, a relationship
 connects two nodes. But when modeling, there are frequently relationships
 between multiple entities. For example, student John attends a History
 course at university, and John was referred to the History course by Paul.
 This relationship relates John, Paul, and the History course.
 
 There are a few ways I can think of to model this.
 
 1) A node John has an ATTENDS relationship with node History, and the
 relationship has a referrer property with Paul's ID. Simple, but keeping
 IDs as properties seems like an anti-pattern.
 2) A node Referral has a CREATED_BY relationship with node Paul,
 a FOR_COURSE relationship with node History, and a TO_STUDENT relationship
 with node John. It's effectively a three-way join.
 3) Same as 2, but with an additional ATTENDS relationship between John and
 History. This is particularly useful if a course attendant may attend a
 course without being referred.
 
 This might not be the best example in the world, but it should drive my
 point home: when relationships have a degree higher than 2, relationships
 need to be modelled as vertices to overcome the binary nature of edges. Is
 this expected behavior that's part and parcel of graph databases, or am I
 approaching the modeling incorrectly somehow?
 
 My concern is that traversals may become unnatural when this happens. Say I
 want to iterate over the attendants of a course, and show the name of who
 referred them when I do so. Will I have the graph database equivalent of n+1
 selects because the data I want to extract (referrer name) is in a different
 node (Paul) to my node of interest (John), instead of in the relationship to
 it (attends)?
 
 Any tips and opinions would be appreciated.
 
 Cheers,
 Emerson
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Is data lost if the object graph and relationships are changed?

2011-08-15 Thread Niels Hoogeveen


All your existing relationships will remain the same, unless you remove them 
yourself.

If you make your hypothetical changes, all Persons will keep a relationship to 
Address through the RESIDES_AT relationship, even though you now create a new 
ContactInfo entity that connects to Address too.

So unless you remove RESIDES_AT relationships, there will be two paths from a 
Person to an Address: from Person via RESIDES_AT to Address and from Person via 
CONTACT_BY, ContactInfo, BY_ADDRESS.

Niels
 From: e...@nextideapartners.com
 To: user@lists.neo4j.org
 Date: Mon, 15 Aug 2011 18:41:53 -0400
 Subject: [Neo4j] Is data lost if the object graph and relationships are   
 changed?
 
 Hypothetical example, let's say I'm building a system and I want to capture
 Person and Address entities, I might model it like this
 
 Person ---(RESIDES_AT)--- Address
 
 Assume that the relationship is bi-directionally, so whether I have a person
 or address entity, I can always find the other.
 
 After 6 months of running in production, we now need to capture phone
 numbers and email addresses, so we decide to create a new entity,
 ContactInfo
 
  
 ---(BY_ADDRESS)--- Address
 Person ---(CONTACT_BY)--- ContactInfo  ---(BY_PHONE)   ---Phone
  
 ---(BY_EMAIL) --- Email
 
 
 So we introduced a new entity, ContactInfo, which has relationships to
 Address, Phone, and Email entities. 
 
 My question is, since Address was originally related to Person but is now
 related to ContactInfo via Person, does neo4j automatically pick up the
 address details from the ContactInfo relationship for all Persons who used
 the prior relationship? This is important because change is inevitable, so I
 want to make sure existing data is not lost simply because a relationship
 was re-mapped in the java object hierarchy.
 
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Is data lost if the object graph and relationships are changed?

2011-08-15 Thread Niels Hoogeveen


Relationships can't be changed. They are created from one Node to another Node 
with a certain RelationshipType, and can only be removed..

All Relationships you create can be navigated. 

If your original code did something like:

person.getSingleRelationship(RESIDES_AT, Direction.OUTGOING).getEndNode(), you 
will now have to do something like:

Node addressNode = null;
for(Relationship rel: person.getRelationships(CONTACT_BY, Direction.OUTGOING)){
  Node contactInfo = rel.getEndNode();
  if(contactInfo).hasRelationship(BY_ADDRESS, Direction.OUTGOING){
addresNode = contactInfo.getSingleRelationship(BY_ADDRESS, 
Direction.OUTGOING).getEndNode();
  }
}

Niels
 From: e...@nextideapartners.com
 To: user@lists.neo4j.org
 Date: Mon, 15 Aug 2011 22:14:05 -0400
 Subject: Re: [Neo4j] Is data lost if the object graph and relationships are   
 changed?
 
 OK, but after making changes to the relationships, does the graph service
 automatically allow me to navigate from Person to ContactInfo to Address?
 
 -Original Message-
 From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On
 Behalf Of Niels Hoogeveen
 Sent: Monday, August 15, 2011 10:08 PM
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Is data lost if the object graph and relationships are
 changed?
 
 
 All your existing relationships will remain the same, unless you remove them
 yourself.
 
 If you make your hypothetical changes, all Persons will keep a relationship
 to Address through the RESIDES_AT relationship, even though you now create a
 new ContactInfo entity that connects to Address too.
 
 So unless you remove RESIDES_AT relationships, there will be two paths from
 a Person to an Address: from Person via RESIDES_AT to Address and from
 Person via CONTACT_BY, ContactInfo, BY_ADDRESS.
 
 Niels
  From: e...@nextideapartners.com
  To: user@lists.neo4j.org
  Date: Mon, 15 Aug 2011 18:41:53 -0400
  Subject: [Neo4j] Is data lost if the object graph and relationships are
 changed?
  
  Hypothetical example, let's say I'm building a system and I want to 
  capture Person and Address entities, I might model it like this
  
  Person ---(RESIDES_AT)--- Address
  
  Assume that the relationship is bi-directionally, so whether I have a 
  person or address entity, I can always find the other.
  
  After 6 months of running in production, we now need to capture phone 
  numbers and email addresses, so we decide to create a new entity, 
  ContactInfo
  
   
  ---(BY_ADDRESS)--- Address
  Person ---(CONTACT_BY)--- ContactInfo  ---(BY_PHONE)   ---Phone
   
  ---(BY_EMAIL) --- Email
  
  
  So we introduced a new entity, ContactInfo, which has relationships to 
  Address, Phone, and Email entities.
  
  My question is, since Address was originally related to Person but is 
  now related to ContactInfo via Person, does neo4j automatically pick 
  up the address details from the ContactInfo relationship for all 
  Persons who used the prior relationship? This is important because 
  change is inevitable, so I want to make sure existing data is not 
  lost simply because a relationship was re-mapped in the java object
 hierarchy.
  
  
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Enhanced API wiki page

2011-08-12 Thread Niels Hoogeveen

 
 http://www.neo4j.org   - Your high performance graph database.
 http://startupbootcamp.org/- Öresund - Innovation happens HERE.
 http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.
 
 
 
 On Wed, Aug 10, 2011 at 1:19 AM, Niels Hoogeveen
 pd_aficion...@hotmail.com wrote:
 
  Today I updated the wiki page for Enhanced API. Since the last edit many 
  changes have taken place, so it was to to reflect those changes on the wiki 
  page.
 
  See: https://github.com/peterneubauer/graph-collections/wiki/Enhanced-API
 
  I also changed what was previously called an EdgeRole into a Connector.
 
  Every Edge has a number of Connectors to which Vertices connect.
 
  The EdgeType of an Edge defines the ConnectorTypes of the Connectors of an 
  Edge.
 
  Each ConnectorType and with that a Connector, has a ConnectionMode, which 
  can be one of these four:
 
  Unrestricted: An Edge can connect to an unlimited number of Vertices 
  through a Connector with an unrestricted mode, and a Vertex can have an 
  unlimited number of connected Edges with a ConnectorType with an 
  unrestricted ConnectionMode.
  Injective: An Edge can connect to only one Vertex through a Connector with 
  injective mode, but a Vertex can have an unlimited number of connected 
  Edges with a ConnectorType with an injective ConnectionMode.
  Surjective: An Edge can connect to an unlimited number of Vertices through 
  a Connector with a surjective mode, but a Vertex can only have one Edge 
  connected to it with a ConnectorType with a surjective ConnectionMode.
  Bijective: An Edge can connect to only one Vertex through a Connector with 
  bijective mode, and a Vertex can only have one Edge connected to it with a 
  ConnectorType with a bijective ConnectionMode.
  All ConnectionModes have been implemented.
 
  The switch from EdgeRole to Connector with ConnectionModes has eliminated 
  some of the more annoying type parameters found in the previous incarnation 
  of Enhanced API.
 
 
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Enhanced API wiki page

2011-08-11 Thread Niels Hoogeveen


Hi Peter,

The API is indeed a bit heavy to grasp, if you want to use N-ary edges. I don't 
know how to make that simpler without sacrificing functionality.

For binary edges and properties, the API is very similar to the standard Neo4j 
API, give or take a some details. 

I have given it considerable thought how to traverse these hyperedges, and the 
answer is stunningly simple: the same as we would a binary edge.

Right now we traverse from a Node to another Node by means of a RelatonshipType 
(given a Direction).

We could also say in Enhanced API parlance that we traverse from a Vertex to a 
BinaryEdge following the StartConnector, then use the EndConnector to reach 
another Vertex.

So traversing the graph requires that we provide a pair of Connectors. This 
works the same for N-ary edges, we still provide a pair of connector, helping 
to build the path we want to return.

Example:

Suppose we have stored the fact Tom, Dick and Harry give Flo and Eddie a Book 
and a Bicycle, as explained on the Wiki page. Suppose all people in the 
database can also be FRIENDs to other people.

Now suppose we want to know the people who are friends of the people that Tom 
has given a gift to.

We provide the traverser with (Giver, Recipient, GIFT) and with 
(StartConnector, EndConnector, FRIEND).

Now we can of course further simplify this by making each step in the traversal 
to only follow one connector:

(Giver, GIFT) (Recipient, GIFT) (StartConnector, FRIEND) (EndConnector, FRIEND)

This way we can traverse not only from Vertex to Vertex (via an Edge), but to 
traverse from a Vertex to an Edge, to a property on that Edge.

Since we want to return a path through the graph, we need to provide a list of 
Connectors describing how to get there. 

Interestingly enough the arity of an Edge has no impact on how the graph is 
traversed. It takes one connector to get to an Edge, and it takes one connector 
to get away from an Edge. 

Niels





 From: peter.neuba...@neotechnology.com
 Date: Thu, 11 Aug 2011 20:46:59 +0200
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Enhanced API wiki page
 
 Nils,
 interesting approaches! However, IMHO the API is still too heavy to
 grasp with ConnectorType, EdgeElement, EdgeType and Edge being
 involved in creating connections between facts. Is anyone seeing a
 more fluent/concise approach to this?  Also, did you have some ideas
 about how to traverse or query these hyperedges?
 
 Cheers,
 
 /peter neubauer
 
 GTalk:  neubauer.peter
 Skype   peter.neubauer
 Phone   +46 704 106975
 LinkedIn   http://www.linkedin.com/in/neubauer
 Twitter  http://twitter.com/peterneubauer
 
 http://www.neo4j.org   - Your high performance graph database.
 http://startupbootcamp.org/- Öresund - Innovation happens HERE.
 http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.
 
 
 
 On Wed, Aug 10, 2011 at 1:19 AM, Niels Hoogeveen
 pd_aficion...@hotmail.com wrote:
 
  Today I updated the wiki page for Enhanced API. Since the last edit many 
  changes have taken place, so it was to to reflect those changes on the wiki 
  page.
 
  See: https://github.com/peterneubauer/graph-collections/wiki/Enhanced-API
 
  I also changed what was previously called an EdgeRole into a Connector.
 
  Every Edge has a number of Connectors to which Vertices connect.
 
  The EdgeType of an Edge defines the ConnectorTypes of the Connectors of an 
  Edge.
 
  Each ConnectorType and with that a Connector, has a ConnectionMode, which 
  can be one of these four:
 
  Unrestricted: An Edge can connect to an unlimited number of Vertices 
  through a Connector with an unrestricted mode, and a Vertex can have an 
  unlimited number of connected Edges with a ConnectorType with an 
  unrestricted ConnectionMode.
  Injective: An Edge can connect to only one Vertex through a Connector with 
  injective mode, but a Vertex can have an unlimited number of connected 
  Edges with a ConnectorType with an injective ConnectionMode.
  Surjective: An Edge can connect to an unlimited number of Vertices through 
  a Connector with a surjective mode, but a Vertex can only have one Edge 
  connected to it with a ConnectorType with a surjective ConnectionMode.
  Bijective: An Edge can connect to only one Vertex through a Connector with 
  bijective mode, and a Vertex can only have one Edge connected to it with a 
  ConnectorType with a bijective ConnectionMode.
  All ConnectionModes have been implemented.
 
  The switch from EdgeRole to Connector with ConnectionModes has eliminated 
  some of the more annoying type parameters found in the previous incarnation 
  of Enhanced API.
 
 
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] length of property names

2011-08-10 Thread Niels Hoogeveen

I find myself using some pretty long property names, like 
org.neo4j.collections.graphdb.node_id and wonder if this has an impact on 
performance.
Niels

From: pd_aficion...@hotmail.com
To: user@lists.neo4j.org
Subject: length of property names
Date: Mon, 8 Aug 2011 15:44:20 +0200

Quick question: what is the performance impact of the length of a property 
name? 
Niels   

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] length of property names

2011-08-10 Thread Niels Hoogeveen

Thanks Mattias
 Date: Wed, 10 Aug 2011 15:25:24 +0200
 From: matt...@neotechnology.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] length of property names

 No, none whatsoever (if you don't count the potentially slightly longer
 for-loop in String#equals which maps from String to internal ID (integer)
 used in Neo4j).

 2011/8/10 Niels Hoogeveen pd_aficion...@hotmail.com

  I find myself using some pretty long property names, like
  org.neo4j.collections.graphdb.node_id and wonder if this has an impact on
  performance.
  Niels

  From: pd_aficion...@hotmail.com
  To: user@lists.neo4j.org
  Subject: length of property names
  Date: Mon, 8 Aug 2011 15:44:20 +0200

  Quick question: what is the performance impact of the length of a property
  name?
  Niels
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user

 -- 
 Mattias Persson, [matt...@neotechnology.com]
 Hacker, Neo Technology
 www.neotechnology.com
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

[Neo4j] Enhanced API wiki page

2011-08-09 Thread Niels Hoogeveen


Today I updated the wiki page for Enhanced API. Since the last edit many 
changes have taken place, so it was to to reflect those changes on the wiki 
page.

See: https://github.com/peterneubauer/graph-collections/wiki/Enhanced-API

I also changed what was previously called an EdgeRole into a Connector. 

Every Edge has a number of Connectors to which Vertices connect.

The EdgeType of an Edge defines the ConnectorTypes of the Connectors of an Edge.

Each ConnectorType and with that a Connector, has a ConnectionMode, which can 
be one of these four:

Unrestricted: An Edge can connect to an unlimited number of Vertices through a 
Connector with an unrestricted mode, and a Vertex can have an unlimited number 
of connected Edges with a ConnectorType with an unrestricted ConnectionMode.
Injective: An Edge can connect to only one Vertex through a Connector with 
injective mode, but a Vertex can have an unlimited number of connected Edges 
with a ConnectorType with an injective ConnectionMode.
Surjective: An Edge can connect to an unlimited number of Vertices through a 
Connector with a surjective mode, but a Vertex can only have one Edge connected 
to it with a ConnectorType with a surjective ConnectionMode.
Bijective: An Edge can connect to only one Vertex through a Connector with 
bijective mode, and a Vertex can only have one Edge connected to it with a 
ConnectorType with a bijective ConnectionMode.
All ConnectionModes have been implemented.

The switch from EdgeRole to Connector with ConnectionModes has eliminated some 
of the more annoying type parameters found in the previous incarnation of 
Enhanced API.

  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Enhanced API wiki page

2011-08-09 Thread Niels Hoogeveen

I should of course market this work better.

So hereby the statement: NOW with nice and handy images, free of charge!!!

Niels
 From: pd_aficion...@hotmail.com
 To: user@lists.neo4j.org
 Date: Wed, 10 Aug 2011 01:19:42 +0200
 Subject: [Neo4j] Enhanced API wiki page

 Today I updated the wiki page for Enhanced API. Since the last edit many 
 changes have taken place, so it was to to reflect those changes on the wiki 
 page.

 See: https://github.com/peterneubauer/graph-collections/wiki/Enhanced-API

 I also changed what was previously called an EdgeRole into a Connector. 

 Every Edge has a number of Connectors to which Vertices connect.

 The EdgeType of an Edge defines the ConnectorTypes of the Connectors of an 
 Edge.

 Each ConnectorType and with that a Connector, has a ConnectionMode, which can 
 be one of these four:

 Unrestricted: An Edge can connect to an unlimited number of Vertices through 
 a Connector with an unrestricted mode, and a Vertex can have an unlimited 
 number of connected Edges with a ConnectorType with an unrestricted 
 ConnectionMode.
 Injective: An Edge can connect to only one Vertex through a Connector with 
 injective mode, but a Vertex can have an unlimited number of connected Edges 
 with a ConnectorType with an injective ConnectionMode.
 Surjective: An Edge can connect to an unlimited number of Vertices through a 
 Connector with a surjective mode, but a Vertex can only have one Edge 
 connected to it with a ConnectorType with a surjective ConnectionMode.
 Bijective: An Edge can connect to only one Vertex through a Connector with 
 bijective mode, and a Vertex can only have one Edge connected to it with a 
 ConnectorType with a bijective ConnectionMode.
 All ConnectionModes have been implemented.

 The switch from EdgeRole to Connector with ConnectionModes has eliminated 
 some of the more annoying type parameters found in the previous incarnation 
 of Enhanced API.

 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Enhanced API rewrite

2011-08-08 Thread Niels Hoogeveen

I can probably find the time for that. It would be fun working on these ideas 
in collaboration. I don't mind producing my usual brain-dumps and write some of 
the code, but quality will certainly improve when it is more than just me 
paying attention to this.
Niels

 From: peter.neuba...@neotechnology.com
 Date: Mon, 8 Aug 2011 11:50:35 +0200
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Enhanced API rewrite

 Very interesting thoughts!

 I would love to have a bootcamp and explore a spike on how this would
 work out in practice. Got anything to do this autumn? ;)

 Cheers,

 /peter neubauer

 GTalk:  neubauer.peter
 Skype   peter.neubauer
 Phone   +46 704 106975
 LinkedIn   http://www.linkedin.com/in/neubauer
 Twitter  http://twitter.com/peterneubauer

 http://www.neo4j.org   - Your high performance graph database.
 http://startupbootcamp.org/- Öresund - Innovation happens HERE.
 http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.

 On Sun, Aug 7, 2011 at 4:30 PM, Niels Hoogeveen
 pd_aficion...@hotmail.com wrote:

  Hi Peter,

  Thanks for showing an interest.

  A Property is indeed a unary edge in the Enhanced API and therefore 
  (potentially) backed by a Node, but that Node doesn't contain the value.

  All property values are still stored the way they are stored in the 
  standard API. If someone however decides to add a Property to a Property or 
  create an Edge containing that Property, a Node will be created to store 
  those properties and connect those Edges to.

  When the associated Node of a Property is created, the ID of that Node will 
  be stored in the PropertyContainer of that property.

  Example:

  Suppose we have a property on a Person Vertex that denotes a personal 
  identity number, and the user of the application want to annually check 
  that identity number against some other database and state when it was last 
  verified and who verified it.

  A Vertex (backed by a Node) for a particular Person is created and the 
  property is set (in that Node's PropertyContainer), just like it would be 
  the case in the standard API.

  When the verification is done, an additional property is created on the 
  PropertyContainer of that Person with the name 
  org.neo4j.collections.graphdb.[propertyname].node_id

  This property contains the node ID of the associated property. On that node 
  the verification date will be set and the BinaryEdge (in principle nothing 
  but a classic Relationship) will be created to the Person Vertex of the 
  one who verified the personal identity code.

  It is certainly true that everything being a Vertex makes the Node 
  implementation more important than ever before, but it goes even further, 
  apart from a standard Vertex and the various VertexTypes, almost everything 
  is an Edge. So I would say the Relationship implementation is becoming 
  eminently important.

  There are certainly several tweaks to the storage layer I would love to see 
  incorporated, mostly to hide the implementation for the user and to make 
  sure that the maintenance of IDs takes place in core and not in a layer on 
  top of core.

  In fact all of Enhanced API could much better be maintained  in core, 
  something that can actually quite easily be implemented. One of my 
  ulterior motives with the development of Enhanced API is to tease out the 
  technical requirements to push this functionality into core (whether Neo 
  Tech decides to do so, is another question of course).

  Since the Neo4j database consists mostly of records and linked lists, the 
  technical requirements to push things into core, are mostly a question of 
  adding entry-points to linked lists in some records and partitioning some 
  existing linked lists.

  I will write down those requirements in a separate post. This will include 
  support for N-ary edges, since that is actually not all that difficult to 
  implement and adds very little complexity to the database.

  Yes, traversals will become much more generalized in the Enhanced API, 
  especially when we make them composable. In fact composable traversal 
  descriptions can easily be seen as a query language giving access to all 
  parts of the database.

  Niels

  From: peter.neuba...@neotechnology.com
  Date: Sun, 7 Aug 2011 09:10:02 +0200
  To: user@lists.neo4j.org
  Subject: Re: [Neo4j] Enhanced API rewrite

  Niels,
  this sounds very interesting. Given the role of properties being unary
  edges, that would mean that any classic Neo4j property would now be a
  Node with one Property in the new Vertex sense?

  Having Vertices for EVERYTHING will of course make the
  node-implementation much more important than anything else, since
  every element is backed by a node, possibly with some property. I
  wonder how this would reflect in the storage layer that might need to
  be tweaked.

  Also, as you point out, traversals will become quite

Re: [Neo4j] Enhanced API rewrite

2011-08-08 Thread Niels Hoogeveen

Hi Dmitri,
I would very much appreciate it if you tried out Enhanced API and gave me feed 
back about your findings. Apart from traversals it is more or less feature 
complete, but it could use some thorough trying out.
Niels

 Date: Mon, 8 Aug 2011 20:20:14 +0500
 From: shaban...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Enhanced API rewrite

 I ready to jump in too ;-)

 On Mon, Aug 8, 2011 at 3:37 PM, Niels Hoogeveen
 pd_aficion...@hotmail.comwrote:

  I can probably find the time for that. It would be fun working on these
  ideas in collaboration. I don't mind producing my usual brain-dumps and
  write some of the code, but quality will certainly improve when it is more
  than just me paying attention to this.
  Niels

 -- 
 Dmitriy Shabanov
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Enhanced API rewrite

2011-08-07 Thread Niels Hoogeveen

Hi Peter,

Thanks for showing an interest.

A Property is indeed a unary edge in the Enhanced API and therefore
(potentially) backed by a Node, but that Node doesn't contain the value.

All property values are still stored the way they are stored in the standard
API. If someone however decides to add a Property to a Property or create an
Edge containing that Property, a Node will be created to store those properties
and connect those Edges to.

When the associated Node of a Property is created, the ID of that Node will be
stored in the PropertyContainer of that property.

Example:

Suppose we have a property on a Person Vertex that denotes a personal
identity number, and the user of the application want to annually check that
identity number against some other database and state when it was last verified
and who verified it.

A Vertex (backed by a Node) for a particular Person is created and the property
is set (in that Node's PropertyContainer), just like it would be the case in
the standard API.

When the verification is done, an additional property is created on the
PropertyContainer of that Person with the name
org.neo4j.collections.graphdb.[propertyname].node_id

This property contains the node ID of the associated property. On that node the
verification date will be set and the BinaryEdge (in principle nothing but a
classic Relationship) will be created to the Person Vertex of the one who
verified the personal identity code.

It is certainly true that everything being a Vertex makes the Node
implementation more important than ever before, but it goes even further, apart
from a standard Vertex and the various VertexTypes, almost everything is an
Edge. So I would say the Relationship implementation is becoming eminently
important.

There are certainly several tweaks to the storage layer I would love to see
incorporated, mostly to hide the implementation for the user and to make sure
that the maintenance of IDs takes place in core and not in a layer on top of
core.

In fact all of Enhanced API could much better be maintained in core, something
that can actually quite easily be implemented. One of my ulterior motives
with the development of Enhanced API is to tease out the technical requirements
to push this functionality into core (whether Neo Tech decides to do so, is
another question of course).

Since the Neo4j database consists mostly of records and linked lists, the
technical requirements to push things into core, are mostly a question of
adding entry-points to linked lists in some records and partitioning some
existing linked lists.

I will write down those requirements in a separate post. This will include
support for N-ary edges, since that is actually not all that difficult to
implement and adds very little complexity to the database.

Yes, traversals will become much more generalized in the Enhanced API,
especially when we make them composable. In fact composable traversal
descriptions can easily be seen as a query language giving access to all parts
of the database.

Niels

From: peter.neuba...@neotechnology.com
Date: Sun, 7 Aug 2011 09:10:02 +0200
To: user@lists.neo4j.org
Subject: Re: [Neo4j] Enhanced API rewrite

Niels,
this sounds very interesting. Given the role of properties being unary
edges, that would mean that any classic Neo4j property would now be a
Node with one Property in the new Vertex sense?

Having Vertices for EVERYTHING will of course make the
node-implementation much more important than anything else, since
every element is backed by a node, possibly with some property. I
wonder how this would reflect in the storage layer that might need to
be tweaked.

Also, as you point out, traversals will become quite different with
this API, but let's see an what the weekend brings ;)

Cheers,

/peter neubauer

GTalk: neubauer.peter
Skype peter.neubauer
Phone +46 704 106975
LinkedIn http://www.linkedin.com/in/neubauer
Twitter http://twitter.com/peterneubauer

http://www.neo4j.org - Your high performance graph database.
http://startupbootcamp.org/- Öresund - Innovation happens HERE.
http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.

On Sat, Aug 6, 2011 at 2:51 AM, Niels Hoogeveen
pd_aficion...@hotmail.com wrote:

Today I pushed a major rewrite of the Enhanced API. See:
https://github.com/peterneubauer/graph-collections/tree/master/src/main/java/org/neo4j/collections/graphdb

Originally the Enhanced API was a drop-in replacement of the standard Neo4j
API. This resulted in lots of wrapper classes that needed to be maintained.

The rewrite of Enhanced API is no longer a drop-in replacement and contains
no interface/class names that can be found in the standard API.

Enhanced API no longer speaks of Nodes but of Vertices and doesn't speak of
Relationships but of Edges. This helps to prevent name clashes at the
expense

Re: [Neo4j] Node#getRelationshipTypes

2011-08-07 Thread Niels Hoogeveen

Yes, let's not argue about something as elusive as the definition of low 
hanging fruit.
In the mean time I wrote down my suggestions for store refactoring more 
succinctly and added some more suggestions.
Niels

 Date: Sun, 7 Aug 2011 22:09:48 +0200
 From: matt...@neotechnology.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Node#getRelationshipTypes

 2011/8/6 Niels Hoogeveen pd_aficion...@hotmail.com

  This is the thread about store layer changes for type/direction, and in my
  opinion this is still quite low hanging fruit. Sure, the impact needs to be
  tested rigorously, which may take considerable time, but the implementation
  is quite straight-forward and the potential gains are large.

 Agreeing to disagree. Implementing it shouldn't be very hard, but that's
 only a small part of it. It would require quite hefty amounts of testing to
 be considered production quality... not even mentioning writing and testing
 migration of existing databases.

 Or we just have different views of what kind of fruit to consider low
 hanging.

  Niels
   Date: Sat, 6 Aug 2011 22:16:15 +0200
   From: matt...@neotechnology.com
   To: user@lists.neo4j.org
   Subject: Re: [Neo4j] Node#getRelationshipTypes

   Oh, confused this thread with store layer changes for type/direction
   of relationships. This fruit in this thread is pretty low hanging.

   Den lördagen den 6:e augusti 2011 skrev Mattias
   Perssonmatt...@neotechnology.com:
I would not consider this low hanging fruit btw

Den onsdagen den 3:e augusti 2011 skrev Niels
Hoogeveenpd_aficion...@hotmail.com:

Hmmm... Does that require the inclusion of golden parachutes as well?
Anyway, addressing the readers of this message that have time
  allocation authority. I hope my suggestion, or another technical solution
  that solves the same issues will be picked up for 1.5. This is as far as I
  can tell pretty much low hanging fruit. There are probably all sorts of
  tweaks that can improve the performance of Neo4j, but this one can improve
  the performance of Neo4j big time (under certain conditions). As a user who
  is confronted with several very densely connected nodes, I have tried all
  sorts of means to solve my issues, but none as rewarding as a solution in
  core would be.
Niels
Date: Wed, 3 Aug 2011 16:31:04 +0200
From: matt...@neotechnology.com
To: user@lists.neo4j.org
Subject: Re: [Neo4j] Node#getRelationshipTypes

A golden helicopter might do the trick :)

2011/8/3 Niels Hoogeveen pd_aficion...@hotmail.com

 How does one persuade the time allocation authorities?
 Niels

  Date: Wed, 3 Aug 2011 09:28:45 +0200
  From: matt...@neotechnology.com
  To: user@lists.neo4j.org
  Subject: Re: [Neo4j] Node#getRelationshipTypes

  Yup, it's a pretty sane approach and somewhat along the lines of
  how I
 feel
  it would be done. It's been said a long time that this
  functionality
 will
  be implemented some day and it's just that a significant amount
  of time
  have to be invested... maybe not for implementing it, but for
  discovering
  all bugs and inconveniences to have it on par with production
  quality.
 And
  that kind of time haven't been allocated yet.

  I appreciate your thoughts and time on all this!

  Best,
  Mattias

  2011/8/3 Niels Hoogeveen pd_aficion...@hotmail.com

   I would like to make a suggestion that would both address my
  feature
   request and increase performance of the database.

   Right now the NodeRecord
 (org.neo4j.kernel.impl.nioneo.store.NodeRecord)
   contains the ID of the first Relationship, while the
  RelationshipRecord
   contain the ID's of the previous and next relationship for both
  sides
 of the
   relationship.

   My suggestion is as follows:

   Create a new store:

   noderelationshiptypestore.db

   The layout of this store is given by the
  NodeRelationshipTypeRecord:

   id
   previousrelationshiptype
   nextrelationshiptype
   firstrelationship

   The NodeRecord would now need to point to the first outgoing
   NodeRelationshipType and to the first incoming
  NodeRelationshipType
 instead
   of to the first Relationship.

   On insert of a Relationship, one side of the relationship will
  update
 the
   store from the outgoing side, the other side will update the
  store for
 the
   incoming side.

   I will list the steps to take here for the outgoing side (the
  incoming
 side
   is almost identical).

   From the NodeReco--
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com

   --
   Mattias Persson, [matt...@neotechnology.com]
   Hacker, Neo Technology
   www.neotechnology.com

[Neo4j] sub-graphs

2011-08-07 Thread Niels Hoogeveen


While I am at it, let's post another brain dump.

A couple of weeks ago, I worked on SortedTree/IndexRelationships in an attempt 
to solve the densely-connected-node-problem. 

SortedTree is a Btree layed-out in the graph, sorted by some function on a node 
(eg. the nodeId, or a property value).

This approach worked, to a degree, but at some point, load times decrease 
because of reorganizations of the tree. Too much memory is needed to keep the 
entire tree in memory and standard nodes and relationships are simply too fine 
grained for the job. Instead of loading each individual node and each 
individual relationships in an index block, it would be nice to be able to load 
the entire block with one read operation, and swap out an entire memory block 
when memory is needed. 

This brought me to the idea of sub-graphs. Let's say every node (and possibly 
relationship) is a graph, containing nodes and relationships. Each graph has 
its own store (if the contained graph is not empty). Relationships are 
lightweight (offset based) when associated with Nodes (and possibly 
Relationships) within the same graph, but require an extra store_id when 
associating with nodes (and possibly Relationship) outside that graph. This 
gives control over where things are stored, and what is stored together. 

Using RelationshipRoles, as I described in another post we can state which 
association of a Relationship is certainly stored local, and what is certainly 
stored in another another Graph and what is either stored local or in another 
Graph. This way we can have full control over the locality of the associations 
of a Relationship.

If we make each index block of SortedTree its own graph we can make sure that 
all Relationship associations are local, except the ones eventually pointing to 
the Nodes we want indexed, those are certainly stored in another Graph. This 
way the store will only contain Nodes and Relationships belonging to that index 
block, so we can load the entire store in a set of buffers and flush those 
buffers when no longer needed. 

This approach could also be used for sharding the database. Since each node in 
the graph can be a store of its own, we have a natural means to distribute 
graphs over different shards.

Lets define a shard as a set of graphs, which membership is decided by some 
rules defined on the RelationshipTypes used in the shard.

We could add the following options to the RelationshipRoles: must be shard, may 
be in shard and must not be in shard. 

This way the RelationshipRoles used in a store determine the dependencies of 
that store.

RelationshipRoles can form 9 possible combinations of settings over the 
locality of each Relationship association, one of which is mutually exclusive 
and some are tautological or inconsequential:

Must be in store and Must be in shard (is tautological).
Must be in store and May be in shard (is inconsequential).
Must be in store and Must not be in shard (is impossible)
May be in store and Must be in shard
May be in store and May be in shard (is inconsequential)
May be in store and Must not be in shard (is inconsequential)
Must not be in store and Must be in shard
Must not be in store and May be in shard (inconsequential)
Must not be in store and Must not be in shard (is tautological)

So that leaves the following RelationshipRole options:
Must be in store 
May be in store and Must be in shard
May be in shard 
Must not be in store and Must be in shard
Must not be in store 
Must not be in shard 

The default RelationshipRoles of a standard binary relationship are StartNode 
and EndNode, which both will have as default setting Must be in store. This 
way an implementation of such an approach remains backwards compatible. 

When combining RelationshipRoles into a RelationshipType, at least one 
RelationshipRole in the set must not have the setting Must not be in store, 
which is implied by Must not be in shard. Any such combination cannot be 
stored, since no store can contain any of the associated Nodes.

When creating a Relationship, a store adds that Relationship when at least one 
associated Node is actually present in the database.

When adding NodeTypes to the mix, the distribution of Nodes and Relationships 
over the various stores can even be further controlled. If we would know for 
each created Node if it must have an associated RelationshipRole, may have an 
associated RelationshipRole, or must not have an associated RelationshipRole, 
it becomes possible to decide if a Node Must be added to a store, may be added 
to a store, or must not be added to a store. For the may be added to a store 
cases, a Coordinator can decide where to store those particular Nodes.

Finally, this approach allows for distributed traversals. Traversals are always 
local, when a traversal branch hits upon a relationship association that is 
external to the store, that traversal will asynchronously be continued on that 
other store. When the traversal ends its

Re: [Neo4j] Keeping context information in the Graph

2011-08-06 Thread Niels Hoogeveen

What you describe here is a ternary edge, something I try to cover in the
Enhanced API.

Your film example can be modeled as follows:

There is an Edge STARS with the EdgeRoles: Actor, Film, Role.

We can now state:

STARS
-- Actor -- Brad Pitt
-- Film -- Fight club
-- Role -- Tyler Durden

STARS
-- Actor -- Edward Norton
-- Film -- Fight club
-- Role -- Fight club narrator, Tyler Durden

or we can state

STARS
-- Actor -- Brad Pitt, Edward Norton
-- Film -- Fight club
-- Role -- Tyler Durden

STARS
-- Actor -- Edward Norton
-- Film -- Fight club
-- Role -- Fight club narrator

or we can state

STARS
-- Actor -- Brad Pitt
-- Film -- Fight club
-- Role -- Tyler Durden

STARS
-- Actor -- Edward Norton
-- Film -- Fight club
-- Role -- Fight club narrator

STARS
-- Actor -- Edward Norton
-- Film -- Fight club
-- Role -- Tyler Durden

Niels

Date: Sat, 6 Aug 2011 15:51:51 +0800
From: asianf...@gmail.com
To: user@lists.neo4j.org
Subject: Re: [Neo4j] Keeping context information in the Graph

This may be the same solution suggested by Dmitriy, but I had to visualise
it to understand the problem. The problematic solution on top, if I
understand it correctly; the proposed solution beneath it:
http://s3.amazonaws.com/neo4j/node_example.png

It's a more verbose graph, but it does model the semantics. This is all
very abstract, so let's make your example more concrete by naming the nodes
something other than letters that match to a real world example.

1. (A) Brad Pitt stars in (B) Fight Club in the role of (C) Tyler Durden.
2. (D) Edward Norton stars in (B) Fight Club in the roles of both (E) The
Narrator and [spoiler alert] (C) Tyler Durden

The creation of casting nodes F and G in the diagram may serve a practical
purpose later, for example if one was also modelling Pitt and Norton's
contract for accounting purposes, tracking media coverage of the casting
news, etc.

Stephen

On 6 August 2011 06:11, pankaj pankaj@gmail.com wrote:

Hi,

I have following data modeling problem. Node A related to Node B with
complex property C. I modeled it like
A-B-C. Now I have another node D related to B with complex property C and
E. Now my graph looks like
D-B-c, A-B-C, and D-B-E. Now storing like this, I lost the
information
that A never related to B in the context of complex property E. How do I
model it?

Thanks
Pankaj

--
View this message in context:
http://neo4j-community-discussions.438527.n3.nabble.com/Keeping-context-information-in-the-Graph-tp3229955p3229955.html
Sent from the Neo4j Community Discussions mailing list archive at
Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Node#getRelationshipTypes

2011-08-06 Thread Niels Hoogeveen

This is the thread about store layer changes for type/direction, and in my 
opinion this is still quite low hanging fruit. Sure, the impact needs to be 
tested rigorously, which may take considerable time, but the implementation is 
quite straight-forward and the potential gains are large.
Niels
 Date: Sat, 6 Aug 2011 22:16:15 +0200
 From: matt...@neotechnology.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Node#getRelationshipTypes

 Oh, confused this thread with store layer changes for type/direction
 of relationships. This fruit in this thread is pretty low hanging.

 Den lördagen den 6:e augusti 2011 skrev Mattias
 Perssonmatt...@neotechnology.com:
  I would not consider this low hanging fruit btw

  Den onsdagen den 3:e augusti 2011 skrev Niels
  Hoogeveenpd_aficion...@hotmail.com:

  Hmmm... Does that require the inclusion of golden parachutes as well?
  Anyway, addressing the readers of this message that have time allocation 
  authority. I hope my suggestion, or another technical solution that 
  solves the same issues will be picked up for 1.5. This is as far as I can 
  tell pretty much low hanging fruit. There are probably all sorts of tweaks 
  that can improve the performance of Neo4j, but this one can improve the 
  performance of Neo4j big time (under certain conditions). As a user who is 
  confronted with several very densely connected nodes, I have tried all 
  sorts of means to solve my issues, but none as rewarding as a solution in 
  core would be.
  Niels
  Date: Wed, 3 Aug 2011 16:31:04 +0200
  From: matt...@neotechnology.com
  To: user@lists.neo4j.org
  Subject: Re: [Neo4j] Node#getRelationshipTypes

  A golden helicopter might do the trick :)

  2011/8/3 Niels Hoogeveen pd_aficion...@hotmail.com

   How does one persuade the time allocation authorities?
   Niels

Date: Wed, 3 Aug 2011 09:28:45 +0200
From: matt...@neotechnology.com
To: user@lists.neo4j.org
Subject: Re: [Neo4j] Node#getRelationshipTypes

Yup, it's a pretty sane approach and somewhat along the lines of how I
   feel
it would be done. It's been said a long time that this functionality
   will
be implemented some day and it's just that a significant amount of 
time
have to be invested... maybe not for implementing it, but for 
discovering
all bugs and inconveniences to have it on par with production quality.
   And
that kind of time haven't been allocated yet.

I appreciate your thoughts and time on all this!

Best,
Mattias

2011/8/3 Niels Hoogeveen pd_aficion...@hotmail.com

 I would like to make a suggestion that would both address my feature
 request and increase performance of the database.

 Right now the NodeRecord
   (org.neo4j.kernel.impl.nioneo.store.NodeRecord)
 contains the ID of the first Relationship, while the 
 RelationshipRecord
 contain the ID's of the previous and next relationship for both 
 sides
   of the
 relationship.

 My suggestion is as follows:

 Create a new store:

 noderelationshiptypestore.db

 The layout of this store is given by the NodeRelationshipTypeRecord:

 id
 previousrelationshiptype
 nextrelationshiptype
 firstrelationship

 The NodeRecord would now need to point to the first outgoing
 NodeRelationshipType and to the first incoming NodeRelationshipType
   instead
 of to the first Relationship.

 On insert of a Relationship, one side of the relationship will 
 update
   the
 store from the outgoing side, the other side will update the store 
 for
   the
 incoming side.

 I will list the steps to take here for the outgoing side (the 
 incoming
   side
 is almost identical).

 From the NodeReco--
  Mattias Persson, [matt...@neotechnology.com]
  Hacker, Neo Technology
  www.neotechnology.com

 -- 
 Mattias Persson, [matt...@neotechnology.com]
 Hacker, Neo Technology
 www.neotechnology.com
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Enhanced API rewrite

2011-08-06 Thread Niels Hoogeveen

Today I added fluency to the API design.

It is now possible to write:

Db().createVertex()
.setProperty(Name, John)
.setProperty(Age, 29)
.addEdgeTo(june, WIFE)

I also added support for VertexTypes, which is nothing more and nothing less
than a Vertex with a unique name and a class name to initialize the VertexType.
Application programmers can decide for themselves how to implement VertexTypes.

VertexTypes can be retrieved from a Vertex with the method Vertex#getTypes().

There are no facilities to retrieve the Vertices defined with a certain
VertexType. The connection between Vertex and VertexType is not stored as a
Relationship, but is stored as a Long[] property on the Vertex, containing the
id's of the VertexTypes, this to prevent the densely-connected-node-problem.
Each Vertex will likely have few types, but each VertexType will likely have
lots of associated Vertices. If users want to know know the Vertices of a
VertexType they can create an index for that (something that is outside the
scope of Enhanced API).

Edges all have at least one associated VertexType which is used for traversals.
An Edge can have more than one VertexType, but only the one added as EdgeType
(which extends VertexType) will be used for traversals.

Niels

From: pd_aficion...@hotmail.com
To: user@lists.neo4j.org
Date: Sat, 6 Aug 2011 02:51:23 +0200
Subject: [Neo4j] Enhanced API rewrite

Today I pushed a major rewrite of the Enhanced API. See:
https://github.com/peterneubauer/graph-collections/tree/master/src/main/java/org/neo4j/collections/graphdb

Originally the Enhanced API was a drop-in replacement of the standard Neo4j
API. This resulted in lots of wrapper classes that needed to be maintained.

The rewrite of Enhanced API is no longer a drop-in replacement and contains
no interface/class names that can be found in the standard API.

Enhanced API no longer speaks of Nodes but of Vertices and doesn't speak of
Relationships but of Edges. This helps to prevent name clashes at the expense
of somewhat less recognizable names (Relationship is after all a more common
word than Edge).

This rewrite is not merely a renaming of classes and interfaces, but is in
most part a complete rewrite and also a rethinking of the API on my part.

Enhanced API consists of two basic elements: Vertex and EdgeRole. Most
elements are a subclass of Vertex, though there are some specialized versions
of EdgeRole.

Let me start with an example:

Suppose we have two vertices denoting the persons Tom and Paula, and we want
to state that Tom is the father of Paula.

For standard Neo4j we tend to write such a fact as:

Tom --Father-- Paula

For Enhanced API we can conceptually write this fact as follows:

--StartRole--Tom
Father
--EndRole--Paula

This should be read as follows: We have two Vertices: Tom and Paula and we
have a BinaryEdge (similar to a Relationship in the standard API) of type
Father, where Tom has the StartRole for that edge and Paula has the EndRole
for that edge.

So instead of a directed graph, we conceptually have an undirected bipartite
graph.

For binary edges (edges between two vertices), this is mostly conceptually
the case, because the API will simply allow you to write:
tom.createEdgeTo(paula, FATHER) (similar to tom.createRelationshipTo(paula,
FATHER) as we would have in the standard API).

It is also possible to fetch the start vertex of the binary relationship with
the method: edge.getStartVertex() (similar to relationship.getStartNode()),
although it is also possible to treat the binary edge as a generic edge and
fetch that Vertex as: edge.getElement(db.getStartRole()).

BinaryEdges, are a special case and have special methods which cover the same
functionality as can be found in the standard Neo4j API.

In general, we can say that Vertices are connected to Edges by means of
EdgeRoles. In the binary case there are two predefined EdgeRoles: StartRole
and EndRole.

Before we get deeper into the general case of n-ary edges, let's first look
at another special case: Properties.

Properties can be thought of as unary edges, an edge that connects to only
one Vertex (as opposed to two in the binary case).

Suppose we want to state that Tom is 49 years old, we can write that as:

age(49)--PropertyRole--Tom

We have an edge of type age that is connected to the vertex Tom in the role
of a property.

Again this is mostly conceptually true, because there are lots of methods in
Enhanced API that are very similar to the ones found in the standard API;
getProperty, hasProperty, setProperty. Instead, we can also call methods on
the property itself, after all the age property connected to the Vertex
Tom, is an object all of itself. More precisely it is a Property and with
that it is a UnaryEdge, which is an Edge, which is a Vertex.

From the age property we can fetch the ProperyType, but we

[Neo4j] Enhanced API rewrite

2011-08-05 Thread Niels Hoogeveen

Today I pushed a major rewrite of the Enhanced API. See:
https://github.com/peterneubauer/graph-collections/tree/master/src/main/java/org/neo4j/collections/graphdb

Originally the Enhanced API was a drop-in replacement of the standard Neo4j
API. This resulted in lots of wrapper classes that needed to be maintained.

The rewrite of Enhanced API is no longer a drop-in replacement and contains no
interface/class names that can be found in the standard API.

This rewrite is not merely a renaming of classes and interfaces, but is in most
part a complete rewrite and also a rethinking of the API on my part.

Enhanced API consists of two basic elements: Vertex and EdgeRole. Most elements
are a subclass of Vertex, though there are some specialized versions of
EdgeRole.

Let me start with an example:

Suppose we have two vertices denoting the persons Tom and Paula, and we want to
state that Tom is the father of Paula.

For standard Neo4j we tend to write such a fact as:

Tom --Father-- Paula

For Enhanced API we can conceptually write this fact as follows:

--StartRole--Tom
Father
--EndRole--Paula

This should be read as follows: We have two Vertices: Tom and Paula and we have
a BinaryEdge (similar to a Relationship in the standard API) of type Father,
where Tom has the StartRole for that edge and Paula has the EndRole for that
edge.

So instead of a directed graph, we conceptually have an undirected bipartite
graph.

For binary edges (edges between two vertices), this is mostly conceptually the
case, because the API will simply allow you to write: tom.createEdgeTo(paula,
FATHER) (similar to tom.createRelationshipTo(paula, FATHER) as we would have in
the standard API).

BinaryEdges, are a special case and have special methods which cover the same
functionality as can be found in the standard Neo4j API.

In general, we can say that Vertices are connected to Edges by means of
EdgeRoles. In the binary case there are two predefined EdgeRoles: StartRole and
EndRole.

Before we get deeper into the general case of n-ary edges, let's first look at
another special case: Properties.

Properties can be thought of as unary edges, an edge that connects to only one
Vertex (as opposed to two in the binary case).

Suppose we want to state that Tom is 49 years old, we can write that as:

age(49)--PropertyRole--Tom

We have an edge of type age that is connected to the vertex Tom in the role
of a property.

Again this is mostly conceptually true, because there are lots of methods in
Enhanced API that are very similar to the ones found in the standard API;
getProperty, hasProperty, setProperty. Instead, we can also call methods on the
property itself, after all the age property connected to the Vertex Tom, is
an object all of itself. More precisely it is a Property and with that it is a
UnaryEdge, which is an Edge, which is a Vertex.

From the age property we can fetch the ProperyType, but we can also ask for
the Vertex it is connected to: getVertex(). Since a Property is an Edge we can
also fetch the connected vertex (Tom) as follows:
age.getElement(db.getPropertyRole).

So we have seen the two special cases: unary edges and binary edges, which work
very much the same as properties and Relationships in the standard Neo4j API,
though we have given it a conceptually different perspective that unifies the
two and fits it neatly into the general case of N-ary edges.

As said before, an Edge is a Vertex that connects other Vertices by means of
EdgeRoles. Since Edges are Vertices, they can have other Edges connected to
them. Or in standard API talk: relationships can be connected to other
relationships and they can have properties.

The concept of EdgeRoles separates Edges from Vertices, so we will effectively
have a bipartite graph where Vertices can only connect to Edges and Edges can
only connect to Vertices. Given the fact that Edges are also Vertices, Edges
can be connected to Edges, but in such a case it is unambiguous which plays the
role of Edge and which plays the role of Vertex in that connection.

Let's look at an example of an N-ary edge:

Suppose we want to state the fact that Tom gives Paula a Bicycle (no golden
helicopters in stock today). We can write that as follows:

--Giver--Tom
GIVES --Recipient -- Paula
--Gift -- Bicycle

There is an EdgeType GIVES which defines three EdgeRoles: Giver, Recipient and
Gift, which

Re: [Neo4j] Batch find

2011-08-03 Thread Niels Hoogeveen


The batch insert is intended to push data into the database with having to do 
any look ups.
You could preprocess your input data, such that it can be loaded in one go. You 
could for example read you input file against an existing database, fetch the 
ID's of nodes and relationships that contain the information you need to 
update, and create two new input files. One containing data that can be 
inserted using the batch inserter, and one containing the information that 
needs to updated (including the ID's of the PropertyContainers that need to be 
updated).
Niels


 Date: Wed, 3 Aug 2011 04:14:44 -0700
 From: ahmed.elshark...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Batch find
 
 I am trying to insert a document containing list of words , and i wont to
 check whether some of this words are already in my graph and in this case i
 will update their properties otherwise i will create new nodes with the new
 words
 
 --
 View this message in context: 
 http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3221964.html
 Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Batch find

2011-08-03 Thread Niels Hoogeveen

That should be without having to do any lookups

 From: pd_aficion...@hotmail.com
 To: user@lists.neo4j.org
 Date: Wed, 3 Aug 2011 13:37:44 +0200
 Subject: Re: [Neo4j] Batch find

 The batch insert is intended to push data into the database with having to do 
 any look ups.
 You could preprocess your input data, such that it can be loaded in one go. 
 You could for example read you input file against an existing database, fetch 
 the ID's of nodes and relationships that contain the information you need to 
 update, and create two new input files. One containing data that can be 
 inserted using the batch inserter, and one containing the information that 
 needs to updated (including the ID's of the PropertyContainers that need to 
 be updated).
 Niels

  Date: Wed, 3 Aug 2011 04:14:44 -0700
  From: ahmed.elshark...@gmail.com
  To: user@lists.neo4j.org
  Subject: Re: [Neo4j] Batch find

  I am trying to insert a document containing list of words , and i wont to
  check whether some of this words are already in my graph and in this case i
  will update their properties otherwise i will create new nodes with the new
  words

  --
  View this message in context: 
  http://neo4j-community-discussions.438527.n3.nabble.com/Batch-find-tp3221634p3221964.html
  Sent from the Neo4j Community Discussions mailing list archive at 
  Nabble.com.
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user

 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Node#getRelationshipTypes

2011-08-03 Thread Niels Hoogeveen


Hmmm... Does that require the inclusion of golden parachutes as well?
Anyway, addressing the readers of this message that have time allocation 
authority. I hope my suggestion, or another technical solution that solves the 
same issues will be picked up for 1.5. This is as far as I can tell pretty much 
low hanging fruit. There are probably all sorts of tweaks that can improve the 
performance of Neo4j, but this one can improve the performance of Neo4j big 
time (under certain conditions). As a user who is confronted with several very 
densely connected nodes, I have tried all sorts of means to solve my issues, 
but none as rewarding as a solution in core would be.
Niels
 Date: Wed, 3 Aug 2011 16:31:04 +0200
 From: matt...@neotechnology.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Node#getRelationshipTypes
 
 A golden helicopter might do the trick :)
 
 2011/8/3 Niels Hoogeveen pd_aficion...@hotmail.com
 
 
  How does one persuade the time allocation authorities?
  Niels
 
   Date: Wed, 3 Aug 2011 09:28:45 +0200
   From: matt...@neotechnology.com
   To: user@lists.neo4j.org
   Subject: Re: [Neo4j] Node#getRelationshipTypes
  
   Yup, it's a pretty sane approach and somewhat along the lines of how I
  feel
   it would be done. It's been said a long time that this functionality
  will
   be implemented some day and it's just that a significant amount of time
   have to be invested... maybe not for implementing it, but for discovering
   all bugs and inconveniences to have it on par with production quality.
  And
   that kind of time haven't been allocated yet.
  
   I appreciate your thoughts and time on all this!
  
   Best,
   Mattias
  
   2011/8/3 Niels Hoogeveen pd_aficion...@hotmail.com
  
   
I would like to make a suggestion that would both address my feature
request and increase performance of the database.
   
Right now the NodeRecord
  (org.neo4j.kernel.impl.nioneo.store.NodeRecord)
contains the ID of the first Relationship, while the RelationshipRecord
contain the ID's of the previous and next relationship for both sides
  of the
relationship.
   
My suggestion is as follows:
   
Create a new store:
   
noderelationshiptypestore.db
   
The layout of this store is given by the NodeRelationshipTypeRecord:
   
id
previousrelationshiptype
nextrelationshiptype
firstrelationship
   
The NodeRecord would now need to point to the first outgoing
NodeRelationshipType and to the first incoming NodeRelationshipType
  instead
of to the first Relationship.
   
On insert of a Relationship, one side of the relationship will update
  the
store from the outgoing side, the other side will update the store for
  the
incoming side.
   
I will list the steps to take here for the outgoing side (the incoming
  side
is almost identical).
   
From the NodeRecord getFirstNodeRelationType (outgoing).
   
Keep following NextRelationshipType until the desired record is found.
  If
no record exists, create one, make the current
  FirstNodeRelationshipType in
the NodeRecord (if it exists) the NextRelationshipType of the created
NodeRelationshipType (and make the created one the previous of the
  current
one) and make the created NodeRelationshipType the
  FirstNodeRelationshipType
in the NodeRecord.
   
In other words: find the NodeRelationshipTypeRecord in the linked list.
  If
none exists, create a NodeRelationshipTypeRecord, prepend it to the
  existing
list and change the entry point in the NodeRecord.
   
We now have found the requested NodeRelationshipTypeRecord.
   
From NodeRelationshipTypeRecord getFirstRelationship.
   
Create a new RelationshipRecord and make it the FirstRelationship in
  the
NodeRelationshipTypeRecord.
   
Make the old first RelationshipRecord (if it exists) the
  nextRelationship
of the new first RelationshipRecord and make the new first
RelationshipRecord the previous of the old first RelationshipRecord.
   
In other words: prepend a new RelationshipRecord to the existing list
  of
Relationships and change the entry point in the
  NodeRelationshipTypeRecord.
   
Do the same for the incoming side (except for the creation of the
RelationshipRecord, we only need one of those).
   
Instead of a linked list of Relationships per Node we now have two
  linked
lists of RelationshipTypes per Node (one incoming, one outgoing), with
  a
linked list of Relationships per NodeRelationshipType.
   
With this approach only those Relationships need to be read that match
  the
RelationshipType and Direction given.
   
Worst case this approach leads to an extra read operation per
RelationshipType:
   
Worst case example 1: Retrieve all Relationships, regardless of
Relationship or Direction. Here we have extra reads for all
NodeRelationshipType records. If the number of Relationships per

Re: [Neo4j] Memory overflow while creating big graph

2011-08-03 Thread Niels Hoogeveen


Is it possible for you to use the batch inserter, or does the data you are 
loading require a lot of lookups?
Niels

 From: jvcole...@gmail.com
 Date: Wed, 3 Aug 2011 17:57:20 -0300
 To: user@lists.neo4j.org
 Subject: [Neo4j] Memory overflow while creating big graph
 
 Hi,
 
 I'm trying to create a graph with 15M nodes and 12M relationships, but after
 insert 400K relationships the following exception is thrown: Exception in
 thread main java.lang.OutOfMemoryError: GC overhead limit exceeded.
 
 I'm using -Xmx3g and the following configuration file for the graph:
 neostore.nodestore.db.mapped_memory = 256M
 neostore.relationshipstore.db.mapped_memory = 1G
 neostore.propertystore.db.mapped_memory = 90M
 neostore.propertystore.db.index.mapped_memory = 1M
 neostore.propertystore.db.index.keys.mapped_memory = 1M
 neostore.propertystore.db.strings.mapped_memory = 768M
 neostore.propertystore.db.arrays.mapped_memory = 130M
 cache_type = weak
 
 Can anyone help me?
 
 -- 
 Jose Vinicius Pimenta Coletto
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] graph weight scheme design advice

2011-08-03 Thread Niels Hoogeveen


Hi Boris,
What will be your decision procedure to determine what edges will be marked as 
heavy and which will be marked as light? Even if you establish a fixed ratio, 
you will still need to decide what relationships belong in one category and 
which belong in the other?
Could you elaborate a little more on your problem domain? 
Niels

 From: bo...@popcha.com
 Date: Mon, 1 Aug 2011 23:31:08 -0400
 To: user@lists.neo4j.org
 Subject: [Neo4j] graph weight scheme design advice
 
 Howdy Graphistas!
 
 I hope someone with graph modeling experience can help me with a pattern I'm
 working on.
 
 I have two kinds of edges that may connect nodes, one is very heavy
 meaning that it has a high weight and if two nodes are connected by it this
 relationship it is very important, but there are few of these. The other is
 the opposite, it is very light, but plentiful. Since there will always be
 many more of the light relationships then the heavy ones, what is the best
 way to represent these in the graph? I can set up a fixed ratio, like 5:1,
 so each of the light ones is .2 and each of the heavy ones is 1, but at this
 time I have no idea what that ratio should be because I don't know how large
 the data set is and how it is configured, so I was wondering of this is a
 known pattern and had some elegant representation. If this isn't a message
 board answer, maybe someone can point me at a paper?
 
 Many thanks!
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Composable traversals

2011-08-02 Thread Niels Hoogeveen


It looks like this does the same I suggested. It's a bit clunkier, but I 
understand you don't want to changed the Node interface. OTOH is there any 
reason not to extend the Node interface, after all it is only one extends 
more? Since Nodes are all created in the neo4j-kernel component, there is no 
real reason to maintain strict binary backwards compatibility between versions, 
or do you expect people having projects with two separate neo4j-kernel jars 
having different versions?
Niels

 Date: Tue, 2 Aug 2011 23:05:17 +0200
 From: matt...@neotechnology.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Composable traversals
 
 Cool. To not mess around with interfaces too much I'm thinking of having:
 
 TraversalDescription#traverse( Node startNode, Node...
 additionalStartNodes );
 TraversalDescription#traverse( Path startPath, Path...
 additionalStartPaths );
 TraversalDescription#traverse( IterablePath startPaths );
 
 that would be rather similar, wouldn't it?
 
 2011/7/30 Niels Hoogeveen pd_aficion...@hotmail.com
 
 
  I would be all for it if this could become part of 1.5.
  I am willing to put time into this.
   Date: Sat, 30 Jul 2011 11:33:01 +0200
   From: matt...@neotechnology.com
   To: user@lists.neo4j.org
   Subject: Re: [Neo4j] Composable traversals
  
   Yes, FYI that's the exact thing we've been discussing :)
  
   2011/7/29 Niels Hoogeveen pd_aficion...@hotmail.com
  
   
Great, I would much rather see this become part of the core API than
  have
this as part of the Enhanced API.
To make things work correctly, one important change to core is needed:
  The
Node interface needs to extends Traverser (the interface in
org.neo4j.graphdb.traversal, not the one in org.neo4j.graphdb).
This is actually not a big deal. There Traverser interface supports
  three
methods:
Iteratorpath iterator() [return 1 path with 1 element in the path,
  being
the node itself]IterableNode nodes() [return an iterable over the
  node
itself]IterableRelationship relationships() [return an empty
  iterable]
With that addition, it's not all too difficult to enhance the current
implementation of Traverser. It only adds one more iteration level over
  the
current implementation. Instead of having one start node, we now have
multiple start paths. When returning values from the Traverser, the
  start
paths and the result paths need to be concatenated.
In the new scenario, all old traverse() methods can remain the same,
since Node becomes a Traverser, so those methods are just special cases
where IterablePath consists of 1 path, with just 1 element.
Niels
 Date: Fri, 29 Jul 2011 18:36:28 +0200
 From: matt...@neotechnology.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Composable traversals

 There have been thoughts a long while to make something like this
  with
the
 traversal framework, but time has never been allocated to evolve it.
  I'm
 adding stuff to the framework in a side track and will surely add
  some
 aspect of composable traversers also.

 2011/7/29 Niels Hoogeveen pd_aficion...@hotmail.com

 
  I'd like to take a stab at implementing traversals in the Enhanced
  API.
One
  of the things I'd like to do, is to make traversals composable.
 
  Right now a Traverser is created by either calling the traverse
  method
on
  Node, or to call the traverse(Node) method on TraversalDescription.
 
  This makes traversals inherently non-composable, so we can't define
  a
  single traversal that returns the parents of all our friends.
 
  To make Traversers composable we need a function:
 
  Traverser traverse(Traverser, TraversalDescription)
 
  My take on it is to make Element (which is a superinterface of
  Node)
into a
  Traverser.
 
  Traverser is basically another name for IterablePath.
 
  Every Node (or more generally every Element) can be seen as an
  IterabePath, returning a single Path, which contains a single
  path-element, the Node/Element itself.
 
  Composing traversals would entail the concatenation of the paths
returned
  with the paths supplied, so when we ask for the parents of all our
friends,
  the returned paths would take the form:
 
  Node --FRIEND-- Node -- PARENT -- Node
 
  Niels
 
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 



 --
 Mattias Persson, [matt...@neotechnology.com]
 Hacker, Neo Technology
 www.neotechnology.com
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
   
___
Neo4j mailing list
User

Re: [Neo4j] Node#getRelationshipTypes

2011-08-02 Thread Niels Hoogeveen

 the performance may decrease by at most a factor 2, while the 
performance may increase by orders of magnitude in some quite common use cases. 
On top of that, we can also present the meta information I requested, because 
we can simply iterate over the NodeRelationshipType list and return the entries 
to the user.

Finally, this proposal makes it possible to guarantee functional, surjective 
and one-to-one Relationships. Due to the partitioning we will know if there 
already is a relationship of a certain type. If a relationship is stated to be 
functional, surjective, or one-to-one, we can raise an exception when a second 
relationship is about to be created for that particular NodeRelationshipType.

Niels



 From: pd_aficion...@hotmail.com
 To: user@lists.neo4j.org
 Date: Tue, 2 Aug 2011 23:03:41 +0200
 Subject: Re: [Neo4j] Node#getRelationshipTypes
 
 
 Building an API on top of Neo4j of course pushes the standard API to its 
 limits. So for that matter it is already a good exercise.
 Any chance this feature request will find its way into 1.5?
 Niels
 
  Date: Tue, 2 Aug 2011 22:33:03 +0200
  From: matt...@neotechnology.com
  To: user@lists.neo4j.org
  Subject: Re: [Neo4j] Node#getRelationshipTypes
  
  Those methods will of course be more efficient if implemented in the kernel
  compared to iterating through all relationships if the whole relationship
  chain have already been loaded for that node, otherwise it will require a
  full iteration (or at least making sure the whole chain have been loaded).
  I've never found a use case for it myself and this is the first I've heard.
  
  2011/8/1 Niels Hoogeveen pd_aficion...@hotmail.com
  
  
   I have two specific use cases for these methods:
   I'd like to present a node with the property types (names) it has content
   for and with the relationship types it has relationships for, while 
   loading
   those properties/relationships on demand (ie. click here to see details).
   This can be done for properties: there is a getPropertyKeys() method, but
   there is no getRelationshipTypes() method.
   The other use case has to do with the Enhanced API. There I want to have
   pluggable relationships and properties. With respect to relationships 
   there
   are already three implementations: the regular Relationship, 
   SortedRelations
   (which use an in-graph Btree for storage) and HyperRelationships which 
   allow
   n-ary relationships.
   Every Element in Enhanced API has a getRelationships() method, much like
   the getRelationships() method in Node, which should return every
   relationship attached to an Element, irrespective of its implementation.
   Right now the Element implementation has to perform the logic to 
   distinguish
   which relationship is used for what implementation (under the hood it all
   works using normal Relationships). It would be much more elegant to 
   iterate
   over the RelationshipTypes and dispatch the getRelationships() method to 
   the
   appropriate RelationshipType implementations. That way the logic for
   SortedRelationships, HyperRelationships remains in their associated 
   classes
   and is not spread around the implementation.
  
   Niels
From: michael.hun...@neotechnology.com
Date: Sun, 31 Jul 2011 23:20:50 +0200
To: user@lists.neo4j.org
Subject: Re: [Neo4j] Node#getRelationshipTypes
   
Imho it would have to iterate as well.
   
As the type is stored with the relationship record and so can only be
   accessed after having read it.
   
It might be to have some minimal performance improvements that
   relationships would not have to be fully loaded, nor put into the cache 
   for
   that. But this is always a question of the use-case. What will be done 
   next
   with those rel-types.
   
What was the use-case for this operation again?
   
Cheers
   
Michael
   
Am 31.07.2011 um 18:59 schrieb Niels Hoogeveen:
   

 Good point.
 It could for all practical purposes even be IterableRelationshipType
   so they can be lazily fetched, as long as the underlying implementation
   makes certain that any iteration of the RelationshipTypes forms a set (no
   duplicates).
 There is no need to have RelationshipTypes in any particular order, 
 and
   if that is needed in the application, they can usually be sorted locally
   since Nodes will generally have associated Relationships of only a handful
   of RelationshipTypes.

 That said, the more important question is, if the Neo4j store can
   produce this meta-information. For sparsely connected nodes, it is 
   possible
   to iterate over the relationships and return the set of RelationshipTypes,
   but this is not a proper solution when nodes are densely connected. So 
   there
   is no general solution for this question yet.
 Niels

 From: j...@neotechnology.com
 Date: Sun, 31 Jul 2011 17:29:29 +0100
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Node#getRelationshipTypes

[Neo4j] HyperEdges unify Relationships and Properties

2011-08-01 Thread Niels Hoogeveen


Last couple of days I have worked improving upon the Enhanced API and made some 
progress unifying Properties and Relationships.

For some time, I have wanted to have a traverser which I can set up so that it 
returns a collection of properties. After all what we want to present in an 
application is not a node, but a set of properties on a node (or relationship). 
Both the node and the relationship are ultimately containers and only 
interesting for computational reasons.

Of course it is possible to unify Relationships and Properties, after all they 
are both addressed by name (albeit in the Relationship case dressed up as a 
RelationshipType).

Introducing HyperEdges (formerly named HyperRelationships) creates the right 
framework to unify Relationships as Properties.

The standard API provides support for binary edges, relating 2 Nodes by means 
of a RelationshipType (label). 

The constructor of such a binary relationship can be thought of as:

Egde(EdgeType, Node, Node)

The Enhanced API generalizes this to n-ary Edges, where the binary (2-ary) edge 
is just a special case. 

So for the n-ary case we get the constructor:

Edge(EdgeType, Node...)

This is implemented and works well for all n  2. It also works for n = 2, 
because then we simply wrap the standard API and make direct calls to normal 
Relationships.

This leaves us with two special cases n = 0 and n = 1.

The n = 0 case could be thought of as the EdgeType itself, after all for case n 
= 0, the constructor of the edge reduces to:

Egde(EdgeType)

The n = 1 case brings us to the reason of this post, because for that case, the 
constructor of the edge is:

Egde(EdgeType, Node)

This looks strikingly similar to the constructor of a property, which would 
take the form:

Property(PropertyType, Node)

All we need to do to unify Properties and Relationships within the HyperEdge 
framework is to state that PropertyType is a subtype of EdgeType and that 
Property is a subtype of Edge.

How all this relates to the transformation of a directed graph into an 
undirected bipartite graph will likely be subject of another post.

Niels
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Brainstorming on my project: neo4john

2011-07-31 Thread Niels Hoogeveen


Hi John,

I think when approaching a project there are two distinct issues at play, one 
is the tooling level, 
another is the actual solution you are trying to create for an actual problem.

When looking at the tooling level it is great to have as much covered as 
possible. 
Neo4j offers a graph database and pretty good integration with Lucene. 
This overall is a good choice of tools, because there is hardly any overlapping 
functionality. 
Neo4j offers storage and navigation, while Lucene provides indexing. So the 
tools are pretty much orthogonal to each other.

When adding BDB to the mix, things become a bit messier. BDB offers indexing 
and storage, 
so now you have to decide what to use BDB for. If you choose to only use it for 
indexing, 
like an alternative to Lucene, things remain pretty much orthogonal. 
When you decide to use BDB for storage, the question becomes: what to store in 
Neo4j and what to store in BDB. 

When it comes to storing and retrieving properties to entities both seem to be 
pretty fast, and unless you have serious performance issues with the storage of 
properties, either Neo4j or BDB is suitable for the task.

When it comes to storing relationships between entities, Neo4j is by far the 
better solution. Fetching a relationship is a really cheap action, since it 
only involves moving a file pointer to a certain position (id * record 
length) and read the record (ie. if that data is not available in the cache 
already). 

When having a relationships it is also cheap to fetch the associated nodes 
(again moving a file pointer to a position, or read it from the cache). And 
while we are at it, when having a node or a relationship, it is again cheap to 
fetch the properties associated to that node.

The motto of Neo4j seems to be, keep it local stupid. 

This works great, unless things are not local and this is where indexing comes 
into play. 

Suppose we know a name or a certain value and want to know what nodes or 
relationships it is associated with, doing a local search becomes ineffective. 
We could iterated over all nodes (and or all relationships) and check for that 
particular value, but that doesn't scale beyond a couple of thousand nodes or 
relationships. 

One option could be to do the indexing in the graph. We could create a node 
that can easily be addressed through the reference node, that functions as a 
tree root and traverse over he index to find a particular node or relationship.

It works, but is not as fast as dedicated indexing. A dedicated index will 
fetch index blocks in one read operation and manipulate those index blocks in 
memory, where an index build in Neo4j would model an index block as a set of 
nodes that need to be read one after another (and likely from very different 
places in the store). So a dedicated index is more local than Neo4j can be when 
manipulating the index trees. 

A dedicated index will win hands down from Neo4j when it comes to raw speed of 
an index lookup/manipulation and likely consume less memory doing so. 

Neo4j already supports Lucene, which is great for certain jobs (full text 
indexing, composite queries), but is probably (I would have to run tests to 
verify this assumption) slower than BDB when it comes to simple key-value 
mappings. Lucene is also not very good at handling unicity constraints, an area 
where a more regular key-value store like BDB has advantage too.

All this is just about the tooling level of an application (fun in its own 
right, but it doesn't solve any real problems). Things become more interesting 
when we start looking at an actual application. 

So my question is, what use cases do you want to solve with your neo4john 
project. 

Your example with buttons on a screen is a bit too high level, because it 
contains a lot more tooling than just neo4j and or BDB. You would need 
presentation (GUI or HTML) and reactiveness (how to respond to input) and you 
would need to somehow model your domain. 

So my suggestion would be to first list a couple of real world scenarios you 
want to solve with your neo4john project and then look at your tooling to see 
what trade-offs you need to make to implement it. You may need a mix of Neo4j, 
Lucene and BDB, but maybe you don't need all three to solve your particular 
problem. 

In any case, it's important to rise above the tooling level, because that is 
only a means to a goal. Even if your project provides additional tooling, there 
is still an application level to it. Focusing on the application level is good 
practice, because only there do you actually provide solutions.

Niels


 Date: Sun, 31 Jul 2011 15:09:20 +0200
 From: cyuczie...@gmail.com
 To: user@lists.neo4j.org
 Subject: [Neo4j] Brainstorming on my project: neo4john
 
 Hey guys,
 
 I've been thinking that I would like to have a topic (like this current one)
 where I would be allowed to post anything related to brainstorming on my
 project which is currently a mix of neo4j and berkeleydb java

Re: [Neo4j] Node#getRelationshipTypes

2011-07-31 Thread Niels Hoogeveen


Good point. 
It could for all practical purposes even be IterableRelationshipType so they 
can be lazily fetched, as long as the underlying implementation makes certain 
that any iteration of the RelationshipTypes forms a set (no duplicates).
There is no need to have RelationshipTypes in any particular order, and if that 
is needed in the application, they can usually be sorted locally since Nodes 
will generally have associated Relationships of only a handful of 
RelationshipTypes. 

That said, the more important question is, if the Neo4j store can produce this 
meta-information. For sparsely connected nodes, it is possible to iterate over 
the relationships and return the set of RelationshipTypes, but this is not a 
proper solution when nodes are densely connected. So there is no general 
solution for this question yet. 
Niels

 From: j...@neotechnology.com
 Date: Sun, 31 Jul 2011 17:29:29 +0100
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Node#getRelationshipTypes
 
 Hi Niels,
 
 Ignoring the operational use for getting relationship types, I do think these 
 should be generalised from:
 
  RelationshipType[] getRelationshipTypes();
  RelationshipType[] getRelationshipTypes(Direction);
 
 to:
 
 SetRelationshipType getRelationshipTypes();
 SetRelationshipType getgetRelationshipTypes(Direction);
 
 Unless you need the ordering and you think the overhead of creating a some 
 kind of Set is too onerous from a performance point of view.
 
 Jim
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Brainstorming on my project: neo4john

2011-07-31 Thread Niels Hoogeveen


Interesting thought, and it is certainly true that indexing is much less of a 
concern in a graph database than in a normal RDBMS where generally every table 
needs to have a primary key and where you need to have an index on the primary 
key to be able to do joins (at least to do them somewhat quickly). 

In a graph database relationships are explicit and static, while in an RDBMS 
inter-table relationships are implicit and dynamic.

This distinction makes that an RDBMS can answer some ad-hoc relationship 
questions where this would be unpractical in a graph database.

For example, in an RDBMS I can ask for a join over the Persons and over the 
Country table and return the Person_ID and the Country_ID if the country code 
is contained in the last name of the person.

In a graph database asking that same question is not that easy, unless of 
course we have explicitly created relationships from Person nodes to Country 
nodes if the country code is contained in the last name of the person 
(unlikely). 

Being able to find relationships in an implicit and dynamic way has of course a 
performance penalty. After all it's much cheaper to follow a file pointer than 
having to lookup a value in an index (or worse do a full table scan).

That said, there are situations where we need to jump to another position in 
the graph. One way is through the use of id's, which is a very cheap non-local 
jump. The other is through indexes, which can come in two variations, in-graph 
(using a traversal to mimic a non-local jump), or through an external index 
service.

In-graph indexes can work really well, but are not as optimized to the task as 
dedicated index services are. The main reason is that dedicated index services 
can map index blocks to memory, while neo4j is much more fine grained, having 
to load the content of an index block node for node and relationship for 
relationship. This makes that in-graph indexes don't really scale all that 
well, especially when getting bigger than memory allocated.  When having a 
cache miss, a dedicated index service can swap out a couple of index blocks 
where neo4j needs to swap out individual nodes and relationships. If index 
blocks are needed again, a dedicated index service can simply load those block 
in one read operation, while an in-graph index would have to reload those 
individual nodes and relationships one at a time.
Niels

 From: j...@neotechnology.com
 Date: Sun, 31 Jul 2011 17:27:33 +0100
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Brainstorming on my project: neo4john
 
 Hi John, Niels,
 
 I think of indexes in Neo4j as long-lived names. Not quite the keep it 
 local that Niels mentioned, but not entirely dissimilar either. 
 
 Those long lived-names tend to give you starting points in the graph from 
 where you perform graph operations. Indexing therefore constitutes less of 
 your database design than it would in a RDBMS.
 
 Marko had a good line about this: graphs are adjacency free indexes (or words 
 to that affect). 
 
 Jim
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Brainstorming on my project: neo4john

2011-07-31 Thread Niels Hoogeveen


Aiming to be as generic as possible can be good, but as some point you need to 
be specific too.
You mention Java and Eclipse as being generic, but they are only to a point.
When Java was introduced some of its main feats were platform independence, 
static type checking, garbage collection/managed memory and checked exceptions.
Those were deliberate design decisions making Java a very specific sort of 
language, making it suitable for certain types of applications and less 
suitable for other types of applications. Java is very suitable for large 
applications that need modularisation, but it's not that great for ad-hoc 
scripting.
The same is true for Eclipse. It is a great platform to build an IDE in, but 
would be overkill for a simple game of tic-tac-toe.
Creating something that is completely generic has the downside that it actually 
becomes bad at doing something specific.
Another downside to being completely generic is that it doesn't provide people 
with clues what it can do. This is most noticeable in the programming language 
LISP, which is so generic that every construct looks like every other 
construct, giving no visual clues to what the program is actually doing. It's a 
wonderful language where you can do the most amazing dynamic magic in only a 
few lines of code (with lots of parentheses), but has always been a niche 
language, because it doesn't offer programmers concrete clues about what you 
can do with it.
Another language from that same era, COBOL, took the opposite approach and very 
explicitly made every feature available at the language level. This made COBOL 
a very special purpose language. 
At the time, for application programmers COBOL was an easy choice, because it 
offered many of the features needed for the applications. Much of that could be 
achieved in LISP too, but it never made those features explicit, so no one ever 
considered writing a business app in LISP.
That said, the demise of COBOL came because of a changing environment and the 
specifics strengths eventually became weaknesses. Still, the changing 
environment didn't make LISP a winner, instead a language like PHP became 
hugely successful, because it focused on doing one thing well: the creation of 
HTML pages.
So my point is, when you want to create something, try to have a concrete 
vision of what you want it to do. 
Now as to the discussion of what is tooling level and what is application 
level. 
I think the as a rule of thumb you can say that tools can be replaced by 
something else without functionally changing the application, while you cannot 
replace part of the application with something else without functionally 
changing the applications.
Let's look at Neo4j. For me as an application programmer, it is a tool. I could 
in principle swap Neo4j out and replace it with another storage engine. I would 
probably take a performance hit in some areas doing so, but functionally my 
application could very much remain the same.
For Neo Tech on the other hand, Neo4j is an application. There the tooling 
level consists of things like Maven and the java NIO API. In principle the 
tooling could be replaced. Instead of Maven, ANT scripts could be used to do 
the build and instead of the NIO API, the old fashioned IO API could be used. 
There would be a huge performance penalty swapping out NIO for IO, but 
functionally Neo4j could remain the same, only much slower. Yet it is not 
possible to remove the Node API and replace it with something else without 
changing the functionality of the application.
So the question remains what functionality you want to provide with your 
neo4john project. You could think of a storage API that is independent of the 
storage engine used. So you could swap out Neo4j and replace it with BDB, and 
vice versa. If you do that, ask yourself who would be interested in that and 
what purpose does it serve? What are the benefits of replacing one storage 
engine with the other?
When I started working on the Enhanced API, I had some concrete goals in mind 
which I wanted to solve:
1.) Make every element of the database reifiable, so they can all be used as 
first-class citizens.2.) Provide a pluggable architecture for properties and 
relationships.
Both these goals make the Enhanced API more general than the standard API, but 
this is a result of the goals and not a goal in and of itself.

 

 Date: Sun, 31 Jul 2011 19:45:50 +0200
 From: cyuczie...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Brainstorming on my project: neo4john
 
 Hey Niels, thanks for the concise reply.
 
 On Sun, Jul 31, 2011 at 5:10 PM, Niels Hoogeveen
 pd_aficion...@hotmail.comwrote:
 
 
  Hi John,
 
  I think when approaching a project there are two distinct issues at play,
  one is the tooling level,
  another is the actual solution you are trying to create for an actual
  problem.
 
  I seem to want a generic solution for multiple problems.
 Something generic enough that it can be applied

Re: [Neo4j] Node#getRelationshipTypes

2011-07-31 Thread Niels Hoogeveen


I have two specific use cases for these methods:
I'd like to present a node with the property types (names) it has content for 
and with the relationship types it has relationships for, while loading those 
properties/relationships on demand (ie. click here to see details).
This can be done for properties: there is a getPropertyKeys() method, but there 
is no getRelationshipTypes() method.
The other use case has to do with the Enhanced API. There I want to have 
pluggable relationships and properties. With respect to relationships there are 
already three implementations: the regular Relationship, SortedRelations (which 
use an in-graph Btree for storage) and HyperRelationships which allow n-ary 
relationships.
Every Element in Enhanced API has a getRelationships() method, much like the 
getRelationships() method in Node, which should return every relationship 
attached to an Element, irrespective of its implementation. Right now the 
Element implementation has to perform the logic to distinguish which 
relationship is used for what implementation (under the hood it all works using 
normal Relationships). It would be much more elegant to iterate over the 
RelationshipTypes and dispatch the getRelationships() method to the appropriate 
RelationshipType implementations. That way the logic for SortedRelationships, 
HyperRelationships remains in their associated classes and is not spread around 
the implementation. 

Niels
 From: michael.hun...@neotechnology.com
 Date: Sun, 31 Jul 2011 23:20:50 +0200
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Node#getRelationshipTypes
 
 Imho it would have to iterate as well.
 
 As the type is stored with the relationship record and so can only be 
 accessed after having read it.
 
 It might be to have some minimal performance improvements that relationships 
 would not have to be fully loaded, nor put into the cache for that. But this 
 is always a question of the use-case. What will be done next with those 
 rel-types.
 
 What was the use-case for this operation again?
 
 Cheers
 
 Michael
 
 Am 31.07.2011 um 18:59 schrieb Niels Hoogeveen:
 
  
  Good point. 
  It could for all practical purposes even be IterableRelationshipType so 
  they can be lazily fetched, as long as the underlying implementation makes 
  certain that any iteration of the RelationshipTypes forms a set (no 
  duplicates).
  There is no need to have RelationshipTypes in any particular order, and if 
  that is needed in the application, they can usually be sorted locally since 
  Nodes will generally have associated Relationships of only a handful of 
  RelationshipTypes. 
  
  That said, the more important question is, if the Neo4j store can produce 
  this meta-information. For sparsely connected nodes, it is possible to 
  iterate over the relationships and return the set of RelationshipTypes, but 
  this is not a proper solution when nodes are densely connected. So there is 
  no general solution for this question yet. 
  Niels
  
  From: j...@neotechnology.com
  Date: Sun, 31 Jul 2011 17:29:29 +0100
  To: user@lists.neo4j.org
  Subject: Re: [Neo4j] Node#getRelationshipTypes
  
  Hi Niels,
  
  Ignoring the operational use for getting relationship types, I do think 
  these should be generalised from:
  
  RelationshipType[] getRelationshipTypes();
  RelationshipType[] getRelationshipTypes(Direction);
  
  to:
  
  SetRelationshipType getRelationshipTypes();
  SetRelationshipType getgetRelationshipTypes(Direction);
  
  Unless you need the ordering and you think the overhead of creating a some 
  kind of Set is too onerous from a performance point of view.
  
  Jim
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user

  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] bdb-index

2011-07-30 Thread Niels Hoogeveen

I use the download option on Github expand the zip in a directory and run mvn 
install in that directory without any problems.
Niels

 Date: Sat, 30 Jul 2011 13:39:15 +0200
 From: cyuczie...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] bdb-index

 When running the mvn install, both tests are ran after another.
 Since I didn't use mvn (xD) I ran the tests manually one by one, but what
 you say makes sense, it's likely the tests fail when ran one after the
 other, I'll see what happens with an @Suite
 since there are only 2 junit tests, with @Suite they work
 Let's see if I could run mvn install (btw, avoided mvn so far because I
 cannot install the git plugin for some reason and that other error I get)
 Looks like I still need to find out how to fix this error:
 [ERROR]   The project org.neo4j:neo4j-berkeleydb-je-index:0.1-SNAPSHOT
 (E:\wrkspc\bdb-index-fork\pom.xml) has 1 error
 [ERROR] Non-resolvable parent POM: The repository system is offline but
 the artifact org.neo4j:parent-central:pom:18 is not available in the local
 repository. and 'parent.relativePath' points at wrong local POM @ line 3,
 column 11 - [Help 2]

 before I could do anything with maven...
 I'll skip trying to make maven to work for me for now, don't feel like it :)

 *I'm not qualified to fix this with maven, sorry*
 John

 On Fri, Jul 29, 2011 at 5:16 PM, Niels Hoogeveen
 pd_aficion...@hotmail.comwrote:

  Hi John,
  Thanks for looking into this.
  I am still seeing the same error I had before. When running the mvn
  install, both tests are ran after another. For some reason the transaction
  log sees an unclean shutdown and tries to commit pending transactions.
  During that process the index names of the bdb indexes are being retrieved
  from binary storage. Here something goes wrong, because the index name
  returned is garbage, so the recovery process fails because it can't find the
  right index files.
  Niels

   Date: Fri, 29 Jul 2011 07:48:43 +0200
   From: cyuczie...@gmail.com
   To: user@lists.neo4j.org
   Subject: Re: [Neo4j] bdb-index

   I forked and fixed, the tests are all working now:
   https://github.com/13th-floor/bdb-index
   Let me know if you want me to do a pull request, ... sadly I applied
   formatting on RawBDBSpeed and the diff doesn't look pretty if you're
  trying
   to see what changed

   John.

   On Thu, Jul 28, 2011 at 7:36 PM, Niels Hoogeveen
   pd_aficion...@hotmail.comwrote:

Trying to find something useful to hide the implementation book keeping
  of
Enhanced API, I tried out dbd-index as can be found here:
https://github.com/peterneubauer/bdb-index
It looks interesting, but fails its tests. When recovering it performs
BerkeleyDbCommand#readCommand from the log. The retrieved indexName is
  not
actually garbage. I would like to help make this component workable,
  but
area of the database is a bit beyond the scope that I know.
I know this is completely unsupported software, but can someone give me
some pointers on how to fix this issue?
Niels
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

   ___
   Neo4j mailing list
   User@lists.neo4j.org
   https://lists.neo4j.org/mailman/listinfo/user

  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user

 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] bdb-index

2011-07-30 Thread Niels Hoogeveen

 licenses...
  [INFO]
  [INFO] --- maven-resources-plugin:2.4.3:resources (default-resources) @
  neo4j-berkeleydb-je-index ---
  [WARNING] The POM for org.apache.maven:maven-plugin-api:jar:2.0.6 is
  missing, no dependency information available
  [WARNING] The POM for org.apache.maven:maven-project:jar:2.0.6 is
  missing, no dependency information available
  [WARNING] The POM for org.apache.maven:maven-core:jar:2.0.6 is missing,
  no dependency information available
  [WARNING] The POM for org.apache.maven:maven-artifact:jar:2.0.6 is
  missing, no dependency information available
  [WARNING] The POM for org.apache.maven:maven-settings:jar:2.0.6 is
  missing, no dependency information available
  [WARNING] The POM for org.apache.maven:maven-model:jar:2.0.6 is missing,
  no dependency information available
  [WARNING] The POM for org.apache.maven:maven-monitor:jar:2.0.6 is
  missing, no dependency information available
  [WARNING] The POM for
  org.apache.maven.shared:maven-filtering:jar:1.0-beta-4 is missing, no
  dependency information available
  [WARNING] The POM for org.codehaus.plexus:plexus-interpolation:jar:1.13
  is missing, no dependency information available
  [INFO]
  
  [INFO] BUILD FAILURE
  [INFO]
  
  [INFO] Total time: 1.780s
  [INFO] Finished at: Sat Jul 30 19:32:06 CEST 2011
  [INFO] Final Memory: 16M/154M
  [INFO]
  
  [ERROR] Failed to execute goal
  org.apache.maven.plugins:maven-resources-plugin:2.4.3:resources
  (default-resources) on project neo4j-berkeleydb-je-index: Executi
  on default-resources of goal
  org.apache.maven.plugins:maven-resources-plugin:2.4.3:resources failed:
  Plugin org.apache.maven.plugins:maven-resources-plugin:2.4.
  3 or one of its dependencies could not be resolved: The following
  artifacts could not be resolved:
  org.apache.maven.shared:maven-filtering:jar:1.0-beta-4, org.c
  odehaus.plexus:plexus-interpolation:jar:1.13: The repository system is
  offline but the artifact
  org.apache.maven.shared:maven-filtering:jar:1.0-beta-4 is not av
  ailable in the local repository. - [Help 1]
  [ERROR]
  [ERROR] To see the full stack trace of the errors, re-run Maven with the
  -e switch.
  [ERROR] Re-run Maven using the -X switch to enable full debug logging.
  [ERROR]
  [ERROR] For more information about the errors and possible solutions,
  please read the following articles:
  [ERROR] [Help 1]
  http://cwiki.apache.org/confluence/display/MAVEN/PluginResolutionException
 
  *BOLD *part :)
  Running from command line, but on a just now downloaded zip file(as you
  said),
  it works (thus my eclipse maven still needs some work, ie. maybe allow
  it internet access even though it's on ask in firewall)
  I mean I do see those errors that you said you're seeing... can't really
  paste them here from terminal they will be broken with 80 chars per line
 
  In eclipse without maven, running AllTests, although the tests do pass,
  I failed to see that (possibly) the same exception(s) thrown by maven
  install, are happening on console. But not when tests are run each
  individually. So, your errors happen both with mvn install and AllTests
  (which runs them both one after the other, too). So that was a failure to
  notice on my part :) that counting from 0 to 100 on console must've 
  moved up
  the exceptions and since tests were all success, I didn't scroll up.
 
  Trying to fix,
 
 
 
  On Sat, Jul 30, 2011 at 3:28 PM, Niels Hoogeveen 
  pd_aficion...@hotmail.com wrote:
 
 
  I use the download option on Github expand the zip in a directory and
  run mvn install in that directory without any problems.
  Niels
 
   Date: Sat, 30 Jul 2011 13:39:15 +0200
   From: cyuczie...@gmail.com
   To: user@lists.neo4j.org
   Subject: Re: [Neo4j] bdb-index
  
   When running the mvn install, both tests are ran after another.
   Since I didn't use mvn (xD) I ran the tests manually one by one, but
  what
   you say makes sense, it's likely the tests fail when ran one after
  the
   other, I'll see what happens with an @Suite
   since there are only 2 junit tests, with @Suite they work
   Let's see if I could run mvn install (btw, avoided mvn so far because
  I
   cannot install the git plugin for some reason and that other error I
  get)
   Looks like I still need to find out how to fix this error:
   [ERROR]   The project
  org.neo4j:neo4j-berkeleydb-je-index:0.1-SNAPSHOT
   (E:\wrkspc\bdb-index-fork\pom.xml) has 1 error
   [ERROR] Non-resolvable parent POM: The repository system is
  offline but
   the artifact org.neo4j:parent-central:pom:18 is not available in the
  local
   repository. and 'parent.relativePath' points at wrong local POM @
  line 3,
   column 11 - [Help 2]
  
   before I could do anything with maven...
   I'll skip trying to make maven to work for me

[Neo4j] Node#getRelationshipTypes

2011-07-30 Thread Niels Hoogeveen


While working on Enhanced API, I realize two crucial method are missing on the 
Node interface of the standard API:

RelationshipType[] getRelationshipTypes();
RelationshipType[] getRelationshipTypes(Direction);

For Enhanced API, I'd like to be able to plug in different Relationship 
implementations (eg. SortedRelations and HyperRelations). 

Doing so is sort of possible, but only by cluttering a class called ElementImpl 
with all sorts of logic 
related to those different Relationship implementations (not the place where it 
belongs). 

The neat way would be to dispatch on RelationshipType and have the different 
Relationship implementations handle that logic.

The API for PropertyContainer on the other hand does provide a method similar 
to what I am asking for: PropertyContainer#getPropertyKeys(). 

I realize this request cannot be honored without changing the record layout of 
the neo4j store, which has a major impact.

However, there is already good reason to reconsider the record layout of the 
relationship store
to solve the issue of densely connected nodes. 

To properly solve the densely connected node issue, relationships should be 
partitioned by relationship type and by direction. 

That way only those Relationships belonging to a RelationshipType that 
contributes to the densely connectedness will take time to load, 
while other Relationships can be fetched fast.

Such partitioning immediately provides the meta information I am asking for.

Such meta information (as exists for properties), has value beyond its use in 
the Enhanced API. 

I would eg. like to be able to present a form with the property keys and 
RelationshipTypes associated with a particular Node, 
and on request load the content belonging to a property key or a 
RelationshipType. 

For property keys this is possible, but for RelationshipTypes all relationships 
need to be fetched 
to know which RelationshipTypes are associated with a particular node. 

Especially for RelationshipTypes with many instances connected to one Node this 
is not a suitable solution 
and runs counter to the need to load those relationships on request.

Niels 
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] bdb-index

2011-07-30 Thread Niels Hoogeveen






Yes, you are right. I had looked at the code too superficially. Still, 
something goes wrong reading the indexName, when I print that name it looks 
like garbage (upon recovery), while it should produce a readable index name. I 
didn't check if the value written to the record is actually a readable String. 
Niels

 Date: Sat, 30 Jul 2011 23:23:49 +0200
 From: cyuczie...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] bdb-index
 
 I did a quick check of what you said
 org.neo4j.index.bdbje.BerkeleyDbCommand.writeToFile(LogBuffer)
  char[] indexName = indexId.indexName.toCharArray();
  buffer.putInt( indexName.length );
  buffer.put( indexName );
 
 I'm probably missing something but on my side it looks like it writes length
 then indexName (and I didn't update from github, just in case you've already
 fixed this)
 
 Either way, my impression of what was happening is that some files got
 deleted, except some ie. the log, which were still open/in use, and maybe
 when recovery was tried, either it couldn't be opened, or due to being
 opened contained impartial data, or all was well but recovery couldn't
 happen because the log needed some other files or a previous database
 snapshot upon which to apply the recovered transactions
 
 I only get that messages.log being unable to delete when I allow the test
 testFindCreatedIndex() to run, I cannot yet figure out who creates that file
 and to make sure it's being closed
 
 John.
 
 On Sat, Jul 30, 2011 at 11:09 PM, Niels Hoogeveen pd_aficion...@hotmail.com
  wrote:
 
 
  The problem is indeed related to not properly closing the bdb database, and
  that is triggers another problem. In BerkeleyDbCommand data is being stored
  into the transaction log and been read from the transaction log later on.
  Something goes wrong making the indexName being retrieved from the
  transaction log look like garbage.
  I think I have located the problem. In the method
  BerkeleyDbCommand#writeToFile the sequence of elements written to the buffer
  is different from the order in which the method
  BerkeleyDbCommand#readCommand reads those elements. The
  BerkeleyDbCommand#writeToFile method cannot be correct, because it first
  writes the indexName and then its length. It should of course first write
  the length and then the indexName.
  Niels
   Date: Sat, 30 Jul 2011 22:51:40 +0200
   From: cyuczie...@gmail.com
   To: user@lists.neo4j.org
   Subject: Re: [Neo4j] bdb-index
  
   found out that I don't need to call index.delete() all the time, instead
   BerkeleyDbDataSource.close() aka XaDataSource.close() should do what
   index.delete() does, namely closing all databases (related to this
   datasource) and their bdb environment; so I do just that.
  
   Therefore I answer some parts I asked before.
  
   And that logical.log.1 seems to be a part of XA Transactions and I must
  find
   a way to see that it's closed or something
  
   On Sat, Jul 30, 2011 at 10:15 PM, John cyuczieekc cyuczie...@gmail.com
  wrote:
  
in TestBerkeley.java
So far I've found that, bdb environment(and relevant databases) is(are)
only closed when index.delete() is called
and that can only be called when the current transaction is finished
  (else
it will complain that some bdb databases are not opened on txn commit)
   Applying all those changes, the following file is still in use (due
  to
cannot be deleted):
   
E:\wrkspc\bdb-index-fork\target\var\neo4j-db\logical.log.1
This seems to be part of neo4j, though I am not sure why would it still
  be
in use even after graphDb.shutdown()
Any ideas why that would be still in use? Is graphDb.shutdown()
  blocking
until everything is closed? or are there still threads left keeping
  files
locked? or shutdown is delegated to other threads which may still be
  doing
their work when .shutdown() returns ?
   
By looking at some testcases in neo4j, I see that *index.delete() can
  be
called before transaction finished, is this correct* ? anyone?
ie.
 beginTx();
index = graphDb.index().forNodes( INDEX_NAME );
index.delete();
restartTx();
where
 void restartTx()
{
finishTx( true );
beginTx();
}
   
in this case, if that's true that index.delete() should not cause the
  txn
commit to fail, then this needs to be fixed in bdb-index
   
Also,* is neo4j closing the indexes* somehow when graphDb.shutdown() ?
  it
seems to me the only close would be index.delete() and neo4j isn't
  closing
them, thus leaving the bdb Environment still open, thus tests that
  require
shutdown and reopen of graphdb will fail since bdb wasn't itself
  shutdown
and reopened but was left still open.
Maybe closing the indexes is left to the user then? it's fine with me,
  just
so long as I know
   
   
disorganized John :)
   
   
On Sat, Jul 30, 2011 at 9:06 PM, John cyuczieekc

Re: [Neo4j] bdb-index

2011-07-30 Thread Niels Hoogeveen


It looks as if you have modified the file header of the source files. 
Maven checks the license (the file header) and returns an error message when 
the license required is different from the license provided.
When looking at the diff of one of your edits I noticed there are extra spaces 
in the license. See: 
https://github.com/13th-floor/bdb-index/commit/7c6b59fbdc445a122aa247b391c15a23dd64cac9#src/main/java/org/neo4j/index/bdbje/BerkeleyDbBatchInserterIndexProvider.java
These extra spaces make that maven does not install.
Niels

 Date: Sun, 31 Jul 2011 00:00:42 +0200
 From: cyuczie...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] bdb-index
 
 related to this I've created: https://trac.neo4j.org/ticket/358
 also committed on my fork, now AllTests.java works
 https://github.com/13th-floor/bdb-index
 
 for some reason I cannot mvn install:
 [INFO] [enforcer:enforce {execution: enforce-maven}]
 [INFO] [license:check {execution: check-licenses}]
 [INFO] Checking licenses...
 [INFO] Missing header in:
 e:\down\13th-floor-bdb-index-f9a3155\src\main\java\org
 \neo4j\index\bdbje\BerkeleyDbIndex.java
 [INFO] Missing header in:
 e:\down\13th-floor-bdb-index-f9a3155\src\main\java\org
 \neo4j\index\bdbje\BerkeleyDbBatchInserterIndexProvider.java
 [INFO] Missing header in:
 e:\down\13th-floor-bdb-index-f9a3155\src\test\java\org
 \neo4j\index\bdbje\Neo4jTestCase.java
 [INFO] Missing header in:
 e:\down\13th-floor-bdb-index-f9a3155\src\test\java\org
 \neo4j\index\bdbje\TestBerkeley.java
 [INFO] Missing header in:
 e:\down\13th-floor-bdb-index-f9a3155\src\main\java\org
 \neo4j\index\bdbje\BerkeleyDbBatchInserterIndex.java
 [INFO] Missing header in:
 e:\down\13th-floor-bdb-index-f9a3155\src\main\java\org
 \neo4j\index\bdbje\BerkeleyDbDataSource.java
 [INFO] Missing header in:
 e:\down\13th-floor-bdb-index-f9a3155\src\test\java\org
 \neo4j\index\bdbje\TestBerkeleyBatchInsert.java
 [INFO] Missing header in:
 e:\down\13th-floor-bdb-index-f9a3155\src\test\java\All
 Tests.java
 [ERROR] BUILD ERROR
 [INFO] -
 [INFO] Some files do not have the expected license header
 [INFO] -
 
 But it should work, I say; maybe let me know if it doesn't
 
 On Sat, Jul 30, 2011 at 11:41 PM, John cyuczieekc cyuczie...@gmail.comwrote:
 
  org.neo4j.kernel.impl.batchinsert.BatchInserterImpl
  keeps StringLogger msgLog  still open even after shutdown()
   public void shutdown()
  {
  graphDbService.clearCaches();
  neoStore.close();
  msgLog.logMessage( Thread.currentThread() +  Clean shutdown on
  BatchInserter( + this + ), true );
  }
  we'd need a msgLog.close(storeDir)
  and storeDir is the same param given to the constructor of
  BatchInserterImpl
  maybe someone from neo4j could do that?
 
  meanwhile I will ignore the failure to delete that file
 
 
  On Sat, Jul 30, 2011 at 11:34 PM, John cyuczieekc 
  cyuczie...@gmail.comwrote:
 
  testFindCreatedIndex() is the method that fails (due to unable to delete
  the file, else it works fine)
  but it only fails when testInsertionSpeed() is allowed to execute (ie. not
  @Ignore)
 
  messages.log contents:
  Sat Jul 30 23:31:23 CEST 2011: Thread[main,5,main] Starting
  BatchInserter(EmbeddedBatchInserter[target/var/batch])
  Sat Jul 30 23:31:42 CEST 2011: Thread[main,5,main] Clean shutdown on
  BatchInserter(EmbeddedBatchInserter[target/var/batch])
 
 
 
  On Sat, Jul 30, 2011 at 11:26 PM, John cyuczieekc 
  cyuczie...@gmail.comwrote:
 
 
 
  On Sat, Jul 30, 2011 at 11:23 PM, John cyuczieekc 
  cyuczie...@gmail.comwrote:
 
  I did a quick check of what you said
  org.neo4j.index.bdbje.BerkeleyDbCommand.writeToFile(LogBuffer)
   char[] indexName = indexId.indexName.toCharArray();
   buffer.putInt( indexName.length );
   buffer.put( indexName );
 
  I'm probably missing something but on my side it looks like it writes
  length then indexName (and I didn't update from github, just in case 
  you've
  already fixed this)
 
  Either way, my impression of what was happening is that some files got
  deleted, except some ie. the log, which were still open/in use, and maybe
  when recovery was tried, either it couldn't be opened, or due to being
  opened contained impartial data, or all was well but recovery couldn't
  happen because the log needed some other files or a previous database
  snapshot upon which to apply the recovered transactions
 
  I only get that messages.log being unable to delete when I allow the
  test testFindCreatedIndex() to run, I cannot yet figure out who creates 
  that
  file and to make sure it's being closed
 
  correction testInsertionSpeed()
 
  John.
 
 
  On Sat, Jul 30, 2011 at 11:09 PM, Niels Hoogeveen 
  pd_aficion...@hotmail.com wrote:
 
 
  The problem is indeed related to not properly closing the bdb database,
  and that is triggers another problem. In BerkeleyDbCommand data is being
  stored into the transaction

Re: [Neo4j] bdb-index

2011-07-30 Thread Niels Hoogeveen

Could you check if the  neo4j kernel jar file maven adds to class path is 
correct and complete. You can find it in your user directory in the .m2 
subdirectory.

 Date: Sun, 31 Jul 2011 00:40:51 +0200
 From: cyuczie...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] bdb-index

 I fixed those licenses, but to my amazement I'm getting new errors which
 didn't happen before,
 I am puzzled as to why would this happen

 e:\down\13th-floor-bdb-index-f9a3155mvn install
 [INFO] Scanning for projects...
 [INFO]

 [INFO] Building Unnamed -
 org.neo4j:neo4j-berkeleydb-je-index:jar:0.1-SNAPSHOT
 [INFO]task-segment: [install]
 [INFO]

 [INFO] [enforcer:enforce {execution: enforce-maven}]
 [INFO] [license:check {execution: check-licenses}]
 [INFO] Checking licenses...
 [INFO] [resources:resources {execution: default-resources}]
 [INFO] Using 'UTF-8' encoding to copy filtered resources.
 [INFO] Copying 1 resource
 [INFO] Copying 0 resource to META-INF
 [INFO] [compiler:compile {execution: default-compile}]
 [INFO] Compiling 14 source files to
 e:\down\13th-floor-bdb-index-f9a3155\target\
 classes
 [INFO] -
 [ERROR] COMPILATION ERROR :
 [INFO] -
 [ERROR]
 \down\13th-floor-bdb-index-f9a3155\src\main\java\org\neo4j\index\bdbje\B
 erkeleyDbDataSource.java:[31,29] package org.neo4j.index.lucene does not
 exist
 [ERROR]
 \down\13th-floor-bdb-index-f9a3155\src\main\java\org\neo4j\index\bdbje\B
 erkeleyDbBatchInserterIndexProvider.java:[32,29] package
 org.neo4j.index.lucene
 does not exist
 [ERROR]
 \down\13th-floor-bdb-index-f9a3155\src\main\java\org\neo4j\index\bdbje\B
 erkeleyDbDataSource.java:[31,29] package org.neo4j.index.lucene does not
 exist
 [ERROR]
 \down\13th-floor-bdb-index-f9a3155\src\main\java\org\neo4j\index\bdbje\B
 erkeleyDbBatchInserterIndexProvider.java:[32,29] package
 org.neo4j.index.lucene
 does not exist
 [INFO] 4 errors
 [INFO] -
 [INFO]

 [ERROR] BUILD FAILURE
 [INFO]

 [INFO] Compilation failure

 \down\13th-floor-bdb-index-f9a3155\src\main\java\org\neo4j\index\bdbje\BerkeleyD
 bDataSource.java:[31,29] package org.neo4j.index.lucene does not exist
 \down\13th-floor-bdb-index-f9a3155\src\main\java\org\neo4j\index\bdbje\BerkeleyD
 bBatchInserterIndexProvider.java:[32,29] package org.neo4j.index.lucene does
 not
  exist
 \down\13th-floor-bdb-index-f9a3155\src\main\java\org\neo4j\index\bdbje\BerkeleyD
 bDataSource.java:[31,29] package org.neo4j.index.lucene does not exist
 \down\13th-floor-bdb-index-f9a3155\src\main\java\org\neo4j\index\bdbje\BerkeleyD
 bBatchInserterIndexProvider.java:[32,29] package org.neo4j.index.lucene does
 not
  exist

 [INFO]

 [INFO] For more information, run Maven with the -e switch
 [INFO]

 [INFO] Total time: 2 seconds
 [INFO] Finished at: Sun Jul 31 00:37:19 CEST 2011
 [INFO] Final Memory: 38M/359M
 [INFO]

 e:\down\13th-floor-bdb-index-f9a3155

 On Sun, Jul 31, 2011 at 12:26 AM, Niels Hoogeveen pd_aficion...@hotmail.com
  wrote:

  It looks as if you have modified the file header of the source files.
  Maven checks the license (the file header) and returns an error message
  when the license required is different from the license provided.
  When looking at the diff of one of your edits I noticed there are extra
  spaces in the license. See:
  https://github.com/13th-floor/bdb-index/commit/7c6b59fbdc445a122aa247b391c15a23dd64cac9#src/main/java/org/neo4j/index/bdbje/BerkeleyDbBatchInserterIndexProvider.java
  These extra spaces make that maven does not install.
  Niels

   Date: Sun, 31 Jul 2011 00:00:42 +0200
   From: cyuczie...@gmail.com
   To: user@lists.neo4j.org
   Subject: Re: [Neo4j] bdb-index

   related to this I've created: https://trac.neo4j.org/ticket/358
   also committed on my fork, now AllTests.java works
   https://github.com/13th-floor/bdb-index

   for some reason I cannot mvn install:
   [INFO] [enforcer:enforce {execution: enforce-maven}]
   [INFO] [license:check {execution: check-licenses}]
   [INFO] Checking licenses...
   [INFO] Missing header in:
   e:\down\13th-floor-bdb-index-f9a3155\src\main\java\org
   \neo4j\index\bdbje\BerkeleyDbIndex.java
   [INFO] Missing header in:
   e:\down\13th-floor-bdb-index-f9a3155\src\main\java\org
   \neo4j\index\bdbje\BerkeleyDbBatchInserterIndexProvider.java
   [INFO] Missing header in:
   e:\down\13th-floor

Re: [Neo4j] bdb-index

2011-07-30 Thread Niels Hoogeveen


I see in your edit of is the following import:
import org.neo4j.index.lucene.LuceneIndexProvider;
This is an interface defined in the legacy-index component, which is not in the 
POM ( and shouldn't be). The import is nowhere used in the file, except as 
links in header of the class where it doesn't belong. I guess an organize 
imports in Eclipse has added that import based on an incorrect header.

It's best to remove the legacy-index component from your build path in eclips. 
In fact, it's best to let maven manage the project for you, so only jars listed 
as dependencies in maven are put in the build path. To work on bdb-index you 
need nothing more than the neo4j-kernel on your build path.
Niels

 Date: Sun, 31 Jul 2011 01:19:17 +0200 From: cyuczie...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] bdb-index
 
 I'm not sure how complete it is (ie. there's no org\neo4j\index folder
 inside it), but its sha1 matches, but also worth mentioning that I noticed
 it got updated a few minutes before I tried to mvn install, so it could be
 that it worked before because it was a different .jar (ie. prev version)
 Also, unpacking the jar and searching for any file named lucene* yields no
 results
 Searching for lucene* in all archives under that .m2 folder, still nothing.
 trying with 1.3 still doesn't work, not found.
 
 neo4j-kernel-1.4-SNAPSHOT.jar 817,935 bytes
 sha1: a20720ece824b372520b7afde080cdc83abb5501
 
 Thanks for the hints! All this maven knowledge will prove useful.
 John.
 
 
 On Sun, Jul 31, 2011 at 12:57 AM, Niels Hoogeveen pd_aficion...@hotmail.com
  wrote:
 
 
  Could you check if the  neo4j kernel jar file maven adds to class path is
  correct and complete. You can find it in your user directory in the .m2
  subdirectory.
 
   Date: Sun, 31 Jul 2011 00:40:51 +0200
   From: cyuczie...@gmail.com
   To: user@lists.neo4j.org
   Subject: Re: [Neo4j] bdb-index
  
   I fixed those licenses, but to my amazement I'm getting new errors which
   didn't happen before,
   I am puzzled as to why would this happen
  
  
   e:\down\13th-floor-bdb-index-f9a3155mvn install
   [INFO] Scanning for projects...
   [INFO]
   
   [INFO] Building Unnamed -
   org.neo4j:neo4j-berkeleydb-je-index:jar:0.1-SNAPSHOT
   [INFO]task-segment: [install]
   [INFO]
   
   [INFO] [enforcer:enforce {execution: enforce-maven}]
   [INFO] [license:check {execution: check-licenses}]
   [INFO] Checking licenses...
   [INFO] [resources:resources {execution: default-resources}]
   [INFO] Using 'UTF-8' encoding to copy filtered resources.
   [INFO] Copying 1 resource
   [INFO] Copying 0 resource to META-INF
   [INFO] [compiler:compile {execution: default-compile}]
   [INFO] Compiling 14 source files to
   e:\down\13th-floor-bdb-index-f9a3155\target\
   classes
   [INFO] -
   [ERROR] COMPILATION ERROR :
   [INFO] -
   [ERROR]
   \down\13th-floor-bdb-index-f9a3155\src\main\java\org\neo4j\index\bdbje\B
   erkeleyDbDataSource.java:[31,29] package org.neo4j.index.lucene does not
   exist
   [ERROR]
   \down\13th-floor-bdb-index-f9a3155\src\main\java\org\neo4j\index\bdbje\B
   erkeleyDbBatchInserterIndexProvider.java:[32,29] package
   org.neo4j.index.lucene
   does not exist
   [ERROR]
   \down\13th-floor-bdb-index-f9a3155\src\main\java\org\neo4j\index\bdbje\B
   erkeleyDbDataSource.java:[31,29] package org.neo4j.index.lucene does not
   exist
   [ERROR]
   \down\13th-floor-bdb-index-f9a3155\src\main\java\org\neo4j\index\bdbje\B
   erkeleyDbBatchInserterIndexProvider.java:[32,29] package
   org.neo4j.index.lucene
   does not exist
   [INFO] 4 errors
   [INFO] -
   [INFO]
   
   [ERROR] BUILD FAILURE
   [INFO]
   
   [INFO] Compilation failure
  
  
  \down\13th-floor-bdb-index-f9a3155\src\main\java\org\neo4j\index\bdbje\BerkeleyD
   bDataSource.java:[31,29] package org.neo4j.index.lucene does not exist
  
  \down\13th-floor-bdb-index-f9a3155\src\main\java\org\neo4j\index\bdbje\BerkeleyD
   bBatchInserterIndexProvider.java:[32,29] package org.neo4j.index.lucene
  does
   not
exist
  
  \down\13th-floor-bdb-index-f9a3155\src\main\java\org\neo4j\index\bdbje\BerkeleyD
   bDataSource.java:[31,29] package org.neo4j.index.lucene does not exist
  
  \down\13th-floor-bdb-index-f9a3155\src\main\java\org\neo4j\index\bdbje\BerkeleyD
   bBatchInserterIndexProvider.java:[32,29] package org.neo4j.index.lucene
  does
   not
exist
  
   [INFO]
   
   [INFO] For more information, run Maven with the -e switch

Re: [Neo4j] bdb-index

2011-07-30 Thread Niels Hoogeveen

Forgot the filename in the first sentence: 
BerkeleyDbBatchInserterIndexProvider.java

 From: pd_aficion...@hotmail.com
 To: user@lists.neo4j.org
 Date: Sun, 31 Jul 2011 01:47:20 +0200
 Subject: Re: [Neo4j] bdb-index

 I see in your edit of is the following import:
 import org.neo4j.index.lucene.LuceneIndexProvider;
 This is an interface defined in the legacy-index component, which is not in 
 the POM ( and shouldn't be). The import is nowhere used in the file, except 
 as links in header of the class where it doesn't belong. I guess an organize 
 imports in Eclipse has added that import based on an incorrect header.

 It's best to remove the legacy-index component from your build path in 
 eclips. In fact, it's best to let maven manage the project for you, so only 
 jars listed as dependencies in maven are put in the build path. To work on 
 bdb-index you need nothing more than the neo4j-kernel on your build path.
 Niels

  Date: Sun, 31 Jul 2011 01:19:17 +0200 From: cyuczie...@gmail.com
  To: user@lists.neo4j.org
  Subject: Re: [Neo4j] bdb-index

  I'm not sure how complete it is (ie. there's no org\neo4j\index folder
  inside it), but its sha1 matches, but also worth mentioning that I noticed
  it got updated a few minutes before I tried to mvn install, so it could be
  that it worked before because it was a different .jar (ie. prev version)
  Also, unpacking the jar and searching for any file named lucene* yields no
  results
  Searching for lucene* in all archives under that .m2 folder, still nothing.
  trying with 1.3 still doesn't work, not found.

  neo4j-kernel-1.4-SNAPSHOT.jar 817,935 bytes
  sha1: a20720ece824b372520b7afde080cdc83abb5501

  Thanks for the hints! All this maven knowledge will prove useful.
  John.

  On Sun, Jul 31, 2011 at 12:57 AM, Niels Hoogeveen pd_aficion...@hotmail.com
   wrote:

   Could you check if the  neo4j kernel jar file maven adds to class path is
   correct and complete. You can find it in your user directory in the .m2
   subdirectory.

Date: Sun, 31 Jul 2011 00:40:51 +0200
From: cyuczie...@gmail.com
To: user@lists.neo4j.org
Subject: Re: [Neo4j] bdb-index

I fixed those licenses, but to my amazement I'm getting new errors which
didn't happen before,
I am puzzled as to why would this happen

e:\down\13th-floor-bdb-index-f9a3155mvn install
[INFO] Scanning for projects...
[INFO]

[INFO] Building Unnamed -
org.neo4j:neo4j-berkeleydb-je-index:jar:0.1-SNAPSHOT
[INFO]task-segment: [install]
[INFO]

[INFO] [enforcer:enforce {execution: enforce-maven}]
[INFO] [license:check {execution: check-licenses}]
[INFO] Checking licenses...
[INFO] [resources:resources {execution: default-resources}]
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 1 resource
[INFO] Copying 0 resource to META-INF
[INFO] [compiler:compile {execution: default-compile}]
[INFO] Compiling 14 source files to
e:\down\13th-floor-bdb-index-f9a3155\target\
classes
[INFO] -
[ERROR] COMPILATION ERROR :
[INFO] -
[ERROR]
\down\13th-floor-bdb-index-f9a3155\src\main\java\org\neo4j\index\bdbje\B
erkeleyDbDataSource.java:[31,29] package org.neo4j.index.lucene does not
exist
[ERROR]
\down\13th-floor-bdb-index-f9a3155\src\main\java\org\neo4j\index\bdbje\B
erkeleyDbBatchInserterIndexProvider.java:[32,29] package
org.neo4j.index.lucene
does not exist
[ERROR]
\down\13th-floor-bdb-index-f9a3155\src\main\java\org\neo4j\index\bdbje\B
erkeleyDbDataSource.java:[31,29] package org.neo4j.index.lucene does not
exist
[ERROR]
\down\13th-floor-bdb-index-f9a3155\src\main\java\org\neo4j\index\bdbje\B
erkeleyDbBatchInserterIndexProvider.java:[32,29] package
org.neo4j.index.lucene
does not exist
[INFO] 4 errors
[INFO] -
[INFO]

[ERROR] BUILD FAILURE
[INFO]

[INFO] Compilation failure

   \down\13th-floor-bdb-index-f9a3155\src\main\java\org\neo4j\index\bdbje\BerkeleyD
bDataSource.java:[31,29] package org.neo4j.index.lucene does not exist

   \down\13th-floor-bdb-index-f9a3155\src\main\java\org\neo4j\index\bdbje\BerkeleyD
bBatchInserterIndexProvider.java:[32,29] package org.neo4j.index.lucene
   does
not
 exist

   \down\13th-floor-bdb-index-f9a3155\src\main\java\org\neo4j\index\bdbje\BerkeleyD
bDataSource.java:[31,29] package org.neo4j.index.lucene does not exist

Re: [Neo4j] Composable traversals

2011-07-29 Thread Niels Hoogeveen


I am going to stick as closely to the current implementation of traversers and 
where possible use code of the current implementation. 
As far as I can see, the current UniquenessFilter works well, so I am going to 
keep that setup for the new implementation.
Indeed it should be:
Node --FRIEND-- Node --PARENT-- Node
Both FRIEND and PARENT are just RelationshipTypes, nothing fancy with 
intermediate nodes going on. 

The iterator that returns the paths of the traversal should check if none of 
the nodes/relationships in the path have been deleted when returning. Locking 
things in the traverser is probably not a good idea, since it can easily lock 
large parts of the graph for an unknown amount of time. The traverser works 
lazily, so we cannot know in advance when and even if the iterator will be 
forwarded. Keeping nodes (potentially indefinitely) locked is not such a good 
idea. 
A traversal can never return more than temporary snapshots of the database. It 
can easily be that already returned paths have been deleted by the time the 
traversal ends, and new paths can be created which the traverser will not see, 
because that part of the graph has already been examinated.
I don't see how the isolation levels found in an RDBMS can be implemented in 
graph dabase. There is no notion of range locks without having a schema, so 
phantom reads may always occur in traversals.
Niels
 Date: Fri, 29 Jul 2011 07:04:41 +0200
 From: cyuczie...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Composable traversals
 
 Hey Niels,
   As they are composable, is java going to keep track of things, like if
 recursive, in stack ? or in array/variables ? or the graph could keep track
 of what's beep parsed so far, in-graph ? (I mean, this question applies for
 non-composable too; personally i like the idea of in-graph keeping track of
 those but maybe that would be implemented later at a higher level, so I
 guess for now it will be in array/variables)
 
 Just making sure, in here:
  Node --FRIEND-- Node -- PARENT -- Node
 FRIEND and PARENT are both relationship types?
 they are thus not intermediary nodes acting like they are relationships?
 (which is actually what I do with bdb where the only elemental thing is the
 Node, rels cannot be addressed ie. by ID)
 
 What happens while the traversers are executing and some other
 thread/process is deleting something which the traverser added to to itself
 as a valid node/path ? For example the first Node in Node --FRIEND-- Node
 assuming that's where the traverser's currently at, is deleted...
 Is there some notification/event or were they locked by traverser? or this
 kind of issue will be dealt with later after traverser is implemented?
 Are thee locks kept in-graph so they can be seen by other threads/processes
 (mainly thinking processes that cannot access the same java resource ie. in
 another jvm or computer tho accessing the same database - I guess this rules
 out embedded?) ? if any locks...
 
 On Fri, Jul 29, 2011 at 1:30 AM, Niels Hoogeveen
 pd_aficion...@hotmail.comwrote:
 
 
  I'd like to take a stab at implementing traversals in the Enhanced API. One
  of the things I'd like to do, is to make traversals composable.
 
  Right now a Traverser is created by either calling the traverse method on
  Node, or to call the traverse(Node) method on TraversalDescription.
 
  This makes traversals inherently non-composable, so we can't define a
  single traversal that returns the parents of all our friends.
 
  To make Traversers composable we need a function:
 
  Traverser traverse(Traverser, TraversalDescription)
 
  My take on it is to make Element (which is a superinterface of Node) into a
  Traverser.
 
  Traverser is basically another name for IterablePath.
 
  Every Node (or more generally every Element) can be seen as an
  IterabePath, returning a single Path, which contains a single
  path-element, the Node/Element itself.
 
  Composing traversals would entail the concatenation of the paths returned
  with the paths supplied, so when we ask for the parents of all our friends,
  the returned paths would take the form:
 
  Node --FRIEND-- Node -- PARENT -- Node
 
  Niels
 
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] bdb-index

2011-07-29 Thread Niels Hoogeveen


Hi John,
Thanks for looking into this. 
I am still seeing the same error I had before. When running the mvn install, 
both tests are ran after another. For some reason the transaction log sees an 
unclean shutdown and tries to commit pending transactions. 
During that process the index names of the bdb indexes are being retrieved from 
binary storage. Here something goes wrong, because the index name returned is 
garbage, so the recovery process fails because it can't find the right index 
files.
Niels

 Date: Fri, 29 Jul 2011 07:48:43 +0200
 From: cyuczie...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] bdb-index
 
 I forked and fixed, the tests are all working now:
 https://github.com/13th-floor/bdb-index
 Let me know if you want me to do a pull request, ... sadly I applied
 formatting on RawBDBSpeed and the diff doesn't look pretty if you're trying
 to see what changed
 
 John.
 
 
 On Thu, Jul 28, 2011 at 7:36 PM, Niels Hoogeveen
 pd_aficion...@hotmail.comwrote:
 
 
  Trying to find something useful to hide the implementation book keeping of
  Enhanced API, I tried out dbd-index as can be found here:
  https://github.com/peterneubauer/bdb-index
  It looks interesting, but fails its tests. When recovering it performs
  BerkeleyDbCommand#readCommand from the log. The retrieved indexName is not
  actually garbage. I would like to help make this component workable, but
  area of the database is a bit beyond the scope that I know.
  I know this is completely unsupported software, but can someone give me
  some pointers on how to fix this issue?
  Niels
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] bdb-index

2011-07-29 Thread Niels Hoogeveen


What I need to store in an index depends on the type of element that needs to 
be reified.
Relatationship:
To associated Node: RelId - NodeIdFrom associated Node: NodeId - RelId
RelationshipType:
To associated Node: RelationhipType.name - NodeIdFrom associated Node: NodeId 
- RelationshipType.name;
RelationshipRole:To associated Node: RelationhipRole.name - NodeIdFrom 
associated Node: NodeId - RelationshipRole.name;
PropertyType:To associated Node: PropertyType.name - NodeIdFrom associated 
Node: NodeId - PropertyType.name;
Property:To associated Node: Node, PropertyType.name - NodeIdFrom associated 
Node: NodeId - Node, PropertyType.name
Niels
 Date: Fri, 29 Jul 2011 06:49:31 +0200
 From: cyuczie...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] bdb-index
 
 Hi xD
 I'm not clear what you need to store here, if I understand correctly you
 could store in 2 primary bdb databases the nodeID (ie. long) of each node in
 a relationship
 ie.
 key-value
 
 dbForward:
 A-B
 A-C
 X-D
 X-B
 
 dbBackward:
 B-A
 B-X
 C-A
 D-X
 
 A,B,C,D,X are all nodeIDs ie. longs
 
 this way you could check if A-B exists, or all of A's endNodes , or what
 startNodes are pointing to the endNode B
 the storing of these would be sorted and in BTree, lookup would be fast, so
 you can consider ie. A as being a set of B and C, and X being a set of B and
 D, (that is you cannot set the order as in a list, they are sorted by bdb
 for fast retrievals). (But upon this, sets, can build lists np - that is
 using only bdb; tho you won't need that using neo4j)
 So, if this is the kind of index you wanted... (I am not aware of specific
 indexes with bdb, though that doesn't mean they don't exist)
 
 Insertions would require transaction protection so both A-B in dbForward
 and B-A in dbBackward are inserted atomically. Parsing A then X of B-  in
 dbBackward for example can only be done with a cursor...
 
 Either way, I'm taking a look on that bdb-index thingy; will report back if
 I have any ideas heh
 
 John.
 
 On Thu, Jul 28, 2011 at 9:42 PM, Niels Hoogeveen
 pd_aficion...@hotmail.comwrote:
 
 
  Thank you, Peter,There is no rush here. It would be nice to investigate
  this option, but it can wait until Mattias has returned and sifted through
  urgent matters. The question is even, if it would be a good idea to use an
  index to do the book keeping for Enhanced API.As it is now, the Reification
  of eg. a Relationship, requires one property to be set on a relationship,
  containing the node ID of the associated node. On the associated node is a
  property containing the ID of the relationship, so there is a bidirectional
  look up. Introducing an index would remove the need to have these additional
  properties, but would lead to slower look-up times (no matter how fast the
  index).So it's a trade-off between speed and cleanliness of namespace. Using
  the Enhanced API disallows certain property names to be used in user
  applications.The property names used in Enhanced API all start with
  org.neo4j.collections.graphbd., so there is little chance a user
  application would want to use those property names, but it is a restriction
  not found in the standard API, so ultimately something to consider.Niels
   From: peter.neuba...@neotechnology.com
   Date: Thu, 28 Jul 2011 10:39:47 -0700
   To: user@lists.neo4j.org
   Subject: Re: [Neo4j] bdb-index
  
   niels,
   in this spike, I just concentrated on getting _something_ working in
   order to test insertion speed. This is not up to real indexing
   standards, so some love is needed here. I think Mattias is the best
   person to ask about pointers, let's wait until he is back next week if
   that is ok? Maybe some other (like the standard Lucene)  index can
   suffice for the time being to test out things?
  
   Cheers,
  
   /peter neubauer
  
   GTalk:  neubauer.peter
   Skype   peter.neubauer
   Phone   +46 704 106975
   LinkedIn   http://www.linkedin.com/in/neubauer
   Twitter  http://twitter.com/peterneubauer
  
   http://www.neo4j.org   - Your high performance graph
  database.
   http://startupbootcamp.org/- Öresund - Innovation happens HERE.
   http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.
  
  
  
   On Thu, Jul 28, 2011 at 10:36 AM, Niels Hoogeveen
   pd_aficion...@hotmail.com wrote:
   
Trying to find something useful to hide the implementation book keeping
  of Enhanced API, I tried out dbd-index as can be found here:
  https://github.com/peterneubauer/bdb-index
It looks interesting, but fails its tests. When recovering it performs
  BerkeleyDbCommand#readCommand from the log. The retrieved indexName is not
  actually garbage. I would like to help make this component workable, but
  area of the database is a bit beyond the scope that I know.
I know this is completely unsupported software, but can someone give me
  some pointers on how to fix this issue?
Niels

Re: [Neo4j] Composable traversals

2011-07-29 Thread Niels Hoogeveen


Great, I would much rather see this become part of the core API than have this 
as part of the Enhanced API. 
To make things work correctly, one important change to core is needed: The Node 
interface needs to extends Traverser (the interface in 
org.neo4j.graphdb.traversal, not the one in org.neo4j.graphdb). 
This is actually not a big deal. There Traverser interface supports three 
methods:
Iteratorpath iterator() [return 1 path with 1 element in the path, being the 
node itself]IterableNode nodes() [return an iterable over the node 
itself]IterableRelationship relationships() [return an empty iterable]
With that addition, it's not all too difficult to enhance the current 
implementation of Traverser. It only adds one more iteration level over the 
current implementation. Instead of having one start node, we now have multiple 
start paths. When returning values from the Traverser, the start paths and the 
result paths need to be concatenated. 
In the new scenario, all old traverse() methods can remain the same, since 
Node becomes a Traverser, so those methods are just special cases where 
IterablePath consists of 1 path, with just 1 element.
Niels
 Date: Fri, 29 Jul 2011 18:36:28 +0200
 From: matt...@neotechnology.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Composable traversals
 
 There have been thoughts a long while to make something like this with the
 traversal framework, but time has never been allocated to evolve it. I'm
 adding stuff to the framework in a side track and will surely add some
 aspect of composable traversers also.
 
 2011/7/29 Niels Hoogeveen pd_aficion...@hotmail.com
 
 
  I'd like to take a stab at implementing traversals in the Enhanced API. One
  of the things I'd like to do, is to make traversals composable.
 
  Right now a Traverser is created by either calling the traverse method on
  Node, or to call the traverse(Node) method on TraversalDescription.
 
  This makes traversals inherently non-composable, so we can't define a
  single traversal that returns the parents of all our friends.
 
  To make Traversers composable we need a function:
 
  Traverser traverse(Traverser, TraversalDescription)
 
  My take on it is to make Element (which is a superinterface of Node) into a
  Traverser.
 
  Traverser is basically another name for IterablePath.
 
  Every Node (or more generally every Element) can be seen as an
  IterabePath, returning a single Path, which contains a single
  path-element, the Node/Element itself.
 
  Composing traversals would entail the concatenation of the paths returned
  with the paths supplied, so when we ask for the parents of all our friends,
  the returned paths would take the form:
 
  Node --FRIEND-- Node -- PARENT -- Node
 
  Niels
 
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 
 
 
 
 -- 
 Mattias Persson, [matt...@neotechnology.com]
 Hacker, Neo Technology
 www.neotechnology.com
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] HyperRelationship example

2011-07-28 Thread Niels Hoogeveen

Hi John,
Thanks for showing an interest.
The compile error you got was due to the fact that a removed class was still
hanging around in the Git repo. I renamed BinaryRelationshipRoles into
BinaryRelationshipRole, but the original file was still active in the Git repo.
I fixed that.
I have been thinking about BDB too for this situation, because the graph
database now stores some information about the associated nodes and their
reverse lookup. This of course polutes the name/node space. It would be neat to
offload this book keeping information to some persistent hashmap, so the
implementation is completely transparent to the user.
I don't know how nicely BDB plays with Neo4J transactions. Does anyone have
experience with this?
Another aspect is licencing. I am no legal buff, so maybe someone else can jump
in and answer this.
Personally, I don't mind adding BDB as a dependency, but it has to work well at
the transaction level and licence wise, otherwise it's a no go for me.
I would recommend you to start using maven. There is an Eclipse plugin
m2eclipse, which allows you to use/maintain Maven projects from within Eclipse.
Niels

Date: Thu, 28 Jul 2011 05:09:54 +0200
From: cyuczie...@gmail.com
To: user@lists.neo4j.org
Subject: Re: [Neo4j] HyperRelationship example

Hey Niels,

I like xD
this seems like a lot of work and professionally done; ie. something I could
not have done (I don't have that kind of experience and focus). Gratz on
that, I really appreciate seeing this.

I cloned the repo from git, manually, with eclipse (not using maven - don't
know how with eclipse)
I am getting only about 3 compile errors, like:
1) The type BinaryRelationshipRolesT must implement the inherited abstract
method PropertyContainer.getId()
2) The constructor PropertyTypeT(String, GraphDatabaseService) is not
visible
3) The return type is incompatible with
RelationshipContainer.getRelationships()
for
org.neo4j.collections.graphdb.impl.RelationshipIterable.RelationshipIterable(IterableRelationship
rels)

Also, I am thinking to try and implement this on top of berkeleydb just
for fun/benchmarking (so to speak) to compare between that and neo4j - since
I am currently unsure which one to use for my hobby project (I like that
berkeleydb's searches are 0-1ms instead of few seconds)

Btw, would it be any interest to you if I were to fork your repo and add ie.
AllTests.java for junit and the .project and related files for eclipse
project in a pull or two ? as long as it doesn't seem useless or
cluttering... (note however I never actually, yet, used forkpull but only
read about it on github xD)

Thanks to all, for wasting some time reading this,
Greeting and salutations,
John

On Wed, Jul 27, 2011 at 8:48 PM, Niels Hoogeveen
pd_aficion...@hotmail.comwrote:

I just posted an example on how to use HyperRelationships:

https://github.com/peterneubauer/graph-collections/wiki/HyperRelationship-example

There is now a proper test for HyperRelationships, so I hereby push the
software to Beta status.

Please try out the Enhanced API and HyperRelationships and let me know what
needs improvement.

Niels
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] strange problem while getting a node property

2011-07-28 Thread Niels Hoogeveen

When iterating over all nodes, you also pull the reference node (with id = 0), 
which probably doesn't have the requested property.
If you want to list all properties of a node, it's better to use a construct 
like:
for(String key: node.getPropertyKeys()){   
System.out.println(node.getProperty(key));}

 Date: Thu, 28 Jul 2011 13:18:50 +0200
 From: c-...@jsnet.be
 To: user@lists.neo4j.org
 Subject: [Neo4j] strange problem while getting a node property

 Hi,
 I've this strange problem when I try to collect data from the graph with 
 the Java API in Groovy :

 db.allNodes.each {node -
  cpt=0
  node.getRelationships().each {rel -
  cpt++
  }
  println (${node} ${cpt})
  println node.getPropertyKeys()
 }

 The iteration on each node is right working.
 The iteration to count the relationships on each node is working too.

 The call node.getPropertyKeys() gives me the list of the properties like 
 this :
 [nbrel, version, maintainer, section, architecture, package, priority, 
 dataset, installedSize]

 But,

 If a call node.getProperty(package)
 I receive this error :
 Caught: org.neo4j.graphdb.NotFoundException: package property not found 
 for NodeImpl#0

 And, If I set the value just before, for test like this :
 node.setProperty(package, test)
 println node.getProperty(package)

 I get the value.

 So I can't get property which was not set by the node.setProperty method.
 The initial data are copied into the graph with a perl script using the 
 Neo4j REST interface.

 Maybe I do something wrong,
 I'm a newbie in both Neo4j and Groovy

 Regards,
 Jean-Sébastien Stoffen
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] HyperRelationship example

2011-07-28 Thread Niels Hoogeveen

) org.neo4j.collections.graphdb.impl.NodeLikeImpl.getRelationships()
 The return type is incompatible with
 RelationshipContainer.getRelationships()
 
 3)
 org.neo4j.collections.graphdb.impl.NodeLikeImpl.getRelationships(RelationshipType...)
 The return type is incompatible with
 RelationshipContainer.getRelationships(RelationshipType[])
 
 
 John.
 
 On Thu, Jul 28, 2011 at 12:52 PM, Niels Hoogeveen pd_aficion...@hotmail.com
  wrote:
 
 
  Hi John,
  Thanks for showing an interest.
  The compile error you got was due to the fact that a removed class was
  still hanging around in the Git repo. I renamed BinaryRelationshipRoles into
  BinaryRelationshipRole, but the original file was still active in the Git
  repo. I fixed that.
  I have been thinking about BDB too for this situation, because the graph
  database now stores some information about the associated nodes and their
  reverse lookup. This of course polutes the name/node space. It would be neat
  to offload this book keeping information to some persistent hashmap, so the
  implementation is completely transparent to the user.
  I don't know how nicely BDB plays with Neo4J transactions. Does anyone have
  experience with this?
  Another aspect is licencing. I am no legal buff, so maybe someone else can
  jump in and answer this.
  Personally, I don't mind adding BDB as a dependency, but it has to work
  well at the transaction level and licence wise, otherwise it's a no go for
  me.
  I would recommend you to start using maven. There is an Eclipse plugin
  m2eclipse, which allows you to use/maintain Maven projects from within
  Eclipse.
  Niels
 
   Date: Thu, 28 Jul 2011 05:09:54 +0200
   From: cyuczie...@gmail.com
   To: user@lists.neo4j.org
   Subject: Re: [Neo4j] HyperRelationship example
  
   Hey Niels,
  
   I like xD
   this seems like a lot of work and professionally done; ie. something I
  could
   not have done (I don't have that kind of experience and focus). Gratz on
   that, I really appreciate seeing this.
  
   I cloned the repo from git, manually, with eclipse (not using maven -
  don't
   know how with eclipse)
   I am getting only about 3 compile errors, like:
   1) The type BinaryRelationshipRolesT must implement the inherited
  abstract
   method PropertyContainer.getId()
   2) The constructor PropertyTypeT(String, GraphDatabaseService) is not
   visible
   3) The return type is incompatible with
   RelationshipContainer.getRelationships()
   for
  
  org.neo4j.collections.graphdb.impl.RelationshipIterable.RelationshipIterable(IterableRelationship
   rels)
  
  
 Also, I am thinking to try and implement this on top of berkeleydb just
   for fun/benchmarking (so to speak) to compare between that and neo4j -
  since
   I am currently unsure which one to use for my hobby project (I like that
   berkeleydb's searches are 0-1ms instead of few seconds)
  
   Btw, would it be any interest to you if I were to fork your repo and add
  ie.
   AllTests.java for junit and the .project and related files for eclipse
   project in a pull or two ? as long as it doesn't seem useless or
   cluttering... (note however I never actually, yet, used forkpull but
  only
   read about it on github xD)
  
   Thanks to all, for wasting some time reading this,
   Greeting and salutations,
   John
  
   On Wed, Jul 27, 2011 at 8:48 PM, Niels Hoogeveen
   pd_aficion...@hotmail.comwrote:
  
   
I just posted an example on how to use HyperRelationships:
   
   
   
  https://github.com/peterneubauer/graph-collections/wiki/HyperRelationship-example
   
There is now a proper test for HyperRelationships, so I hereby push the
software to Beta status.
   
Please try out the Enhanced API and HyperRelationships and let me know
  what
needs improvement.
   
Niels
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user
   
   ___
   Neo4j mailing list
   User@lists.neo4j.org
   https://lists.neo4j.org/mailman/listinfo/user
 
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] HyperRelationship example

2011-07-28 Thread Niels Hoogeveen

It's a trick to lock a node. When removing a property that does not exist the 
node gets locked. 

 Date: Thu, 28 Jul 2011 15:51:15 +0200
 From: cyuczie...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] HyperRelationship example

 Hey Niels,

 what is acquireLock() doing in SortedTree ?
 is removeProperty causing neo4j to acquire a lock on the Node? or its
 properties?
 also does that property need to exist? seems like not
 interesting :)

 On Wed, Jul 27, 2011 at 8:48 PM, Niels Hoogeveen
 pd_aficion...@hotmail.comwrote:

  I just posted an example on how to use HyperRelationships:

  https://github.com/peterneubauer/graph-collections/wiki/HyperRelationship-example

  There is now a proper test for HyperRelationships, so I hereby push the
  software to Beta status.

  Please try out the Enhanced API and HyperRelationships and let me know what
  needs improvement.

  Niels
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user

 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] HyperRelationship example

2011-07-28 Thread Niels Hoogeveen

   meant if you could make a wrapper such that you could use the same
   format/interface neo4j uses for their transactions, then you can, I
  did
   some
   attempt to that it works for me, also BDB Java Edition doesn't support
   nested transactions either (the C++ version does), but emulating them
  to
   use
   the same root/parent transaction is easy, my attempt is here:
  
  
  https://github.com/13th-floor/neo4john/blob/6c0371e82b7fc5b5f45d7c0ea9fb03ee4d241df9/src.obsolete/org/bdb/BETransaction.java
   probably not much relevant though. But this file here:
  
  
  https://github.com/13th-floor/neo4john/blob/master/src/org/benchtests/neo4j/TestLinkage.java
   I made to use both neo4j and bdb to do the same thing, that is:
   create nodes(uppercase named ones) with these rels:
   ROOT_LIST --   START
   ROOT_LIST --   half a million unique nodes
   ROOT_LIST --   MIDDLE
   ROOT_LIST --   another half a million unique nodes
   ROOT_LIST --   END
  
   then make both bdb and neo4j check if the following rels exist:
   ROOT_LIST --   START
   ROOT_LIST --   MIDDLE
   ROOT_LIST --   END
   (you probably saw this already in another post)
   But both bdb and neo4j now use transactions... that is, in my test
  file.
  
   About licensing, I'm not much into that but here's the license for
   Berkeley
   DB Java Edition:
  
  
  http://www.oracle.com/technetwork/database/berkeleydb/downloads/jeoslicense-086837.html
   Looks like New(or normal?) BSD license or something ...
   also
   
   Licensing
  
   Berkeley DB is available under dual license:
  
- Public license that requires that software that uses the
  Berkeley
   DB
code be free/open source software; and
- Closed source license for non-open source software.
  
   If your code is not redistributed, no license is required (free for
   in-house
   use).
  
   
  
   from http://www.orafaq.com/wiki/Berkeley_DB#Licensing
  
  
   I would totally use neo4j, if it would be as fast at searches :/ ie.
   BTree
   storage of nodes/rels? (guessing)
  
   But having 10mil rels, and seeing BDB checking if A--B in 0ms, and
  neo4j
   in
   like 0 to 66 to 310 seconds (depending on its position)
  
   is a show stopper for me, especially because I want to base everything
  on
   just nodes (without properties) and their relationships. ie. make a
  set
   or
   list of things, without having A ---[ENTRY]--   e ---[NEXT] ---   e2
   but
   instead A-b-e-c-e2  where b and c are just nodes, and also
   AllEntries-b  and AllNexts-c   (silly example with such less info
  tho)
  
   Point is, I would do lots of searches a lot (imagine a real time
  program
   running on top of nodes/rels, that is it's defined in and can access
  only
   nodes), this would likely cause those ms to add up to seconds...
  
  
   I installed maven (m2e) again, I guess I could use it, but it seems it
   creates .jar , not sure if that's useful to me while I am coding...
  seems
   better to use project/sources no? and maven only when ready to
   publish/get
   the jar ; anyway I need to learn how to use it otherwise I'm getting
   errors
   like this , when trying to build:
  
   [ERROR]   The project org.neo4j:neo4j-graph-collections:1.5-SNAPSHOT
   (E:\wrkspc\graph-collections\pom.xml) has 1 error
   [ERROR] Non-resolvable parent POM: The repository system is
  offline
   but
   the artifact org.neo4j:parent-central:pom:21 is not available in the
   local
   repositor
   y. and 'parent.relativePath' points at wrong local POM @ line 4,
  column
   11
   -   [Help 2]
  
  
  
   Anyway, with normal eclipse, I'm still showing 2 different errors:
  
   1) in org.neo4j.collections.graphdb.ComparablePropertyTypeT
  
   line 29: super(name, graphDb);
  
   The constructor PropertyTypeT(String, GraphDatabaseService) is not
   visible
  
   2) org.neo4j.collections.graphdb.impl.NodeLikeImpl.getRelationships()
   The return type is incompatible with
   RelationshipContainer.getRelationships()
  
   3)
  
  
  org.neo4j.collections.graphdb.impl.NodeLikeImpl.getRelationships(RelationshipType...)
   The return type is incompatible with
   RelationshipContainer.getRelationships(RelationshipType[])
  
  
   John.
  
   On Thu, Jul 28, 2011 at 12:52 PM, Niels Hoogeveen
   pd_aficion...@hotmail.com
   wrote:
  
  
   Hi John,
   Thanks for showing an interest.
   The compile error you got was due to the fact that a removed class
  was
   still hanging around in the Git repo. I renamed
  BinaryRelationshipRoles
   into
   BinaryRelationshipRole, but the original file was still active in the
   Git
   repo. I fixed that.
   I have been thinking about BDB too for this situation, because the
  graph
   database now stores some information about the associated nodes and
   their
   reverse lookup. This of course polutes the name/node space. It would
  be
   neat
   to offload this book keeping information to some persistent hashmap,
  so
   the
   implementation is completely

[Neo4j] bdb-index

2011-07-28 Thread Niels Hoogeveen


Trying to find something useful to hide the implementation book keeping of 
Enhanced API, I tried out dbd-index as can be found 
here:https://github.com/peterneubauer/bdb-index
It looks interesting, but fails its tests. When recovering it performs 
BerkeleyDbCommand#readCommand from the log. The retrieved indexName is not 
actually garbage. I would like to help make this component workable, but area 
of the database is a bit beyond the scope that I know.
I know this is completely unsupported software, but can someone give me some 
pointers on how to fix this issue?
Niels 
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] bdb-index

2011-07-28 Thread Niels Hoogeveen

Should read: The retrieved indexName is actually garbage. 

 From: pd_aficion...@hotmail.com
 To: user@lists.neo4j.org
 Date: Thu, 28 Jul 2011 19:36:21 +0200
 Subject: [Neo4j] bdb-index

 Trying to find something useful to hide the implementation book keeping of 
 Enhanced API, I tried out dbd-index as can be found 
 here:https://github.com/peterneubauer/bdb-index
 It looks interesting, but fails its tests. When recovering it performs 
 BerkeleyDbCommand#readCommand from the log. The retrieved indexName is not 
 actually garbage. I would like to help make this component workable, but area 
 of the database is a bit beyond the scope that I know.
 I know this is completely unsupported software, but can someone give me some 
 pointers on how to fix this issue?
 Niels   
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] bdb-index

2011-07-28 Thread Niels Hoogeveen


Thank you, Peter,There is no rush here. It would be nice to investigate this 
option, but it can wait until Mattias has returned and sifted through urgent 
matters. The question is even, if it would be a good idea to use an index to do 
the book keeping for Enhanced API.As it is now, the Reification of eg. a 
Relationship, requires one property to be set on a relationship, containing the 
node ID of the associated node. On the associated node is a property containing 
the ID of the relationship, so there is a bidirectional look up. Introducing an 
index would remove the need to have these additional properties, but would lead 
to slower look-up times (no matter how fast the index).So it's a trade-off 
between speed and cleanliness of namespace. Using the Enhanced API disallows 
certain property names to be used in user applications.The property names used 
in Enhanced API all start with org.neo4j.collections.graphbd., so there is 
little chance a user application would want to use those property names, but it 
is a restriction not found in the standard API, so ultimately something to 
consider.Niels
 From: peter.neuba...@neotechnology.com
 Date: Thu, 28 Jul 2011 10:39:47 -0700
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] bdb-index
 
 niels,
 in this spike, I just concentrated on getting _something_ working in
 order to test insertion speed. This is not up to real indexing
 standards, so some love is needed here. I think Mattias is the best
 person to ask about pointers, let's wait until he is back next week if
 that is ok? Maybe some other (like the standard Lucene)  index can
 suffice for the time being to test out things?
 
 Cheers,
 
 /peter neubauer
 
 GTalk:  neubauer.peter
 Skype   peter.neubauer
 Phone   +46 704 106975
 LinkedIn   http://www.linkedin.com/in/neubauer
 Twitter  http://twitter.com/peterneubauer
 
 http://www.neo4j.org   - Your high performance graph database.
 http://startupbootcamp.org/- Öresund - Innovation happens HERE.
 http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.
 
 
 
 On Thu, Jul 28, 2011 at 10:36 AM, Niels Hoogeveen
 pd_aficion...@hotmail.com wrote:
 
  Trying to find something useful to hide the implementation book keeping of 
  Enhanced API, I tried out dbd-index as can be found 
  here:https://github.com/peterneubauer/bdb-index
  It looks interesting, but fails its tests. When recovering it performs 
  BerkeleyDbCommand#readCommand from the log. The retrieved indexName is not 
  actually garbage. I would like to help make this component workable, but 
  area of the database is a bit beyond the scope that I know.
  I know this is completely unsupported software, but can someone give me 
  some pointers on how to fix this issue?
  Niels
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

[Neo4j] Composable traversals

2011-07-28 Thread Niels Hoogeveen


I'd like to take a stab at implementing traversals in the Enhanced API. One of 
the things I'd like to do, is to make traversals composable. 

Right now a Traverser is created by either calling the traverse method on Node, 
or to call the traverse(Node) method on TraversalDescription. 

This makes traversals inherently non-composable, so we can't define a single 
traversal that returns the parents of all our friends.

To make Traversers composable we need a function:

Traverser traverse(Traverser, TraversalDescription)

My take on it is to make Element (which is a superinterface of Node) into a 
Traverser.

Traverser is basically another name for IterablePath. 

Every Node (or more generally every Element) can be seen as an IterabePath, 
returning a single Path, which contains a single path-element, the Node/Element 
itself.

Composing traversals would entail the concatenation of the paths returned with 
the paths supplied, so when we ask for the parents of all our friends, the 
returned paths would take the form:

Node --FRIEND-- Node -- PARENT -- Node

Niels
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Enhanced API and HyperRelationships

2011-07-27 Thread Niels Hoogeveen

Added a test for Enhanced API and HyperRelationships. Reification works 
correctly, HyperRelationships works correctly for binary relationships. Still 
need to add tests for HyperRelationships with higher arity (will do so later 
today).
Niels

 From: pd_aficion...@hotmail.com
 To: user@lists.neo4j.org
 Date: Wed, 27 Jul 2011 00:04:32 +0200
 Subject: Re: [Neo4j] Enhanced API and HyperRelationships

 Hi Peter,
 I will start writing test-code first. Some nice creation code will definitely 
 be part of that, which I will post on the graph-collections Wiki, together 
 with the resulting data (reminder to self: install neoclipse to make neat 
 images of the graph).
 The Enhanced stuff still needs thorough testing. My Scala app only uses the 
 non-enhanced features, so it was basically a test proving the wrappers all 
 work properly. In two or three days, I am confident the software is ready for 
 others to try out.
 Niels

  From: peter.neuba...@neotechnology.com
  Date: Tue, 26 Jul 2011 14:48:36 -0700
  To: user@lists.neo4j.org
  Subject: Re: [Neo4j] Enhanced API and HyperRelationships

  That is cool Niels,
  I am looking forward to you testing it out, maybe some else people?
  Also, I would love to see how to query such a structure at the API
  level. Could you post some nice creation code and the resulting graph
  so we can see how it looks?

  Cheers,

  /peter neubauer

  GTalk:  neubauer.peter
  Skype   peter.neubauer
  Phone   +46 704 106975
  LinkedIn   http://www.linkedin.com/in/neubauer
  Twitter  http://twitter.com/peterneubauer

  http://www.neo4j.org   - Your high performance graph database.
  http://startupbootcamp.org/- Öresund - Innovation happens HERE.
  http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.

  On Tue, Jul 26, 2011 at 2:44 PM, Niels Hoogeveen
  pd_aficion...@hotmail.com wrote:

   I just ported my own application 12kloc of Scala code to use the Enhance 
   API and got it working. Of course more thorough testing needs to be done, 
   but it proves that at least in the case of my own application the 
   Enhanced API can work as a drop-in replacement.
   Niels

   From: pd_aficion...@hotmail.com
   To: user@lists.neo4j.org
   Date: Tue, 26 Jul 2011 22:13:59 +0200
   Subject: Re: [Neo4j] Enhanced API and HyperRelationships

   A first stab at implementing the Enhanced API and HyperRelationships is 
   finished. It still needs thorough testing, so this is PRE-ALPHA 
   quality.It also still lacks proper documentation (java docs).The source 
   code can be found 
   at:https://github.com/peterneubauer/graph-collections/tree/master/src/main/java/org/neo4j/collections/graphdbA
description can be found 
   at:https://github.com/peterneubauer/graph-collections/wiki/Enhanced-APINiels
From: pd_aficion...@hotmail.com
To: user@lists.neo4j.org
Date: Tue, 26 Jul 2011 01:02:20 +0200
Subject: Re: [Neo4j] Enhanced API and HyperRelationships

The implementation of HyperRelationships needs another day of work, 
though the hard parts are finished now.

Time to explain the inner workings of HyperRelationships.

HyperRelationships are a generalization of the binary relationships 
found in Neo4j.

Instead of creating a relationship from a node to another node,
we define a HyperRelationship as a set of Nodes each having a 
RelationshipRole within the HyperRelationship.

For the binary case the RelationshipRoles are StartNode and EndNode.
For HyperRelationships with an arity higher than 2, the Roles need to 
be defined for each HyperRelationshipType.

A HyperRelationship is layed-out in the database as a regular 
relationship in the binary case.

For HyperRelationship with an arity higher than 2, a Node is created 
subsuming the role of Relationship.
From this Node, binary relationships (regular Neo4J relationships) are 
created for each Element of the relationship.

The RelationshipTypes of these binary relationships are a 
concatenation of the name of the HyperRelationshipType used
and the RelationshipRole of the attached Element.

Example:

Suppose we want to store the fact that Flo and Eddie give Tom, Dick 
and Harry a Book.

This is a ternary relationship, with the following RelationshipRoles:

Giver: Flo and Eddie
Recipient: Tom, Dick and Harry
Gift: Book

The GIVE relationship is first created with a Set of Roles (Giver, 
Recipient and Gift).
When the example relation is created the following binary 
relationships will be create:

HyperRelationshipNode --GIVE/#/Giver-- Flo
HyperRelationshipNode --GIVE/#/Giver-- Eddie
HyperRelationshipNode --GIVE/#/Recipient-- Tom
HyperRelationshipNode --GIVE/#/Recipient-- Dick
HyperRelationshipNode --GIVE/#/Recipient-- Harry
HyperRelationshipNode --GIVE/#/Gift-- Book

[Neo4j] HyperRelationship example

2011-07-27 Thread Niels Hoogeveen


I just posted an example on how to use HyperRelationships:

https://github.com/peterneubauer/graph-collections/wiki/HyperRelationship-example

There is now a proper test for HyperRelationships, so I hereby push the 
software to Beta status.

Please try out the Enhanced API and HyperRelationships and let me know what 
needs improvement.

Niels 
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] how to scale and view or generate reports for complex graphs?

2011-07-27 Thread Niels Hoogeveen


Hi Sambodhi,
One of the means to organize complexity is by adding meta information to your 
database. This first of all helps you organize what relationships and 
properties belong to what sort of node, it may also help answer questions such 
as: what nodes belong to what type/class.
Niels


 Date: Wed, 27 Jul 2011 23:23:45 +0100
 From: sambodhi.s...@gmail.com
 To: user@lists.neo4j.org
 Subject: [Neo4j] how to scale and view or generate reports for complex
 graphs?
 
 Hi Guys!
 
 I am a bit new to Graph database. I really liked the concept, graph made
 managing relationship between the entities relatively easy. I therefore
 chose to use it in my new project. I started the development two weeks back
 and my graph has already grown so complex with static data. I am wondering
 when it goes to production with thousands of users, how would we manage it.
 What really bothers me is :
 
 a. how do view such a complex graph? I use neoecplise but am not sure it
 would be able to accommodate thousands of nodes and at the same time it
 would be easy to eyes to find a particular node.
 
 b. is there any kind of report generation tool ?
 
 c. how to scale the graph? i read few article on it but it got me more
 confused. Would be really helpful if you can provide a link to a relevant
 document.
 
 Many Thanks!
 Sambodhi
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Enhanced API and HyperRelationships

2011-07-27 Thread Niels Hoogeveen

Integrated IndexedRelationships functionality into the Enhanced API, so
relationships of a certain type are maintained in a Btree, while they can be
manipulated through the API just like any other relationship.
Still need to test this one.
As mentioned earlier today, HyperRelationships and Enhanced API now have a set
of tests which they pass.

Niels

From: pd_aficion...@hotmail.com
To: user@lists.neo4j.org
Date: Tue, 26 Jul 2011 22:13:59 +0200
Subject: Re: [Neo4j] Enhanced API and HyperRelationships

A first stab at implementing the Enhanced API and HyperRelationships is
finished. It still needs thorough testing, so this is PRE-ALPHA quality.It
also still lacks proper documentation (java docs).The source code can be
found
at:https://github.com/peterneubauer/graph-collections/tree/master/src/main/java/org/neo4j/collections/graphdbA
description can be found
at:https://github.com/peterneubauer/graph-collections/wiki/Enhanced-APINiels
From: pd_aficion...@hotmail.com
To: user@lists.neo4j.org
Date: Tue, 26 Jul 2011 01:02:20 +0200
Subject: Re: [Neo4j] Enhanced API and HyperRelationships

The implementation of HyperRelationships needs another day of work, though
the hard parts are finished now.

Time to explain the inner workings of HyperRelationships.

HyperRelationships are a generalization of the binary relationships found
in Neo4j.

Instead of creating a relationship from a node to another node,
we define a HyperRelationship as a set of Nodes each having a
RelationshipRole within the HyperRelationship.

For the binary case the RelationshipRoles are StartNode and EndNode.
For HyperRelationships with an arity higher than 2, the Roles need to be
defined for each HyperRelationshipType.

A HyperRelationship is layed-out in the database as a regular relationship
in the binary case.

For HyperRelationship with an arity higher than 2, a Node is created
subsuming the role of Relationship.
From this Node, binary relationships (regular Neo4J relationships) are
created for each Element of the relationship.

The RelationshipTypes of these binary relationships are a concatenation of
the name of the HyperRelationshipType used
and the RelationshipRole of the attached Element.

Example:

Suppose we want to store the fact that Flo and Eddie give Tom, Dick and
Harry a Book.

This is a ternary relationship, with the following RelationshipRoles:

Giver: Flo and Eddie
Recipient: Tom, Dick and Harry
Gift: Book

The GIVE relationship is first created with a Set of Roles (Giver,
Recipient and Gift).
When the example relation is created the following binary relationships
will be create:

HyperRelationshipNode --GIVE/#/Giver-- Flo
HyperRelationshipNode --GIVE/#/Giver-- Eddie
HyperRelationshipNode --GIVE/#/Recipient-- Tom
HyperRelationshipNode --GIVE/#/Recipient-- Dick
HyperRelationshipNode --GIVE/#/Recipient-- Harry
HyperRelationshipNode --GIVE/#/Gift-- Book

We can now retrieve all Relationships where Flo is the Giver in a GIVE
relationship,
simply by concatenating GiVE and Giver into GIVE/#/Giver,
and then ask all incoming Relationships with that RelationshipType.

This fetches the HyperRelationship nodes and the other attached Elements of
the HyperRelationship can be loaded.

I added an extra interface FunctionalRelationshipRole, which restricts the
number of Elements attached to a RelationshipRole within a
HyperRelationship to one.

The use of this amounts to something similar to having a
getSingleRelationship method,
which cannot throw an Exception, because multiple entries with the same
RelationshipType cannot be created by design.

Niels
From: pd_aficion...@hotmail.com
To: user@lists.neo4j.org
Date: Mon, 25 Jul 2011 02:03:54 +0200
Subject: [Neo4j] Enhanced API and HyperRelationships

Today I wrote a piece about the Enhanced API and about
HyperRelationships, I have been working on over the last couple of days.

See: https://github.com/peterneubauer/graph-collections/wiki/Enhanced-API

The API as presented in the graph-collections repo on Git is not feature
complete yet with respect to HyperRelationships.
The interfaces are there, but the implementation only works for binary
relationships at present. Need one more day for the implementation.

I posted the Wiki page and the source code to open the discussion about
these new features.

Niels
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Enhanced API and HyperRelationships

2011-07-26 Thread Niels Hoogeveen

A first stab at implementing the Enhanced API and HyperRelationships is
finished. It still needs thorough testing, so this is PRE-ALPHA quality.It also
still lacks proper documentation (java docs).The source code can be found
at:https://github.com/peterneubauer/graph-collections/tree/master/src/main/java/org/neo4j/collections/graphdbA
description can be found
at:https://github.com/peterneubauer/graph-collections/wiki/Enhanced-APINiels
From: pd_aficion...@hotmail.com
To: user@lists.neo4j.org
Date: Tue, 26 Jul 2011 01:02:20 +0200
Subject: Re: [Neo4j] Enhanced API and HyperRelationships

The implementation of HyperRelationships needs another day of work, though
the hard parts are finished now.

Time to explain the inner workings of HyperRelationships.

HyperRelationships are a generalization of the binary relationships found in
Neo4j.

Instead of creating a relationship from a node to another node,
we define a HyperRelationship as a set of Nodes each having a
RelationshipRole within the HyperRelationship.

For the binary case the RelationshipRoles are StartNode and EndNode.
For HyperRelationships with an arity higher than 2, the Roles need to be
defined for each HyperRelationshipType.

A HyperRelationship is layed-out in the database as a regular relationship in
the binary case.

The RelationshipTypes of these binary relationships are a concatenation of
the name of the HyperRelationshipType used
and the RelationshipRole of the attached Element.

Example:

Suppose we want to store the fact that Flo and Eddie give Tom, Dick and Harry
a Book.

This is a ternary relationship, with the following RelationshipRoles:

Giver: Flo and Eddie
Recipient: Tom, Dick and Harry
Gift: Book

The GIVE relationship is first created with a Set of Roles (Giver, Recipient
and Gift).
When the example relation is created the following binary relationships will
be create:

This fetches the HyperRelationship nodes and the other attached Elements of
the HyperRelationship can be loaded.

I added an extra interface FunctionalRelationshipRole, which restricts the
number of Elements attached to a RelationshipRole within a HyperRelationship
to one.

Niels
From: pd_aficion...@hotmail.com
To: user@lists.neo4j.org
Date: Mon, 25 Jul 2011 02:03:54 +0200
Subject: [Neo4j] Enhanced API and HyperRelationships

Today I wrote a piece about the Enhanced API and about HyperRelationships,
I have been working on over the last couple of days.

See: https://github.com/peterneubauer/graph-collections/wiki/Enhanced-API

I posted the Wiki page and the source code to open the discussion about
these new features.

Niels
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Enhanced API and HyperRelationships

2011-07-26 Thread Niels Hoogeveen

I just ported my own application 12kloc of Scala code to use the Enhance API
and got it working. Of course more thorough testing needs to be done, but it
proves that at least in the case of my own application the Enhanced API can
work as a drop-in replacement.
Niels