Re: [Neo4j] Indexed relationships

Niels Hoogeveen Thu, 07 Jul 2011 13:11:47 -0700
IndexedRelationship and IndexedRelationshipExpander are now in Git. See: 
https://github.com/peterneubauer/graph-collections/tree/master/src/main/java/org/neo4j/collections/indexedrelationship
An example:
class IdComparator implements java.util.Comparator<Node>{
  public int compare(Node n1, Node n2){
    long l1 = Long.reverse(n1.getId());
    long l2 = Long.reverse(n2.getId());
    if(l1 == l2) return 0;
    else if(l1 < l2) return -1;
    else return 1;
  }
}static enum RelTypes implements RelationshipType{
  DIRECT_RELATIONSHIP,
  INDEXED_RELATIONSHIP,
};
Node indexedNode = graphDb().createNode();
IndexedRelationship ir = new IndexedRelationship(RelTypes.INDEXED_RELATIONSHIP, 
Direction.OUTGOING, new IdComparator(), true, indexedNode, graphDb());
        
Node n1 = graphDb().createNode();
n1.setProperty("name", "n1");
Node n2 = graphDb().createNode();
n2.setProperty("name", "n2");
Node n3 = graphDb().createNode();
n3.setProperty("name", "n3");
Node n4 = graphDb().createNode();
n4.setProperty("name", "n4");
                
indexedNode.createRelationshipTo(n1, RelTypes.DIRECT_RELATIONSHIP);
indexedNode.createRelationshipTo(n3, RelTypes.DIRECT_RELATIONSHIP);
ir.createRelationshipTo(n2);
ir.createRelationshipTo(n4);
                
IndexedRelationshipExpander re1 = new IndexedRelationshipExpander(graphDb(), 
Direction.OUTGOING, RelTypes.DIRECT_RELATIONSHIP);
IndexedRelationshipExpander re2 = new IndexedRelationshipExpander(graphDb(), 
Direction.OUTGOING, RelTypes.INDEXED_RELATIONSHIP);
                
for(Relationship rel: re1.expand(indexedNode)){
  System.out.println(rel.getEndNode().getProperty("name"));
}
for(Relationship rel: re2.expand(indexedNode)){
  System.out.println(re2.getEndNode().getProperty("name"));
}
> From: [email protected]
> To: [email protected]
> Date: Thu, 7 Jul 2011 16:55:36 +0200
> Subject: Re: [Neo4j] Indexed relationships
> 
> 
> Hi Michael,I realize that the implementation of IndexedRelationship can in 
> fact support returning relationships, and I have a preliminary version 
> running locally now.The returned relationships can support all methods of the 
> Relationship interface, returning the node pointing to the treeRoot as the 
> startNode, and returning the node set as the key_value as the endNode.All 
> relationship properties will be stored on the KEY_VALUE relationship pointing 
> to the endNode.There is one caveat to this solution, the returned 
> relationships cannot support the getId() method,and will throw an 
> UnsupportedOperationException when being called.IndexedRelationship will 
> implement Iterable<Relationship>.With these changes, it is possible to create 
> an Expander and I am working right now to implement that.Niels
> 
> > From: [email protected]
> > To: [email protected]
> > Date: Thu, 7 Jul 2011 14:46:35 +0200
> > Subject: Re: [Neo4j] Indexed relationships
> > 
> > 
> > Hi Michael,
> > 
> > I haven't yet worked on an example. 
> > 
> > There are tests for the SortedTree implementation, 
> > but didn't add those to the IndexedRelationship class, 
> > which is simply a wrapper around SortedTree. 
> > Having a test would have caught the error 
> > that no relationship to the treeNode was created 
> > (fixed that bug and pushed it to Git) 
> > (note to self: always create a unit test, 
> > especially when code seems trivial).
> > 
> > There is no relationship expander that uses this. 
> > The RelationshipExpander has a method Iterable<Relationship> expand(Node 
> > node) 
> > which cannot be supported, since there is no direct relationship from 
> > startnode to endnode. 
> > Instead there is a path through the index tree. 
> > 
> > It's not possible to support the original relationship-traversal API 
> > since the IndexedRelationship class is not a wrapper around a node, 
> > but a wrapper around the relationships of a certain RelationshipType in the 
> > OUTGOING direction. 
> > 
> > As to the name of the class. 
> > It is essentially an indexed relationship, 
> > and not just a solution to the densely-connected-node problem. 
> > An indexed relationship can also be used to maintain 
> > a sorted set of relationships of any size, 
> > and can be used to guarantee unicity constraints. Niels
> > > From: [email protected]
> > > Date: Thu, 7 Jul 2011 13:27:00 +0200
> > > To: [email protected]
> > > Subject: Re: [Neo4j] Indexed relationships
> > > 
> > > Good work,
> > > 
> > > do you have an example ready (and/or some tests that show how it works/is 
> > > used) ?
> > > 
> > > In creation, manual traversal and automatic traversal (i.e. is there a 
> > > RelationshipExpander that uses it).
> > > 
> > > And in the constructor if there is no relationship to the treeNode, you 
> > > create a new one, but that new treeNode is not connected to the actual 
> > > node?
> > > 
> > > I'm not sure if it should support the original relationship-traversal API 
> > > / methods (getRelationships(Dir,type), etc).
> > > 
> > > Perhaps that IndexedRelationship should rather be just a wrapper around a 
> > > SuperNode ? So probably rename it to "SuperNode(Wrapper) or 
> > > HeavilyConnectedNode(Wrapper) ?)
> > > 
> > > Cheers
> > > 
> > > Michael
> > > 
> > > Am 07.07.2011 um 12:51 schrieb Niels Hoogeveen:
> > > 
> > > > 
> > > > Finished the implementation of indexed relationships. The graph 
> > > > collections component now contains the package 
> > > > https://github.com/peterneubauer/graph-collections/tree/master/src/main/java/org/neo4j/collections/indexedrelationship,
> > > >  containing the IndexedRelationship class.
> > > > This class can be used instead of regular relationships 
> > > > when:relationships need to be stored in a particular sort ordera 
> > > > unicity constraint needs to be guaranteed nodes become densely 
> > > > populated with relationships.
> > > > The implementation is traverser friendly. Given a start nodes all end 
> > > > nodes can be found by following four relationships types in outgoing 
> > > > direction. Given an end node the start node can be found by following 
> > > > these four relationship types in incoming direction. Of course this 
> > > > functionality is also covered in the API.
> > > > Niels
> > > > 
> > > >> From: [email protected]
> > > >> To: [email protected]
> > > >> Date: Thu, 7 Jul 2011 02:36:29 +0200
> > > >> Subject: Re: [Neo4j] Indexed relationships
> > > >> 
> > > >> 
> > > >> Pushed SortedTree to Git after adding a unit test and doing some 
> > > >> debugging.
> > > >> TODO:Add API for indexed relationships using SortedTree as the 
> > > >> implementation.Make SortedTree thread safe.
> > > >> With regard to the latter issue. I am considering the following 
> > > >> solution. Acquire a lock (delete a non existent property) on the node 
> > > >> that points to the root of the tree at the start of AddNode, 
> > > >> RemoveNode and Delete. No other node in the SortedTree is really 
> > > >> stable, even the rootnode may be moved down, turning another node into 
> > > >> the new rootnode, while after a couple of remove actions the original 
> > > >> rootnode may even be deleted. 
> > > >> Locking the node pointing to the rootnode, prevents all other 
> > > >> threads/transactions from updating the tree. This may seem 
> > > >> restrictive, but a single new entry or a single removal may in fact 
> > > >> have impact on much of the tree, due to balancing. More selective 
> > > >> locking would require a prebalancing tree walk, determining the 
> > > >> affected subtrees, lock them and once every affected subtree is 
> > > >> locked, perform the actual balancing. 
> > > >> Please let me hear if there are any objections to locking the node 
> > > >> pointing to the tree as the a solution to make SortedTree thread safe.
> > > >> Niels
> > > >> 
> > > >>> Date: Tue, 5 Jul 2011 08:27:57 +0200
> > > >>> From: [email protected]
> > > >>> To: [email protected]
> > > >>> Subject: Re: [Neo4j] Indexed relationships
> > > >>> 
> > > >>> Great work Nils!
> > > >>> 
> > > >>> /peter
> > > >>> 
> > > >>> Sent from my phone.
> > > >>> On Jul 4, 2011 11:39 PM, "Niels Hoogeveen" <[email protected]>
> > > >>> wrote:
> > > >>>> 
> > > >>>> Made some more changes to the SortedTree implementation. Previously
> > > >>> SortedTree would throw an exception if a duplicate entry was being 
> > > >>> added.
> > > >>>> I changed SortedTree to allow a key to point to more than one node, 
> > > >>>> unless
> > > >>> the SortedTree is created as a unique index, in which case an 
> > > >>> exception is
> > > >>> raised when an attempt is made to add a node to an existing key entry.
> > > >>>> A SortedTree once defined as unique can not be changed to a 
> > > >>>> non-unique
> > > >>> index or vice-versa.
> > > >>>> SortedTrees now have a name, which is stored in the a property of the
> > > >>> TREE_ROOT relationship and in the KEY_VALUE relationship (a new 
> > > >>> relationship
> > > >>> that points from the SortedTree to the Node inserted in the 
> > > >>> SortedTree). The
> > > >>> name of a SortedTree can not be changed.
> > > >>>> SortedTrees now store the class of the Comparator, so a SortedTree, 
> > > >>>> once
> > > >>> created, can not be used with a different Comparator.
> > > >>>> SortedTree is now an Iterable, making it possible to use it in a
> > > >>> foreach-loop.
> > > >>>> Since there are as of yet, no unit tests for SortedTree, I will 
> > > >>>> create
> > > >>> those first before pushing my changes to Git. Preliminary results so 
> > > >>> far are
> > > >>> good. I integrated the changes in my own application and it seems to 
> > > >>> work
> > > >>> fine.
> > > >>>> Todo:
> > > >>>> Decide on an API for indexed relationships. (Community input still
> > > >>> welcome).Write unit tests.Make SortedTree thread safe (Community help 
> > > >>> still
> > > >>> welcome).
> > > >>>> Niels
> > > >>>> 
> > > >>>>> From: [email protected]
> > > >>>>> To: [email protected]
> > > >>>>> Date: Mon, 4 Jul 2011 15:49:45 +0200
> > > >>>>> Subject: Re: [Neo4j] Indexed relationships
> > > >>>>> 
> > > >>>>> 
> > > >>>>> I forgot to add another recurrent issue that can be solved with 
> > > >>>>> indexed
> > > >>> relationships: guaranteed unicity constraints.
> > > >>>>>> From: [email protected]
> > > >>>>>> To: [email protected]
> > > >>>>>> Date: Mon, 4 Jul 2011 01:55:08 +0200
> > > >>>>>> Subject: [Neo4j] Indexed relationships
> > > >>>>>> 
> > > >>>>>> 
> > > >>>>>> In the thread [Neo4j] traversing densely populated nodes we 
> > > >>>>>> discussed
> > > >>> the problems arising when large numbers of relationships are added to 
> > > >>> the
> > > >>> same node.
> > > >>>>>> Over the weekend, I have worked on a solution for the
> > > >>> dense-relationship-nodes using SortedTree in the neo-graph-collections
> > > >>> component. After some minor tweaks to the implementation of 
> > > >>> SortedTree, I
> > > >>> have managed to get a workable solution, where two nodes are not 
> > > >>> directly
> > > >>> linked by a relationship, but by means of a BTree (entirely stored in 
> > > >>> the
> > > >>> graph).
> > > >>>>>> Before continuing this work, I'd like to have a discussion about
> > > >>> features, since what we have now is not just a solution for the dense
> > > >>> populated node issue, but is actually a full fledges indexed 
> > > >>> relationship,
> > > >>> which makes it suitable for other purposes too.
> > > >>>>>> An indexed relationship can for example be used to maintain a 
> > > >>>>>> sorted
> > > >>> set of relationships in the graph, that is not necessarily huge, but 
> > > >>> large
> > > >>> enough to make sorting on internal memory too expensive an operation, 
> > > >>> or
> > > >>> situations where only one out of a large number of relationships is 
> > > >>> actually
> > > >>> traversed in most cases.
> > > >>>>>> There are probably more use cases for in-graph indexed 
> > > >>>>>> relationships,
> > > >>> so I'd like to know what features are desirable and what API would 
> > > >>> Neo4J
> > > >>> users appreciate.
> > > >>>>>> P.S. I still think it would be good to consider, if technically
> > > >>> possible, partitioning the relationship store per relationship type 
> > > >>> and per
> > > >>> direction. The indexed relationship solution works, but is of course 
> > > >>> slower
> > > >>> than a direct relationship, both with respect to insert time and 
> > > >>> traversal
> > > >>> time. If dense relationships are never traversed going out of the 
> > > >>> dense
> > > >>> node, the extra structure maintained by the BTree is only extra 
> > > >>> burden.
> > > >>>>>> P.P.S. If there are people with experience to make an 
> > > >>>>>> implementation
> > > >>> thread safe, please volunteer to help make the implementation 
> > > >>> production
> > > >>> proof.
> > > >>>>>> Niels
> > > >>>>>> _______________________________________________
> > > >>>>>> Neo4j mailing list
> > > >>>>>> [email protected]
> > > >>>>>> https://lists.neo4j.org/mailman/listinfo/user
> > > >>>>> 
> > > >>>>> _______________________________________________
> > > >>>>> Neo4j mailing list
> > > >>>>> [email protected]
> > > >>>>> https://lists.neo4j.org/mailman/listinfo/user
> > > >>>> 
> > > >>>> _______________________________________________
> > > >>>> Neo4j mailing list
> > > >>>> [email protected]
> > > >>>> https://lists.neo4j.org/mailman/listinfo/user
> > > >>> _______________________________________________
> > > >>> Neo4j mailing list
> > > >>> [email protected]
> > > >>> https://lists.neo4j.org/mailman/listinfo/user
> > > >>                                          
> > > >> _______________________________________________
> > > >> Neo4j mailing list
> > > >> [email protected]
> > > >> https://lists.neo4j.org/mailman/listinfo/user
> > > >                                           
> > > > _______________________________________________
> > > > Neo4j mailing list
> > > > [email protected]
> > > > https://lists.neo4j.org/mailman/listinfo/user
> > > 
> > > _______________________________________________
> > > Neo4j mailing list
> > > [email protected]
> > > https://lists.neo4j.org/mailman/listinfo/user
> >                                       
> > _______________________________________________
> > Neo4j mailing list
> > [email protected]
> > https://lists.neo4j.org/mailman/listinfo/user
>                                         
> _______________________________________________
> Neo4j mailing list
> [email protected]
> https://lists.neo4j.org/mailman/listinfo/user
                                          
_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Indexed relationships

Reply via email to