Re: [Neo4j] Issues with IndexedRelationship

2011-09-08 Thread Niels Hoogeveen

Excellent... I did a code review and think this is a huge improvement over what 
we had.
Peter, can you pull these changes, I no longer have the privs to do so.
Niels

 Date: Thu, 8 Sep 2011 17:24:44 +1200
 From: bryc...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Issues with IndexedRelationship
 
 I have made the changes in regards to SortedTree in regards to relationships
 vs nodes, and have got all the tests passing.  The changes are pushed up to
 my github account (and pull request has been raised).
 
 The changes can be seen here:
 https://github.com/brycenz/graph-collections
 
 On Thu, Sep 8, 2011 at 3:41 PM, Bryce bryc...@gmail.com wrote:
 
  Another thought if there is going to be a larger refactor of the code is
  whether the indexing mechanism should be broken out as a strategy for the
  IndexedRelationship.  At present it is tied to SortedTree, but if an
  interface was extracted out that had addNode, removeNode, iterator, and
  isUniqueIndex then other indexing implementations could be used in certain
  cases.
 
  The particular other implementation I am currently thinking of that could
  be of use to me would be a paged linked list.  So that would have a linked
  list of pages, each with min  x  max KEY_VALUE (or equivalent)
  relationships.  I think that could work quite well for the situation where
  the index is descending date ordered, and generally just appended at the
  most recent end, and results are retrieved in a paged manner generally from
  near the most recent.
 
  But more to the point there could be any number of implementations that
  would be good for given different situations.
 
  That does bring up a question though, there was some discussion a while ago
  about some functionality along the lines of IndexedRelationship being pulled
  into the core, so is that overkill for now if there is going to be another
  core offering later?
 
 
  On Thu, Sep 8, 2011 at 2:38 PM, Niels Hoogeveen pd_aficion...@hotmail.com
   wrote:
 
 
  I think we don't have to worry about backwards compatibility much yet.
  There has not been a formal release of the component, so if there are 
  people
  using the software, they will accept that they are bleeding edgers.
  Indeed addNode should return the KEY_VALUE relationship and I think we
  should change the signature of SortedTree to turn it into
  IterableRelationship. No need to maintain a Node iterator, the node is
  always one getEndNode away.
  Niels
 
   Date: Thu, 8 Sep 2011 14:17:59 +1200
   From: bryc...@gmail.com
   To: user@lists.neo4j.org
   Subject: Re: [Neo4j] Issues with IndexedRelationship
  
   Will have to experiment with changing my id's to be stored as longs, it
  does
   make perfect sense really that it would be better.  Thanks for the hint.
  
   In regards to SortedTree returning the KEY_VALUE relationship instead of
  the
   end Node, I had thought of that too, and it would definitely help.
   Could
   end up being a significant change to SortedTree though, e.g.:
 sortedTree.addNode( node );
   Would need to return the KEY_VALUE relationship instead of a boolean.
   Which
   not knowing where else SortedTree is used could be a large change?
  
   Maybe SortedTree would have two iterator's available a key_value
   relationship iterator, and a node iterator.  Having a quick look at it
  now
   it seems that it could work ok that way without introducing much code
   duplication.
  
   On Thu, Sep 8, 2011 at 12:46 PM, Niels Hoogeveen
   pd_aficion...@hotmail.comwrote:
  
   
Two longs is certainly cheaper than a string. Two longs take 128 bit
  and
are stored in the main record of the PropertyContainer, while a String
  would
require a 64 bit pointer in the main record of the
  PropertyContainer, and
an additional read in the String store where the string representation
  will
take up 256 bits. So both memory-wise, as perfomance wise, it is
  better to
store a UUID as two long values.
   
   
The main issue is something that needs a deeper fix than adding ID's.
SortedTree now returns Nodes when traversing the tree. We should
  however
return the KEY_VALUE Relationship to the indexed Node. Then
IndexedRelationship.DirectRelationship can be created with that
  relationship
as an argument. We get the Direction and the RelationshipType for
  free.
Niels
   
 Date: Thu, 8 Sep 2011 11:36:11 +1200
 From: bryc...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Issues with IndexedRelationship

 Hi Niels,

 Sorry I didn't quite write the bit about (1) clearly enough.  The
  problem
is
 that it presently throws an Exception where it shouldn't.

 This stems from IndexedRelationship.DirectRelationship:
 this.endRelationship = endNode.getSingleRelationship(
 SortedTree.RelTypes.KEY_VALUE, Direction.INCOMING );

 So if the end node has more than one incoming KEY_VALUE relationship
  a
more

Re: [Neo4j] Issues with IndexedRelationship

2011-09-08 Thread Niels Hoogeveen

I like this idea

 Date: Thu, 8 Sep 2011 15:41:52 +1200
 From: bryc...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Issues with IndexedRelationship
 
 Another thought if there is going to be a larger refactor of the code is
 whether the indexing mechanism should be broken out as a strategy for the
 IndexedRelationship.  At present it is tied to SortedTree, but if an
 interface was extracted out that had addNode, removeNode, iterator, and
 isUniqueIndex then other indexing implementations could be used in certain
 cases.
 
 The particular other implementation I am currently thinking of that could be
 of use to me would be a paged linked list.  So that would have a linked list
 of pages, each with min  x  max KEY_VALUE (or equivalent) relationships.
  I think that could work quite well for the situation where the index is
 descending date ordered, and generally just appended at the most recent end,
 and results are retrieved in a paged manner generally from near the most
 recent.
 
 But more to the point there could be any number of implementations that
 would be good for given different situations.
 
 That does bring up a question though, there was some discussion a while ago
 about some functionality along the lines of IndexedRelationship being pulled
 into the core, so is that overkill for now if there is going to be another
 core offering later?
 
 On Thu, Sep 8, 2011 at 2:38 PM, Niels Hoogeveen
 pd_aficion...@hotmail.comwrote:
 
 
  I think we don't have to worry about backwards compatibility much yet.
  There has not been a formal release of the component, so if there are people
  using the software, they will accept that they are bleeding edgers.
  Indeed addNode should return the KEY_VALUE relationship and I think we
  should change the signature of SortedTree to turn it into
  IterableRelationship. No need to maintain a Node iterator, the node is
  always one getEndNode away.
  Niels
 
   Date: Thu, 8 Sep 2011 14:17:59 +1200
   From: bryc...@gmail.com
   To: user@lists.neo4j.org
   Subject: Re: [Neo4j] Issues with IndexedRelationship
  
   Will have to experiment with changing my id's to be stored as longs, it
  does
   make perfect sense really that it would be better.  Thanks for the hint.
  
   In regards to SortedTree returning the KEY_VALUE relationship instead of
  the
   end Node, I had thought of that too, and it would definitely help.  Could
   end up being a significant change to SortedTree though, e.g.:
 sortedTree.addNode( node );
   Would need to return the KEY_VALUE relationship instead of a boolean.
   Which
   not knowing where else SortedTree is used could be a large change?
  
   Maybe SortedTree would have two iterator's available a key_value
   relationship iterator, and a node iterator.  Having a quick look at it
  now
   it seems that it could work ok that way without introducing much code
   duplication.
  
   On Thu, Sep 8, 2011 at 12:46 PM, Niels Hoogeveen
   pd_aficion...@hotmail.comwrote:
  
   
Two longs is certainly cheaper than a string. Two longs take 128 bit
  and
are stored in the main record of the PropertyContainer, while a String
  would
require a 64 bit pointer in the main record of the PropertyContainer,
  and
an additional read in the String store where the string representation
  will
take up 256 bits. So both memory-wise, as perfomance wise, it is better
  to
store a UUID as two long values.
   
   
The main issue is something that needs a deeper fix than adding ID's.
SortedTree now returns Nodes when traversing the tree. We should
  however
return the KEY_VALUE Relationship to the indexed Node. Then
IndexedRelationship.DirectRelationship can be created with that
  relationship
as an argument. We get the Direction and the RelationshipType for free.
Niels
   
 Date: Thu, 8 Sep 2011 11:36:11 +1200
 From: bryc...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Issues with IndexedRelationship

 Hi Niels,

 Sorry I didn't quite write the bit about (1) clearly enough.  The
  problem
is
 that it presently throws an Exception where it shouldn't.

 This stems from IndexedRelationship.DirectRelationship:
 this.endRelationship = endNode.getSingleRelationship(
 SortedTree.RelTypes.KEY_VALUE, Direction.INCOMING );

 So if the end node has more than one incoming KEY_VALUE relationship
  a
more
 than one relationship exception is thrown.

 Instead of the getSingleRelationship I was planning on iterating over
  the
 relationships and matching the UUID stored at the root end of the IR
  with
 one of the KEY_VALUE relationships (which is why using a unique id is
 necessary rather than the relationship type).  Note: there will
  actually
 still be an issue if the same IR has multiple relationships to the
  same
leaf
 node - still thinking about that might need .

 Is storing the UUID as two

Re: [Neo4j] Issues with IndexedRelationship

2011-09-08 Thread Peter Neubauer
Niels, Bryce,
great! Gave you access to the repo, please merge :)

/peter

On Thu, Sep 8, 2011 at 2:32 PM, Niels Hoogeveen
pd_aficion...@hotmail.com wrote:

 Excellent... I did a code review and think this is a huge improvement over 
 what we had.
 Peter, can you pull these changes, I no longer have the privs to do so.
 Niels

 Date: Thu, 8 Sep 2011 17:24:44 +1200
 From: bryc...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Issues with IndexedRelationship

 I have made the changes in regards to SortedTree in regards to relationships
 vs nodes, and have got all the tests passing.  The changes are pushed up to
 my github account (and pull request has been raised).

 The changes can be seen here:
 https://github.com/brycenz/graph-collections

 On Thu, Sep 8, 2011 at 3:41 PM, Bryce bryc...@gmail.com wrote:

  Another thought if there is going to be a larger refactor of the code is
  whether the indexing mechanism should be broken out as a strategy for the
  IndexedRelationship.  At present it is tied to SortedTree, but if an
  interface was extracted out that had addNode, removeNode, iterator, and
  isUniqueIndex then other indexing implementations could be used in certain
  cases.
 
  The particular other implementation I am currently thinking of that could
  be of use to me would be a paged linked list.  So that would have a linked
  list of pages, each with min  x  max KEY_VALUE (or equivalent)
  relationships.  I think that could work quite well for the situation where
  the index is descending date ordered, and generally just appended at the
  most recent end, and results are retrieved in a paged manner generally from
  near the most recent.
 
  But more to the point there could be any number of implementations that
  would be good for given different situations.
 
  That does bring up a question though, there was some discussion a while ago
  about some functionality along the lines of IndexedRelationship being 
  pulled
  into the core, so is that overkill for now if there is going to be another
  core offering later?
 
 
  On Thu, Sep 8, 2011 at 2:38 PM, Niels Hoogeveen pd_aficion...@hotmail.com
   wrote:
 
 
  I think we don't have to worry about backwards compatibility much yet.
  There has not been a formal release of the component, so if there are 
  people
  using the software, they will accept that they are bleeding edgers.
  Indeed addNode should return the KEY_VALUE relationship and I think we
  should change the signature of SortedTree to turn it into
  IterableRelationship. No need to maintain a Node iterator, the node is
  always one getEndNode away.
  Niels
 
   Date: Thu, 8 Sep 2011 14:17:59 +1200
   From: bryc...@gmail.com
   To: user@lists.neo4j.org
   Subject: Re: [Neo4j] Issues with IndexedRelationship
  
   Will have to experiment with changing my id's to be stored as longs, it
  does
   make perfect sense really that it would be better.  Thanks for the hint.
  
   In regards to SortedTree returning the KEY_VALUE relationship instead of
  the
   end Node, I had thought of that too, and it would definitely help.
   Could
   end up being a significant change to SortedTree though, e.g.:
     sortedTree.addNode( node );
   Would need to return the KEY_VALUE relationship instead of a boolean.
   Which
   not knowing where else SortedTree is used could be a large change?
  
   Maybe SortedTree would have two iterator's available a key_value
   relationship iterator, and a node iterator.  Having a quick look at it
  now
   it seems that it could work ok that way without introducing much code
   duplication.
  
   On Thu, Sep 8, 2011 at 12:46 PM, Niels Hoogeveen
   pd_aficion...@hotmail.comwrote:
  
   
Two longs is certainly cheaper than a string. Two longs take 128 bit
  and
are stored in the main record of the PropertyContainer, while a String
  would
require a 64 bit pointer in the main record of the
  PropertyContainer, and
an additional read in the String store where the string representation
  will
take up 256 bits. So both memory-wise, as perfomance wise, it is
  better to
store a UUID as two long values.
   
   
The main issue is something that needs a deeper fix than adding ID's.
SortedTree now returns Nodes when traversing the tree. We should
  however
return the KEY_VALUE Relationship to the indexed Node. Then
IndexedRelationship.DirectRelationship can be created with that
  relationship
as an argument. We get the Direction and the RelationshipType for
  free.
Niels
   
 Date: Thu, 8 Sep 2011 11:36:11 +1200
 From: bryc...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Issues with IndexedRelationship

 Hi Niels,

 Sorry I didn't quite write the bit about (1) clearly enough.  The
  problem
is
 that it presently throws an Exception where it shouldn't.

 This stems from IndexedRelationship.DirectRelationship:
 this.endRelationship = endNode.getSingleRelationship

[Neo4j] Issues with IndexedRelationship

2011-09-07 Thread Bryce
Hi,

As I mentioned a while ago I am looking at using IndexedRelationship's
within my application.  The major thing that was missing for me to be able
to do this was IndexedRelationshipExpander being able to provide all the
relationships from the leaf end of indexed relationships through the the
root end.  So I have been working on getting that support in there.

However in writing this I have discovered a number of other issues that I
have also fixed, and at least one I am still working on.  Since I was right
into the extra support for expanding the relationships it is hard to break
out these fixes as a separate commit (which I think would be ideal), so it
will most likely all come in together hopefully later today (NZ time).

Just letting everyone know in case someone else is doing development against
indexed relationships.

Quick run down of the issues, note: N -- IR(X) -- {A,B} below means there
is a indexed relationship from N to A  B, of type X.

1) Exception thrown when more than one IR terminates at a given node, e.g.:
N1 -- IR(X) -- {A,B,C,D}
N2 -- IR(X) -- {A,X,Y,Z}
Will throw an exception when using the IndexedRelationshipExpander on either
N1, or N2.

2) Start / End nodes are transposed when the IR has an direction of
incoming, i.e. the IR is created against N but across a set of incoming
relationships:
N -- IR(Y) -- {A,B,C}
Will return 3 relationships N -- A, N -- B, N -- C.

I have written tests for each of these, as well as a couple of other tests.

Still completing (1) and have a little question about this.  In order to fix
this I may need to introduce a unique ID stored against the IR both at the
root and at the leaves.  Currently the relationship type is used to name the
IR at both root and leaves, but in the case above that means you can't tell
from node A which KEY_VALUE relationship belongs to which IR tree without
traversing the tree.

So the question is adding this ID would mean that anyone who is already
using this wont have the ID, and therefore without care will be data
incompatible with the updated code.  This could be managed via a check for
the ID when accessing the tree and if it isn't there doing a walk over the
tree to populate all the places where it is required.

In general in developing against this code where do we sit on data
compatibility and API compatibility?

Cheers
Bryce
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Issues with IndexedRelationship

2011-09-07 Thread Niels Hoogeveen

Great work Bryce,
I do have a question though.
What is the rationale for the restriction mentioned under 1). Do you need 
this for the general case (to make IndexedRelationshipExpander work correctly), 
or do you need it for your own application to throw that exception? If the 
latter is the case, I think it would be important to tease out the general case 
and offer this new behaviour as an option.
A unique key for the index is a good idea anyway and can be added to 
SortedTree. Generate a UUID and store it in two long properties. That way the 
two values will always be read in the first fetch of the underlying 
PropertyContainer. A getId method on the TreeNodes can then return a String 
representation of of the two long values.
IndexRelationships are a relatively new development, so I think you are one of 
the first to actually try it out. Personally I have chosen to directly work 
with SortedTree, because I am working within the framework of a wrapper API, so 
I can integrate the functionality behind the regular createRelationshipTo and 
getRelationships methods.
I don't think API changes will be an issue at the moment.
Niels
 Date: Thu, 8 Sep 2011 10:22:11 +1200
 From: bryc...@gmail.com
 To: user@lists.neo4j.org
 Subject: [Neo4j] Issues with IndexedRelationship
 
 Hi,
 
 As I mentioned a while ago I am looking at using IndexedRelationship's
 within my application.  The major thing that was missing for me to be able
 to do this was IndexedRelationshipExpander being able to provide all the
 relationships from the leaf end of indexed relationships through the the
 root end.  So I have been working on getting that support in there.
 
 However in writing this I have discovered a number of other issues that I
 have also fixed, and at least one I am still working on.  Since I was right
 into the extra support for expanding the relationships it is hard to break
 out these fixes as a separate commit (which I think would be ideal), so it
 will most likely all come in together hopefully later today (NZ time).
 
 Just letting everyone know in case someone else is doing development against
 indexed relationships.
 
 Quick run down of the issues, note: N -- IR(X) -- {A,B} below means there
 is a indexed relationship from N to A  B, of type X.
 
 1) Exception thrown when more than one IR terminates at a given node, e.g.:
 N1 -- IR(X) -- {A,B,C,D}
 N2 -- IR(X) -- {A,X,Y,Z}
 Will throw an exception when using the IndexedRelationshipExpander on either
 N1, or N2.
 
 2) Start / End nodes are transposed when the IR has an direction of
 incoming, i.e. the IR is created against N but across a set of incoming
 relationships:
 N -- IR(Y) -- {A,B,C}
 Will return 3 relationships N -- A, N -- B, N -- C.
 
 I have written tests for each of these, as well as a couple of other tests.
 
 Still completing (1) and have a little question about this.  In order to fix
 this I may need to introduce a unique ID stored against the IR both at the
 root and at the leaves.  Currently the relationship type is used to name the
 IR at both root and leaves, but in the case above that means you can't tell
 from node A which KEY_VALUE relationship belongs to which IR tree without
 traversing the tree.
 
 So the question is adding this ID would mean that anyone who is already
 using this wont have the ID, and therefore without care will be data
 incompatible with the updated code.  This could be managed via a check for
 the ID when accessing the tree and if it isn't there doing a walk over the
 tree to populate all the places where it is required.
 
 In general in developing against this code where do we sit on data
 compatibility and API compatibility?
 
 Cheers
 Bryce
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
  
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Issues with IndexedRelationship

2011-09-07 Thread Bryce
Hi Niels,

Sorry I didn't quite write the bit about (1) clearly enough.  The problem is
that it presently throws an Exception where it shouldn't.

This stems from IndexedRelationship.DirectRelationship:
this.endRelationship = endNode.getSingleRelationship(
SortedTree.RelTypes.KEY_VALUE, Direction.INCOMING );

So if the end node has more than one incoming KEY_VALUE relationship a more
than one relationship exception is thrown.

Instead of the getSingleRelationship I was planning on iterating over the
relationships and matching the UUID stored at the root end of the IR with
one of the KEY_VALUE relationships (which is why using a unique id is
necessary rather than the relationship type).  Note: there will actually
still be an issue if the same IR has multiple relationships to the same leaf
node - still thinking about that might need .

Is storing the UUID as two longs much quicker than storing it as a string?
 Curious about this since in my current model I have all the domain objects
with UUID's, and these are all stored as strings.  If it was going to help
with either memory or performance then I would be keen to migrate this to
two longs.

Cheers
Bryce

On Thu, Sep 8, 2011 at 11:07 AM, Niels Hoogeveen
pd_aficion...@hotmail.comwrote:


 Great work Bryce,
 I do have a question though.
 What is the rationale for the restriction mentioned under 1). Do you need
 this for the general case (to make IndexedRelationshipExpander work
 correctly), or do you need it for your own application to throw that
 exception? If the latter is the case, I think it would be important to tease
 out the general case and offer this new behaviour as an option.
 A unique key for the index is a good idea anyway and can be added to
 SortedTree. Generate a UUID and store it in two long properties. That way
 the two values will always be read in the first fetch of the underlying
 PropertyContainer. A getId method on the TreeNodes can then return a String
 representation of of the two long values.
 IndexRelationships are a relatively new development, so I think you are one
 of the first to actually try it out. Personally I have chosen to directly
 work with SortedTree, because I am working within the framework of a wrapper
 API, so I can integrate the functionality behind the regular
 createRelationshipTo and getRelationships methods.
 I don't think API changes will be an issue at the moment.
 Niels
  Date: Thu, 8 Sep 2011 10:22:11 +1200
  From: bryc...@gmail.com
  To: user@lists.neo4j.org
  Subject: [Neo4j] Issues with IndexedRelationship
 
  Hi,
 
  As I mentioned a while ago I am looking at using IndexedRelationship's
  within my application.  The major thing that was missing for me to be
 able
  to do this was IndexedRelationshipExpander being able to provide all the
  relationships from the leaf end of indexed relationships through the the
  root end.  So I have been working on getting that support in there.
 
  However in writing this I have discovered a number of other issues that I
  have also fixed, and at least one I am still working on.  Since I was
 right
  into the extra support for expanding the relationships it is hard to
 break
  out these fixes as a separate commit (which I think would be ideal), so
 it
  will most likely all come in together hopefully later today (NZ time).
 
  Just letting everyone know in case someone else is doing development
 against
  indexed relationships.
 
  Quick run down of the issues, note: N -- IR(X) -- {A,B} below means
 there
  is a indexed relationship from N to A  B, of type X.
 
  1) Exception thrown when more than one IR terminates at a given node,
 e.g.:
  N1 -- IR(X) -- {A,B,C,D}
  N2 -- IR(X) -- {A,X,Y,Z}
  Will throw an exception when using the IndexedRelationshipExpander on
 either
  N1, or N2.
 
  2) Start / End nodes are transposed when the IR has an direction of
  incoming, i.e. the IR is created against N but across a set of incoming
  relationships:
  N -- IR(Y) -- {A,B,C}
  Will return 3 relationships N -- A, N -- B, N -- C.
 
  I have written tests for each of these, as well as a couple of other
 tests.
 
  Still completing (1) and have a little question about this.  In order to
 fix
  this I may need to introduce a unique ID stored against the IR both at
 the
  root and at the leaves.  Currently the relationship type is used to name
 the
  IR at both root and leaves, but in the case above that means you can't
 tell
  from node A which KEY_VALUE relationship belongs to which IR tree without
  traversing the tree.
 
  So the question is adding this ID would mean that anyone who is already
  using this wont have the ID, and therefore without care will be data
  incompatible with the updated code.  This could be managed via a check
 for
  the ID when accessing the tree and if it isn't there doing a walk over
 the
  tree to populate all the places where it is required.
 
  In general in developing against this code where do we sit on data
  compatibility and API compatibility

Re: [Neo4j] Issues with IndexedRelationship

2011-09-07 Thread Niels Hoogeveen

Two longs is certainly cheaper than a string. Two longs take 128 bit and are 
stored in the main record of the PropertyContainer, while a String would 
require a 64 bit pointer in the main record of the PropertyContainer, and an 
additional read in the String store where the string representation will take 
up 256 bits. So both memory-wise, as perfomance wise, it is better to store a 
UUID as two long values. 


The main issue is something that needs a deeper fix than adding ID's. 
SortedTree now returns Nodes when traversing the tree. We should however return 
the KEY_VALUE Relationship to the indexed Node. Then 
IndexedRelationship.DirectRelationship can be created with that relationship as 
an argument. We get the Direction and the RelationshipType for free.
Niels

 Date: Thu, 8 Sep 2011 11:36:11 +1200
 From: bryc...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Issues with IndexedRelationship
 
 Hi Niels,
 
 Sorry I didn't quite write the bit about (1) clearly enough.  The problem is
 that it presently throws an Exception where it shouldn't.
 
 This stems from IndexedRelationship.DirectRelationship:
 this.endRelationship = endNode.getSingleRelationship(
 SortedTree.RelTypes.KEY_VALUE, Direction.INCOMING );
 
 So if the end node has more than one incoming KEY_VALUE relationship a more
 than one relationship exception is thrown.
 
 Instead of the getSingleRelationship I was planning on iterating over the
 relationships and matching the UUID stored at the root end of the IR with
 one of the KEY_VALUE relationships (which is why using a unique id is
 necessary rather than the relationship type).  Note: there will actually
 still be an issue if the same IR has multiple relationships to the same leaf
 node - still thinking about that might need .
 
 Is storing the UUID as two longs much quicker than storing it as a string?
  Curious about this since in my current model I have all the domain objects
 with UUID's, and these are all stored as strings.  If it was going to help
 with either memory or performance then I would be keen to migrate this to
 two longs.
 
 Cheers
 Bryce
 
 On Thu, Sep 8, 2011 at 11:07 AM, Niels Hoogeveen
 pd_aficion...@hotmail.comwrote:
 
 
  Great work Bryce,
  I do have a question though.
  What is the rationale for the restriction mentioned under 1). Do you need
  this for the general case (to make IndexedRelationshipExpander work
  correctly), or do you need it for your own application to throw that
  exception? If the latter is the case, I think it would be important to tease
  out the general case and offer this new behaviour as an option.
  A unique key for the index is a good idea anyway and can be added to
  SortedTree. Generate a UUID and store it in two long properties. That way
  the two values will always be read in the first fetch of the underlying
  PropertyContainer. A getId method on the TreeNodes can then return a String
  representation of of the two long values.
  IndexRelationships are a relatively new development, so I think you are one
  of the first to actually try it out. Personally I have chosen to directly
  work with SortedTree, because I am working within the framework of a wrapper
  API, so I can integrate the functionality behind the regular
  createRelationshipTo and getRelationships methods.
  I don't think API changes will be an issue at the moment.
  Niels
   Date: Thu, 8 Sep 2011 10:22:11 +1200
   From: bryc...@gmail.com
   To: user@lists.neo4j.org
   Subject: [Neo4j] Issues with IndexedRelationship
  
   Hi,
  
   As I mentioned a while ago I am looking at using IndexedRelationship's
   within my application.  The major thing that was missing for me to be
  able
   to do this was IndexedRelationshipExpander being able to provide all the
   relationships from the leaf end of indexed relationships through the the
   root end.  So I have been working on getting that support in there.
  
   However in writing this I have discovered a number of other issues that I
   have also fixed, and at least one I am still working on.  Since I was
  right
   into the extra support for expanding the relationships it is hard to
  break
   out these fixes as a separate commit (which I think would be ideal), so
  it
   will most likely all come in together hopefully later today (NZ time).
  
   Just letting everyone know in case someone else is doing development
  against
   indexed relationships.
  
   Quick run down of the issues, note: N -- IR(X) -- {A,B} below means
  there
   is a indexed relationship from N to A  B, of type X.
  
   1) Exception thrown when more than one IR terminates at a given node,
  e.g.:
   N1 -- IR(X) -- {A,B,C,D}
   N2 -- IR(X) -- {A,X,Y,Z}
   Will throw an exception when using the IndexedRelationshipExpander on
  either
   N1, or N2.
  
   2) Start / End nodes are transposed when the IR has an direction of
   incoming, i.e. the IR is created against N but across a set of incoming
   relationships:
   N -- IR(Y) -- {A,B,C

Re: [Neo4j] Issues with IndexedRelationship

2011-09-07 Thread Bryce
Will have to experiment with changing my id's to be stored as longs, it does
make perfect sense really that it would be better.  Thanks for the hint.

In regards to SortedTree returning the KEY_VALUE relationship instead of the
end Node, I had thought of that too, and it would definitely help.  Could
end up being a significant change to SortedTree though, e.g.:
  sortedTree.addNode( node );
Would need to return the KEY_VALUE relationship instead of a boolean.  Which
not knowing where else SortedTree is used could be a large change?

Maybe SortedTree would have two iterator's available a key_value
relationship iterator, and a node iterator.  Having a quick look at it now
it seems that it could work ok that way without introducing much code
duplication.

On Thu, Sep 8, 2011 at 12:46 PM, Niels Hoogeveen
pd_aficion...@hotmail.comwrote:


 Two longs is certainly cheaper than a string. Two longs take 128 bit and
 are stored in the main record of the PropertyContainer, while a String would
 require a 64 bit pointer in the main record of the PropertyContainer, and
 an additional read in the String store where the string representation will
 take up 256 bits. So both memory-wise, as perfomance wise, it is better to
 store a UUID as two long values.


 The main issue is something that needs a deeper fix than adding ID's.
 SortedTree now returns Nodes when traversing the tree. We should however
 return the KEY_VALUE Relationship to the indexed Node. Then
 IndexedRelationship.DirectRelationship can be created with that relationship
 as an argument. We get the Direction and the RelationshipType for free.
 Niels

  Date: Thu, 8 Sep 2011 11:36:11 +1200
  From: bryc...@gmail.com
  To: user@lists.neo4j.org
  Subject: Re: [Neo4j] Issues with IndexedRelationship
 
  Hi Niels,
 
  Sorry I didn't quite write the bit about (1) clearly enough.  The problem
 is
  that it presently throws an Exception where it shouldn't.
 
  This stems from IndexedRelationship.DirectRelationship:
  this.endRelationship = endNode.getSingleRelationship(
  SortedTree.RelTypes.KEY_VALUE, Direction.INCOMING );
 
  So if the end node has more than one incoming KEY_VALUE relationship a
 more
  than one relationship exception is thrown.
 
  Instead of the getSingleRelationship I was planning on iterating over the
  relationships and matching the UUID stored at the root end of the IR with
  one of the KEY_VALUE relationships (which is why using a unique id is
  necessary rather than the relationship type).  Note: there will actually
  still be an issue if the same IR has multiple relationships to the same
 leaf
  node - still thinking about that might need .
 
  Is storing the UUID as two longs much quicker than storing it as a
 string?
   Curious about this since in my current model I have all the domain
 objects
  with UUID's, and these are all stored as strings.  If it was going to
 help
  with either memory or performance then I would be keen to migrate this to
  two longs.
 
  Cheers
  Bryce
 
  On Thu, Sep 8, 2011 at 11:07 AM, Niels Hoogeveen
  pd_aficion...@hotmail.comwrote:
 
  
   Great work Bryce,
   I do have a question though.
   What is the rationale for the restriction mentioned under 1). Do you
 need
   this for the general case (to make IndexedRelationshipExpander work
   correctly), or do you need it for your own application to throw that
   exception? If the latter is the case, I think it would be important to
 tease
   out the general case and offer this new behaviour as an option.
   A unique key for the index is a good idea anyway and can be added to
   SortedTree. Generate a UUID and store it in two long properties. That
 way
   the two values will always be read in the first fetch of the underlying
   PropertyContainer. A getId method on the TreeNodes can then return a
 String
   representation of of the two long values.
   IndexRelationships are a relatively new development, so I think you are
 one
   of the first to actually try it out. Personally I have chosen to
 directly
   work with SortedTree, because I am working within the framework of a
 wrapper
   API, so I can integrate the functionality behind the regular
   createRelationshipTo and getRelationships methods.
   I don't think API changes will be an issue at the moment.
   Niels
Date: Thu, 8 Sep 2011 10:22:11 +1200
From: bryc...@gmail.com
To: user@lists.neo4j.org
Subject: [Neo4j] Issues with IndexedRelationship
   
Hi,
   
As I mentioned a while ago I am looking at using
 IndexedRelationship's
within my application.  The major thing that was missing for me to be
   able
to do this was IndexedRelationshipExpander being able to provide all
 the
relationships from the leaf end of indexed relationships through the
 the
root end.  So I have been working on getting that support in there.
   
However in writing this I have discovered a number of other issues
 that I
have also fixed, and at least one I am still working on.  Since I

Re: [Neo4j] Issues with IndexedRelationship

2011-09-07 Thread Niels Hoogeveen

I think we don't have to worry about backwards compatibility much yet. There 
has not been a formal release of the component, so if there are people using 
the software, they will accept that they are bleeding edgers. 
Indeed addNode should return the KEY_VALUE relationship and I think we should 
change the signature of SortedTree to turn it into IterableRelationship. No 
need to maintain a Node iterator, the node is always one getEndNode away. 
Niels

 Date: Thu, 8 Sep 2011 14:17:59 +1200
 From: bryc...@gmail.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo4j] Issues with IndexedRelationship
 
 Will have to experiment with changing my id's to be stored as longs, it does
 make perfect sense really that it would be better.  Thanks for the hint.
 
 In regards to SortedTree returning the KEY_VALUE relationship instead of the
 end Node, I had thought of that too, and it would definitely help.  Could
 end up being a significant change to SortedTree though, e.g.:
   sortedTree.addNode( node );
 Would need to return the KEY_VALUE relationship instead of a boolean.  Which
 not knowing where else SortedTree is used could be a large change?
 
 Maybe SortedTree would have two iterator's available a key_value
 relationship iterator, and a node iterator.  Having a quick look at it now
 it seems that it could work ok that way without introducing much code
 duplication.
 
 On Thu, Sep 8, 2011 at 12:46 PM, Niels Hoogeveen
 pd_aficion...@hotmail.comwrote:
 
 
  Two longs is certainly cheaper than a string. Two longs take 128 bit and
  are stored in the main record of the PropertyContainer, while a String would
  require a 64 bit pointer in the main record of the PropertyContainer, and
  an additional read in the String store where the string representation will
  take up 256 bits. So both memory-wise, as perfomance wise, it is better to
  store a UUID as two long values.
 
 
  The main issue is something that needs a deeper fix than adding ID's.
  SortedTree now returns Nodes when traversing the tree. We should however
  return the KEY_VALUE Relationship to the indexed Node. Then
  IndexedRelationship.DirectRelationship can be created with that relationship
  as an argument. We get the Direction and the RelationshipType for free.
  Niels
 
   Date: Thu, 8 Sep 2011 11:36:11 +1200
   From: bryc...@gmail.com
   To: user@lists.neo4j.org
   Subject: Re: [Neo4j] Issues with IndexedRelationship
  
   Hi Niels,
  
   Sorry I didn't quite write the bit about (1) clearly enough.  The problem
  is
   that it presently throws an Exception where it shouldn't.
  
   This stems from IndexedRelationship.DirectRelationship:
   this.endRelationship = endNode.getSingleRelationship(
   SortedTree.RelTypes.KEY_VALUE, Direction.INCOMING );
  
   So if the end node has more than one incoming KEY_VALUE relationship a
  more
   than one relationship exception is thrown.
  
   Instead of the getSingleRelationship I was planning on iterating over the
   relationships and matching the UUID stored at the root end of the IR with
   one of the KEY_VALUE relationships (which is why using a unique id is
   necessary rather than the relationship type).  Note: there will actually
   still be an issue if the same IR has multiple relationships to the same
  leaf
   node - still thinking about that might need .
  
   Is storing the UUID as two longs much quicker than storing it as a
  string?
Curious about this since in my current model I have all the domain
  objects
   with UUID's, and these are all stored as strings.  If it was going to
  help
   with either memory or performance then I would be keen to migrate this to
   two longs.
  
   Cheers
   Bryce
  
   On Thu, Sep 8, 2011 at 11:07 AM, Niels Hoogeveen
   pd_aficion...@hotmail.comwrote:
  
   
Great work Bryce,
I do have a question though.
What is the rationale for the restriction mentioned under 1). Do you
  need
this for the general case (to make IndexedRelationshipExpander work
correctly), or do you need it for your own application to throw that
exception? If the latter is the case, I think it would be important to
  tease
out the general case and offer this new behaviour as an option.
A unique key for the index is a good idea anyway and can be added to
SortedTree. Generate a UUID and store it in two long properties. That
  way
the two values will always be read in the first fetch of the underlying
PropertyContainer. A getId method on the TreeNodes can then return a
  String
representation of of the two long values.
IndexRelationships are a relatively new development, so I think you are
  one
of the first to actually try it out. Personally I have chosen to
  directly
work with SortedTree, because I am working within the framework of a
  wrapper
API, so I can integrate the functionality behind the regular
createRelationshipTo and getRelationships methods.
I don't think API changes will be an issue at the moment.
Niels

Re: [Neo4j] Issues with IndexedRelationship

2011-09-07 Thread Bryce
Another thought if there is going to be a larger refactor of the code is
whether the indexing mechanism should be broken out as a strategy for the
IndexedRelationship.  At present it is tied to SortedTree, but if an
interface was extracted out that had addNode, removeNode, iterator, and
isUniqueIndex then other indexing implementations could be used in certain
cases.

The particular other implementation I am currently thinking of that could be
of use to me would be a paged linked list.  So that would have a linked list
of pages, each with min  x  max KEY_VALUE (or equivalent) relationships.
 I think that could work quite well for the situation where the index is
descending date ordered, and generally just appended at the most recent end,
and results are retrieved in a paged manner generally from near the most
recent.

But more to the point there could be any number of implementations that
would be good for given different situations.

That does bring up a question though, there was some discussion a while ago
about some functionality along the lines of IndexedRelationship being pulled
into the core, so is that overkill for now if there is going to be another
core offering later?

On Thu, Sep 8, 2011 at 2:38 PM, Niels Hoogeveen
pd_aficion...@hotmail.comwrote:


 I think we don't have to worry about backwards compatibility much yet.
 There has not been a formal release of the component, so if there are people
 using the software, they will accept that they are bleeding edgers.
 Indeed addNode should return the KEY_VALUE relationship and I think we
 should change the signature of SortedTree to turn it into
 IterableRelationship. No need to maintain a Node iterator, the node is
 always one getEndNode away.
 Niels

  Date: Thu, 8 Sep 2011 14:17:59 +1200
  From: bryc...@gmail.com
  To: user@lists.neo4j.org
  Subject: Re: [Neo4j] Issues with IndexedRelationship
 
  Will have to experiment with changing my id's to be stored as longs, it
 does
  make perfect sense really that it would be better.  Thanks for the hint.
 
  In regards to SortedTree returning the KEY_VALUE relationship instead of
 the
  end Node, I had thought of that too, and it would definitely help.  Could
  end up being a significant change to SortedTree though, e.g.:
sortedTree.addNode( node );
  Would need to return the KEY_VALUE relationship instead of a boolean.
  Which
  not knowing where else SortedTree is used could be a large change?
 
  Maybe SortedTree would have two iterator's available a key_value
  relationship iterator, and a node iterator.  Having a quick look at it
 now
  it seems that it could work ok that way without introducing much code
  duplication.
 
  On Thu, Sep 8, 2011 at 12:46 PM, Niels Hoogeveen
  pd_aficion...@hotmail.comwrote:
 
  
   Two longs is certainly cheaper than a string. Two longs take 128 bit
 and
   are stored in the main record of the PropertyContainer, while a String
 would
   require a 64 bit pointer in the main record of the PropertyContainer,
 and
   an additional read in the String store where the string representation
 will
   take up 256 bits. So both memory-wise, as perfomance wise, it is better
 to
   store a UUID as two long values.
  
  
   The main issue is something that needs a deeper fix than adding ID's.
   SortedTree now returns Nodes when traversing the tree. We should
 however
   return the KEY_VALUE Relationship to the indexed Node. Then
   IndexedRelationship.DirectRelationship can be created with that
 relationship
   as an argument. We get the Direction and the RelationshipType for free.
   Niels
  
Date: Thu, 8 Sep 2011 11:36:11 +1200
From: bryc...@gmail.com
To: user@lists.neo4j.org
Subject: Re: [Neo4j] Issues with IndexedRelationship
   
Hi Niels,
   
Sorry I didn't quite write the bit about (1) clearly enough.  The
 problem
   is
that it presently throws an Exception where it shouldn't.
   
This stems from IndexedRelationship.DirectRelationship:
this.endRelationship = endNode.getSingleRelationship(
SortedTree.RelTypes.KEY_VALUE, Direction.INCOMING );
   
So if the end node has more than one incoming KEY_VALUE relationship
 a
   more
than one relationship exception is thrown.
   
Instead of the getSingleRelationship I was planning on iterating over
 the
relationships and matching the UUID stored at the root end of the IR
 with
one of the KEY_VALUE relationships (which is why using a unique id is
necessary rather than the relationship type).  Note: there will
 actually
still be an issue if the same IR has multiple relationships to the
 same
   leaf
node - still thinking about that might need .
   
Is storing the UUID as two longs much quicker than storing it as a
   string?
 Curious about this since in my current model I have all the domain
   objects
with UUID's, and these are all stored as strings.  If it was going to
   help
with either memory or performance then I would be keen to migrate

Re: [Neo4j] Issues with IndexedRelationship

2011-09-07 Thread Bryce
I have made the changes in regards to SortedTree in regards to relationships
vs nodes, and have got all the tests passing.  The changes are pushed up to
my github account (and pull request has been raised).

The changes can be seen here:
https://github.com/brycenz/graph-collections

On Thu, Sep 8, 2011 at 3:41 PM, Bryce bryc...@gmail.com wrote:

 Another thought if there is going to be a larger refactor of the code is
 whether the indexing mechanism should be broken out as a strategy for the
 IndexedRelationship.  At present it is tied to SortedTree, but if an
 interface was extracted out that had addNode, removeNode, iterator, and
 isUniqueIndex then other indexing implementations could be used in certain
 cases.

 The particular other implementation I am currently thinking of that could
 be of use to me would be a paged linked list.  So that would have a linked
 list of pages, each with min  x  max KEY_VALUE (or equivalent)
 relationships.  I think that could work quite well for the situation where
 the index is descending date ordered, and generally just appended at the
 most recent end, and results are retrieved in a paged manner generally from
 near the most recent.

 But more to the point there could be any number of implementations that
 would be good for given different situations.

 That does bring up a question though, there was some discussion a while ago
 about some functionality along the lines of IndexedRelationship being pulled
 into the core, so is that overkill for now if there is going to be another
 core offering later?


 On Thu, Sep 8, 2011 at 2:38 PM, Niels Hoogeveen pd_aficion...@hotmail.com
  wrote:


 I think we don't have to worry about backwards compatibility much yet.
 There has not been a formal release of the component, so if there are people
 using the software, they will accept that they are bleeding edgers.
 Indeed addNode should return the KEY_VALUE relationship and I think we
 should change the signature of SortedTree to turn it into
 IterableRelationship. No need to maintain a Node iterator, the node is
 always one getEndNode away.
 Niels

  Date: Thu, 8 Sep 2011 14:17:59 +1200
  From: bryc...@gmail.com
  To: user@lists.neo4j.org
  Subject: Re: [Neo4j] Issues with IndexedRelationship
 
  Will have to experiment with changing my id's to be stored as longs, it
 does
  make perfect sense really that it would be better.  Thanks for the hint.
 
  In regards to SortedTree returning the KEY_VALUE relationship instead of
 the
  end Node, I had thought of that too, and it would definitely help.
  Could
  end up being a significant change to SortedTree though, e.g.:
sortedTree.addNode( node );
  Would need to return the KEY_VALUE relationship instead of a boolean.
  Which
  not knowing where else SortedTree is used could be a large change?
 
  Maybe SortedTree would have two iterator's available a key_value
  relationship iterator, and a node iterator.  Having a quick look at it
 now
  it seems that it could work ok that way without introducing much code
  duplication.
 
  On Thu, Sep 8, 2011 at 12:46 PM, Niels Hoogeveen
  pd_aficion...@hotmail.comwrote:
 
  
   Two longs is certainly cheaper than a string. Two longs take 128 bit
 and
   are stored in the main record of the PropertyContainer, while a String
 would
   require a 64 bit pointer in the main record of the
 PropertyContainer, and
   an additional read in the String store where the string representation
 will
   take up 256 bits. So both memory-wise, as perfomance wise, it is
 better to
   store a UUID as two long values.
  
  
   The main issue is something that needs a deeper fix than adding ID's.
   SortedTree now returns Nodes when traversing the tree. We should
 however
   return the KEY_VALUE Relationship to the indexed Node. Then
   IndexedRelationship.DirectRelationship can be created with that
 relationship
   as an argument. We get the Direction and the RelationshipType for
 free.
   Niels
  
Date: Thu, 8 Sep 2011 11:36:11 +1200
From: bryc...@gmail.com
To: user@lists.neo4j.org
Subject: Re: [Neo4j] Issues with IndexedRelationship
   
Hi Niels,
   
Sorry I didn't quite write the bit about (1) clearly enough.  The
 problem
   is
that it presently throws an Exception where it shouldn't.
   
This stems from IndexedRelationship.DirectRelationship:
this.endRelationship = endNode.getSingleRelationship(
SortedTree.RelTypes.KEY_VALUE, Direction.INCOMING );
   
So if the end node has more than one incoming KEY_VALUE relationship
 a
   more
than one relationship exception is thrown.
   
Instead of the getSingleRelationship I was planning on iterating
 over the
relationships and matching the UUID stored at the root end of the IR
 with
one of the KEY_VALUE relationships (which is why using a unique id
 is
necessary rather than the relationship type).  Note: there will
 actually
still be an issue if the same IR has multiple relationships to the
 same