Re: [Neo4j] Issues with IndexedRelationship

Niels Hoogeveen Wed, 07 Sep 2011 17:46:35 -0700

Two longs is certainly cheaper than a string. Two longs take 128 bit and are 
stored in the main record of the PropertyContainer, while a String would 
require a 64 bit "pointer" in the main record of the PropertyContainer, and an 
additional read in the String store where the string representation will take 
up 256 bits. So both memory-wise, as perfomance wise, it is better to store a 
UUID as two long values.



The main issue is something that needs a deeper fix than adding ID's. 
SortedTree now returns Nodes when traversing the tree. We should however return 
the KEY_VALUE Relationship to the indexed Node. Then 
IndexedRelationship.DirectRelationship can be created with that relationship as 
an argument. We get the Direction and the RelationshipType for free.
Niels

> Date: Thu, 8 Sep 2011 11:36:11 +1200
> From: [email protected]
> To: [email protected]
> Subject: Re: [Neo4j] Issues with IndexedRelationship
> 
> Hi Niels,
> 
> Sorry I didn't quite write the bit about (1) clearly enough.  The problem is
> that it presently throws an Exception where it shouldn't.
> 
> This stems from IndexedRelationship.DirectRelationship:
> this.endRelationship = endNode.getSingleRelationship(
> SortedTree.RelTypes.KEY_VALUE, Direction.INCOMING );
> 
> So if the end node has more than one incoming KEY_VALUE relationship a more
> than one relationship exception is thrown.
> 
> Instead of the getSingleRelationship I was planning on iterating over the
> relationships and matching the UUID stored at the root end of the IR with
> one of the KEY_VALUE relationships (which is why using a unique id is
> necessary rather than the relationship type).  Note: there will actually
> still be an issue if the same IR has multiple relationships to the same leaf
> node - still thinking about that might need .
> 
> Is storing the UUID as two longs much quicker than storing it as a string?
>  Curious about this since in my current model I have all the domain objects
> with UUID's, and these are all stored as strings.  If it was going to help
> with either memory or performance then I would be keen to migrate this to
> two longs.
> 
> Cheers
> Bryce
> 
> On Thu, Sep 8, 2011 at 11:07 AM, Niels Hoogeveen
> <[email protected]>wrote:
> 
> >
> > Great work Bryce,
> > I do have a question though.
> > What is the rationale for the restriction mentioned under "1)". Do you need
> > this for the general case (to make IndexedRelationshipExpander work
> > correctly), or do you need it for your own application to throw that
> > exception? If the latter is the case, I think it would be important to tease
> > out the general case and offer this new behaviour as an option.
> > A unique key for the index is a good idea anyway and can be added to
> > SortedTree. Generate a UUID and store it in two long properties. That way
> > the two values will always be read in the first fetch of the underlying
> > PropertyContainer. A getId method on the TreeNodes can then return a String
> > representation of of the two long values.
> > IndexRelationships are a relatively new development, so I think you are one
> > of the first to actually try it out. Personally I have chosen to directly
> > work with SortedTree, because I am working within the framework of a wrapper
> > API, so I can integrate the functionality behind the regular
> > createRelationshipTo and getRelationships methods.
> > I don't think API changes will be an issue at the moment.
> > Niels
> > > Date: Thu, 8 Sep 2011 10:22:11 +1200
> > > From: [email protected]
> > > To: [email protected]
> > > Subject: [Neo4j] Issues with IndexedRelationship
> > >
> > > Hi,
> > >
> > > As I mentioned a while ago I am looking at using IndexedRelationship's
> > > within my application.  The major thing that was missing for me to be
> > able
> > > to do this was IndexedRelationshipExpander being able to provide all the
> > > relationships from the leaf end of indexed relationships through the the
> > > root end.  So I have been working on getting that support in there.
> > >
> > > However in writing this I have discovered a number of other issues that I
> > > have also fixed, and at least one I am still working on.  Since I was
> > right
> > > into the extra support for expanding the relationships it is hard to
> > break
> > > out these fixes as a separate commit (which I think would be ideal), so
> > it
> > > will most likely all come in together hopefully later today (NZ time).
> > >
> > > Just letting everyone know in case someone else is doing development
> > against
> > > indexed relationships.
> > >
> > > Quick run down of the issues, note: N -- IR(X) --> {A,B} below means
> > there
> > > is a indexed relationship from N to A & B, of type X.
> > >
> > > 1) Exception thrown when more than one IR terminates at a given node,
> > e.g.:
> > > N1 -- IR(X) --> {A,B,C,D}
> > > N2 -- IR(X) --> {A,X,Y,Z}
> > > Will throw an exception when using the IndexedRelationshipExpander on
> > either
> > > N1, or N2.
> > >
> > > 2) Start / End nodes are transposed when the IR has an direction of
> > > incoming, i.e. the IR is created against N but across a set of incoming
> > > relationships:
> > > N <-- IR(Y) -- {A,B,C}
> > > Will return 3 relationships N --> A, N --> B, N --> C.
> > >
> > > I have written tests for each of these, as well as a couple of other
> > tests.
> > >
> > > Still completing (1) and have a little question about this.  In order to
> > fix
> > > this I may need to introduce a unique ID stored against the IR both at
> > the
> > > root and at the leaves.  Currently the relationship type is used to name
> > the
> > > IR at both root and leaves, but in the case above that means you can't
> > tell
> > > from node A which KEY_VALUE relationship belongs to which IR tree without
> > > traversing the tree.
> > >
> > > So the question is adding this ID would mean that anyone who is already
> > > using this wont have the ID, and therefore without care will be data
> > > incompatible with the updated code.  This could be managed via a check
> > for
> > > the ID when accessing the tree and if it isn't there doing a walk over
> > the
> > > tree to populate all the places where it is required.
> > >
> > > In general in developing against this code where do we sit on data
> > > compatibility and API compatibility?
> > >
> > > Cheers
> > > Bryce
> > > _______________________________________________
> > > Neo4j mailing list
> > > [email protected]
> > > https://lists.neo4j.org/mailman/listinfo/user
> >
> > _______________________________________________
> > Neo4j mailing list
> > [email protected]
> > https://lists.neo4j.org/mailman/listinfo/user
> >
> _______________________________________________
> Neo4j mailing list
> [email protected]
> https://lists.neo4j.org/mailman/listinfo/user
                                          
_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Issues with IndexedRelationship

Reply via email to