Very interesting thoughts! I would love to have a bootcamp and explore a spike on how this would work out in practice. Got anything to do this autumn? ;)
Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://startupbootcamp.org/ - Öresund - Innovation happens HERE. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Sun, Aug 7, 2011 at 4:30 PM, Niels Hoogeveen <[email protected]> wrote: > > Hi Peter, > > Thanks for showing an interest. > > A Property is indeed a unary edge in the Enhanced API and therefore > (potentially) backed by a Node, but that Node doesn't contain the value. > > All property values are still stored the way they are stored in the standard > API. If someone however decides to add a Property to a Property or create an > Edge containing that Property, a Node will be created to store those > properties and connect those Edges to. > > When the associated Node of a Property is created, the ID of that Node will > be stored in the PropertyContainer of that property. > > Example: > > Suppose we have a property on a "Person" Vertex that denotes a personal > identity number, and the user of the application want to annually check that > identity number against some other database and state when it was last > verified and who verified it. > > A Vertex (backed by a Node) for a particular Person is created and the > property is set (in that Node's PropertyContainer), just like it would be the > case in the standard API. > > When the verification is done, an additional property is created on the > PropertyContainer of that Person with the name > org.neo4j.collections.graphdb.[propertyname].node_id > > This property contains the node ID of the associated property. On that node > the verification date will be set and the BinaryEdge (in principle nothing > but a classic Relationship) will be created to the "Person" Vertex of the one > who verified the personal identity code. > > It is certainly true that everything being a Vertex makes the Node > implementation more important than ever before, but it goes even further, > apart from a standard Vertex and the various VertexTypes, almost everything > is an Edge. So I would say the Relationship implementation is becoming > eminently important. > > There are certainly several tweaks to the storage layer I would love to see > incorporated, mostly to hide the implementation for the user and to make sure > that the maintenance of IDs takes place in core and not in a layer on top of > core. > > In fact all of Enhanced API could much better be maintained in core, > something that can actually quite easily be implemented. One of my "ulterior > motives" with the development of Enhanced API is to tease out the technical > requirements to push this functionality into core (whether Neo Tech decides > to do so, is another question of course). > > Since the Neo4j database consists mostly of records and linked lists, the > technical requirements to push things into core, are mostly a question of > adding entry-points to linked lists in some records and partitioning some > existing linked lists. > > I will write down those requirements in a separate post. This will include > support for N-ary edges, since that is actually not all that difficult to > implement and adds very little complexity to the database. > > Yes, traversals will become much more generalized in the Enhanced API, > especially when we make them composable. In fact composable traversal > descriptions can easily be seen as a query language giving access to all > parts of the database. > > Niels > >> From: [email protected] >> Date: Sun, 7 Aug 2011 09:10:02 +0200 >> To: [email protected] >> Subject: Re: [Neo4j] Enhanced API rewrite >> >> Niels, >> this sounds very interesting. Given the role of properties being unary >> edges, that would mean that any classic Neo4j property would now be a >> Node with one Property in the new Vertex sense? >> >> Having Vertices for EVERYTHING will of course make the >> node-implementation much more important than anything else, since >> every element is backed by a node, possibly with some property. I >> wonder how this would reflect in the storage layer that might need to >> be tweaked. >> >> Also, as you point out, traversals will become quite different with >> this API, but let's see an what the weekend brings ;) >> >> Cheers, >> >> /peter neubauer >> >> GTalk: neubauer.peter >> Skype peter.neubauer >> Phone +46 704 106975 >> LinkedIn http://www.linkedin.com/in/neubauer >> Twitter http://twitter.com/peterneubauer >> >> http://www.neo4j.org - Your high performance graph database. >> http://startupbootcamp.org/ - Öresund - Innovation happens HERE. >> http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. >> >> >> >> On Sat, Aug 6, 2011 at 2:51 AM, Niels Hoogeveen >> <[email protected]> wrote: >> > >> > Today I pushed a major rewrite of the Enhanced API. See: >> > https://github.com/peterneubauer/graph-collections/tree/master/src/main/java/org/neo4j/collections/graphdb >> > >> > Originally the Enhanced API was a drop-in replacement of the standard >> > Neo4j API. This resulted in lots of wrapper classes that needed to be >> > maintained. >> > >> > The rewrite of Enhanced API is no longer a drop-in replacement and >> > contains no interface/class names that can be found in the standard API. >> > >> > Enhanced API no longer speaks of Nodes but of Vertices and doesn't speak >> > of Relationships but of Edges. This helps to prevent name clashes at the >> > expense of somewhat less recognizable names (Relationship is after all a >> > more common word than Edge). >> > >> > This rewrite is not merely a renaming of classes and interfaces, but is in >> > most part a complete rewrite and also a rethinking of the API on my part. >> > >> > Enhanced API consists of two basic elements: Vertex and EdgeRole. Most >> > elements are a subclass of Vertex, though there are some specialized >> > versions of EdgeRole. >> > >> > Let me start with an example: >> > >> > Suppose we have two vertices denoting the persons Tom and Paula, and we >> > want to state that Tom is the father of Paula. >> > >> > For standard Neo4j we tend to write such a fact as: >> > >> > Tom --Father--> Paula >> > >> > For Enhanced API we can conceptually write this fact as follows: >> > >> > --StartRole--Tom >> > Father >> > --EndRole--Paula >> > >> > This should be read as follows: We have two Vertices: Tom and Paula and we >> > have a BinaryEdge (similar to a Relationship in the standard API) of type >> > "Father", where Tom has the StartRole for that edge and Paula has the >> > EndRole for that edge. >> > >> > So instead of a directed graph, we conceptually have an undirected >> > bipartite graph. >> > >> > For binary edges (edges between two vertices), this is mostly conceptually >> > the case, because the API will simply allow you to write: >> > tom.createEdgeTo(paula, FATHER) (similar to >> > tom.createRelationshipTo(paula, FATHER) as we would have in the standard >> > API). >> > >> > It is also possible to fetch the start vertex of the binary relationship >> > with the method: edge.getStartVertex() (similar to >> > relationship.getStartNode()), although it is also possible to treat the >> > binary edge as a generic edge and fetch that Vertex as: >> > edge.getElement(db.getStartRole()). >> > >> > BinaryEdges, are a special case and have special methods which cover the >> > same functionality as can be found in the standard Neo4j API. >> > >> > In general, we can say that Vertices are connected to Edges by means of >> > EdgeRoles. In the binary case there are two predefined EdgeRoles: >> > StartRole and EndRole. >> > >> > Before we get deeper into the general case of n-ary edges, let's first >> > look at another special case: Properties. >> > >> > Properties can be thought of as unary edges, an edge that connects to only >> > one Vertex (as opposed to two in the binary case). >> > >> > Suppose we want to state that Tom is 49 years old, we can write that as: >> > >> > age(49)--PropertyRole--Tom >> > >> > We have an edge of type "age" that is connected to the vertex Tom in the >> > role of a property. >> > >> > Again this is mostly conceptually true, because there are lots of methods >> > in Enhanced API that are very similar to the ones found in the standard >> > API; getProperty, hasProperty, setProperty. Instead, we can also call >> > methods on the property itself, after all the age property connected to >> > the Vertex "Tom", is an object all of itself. More precisely it is a >> > Property and with that it is a UnaryEdge, which is an Edge, which is a >> > Vertex. >> > >> > From the age property we can fetch the ProperyType, but we can also ask >> > for the Vertex it is connected to: getVertex(). Since a Property is an >> > Edge we can also fetch the connected vertex (Tom) as follows: >> > age.getElement(db.getPropertyRole). >> > >> > So we have seen the two special cases: unary edges and binary edges, which >> > work very much the same as properties and Relationships in the standard >> > Neo4j API, though we have given it a conceptually different perspective >> > that unifies the two and fits it neatly into the general case of N-ary >> > edges. >> > >> > As said before, an Edge is a Vertex that connects other Vertices by means >> > of EdgeRoles. Since Edges are Vertices, they can have other Edges >> > connected to them. Or in standard API talk: relationships can be connected >> > to other relationships and they can have properties. >> > >> > The concept of EdgeRoles separates Edges from Vertices, so we will >> > effectively have a bipartite graph where Vertices can only connect to >> > Edges and Edges can only connect to Vertices. Given the fact that Edges >> > are also Vertices, Edges can be connected to Edges, but in such a case it >> > is unambiguous which plays the role of Edge and which plays the role of >> > Vertex in that connection. >> > >> > Let's look at an example of an N-ary edge: >> > >> > Suppose we want to state the fact that Tom gives Paula a Bicycle (no >> > golden helicopters in stock today). We can write that as follows: >> > >> > --Giver--Tom >> > GIVES --Recipient -- Paula >> > --Gift -- Bicycle >> > >> > There is an EdgeType GIVES which defines three EdgeRoles: Giver, Recipient >> > and Gift, which connect Tom, Paula and Bicycle to the Edge. >> > >> > The edge is created by first creating three EdgeElement objects that each >> > contain a Role and the connected Vertex. We can then make the call >> > db.createEdge(GIVES, edgeElements). >> > >> > An EdgeElement is that what is connected to Edge for a particular EdgeRole >> > (including that EdgeRole itself). >> > >> > An EdgeElement can contain more than one connected Vertex. We can for >> > example state: Tom and Dick give Paula a Bicycle. >> > >> > In Enhanced API notation: >> > >> > --Giver--Tom, Dick >> > GIVES --Recipient -- Paula >> > --Gift -- Bicycle >> > >> > Or we may want to state: Tom, Dick and Harry give Paula and Josephine a >> > Bicycle and an Icecream. >> > >> > In Enhanced API notation: >> > >> > --Giver--Tom, Dick, Harry >> > GIVES --Recipient -- Paula, Josephine >> > --Gift -- Bicycle, Icecream >> > >> > The API allow the user to fetch an EdgeElement by means of an EdgeRole and >> > iterate over the connected Vertices: >> > >> > for(EdgeElement givers: gives.getElements(Giver)){ >> > for(Vertex giver: givers.getVertices){ >> > //do something with the giver Vertex >> > } >> > } >> > >> > For those cases where an EdgeElement can contain only one Vertex, there is >> > a FunctionalEdgeElement, which can only be used in conjunction with >> > FunctionalEdgeRoles. >> > >> > StartRole, EndRole and PropertyRole are all FunctionalEdgeRoles, since we >> > can have only one start Vertex and one end Vertex per BinaryEdge (just >> > like there can only be one StartNode and one EndNode for a Relationship in >> > the standard API) and we can only have one Vertex associated with a >> > Property (just like a property can not belong to two different Nodes in >> > the standard Neo4j API) . >> > >> > The Enhanced API can be used in conjunction with standard Neo4j API. The >> > only replacement needed is that of the database instance. The Enhanced API >> > defines a DatabaseService interface, which extends the standard >> > GraphDatabaseService interface and adds several enhanced methods for the >> > creation and lookup of Vertices, Edges and several kinds of VertexTypes. >> > >> > Now the big question is of course, what do we gain with this entire >> > apparatus? >> > >> > First of all, we have unification of the storage elements of Neo4j. >> > Everything that can be stored in Neo4j is a Vertex: >> > >> > Node is very much like a Vertex (with a slightly different interface that >> > has similar features to the standard Neo4j API, and more...) >> > Relationship is very much like BinaryEdge, which is an Edge, which is a >> > Vertex >> > RelationshipType is covered by BinaryEdgeType which is an EdgeType, which >> > is a VertexType, which is a Vertex >> > property name is wrapped as a PropertyType which is an an EdgeType, which >> > is a VertexType, which is a Vertex. >> > propery value is wrapped as a Property which is a UnaryEdge, which is an >> > Edge, which is a Vertex >> > >> > Having this unification, it is possible to write traversals to every part >> > of the Neo4j database. And that is the big boon of this unification. >> > >> > Every part of the database can be accessed with a traveral description. >> > >> > The standard Neo4j API only allows traversals to return Nodes given a >> > start Node. The Enhanced API allows traversals from any part of the graph, >> > whether it is a regular Vertex, an Edge or a Property (or a type thereof), >> > to any other part of the graph, no matter if it is a regular Vertex, an >> > Edge or a Property (or a type thereof). >> > >> > All that needs to be supplied are the EdgeTypes that need to be followed >> > in a traversal (and the regular evaluators that go with it). >> > >> > Now the big downer to this all: >> > >> > I still have to write the traversal framework, which will actually follow >> > the Standard Neo4j framework, but will certainly make traversals >> > composable. >> > >> > Every Vertex is not just a Vertex, but it is also a bunch of paths. Well >> > not really a bunch, it is a bunch of size one, and not much of a path >> > either, since it only contains one path element, the Vertex itself. >> > >> > A traversal returns a bunch of paths (Iterable<Path>) and starts from a >> > bunch of paths (still Iterable<Path>). >> > >> > Since the output of a traversal is the same as the input of a traversal we >> > can now compose them. This makes it possible to write a traversal >> > description which states that we want to retrieve the parents of our >> > friends, or the neighbours of the parents of our friends, and even: the >> > names of the dogs of the neighbours of the parents of our friends (after >> > all, we can now traverse to a property). >> > >> > This can be achieved when we make traversal descriptions composable. Most >> > users probably don't want to manually compose traversals, they would much >> > rather compose traversal descriptions and let those descriptions do the >> > composition of the traversals. >> > >> > These are some things to work on over the weekend + plus + plus + >> > documentation (especially Javadoc) and more test cases (especially the >> > integration of IndexedRelationships as SortableBinaryEdges needs thorough >> > testing). >> > >> > For the rest, I'd like to hear opinions and suggestions for improvement. >> > >> > Niels >> > _______________________________________________ >> > Neo4j mailing list >> > [email protected] >> > https://lists.neo4j.org/mailman/listinfo/user >> > >> _______________________________________________ >> Neo4j mailing list >> [email protected] >> https://lists.neo4j.org/mailman/listinfo/user > > _______________________________________________ > Neo4j mailing list > [email protected] > https://lists.neo4j.org/mailman/listinfo/user > _______________________________________________ Neo4j mailing list [email protected] https://lists.neo4j.org/mailman/listinfo/user

