Very interesting thoughts!

I would love to have a bootcamp and explore a spike on how this would
work out in practice. Got anything to do this autumn? ;)

Cheers,

/peter neubauer

GTalk:      neubauer.peter
Skype       peter.neubauer
Phone       +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter      http://twitter.com/peterneubauer

http://www.neo4j.org               - Your high performance graph database.
http://startupbootcamp.org/    - Öresund - Innovation happens HERE.
http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.



On Sun, Aug 7, 2011 at 4:30 PM, Niels Hoogeveen
<[email protected]> wrote:
>
> Hi Peter,
>
> Thanks for showing an interest.
>
> A Property is indeed a unary edge in the Enhanced API and therefore 
> (potentially) backed by a Node, but that Node doesn't contain the value.
>
> All property values are still stored the way they are stored in the standard 
> API. If someone however decides to add a Property to a Property or create an 
> Edge containing that Property, a Node will be created to store those 
> properties and connect those Edges to.
>
> When the associated Node of a Property is created, the ID of that Node will 
> be stored in the PropertyContainer of that property.
>
> Example:
>
> Suppose we have a property on a "Person" Vertex that denotes a personal 
> identity number, and the user of the application want to annually check that 
> identity number against some other database and state when it was last 
> verified and who verified it.
>
> A Vertex (backed by a Node) for a particular Person is created and the 
> property is set (in that Node's PropertyContainer), just like it would be the 
> case in the standard API.
>
> When the verification is done, an additional property is created on the 
> PropertyContainer of that Person with the name 
> org.neo4j.collections.graphdb.[propertyname].node_id
>
> This property contains the node ID of the associated property. On that node 
> the verification date will be set and the BinaryEdge (in principle nothing 
> but a classic Relationship) will be created to the "Person" Vertex of the one 
> who verified the personal identity code.
>
> It is certainly true that everything being a Vertex makes the Node 
> implementation more important than ever before, but it goes even further, 
> apart from a standard Vertex and the various VertexTypes, almost everything 
> is an Edge. So I would say the Relationship implementation is becoming 
> eminently important.
>
> There are certainly several tweaks to the storage layer I would love to see 
> incorporated, mostly to hide the implementation for the user and to make sure 
> that the maintenance of IDs takes place in core and not in a layer on top of 
> core.
>
> In fact all of Enhanced API could much better be maintained  in core, 
> something that can actually quite easily be implemented. One of my "ulterior 
> motives" with the development of Enhanced API is to tease out the technical 
> requirements to push this functionality into core (whether Neo Tech decides 
> to do so, is another question of course).
>
> Since the Neo4j database consists mostly of records and linked lists, the 
> technical requirements to push things into core, are mostly a question of 
> adding entry-points to linked lists in some records and partitioning some 
> existing linked lists.
>
> I will write down those requirements in a separate post. This will include 
> support for N-ary edges, since that is actually not all that difficult to 
> implement and adds very little complexity to the database.
>
> Yes, traversals will become much more generalized in the Enhanced API, 
> especially when we make them composable. In fact composable traversal 
> descriptions can easily be seen as a query language giving access to all 
> parts of the database.
>
> Niels
>
>> From: [email protected]
>> Date: Sun, 7 Aug 2011 09:10:02 +0200
>> To: [email protected]
>> Subject: Re: [Neo4j] Enhanced API rewrite
>>
>> Niels,
>> this sounds very interesting. Given the role of properties being unary
>> edges, that would mean that any classic Neo4j property would now be a
>> Node with one Property in the new Vertex sense?
>>
>> Having Vertices for EVERYTHING will of course make the
>> node-implementation much more important than anything else, since
>> every element is backed by a node, possibly with some property. I
>> wonder how this would reflect in the storage layer that might need to
>> be tweaked.
>>
>> Also, as you point out, traversals will become quite different with
>> this API, but let's see an what the weekend brings ;)
>>
>> Cheers,
>>
>> /peter neubauer
>>
>> GTalk:      neubauer.peter
>> Skype       peter.neubauer
>> Phone       +46 704 106975
>> LinkedIn   http://www.linkedin.com/in/neubauer
>> Twitter      http://twitter.com/peterneubauer
>>
>> http://www.neo4j.org               - Your high performance graph database.
>> http://startupbootcamp.org/    - Öresund - Innovation happens HERE.
>> http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.
>>
>>
>>
>> On Sat, Aug 6, 2011 at 2:51 AM, Niels Hoogeveen
>> <[email protected]> wrote:
>> >
>> > Today I pushed a major rewrite of the Enhanced API. See: 
>> > https://github.com/peterneubauer/graph-collections/tree/master/src/main/java/org/neo4j/collections/graphdb
>> >
>> > Originally the Enhanced API was a drop-in replacement of the standard 
>> > Neo4j API. This resulted in lots of wrapper classes that needed to be 
>> > maintained.
>> >
>> > The rewrite of Enhanced API is no longer a drop-in replacement and 
>> > contains no interface/class names that can be found in the standard API.
>> >
>> > Enhanced API no longer speaks of Nodes but of Vertices and doesn't speak 
>> > of Relationships but of Edges. This helps to prevent name clashes at the 
>> > expense of somewhat less recognizable names (Relationship is after all a 
>> > more common word than Edge).
>> >
>> > This rewrite is not merely a renaming of classes and interfaces, but is in 
>> > most part a complete rewrite and also a rethinking of the API on my part.
>> >
>> > Enhanced API consists of two basic elements: Vertex and EdgeRole. Most 
>> > elements are a subclass of Vertex, though there are some specialized 
>> > versions of EdgeRole.
>> >
>> > Let me start with an example:
>> >
>> > Suppose we have two vertices denoting the persons Tom and Paula, and we 
>> > want to state that Tom is the father of Paula.
>> >
>> > For standard Neo4j we tend to write such a fact as:
>> >
>> > Tom --Father--> Paula
>> >
>> > For Enhanced API we can conceptually write this fact as follows:
>> >
>> >       --StartRole--Tom
>> > Father
>> >       --EndRole--Paula
>> >
>> > This should be read as follows: We have two Vertices: Tom and Paula and we 
>> > have a BinaryEdge (similar to a Relationship in the standard API) of type 
>> > "Father", where Tom has the StartRole for that edge and Paula has the 
>> > EndRole for that edge.
>> >
>> > So instead of a directed graph, we conceptually have an undirected 
>> > bipartite graph.
>> >
>> > For binary edges (edges between two vertices), this is mostly conceptually 
>> > the case, because the API will simply allow you to write: 
>> > tom.createEdgeTo(paula, FATHER) (similar to 
>> > tom.createRelationshipTo(paula, FATHER) as we would have in the standard 
>> > API).
>> >
>> > It is also possible to fetch the start vertex of the binary relationship 
>> > with the method: edge.getStartVertex() (similar to 
>> > relationship.getStartNode()), although it is also possible to treat the 
>> > binary edge as a generic edge and fetch that Vertex as: 
>> > edge.getElement(db.getStartRole()).
>> >
>> > BinaryEdges, are a special case and have special methods which cover the 
>> > same functionality as can be found in the standard Neo4j API.
>> >
>> > In general, we can say that Vertices are connected to Edges by means of 
>> > EdgeRoles. In the binary case there are two predefined EdgeRoles: 
>> > StartRole and EndRole.
>> >
>> > Before we get deeper into the general case of n-ary edges, let's first 
>> > look at another special case: Properties.
>> >
>> > Properties can be thought of as unary edges, an edge that connects to only 
>> > one Vertex (as opposed to two in the binary case).
>> >
>> > Suppose we want to state that Tom is 49 years old, we can write that as:
>> >
>> > age(49)--PropertyRole--Tom
>> >
>> > We have an edge of type "age" that is connected to the vertex Tom in the 
>> > role of a property.
>> >
>> > Again this is mostly conceptually true, because there are lots of methods 
>> > in Enhanced API that are very similar to the ones found in the standard 
>> > API; getProperty, hasProperty, setProperty. Instead, we can also call 
>> > methods on the property itself, after all the age property connected to 
>> > the Vertex "Tom", is an object all of itself. More precisely it is a 
>> > Property and with that it is a UnaryEdge, which is an Edge, which is a 
>> > Vertex.
>> >
>> > From the age property we can fetch the ProperyType, but we can also ask 
>> > for the Vertex it is connected to: getVertex(). Since a Property is an 
>> > Edge we can also fetch the connected vertex (Tom) as follows: 
>> > age.getElement(db.getPropertyRole).
>> >
>> > So we have seen the two special cases: unary edges and binary edges, which 
>> > work very much the same as properties and Relationships in the standard 
>> > Neo4j API, though we have given it a conceptually different perspective 
>> > that unifies the two and fits it neatly into the general case of N-ary 
>> > edges.
>> >
>> > As said before, an Edge is a Vertex that connects other Vertices by means 
>> > of EdgeRoles. Since Edges are Vertices, they can have other Edges 
>> > connected to them. Or in standard API talk: relationships can be connected 
>> > to other relationships and they can have properties.
>> >
>> > The concept of EdgeRoles separates Edges from Vertices, so we will 
>> > effectively have a bipartite graph where Vertices can only connect to 
>> > Edges and Edges can only connect to Vertices. Given the fact that Edges 
>> > are also Vertices, Edges can be connected to Edges, but in such a case it 
>> > is unambiguous which plays the role of Edge and which plays the role of 
>> > Vertex in that connection.
>> >
>> > Let's look at an example of an N-ary edge:
>> >
>> > Suppose we want to state the fact that Tom gives Paula a Bicycle (no 
>> > golden helicopters in stock today). We can write that as follows:
>> >
>> >      --Giver--Tom
>> > GIVES --Recipient -- Paula
>> >      --Gift -- Bicycle
>> >
>> > There is an EdgeType GIVES which defines three EdgeRoles: Giver, Recipient 
>> > and Gift, which connect Tom, Paula and Bicycle to the Edge.
>> >
>> > The edge is created by first creating three EdgeElement objects that each 
>> > contain a Role and the connected Vertex. We can then make the call 
>> > db.createEdge(GIVES, edgeElements).
>> >
>> > An EdgeElement is that what is connected to Edge for a particular EdgeRole 
>> > (including that EdgeRole itself).
>> >
>> > An EdgeElement can contain more than one connected Vertex. We can for 
>> > example state: Tom and Dick give Paula a Bicycle.
>> >
>> > In Enhanced API notation:
>> >
>> >      --Giver--Tom, Dick
>> > GIVES --Recipient -- Paula
>> >      --Gift -- Bicycle
>> >
>> > Or we may want to state: Tom, Dick and Harry give Paula and Josephine a 
>> > Bicycle and an Icecream.
>> >
>> > In Enhanced API notation:
>> >
>> >      --Giver--Tom, Dick, Harry
>> > GIVES --Recipient -- Paula, Josephine
>> >      --Gift -- Bicycle, Icecream
>> >
>> > The API allow the user to fetch an EdgeElement by means of an EdgeRole and 
>> > iterate over the connected Vertices:
>> >
>> > for(EdgeElement givers: gives.getElements(Giver)){
>> >  for(Vertex giver: givers.getVertices){
>> >     //do something with the giver Vertex
>> >  }
>> > }
>> >
>> > For those cases where an EdgeElement can contain only one Vertex, there is 
>> > a FunctionalEdgeElement, which can only be used in conjunction with 
>> > FunctionalEdgeRoles.
>> >
>> > StartRole, EndRole and PropertyRole are all FunctionalEdgeRoles, since we 
>> > can have only one start Vertex and one end Vertex per BinaryEdge (just 
>> > like there can only be one StartNode and one EndNode for a Relationship in 
>> > the standard API) and we can only have one Vertex associated with a 
>> > Property (just like a property can not belong to two different Nodes in 
>> > the standard Neo4j API) .
>> >
>> > The Enhanced API can be used in conjunction with standard Neo4j API. The 
>> > only replacement needed is that of the database instance. The Enhanced API 
>> > defines a DatabaseService interface, which extends the standard 
>> > GraphDatabaseService interface and adds several enhanced methods for the 
>> > creation and lookup of Vertices, Edges and several kinds of VertexTypes.
>> >
>> > Now the big question is of course, what do we gain with this entire 
>> > apparatus?
>> >
>> > First of all, we have unification of the storage elements of Neo4j. 
>> > Everything that can be stored in Neo4j is a Vertex:
>> >
>> > Node is very much like a Vertex (with a slightly different interface that 
>> > has similar features to the standard Neo4j API, and more...)
>> > Relationship is very much like BinaryEdge, which is an Edge, which is a 
>> > Vertex
>> > RelationshipType is covered by BinaryEdgeType which is an EdgeType, which 
>> > is a VertexType, which is a Vertex
>> > property name is wrapped as a PropertyType which is an an EdgeType, which 
>> > is a VertexType, which is a Vertex.
>> > propery value is wrapped as a Property which is a UnaryEdge, which is an 
>> > Edge, which is a Vertex
>> >
>> > Having this unification, it is possible to write traversals to every part 
>> > of the Neo4j database. And that is the big boon of this unification.
>> >
>> > Every part of the database can be accessed with a traveral description.
>> >
>> > The standard Neo4j API only allows traversals to return Nodes given a 
>> > start Node. The Enhanced API allows traversals from any part of the graph, 
>> > whether it is a regular Vertex, an Edge or a Property (or a type thereof), 
>> > to any other part of the graph, no matter if it is a regular Vertex, an 
>> > Edge or a Property (or a type thereof).
>> >
>> > All that needs to be supplied are the EdgeTypes that need to be followed 
>> > in a traversal (and the regular evaluators that go with it).
>> >
>> > Now the big downer to this all:
>> >
>> > I still have to write the traversal framework, which will actually follow 
>> > the Standard Neo4j framework, but will certainly make traversals 
>> > composable.
>> >
>> > Every Vertex is not just a Vertex, but it is also a bunch of paths. Well 
>> > not really a bunch, it is a bunch of size one, and not much of a path 
>> > either, since it only contains one path element, the Vertex itself.
>> >
>> > A traversal returns a bunch of paths (Iterable<Path>) and starts from a 
>> > bunch of paths (still Iterable<Path>).
>> >
>> > Since the output of a traversal is the same as the input of a traversal we 
>> > can now compose them. This makes it possible to write a traversal 
>> > description which states that we want to retrieve the parents of our 
>> > friends, or the neighbours of the parents of our friends, and even: the 
>> > names of the dogs of the neighbours of the parents of our friends (after 
>> > all, we can now traverse to a property).
>> >
>> > This can be achieved when we make traversal descriptions composable. Most 
>> > users probably don't want to manually compose traversals, they would much 
>> > rather compose traversal descriptions and let those descriptions do the 
>> > composition of the traversals.
>> >
>> > These are some things to work on over the weekend + plus + plus + 
>> > documentation (especially Javadoc) and more test cases (especially the 
>> > integration of IndexedRelationships as SortableBinaryEdges needs thorough 
>> > testing).
>> >
>> > For the rest, I'd like to hear opinions and suggestions for improvement.
>> >
>> > Niels
>> > _______________________________________________
>> > Neo4j mailing list
>> > [email protected]
>> > https://lists.neo4j.org/mailman/listinfo/user
>> >
>> _______________________________________________
>> Neo4j mailing list
>> [email protected]
>> https://lists.neo4j.org/mailman/listinfo/user
>
> _______________________________________________
> Neo4j mailing list
> [email protected]
> https://lists.neo4j.org/mailman/listinfo/user
>
_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to