I agree that hiding the stuff providers use to implement graph instances is
a good idea. It then establishes a single way of doing things for users and
reduces a lot of confusion. As long as we have multi-properties (which
cause no end of trouble) the possibility of returning an entire Vertex from
a remote source is going to be trouble. Better to just work with references.

You put a question mark next to ReferenceGraph.tx() - I have no idea what
we do with that atm for a future version of TinkerPop. I think the model
we've used for three version of TinkerPop now is rooted in the Neo4j
approach to transaction and is often more trouble than it should be for us
and providers. Distributed transactions are a challenge and don't apply to
every provider. Transactions are further complicated by GLVs......I still
like the idea of local subgraphs for mutations and transaction management
(but that goes against having just ReferenceGraph). I guess the shortened
message I'm getting at is that transaction behavior is quite varied across
providers and I don't know how it should be generalized right now, but in
whatever year someone starts stroking the first lines of TinkerPop 4, we
should have that answer before getting too far because it shoehorning in
transactions is going to get us into all kinds of trouble.

On Wed, Jun 28, 2017 at 9:48 AM, Marko Rodriguez <[email protected]>
wrote:

> Hello,
>
> Throughout our documentation we show uses of the “Blueprints API” (i.e.
> Graph/Vertex/Edge/etc. classes & methods) as well as the use of the
> Traversal API (i.e. Gremlin).
>
> Enabling users to have two ways of interacting with the graph system has
> its problems:
>
>         1. The DetachedXXX problem — how much data should a returned
> vertex/edge/etc. have associated with it?
>         2. graph.addVertex() and g.addV() — which should I use? The first
> is faster but is not recommended.
>         3. SubgraphStrategy leaking — I get subgraphs with Gremlin, but
> can then directly interact with the vertex objects to see more than I
> should.
>         4. VertexProgram model — I write traversals with Traversal API,
> but then develop VertexPrograms with the Blueprints API. That’s weird.
>         5. GremlinServer returning fat objects — Serializers are created
> property-rich vertices and edges. The awkward HaltedTraversalStrategy
> solution.
>         6. … various permutations of these source problems.
>
> I propose that we solve this problem once and for all in TinkerPop4 as
> follows:
>
> There should be two “Graph APIs.”
>
>         1. Provider Graph API: This is the current Blueprints API with
> Graph.addVertex(), Vertex.edges(), Edge.inVertex(), etc.
>         3. User Graph API: This is a ReferenceXXX API.
>
> Lets talk about the second as its more novel and distinct from current
> practices.
>
> We should have ReferenceGraph which is simply a reference/dummy/proxy to
> the provider Graph API. ReferenceGraph has the following API:
>
> ReferenceGraph.open()
> ReferenceGraph.close()
> ReferenceGraph.tx() // assuming we like the current transaction model (??)
> ReferenceGraph.traversal()
>
> That is it. What does this entail? Assume the following traversal:
>
> g = ReferenceGraph.open(config).traversal()
> g.V(1).out(‘knows’)
>
> ReferenceGraph is almost like a “RemoteGraph” (RemoteStrategy) in that it
> makes a connection (remote or inter-JVM) to the provider Graph API. When
> g.V(1).out(‘knows’) executes, it is really sending the bytecode to the
> provider Graph for execution (as specified by the config of
> ReferenceGraph.open()). Thus, once it hits the provider's graph,
> ProviderVertex, ProviderEdge, etc. are the objects being processed.
> However, what the traversal’s Iterator<Vertex> returns is ReferenceVertex!
> That is, it never returns ProviderVertex. In this way, regardless if the
> user is going “over the wire” or within the same JVM or against a different
> provider’s graph database or from Gremlin-Python/C#/etc., all the vertices
> are simply ‘reference vertices’ (id + label). This makes it so that users
> never interact with the graph element objects themselves directly. They can
> ONLY interact with the graph via traversals! At most they can
> ReferenceVertex.id() and ReferenceVertex.label(). Thats it, — no mutations,
> not walking edges, nada! And moreover, since ReferenceXXX has enough
> information to re-attach to the source graph, they can always do the
> following to get more information:
>
> v = g.V(1).out(‘knows’).next()
> g.V(v).values(‘name’)
>
> This split into two Graph APIs will enables us to make a hard boundary
> between what the provider (vendor) needs to implement and what the user
> (developer) gets to access. This distinction should solve the problems
> articulated at the start of this email.
>
> Thoughts?,
> Marko.
>
> http://markorodriguez.com
>
>
>
>

Reply via email to