[
https://issues.apache.org/jira/browse/TINKERPOP-1564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
stephen mallette closed TINKERPOP-1564.
---------------------------------------
Resolution: Later
> Distributed OLTP Traversals and the Introduction of Partition Concept
> ---------------------------------------------------------------------
>
> Key: TINKERPOP-1564
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1564
> Project: TinkerPop
> Issue Type: Improvement
> Components: driver, process, server
> Affects Versions: 3.2.3
> Reporter: Marko A. Rodriguez
> Assignee: Marko A. Rodriguez
> Priority: Major
> Attachments: distributed-oltp.png
>
>
> This proposal unifies OLTP and OLAP into a single framework that removes the
> need for OLAP {{GraphComputer}} by introducing distributed, data local
> processing to OLTP. In essence, this is a proposal for a step-by-step query
> routing framework within {{Traversal}}. This proposal can work across
> machines in a cluster, threads on a machine, or in a hierarchical fashion
> machines&threads. The example presented will discuss distribution across
> machines in a cluster as its the most complicated scenario.
> Currently, an OLTP traversal executes at a particular machine (or thread) and
> pulls vertex/edge/etc. data to it accordingly in order to solve the
> traversal. In OLAP, the traversal is cloned and distributed to all machines
> in the cluster and traversals communicate with one another by sending
> {{Traversers}} (i.e. messages) between themselves ensuring data local
> processing. Given recent advancements in GremlinServer and
> {{RemoteTraversal}}, it is possible to add traverser routing to OLTP and
> thus, effect the computational paradigm of Gremlin OLAP in Gremlin OLTP with
> some added benefits not possible in Gremlin OLAP.
> Assume a 4 machine cluster and the following traversal:
> {code}
> g.V(1).out(‘knows’).has(‘age’,gt(20)).out(‘likes’).values(‘name’)
> {code}
> Every time there is a "walk" (adjacency), it is possible that the
> {{Traverser}} is no longer accessing data local to the current machine. In
> order to do data local query routing, every adjacency would feed into a
> {{PartitionStep}}. The traversal above would be cloned (via {{Bytecode}}
> distribution) across the cluster where "sibling" {{PartitionSteps}} would
> have network access to one another using the same protocol of
> {{RemoteConnection}} though called {{PartitionConnection}}. Thus, given the 4
> node cluster example, the above traversal would be overlaid as below. Note
> that {{partition()}} would not be a new step in the language, but simply
> provided here to show where {{PartitionStrategy}} would insert
> {{PartitionSteps}} into the traversal.
> {code}
> g.V(1).out(‘knows’).partition().has(‘age’,gt(20)).out(‘likes’).partition().values(‘name’).partition()
> | |
> ^
>
> __.out(‘knows’).partition().has(‘age’,gt(20)).out(‘likes’).partition().values(‘name’).partition()
> | |
> |
>
> __.out(‘knows’).partition().has(‘age’,gt(20)).out(‘likes’).partition().values(‘name’).partition()
> | |
> |
>
> __.out(‘knows’).partition().has(‘age’,gt(20)).out(‘likes’).partition().values(‘name’).partition()
> {code}
> The top traversal is called the "master traversal" and the other three
> "worker traversals." Note that this is identical to current Gremlin OLAP.
> Now, the master traversal would be the traversal that is {{.next()}}'d for
> results. So, when the "master traversal" is {{next()}}'d, {{g.V(1)}} will
> fetch {{v[1]}} and then its outgoing knows-adjacencies. These adjacent
> "reference vertices" would be fed into the first {{remote()}} and a "routing
> algorithm" would determine where in the cluster the particular vertex's data
> is. Thus, {{partition()}} ({{PartitionStep}}) serves as a router, pushing
> {{Traversers}} local to the data. Finally, note that the final
> {{PartitionSteps}} can only feed back to the "master traversal" for ultimate
> aggregation and return to the user.
> TinkerPop currently has all the structures in place to make this possible:
> 1. Encapsulation of computational metadata via {{Traverser}}.
> 2. The ability to detach {{Traversers}} and migrate/serialize them via
> {{Traverser.detach()}} and {{Traverser.attach()}}.
> 3. The concept of {{ReferenceElement}} so the traverser only carries
> with it enough information to re-attach at the remote site.
> 4. {{Bytecode}} and the ability to send {{Traversals}} across the
> cluster.
> 5. GremlinServer and {{Client}}/{{Cluster}} messaging protocol.
> What does {{PartitionStep}} look like? *Please see comments below*
> Here are the benefits of this model:
> * Gremlin OLTP is Gremlin OLAP. The semantics of Gremlin OLAP are exactly
> what is proposed here but with the added benefit that message passing happens
> at the partition/subgraph level, not the star vertex level.
> * There is no need for {{SparkGraphComputer}} as GremlinServer now plays the
> role of SparkServer. The added benefit, no pulling data from the graph
> database and re-representing it in an RDD or {{SequenceFile}}.
> * No longer are "local children traversals" the boundary for "OLAP." Local
> children can be processed beyond the star graph, but would require pulling
> data from a remote machine is necessary. However, given a good graph
> partitioning algorithm, local children will most likely NOT leave the
> subgraph partition and thus, will remain a local computation.
> * Failover is already built into the architecture. If a {{PartitionStep}} can
> not be accessed, but the machine's data is still available (perhaps via
> replication), then data will simply be pulled over the wire instead of
> traversers routed to the "dead node."
> * The infrastructure for side-effects and reducing barrier steps already
> implemented for Gremlin OLAP would automatically work for distributed Gremlin
> OLTP.
> * If the entire graph is hot in-memory across the cluster, then distributed
> in-memory graph computing is possible. Again, no more linear-scans over
> partitions like with Giraph/Spark/etc. ({{GraphComputer}}).
> * If transactions are worked out, then distributed OLTP Gremlin provides
> mutation capabilities (something currently not implemented for
> {{GraphComputer}}). That is {{addV}}, {{addE}}, {{drop}}, etc. just works.
> **Caveate, transactions in this environment across GremlinServer seems
> difficult.**
> So thats that. This could very well be the future of Gremlin OLAP. The
> disjoint between OLAP and OLTP would go away, the codebase would be
> simplified, and the computational gains in terms of performance and
> expressivity would be great. This is a big deal idea.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)