[
https://issues.apache.org/jira/browse/TINKERPOP3-813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717185#comment-14717185
]
Matt Frantz commented on TINKERPOP3-813:
----------------------------------------
It may be important to add more detail on the constraints of the Gremlin
Virtual Machine. For example, you could say something about the observability
of various states, partial ordering, etc. Conforming vendor implementations
may provide additional constraints, but those should fit within the constraints
of the GVM.
One of the key aspects of the GVM is the notion of "effective simultaneity"
that occurs at {{flatMap}} steps, e.g. {{out}}, or at the source, e.g. {{V}}.
Application developers must be comfortable with the notion that operations may
happen in parallel, and that a vendor reserves the right to reorder and/or
parallelize operations within the GVM constraints.
There could also be a useful discussion on equivalence classes of traversals.
Some strategies are concerned with exploiting equivalence in order to improve
performance. There is thus a traversal algebra that could identify the
properties of commutativity, associativity, etc.
> [Proposal] Make the Gremlin Graph Traversal Machine and Instruction Set
> Explicit
> --------------------------------------------------------------------------------
>
> Key: TINKERPOP3-813
> URL: https://issues.apache.org/jira/browse/TINKERPOP3-813
> Project: TinkerPop 3
> Issue Type: Improvement
> Components: process
> Reporter: Marko A. Rodriguez
> Assignee: Marko A. Rodriguez
>
> There is a lot of confusion as to "what is Gremlin?" Is it Groovy? Whats
> Gremlin-Scala? Does it only work for the JVM? Is it a language or is it an
> execution engine? Is Gremlin OLAP different from Gremlin OLTP? How is Cypher
> related to Gremlin? Who is John Galt?
> For TinkerPop 3.2.0, we need to make explicit *The Gremlin Graph Traversal
> Machine and Instruction Set*. Here is a direct mapping to the Java world so
> people can get a grasp of where we are going with this.
> {code}
> Gremlin-Java8 == Java programming language
> Gremlin-Scala == Scala programming language
> Gremlin-Groovy == Groovy programming language
> ...
> Gremlin steps == Java instruction set
> Gremlin machine == Java virtual machine
> {code}
> Lets go through each one of these to flush out the concepts.
> *Gremlin as a Language.*
> When you see:
> {code}
> g.V().as("a").out("knows").as("b").
> select("a","b").
> by("name").
> by("age")
> {code}
> You are bearing witness to Gremlin-Java8. This is a fluent-style graph
> traversal API that ultimately writes out Gremlin steps. The Gremlin language
> (i.e. the fluent-style for doing graph traversing) can be embedded in any
> host language that supports function composition and function nesting.
> Because of this simple requirement, there exists Gremlin-Groovy,
> Gremlin-Scala, Gremlin-Clojure, etc. However, if the host language is NOT on
> the JVM, then it doesn't compile directly to Gremlin steps. How do we support
> other languages outside the JVM? There are two ways currently, via
> GremlinServer, but also, via the step-syntax described next.
> *Gremlin as an Instruction Set.*
> {code}
> g.V().as("a").out("knows").as("b").
> select("a","b").
> by("name").
> by("age")
> {code}
> The above Gremlin-Java8 query gets compiled down to a step sequence called a
> "traversal." Here is the *unique* string representation of the query above.
> {code}
> [GraphStep([],vertex)@[a], VertexStep(OUT,[knows],vertex)@[b], SelectStep([a,
> b],[value(name), value(age)])]
> {code}
> The steps are the primitives of the Gremlin graph traversal machine. They are
> the parameterized instructions that the machine ultimately executes. The
> Gremlin instruction set is approximately 30 steps. These are sufficient to
> provide general purpose computing and what is typically required to express
> the common motifs of any user query. Moreover, with Gremlin3 there is no need
> for lambdas so the step syntax above is sufficient to completely express the
> traversal computation.
> Note that ANY one can create a "Gremlin program" (traversal). You can imagine
> a compiler that takes SPARQL and writes out Gremlin steps. Neo4j's Cypher
> could be compiled to write out Gremlin steps. Realize that SPARQL is NOT a
> JVM language. Thus, PHP/Python/Go/Rust/Lisp can all have Gremlin "languages"
> that simply write out traversal steps in the step-syntax presented
> previously. Why would anyone want to do this? -- because of the Gremlin graph
> traversal machine.
> *Gremlin as a Virtual Machine.*
> The Gremlin graph traversal machine can be executed in OLTP or OLAP. What
> does that mean? It means Gremlin can be executed single machine or
> distributed across a multi-machine compute cluster. Best of all, NOTHING
> changes about the language or the instruction set. That is the promise of the
> Gremlin machine. Most importantly, the Gremlin machine is agnostic to the
> underlying graph system. It works over Titan, Neo4j, OrientDB, Giraph,
> Hadoop, Spark, BlazeGraph, InfiniteGraph, SQLg, Bitsy, IBM Graph Store,
> AmazonDynamoDB, ArangoDB, etc. The Gremlin machine simply executes the steps
> and a result set is returned. It doesn't care if the steps were created from
> Gremlin-Java8, Cypher, SPARQL, MyGraphLanguage, R:Statistics, NetLogo, punch
> cards, etc. It works at the step level, not the human-writable level. This is
> the power of Gremlin. Gremlin is to the graph world as Java is the
> programming world. And best of all, Gremlin is owned by the Apache Software
> Foundation and thus, free to use for any reason commercial or otherwise.
> ----
> So, whats next?
> 1. Re-read http://arxiv.org/abs/1508.03843.
> 2. Hone the Gremlin step library for TinkerPop 3.2.0
> 3. Write a blog explaining this concept for developers to understand.
> 4. Write a SPARQL(--) (toy) compiler that generates Gremlin steps for for
> the blog in 3.
> Hopefully that answers the question -- "What is Gremlin?" and will allow
> vendors to realize how they can leverage the execution engine of Gremlin for
> their graph product (and if they want their own query language, simply have
> it compile to a Gremlin traversal).
> Finally, note that I did not discuss {{TraversalStrategies}} (i.e. compiler
> optimizers). These are important, but do not effect the conceptual
> distinctions made above or their practical application.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)