[
https://issues.apache.org/jira/browse/TINKERPOP3-813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marko A. Rodriguez updated TINKERPOP3-813:
------------------------------------------
Fix Version/s: 3.2.0-incubating
> [Proposal] Make the Gremlin Graph Traversal Machine and Instruction Set
> Explicit
> --------------------------------------------------------------------------------
>
> Key: TINKERPOP3-813
> URL: https://issues.apache.org/jira/browse/TINKERPOP3-813
> Project: TinkerPop 3
> Issue Type: Improvement
> Components: process
> Affects Versions: 3.0.2-incubating
> Reporter: Marko A. Rodriguez
> Assignee: Marko A. Rodriguez
> Fix For: 3.2.0-incubating
>
>
> There is a lot of confusion as to "what is Gremlin?" Is it Groovy? Whats
> Gremlin-Scala? Does it only work for the JVM? Is it a language or is it an
> execution engine? Is Gremlin OLAP different from Gremlin OLTP? How is Cypher
> related to Gremlin? Who is John Galt?
> For TinkerPop 3.2.0, we need to make explicit *The Gremlin Graph Traversal
> Machine and Instruction Set*. Here is a direct mapping to the Java world so
> people can get a grasp of where we are going with this.
> {code}
> Gremlin-Java8 == Java programming language
> Gremlin-Scala == Scala programming language
> Gremlin-Groovy == Groovy programming language
> ...
> Gremlin steps == Java instruction set
> Gremlin machine == Java virtual machine
> {code}
> Lets go through each one of these to flush out the concepts.
> *Gremlin as a Language.*
> When you see:
> {code}
> g.V().as("a").out("knows").as("b").
> select("a","b").
> by("name").
> by("age")
> {code}
> You are bearing witness to Gremlin-Java8. This is a fluent-style graph
> traversal API that ultimately writes out Gremlin steps. The Gremlin language
> (i.e. the fluent-style for doing graph traversing) can be embedded in any
> host language that supports function composition and function nesting.
> Because of this simple requirement, there exists Gremlin-Groovy,
> Gremlin-Scala, Gremlin-Clojure, etc. However, if the host language is NOT on
> the JVM, then it doesn't compile directly to Gremlin steps. How do we support
> other languages outside the JVM? There are two ways currently, via
> GremlinServer, but also, via the step-syntax described next.
> *Gremlin as an Instruction Set.*
> {code}
> g.V().as("a").out("knows").as("b").
> select("a","b").
> by("name").
> by("age")
> {code}
> The above Gremlin-Java8 query gets compiled down to a step sequence called a
> "traversal." Here is the *unique* string representation of the query above.
> {code}
> [GraphStep([],vertex)@[a], VertexStep(OUT,[knows],vertex)@[b], SelectStep([a,
> b],[value(name), value(age)])]
> {code}
> The steps are the primitives of the Gremlin graph traversal machine. They are
> the parameterized instructions that the machine ultimately executes. The
> Gremlin instruction set is approximately 30 steps. These are sufficient to
> provide general purpose computing and what is typically required to express
> the common motifs of any user query. Moreover, with Gremlin3 there is no need
> for lambdas so the step syntax above is sufficient to completely express the
> traversal computation.
> Note that ANY one can create a "Gremlin program" (traversal). You can imagine
> a compiler that takes SPARQL and writes out Gremlin steps. Neo4j's Cypher
> could be compiled to write out Gremlin steps. Realize that SPARQL is NOT a
> JVM language. Thus, PHP/Python/Go/Rust/Lisp can all have Gremlin "languages"
> that simply write out traversal steps in the step-syntax presented
> previously. Why would anyone want to do this? -- because of the Gremlin graph
> traversal machine.
> *Gremlin as a Virtual Machine.*
> The Gremlin graph traversal machine can be executed in OLTP or OLAP. What
> does that mean? It means Gremlin can be executed single machine or
> distributed across a multi-machine compute cluster. Best of all, NOTHING
> changes about the language or the instruction set. That is the promise of the
> Gremlin machine. Most importantly, the Gremlin machine is agnostic to the
> underlying graph system. It works over Titan, Neo4j, OrientDB, Giraph,
> Hadoop, Spark, BlazeGraph, InfiniteGraph, SQLg, Bitsy, IBM Graph Store,
> AmazonDynamoDB, ArangoDB, etc. The Gremlin machine simply executes the steps
> and a result set is returned. It doesn't care if the steps were created from
> Gremlin-Java8, Cypher, SPARQL, MyGraphLanguage, R:Statistics, NetLogo, punch
> cards, etc. It works at the step level, not the human-writable level. This is
> the power of Gremlin. Gremlin is to the graph world as Java is the
> programming world. And best of all, Gremlin is owned by the Apache Software
> Foundation and thus, free to use for any reason commercial or otherwise.
> ----
> So, whats next?
> 1. Re-read http://arxiv.org/abs/1508.03843.
> 2. Hone the Gremlin step library for TinkerPop 3.2.0
> 3. Write a blog explaining this concept for developers to understand.
> 4. Write a SPARQL(--) (toy) compiler that generates Gremlin steps for for
> the blog in 3.
> Hopefully that answers the question -- "What is Gremlin?" and will allow
> vendors to realize how they can leverage the execution engine of Gremlin for
> their graph product (and if they want their own query language, simply have
> it compile to a Gremlin traversal).
> Finally, note that I did not discuss {{TraversalStrategies}} (i.e. compiler
> optimizers). These are important, but do not effect the conceptual
> distinctions made above or their practical application.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)