[ https://issues.apache.org/jira/browse/TINKERPOP-813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Marko A. Rodriguez closed TINKERPOP-813. ---------------------------------------- Resolution: Fixed The conclusion is that {{GraphTraversal}} is the lowest level for a user to access the Gremlin virtual machine. This will all be spelled out in "DSL Patterns" doc. Closing this. > [Proposal] Make the Gremlin Graph Traversal Machine and Instruction Set > Explicit > -------------------------------------------------------------------------------- > > Key: TINKERPOP-813 > URL: https://issues.apache.org/jira/browse/TINKERPOP-813 > Project: TinkerPop > Issue Type: Improvement > Components: process > Affects Versions: 3.0.2-incubating > Reporter: Marko A. Rodriguez > Assignee: Marko A. Rodriguez > Fix For: 3.2.0-incubating > > > There is a lot of confusion as to "what is Gremlin?" Is it Groovy? Whats > Gremlin-Scala? Does it only work for the JVM? Is it a language or is it an > execution engine? Is Gremlin OLAP different from Gremlin OLTP? How is Cypher > related to Gremlin? Who is John Galt? > For TinkerPop 3.2.0, we need to make explicit *The Gremlin Graph Traversal > Machine and Instruction Set*. Here is a direct mapping to the Java world so > people can get a grasp of where we are going with this. > {code} > Gremlin-Java8 == Java programming language > Gremlin-Scala == Scala programming language > Gremlin-Groovy == Groovy programming language > ... > Gremlin steps == Java instruction set > Gremlin machine == Java virtual machine > {code} > Lets go through each one of these to flush out the concepts. > *Gremlin as a Language.* > When you see: > {code} > g.V().as("a").out("knows").as("b"). > select("a","b"). > by("name"). > by("age") > {code} > You are bearing witness to Gremlin-Java8. This is a fluent-style graph > traversal API that ultimately writes out Gremlin steps. The Gremlin language > (i.e. the fluent-style for doing graph traversing) can be embedded in any > host language that supports function composition and function nesting. > Because of this simple requirement, there exists Gremlin-Groovy, > Gremlin-Scala, Gremlin-Clojure, etc. However, if the host language is NOT on > the JVM, then it doesn't compile directly to Gremlin steps. How do we support > other languages outside the JVM? There are two ways currently, via > GremlinServer, but also, via the step-syntax described next. > *Gremlin as an Instruction Set.* > {code} > g.V().as("a").out("knows").as("b"). > select("a","b"). > by("name"). > by("age") > {code} > The above Gremlin-Java8 query gets compiled down to a step sequence called a > "traversal." Here is the *unique* string representation of the query above. > {code} > [GraphStep([],vertex)@[a], VertexStep(OUT,[knows],vertex)@[b], SelectStep([a, > b],[value(name), value(age)])] > {code} > The steps are the primitives of the Gremlin graph traversal machine. They are > the parameterized instructions that the machine ultimately executes. The > Gremlin instruction set is approximately 30 steps. These are sufficient to > provide general purpose computing and what is typically required to express > the common motifs of any user query. Moreover, with Gremlin3 there is no need > for lambdas so the step syntax above is sufficient to completely express the > traversal computation. > Note that ANY one can create a "Gremlin program" (traversal). You can imagine > a compiler that takes SPARQL and writes out Gremlin steps. Neo4j's Cypher > could be compiled to write out Gremlin steps. Realize that SPARQL is NOT a > JVM language. Thus, PHP/Python/Go/Rust/Lisp can all have Gremlin "languages" > that simply write out traversal steps in the step-syntax presented > previously. Why would anyone want to do this? -- because of the Gremlin graph > traversal machine. > *Gremlin as a Virtual Machine.* > The Gremlin graph traversal machine can be executed in OLTP or OLAP. What > does that mean? It means Gremlin can be executed single machine or > distributed across a multi-machine compute cluster. Best of all, NOTHING > changes about the language or the instruction set. That is the promise of the > Gremlin machine. Most importantly, the Gremlin machine is agnostic to the > underlying graph system. It works over Titan, Neo4j, OrientDB, Giraph, > Hadoop, Spark, BlazeGraph, InfiniteGraph, SQLg, Bitsy, IBM Graph Store, > AmazonDynamoDB, ArangoDB, etc. The Gremlin machine simply executes the steps > and a result set is returned. It doesn't care if the steps were created from > Gremlin-Java8, Cypher, SPARQL, MyGraphLanguage, R:Statistics, NetLogo, punch > cards, etc. It works at the step level, not the human-writable level. This is > the power of Gremlin. Gremlin is to the graph world as Java is the > programming world. And best of all, Gremlin is owned by the Apache Software > Foundation and thus, free to use for any reason commercial or otherwise. > ---- > So, whats next? > 1. Re-read http://arxiv.org/abs/1508.03843. > 2. Hone the Gremlin step library for TinkerPop 3.2.0 > 3. Write a blog explaining this concept for developers to understand. > 4. Write a SPARQL(--) (toy) compiler that generates Gremlin steps for for > the blog in 3. > Hopefully that answers the question -- "What is Gremlin?" and will allow > vendors to realize how they can leverage the execution engine of Gremlin for > their graph product (and if they want their own query language, simply have > it compile to a Gremlin traversal). > Finally, note that I did not discuss {{TraversalStrategies}} (i.e. compiler > optimizers). These are important, but do not effect the conceptual > distinctions made above or their practical application. -- This message was sent by Atlassian JIRA (v6.3.4#6332)