Added Gremlin's Anatomy tutorial I might add more to this, but wanted the basic component parts of Gremlin documented. Seemed best to make this part of a standalone document as it didn't quite fit that well in the reference documentation, as it already has a way of introducing those topics and I didn't want to disturb that too much. CTR
Project: http://git-wip-us.apache.org/repos/asf/tinkerpop/repo Commit: http://git-wip-us.apache.org/repos/asf/tinkerpop/commit/3aa9e70e Tree: http://git-wip-us.apache.org/repos/asf/tinkerpop/tree/3aa9e70e Diff: http://git-wip-us.apache.org/repos/asf/tinkerpop/diff/3aa9e70e Branch: refs/heads/TINKERPOP-1897 Commit: 3aa9e70ef7e50d81886954e398b4355524f7b576 Parents: 1a857da Author: Stephen Mallette <sp...@genoprime.com> Authored: Wed Feb 28 14:14:01 2018 -0500 Committer: Stephen Mallette <sp...@genoprime.com> Committed: Wed Feb 28 14:14:01 2018 -0500 ---------------------------------------------------------------------- docs/src/index.asciidoc | 3 + .../tutorials/gremlins-anatomy/index.asciidoc | 189 +++++++++++++++++++ docs/static/images/gremlin-anatomy-filter.png | Bin 0 -> 168854 bytes docs/static/images/gremlin-anatomy-group.png | Bin 0 -> 62410 bytes docs/static/images/gremlin-anatomy-navigate.png | Bin 0 -> 60514 bytes docs/static/images/gremlin-anatomy.png | Bin 0 -> 87212 bytes pom.xml | 23 +++ 7 files changed, 215 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/3aa9e70e/docs/src/index.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/index.asciidoc b/docs/src/index.asciidoc index 40fbb8c..5cc3dd5 100644 --- a/docs/src/index.asciidoc +++ b/docs/src/index.asciidoc @@ -57,6 +57,8 @@ Note the "+" following the link in each table entry - it forces an asciidoc line A gentle introduction to TinkerPop and the Gremlin traversal language that is divided into five, ten and fifteen minute tutorial blocks. |image:gremlin-dashboard.png[] |link:http://tinkerpop.apache.org/docs/x.y.z/tutorials/the-gremlin-console/[The Gremlin Console] + Provides a detailed look at The Gremlin Console and how it can be used when working with TinkerPop. +^|image:gremlin-anatomy.png[width=125] |link:http://tinkerpop.apache.org/docs/x.y.z/gremlins-anatomy/[Gremlin's Anatomy] +Identifies and explains the component parts of a Gremlin traversal. ^|image:gremlin-chef.png[width=125] |link:http://tinkerpop.apache.org/docs/x.y.z/recipes/[Gremlin Recipes] A collection of best practices and common traversal patterns for Gremlin. ^|image:gremlin-house-of-mirrors-cropped.png[width=200] |link:http://tinkerpop.apache.org/docs/x.y.z/tutorials/gremlin-language-variants/[Gremlin Language Variants] @@ -77,6 +79,7 @@ A getting started guide for users of graph databases and the Gremlin query langu Unless otherwise noted, all "publications" are externally managed: +* Mallette, S.P., link:https://www.slideshare.net/StephenMallette/gremlins-anatomy-88713465["Gremlin's Anatomy,"] DataStax User Group, February 2018. * Rodriguez, M.A., link:https://www.slideshare.net/slidarko/gremlin-1013-on-your-fm-dial["Gremlin 101.3 On Your FM Dial,"] DataStax Support and Engineering Summits, Carmel California and Las Vegas Nevada, May 2017. * Rodriguez, M.A., link:https://www.datastax.com/2017/03/graphoendodonticology["Graphoendodonticology,"] DataStax Engineering Blog, March 2017 * Rodriguez, M.A., link:http://www.datastax.com/dev/blog/gremlins-time-machine["Gremlin's Time Machine,"] DataStax Engineering Blog, September 2016. http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/3aa9e70e/docs/src/tutorials/gremlins-anatomy/index.asciidoc ---------------------------------------------------------------------- diff --git a/docs/src/tutorials/gremlins-anatomy/index.asciidoc b/docs/src/tutorials/gremlins-anatomy/index.asciidoc new file mode 100644 index 0000000..b36d881 --- /dev/null +++ b/docs/src/tutorials/gremlins-anatomy/index.asciidoc @@ -0,0 +1,189 @@ +//// +Licensed to the Apache Software Foundation (ASF) under one or more +contributor license agreements. See the NOTICE file distributed with +this work for additional information regarding copyright ownership. +The ASF licenses this file to You under the Apache License, Version 2.0 +(the "License"); you may not use this file except in compliance with +the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +//// + +image::apache-tinkerpop-logo.png[width=500,link="http://tinkerpop.apache.org"] + +*x.y.z* + +== Gremlin's Anatomy + +image:gremlin-anatomy.png[width=160,float=left]The Gremlin language is typically described by the individual +link:http://tinkerpop.apache.org/docs/x.y.z/reference/#graph-traversal-steps[steps] that make up the language, but it +is worth taking a look at the component parts of Gremlin that make a traversal work. Understanding these component +parts make it possible to discuss and understand more advanced Gremlin topics, such as +link:http://tinkerpop.apache.org/docs/x.y.z/reference/#dsl[Gremlin DSL] development and Gremlin debugging techniques. +Ultimately, Gremlin's Anatomy provides a foundational understanding for helping to read and follow Gremlin of arbitrary +complexity, which will lead you to more easily identify traversal patterns and thus enable you to craft better +traversals of your own. + +NOTE: This tutorial is based on Stephen Mallette's presentation on Gremlin's Anatomy - the slides for that presentation +can be found link:https://www.slideshare.net/StephenMallette/gremlins-anatomy-88713465[here]. + +The component parts of a Gremlin traversal can be all be identified from the following code: + +[gremlin-groovy,modern] +---- +g.V(). + has('person', 'name', within('marko', 'josh')). + outE(). + groupCount(). + by(label()).next() +---- + +In plain English, this traversal requests an out-edge label distribution for "marko" and "josh". The following +sections, will pick this traversal apart to show each component part and discuss it in some detail. + +=== GraphTraversalSource + +_`g.V()`_ - You are likely well acquainted with this bit of Gremlin. It is in virtually every traversal you read in +documentation, blog posts, or examples and is likely the start of most every traversal you will write in your own +applications. + +[gremlin-groovy,modern] +---- +g.V() +---- + +While it is well known that `g.V()` returns a list of all the vertices in the graph, the technical underpinnings of +this ubiquitous statement may be less so well established. First of all, the `g` is a variable. It could have been +`x`, `y` or anything else, but by convention, you will normally see `g`. This `g` is a `GraphTraversalSource` +and it spawns `GraphTraversal` instances with start steps. `V()` is one such start step, but there are others like +`E` for getting all the edges in the graph. The important part is that these start steps begin the traversal. + +In addition to exposing the available start steps, the `GraphTraversalSource` also holds configuration options (perhaps +think of them as pre-instructions for Gremlin) to be used for the traversal execution. The methods that allow you to +set these configurations are prefixed by the word "with". Here are a few examples to consider: + +[source,groovy] +---- +g.withStrategies(SubgraphStrategy.build().vertices(hasLabel('person')).create()). <1> + V().has('name','marko').out().values('name') +g.withSack(1.0f).V().sack() <2> +g.withComputer().V().pageRank() <3> +---- + +<1> Define a link:http://tinkerpop.apache.org/docs/x.y.z/reference/#traversalstrategy[strategy] for the traversal +<2> Define an initial link:http://tinkerpop.apache.org/docs/x.y.z/reference/#sack-step[sack] value +<3> Define a link:http://tinkerpop.apache.org/docs/x.y.z/reference/#graphcomputer[GraphComputer] to use in conjunction +with a `VertexProgram` for OLAP based traversals - for example, see +link:http://tinkerpop.apache.org/docs/x.y.z/reference/#sparkgraphcomputer[Spark] + +IMPORTANT: How you instantiate the `GraphTraversalSource` is highly depending on the graph database implementation that +you are using. Typically, they are instantiated from a `Graph` instance with the `traversal()` method, but some graph +databases, ones that are managed or "server-oriented", will simply give you a `g` to work with. Consult the +documentation of your graph database to determine how the `GraphTraversalSource` is constructed. + +=== GraphTraversal + +As you now know, a `GraphTraversal` is spawned from the start steps of a `GraphTraversalSource`. The `GraphTraversal` +contain the steps that make up the Gremlin language. Each step returns a `GraphTraversal` so that the steps can be +chained together in a fluent fashion. Revisiting the example from above: + +[gremlin-groovy,modern] +---- +g.V(). + has('person', 'name', within('marko', 'josh')). + outE(). + groupCount(). + by(label()).next() +---- + +the `GraphTraversal` components are represented by the `has()`, `outE()` and `groupCount()` steps. The key to reading +this Gremlin is to realize that the output of one step becomes the input to the next. Therefore, if you consider the +start step of `V()` and realize that it returns vertices in the graph, the input to `has()` is going to be a `Vertex`. +The `has()` step is a filtering step and will take the vertices that are passed into it and block any that do not +meet the criteria it has specified. In this case, that means that the output of the `has()` step is vertices that have +the label of "person" and the "name" property value of "josh" or "marko". + +image::gremlin-anatomy-filter.png[width=600] + +Given that you know the output of `has()`, you then also know the input to `outE()`. Recall that `outE()` is a +navigational step in that it enables movement about the graph. In this case, `outE()` tells Gremlin to take the +incoming "marko" and "josh" vertices and traverse their outgoing edges as the output. + +image::gremlin-anatomy-navigate.png[width=600] + +Now that it is clear that the output of `outE()` is an edge, you are aware of the input to `groupCount()` - edges. +The `groupCount()` step requires a bit more discussion of other Gremlin components and will thus be examined in the +following sections. At this point, it is simply worth noting that the output of `groupCount()` is a `Map` and if a +Gremlin step followed it, the input to that step would therefore be a `Map`. + +The previous paragraph ended with an interesting point, in that it implied that there were no "steps" following +`groupCount()`. Clearly, `groupCount()` is not the last function to be called in that Gremlin statement so you might +wonder what the remaining bits are, specifically: `by(label()).next()`. The following sections will discuss those +remaining pieces. + +=== Step Modulators + +It's been explained in several ways now that the output of one step becomes the input to the next, so surely the `Map` +produced by `groupCount()` will feed the `by()` step. As alluded to at the end of the previous section, that +expectation is not correct. Technically, `by()` is not a step. It is a step modulator. A step modulator modifies the +behavior of the previous step. In this case, it is telling Gremlin how the key for the `groupCount()` should be +determined. Or said another way in the context of the example, it answers this question: What do you want the "marko" +and "josh" edges to be grouped by? + +=== Anonymous Traversals + +In this case, the answer to that question is provided by the anonymous traversal `label()` as the argument to the step +modulator `by()`. An anonymous traversal is a traversal that is not bound to a `GraphTraversalSource`. It is +constructed from the double underscore class (i.e. `__`), which exposes static functions to spawn the anonymous +traversals. Typically, the double underscore is not visible in examples and code as by convention, TinkerPop typically +recommends that the functions of that class be exposed in a standalone fashion. In Java, that would mean +link:https://docs.oracle.com/javase/7/docs/technotes/guides/language/static-import.html[statically importing] the +methods, thus allowing `__.label()` to be referred to simply as `label()`. + +NOTE: In Java, the full package name for the `__` is `org.apache.tinkerpop.gremlin.process.traversal.dsl.graph`. + +In the context of the example traversal, you can imagine Gremlin getting to the `groupCount()` step with a "marko" or +"josh" outgoing edge, checking the `by()` modulator to see "what to group by", and then putting edges into buckets +by their `label()` and incrementing a counter on each bucket. + +image::gremlin-anatomy-group.png[width=600] + +The output is thus an edge label distribution for the outgoing edges of the "marko" and "josh" vertices. + +=== Terminal Step + +Terminal steps are different from the `GraphTraversal` steps in that terminal steps do not return a `GraphTraversal` +instance, but instead return the result of the `GraphTraversal`. In the case of the example, `next()` is the terminal +step and it returns the `Map` constructed in the `groupCount()` step. Other examples of terminal steps include: +`hasNext()`, `toList()`, and `iterate()`. Without terminal steps, you don't have a result. You only have a +`GraphTraversal` + +NOTE: You can read more about traversal iteration in the +link:http://tinkerpop.apache.org/docs/x.y.z/tutorials/the-gremlin-console/#result-iteration[Gremlin Console Tutorial]. + +=== Expressions + +It is worth backing up a moment to re-examine the `has()` step. Now that you have come to understand anonymous +traversals, it would be reasonable to make the assumption that the `within()` argument to `has()` falls into that +category. It does not. The `within()` option is not a step either, but instead, something called an expression. An +expression typically refers to anything not mentioned in the previously described Gremlin component categories that +can make Gremlin easier to read, write and maintain. Common examples of expressions would be string tokens, enum +values, and classes with static methods that might spawn certain required values. + +A concrete example would be the class from which `within()` is called - `P`. The `P` class spawns `Predicate` values +that can be used as arguments for certain traversals teps. Another example would be the `T` enum which provides a type +safe way to reference `id` and `label` keys in a traversal. Like anonymous traversals, these classes are usually +statically imported so that instead of having to write `P.within()`, you can simply write `within()`, as shown in the +example. + +== Conclusion + +There's much more to a traversal than just a bunch of steps. Gremlin's Anatomy puts names to each of these component +parts of a traversal and explains how they connect together. Understanding these components part should help provide +more insight into how Gremlin works and help you grow in your Gremlin abilities. \ No newline at end of file http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/3aa9e70e/docs/static/images/gremlin-anatomy-filter.png ---------------------------------------------------------------------- diff --git a/docs/static/images/gremlin-anatomy-filter.png b/docs/static/images/gremlin-anatomy-filter.png new file mode 100755 index 0000000..317bb8c Binary files /dev/null and b/docs/static/images/gremlin-anatomy-filter.png differ http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/3aa9e70e/docs/static/images/gremlin-anatomy-group.png ---------------------------------------------------------------------- diff --git a/docs/static/images/gremlin-anatomy-group.png b/docs/static/images/gremlin-anatomy-group.png new file mode 100755 index 0000000..0039c02 Binary files /dev/null and b/docs/static/images/gremlin-anatomy-group.png differ http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/3aa9e70e/docs/static/images/gremlin-anatomy-navigate.png ---------------------------------------------------------------------- diff --git a/docs/static/images/gremlin-anatomy-navigate.png b/docs/static/images/gremlin-anatomy-navigate.png new file mode 100755 index 0000000..152e40e Binary files /dev/null and b/docs/static/images/gremlin-anatomy-navigate.png differ http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/3aa9e70e/docs/static/images/gremlin-anatomy.png ---------------------------------------------------------------------- diff --git a/docs/static/images/gremlin-anatomy.png b/docs/static/images/gremlin-anatomy.png new file mode 100755 index 0000000..d83ebf7 Binary files /dev/null and b/docs/static/images/gremlin-anatomy.png differ http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/3aa9e70e/pom.xml ---------------------------------------------------------------------- diff --git a/pom.xml b/pom.xml index 259f16a..f6ff536 100644 --- a/pom.xml +++ b/pom.xml @@ -1036,6 +1036,29 @@ limitations under the License. </attributes> </configuration> </execution> + <execution> + <id>tutorial-gremlins-anatomy</id> + <phase>generate-resources</phase> + <goals> + <goal>process-asciidoc</goal> + </goals> + <configuration> + <sourceDirectory>${asciidoc.input.dir}/tutorials/gremlins-anatomy</sourceDirectory> + <sourceDocumentName>index.asciidoc</sourceDocumentName> + <outputDirectory>${htmlsingle.output.dir}/tutorials/gremlins-anatomy + </outputDirectory> + <backend>html5</backend> + <doctype>article</doctype> + <attributes> + <imagesdir>../../images</imagesdir> + <encoding>UTF-8</encoding> + <stylesdir>${asciidoctor.style.dir}</stylesdir> + <stylesheet>tinkerpop.css</stylesheet> + <source-highlighter>coderay</source-highlighter> + <basedir>${project.basedir}</basedir> + </attributes> + </configuration> + </execution> </executions> </plugin> </plugins>