robertdale commented on a change in pull request #1019: TINKERPOP-2114 Document 
common Gremlin anti-patterns
URL: https://github.com/apache/tinkerpop/pull/1019#discussion_r241980783
 
 

 ##########
 File path: docs/src/recipes/anti-patterns.asciidoc
 ##########
 @@ -0,0 +1,291 @@
+////
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to You under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+////
+
+[[long-traversals]]
+== Long Traversals
+
+It can be tempting to generate long traversals, e.g. to create a set of 
vertices and edges based on information that
+resides within an application. For example, let's consider two lists - one 
that contains information about persons and
+another that contains information about the relationship between these 
persons. To illustrate the problem we will
+create two list with a few random map entries.
+
+[gremlin-groovy]
+----
+:set max-iteration 10
+rnd = new Random(123) ; x = []
+persons = (1..100).collect {["id": it, "name": "person ${it}", "age": 
rnd.nextInt(40) + 20]}
+relations = (1..500).collect {[rnd.nextInt(persons.size()), 
rnd.nextInt(persons.size())]}.
+  unique().grep {it[0] != it[1] && !x.contains(it.reverse())}.collect {
+    x << it
+    minAge = Math.min(persons[it[0]].age, persons[it[1]].age)
+    knowsSince = new Date().year + 1900 - rnd.nextInt(minAge)
+    ["from": persons[it[0]].id, "to": persons[it[1]].id, "since": knowsSince]
+  }
+[ "Number of persons": persons.size()
+, "Number of unique relationships": relations.size() ]
+----
+
+Now, to create the `person` vertices and the `knows` edges between them it may 
look like a good idea to generate a
+single graph-mutating traversal, just like this:
+
+[gremlin-groovy]
+----
+t = g
+for (person in persons) {
+  t = t.addV("person").
+          property(id, person.id).
+          property("name", person.name).
+          property("age", person.age).as("p${person.id}")
+} ; []
+for (relation in relations) {
+  t = t.addE("knows").property("since", relation.since).
+          from("p${relation.from}").
+          to("p${relation.to}")
+} ; []
+traversalAsString = 
org.apache.tinkerpop.gremlin.groovy.jsr223.GroovyTranslator.of("g").translate(t.bytecode)
 ; []
+[ "Traversal String Length": traversalAsString.length()
+, "Traversal Preview": traversalAsString.replaceFirst(/^(.{104}).*(.{64})$/, 
'$1 ... $2') ]
+----
+
+However, this kind of traversal does not scale and it's prone to produce a 
`StackOverflowError`. This error can hardly be prevented
+as it's a limit imposed by the JVM. The stack size can be increased using the 
`-Xss` JVM option, but that's not how the problem that's
+discussed here, should be solved. The proper way to accomplish the same thing 
as in the traversal above is to inject the lists into
+the traversal and process them from there.
+
+[gremlin-groovy]
+----
+g.withSideEffect("relations", relations).
+  inject(persons).sideEffect(
+    unfold().
+    addV("person").
+      property(id, select("id")).
+      property("name", select("name")).
+      property("age", select("age")).
+    group("m").
+      by(id).
+      by(unfold())).
+  select("relations").unfold().as("r").
+  addE("knows").
+    from(select("m").select(select("r").select("from"))).
+    to(select("m").select(select("r").select("to"))).
+    property("since", select("since")).iterate()
+g
+----
+
+Obviously, these traversals are more complicated, but the number of steps is 
known and thus it's the best way to prevent an
+unexpected `StackOverflowError`. Furthermore, shorter traversals reduce the 
(de)serialization costs when such a traversal is send
+over the wire to a Gremlin Server.
+
+NOTE: Although the example was based on a graph-mutating traversal, the same 
rules apply for read-only and mixed traversals.
+
+[[unspecified-keys-and-labels]]
+== Unspecified Keys and Labels
+
+Some Gremlin steps have optional arguments that represent keys (e.g. 
`valueMap()`) or labels (e.g. `out()`). In the prototyping
+phase of a projects it's often convenient to use these steps without any 
arguments. However, in production code this is bad idea
+and keys and labels should always be specified. Not only does it make the 
traversal easier to read for others, but it also ensures
+that the application will not break if the schema changes at one point and the 
queries return completely different results.
+
+The following code block shows a few examples that are good for prototyping or 
graph discovery.
+
+[gremlin-groovy,modern]
+----
+g.V().has("person","name","marko").out()
+g.V().has("person","name","marko").out("created").valueMap()
+g.V().has("software","name","ripple").inE().has("weight", 
gte(0.5)).outV().properties()
+----
+
+The next code block shows the same queries, but with specified keys and labels.
+
+[gremlin-groovy,existing]
+----
+g.V().has("person","name","marko").out("created","knows")
+g.V().has("person","name","marko").out("created").valueMap("name","lang")
+g.V().has("software","name","ripple").inE("created").has("weight", 
gte(0.5)).outV().
+  properties("name","age")
+----
+
+[[unnecessary-steps]]
+== Unnecessary Steps
+
+There are quite a few steps and patterns that can be combined into a much 
shorter form. TinkerPop is trying to optimize queries, by
+rewriting such patterns automatically using traversal optimization strategies. 
These strategies, however, do have a few preconditions
+and under certain circumstance they will not attempt to rewrite a traversal. 
For example, if the traversal has path computations
+enabled (e.g. by using certain steps, such as `path()`, `simplePath()`, 
`otherV()`, etc.), then the assumption is that all steps are
+required in order to produce the desired path.
+
+An often seen anti-pattern is the one that explicitely traversers to an edge 
and then to a vertex without using any filters.
 
 Review comment:
   explicitly

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to